The notebook is here: talk.ipynb.

How to make things fast.

  • Basics of shared-memory and per-process parallelism


  • The Global Interpreter Lock and why shared memory parallelism is limited in Python


  • Multiprocessing and multithreading pools pools


  • Floating point representations, what they really mean and how to think about them, eg Float16 vs bfloat16 vs float32


  • How to do multi-node training on the Mila Cluster