The notebook is here: talk.ipynb.

How to make things fast.

  • Basics of shared-memory and per-process parallelism

 

  • The Global Interpreter Lock and why shared memory parallelism is limited in Python

 

  • Multiprocessing and multithreading pools pools

 

  • Floating point representations, what they really mean and how to think about them, eg Float16 vs bfloat16 vs float32

 

  • How to do multi-node training on the Mila Cluster