Building the foundations of deep learning from matrix multiplication and backpropagation to ResNets and beyond
- Matrix Multiplication
- Neural Network Forward Pass
- Neural Network Backpropagation
- Rebuilding PyTorch Essentials
Optimizing matrix multiplication from scratch:
- Nested loops in standard Python
- Array Slicing
- Array Broadcasting
- Einstein Summation in PyTorch
- Standard PyTorch
Matrix multiplication in standard PyTorch is about 44,000 times faster than using standard python
Demonstrating the difficulty in training neural networks:
- Exploding Activations with added depth [Solution: Xavier Initialization]
- Vanishing Activations when using ReLU [Solution: Kaiming Initialization]
- Improvements with Parametric/Leaky/Shifted ReLU
After developing an appreciation of challenges in training neural networks, we build a Feed Forward Neural Network that mimics PyTorch's modular design.
Implementing Autograd (Automatic Differentiation) functionality for:
- Linear layer: Affine function
- Activation layer: ReLU
- Loss layer: Mean Squared Error
We design a layer abstraction class to build a Fully Connected Neural Network capable of backpropagating errors using automatic differentiation of its computation graph. PyTorch's design choices like nn.Module starts to make perfect sense.
We explore the internal abstractions and architecture of PyTorch in depth and rebuild it from scratch:
- PyTorch Data Abstractions
- Dataset
- DataLoader
- DataSampler
- PyTorch Training Abstractions
- nn.Parameters
- nn.Sequential
- Optimizer
After having dived deep into the inner workings of PyTorch, we gain a deeper understanding of deep learning concepts, the problems and the existing solutions. We get insight into the software architecture design and development process of a popular deep learning framework.