This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Science of HSE University and Yandex School of Data Analysis.
- Week 1: Introduction
- Lecture: Course overview and organizational details. Core concepts of the GPU architecture and CUDA API.
- Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
- Week 2: Basics of distributed ML
- Lecture: Introduction to distributed training. Process-based communication. Parameter Server architecture.
- Seminar: Multiprocessing basics. Parallel GloVe training.
- Week 3: Data-parallel training and All-Reduce
- Lecture: Data-parallel training of neural networks. All-Reduce and its efficient implementations.
- Seminar: Introduction to PyTorch Distributed. Data-parallel training primitives.
- Week 4: Memory-efficient and model-parallel training
- Lecture: Model-parallel training, gradient checkpointing, offloading
- Seminar: Gradient checkpointing in practice
- Week 5: Training optimizations, profiling DL code
- Lecture: Mixed-precision training. Data storage and loading optimizations. Tools for profiling deep learning workloads
- Seminar: Automatic Mixed Precision in PyTorch. Dynamic padding for sequence data and JPEG decoding benchmarks. Basics of PyTorch Profiler and cProfile.
- Week 6: Python web application deployment
- Lecture/Seminar: Building and deployment of production-ready web services. App & web servers, Docker containers, Prometheus metrics, API via HTTP and gRPC.
- Week 7: Software for serving neural networks
- Lecture/Seminar: Different formats for packing NN: ONNX, TorchScript, IR. Inference servers: OpenVINO, Triton. ML on client devices: TfJS, ML Kit, Core ML.
- Week 8: Optimizing models for faster inference
- Lecture: Knowlenge distillation, Pruning, Quantization, NAS, Efficient Architectures
- Seminar: Quantization and distillation of Transformers
- Week 9: Experiment tracking, model and data versioning, testing DL code in Python
- Lecture: Experiment management basics and pipeline versioning. Configuring Python applications. Intro to regular and property-based testing.
- Seminar: Example DVC+W&B project walkthrough. Intro to testing with pytest.
- Week 10: Invited talks
- Memory Footprint Reduction Techniques for DNN Training: An Overview. Gennady Pekhimenko, University of Toronto, Vector Institute
- Efficient Inference of Deep Learning Models on (GP)GPU. Ivan Komarov, Yandex
There will be a total of 3 home assignments (some of them spread over several weeks). The final grade is a weighted sum of per-assignment grades. Please refer to the course page of your institution for details.