/dl-opt

Tutorials on optimizers for deep neural networks

MIT LicenseMIT

Optimization methods for training neural networks

This repository is the collection of tutorials on my experience in training large neural networks, extracting features of different optimizers, models and regularization techniques as well as different set ups of training. Here I exclude everything related to the convex deterministic optimization and focus only on the stochastic methods that address problems related to the data processing from different domains.

  1. Basic concepts: models, autograd, generalization, local minima and their features
  2. Ingredients of basic optimizers
  3. Key elements of models
  4. Federated learning
  5. Few-bit optimizers
  6. Privacy-aware optimizers
  7. From first-order stochastic methods to higher order optimizers
  8. Paralellism in training large neural networks
  9. Meta optimizers
  10. Challenges and perspectives