Interpolation, Growth Conditions, and Stochastic Gradient Descent
This repo contains the draft of my MSc. thesis. The main contributions are the following:
- a generalized notion of interpolation for stochastic optimization problems;
- a careful complexity analysis of stochastic gradient descent under interpolation, both with a constant step-size and with a stochastic Armijo line-search; and
- an analysis of Nesterov's accelerated gradient method (under interpolation) using the estimating sequences framework.
Many of the results developed here are improved versions of theorems from the following two papers:
- Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron:
- constant step-size SGD;
- Nesterov's accelerated gradient method.
- Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates:
- SGD with a stochastic Armijo line-search.
Acknowledgements
I am extremely fortunate to have been supervised by Mark Schmidt throughout my masters.
I am also greatly indebted to my colleagues and friends Sharan Vaswani, Frederik Kunstner and Si Yi (Cathy) Meng.