This is the supplementary code (in Python 3) for the paper Y. Malitsky and K. Mishchenko “Adaptive Gradient Descent without Descent” (two-column ICML or one-column arxiv)
The implemented adaptive method is a reliable tool for minimizing differentiable functions. It is among the most general gradient-based algorithms and its fast performance is theoretically guaranteed. The method is merely 2 lines:
There are 5 experiments in total. The first four are provided in the form of a Jupyter notebook and for the neural networks we include a PyTorch implementation of the proposed optimizer.
- Logistic regression
- Matrix factorization
- Cubic regularization
- Linesearch for logisitic regresion
- Neural networks
If you find this code useful, please cite our paper:
@article{malitsky2019adaptive, title={Adaptive gradient descent without descent}, author={Malitsky, Yura and Mishchenko, Konstantin}, journal={arXiv preprint arXiv:1910.09529}, year={2019} }