/Adaptive-gradient-descent-without-descent

Adaptive gradient descent without descent

Primary LanguageJupyter NotebookMIT LicenseMIT

About

This is the supplementary code (in Python 3) for the paper Y. Malitsky and K. Mishchenko “Adaptive Gradient Descent without Descent” (two-column ICML or one-column arxiv)

The implemented adaptive method is a reliable tool for minimizing differentiable functions. It is among the most general gradient-based algorithms and its fast performance is theoretically guaranteed. The method is merely 2 lines:



Usage

There are 5 experiments in total. The first four are provided in the form of a Jupyter notebook and for the neural networks we include a PyTorch implementation of the proposed optimizer.

Reference

If you find this code useful, please cite our paper:

@article{malitsky2019adaptive,
  title={Adaptive gradient descent without descent},
  author={Malitsky, Yura and Mishchenko, Konstantin},
  journal={arXiv preprint arXiv:1910.09529},
  year={2019}
}