/nero

๐Ÿ‘‘ Pytorch code for the Nero optimiser.

Primary LanguagePythonOtherNOASSERTION

Nero optimiser

Yang Liu โ€ƒ ยท โ€ƒ Jeremy Bernstein โ€ƒ ยท โ€ƒ Markus Meister โ€ƒ ยท โ€ƒ Yisong Yue

Getting started

  • Grab nero.py and place it in your Pytorch project directory. Then type:
from nero import Nero
optimizer = Nero(net.parameters(), lr=0.01)
  • An initial learning rate of lr = 0.01 is the recommended default. This worked in almost all our experiments. Otherwise try lr=0.001.
  • Learning rate decay over the course of optimisation also helps.

About this repository

This repository was built by Yang Liu and Jeremy Bernstein to accompany the following paper:

Learning by Turning: Neural Architecture Aware Optimisation.

We're putting this code here so that you can test out our optimisation algorithm in your own applications, and also so that you can attempt to reproduce the experiments in our paper.

If something isn't clear or isn't working, let us know in the Issues section or contact yang@abacus.ai and bernstein@caltech.edu.

Repository structure

.
โ”œโ”€โ”€ cGAN/                   # Class conditional GAN image generation experiments
โ”œโ”€โ”€ cifar/                  # CIFAR-10 classification experiments
โ”œโ”€โ”€ imagenet/               # ImageNet classification experiments
โ”œโ”€โ”€ mnist/                  # MNIST experiments with deep MLP and reparameterisation
โ”œโ”€โ”€ optim/                  # optimiser definitions
โ”œโ”€โ”€ ppo/                    # reinforcement learning experiment
โ”œโ”€โ”€ transformer-wmt16/      # large transformer
โ”œโ”€โ”€ wikitext2/              # small transformer
โ”œโ”€โ”€ LICENSE                 # license on our algorithm
โ”œโ”€โ”€ README.md               # the page you're reading now
โ””โ”€โ”€ nero.py                 # our optimiser

Citation

If you find Nero useful, feel free to cite the paper:

@misc{nero2021,
  title={Learning by Turning: Neural Architecture Aware Optimisation},
  author={Yang Liu and Jeremy Bernstein and Markus Meister and Yisong Yue},
  year={2021},
  eprint={arXiv:2102.07227}
}

License

We are making our algorithm available under a CC BY-NC-SA 4.0 license. The other code we have used obeys other license restrictions as indicated in the subfolders.