/top-sgd

Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)

Primary LanguagePython

Watchers