This repository contains pytorch code that produces the Stochastic Cubic Adjusted Gradient Descent (SCAG) in the paper: Making Use of Second-order Information: Cubic Regularization for Training DNNs.
This repository contains example code for (cifar10, cifar100) using (alexnet, vgg, resnet, wide_resnet).
python main.py --arch model_name --dataset dataset_name
, where model_name
could be alexnet
, vgg
, wide_resnet
, and resnet
, dataset_name
could be cifar10
and cifar100
.
For example
python main.py --arch vgg --dataset cifar10;
python main.py --arch wide_resnet --dataset cifar10;
python main.py --arch alexnet --dataset cifar100;
python main.py --arch wide_resnet --dataset cifar100;
The learning rate is set as 0.5
at the beginning and decay 0.1
every 50
epochs, other adjustment schemes can be applied for better performance.
Initial learning rate lr
can be set as follows,
python main.py --arch vgg --dataset cifar10 --lr 0.5;