Stochastic Cubic Adjusted Gradient Descent (SCAG)

This repository contains pytorch code that produces the Stochastic Cubic Adjusted Gradient Descent (SCAG) in the paper: Making Use of Second-order Information: Cubic Regularization for Training DNNs.

This repository contains example code for (cifar10, cifar100) using (alexnet, vgg, resnet, wide_resnet).

Usage examples

python main.py --arch model_name --dataset dataset_name, where model_name could be alexnet, vgg, wide_resnet, and resnet, dataset_name could be cifar10 and cifar100.

For example

python main.py --arch vgg --dataset cifar10;
python main.py --arch wide_resnet --dataset cifar10;
python main.py --arch alexnet --dataset cifar100;
python main.py --arch wide_resnet --dataset cifar100;

Learning rate adjustment

The learning rate is set as 0.5 at the beginning and decay 0.1 every 50 epochs, other adjustment schemes can be applied for better performance.

Initial learning rate lr can be set as follows,

python main.py --arch vgg --dataset cifar10 --lr 0.5;

xuwangyin/cubic-dnn

Stochastic Cubic Adjusted Gradient Descent (SCAG)

Usage examples

Learning rate adjustment