Code for the paper Large-Scale Methods for Distributionally Robust Optimization by Daniel Levy*, Yair Carmon*, John C. Duchi and Aaron Sidford, to appear at NeurIPS 2020.
This code is written in python, dependencies are:
- Python >= 3.6
- PyTorch >= 1.1.0
- torchvision >= 0.3.0
- numpy
- pandas
- for unit tests only: CVXPY and MOSEK
For ImageNet and MNIST, we use the datasets provided by torchvision
. For the
typed-written digits, see Campos, Babu & Varma,
2009. We provide the details of
how we use the datasets in Section F.1 of the Appendix of the paper. The code
extracting features can be found in ./features
.
Features for the Digits experiments
The file robust_losses.py
implements the gradient estimators we consider in the paper.
In particular, it includes the two main estimators we study: Mini-batch and
and the multilevel Monte Carlo (MLMC). It also includes implementations for the "baselines" methods
we consider: dual-SGM and primal-dual. Our code relies on PyTorch for auto-differentiation and
is usable in any existing (PyTorch) training code. We show
in Appendix F.3 how to integrate it in less than 3 lines of code.
The training and evaluation code are contained in train.py
. Here is an example
command to run for the ImageNet dataset, for the
python train.py --algorithm batch --dataset imagenet --data_dir ../data --epochs 30 --momentum 0.9 --lr_schedule constant --averaging constant_3.0 --wd 1e-3 --geometry chi-square --size 1.0 --batch_size 500 --lr 1e-2 --output_dir ../output-dir
The hyperparameters (including seeds) for all the experiments we show in the paper
are detailed in the ./hyperparameters
folder. We describe our search strategy
(a coarse-to-fine grid) in Appendix F.2.
@inproceedings{levy2020large,
title={Large-Scale Methods for Distributionally Robust Optimization},
author={Levy, Daniel and Carmon, Yair and Duchi, John C and Sidford, Aaron},
booktitle={Advances in Neural Information Processing Systems},
year={2020}
}