/fast-dro

PyTorch implementation of efficient algorithms for DRO with CVaR and Chi-Square uncertainty sets

Primary LanguagePythonMIT LicenseMIT

Large-Scale Methods for Distributionally Robust Optimization

Code for the paper Large-Scale Methods for Distributionally Robust Optimization by Daniel Levy*, Yair Carmon*, John C. Duchi and Aaron Sidford, to appear at NeurIPS 2020.

Dependencies

This code is written in python, dependencies are:

  • Python >= 3.6
  • PyTorch >= 1.1.0
  • torchvision >= 0.3.0
  • numpy
  • pandas
  • for unit tests only: CVXPY and MOSEK

Datasets

For ImageNet and MNIST, we use the datasets provided by torchvision. For the typed-written digits, see Campos, Babu & Varma, 2009. We provide the details of how we use the datasets in Section F.1 of the Appendix of the paper. The code extracting features can be found in ./features.

Features for the Digits experiments

Robust Losses

The file robust_losses.py implements the gradient estimators we consider in the paper. In particular, it includes the two main estimators we study: Mini-batch and and the multilevel Monte Carlo (MLMC). It also includes implementations for the "baselines" methods we consider: dual-SGM and primal-dual. Our code relies on PyTorch for auto-differentiation and is usable in any existing (PyTorch) training code. We show in Appendix F.3 how to integrate it in less than 3 lines of code.

Training and Evaluation

The training and evaluation code are contained in train.py. Here is an example command to run for the ImageNet dataset, for the $\chi^2$ uncertainty set of size 1 and running with a batch size of 500, momentum of 0.9 and learning rate of 0.01 for 30 epochs.

python train.py --algorithm batch --dataset imagenet --data_dir ../data --epochs 30 --momentum 0.9 --lr_schedule constant --averaging constant_3.0 --wd 1e-3 --geometry chi-square --size 1.0 --batch_size 500 --lr 1e-2 --output_dir ../output-dir

Hyperparameters

The hyperparameters (including seeds) for all the experiments we show in the paper are detailed in the ./hyperparameters folder. We describe our search strategy (a coarse-to-fine grid) in Appendix F.2.

Reference

@inproceedings{levy2020large,
  title={Large-Scale Methods for Distributionally Robust Optimization},
  author={Levy, Daniel and Carmon, Yair and Duchi, John C and Sidford, Aaron},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}