DomainBed is a PyTorch suite containing benchmark datasets and algorithms for domain generalization, as introduced in In Search of Lost Domain Generalization.
Full results for commit 7df6f06 in LaTeX format available here.
The currently available algorithms are:
- Empirical Risk Minimization (ERM, Vapnik, 1998)
- Invariant Risk Minimization (IRM, Arjovsky et al., 2019)
- Group Distributionally Robust Optimization (GroupDRO, Sagawa et al., 2020)
- Interdomain Mixup (Mixup, Yan et al., 2020)
- Marginal Transfer Learning (MTL, Blanchard et al., 2011-2020)
- Meta Learning Domain Generalization (MLDG, Li et al., 2017)
- Maximum Mean Discrepancy (MMD, Li et al., 2018)
- Deep CORAL (CORAL, Sun and Saenko, 2016)
- Domain Adversarial Neural Network (DANN, Ganin et al., 2015)
- Conditional Domain Adversarial Neural Network (CDANN, Li et al., 2018)
- Style Agnostic Networks (SagNet, Nam et al., 2020)
- Adaptive Risk Minimization (ARM, Zhang et al., 2020), contributed by @zhangmarvin
- Variance Risk Extrapolation (VREx, Krueger et al., 2020), contributed by @zdhNarsil
- Representation Self-Challenging (RSC, Huang et al., 2020), contributed by @SirRob1997
- Spectral Decoupling (SD, Pezeshki et al., 2020)
Send us a PR to add your algorithm! Our implementations use ResNet50 / ResNet18 networks (He et al., 2015) and the hyper-parameter grids described here.
The currently available datasets are:
- RotatedMNIST (Ghifary et al., 2015)
- ColoredMNIST (Arjovsky et al., 2019)
- VLCS (Fang et al., 2013)
- PACS (Li et al., 2017)
- Office-Home (Venkateswara et al., 2017)
- A TerraIncognita (Beery et al., 2018) subset
- DomainNet (Peng et al., 2019)
- A SVIRO (Dias Da Cruz et al., 2020) subset
- WILDS (Koh et al., 2020) FMoW (Christie et al., 2018) about satellite images
- WILDS (Koh et al., 2020) Camelyon17 (Bandi et al., 2019) about tumor detection in tissues
Send us a PR to add your dataset! Any custom image dataset with folder structure dataset/domain/class/image.xyz
is readily usable. While we include some datasets from the WILDS project, please use their official code if you wish to participate in their leaderboard.
Model selection criteria differ in what data is used to choose the best hyper-parameters for a given model:
IIDAccuracySelectionMethod
: A random subset from the data of the training domains.LeaveOneOutSelectionMethod
: A random subset from the data of a held-out (not training, not testing) domain.OracleSelectionMethod
: A random subset from the data of the test domain.
Download the datasets:
python -m domainbed.scripts.download \
--data_dir=/my/datasets/path
Train a model:
python -m domainbed.scripts.train\
--data_dir=/my/datasets/path\
--algorithm ERM\
--dataset RotatedMNIST
Launch a sweep:
python -m domainbed.scripts.sweep launch\
--data_dir=/my/datasets/path\
--output_dir=/my/sweep/output/path\
--command_launcher MyLauncher
Here, MyLauncher
is your cluster's command launcher, as implemented in command_launchers.py
. At the time of writing, the entire sweep trains tens of thousands of models (all algorithms x all datasets x 3 independent trials x 20 random hyper-parameter choices). You can pass arguments to make the sweep smaller:
python -m domainbed.scripts.sweep launch\
--data_dir=/my/datasets/path\
--output_dir=/my/sweep/output/path\
--command_launcher MyLauncher\
--algorithms ERM DANN\
--datasets RotatedMNIST VLCS\
--n_hparams 5\
--n_trials 1
After all jobs have either succeeded or failed, you can delete the data from failed jobs with python -m domainbed.scripts.sweep delete_incomplete
and then re-launch them by running python -m domainbed.scripts.sweep launch
again. Specify the same command-line arguments in all calls to sweep
as you did the first time; this is how the sweep script knows which jobs were launched originally.
To view the results of your sweep:
python -m domainbed.scripts.collect_results\
--input_dir=/my/sweep/output/path
DomainBed includes some unit tests and end-to-end tests. While not exhaustive, but they are a good sanity-check. To run the tests:
python -m unittest discover
By default, this only runs tests which don't depend on a dataset directory. To run those tests as well:
DATA_DIR=/my/datasets/path python -m unittest discover
This source code is released under the MIT license, included here.