/CSD

Implementation and datasets for Efficient Domain Generalization via Common-Specific Low-Rank Decomposition (https://arxiv.org/abs/2003.12815)

Primary LanguagePythonMIT LicenseMIT

This folder contains code accompanying the submission: Efficient Domain Generalization via Common-Specific Low-Rank Decomposition (ICML 2020).

Our proposed technique: Common Specific Decomposition (CSD) is extremely simple and can be easily incorporated in to your codestack by replacing the final layer with CSD. You may find the this isolated code nuggets of CSD for TensorFlow and PyTorch handy.

However, if you wish to reproduce numbers from our paper, we highly encourage that you use our codestack.

The folders are organized into

  1. rotation: Rotation tasks on MNIST and Fashion-MNIST
  2. hw: Hand-written tasks on LipitK and Nepali Character recognition datasets
  3. pacs: PACS evaluation with ResNet18 and AlexNet
  4. speech: Speech utterance classification evaluation

To be able to run all the experiments, you need the following packages for Python3.6

  1. Tensorflow <= 1.16 >1.13 for rotation, speech and hand-written experiments
  2. PyTorch >=1.0 for PACS
  3. Keras for loading datasets in rotation tasks.

The data for hand-written and rotation tasks are either provided or scripted to automatically download, however the PACS and speech dataset are to be downloaded and configured in order to run the provided code.

The following sections give more specific instructions on how to run the code.

Rotation tasks

CSD:

python main.py --dataset mnist --classifier mos

ERM:

python main.py --dataset mnist --classifier simple

Hand-written tasks

LipitK Task

python hw_train_and_test.py --dataset lipitk --num_train <number of train domains>

NHCD Task

python hw_train_and_test.py --dataset nhcd --lr 1e-4

Use flags: –simple or –cg for ERM or CG baselines.

PACS

CSD is coded in to the implementation of JigenDG, a previous domain generalizing approach.

Use the following steps in order to run experiments.

  1. Download and configure the PACS dataset such that all the paths in pacs/data/txt_lists are appropriate.
  2. Use pacs/run.sh either to run PACS evaluation using AlexNet or ResNet-18.

Speech task

You need to download Speech dataset and extract it in to speech/speech_dataset/ folder. It can be downloaded from: http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz

You can then CSD using the follwing command.

$ cd speech;
$ python train.py --training_percentage <num train domains> --train_dir=<checkpoints folder> --model mos2 --seed 0 --learning_rate 2e-3 --how_many_epochs 500 --lmbda 0.5 --num_uids=<2(K)>

Feel free to open an issue or write to me if you find something missing or confusing.