/MoA

Mixture-of-Adapters

Primary LanguagePython

Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters (MoA)

Paper | Project Page

PWC
PWC
PWC
PWC
PWC

Prerequisites

Dataset Preparation

python -m domainbed.scripts.download --data_dir=/my/datasets/path

Environment Setup

conda create -n MoA python=3.9.12
conda activate MoA
pip install -r requirements.txt

Training

We use OpenCLIP ViT-B/16 for all experiments. The pretrained model can be loaded from timm. You can use the following command to get the model.

timm.create_model('vit_base_patch16_clip_224.laion2b', pretrained=True)

Full fine-tuning

python train_all.py [train_name] --data_dir [domainbed_data_dir] --algorithm ERM \
 --dataset DomainNet --model vitbase --seed 1

LoRA

python train_all.py [train_name] --data_dir [domainbed_data_dir] --algorithm ERM \
 --dataset DomainNet --model nf_vitbase_lora --r 2 --seed 1

Mixture-of-LoRA

python train_all.py [train_name] --data_dir [domainbed_data_dir] --algorithm ERM \
 --dataset DomainNet --model nf_vitbase_moelora_last_qkv --seed 1

KAdaptation + Mixture-of-Attention (our best results)

python train_all.py nf_vitbase_moelora_every_qkv_new_laux --data_dir [domainbed_data_dir] --algorithm ERM \
 --dataset DomainNet --model nf_vitbase_moek_every_qkv_new --l_aux --seed 1

Results

Acknowledgements

This code is heavily based on MIRO, SWAD and DomainBed. Also, the LoRA implementation is based on LoRA. We also used the official implementation of KAdaptation, and the Cosine Router using this github. We highly appreciate the authors for their great work.

Citation

If you found this code useful, please consider citing our paper.

@article{lee2023domain,
  title={Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters},
  author={Lee, Gyuseong and Jang, Wooseok and Kim, Jin Hyeon and Jung, Jaewoo and Kim, Seungryong},
  journal={arXiv preprint arXiv:2310.11031},
  year={2023}
}