/MosRep.pytorch

Mosaic Representation Learning for Self-supervised Visual Pre-training (ICLR2023, Spotlight)

Primary LanguagePythonMIT LicenseMIT

[ICLR 2023 Spotlight] Mosaic Representation Learning for Self-supervised Visual Pre-training

This is a PyTorch implementation of our paper.

@inproceedings{wang2023mosaic,
  title={Mosaic Representation Learning for Self-supervised Visual Pre-training},
  author={Wang, Zhaoqing and Chen, Ziyu and Li, Yaqian and Guo, Yandong and Yu, Jun and Gong, Mingming and Liu, Tongliang},
  booktitle={The Eleventh International Conference on Learning Representations},
  year={2023}
}

Requirements

  • Python >= 3.7.12
  • PyTorch >= 1.10.2
  • torchvision >= 0.11.3

Install PyTorch and ImageNet dataset following the official PyTorch ImageNet training code.

For other dependencies, please run:

pip install -r requirements.txt

Webdataset

We use webdataset to speed up our training process. To convert the ImageNet dataset into webdataset format, please run:

python imagenet2wds.py -i $IMAGENET_DIR -o $OUTPUT_FOLDER

Unsupervised Pre-training

This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.

To do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine, please run:

# MoCo-v2
bash pretrain_moco.sh

# MosRep
bash pretrain_mosrep.sh

Linear Classification

With a pre-trained model, to train a supervised linear classifier on frozen features/weights in an 8-gpu machine, please run:

# MoCo-v2
bash linear_moco.sh

# MosRep
bash linear_mosrep.sh

Models

Our pre-trained ResNet-50 models can be downloaded as following:

Method Input view Epochs Linear Probing Top-1 acc. Model
MoCo v2 2x224 200 67.7 download
MoCo v2 2x224 + 4x112 200 69.8 download
MosRep 2x224 + 4x112 200 72.3 download

License

This project is under the MIT license. See the LICENSE file for more details.