AdCo is a contrastive-learning based self-supervised learning methods, which is published on CVPR2021.
Copyright (C) 2020 Qianjiang Hu*, Xiao Wang*, Wei Hu, Guo-Jun Qi
License: MIT for academic use.
Contact: Guo-Jun Qi (
Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. Existing contrastive learning methods either maintain a queue of negative samples over minibatches while only a small portion of them are updated in an iteration, or only use the other examples from the current minibatch as negatives. They could not closely track the change of the learned representation over iterations by updating the entire queue as a whole, or discard the useful information from the past minibatches. Alternatively, we present to directly learn a set of negative adversaries playing against the self-trained representation. Two players, the representation network and negative adversaries, are alternately updated to obtain the most challenging negative examples against which the representation of positive queries will be trained to discriminate. We further show that the negative adversaries are updated towards a weighted combination of positive queries by maximizing the adversarial contrastive loss, thereby allowing them to closely track the change of representations over time. Experiment results demonstrate the proposed Adversarial Contrastive (AdCo) model not only achieves superior performances (a top-1 accuracy of 73.2% over 200 epochs and 75.7% over 800 epochs with linear evaluation on ImageNet), but also can be pre-trained more efficiently with much shorter GPU time and fewer epochs.
CUDA version should be 10.1 or higher.
1. Install git
git clone && cd AdCo
You have two options to install dependency on your computer:
3.1.1install pip
pip install -r requirements.txt --user
If you encounter any errors, you can install each library one by one:
pip install torch==1.7.1
pip install torchvision==0.8.2
pip install numpy==1.19.5
pip install Pillow==5.1.0
pip install tensorboard==1.14.0
pip install tensorboardX==1.7
3.2.1 install conda
conda create -n AdCo python=3.6.9
conda activate AdCo
pip install -r requirements.txt
Each time when you want to run my code, simply activate the environment by
conda activate AdCo
conda deactivate(If you want to exit)
4.1 Download the ImageNet2012 Dataset under "./datasets/imagenet2012".
4.3 move validation images to labeled subfolders, using the following shell script
This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.
python3 --sym=0 --lr=0.03 --memory_lr=3 --moco_t=0.12 --mem_t=0.02 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
python3 --sym=1 --lr=0.03 --memory_lr=3 --moco_t=0.12 --mem_t=0.02 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
# e.g., training with 8192 negative samples and symmetrical loss
python3 --sym=1 --lr=0.04 --memory_lr=3 --moco_t=0.14 --mem_t=0.03 --cluster 8192 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
python3 --multi_crop=1 --lr=0.03 --memory_lr=3 --moco_t=0.12 --mem_t=0.02 --data=./datasets/imagenet2012 --dist_url=tcp://localhost:10001
So far we have yet to support multi crop with symmetrical loss, the feature will be added in future.
With a pre-trained model, we can easily evaluate its performance on ImageNet with:
python3 --data=./datasets/imagenet2012 --dist-url=tcp://localhost:10001 --pretrained=input.pth.tar
pre-train network |
pre-train epochs |
Crop | Symmetrical Loss |
AdCo top-1 acc. |
Model Link |
ResNet-50 | 200 | Single | No | 68.6 | model |
ResNet-50 | 200 | Multi | No | 73.2 | model |
ResNet-50 | 800 | Single | No | 72.8 | None |
ResNet-50 | 800 | Multi | No | 75.7 | None |
ResNet-50 | 200 | Single | Yes | 70.6 | model |
Really sorry that we can't provide 800 epochs' model, which is because of the company regulations, since we trained them on company machines. For downstream tasks, we found multi-200epoch model also had similar performances. Thus, we suggested you to use this model for downstream purposes.
Performance with different negative samples:
pre-train network |
pre-train epochs |
negative samples |
Symmetrical Loss |
AdCo top-1 acc. |
Model Link |
ResNet-50 | 200 | 65536 | No | 68.6 | model |
ResNet-50 | 200 | 65536 | Yes | 70.6 | model |
ResNet-50 | 200 | 16384 | No | 68.6 | model |
ResNet-50 | 200 | 16384 | Yes | 70.2 | model |
ResNet-50 | 200 | 8192 | No | 68.4 | model |
ResNet-50 | 200 | 8192 | Yes | 70.2 | model |
The performance is obtained on a single machine with 8*V100 GPUs.
1 Download Dataset under "./datasets/voc"
python3 --data=../datasets/voc --pretrained=../input.pth.tar
Here VOC directory should be the directory includes "vockit" directory.
1 Download Dataset under "./datasets/places205"
python3 --dataset=Place205 --sgdr=1 --data=./datasets/places205 --lr=5 --dist-url=tcp://localhost:10001 --pretrained=input.pth.tar
1. Install detectron2.
# in detection folder
python3 input.pth.tar output.pkl
3. download VOC Dataset and COCO Dataset under "./detection/datasets" directory,
following the directory structure requried by detectron2.
Number of GPU will influence the overall batch size, thus all the experiments should be done with 8 GPUs. If with less GPUs, please finetune the SOLVER.BASE_LR based on your condition.
cd detection
python --config-file configs/pascal_voc_R_50_C4_24k_adco.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
Number of GPU will influence the overall batch size, thus all the experiments should be done with 8 GPUs. If with less GPUs, please finetune the SOLVER.BASE_LR based on your condition.
cd detection
python --config-file configs/coco_R_50_C4_2x_adco.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
title={Adco: Adversarial contrast for efficient learning of unsupervised representations from self-trained negative adversaries},
author={Hu, Qianjiang and Wang, Xiao and Hu, Wei and Qi, Guo-Jun},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},