solo-learn

A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. While the library is self contained, it is possible to use the models outside of solo-learn.

News

[Aug 13 2021]: 🐳 DeepCluster V2 is now available.
[Jul 31 2021]: 🦔 ReSSL is now available.
[Jul 21 2021]: 🧪 Added Custom Dataset support.
[Jul 21 2021]: 🎠 Added AutoUMAP.

Methods available:

Barlow Twins
BYOL
DeepCluster V2
DINO
MoCo V2+
NNCLR
ReSSL
SimCLR + Supervised Contrastive Learning
SimSiam
Swav
VICReg
W-MSE

Extra flavor

Data

Increased data processing speed by up to 100% using Nvidia Dali
Asymmetric and symmetric augmentations

Evaluation and logging

Online linear evaluation via stop-gradient for easier debugging and prototyping (optionally available for the momentum encoder as well)
Normal offline linear evaluation
All the perks of PyTorch Lightning (mixed precision, gradient accumulation, clipping, automatic logging and much more)
Easy-to-extend modular code structure
Custom model logging with a simpler file organization
Automatic feature space visualization with UMAP
Common metrics and more to come...

Training tricks

Multi-cropping dataloading following SwAV:
- Note: currently, only SimCLR supports this
Exclude batchnorm and biases from LARS
No LR scheduler for the projection head in SimSiam

Requirements

torch
tqdm
einops
wandb
pytorch-lightning
lightning-bolts

Optional:

nvidia-dali

NOTE: if you are using CUDA 10.X change nvidia-dali-cuda110 to nvidia-dali-cuda100 in setup.py, line 7.

Installation

To install the repository with Dali and/or UMAP support, use:

pip3 install .[dali,umap]

If no Dali/UMAP support is needed, the repository can be installed as:

pip3 install .

NOTE: If you want to modify the library, install it in dev mode with -e.

NOTE 2: Soon to be on pip.

Training

For pretraining the encoder, follow one of the many bash files in bash_files/pretrain/.

After that, for offline linear evaluation, follow the examples on bash_files/linear.

NOTE: Files try to be up-to-date and follow as closely as possible the recommended parameters of each paper, but check them before running.

Results

Note: hyperparameters may not be the best, we will be re-running the methods with lower performance eventually.

CIFAR-10

Method	Backbone	Epochs	Dali	Acc@1 (online)	Acc@5 (online)	Checkpoint
Barlow Twins	ResNet18	1000	❌	92.10	99.73	🔗
BYOL	ResNet18	1000	❌	92.58	99.79	🔗
DeepCluster V2	ResNet18	1000	❌	88.85	99.58	🔗
DINO	ResNet18	1000	❌	89.52	99.71	🔗
MoCo V2+	ResNet18	1000	❌	92.94	99.79	🔗
NNCLR	ResNet18	1000	❌	91.88	99.78	🔗
ReSSL	ResNet18	1000	❌	90.63	99.62	🔗
SimCLR	ResNet18	1000	❌	90.74	99.75	🔗
Simsiam	ResNet18	1000	❌	90.51	99.72	🔗
SwAV	ResNet18	1000	❌	89.17	99.68	🔗
VICReg	ResNet18	1000	❌	92.07	99.74	🔗
W-MSE	ResNet18	1000	❌	88.67	99.68	🔗

CIFAR-100

Method	Backbone	Epochs	Dali	Acc@1 (online)	Acc@5 (online)	Checkpoint
Barlow Twins	ResNet18	1000	❌	70.90	91.91	🔗
BYOL	ResNet18	1000	❌	70.46	91.96	🔗
DeepCluster V2	ResNet18	1000	❌	63.61	88.09	🔗
DINO	ResNet18	1000	❌	66.76	90.34	🔗
MoCo V2+	ResNet18	1000	❌	69.89	91.65	🔗
NNCLR	ResNet18	1000	❌	69.62	91.52	🔗
ReSSL	ResNet18	1000	❌	65.92	89.73	🔗
SimCLR	ResNet18	1000	❌	65.78	89.04	🔗
Simsiam	ResNet18	1000	❌	66.04	89.62	🔗
SwAV	ResNet18	1000	❌	64.88	88.78	🔗
VICReg	ResNet18	1000	❌	68.54	90.83	🔗
W-MSE	ResNet18	1000	❌	61.33	87.26	🔗

ImageNet-100

Method	Backbone	Epochs	Dali	Acc@1 (online)	Acc@1 (offline)	Acc@5 (online)	Acc@5 (offline)	Checkpoint
Barlow Twins 🚀	ResNet18	400	✔️	80.38	80.16	95.28	95.14	🔗
BYOL 🚀	ResNet18	400	✔️	79.76	80.16	94.80	95.14	🔗
DeepCluster V2	ResNet18	400	❌	75.36	75.4	93.22	93.10	🔗
DINO	ResNet18	400	✔️	74.84	74.92	92.92	92.78	🔗
MoCo V2+ 🚀	ResNet18	400	✔️	78.20	79.28	95.50	95.18	🔗
NNCLR 🚀	ResNet18	400	✔️	79.80	80.16	95.28	95.30	🔗
ReSSL	ResNet18	400	✔️	76.92	78.48	94.20	94.24	🔗
SimCLR 🚀	ResNet18	400	✔️	77.04	77.48	94.02	93.42	🔗
Simsiam	ResNet18	400	✔️	74.54	78.72	93.16	94.78	🔗
SwAV	ResNet18	400	✔️	74.04	74.28	92.70	92.84	🔗
VICReg 🚀	ResNet18	400	✔️	79.22	79.40	95.06	95.02	🔗
W-MSE	ResNet18	400	✔️	67.60	69.06	90.94	91.22	🔗

🚀 methods where hyperparameters were heavily tuned.

ImageNet

Method	Backbone	Epochs	Dali	Acc@1 (online)	Acc@1 (offline)	Acc@5 (online)	Acc@5 (offline)	Checkpoint
Barlow Twins	ResNet50	100	✔️
BYOL	ResNet50	100	✔️	68.63	68.37	88.80	88.66	🔗
DeepCluster V2	ResNet50	100	✔️
DINO	ResNet50	100	✔️
MoCo V2+	ResNet50	100	✔️
NNCLR	ResNet50	100	✔️
ReSSL	ResNet50	100	✔️
SimCLR	ResNet50	100	✔️
Simsiam	ResNet50	100	✔️
SwAV	ResNet50	100	✔️
VICReg	ResNet50	100	✔️
W-MSE	ResNet50	100	✔️

Training efficiency for DALI

We report the training efficiency of some methods using a ResNet18 with and without DALI (4 workers per GPU) in a server with an Intel i9-9820X and two RTX2080ti.

Method	Dali	Total time for 20 epochs	Time for a 1 epoch	GPU memory (per GPU)
Barlow Twins	❌	1h 38m 27s	4m 55s	5097 MB
	✔️	43m 2s	2m 10s (56% faster)	9292 MB
BYOL	❌	1h 38m 46s	4m 56s	5409 MB
	✔️	50m 33s	2m 31s (49% faster)	9521 MB
NNCLR	❌	1h 38m 30s	4m 55s	5060 MB
	✔️	42m 3s	2m 6s (64% faster)	9244 MB

Note: GPU memory increase doesn't scale with the model, rather it scales with the number of workers.

Citation

If you use solo-learn, please cite our preprint:

@misc{turrisi2021sololearn,
      title={Solo-learn: A Library of Self-supervised Methods for Visual Representation Learning}, 
      author={Victor G. Turrisi da Costa and Enrico Fini and Moin Nabi and Nicu Sebe and Elisa Ricci},
      year={2021},
      eprint={2108.01775},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={\url{https://github.com/vturrisi/solo-learn}},
}

CoffeeCatt/solo-learn