A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. While the library is self contained, it is possible to use the models outside of solo-learn.
- [Aug 13 2021]: ๐ณ DeepCluster V2 is now available.
- [Jul 31 2021]: ๐ฆ ReSSL is now available.
- [Jul 21 2021]: ๐งช Added Custom Dataset support.
- [Jul 21 2021]: ๐ Added AutoUMAP.
- Barlow Twins
- BYOL
- DeepCluster V2
- DINO
- MoCo V2+
- NNCLR
- ReSSL
- SimCLR + Supervised Contrastive Learning
- SimSiam
- Swav
- VICReg
- W-MSE
- Increased data processing speed by up to 100% using Nvidia Dali
- Asymmetric and symmetric augmentations
- Online linear evaluation via stop-gradient for easier debugging and prototyping (optionally available for the momentum encoder as well)
- Normal offline linear evaluation
- All the perks of PyTorch Lightning (mixed precision, gradient accumulation, clipping, automatic logging and much more)
- Easy-to-extend modular code structure
- Custom model logging with a simpler file organization
- Automatic feature space visualization with UMAP
- Common metrics and more to come...
- Multi-cropping dataloading following SwAV:
- Note: currently, only SimCLR supports this
- Exclude batchnorm and biases from LARS
- No LR scheduler for the projection head in SimSiam
- torch
- tqdm
- einops
- wandb
- pytorch-lightning
- lightning-bolts
Optional:
- nvidia-dali
NOTE: if you are using CUDA 10.X change nvidia-dali-cuda110
to nvidia-dali-cuda100
in setup.py
, line 7.
To install the repository with Dali and/or UMAP support, use:
pip3 install .[dali,umap]
If no Dali/UMAP support is needed, the repository can be installed as:
pip3 install .
NOTE: If you want to modify the library, install it in dev mode with -e
.
NOTE 2: Soon to be on pip.
For pretraining the encoder, follow one of the many bash files in bash_files/pretrain/
.
After that, for offline linear evaluation, follow the examples on bash_files/linear
.
NOTE: Files try to be up-to-date and follow as closely as possible the recommended parameters of each paper, but check them before running.
Note: hyperparameters may not be the best, we will be re-running the methods with lower performance eventually.
Method | Backbone | Epochs | Dali | Acc@1 (online) | Acc@1 (offline) | Acc@5 (online) | Acc@5 (offline) | Checkpoint |
---|---|---|---|---|---|---|---|---|
Barlow Twins | ResNet18 | 1000 | โ | 92.10 | 99.73 | ๐ | ||
BYOL | ResNet18 | 1000 | โ | 92.58 | 99.79 | ๐ | ||
DeepCluster V2 | ResNet18 | 1000 | โ | 88.85 | 99.58 | ๐ | ||
DINO | ResNet18 | 1000 | โ | 89.52 | 99.71 | ๐ | ||
MoCo V2+ | ResNet18 | 1000 | โ | 92.94 | 99.79 | ๐ | ||
NNCLR | ResNet18 | 1000 | โ | 91.88 | 99.78 | ๐ | ||
ReSSL | ResNet18 | 1000 | โ | 90.63 | 99.62 | ๐ | ||
SimCLR | ResNet18 | 1000 | โ | 90.74 | 99.75 | ๐ | ||
Simsiam | ResNet18 | 1000 | โ | 90.51 | 99.72 | ๐ | ||
SwAV | ResNet18 | 1000 | โ | 89.17 | 99.68 | ๐ | ||
VICReg | ResNet18 | 1000 | โ | 92.07 | 99.74 | ๐ | ||
W-MSE | ResNet18 | 1000 | โ | 88.67 | 99.68 | ๐ |
Method | Backbone | Epochs | Dali | Acc@1 (online) | Acc@1 (offline) | Acc@5 (online) | Acc@5 (offline) | Checkpoint |
---|---|---|---|---|---|---|---|---|
Barlow Twins | ResNet18 | 1000 | โ | 70.90 | 91.91 | ๐ | ||
BYOL | ResNet18 | 1000 | โ | 70.46 | 91.96 | ๐ | ||
DeepCluster V2 | ResNet18 | 1000 | โ | 63.61 | 88.09 | ๐ | ||
DINO | ResNet18 | 1000 | โ | 66.76 | 90.34 | ๐ | ||
MoCo V2+ | ResNet18 | 1000 | โ | 69.89 | 91.65 | ๐ | ||
NNCLR | ResNet18 | 1000 | โ | 69.62 | 91.52 | ๐ | ||
ReSSL | ResNet18 | 1000 | โ | 65.92 | 89.73 | ๐ | ||
SimCLR | ResNet18 | 1000 | โ | 65.78 | 89.04 | ๐ | ||
Simsiam | ResNet18 | 1000 | โ | 66.04 | 89.62 | ๐ | ||
SwAV | ResNet18 | 1000 | โ | 64.88 | 88.78 | ๐ | ||
VICReg | ResNet18 | 1000 | โ | 68.54 | 90.83 | ๐ | ||
W-MSE | ResNet18 | 1000 | โ | 61.33 | 87.26 | ๐ |
Method | Backbone | Epochs | Dali | Acc@1 (online) | Acc@1 (offline) | Acc@5 (online) | Acc@5 (offline) | Checkpoint |
---|---|---|---|---|---|---|---|---|
Barlow Twins ๐ | ResNet18 | 400 | โ๏ธ | 80.38 | 80.16 | 95.28 | 95.14 | ๐ |
BYOL ๐ | ResNet18 | 400 | โ๏ธ | 79.76 | 80.16 | 94.80 | 95.14 | ๐ |
DeepCluster V2 | ResNet18 | 400 | โ | 75.36 | 75.4 | 93.22 | 93.10 | ๐ |
DINO | ResNet18 | 400 | โ๏ธ | 74.84 | 74.92 | 92.92 | 92.78 | ๐ |
MoCo V2+ ๐ | ResNet18 | 400 | โ๏ธ | 78.20 | 79.28 | 95.50 | 95.18 | ๐ |
NNCLR ๐ | ResNet18 | 400 | โ๏ธ | 79.80 | 80.16 | 95.28 | 95.30 | ๐ |
ReSSL | ResNet18 | 400 | โ๏ธ | 76.92 | 78.48 | 94.20 | 94.24 | ๐ |
SimCLR ๐ | ResNet18 | 400 | โ๏ธ | 77.04 | 77.48 | 94.02 | 93.42 | ๐ |
Simsiam | ResNet18 | 400 | โ๏ธ | 74.54 | 78.72 | 93.16 | 94.78 | ๐ |
SwAV | ResNet18 | 400 | โ๏ธ | 74.04 | 74.28 | 92.70 | 92.84 | ๐ |
VICReg ๐ | ResNet18 | 400 | โ๏ธ | 79.22 | 79.40 | 95.06 | 95.02 | ๐ |
W-MSE | ResNet18 | 400 | โ๏ธ | 67.60 | 69.06 | 90.94 | 91.22 | ๐ |
๐ methods where hyperparameters were heavily tuned.
Method | Backbone | Epochs | Dali | Acc@1 (online) | Acc@1 (offline) | Acc@5 (online) | Acc@5 (offline) | Checkpoint |
---|---|---|---|---|---|---|---|---|
Barlow Twins | ResNet50 | 100 | โ๏ธ | |||||
BYOL | ResNet50 | 100 | โ๏ธ | 68.63 | 68.37 | 88.80 | 88.66 | ๐ |
DeepCluster V2 | ResNet50 | 100 | โ๏ธ | |||||
DINO | ResNet50 | 100 | โ๏ธ | |||||
MoCo V2+ | ResNet50 | 100 | โ๏ธ | |||||
NNCLR | ResNet50 | 100 | โ๏ธ | |||||
ReSSL | ResNet50 | 100 | โ๏ธ | |||||
SimCLR | ResNet50 | 100 | โ๏ธ | |||||
Simsiam | ResNet50 | 100 | โ๏ธ | |||||
SwAV | ResNet50 | 100 | โ๏ธ | |||||
VICReg | ResNet50 | 100 | โ๏ธ | |||||
W-MSE | ResNet50 | 100 | โ๏ธ |
We report the training efficiency of some methods using a ResNet18 with and without DALI (4 workers per GPU) in a server with an Intel i9-9820X and two RTX2080ti.
Method | Dali | Total time for 20 epochs | Time for a 1 epoch | GPU memory (per GPU) |
---|---|---|---|---|
Barlow Twins | โ | 1h 38m 27s | 4m 55s | 5097 MB |
โ๏ธ | 43m 2s | 2m 10s (56% faster) | 9292 MB | |
BYOL | โ | 1h 38m 46s | 4m 56s | 5409 MB |
โ๏ธ | 50m 33s | 2m 31s (49% faster) | 9521 MB | |
NNCLR | โ | 1h 38m 30s | 4m 55s | 5060 MB |
โ๏ธ | 42m 3s | 2m 6s (64% faster) | 9244 MB |
Note: GPU memory increase doesn't scale with the model, rather it scales with the number of workers.
If you use solo-learn, please cite our preprint:
@misc{turrisi2021sololearn,
title={Solo-learn: A Library of Self-supervised Methods for Visual Representation Learning},
author={Victor G. Turrisi da Costa and Enrico Fini and Moin Nabi and Nicu Sebe and Elisa Ricci},
year={2021},
eprint={2108.01775},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={\url{https://github.com/vturrisi/solo-learn}},
}