VICRegL: Self-Supervised Learning of Local Visual Features

This repository provides a PyTorch implementation and pretrained models for VICRegL, a self-supervsied pretraining method for learning global and local features, described in the paper VICRegL: Self-Supervised Learning of Local Visual Features, published to NeurIPS 2022.
Adrien Bardes, Jean Ponce and Yann LeCun
Meta AI, Inria

Pre-trained Models

You can choose to download only the weights of the pretrained backbone used for downstream tasks, or the full checkpoint which contains backbone and expander/projector weights. All the models are pretrained on ImageNet-1k, except the ConvNeXt-XL model which is pretrained on ImageNet-22k. linear cls. is the linear classification accuracy on the validation set of ImageNet, and linear seg. is the linear frozen mIoU on the validation set of Pascal VOC.

arch	alpha	params	linear cls. (%)	linear seg. (mIoU)	download
ResNet-50	0.9	23M	71.2	54.0	backbone	full ckpt	logs
ResNet-50	0.75	23M	70.4	55.9	backbone	full ckpt	logs
ConvNeXt-S	0.9	50M	75.9	66.7	backbone	full ckpt	logs
ConvNeXt-S	0.75	50M	74.6	67.5	backbone	full ckpt	logs
ConvNeXt-B	0.9	85M	77.1	69.3	backbone	full ckpt	logs
ConvNeXt-B	0.75	85M	76.3	70.4	backbone	full ckpt	logs
ConvNeXt-XL	0.75	350M	79.4	78.7	backbone	full ckpt	logs

Pretrained models on PyTorch Hub

import torch
model = torch.hub.load('facebookresearch/vicregl:main', 'resnet50_alpha0p9')
model = torch.hub.load('facebookresearch/vicregl:main', 'resnet50_alpha0p75')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_small_alpha0p9')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_small_alpha0p75')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_base_alpha0p9')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_base_alpha0p75')
model = torch.hub.load('facebookresearch/vicregl:main', 'convnext_xlarge_alpha0p75')

Training

Install PyTorch (pytorch.org) and download ImageNet. The code has been developed for PyTorch version 1.8.1 and torchvision version 0.9.1, but should work with other versions just as well. Setup the ImageNet path in the file datasets.py:

IMAGENET_PATH = "path/to/imagenet"

ImageNet can also be loaded from numpy files, by setting the flag --dataset_from_numpy and setting the path:

IMAGENET_NUMPY_PATH = "path/to/imagenet/numpy/files"

The argument --alpha controls the weight between global and local loss, it is set by default to 0.75.

Single-node local training

To pretrain VICRegL with a ResNet-50 backbone on a single node with 8 GPUs for 100 epochs, run:

python -m torch.distributed.launch --nproc_per_node=8 main_vicregl.py --fp16 --exp-dir /path/to/experiment/ --arch resnet50 --epochs 100 --batch-size 512 --optimizer lars --base-lr 0.3 --weight-decay 1e-06 --size-crops 224 --num-crops 2 --min_scale_crops 0.08 --max_scale_crops 1.0 --alpha 0.75

To pretrain VICRegL with a ConvNeXt-S backbone, run:

python -m torch.distributed.launch --nproc_per_node=8 main_vicregl.py --fp16 --exp-dir /path/to/experiment/ --arch convnext_small --epochs 100 --batch-size 384 --optimizer adamw --base-lr 0.00075 --alpha 0.75

Multi-node training with SLURM

To pretrain VICRegL with a ResNet-50 backbone, with submitit (pip install submitit) and SLURM on 4 nodes with 8 GPUs each for 300 epochs, run:

python run_with_submitit.py --nodes 4 --ngpus 8 --fp16 --exp-dir /path/to/experiment/ --arch resnet50 --epochs 300 --batch-size 2048 --optimizer lars --base-lr 0.2 --weight-decay 1e-06 --size-crops 224 --num-crops 2 --min_scale_crops 0.08 --max_scale_crops 1.0 --alpha 0.75

To pretrain VICRegL with a ConvNeXt-B backbone, run:

python run_with_submitit.py --nodes 2 --ngpus 8 --fp16 --exp-dir /path/to/experiment/ --arch convnext_small --epochs 400 --batch-size 576 --optimizer adamw --base-lr 0.0005 --alpha 0.75

Evaluation

Linear evaluation

To evaluate a pretrained backbone (resnet50, convnext_small, convnext_base, convnext_xlarge) on linear classification on ImageNet, run:

python evaluate.py --data-dir /path/to/imagenet/ --pretrained /path/to/checkpoint/model.pth --exp-dir /path/to/experiment/ --arch [backbone] --lr-head [lr]

with lr=0.02 for resnets models and lr=0.3 for convnexts models.

Linear segmentation

See the segmentation folder.

License

This project is released under the CC-BY-NC License. See LICENSE for details.

Citation

If you find this repository useful, please consider giving a star ⭐ and citation:

@inproceedings{bardes2022vicregl,
  author  = {Adrien Bardes and Jean Ponce and Yann LeCun},
  title   = {VICRegL: Self-Supervised Learning of Local Visual Features},
  booktitle = {NeurIPS},
  year    = {2022},
}

facebookresearch/VICRegL