/unsupervised-face-representation

Implementation for Pre-training strategies and datasets for facial representation learning, ECCV 2022

Primary LanguagePythonMIT LicenseMIT

Pre-training strategies and datasets for facial representation learning

This is the PyTorch implementation for Facial Representation Learning (FRL) paper:

@inproceedings{bulat2022pre,
  title={Pre-training strategies and datasets for facial representation learning},
  author={Bulat, Adrian and Cheng, Shiyang and Yang, Jing and Garbett, Andrew and Sanchez, Enrique and Tzimiropoulos, Georgios},
  journal={ECCV},
  year={2022}
}

Model Zoo

We provide bellow some of the models trained in a self-supervised manner. More models to be added later on.

data backbone url
VGG ResNet 50 model
VGG (1M) ResNet 50 model
FPR-Flickr ResNet 50 model

Code snippet for loading the weights in a torchvision standard resnet:

import torch
from torchvision.models import resnet50

init_weights = torch.load('flr_r50_flickr_face.pth', map_location=torch.device('cpu'))['state_dict']
converted_weights = {k.replace('module.base_net.', ''):v for k, v in init_weights.items()}

model = resnet50(weights=None)
results = model.load_state_dict(converted_weights, strict=False)
# Note: the classifier layer is not loaded (fc.weight and fc.bias)
# similarly,  the projection layers used for pre-training are discarded.
print(results)

Installation

To use the code, clone the repo and install the following packages:

git clone https://github.com/1adrianb/unsupervised-face-representation

Requirements

  • Python >= 3.8
  • Numpy
  • pytorch: install instructions
  • torchvision: conda install torchvision -c pytorch
  • apex: install instructions
  • OpenCV: pip install opencv-python
  • H5Py: conda install h5py
  • tensorboard: pip install tensorboard
  • pandas

Note, if you are using pytorch > 1.10 and experience issues with apex, please see #1282. Alternatively you can switch to the native pytorch amp.

Training

bash scripts/run.sh

Before running the script make sure to set the appropiate paths. The models released in the paper were trained using 64 K40 GPUs.

Preparing data

For instructions regarding getting the data, see DATASET.md

Acknowledgement

We thank the original authors for releasing their code: SwAV, MoCo, BYOL, and vissl which we base our code base upon.