Dataset Preview

DigiFace-1M Dataset

The DigiFace-1M dataset is a collection of over one million diverse synthetic face images for face recognition.

It was introduced in our paper DigiFace-1M: 1 Million Digital Face Images for Face Recognition and can be used to train deep learning models for facial recognition.

The dataset contains:

  • 720K images with 10K identities (72 images per identity). For each identity, 4 different sets of accessories are sampled and 18 images are rendered for each set.
  • 500K images with 100K identities (5 images per identity). For each identity, only one set of accessories is sampled.

The DigiFace-1M dataset can be used for non-commercial research, and is licensed under the license found in LICENSE.

Downloading the Dataset

For convenience the dataset is split into 8 parts which can be downloaded here:

72 images per identity

5 images per identity

Dataset Layout

The DigiFace-1M dataset contains cropped color images in the following layout.

subj_id_n
├── 0.png                 # First rendered image of subject subj_id_n
├── 1.png                 # Second rendered image of subject subj_id_n
...
├── k.png                 # k+1 rendered image of subject subj_id_n

Disclaimer

Some of our rendered faces may be close in appearance to the faces of real people. Any such similarity is naturally unintentional, as it would be in a dataset of real images, where people may appear similar to others unknown to them.

Citation

If you use the DigiFace-1M dataset in your work, please cite the following paper:

@inproceedings{bae2023digiface1m,
  title={DigiFace-1M: 1 Million Digital Face Images for Face Recognition},
  author={Bae, Gwangbin and de La Gorce, Martin and Baltru{\v{s}}aitis, Tadas and Hewitt, Charlie and Chen, Dong and Valentin, Julien and Cipolla, Roberto and Shen, Jingjing},
  booktitle={2023 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2023},
  organization={IEEE}
}