Training Code for ADMIS Teams in CVPR2024 FRCSyn Competition

Primary LanguagePython


This is the official GitHub repository for our team's contribution (ADMIS) to



We use a latent diffusion model (LDM) based on IDiff-Face to synthesize faces. The LDM is conditioned using identity embeddings as contexts, extracted from faces by a pretrained ElasticFace recognition model. image

Dataset and pretrained models

We use the CASIA-WebFace dataset to train an IDiff-Face diffusion model. The download link for our pretrained diffusion model weight is:

We use 10K identities x 50 images dataset in SubTask 1.1/2.1 and 30k identities x 50 images in SubTask 2.1/2.2. We provide the pre-generated synthetic 10K identities dataset at:

We train the recognition model based on TFace. The pretrained IR-50 model trained on the Synthetic dataset can be accessible in:

The evaluation results of our pretrained face recognition models on widely used benchmark:

Backbone Head Dataset Id LFW CFP-FP CPLFW AGEDB CALFW Average
IR-50 ArcFace CASIA_WebFace 10.5K 99.43 97.40 90.23 94.80 93.55 95.08
IR-50 ArcFace IDiff-Face 10K 97.10 82.00 76.65 78.40 86.32 84.09
IR-50 ArcFace DCFace 10K 98.60 88.21 83.33 88.18 91.38 89.94
IR-50 ArcFace Syn_10k (ours) 10K 99.17 92.79 87.67 89.42 91.43 92.09
IR-50 ArcFace Syn_30k (ours) 30K 99.52 94.66 89.75 91.78 93.13 93.77


Our method can mainly be divided into identity conditioned LDM training, context enhanced sampling, and recognition model training. Specifically, identity conditioned LDM training and context enhanced sampling are implemented based on the IDiff-Face repository. We make some modifications to its dataset and sampling codes. The implementation of recognition model training is fundamentally based on TFace repository, with only minor modifications applied to the 'transform' method to incorporate some additional cropping enhancements.

1. Identity conditioned LDM training

  • Install environment: Please refer to 'How to use the code' to set up the environment.

  • Download the data and pretrained models required for training LDM: The training embeddings used as contexts during training and their corresponding images have to be downloaded from the link and placed under dataset/CASIA. The pre-trained autoencoder for the latent diffusion training is obtained from the pre-trained fhq256 LDM from Rombach et al. please follow their license distribution. For training, make sure the tree of the directory is as follows:

      ├── dataset
      │   ├── CASIA
      │   │   ├── elasticface_embeddings # context file and image index file
      │   │   ├── CASIA_namelist.txt # for training
      │   │   └── images # decompressed CASIA-WebFace images
      │   ...
      ├── generative_model_training
      │   ├── ckpt
      │   │   ├── autoencoder
      │   │   │   ├── first_stage_decoder_state_dict.pt # for training
      │   │   │   └── first_stage_encoder_state_dict.pt # for training
      │   │   ...
      │   ...
  • Start training: It has to be ensured that the dataset: CASIA_file option is set and that the paths in the corresponding subconfiguration generative_model_training/configs/dataset/CASIA_file.yaml are pointing to the training images and pre-extracted embeddings. The model training can be initiated by executing:    cd generative_model_training  python main.py  

2. Context enhanced sampling

To synthesize new faces with unseen identities, IDiff-Face suggests a noise embedding sampled from Gaussian distribution could serve as the LDM’s context. However, we observe that such synthesized faces exhibit weak identity consistency. We employ another unconditional DDPM, pretrained on the FFHQ dataset, to help generate high-quality contexts.

  • Prepare contexts: To facilitate ease of use, we have directly supplied the pre-generated context faces along with the context embeddings processed via the Elasticface model. Please download them from this link and place them in dataset/context_database. For sampling, make sure the tree of the directory is as follows:
      ├── dataset
      │   ├── context_database
      │   │   ├── elasticface_embeddings # context file 
      │   │   └── images # decompressed context faces images
      │   ...
  • Run sampling script: If you choose to utilize our pretrained LDM checkpoint, please download the Pre-trained LDM (25% CPD) and make sure the tree of the directory is as follows:
    ├── generative_model_training
    │   ├── ckpt
    │   │   ├── ADMIS_FRCSyn_ckpt
    │   │   │   └── ema_averaged_model_200000.ckpt # for sampling  
    │   │   ...
    │   ...
    Then the sampling process can be initiated by executing:
    cd generative_model_training
    python sample.py
  • ID augmentation: We employ the oversampling strategy from DCFace, by mixing up the context face (augmented 5 times) with its corresponding synthesized faces. Please run:
    cd generative_model_training
    python id_augment.py

3. Recognition model training

  • Prepare TFR format data: To convert raw image to tfrecords, generate a new data dir including some tfrecord files and a index_map file, please run:

    cd recognition_model_training
    python3 tools/img2tfrecord.py --img_list YOUR_IMAGE_ROOT --tfrecords_dir SAVE_ROOT --tfrecords_name SAVE_NAME
  • Train: Modified the DATA_ROOT and INDEX_ROOT in train.yaml, DATA_ROOT is the parent dir for tfrecord dir, INDEX_ROOT is the parent dir for index file.

    cd recognition_model_training
    bash local_train.sh
  • Test: Detail implementations and steps see Test in TFace repository.

If you have any more questions, please contact zzhizhou66@gmail.com.


This repo is modified and adapted on these great repositories, we thank these authors a lot for their great efforts.