FFEM

FFEM stands for Face Feature Embedding Module.
This project includes following implementations:

ArcFace
GroupFace
CenterLoss

Requirements

tensorflow==2.4.1
tensorflow-addons==0.12.1
tensorflow-model-optimization==0.5.0
numpy==1.19.5

How to Make Your Dataset

You have image_list.json file with the format (json) as below.

{
  "Caucacian/a12/frontal1.jpg": {
    "label": 0,
    "x1": 9,
    "y1": 13,
    "x2": 75,
    "y2": 100
  },
  "Asian/p14/profile2.jpg": {
    "label": 0,
    "x1": 7,
    "y1": 7,
    "x2": 43,
    "y2": 65
  },
  ...
}

The key is a relative path of a face image.
The value of the key contains label number and bounding box that indicates exact face location.
The bounding box [x1, y1, x2, y2] is [left, top, right, bottom] respectively.
We generated the bounding box using [11].

Transform the json file into TFRECORD

Input pipeline bottleneck increases training time.
Reading data from a large file sequentially is better than reading a lot of small sized data randomly.
Try the command below, it generates [name].tfrecord file from the above json file.

python generate_tfrecord/main.py --root_path [path] --json_file [path] --output [name].tfrecord

Common Settings

Execute the command export PYTHONPATH=$(pwd) on linux and $env:$PYTHONPATH=$pwd on windows 10 powershell.

Recommendation Steps for Training.

Train a model with 'loss'=SoftmaxCenter on VGGFACE2 dataset.
Train the pretrained model from first step with 'loss'=AngularMargin on large identity dataset.
The training command is python train/main.py.

Results

	ResNet50	ResNet50
Recall @ 1, African	51%	55%
Recall @ 1, Asian	83%	84%
Recall @ 1, Caucacian	69%	74%
Recall @ 1, Indian	69%	72%
Recall @ 1, VGGFace2	89%	95%
Epoch	50	70
Batch Size	2048	2048
Embedding Size	512	512
Feature Pooling	*GNAP	*GNAP
Loss Type	AngularMargin(arcface)	AngularMargin(arcface)
Scale	60	60
LR	SGD@1e-1	SGD@1e-1
# of Identity	93979	100979
Dataset	Trillion Pairs	Trillion Pairs + VGGFACE2

*RFW and VGGFACE2 are used for testing
*All models are pretrained on VGGFACE2 train-set
*Global Norm-Aware Pooling (GNAP) is used for pooling last spatial features of convolution layer.

TODO LIST

Known as Center Loss, A Discriminative Feature Learning Approach for Deep Face Recognition, Y. Wen et al., ECCV 2016
Known as L2 Softmax, L2-constrained Softmax Loss for Discriminative Face Verification, R. Ranjan et al., arXiv preprint arXiv:1703.09507 2017
Global Norm-Aware Pooling for Pose-Robust Face Recognition at Low False Positive Rate, S. Chen et al., arXiv preprint arXiv:1808.00435 2018
The Devil of Face Recognition is in the Noise, F. Wang et al., ECCV 2018
Co-Mining: Deep Face Recognition with Noisy Labels, X. Wang et al., ICCV 2019
ArcFace: Additive Angular Margin Loss for Deep Face Recognition, J. Deng et al., CVPR 2019
Relational Deep Feature Learning for Heterogeneous Face Recognition, M. Cho et al., IEEE 2020
Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web Faces, J. Deng et al., ECCV 2020
GroupFace: Learning Latent Groups and Constructing Group-based Representations for Face Recognition, Y. Kim et al., CVPR 2020

shi510/ffem