This repository contains code of face emotion recognition that was developed in the RSF (Russian Science Foundation) project no. 20-71-10010 (Efficient audiovisual analysis of dynamical changes in emotional state based on information-theoretic approach).
Our approach is described in the arXiv paper
All the models were pre-trained for face identification task using VGGFace2 dataset. In order to train PyTorch models, SAM code was borrowed.
We upload several models that obtained the state-of-the-art results for AffectNet dataset. The facial features extracted by these models lead to the state-of-the-art accuracy of face-only models on video datasets from EmotiW 2019, 2020 challenges: AFEW (Acted Facial Expression In The Wild) and VGAF (Video level Group AFfect).
Here are the accuracies measure on the testing set of above-mentioned datasets:
Model | AffectNet (8 classes), original | AffectNet (8 classes), aligned | AffectNet (7 classes), original | AffectNet (7 classes), aligned | AFEW | VGAF |
---|---|---|---|---|---|---|
mobilenet_7.h5 | - | - | 64.71 | - | 55.35 | 68.92 |
enet_b0_8_best_afew.pt | 60.95 | 60.18 | 64.63 | 64.54 | 59.89 | 66.80 |
enet_b0_8_best_vgaf.pt | 61.32 | 61.03 | 64.57 | 64.89 | 55.14 | 68.29 |
enet_b0_7.pt | - | - | 65.74 | 65.74 | 56.99 | 65.18 |
enet_b2_8.pt | 62.05 | 62.425 | 65.87 | 66.17 | 56.46 | 67.88 |
enet_b2_7.pt | - | - | 65.91 | 66.34 | 59.63 | 69.84 |
Please note, that we report the accuracies for AFEW and VGAFonly on the subsets, in which MTCNN detects facial regions. The code contains also computation of overall accuracy on the complete testing set, which is slightly lower due to the absence of faces or failed face detection.
In order to run our code on the datasets, please prepare them firstly using our TensorFlow notebooks: train_emotions.ipynb, AFEW_train.ipynb and VGAF_train.ipynb.
If you want to run our mobile application, please, run the following scripts inside mobile_app folder:
python to_tflite.py
python to_pytorchlite.py