[CV_FER] Facial Motion Prior Networks for Facial Expression Recognition
jeonggg119 opened this issue · 0 comments
jeonggg119 commented
FMRN-FER : Facial Motion Prior Networks for Facial Expression Recognition
FMPN-FER Architecture
- Facial-Motion Mask Generator (FMG)
- Generate a facial mask to focus on facial muscle moving regions
- Use avg differences bw neutral faces and expressive faces as training guidance (pseudo gt masks)
- Prior Fusion Net (PFN)
- Generated mask is applied to and fused with original input expressive face
- Classification Net (CN)
- Extract features and predict facial expression label (6 class)
Implementation Details
- CN : Inception V3 (pretrained on ImageNet)
- 5 landmarks are extracted, followed by face normalization
- Image Transforms : Random crop from four corners or center & Random horizontal flip
- Training (2 steps)
- Starting by tuning only FMG for 300 epochs, using Adam optimizer
- Epoch 150
- LR linearly decay (FMG : e−4 to 0)
- Jointly training entire framework with λ1 = 10 and λ2 = 1
- Epoch 200
- LR linearly decay (FMG : e−5, CN : e-4) from epoch 100
- l_total = λ1 * l_G(MSE) + λ2 * l_C(CE) = 10 * l_G + l_C
- Starting by tuning only FMG for 300 epochs, using Adam optimizer
Experimental Results
- MMI Facial Expression Database
- Labelled with 6 basic expressions (Disgust > Sadness > Happy > Fear > Surprise > Anger)
- 3 peak frames around center of each labelled sequence are selected → Total : 624 expressive faces
- 10-fold person-independent cross-validation experiments
- Details for MMI