jeonggg119/DL_paper

[CV_FER] Facial Motion Prior Networks for Facial Expression Recognition

jeonggg119 opened this issue · 0 comments

FMRN-FER : Facial Motion Prior Networks for Facial Expression Recognition

FMPN-FER Architecture

image

  • Facial-Motion Mask Generator (FMG)
    • Generate a facial mask to focus on facial muscle moving regions
    • Use avg differences bw neutral faces and expressive faces as training guidance (pseudo gt masks)
  • Prior Fusion Net (PFN)
    • Generated mask is applied to and fused with original input expressive face
  • Classification Net (CN)
    • Extract features and predict facial expression label (6 class)

Implementation Details

  • CN : Inception V3 (pretrained on ImageNet)
  • 5 landmarks are extracted, followed by face normalization
  • Image Transforms : Random crop from four corners or center & Random horizontal flip
  • Training (2 steps)
    1. Starting by tuning only FMG for 300 epochs, using Adam optimizer
      • Epoch 150
      • LR linearly decay (FMG : e−4 to 0)
    2. Jointly training entire framework with λ1 = 10 and λ2 = 1
      • Epoch 200
      • LR linearly decay (FMG : e−5, CN : e-4) from epoch 100
      • l_total = λ1 * l_G(MSE) + λ2 * l_C(CE) = 10 * l_G + l_C

Experimental Results

image

  • MMI Facial Expression Database
    • Labelled with 6 basic expressions (Disgust > Sadness > Happy > Fear > Surprise > Anger)
    • 3 peak frames around center of each labelled sequence are selected → Total : 624 expressive faces
    • 10-fold person-independent cross-validation experiments
    • Details for MMI