Facial expression recognition (FER) plays an important role in a variety of real-world applications such as human-computer interaction. POSTER V1 achieves the state-of-the-art (SOTA) performance in FER by effectively combining facial landmark and image features through two-stream pyramid cross-fusion design. However, the architecture of POSTER V1 is undoubtedly complex. It causes expensive computational costs. In order to relieve the computational pressure of POSTER V1, in this paper, we propose POSTER V2. It improves POSTER V1 in three directions: cross-fusion, two-stream, and multi-scale feature extraction. In cross-fusion, we use window-based cross-attention mechanism replacing vanilla cross-attention mechanism. We remove the image-to-landmark branch in the two-stream design. For multi-scale feature extraction, POSTER V2 combines images with landmark's multi-scale features to replace POSTER V1's pyramid design. Extensive experiments on several standard datasets show that our POSTER V2 achieves the SOTA FER performance with the minimum computational cost. For example, POSTER V2 reached 92.21% on RAF-DB, 67.49% on AffectNet (7 cls) and 63.77% on AffectNet (8 cls), respectively, using only 8.4G floating point operations (FLOPs) and 43.7M parameters (Param). This demonstrates the effectiveness of our improvements.
-
Preparing Data
Download the val dataset from baidu disk.
As an example, assume we wish to run RAF-DB. We need to make sure it have a structure like following:
- data/raf-db/ train/ train_00001_aligned.jpg train_00002_aligned.jpg ... valid/ test_0001_aligned.jpg test_0002_aligned.jpg ...
-
Preparing Pretrained Models
The following table provides the pre-trained checkpoints used in this paper. Put entire
pretrain
folder undermodels
folder.pre-trained checkpoint baidu disk codes google drive ir50 download (POST) download mobilefacenet download (POST) download
The following table provides POSTER V2 checkpoints in each dataset.
dataset | top-1 acc | baidu disk | codes | google drive |
---|---|---|---|---|
RAF-DB | 92.21 | download | (POST) | download |
AffectNet (7 cls) | 67.49 | download | (POST) | download |
AffectNet (8 cls) | 63.77 | download | (POST) | download |
CAER-S | 93.00 | download | (POST) | download |
You can evaluate our model on RAF-DB, AffectNet (7 cls) or CAER-S dataset by running:
python main.py --data path/to/dataset --evaluate path/to/checkpoint
You can evaluate our model on AffectNet (8 cls) dataset by running:
python main_8.py --data path/to/dataset --evaluate path/to/checkpoint
You can train POSTER V2 on RAF-DB dataset by running as follows:
python main.py --data path/to/raf-db/dataset --data_type RAF-DB --lr 3.5e-5 --batch-size 144 --epochs 200 --gpu 0
You can train POSTER V2 on AffectNet (7 cls) dataset by running as follows:
python main.py --data path/to/affectnet-7/dataset --data_type AffectNet-7 --lr 1e-6 --batch-size 144 --epochs 200 --gpu 0
You can train POSTER V2 on CAER-S dataset by running as follows:
python main.py --data path/to/caer-s/dataset --data_type CAER-S --lr 4e-5 --batch-size 144 --epochs 200 --gpu 0
You can train POSTER V2 on AffectNet (8 cls) dataset by running as follows:
python main_8.py --data path/to/affectnet-7/dataset --lr 1e-6 --batch-size 144 --epochs 200 --gpu 0
You can continue your training by running:
python main.py --data path/to/dataset --resume checkpoint/to/continue
Our research code is released under the MIT license. See LICENSE for details.
This work was supported by Public-welfare Technology Application Research of Zhejiang Province in China under Grant LGG22F020032, and Key Research and Development Project of Zhejiang Province in China under Grant 2021C03137.
Our implementation and experiments are built on top of open-source GitHub repositories. We thank all the authors who made their code public, which tremendously accelerates our project progress. If you find these works helpful, please consider citing them as well.
JiaweiShiCV/Amend-Representation-Module
If you use this code for your research, please cite our paper POSTER V2: A simpler and stronger facial expression recognition network:
@article{mao2023poster,
title={POSTER V2: A simpler and stronger facial expression recognition network},
author={Mao, Jiawei and Xu, Rui and Yin, Xuesong and Chang, Yuanqi and Nie, Binling and Huang, Aibin},
journal={arXiv preprint arXiv:2301.12149},
year={2023}
}