Emotion Recognition in the Wild Challenge (EmotiW 2018) audio-video sub-challenge

Description

The audio-video sub-challenge focuses on emotion classification tasks based on videos, which contains seven basic emotion categories: Angry, Disgust, Fear, Happy, Sad, Surprise and Neutral.
For more detailed information, please refer to webisite [EmotiW 2018]. Finally, we got 59.72% accuracy in this challenge and we are 7th in 32 teams (Team name: INHA). It can be seen this link.

We use AFEW dataset that use EmotiW challenge. You can get AFEW dataset after the administrator's approval in EmotiW website.

Our method is combinations of 3D CNN, 2D CNN, and RNN. Some of our networks refer to existing implementation (kenshohara and titu's work).

3D CNN
- C3D
- ResNet 3D
- ResNeXt 3D
2D CNN
- DenseNet
RNN
- LSTM
- GRU

Note: This repository can get accuracy for validation set. Because there is no label for the testset of AFEW dataset, above accuracy is not available.

Environment

This code is tested on below setting.

Ubuntu 16.04
Python 3.5
Cuda 9.0
Pytorch 0.4.1

Usage

Step 1. Clone this respository to local.

$ git clone https://github.com/lemin/EmotiW-2018.git
$ cd EmotiW-2018

Step 2. Prepare dataset.

Pre-processing method is to crop the face after detecting the face every frame and make it into .npz file. The algorithm we used is MTCNN.
We extract 4069-d feature for every data frame by frame using DenseNet after training DenseNet. Likewise, save these features as .npz file.

Step 3. Train networks.

$ python main.py

lemin0922/EmotiW-2018

Emotion Recognition in the Wild Challenge (EmotiW 2018) audio-video sub-challenge

Description

Environment

Usage