Mono- and cross-lingual emotion classification in recorded speech through a convolutional neural network.
Read the paper here
This model was trained and tested on a collective dataset,
consisting of the english IEMOCAP
and the french RECOLA datasets.
These datasets need to be downloaded manually and processed via OpenSMILE using the configuration file in the input folder!
Afterwards transform them to binaries using the write binaries.py
Parameter | Value |
---|---|
Activation Functions | Relu |
Loss Function | Softmax Cross Entropy |
Optimizer | ADAM |
Init. Learning Rate | 0.001 |
Mini-batch size | 50 |
Stride | 3 |
Dropout | 0.5 |
Epoches | 50 |
Englisch testset
Class | Mono-lingual | Multi-lingual | Cross-language |
---|---|---|---|
Sadness | 0.000 | 0.015 | 0.015 |
Anger | 0.019 | 0.014 | 0.014 |
Pleasure | 0.043 | 0.010 | 0.120 |
Joy | 0.942 | 0.985 | 0.864 |
MICRO | 0.421 | 0.432 | 0.405 |
French testset
Class | Mono-lingual | Multi-lingual | Cross-language |
---|---|---|---|
Sadness | 0.070 | 0.000 | 0.230 |
Anger | 0.200 | 0.200 | 0.200 |
Pleasure | 0.350 | 0.035 | 0.357 |
Joy | 0.754 | 0.912 | 0.403 |
MICRO | 0.533 | 0.524 | 0.359 |
You can view the notebook here on github.
- Python 3
- Tensorflow
- Jupyter
Simply open a new terminal in the directory and type:
> jupyter notebook
make sure you run all codeblocks from top to bottom to setup the network
To test the model, you need only to run the last codeblock. This will evaluate the model and print the accuracy for each testset.
- Tensorflow - The framework to create the model
- Project Jupyter - Nice and easy python notebooks