Convert VAD to Ekman
mirix opened this issue · 3 comments
Hello,
This model provides VAD values in 3D space.
However, the Ekman model is more intuitive to share the results with users.
I have found papers with 3D representations hinting at how to perform this conversion.
Are you aware of a straightforward approach to perform the conversion between both models?
Ideally in Python, but any hint on the algorithm would also do.
Best,
Ed
The VAD model is only fine-tuned on the MSP-Podcast dataset, which has several shortcomings for a full blown VAD model:
- Podcast recordings most likely do not contain all possible emotions, e.g. fear
- The dominance and arousal annotations show a high correlation, that is mimicked by the model, which means we most likely do not cover the 3D space of VAD in a meaningful way
Having this in mind I would propose to be very carefully when trying to map the VAD values to emotional categories.
Another way might be to further fine-tune the model on a given database containing the desired emotional categories, or using the embeddings of the model to train a simple classifier on such a database like we do in the notebook under the "https://github.com/audeering/w2v2-how-to/blob/main/notebook.ipynb" section.
Thanks a million for the clarifications.
In general, the conversion from VAD to Ekman seems to provide useful results:
https://github.com/mirix/approaches-to-diarisation/tree/main/emotions
However, it is true that fear is never detected.
I will see what other models are available and pay more attention to which datasets were used.
Hi @hagenw
I have forked MOSEI for SER:
https://huggingface.co/datasets/mirix/messaih
https://github.com/mirix/messaih
Now I will try to train a model and test it in a real-life scenario.