Convert VAD to Ekman

Question

Convert VAD to Ekman

mirix opened this issue a year ago · 3 comments

mirix commented a year ago

Hello,

This model provides VAD values in 3D space.

However, the Ekman model is more intuitive to share the results with users.

I have found papers with 3D representations hinting at how to perform this conversion.

Are you aware of a straightforward approach to perform the conversion between both models?

Ideally in Python, but any hint on the algorithm would also do.

Best,

Ed

Answer 1 · 2023-08-02T18:13:07.000Z

The VAD model is only fine-tuned on the MSP-Podcast dataset, which has several shortcomings for a full blown VAD model:

Podcast recordings most likely do not contain all possible emotions, e.g. fear
The dominance and arousal annotations show a high correlation, that is mimicked by the model, which means we most likely do not cover the 3D space of VAD in a meaningful way

Having this in mind I would propose to be very carefully when trying to map the VAD values to emotional categories.

Another way might be to further fine-tune the model on a given database containing the desired emotional categories, or using the embeddings of the model to train a simple classifier on such a database like we do in the notebook under the "https://github.com/audeering/w2v2-how-to/blob/main/notebook.ipynb" section.

Answer 2 · 2023-08-03T07:56:20.000Z

@hagenw

Thanks a million for the clarifications.

In general, the conversion from VAD to Ekman seems to provide useful results:

https://github.com/mirix/approaches-to-diarisation/tree/main/emotions

However, it is true that fear is never detected.

I will see what other models are available and pay more attention to which datasets were used.

Answer 3 · 2023-08-10T13:24:45.000Z

Hi @hagenw

I have forked MOSEI for SER:

https://huggingface.co/datasets/mirix/messaih

https://github.com/mirix/messaih

Now I will try to train a model and test it in a real-life scenario.