/Cross-lingual-SER

A cross-lingual SER that can learn language invariant representations without requiring target-language data labels

Primary LanguagePythonMIT LicenseMIT

Cross-lingual-SER

The goal of speech emotion recognition (SER) is to identify different kinds of human emotion from the given speech, which has been proven very helpful in automating many real-life applications. The conventional approach towards SER uses the same corpus for both training and testing of classifiers to detect accurate emotions, but this approach cannot be generalized for multilingual environments. We propose to develop a cross-lingual SER that can learn language invariant representations without requiring target-language data labels.

Dataset

Download the dataset from https://drive.google.com/drive/folders/1Geh-LKCutz_9rbQW8ZL8-i9RWdsuh_Mh?usp=sharing and place it in Dataset/ESD/ESD_preprocessed/