SER CNN Group Project

Overview

This project entailed research, development, implementation, and evaluation of a 5-layer CNN deep learning speech emotion recognition (SER) model. This work was conducted to satisfy final project requirements and explore team interests in speech emotion recognition.

ML Libraries and Frameworks

PyTorch
Librosa
Scikit-learn

Scope

Main scope areas of the project include:

preliminary literature search
data collection
exploration and assessment
pre-processing
model development and training
model investigations
discussion of results.

Datasets

In this study we use 4 popular datasets (Crema, Tess, Ravdess, Savee) and several data augmentations to balance and expand the data.

Model Development and Testing

Training, validation, and testing were carried out on the combined and individual datasets and results were evaluated and discussed. The final model had an accuracy of 48% on the test data. Conclusions are drawn on the effectiveness of different augmentations and data sets, and how they could be more effectively utilized in future models.

tcs-rex/ser-cnn-project

SER CNN Group Project

Overview

ML Libraries and Frameworks

Scope

Datasets

Model Development and Testing