/ser-cnn-project

Primary LanguageJupyter Notebook

SER CNN Group Project

Overview

This project entailed research, development, implementation, and evaluation of a 5-layer CNN deep learning speech emotion recognition (SER) model. This work was conducted to satisfy final project requirements and explore team interests in speech emotion recognition.

ML Libraries and Frameworks

  • PyTorch
  • Librosa
  • Scikit-learn

Scope

Main scope areas of the project include:

  • preliminary literature search
  • data collection
  • exploration and assessment
  • pre-processing
  • model development and training
  • model investigations
  • discussion of results.

Datasets

In this study we use 4 popular datasets (Crema, Tess, Ravdess, Savee) and several data augmentations to balance and expand the data.

Model Development and Testing

Training, validation, and testing were carried out on the combined and individual datasets and results were evaluated and discussed. The final model had an accuracy of 48% on the test data. Conclusions are drawn on the effectiveness of different augmentations and data sets, and how they could be more effectively utilized in future models.