Speech Emotion Recognition with RAVEDNESS Dataset

License: MIT

Overview

This repository contains code and resources for performing Speech Emotion Recognition (SER) using neural networks and Long Short-Term Memory (LSTM) models. The RAVEDNESS dataset is utilized for training and evaluation.

Dataset

Please download the dataset and place it in the appropriate directory before running the code.

Model Architecture

We have implemented two main models for SER:

  1. Neural Network (NN) Model: A feedforward neural network designed for SER.

  2. LSTM Model: A Long Short-Term Memory (LSTM) model tailored for sequence data in SER.

You can find the code for these models in their respective directories.

Usage

Follow these steps to run the code:

  1. Install the required dependencies by running: pip install -r requirements.txt

  2. Organize the dataset as specified in the dataset section.

  3. Train and test the models by running the respective scripts.

  4. Evaluate the model performance, and visualize the results.

Results

We have obtained the following results:

  • Confusion Matrix Neural Network:
  • image
  • Confusion Matrix LSTM :
  • (image