/Speech-Emotion-Recognition

Emotion (Happy, Neutral, Anger, Disgust, Fear, Sad) detection is performed in this repository. I am using the popular dataset Crema from Speech Emotion Recognition (en) which contains 7,442 original clips from 91 actors - 48 male and 43 female of a wide range of ages, races and ethnicities.

Primary LanguageJupyter NotebookMIT LicenseMIT

Speech Emotion Recognition

Speech is the most natural way of expressing ourselves as humans. It is only natural then to extend this communication medium to computer applications. We define speech emotion recognition (SER) systems as a collection of methodologies that process and classify speech signals to detect the embedded emotions. SER is not a new field, it has been around for over two decades, and has regained attention thanks to the recent advancements.

  • This is my first attempt at audio classification on Colab. I am using the popular dataset Crema from Speech Emotion Recognition (en) which contains 7,442 original clips from 91 actors - 48 male and 43 female of a wide range of ages, races and ethnicities.

  • The actors spoke from a selection of 12 sentences, each presented using one of six emotions (anger, disgust, fear, happiness, neutral and sadness).

Examples of Audio Samples

  • Happy 😄:
    Happy Sample

  • Fear 😨:
    Fear Sample

  • Neutral 😐:
    Neutral Sample

  • Anger 😡:
    Anger Sample

  • Disgust 🥴:
    Disgust Sample

  • Sad ☹️:
    Sad Sample

Dataset:- Sample Data

Model Evaluation:

Heatmap Evaluation




Author:

Hey, This is Hrugved Kolhe.


Hrugved Kolhe

GitHub followers Twitter Follow