/emotion2vec-speech-emotion-detection-api

This API utilizes a pre-trained model for emotion recognition from audio files. It accepts audio files as input, processes them using the pre-trained model, and returns the predicted emotion along with the confidence score. The API leverages the FastAPI framework for easy development and deployment.

Primary LanguagePython

Speech Emotion Recognition API - IEMOCAP dataset with good accuracy

Requires at least 1.5 gb memory to run fine

This FastAPI application provides an endpoint for performing emotion recognition on audio files.

Installation

  1. Create a virtual environment and activate it:

    python -m venv env
    # Windows
    .\env\Scripts\activate
    # Linux/Mac
    source env/bin/activate
  2. Install dependencies:

    pip install -r requirements.txt

Usage

To start the FastAPI server, run the following command:

uvicorn app:app --reload

This will start the server, and you can access the API at http://localhost:8000/docs.

Emotion Recognition Endpoint

You can perform emotion recognition on audio files by sending a POST request to /emotion_recognition. Upload an audio file with the request, and the API will return the detected emotion and its confidence score.

Example:

curl -X POST -F "audio_file=@/path/to/audio/file.wav" http://localhost:8000/emotion_recognition