Raga Detection Using Machine Learning

This repository contains code for an end to end model for raga and tonic identification on audio samples

Note: This repository currently only contains inference code, the training code and lots of experimental code can be accessed here: https://github.com/VishwaasHegde/E2ERaga However it is not well maintained.

A paper on this has been published: https://aimc2023.pubpub.org/pub/j9v30p0j/release/1

"Please cite the below if you are using this for your work:

Narasinh, V., & Raja, S. (2023). Sequential Pitch Distributions for Raga Detection. AIMC 2023. Retrieved from https://aimc2023.pubpub.org/pub/j9v30p0j"

Getting Started

Requires python==3.7.16

Download and install Anaconda for easier package management

Install the requirements by running pip install -r requirements.txt

Model

Create an empty folder called model and place it in SPD_KNN folder
Download the pitch model from here and place it in the 'model' folder
Download the tonic models (Hindustani and Carnatic) from here and place it in the 'model' folder
Download the Carnatic raga models from here and place it in 'data\RagaDataset\Carnatic\model' (create empty folders if you need)
Download the Hindustani raga models from here and place it in 'data\RagaDataset\Hindustani\model' (create empty folders if you need)

Data

I dont have the permisssion to upload the datasets, the datasets has to be obtained by request from here: https://compmusic.upf.edu/node/328

Run Time Input

E2ERaga supports audio samples which can be recorded at runtime

Steps to run:

Run the command python main.py --runtime=True --tradition=h --duration=30
You can change the tradition (hindustani or carnatic) by choosing h/c and duration to record in seconds
Once you run this command, there will be a prompt - Press 1 to start recording or press 0 to exit:
Enter accordingly and start recording for duration duration
After this the raga label and the tonic is outputted
The tonic can also be optionally given by --tonic=D for specify D pitch as the tonic.
I Recommend the user to provide the tonic explicitly using the '--tonic' whenever possible, as the tonic detection model is not so great

File input

E2ERaga supports recorded audio samples which can be provided at runtime

Steps to run:

Run the command python main.py --runtime_file=<audio_file_path> --tradition=<h/c>

Example: python main.py --runtime_file=data/sample_data/Jog_0.wav --tradition=h
The model supports wav and mp3 file, with mp3 there will be a delay in converting into wav format internally
After this the raga label and the tonic frequency is outputted

Demo videos:

Live Raga Prediction

Hindustani Raga Embedding cosine similarity obtained from the model

Carnatic Raga Embedding cosine similarity obtained from the model

Acknowledgments:

The model uses CREPE to find the pitches for the audio, I would like to thank Jong Wook for clarifiying my questions
Also thank CompMusic and Sankalp Gulati for providing me the datasets

VishwaasHegde/RagaDetector