CRETORA

This repository contains code for an end to end model for raga and tonic identification on audio samples

Note: This repository currently only contains inference code, the training code and lots of experimental code can be accessed here: https://github.com/VishwaasHegde/E2ERaga However it is not well maintained.

Getting Started

Requires python==3.6.9

Download and install Anaconda for easier package management

Install the requirements by running pip install -r requirements.txt

Model

Create an empty folder called model and place it in CRETORA folder
Download the models (model-full.h5, hindustani_tonic_model.hdf5 hindustani_raga_model.hdf5, carnatic_tonic_model.hdf5 carnatic_raga_model.hdf5) from here and place them in models folder

Data

I dont have the permisssion to upload the datasets, the datasets has to be obtained by request from here: https://compmusic.upf.edu/node/328

Run Time Input

E2ERaga supports audio samples which can be recorded at runtime

Steps to run:

Run the command python main.py --runtime=True --tradition=h --duration=30
You can change the tradition (hindustani or carnatic) by choosing h/c and duration to record in seconds
Once you run this command, there will be a prompt - Press 1 to start recording or press 0 to exit:
Enter accordingly and start recording for duration duration
After this the raga label and the tonic is outputted
The tonic can also be optionally given by --tonic=D for specify D pitch as the tonic.

File input

E2ERaga supports recorded audio samples which can be provided at runtime

Steps to run:

Run the command python main.py --runtime_file=<audio_file_path> --tradition=<h/c>

Example: python test_sample.py --runtime_file=data/sample_data/Ahira_bhairav_27.wav --tradition=h
The model supports wav and mp3 file, with mp3 there will be a delay in converting into wav format internally
After this the raga label and the tonic frequency is outputted

Demo videos:

Live Raga Prediction for Raga: Miyan Malhar

(Click will redirect to youtube)

Live Raga Prediction for Raga: Des

Hindustani Raga Embedding cosine similarity obtained from the model

Carnatic Raga Embedding cosine similarity obtained from the model

Acknowledgments:

The model uses CREPE to find the pitches for the audio, I would like to thank Jong Wook for clarifiying my questions
Also thank CompMusic and Sankalp Gulati for providing me the datasets

VishwaasHegde/CRETORA