/CRETORA

Primary LanguagePythonApache License 2.0Apache-2.0

CRETORA

This repository contains code for an end to end model for raga and tonic identification on audio samples

Note: This repository currently only contains inference code, the training code and lots of experimental code can be accessed here: https://github.com/VishwaasHegde/E2ERaga However it is not well maintained.

Getting Started

Requires python==3.6.9

Download and install Anaconda for easier package management

Install the requirements by running pip install -r requirements.txt

Model

  1. Create an empty folder called model and place it in CRETORA folder
  2. Download the models (model-full.h5, hindustani_tonic_model.hdf5 hindustani_raga_model.hdf5, carnatic_tonic_model.hdf5 carnatic_raga_model.hdf5) from here and place them in models folder

Data

  1. I dont have the permisssion to upload the datasets, the datasets has to be obtained by request from here: https://compmusic.upf.edu/node/328

Run Time Input

E2ERaga supports audio samples which can be recorded at runtime

Steps to run:

  1. Run the command python main.py --runtime=True --tradition=h --duration=30
  2. You can change the tradition (hindustani or carnatic) by choosing h/c and duration to record in seconds
  3. Once you run this command, there will be a prompt - Press 1 to start recording or press 0 to exit:
  4. Enter accordingly and start recording for duration duration
  5. After this the raga label and the tonic is outputted
  6. The tonic can also be optionally given by --tonic=D for specify D pitch as the tonic.

File input

E2ERaga supports recorded audio samples which can be provided at runtime

Steps to run:

  1. Run the command python main.py --runtime_file=<audio_file_path> --tradition=<h/c>

    Example: python test_sample.py --runtime_file=data/sample_data/Ahira_bhairav_27.wav --tradition=h

  2. The model supports wav and mp3 file, with mp3 there will be a delay in converting into wav format internally

  3. After this the raga label and the tonic frequency is outputted

Demo videos:

Live Raga Prediction for Raga: Miyan Malhar

(Click will redirect to youtube)

IMAGE ALT TEXT HERE

Live Raga Prediction for Raga: Des

IMAGE ALT TEXT HERE

Hindustani Raga Embedding cosine similarity obtained from the model

alt text

Carnatic Raga Embedding cosine similarity obtained from the model

alt text

Acknowledgments:

  1. The model uses CREPE to find the pitches for the audio, I would like to thank Jong Wook for clarifiying my questions
  2. Also thank CompMusic and Sankalp Gulati for providing me the datasets