Instrument Classification in Carnatic Music (ICCM) is a project developed for the Music Technology Lab optional course, at Universitat Pompeu Fabra.
It aims to create a tool for musicologists, musicians and enthusiasts alike to identify the presence of some of the main instruments found in a Carnatic music performance (human voice, violin, mridangam).
This will be achieved through the training of several ML predictive models that will be able to classify each instrument, ultimately creating a graphical representation of the in and outs of each instrument through the length of a performance.
- Python v3.10.8 or higher
- Pip
- Jupyter Notebook (alternatively, Google Colab can be used, but it's not recommended)
All required dependencies and their respective versions are listed in requirements.txt. You can install all of them at once while on a command prompt environment located at the repository's top folder using the following command: pip install -r requirements.txt
After installing the dependencies, you can now build the model yourself. To do so, download and install all the Jupyter Notebooks located at /src/, following the next order:
- Install_dataset.ipynb
- Dataset_Creation.ipynb
- Feature_Extraction.ipynb
- Add_Instruments_To_Csv.ipynb
- Modelling.ipynb
- Install_dataset.ipynb:
Obtains the saraga1.5_carnatic dataset (16.2 GB) using the corresponding functions of the mir-data library. The download might take a long time depending on the connection. (If you already have access to the dataset, you may skip this step).
- Dataset_Creation.ipynb:
Builds our dataset, using performances from saraga1.5_carnatic, which will be divided into small data/audio chunks and tagged by identifying silent regions for each instrument. These chunks and their respective instrument presence indicators will be referenced in a metadata dataframe (metadata.csv).
- Feature_Extraction.ipynb:
Extracts the features from the audio chunks in a dataframe format (features.csv).
- Add_Instruments_To_Csv.ipynb:
Adds a column to the features.csv dataframe with the instrument combination of each sample (0 for none of them, 1 for vocal only... 7 for all of them), so we later can swiftly create random sample sizes with equally represented instrument combinations.
- Modelling.ipynb:
Trains and tests the Gradient Boosting Classifier models for each instrument.
Due to the limitations we had during this course, we haven't been able to complete the final result of this project, where an user would have been able to upload a carnatic music performance of their choice through a UI. A graphical representation of the in and outs of each instrument through the length of a performance would be displayed together with a audio player, so the user could test the result. Here's an interactive prototype we made of the UI using Figma.
This PDF file details the development process undertaken for this project, and it contains the following documents:
- Project Plan
- State of the art
- Software development tools
- Software requirments specification
- Ethical Considerations
- Evaluation
- Weekly reports (for each member)
Distributed under the GPLv3 License. See LICENSE for more information.