/midi-shark

Automatic piano transcription model based on transformers and the onsets/frames architecture. Class project for APS360: Applied Fundamentals of Machine Learning.

Primary LanguageJupyter Notebook

midi-shark

Presentation

Usage:

Make sure that requirements in requirements.txt are installed, or run

pip install -r requirements.txt

Then, make sure you have FluidSynth and a .sf2 soundfont installed.

Preprocessing the Data.

  1. Create a file named .env in the project's root directory, following the template shown in the .env.example file.
  2. Execute processing/preprocess_batch.py using Python. You must have the dataset and sufficient disk space of [] MB to store the preprocessed data. If you wish to only preprocess a subset, specify the --year argument.

Training a Model

  1. Dataloaders (for pytorch) for all components of the dataset is located in model/database.py. Use this to load your data.
  2. Then, you can train the models we have built using the fit method, and evaluate them using the val_split method. To use your own models, you can still use the dataloaders.

Making Predictions

  1. You can use the code in this jupyter notebook to make predictions. However, ensure you have trained some sort of model to make the predictions.

Resources/References

Datasets

Papers

Dataset Default File Structures

If there are any new datasets added, please update the README with the file structures.

  • MAESTRO should look like
.
├── 2004
├── 2006
├── 2008
├── 2009
├── 2011
├── 2013
├── 2014
├── 2015
├── 2017
├── 2018
├── LICENSE
├── maestro-v3.0.0.csv
├── maestro-v3.0.0.json
└── README
  • MusicNet should look like
.
├── test_data
├── test_labels
├── train_data
└── train_labels