midi-shark

Usage:

Make sure that requirements in requirements.txt are installed, or run

pip install -r requirements.txt

Then, make sure you have FluidSynth and a .sf2 soundfont installed.

Preprocessing the Data.

Create a file named .env in the project's root directory, following the template shown in the .env.example file.
Execute processing/preprocess_batch.py using Python. You must have the dataset and sufficient disk space of [] MB to store the preprocessed data. If you wish to only preprocess a subset, specify the --year argument.

Training a Model

Dataloaders (for pytorch) for all components of the dataset is located in model/database.py. Use this to load your data.
Then, you can train the models we have built using the fit method, and evaluate them using the val_split method. To use your own models, you can still use the dataloaders.

Making Predictions

You can use the code in this jupyter notebook to make predictions. However, ensure you have trained some sort of model to make the predictions.

Resources/References

Datasets

MAESTRO
MusicNet
MAPS currently unavailable

Papers

Dataset Default File Structures

If there are any new datasets added, please update the README with the file structures.

MAESTRO should look like

.
├── 2004
├── 2006
├── 2008
├── 2009
├── 2011
├── 2013
├── 2014
├── 2015
├── 2017
├── 2018
├── LICENSE
├── maestro-v3.0.0.csv
├── maestro-v3.0.0.json
└── README

MusicNet should look like

.
├── test_data
├── test_labels
├── train_data
└── train_labels

jonah-chen/midi-shark