/midi-embs

Experiments on MIDI2vec embeddings

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Learning MIDI embeddings

Repository in support to MIDI2vec: Learning MIDI Embeddings for Reliable Prediction of Symbolic Music Metadata.

Pasquale Lisena, Albert Meroño-Peñuela, Raphaël Troncy. MIDI2vec: Learning MIDI Embeddings for Reliable Prediction of Symbolic Music Metadata, to appear in Semantic Web Journal, Special Issue on Deep Learning for Knowledge Graphs, 2021. http://www.semantic-web-journal.net/content/midi2vec-learning-midi-embeddings-reliable-prediction-symbolic-music-metadata-0

The experiment is available under 3 notebooks, covering 3 datasets:

The MIDI2vec library is available here.

Pre-computed MIDI embeddings used in the paper are available in Zenodo.

Embedding generation

Process to be performed for each <dataset_folder>.

cd midi2edgelist
npm install
node index.js -i <dataset_folder>
node index.js -i <dataset_folder> -o edgelist_300 -n 300
  • Compute embeddings
cd ../
pip install -r edgelist2vec/requirements.txt

python edgelist2vec/embed.py -o embeddings/<dataset>.bin
python edgelist2vec/embed.py -o embeddings/<dataset>_notes.bin --exclude notes
python edgelist2vec/embed.py -o embeddings/<dataset>_program.bin --exclude program
python edgelist2vec/embed.py -o embeddings/<dataset>_tempo.bin --exclude tempo
python edgelist2vec/embed.py -o embeddings/<dataset>_timesig.bin --exclude time.signature
python edgelist2vec/embed.py -i edgelist_300 -o embeddings/<dataset>_300.bin
  • For SLAC and Musedata's CCV experimet, we use the script in split_datasets for splitting them in 10 folds Then, for each fold i, the following commands need to be run
node index.js -i <dataset_splitted_folder>/fold<i>/train -o edgelist<i>
node index.js -i <dataset_splitted_folder>/fold<i>/test -o edgelist<i>_test

python edgelist2vec/embed.py -i edgelist<i> -o <dataset><i>.bin

Classification experiment