/multipitch_mctc

Pytorch project accompanying the paper "Learning Multi-Pitch Estimation From Weakly Aligned Score-Audio Pairs Using a Multi-Label CTC Loss", IEEE WASPAA 2021.

Primary LanguageHTML

multipitch_mctc

This is a pytorch code repository accompanying the following paper:

Christof Weiß and Geoffroy Peeters
Learning Multi-Pitch Estimation From Weakly Aligned Score-Audio Pairs Using a Multi-Label CTC Loss
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021

This repository only contains exemplary code and pre-trained models for most of the paper's experiments as well as some individual examples. All datasets used in the paper are publicly available (at least partially), e.g. our main datasets:

Feature extraction and prediction (Jupyter notebooks)

In this top folder, two Jupyter notebooks (01_precompute_features and 02_predict_with_pretrained_model) demonstrate how to preprocess audio files for running our models and how to load a pretrained model for predicting pitches.

Experiments from the paper (Python scripts)

In the experiments folder, all experimental scripts as well as the log files (subfolder logs) and the filewise results (subfolder results_filewise) can be found. The folder models_pretrained contains pre-trained models for the main experiments. The subfolder predictions contains exemplary model predictions for two of the experiments. Plese note that re-training requires a GPU as well as the pre-processed training data (see the notebook 01_precompute_features for an example). Any script must be started from the repository top folder path in order to get the relative paths working correctly.

The experiment files' names relate to the paper's results in the following way:

Experiment 1 (Table 2) - Loss and model variants

  • exp118g_traintest_schubert_allzero_pitch.py (All-Zero baseline)
  • exp118f2_traintest_schubert_librosa_pitch_maxnorm.py (CQT-Chroma baseline)
  • exp112aS_traintest_schubert_aligned_pitch_nooverlap_segmmodel.py_ (Strongly-aligned training (BCE loss))
  • exp118b_traintest_schubert_sctcthreecomp_pitch.py (Separable CTC (SCTC) loss)
  • exp118d_traintest_schubert_mctcnethreecomp_pitch.py (Non-Epsilon MCTC (MCTC:NE) loss)
  • exp118e_traintest_schubert_mctcwe_pitch.py (MCTC with epsilon (MCTC:WE) loss)

Experiment 2 (Section 3.2) - Train/test on common datasets

  • exp121a_traintest_musicnet_mctcwe_pitch_basiccnn.py (Train/test MusicNet with strongly-aligned training)
  • exp121cS_traintest_musicnet_aligned_pitch_basiccnn_segmmodel.py (Train/test MusicNet with MCTC loss)
  • exp122a_traintest_maestro_mctcwe_pitch_basiccnn.py (Train/test MAESTRO with strongly-aligned training)
  • exp122cS_traintest_maestro_aligned_pitch_basiccnn_segmmodel.py (Train/test MAESTRO with MCTC loss)

Experiment 3 (Figure 2) - Cross-dataset experiment

  • exp123a_trainmaestromunet_testmix_mctcwe_pitch_basiccnn.py (Train MusicNet & MAESTRO, test others, MCTC)
  • exp123cS_trainmaestromunet_testmix_aligned_pitch_basiccnn_segmmodel.py (Train MusicNet & MAESTRO, test others, aligned)
  • exp124a_trainmix_testmusicnet_mctcwe_pitch_basiccnn.py (Test MusicNet, train others, MCTC)
  • exp124cS_trainmix_testmusicnet_aligned_pitch_basiccnn_segmmodel.py (Test MusicNet, train others aligned)
  • exp125a_trainmix_testmaestro_mctcwe_pitch_basiccnn.py (Test MAESTRO, train others MCTC)
  • exp125cS_trainmix_testmaestro_aligned_pitch_basiccnn_segmmodel.py (Test MAESTRO, train others aligned)

Run scripts using e.g. the following commands:
conda activate multipitch_mctc
export CUDA_VISIBLE_DEVICES=1
python experiments/exp112aS_traintest_schubert_aligned_pitch_nooverlap_segmmodel.py