
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano

Primary LanguageJupyter Notebook


arXiv Paper Project page

Code repository following paper From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano.


pip install requirement.txt

For the Jukebox model, you would also need to install the jukebox package according to their doc.


The Pianism-labeling dataset (PLD) is a ~138 hours dataset featuring clips that's labeled with expertise level, difficulty (curated originally from CIPI dataset), and solo piano technique. For a demonstration of data please refer to project page. We provide metadata of youtube link correspondance.


pyathon -m PianoJudge.data_collection.fetch

List of channels for novice, advanced, and virtuoso levels are found in data_collection/*_channels.txt, please modify the paths in fetch.py. Downloaded audio files can be also requested from the author.

Embedding computation

python -m PianoJudge.scripts.utils

Config can be found in conf/utils/compute_embeddings.yaml. This saves the computed embeddings into hdf5 files.

  • -encoder: 'Jukebox' or 'MERT' or 'DAC' or 'AudioMAE'.
  • -max_segs: how many 10s segments is will be used to compute embedding. default 30 (5mins).
  • -use_trained: whether to used fine-tuned DAC and AudioMAE. The checkpoints can be found here.
  • category: set your dataset path and output path here.

Training for the three main tasks

python -m PianoJudge.scripts.ranking
python -m PianoJudge.scripts.difficulty
python -m PianoJudge.scripts.technique

Config can be found in their respective path, e.g. conf/ranking.yaml.

  • -encoder: 'Jukebox' or 'MERT' or 'DAC' or 'AudioMAE'. Note that the previous step must have the embeddings saved as we don't support on-the-fly calculation.
  • -dataset.num_classes: number of classes in the respective task.

ICPC-2015 Prediction

The International Chopin Piano Competition 2015 data is curated by this repository. Similarly, all performances can be fetched.

python -m PianoJudge.scripts.ranking mode=test
python -m PianoJudge.scripts.competition_rank

Setting mode=test will inference on all possible pairs and save to checkpoints/rank_test_*_prediction.csv. The following scripts cleaned up the prediction.


    address = {San Francisco, USA},
    author = {Zhang, Huan and Liang, Jinhua and Dixon, Simon},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)},
    title = {From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano},
    year = {2024}