Representation Evaluation Framework for Music Information Retrieval tasks | Paper
mir_ref
is an open-source library for evaluating audio representations (embeddings or others) on a variety of music-related downstream tasks and datasets. It has two main capabilities:
- Using a config file, you can specify all evaluation experiments you want to run.
mir_ref
will automatically acquire data and conduct the experiments - no coding or data handling needed. Many tasks, datasets, embedding models etc. are ready to use (see supported options section). - You can easily integrate your own features (e.g. embedding models), datasets, probes, metrics, and audio deformations and use them in
mir_ref
experiments.
mir_ref
builds upon existing reproducability efforts in music audio, including mirdata
for data handling, mir_eval
for evaluation metrics, and essentia models
for pretrained audio models.
A first beta release is expected by the end of April, and it includes many fixes and documentation improvements.
Clone the repository. Create and activate a python>3.9 environment and install the requirements.
cd mir_ref
pip install -r requirements.txt
To run the experiments specified in a config file configs/example.yml
end-to-end:
python run.py conduct -c example
This will currently save deformations and features to use them later, but an online option will soon be available.
Alternatively, mir_ref
is comprised of 4 main functions-commands: deform
, for generating deformations from a dataset; extract
, for extracting features; train
, for training probes; and evaluate
, for evaluating them. These can be run as follows:
python run.py COMMAND -c example
deform
optionally includes the option n_jobs
for specifying parallelization of deformation computation, and extract
includes the flag --no_overwrite
to skip recomputing existing features.
An example configuration file is provided. Config files are written in YAML. A list of experiments is expected at the top level, and each experiment contains a task, datasets, features, and probes. For each dataset, a list of deformation scenarios can be specified, following the argument syntax of audiomentations.
magnatagatune
: MagnaTagATune (autotagging)mtg_jamendo
: MTG Jamendo (autotagging)vocalset
: VocalSet (singer_identification, technique_identification)tinysol
: TinySol (instrument_classification, pitch_class_classification)beaport
: Beatport (key_estimation)
~Many more soon
effnet-discogs
vggish-audioset
msd-musicnn
openl3
neuralfp
clmr-v2
mert-v1-95m-6
/mert-v1-95m-0-1-2-3
/mert-v1-95m-4-5-6-7-8
/mert-v1-95m-9-10-11-12
(referring to the layers used)maest
~More soon
We conducted an example evaluation of 7 models in 6 tasks with 4 deformations and 5 different probing setups. For the full results, please refer to the 'Evaluation' chapter of this thesis document, pages 39-58.
If you use mir_ref
, please cite the following paper:
@inproceedings{mir_ref,
author = {Christos Plachouras and Pablo Alonso-Jim\'enez and Dmitry Bogdanov},
title = {mir_ref: A Representation Evaluation Framework for Music Information Retrieval Tasks},
booktitle = {37th Conference on Neural Information Processing Systems (NeurIPS), Machine Learning for Audio Workshop},
address = {New Orleans, LA, USA},
year = 2023,
}