/mir_ref

A Representation Evaluation Framework for Music Information Retrieval tasks

Primary LanguagePython

mir_ref

Representation Evaluation Framework for Music Information Retrieval tasks | Paper

mir_ref is an open-source library for evaluating audio representations (embeddings or others) on a variety of music-related downstream tasks and datasets. It has two main capabilities:

  • Using a config file, you can specify all evaluation experiments you want to run. mir_ref will automatically acquire data and conduct the experiments - no coding or data handling needed. Many tasks, datasets, embedding models etc. are ready to use (see supported options section).
  • You can easily integrate your own features (e.g. embedding models), datasets, probes, metrics, and audio deformations and use them in mir_ref experiments.

mir_ref builds upon existing reproducability efforts in music audio, including mirdata for data handling, mir_eval for evaluation metrics, and essentia models for pretrained audio models.

Disclaimer

A first beta release is expected by the end of January, and it includes many fixes and documentation improvements.

Setup

Clone the repository. Create and activate a python>3.9 environment and install the requirements.

cd mir_ref
pip install -r requirements.txt

Running

To run the experiments specified in a config file configs/example.yml end-to-end:

python run.py conduct -c example

This will currently save deformations and features to use them later, but an online option will soon be available.

Alternatively, mir_ref is comprised of 4 main functions-commands: deform, for generating deformations from a dataset; extract, for extracting features; train, for training probes; and evaluate, for evaluating them. These can be run as follows:

python run.py COMMAND -c example

deform optionally includes the option n_jobs for specifying parallelization of deformation computation, and extract includes the flag --no_overwrite to skip recomputing existing features.

Configuration file

An example configuration file is provided. Config files are written in YAML. A list of experiments is expected at the top level, and each experiment contains a task, datasets, features, and probes. For each dataset, a list of deformation scenarios can be specified, following the argument syntax of audiomentations.

Currently supported options

Datasets and Tasks

  • magnatagatune: MagnaTagATune (autotagging)
  • mtg_jamendo: MTG Jamendo (autotagging)
  • vocalset: VocalSet (singer_identification, technique_identification)
  • tinysol: TinySol (instrument_classification, pitch_class_classification)
  • beaport: Beatport (key_estimation)

~Many more soon

Features

  • effnet-discogs
  • vggish-audioset
  • msd-musicnn
  • openl3
  • neuralfp
  • clmr-v2
  • mert-v1-95m-6 / mert-v1-95m-0-1-2-3 / mert-v1-95m-4-5-6-7-8 / mert-v1-95m-9-10-11-12 (referring to the layers used)
  • maest

~More soon

Example results

We conducted an example evaluation of 7 models in 6 tasks with 4 deformations and 5 different probing setups. For the full results, please refer to the 'Evaluation' chapter of this thesis document, pages 39-58.

Citing

If you use mir_ref, please cite the following paper:

@inproceedings{mir_ref,
    author = {Christos Plachouras and Pablo Alonso-Jim\'enez and Dmitry Bogdanov},
    title = {mir_ref: A Representation Evaluation Framework for Music Information Retrieval Tasks},
    booktitle = {37th Conference on Neural Information Processing Systems (NeurIPS), Machine Learning for Audio Workshop},
    address = {New Orleans, LA, USA},
    year = 2023,
}