Official implementation of Self-Similarity-Based and Novelty-based loss for music structure analysis, published at ISMIR 2023. Include a pre-trained model.
If you use this code and/or paper in your research please cite:
author = {Peeters, Geoffroy},
booktitle = {Proceedings of the 24th International Society for Music Information Retrieval Conference, ISMIR 2023},
publisher = {International Society for Music Information Retrieval},
title = {Self-Similarity-Based and Novelty-based loss for music structure analysis},
year = {2023}
git clone
cd ssmnet_ISMIR2023/
python -m venv env_ssmnet
source env_ssmnet/bin/activate
pip install -e .
This package includes a CLI as well as pretrained models. To use it, type in a terminal:
ssmnet $fullpath_to_audio_file -o csv_file -p pdf_file
The output format is .csv. The output file is specified with -o.
Alternatively, the functions defined in ssmnet/
can directly be called within another Python code.
ssmnet_deploy = SsmNetDeploy(config_d)
# get the audio features patches
feat_3m, time_sec_v = ssmnet_deploy.m_get_features(args.audio_file)
# process through SSMNet to get the Self-Similarity-Matrix and Novelty-Curve
hat_ssm_np, hat_novelty_np = ssmnet_deploy.m_get_ssm_novelty(feat_3m)
# estimate segment boundries from the Novelty-Curve
hat_boundary_sec_v, hat_boundary_frame_v = ssmnet_deploy.m_get_boundaries(hat_novelty_np, time_sec_v)
# export as .csv
ssmnet_deploy.m_plot(hat_ssm_np, hat_novelty_np, hat_boundary_frame_v, args.output_pdf_file)
# export as .pdf
ssmnet_deploy.m_export_csv(hat_boundary_sec_v, args.output_csv_file)
contains the pyjama file (see doc for a description of the pyjama format) corresponding to the four datasets. Each pyjama file contains all the annotations of a given dataset.
Those are provided for reproducibility. In the paper, the evaluation using
is performed using the subset of entries ofsalami.pyjama
with keyCLASS
equal topopular
is done using the subset of entries ofsalami.pyjama
with keySOURCE
equal toIA
is done using the subset of entries ofsalami.pyjama
which has two annotations (bothtextfile1_functions.txt
keys are defined)
with open('salami.pyjama', encoding = "utf-8") as json_fid: data_d = json.load(json_fid)
subentry_l = [entry for entry in data_d['collection']['entry'] if entry['CLASS'][0]['value']=='popular']
len(subentry_l) # ---> 280
subentry_l = [entry for entry in data_d['collection']['entry'] if entry['SOURCE'][0]['value']=='IA']
len(subentry_l) # ---> 446
subentry_l = [entry for entry in data_d['collection']['entry'] if len(entry['textfile1_functions.txt']) and len(entry['textfile2_functions.txt'])]
len(subentry_l) # ---> 882
|--*.pt # weights of pre-trained network for the model with `do_nb_attention=1` and `do_nb_attention=3`
| # --- is the pytorch code of the model
contains the code of the library