Implememtation on two segmentation and one labeling algorithms in matlab. Additionally, it contains a toolbox and a workspace for facilitating coding.
NEW pyhton implementation is here.
Related Topics: Segmentation, Lableing, Recurrence Plot (RP), Self-Similarity Matrix(SSM)
- Chroma Toolbox (matlab toolbox) [4] :
- mir_eval (python package) [5] : For evaluation (Optional)
- madmom [7] : For HPCP, DCP chroma feature (Optional)
Note that there are a warning in origirnal Chroma Toolbox and a little bug that it can't read .mp3. I fixed them!
There are two folders
|- segmentaion toolbox/ : set path and it can be used directly
|- workspace/ : a template for testing and evaluating a dataset
Adding this folder to toolbox or addpath, and it's easy to use.
% Segmentation
audio_filename = 'test.wav';
result = audio_segmenter_sf(audio_filename);
visualize_results(audio_filename, result);
% Segmentation & Labeling
[result_sf, labeling] = audio_segmenter_sf(audio_filename,'clp', 0, 1);
see demo.m for further using
If you want to use this template, please follows the structure of folders.
|- annotation/ : groundtruth or anntations files
|- audio/ : audio files
|- estimation/ : results of the program
|- feature/ : generated features
In the root folder (workspace here), there are three programs. Following the procedures, you can experiments on a dataset.
- run "feature_saving.m" and generated features will be placed at the feature folder
- run "run_all.m" and the results of prediction will be palced at estimation folder
- run "" to see the performance. (Optional)
- if needed, run "" to gain hpcp and dcp chorma feature
Note that the amount of annotation files will dominate the amount of evaluation. To see details in "run_all.m" and "".
Note that there are existing results in estimation folder, the parameter is default (see below).
Note that for the reason of copyright, I won't put any audio files here.
- Segmentation
1. Structure Feature (2012) (default) [1]
2. Checkboard Kernel (2000) [2] - Labeling
1. Structure Feature (2014) [6]
Generally, it's recommended to use the first one - "Structure Feature". It's still one of most effective segmentation algorithms. However, Checkboard Kernel is simple to implement :).
From Chroma Toolbox: CLP (default), CENS, CRP From Madmom: HPCP, DCP
To see the influence on performance of chroma feature, please refer to [3] Note that there are no MFCC feature, but my function accept customized feature as input.
- dataset:
Beatles (174 songs) - parameters (default):
Chroma Feature: winLenSTMSP = 4410
Structure Feature (SF): (m, k, st) = (2.5, 0.04, 30)
Checkboard Kernel (foote): winLen = 64 - evaluation (by mir_eval):
Segmentaion (Seg): F-measure with 3s tolerance
Labeling (Lab): Pairwise Precision
Algo | Feature | Seg | Lab |
SF | CENS | 0.711 | 0.692 |
CLP | 0.695 | 0.660 | |
CRP | 0.694 | 0.650 | |
HPCP | 0.630 | 0.600 | |
HPCP | 0.689 | 0.645 | |
Foote | CENS | 0.448 | -- |
CLP | 0.440 | -- | |
CRP | 0.423 | -- |
I think the performance of foote is not good enough. Maybe somewhere is wrong.
- Serrà, J., Müller, M., Grosche, P., & Arcos, J. L. (2012). Unsupervised Detection of Music Boundaries by Time Series Structure Features. In Proc. of the 26th AAAI Conference on Artificial Intelligence (pp. 1613–1619).Toronto, Canada.
- Foote, J. (2000). Automatic Audio Segmentation Using a Measure Of Audio Novelty. In Proc. of the IEEE International Conference of Multimedia and Expo (pp. 452–455). New York City, NY, USA.
- Nieto, O., Bello, J. P., Systematic Exploration Of Computational Music Structure Research. Proc. of the 17th International Society for Music Information Retrieval Conference (ISMIR). New York City, NY, USA, 2016.
- Meinard Müller and Sebastian Ewert Chroma Toolbox: MATLAB Implementations for Extracting Variants of Chroma-Based Audio Features Proceedings of the International Conference on Music Information Retrieval (ISMIR), 2011.
- Colin Raffel, Brian McFee, Eric J. Humphrey, Justin Salamon, Oriol Nieto, Dawen Liang, and Daniel P. W. Ellis, "mir_eval: A Transparent Implementation of Common MIR Metrics", Proceedings of the 15th International Conference on Music Information Retrieval, 2014.
- Joan Serra, Meinard M ` uller, Peter Grosche, and Josep Llu´ıs Arcos. Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity.
- [] (