license

Ballroom dataset for MCLNN

The Ballroom dataset was used in the ISMIR2004 tempo induction contest. The dataset music clips can be downloaded from here.

Clip Duration Format Count Categories
30 secs .wav 698 8

Dataset Summary:

  • clips are 30 seconds in length with 44100 Hz sampling rates.
  • No predefined split is defined for the dataset cross-validation.

This folder contains:

  • Pretrained weights and indices for the 10-fold cross-validation in addition to the standardization parameters to replicate the results in:

Fady Medhat, David Chesmore and John Robinson, "Automatic Classification of Music Genre Using Masked Conditional Neural Networks," 2017 IEEE International Conference on Data Mining (ICDM)

Prepossessing

The preprocessing involved in preparing the Ballroom dataset is resampling to .wav at 22050 Hz (note: resampling is done through librosa while transforming the clips to spectrograms).

Steps

  1. Make sure the files are ordered following the Ballroom_storage_ordering file.
  2. Configure the spectrogram transformation within the Dataset Transformer and generate the MCLNN-Ready hdf5 for the dataset.
  3. Generate the indices for the folds using the Index Generator script.