/Homburg-for-MCLNN

Primary LanguageBatchfileMIT LicenseMIT

license

HOMBURG dataset for MCLNN

The Homburg music genre dataset.

Clip Duration Format Count Categories
10 secs .mp3 1886 9

Dataset Summary:

  • clips are 10 seconds in length with 44100 Hz sampling rates.
  • No predefined split is defined for the dataset cross-validation.

This folder contains:

  • Scripts required to prepare the Homburg dataset for the MCLNN processing.
  • Pretrained weights and indices for the 10-fold cross-validation in addition to the standardization parameters to replicate the results in:

Fady Medhat, David Chesmore and John Robinson, "Music Genre Classification Using Masked Conditional Neural Networks.", International Conference on Neural Information Processing, ICONIP 2017.

Prepossessing

The following are the steps involved in preparing the Homburg dataset:

  1. Convert .mp3 to .wav
  2. Downsample the clips to 22050 Hz.

Preparation scripts prerequisites

The preparation scripts require the following packages to be installed beforehand:

  • ffmpeg version N-81489-ga37e6dd
  • numpy 1.11.2+mkl
  • librosa 0.4.0
  • h5py 2.6.0

Steps

  1. Download the dataset and execute the scripts in the preparation scripts following the order of their labels.
  2. Make sure the files are ordered following the Homburg_storage_ordering file.
  3. Configure the spectrogram transformation within the Dataset Transformer and generate the MCLNN-Ready hdf5 for the dataset.
  4. Generate the indices for the folds using the Index Generator script.