YorNoise dataset for MCLNN

The YorNoise environmental sound dataset.

Clip Duration	Format	Count	Categories
4 secs	.wav	1527	2

Dataset Summary:

Clips are 4-seconds in length with 44100 Hz sampling rates.
The dataset is released into predefined 10-fold splits for cross-validation.
It is combined through the experiments with the UrbanSound8k dataset to form a 12-classes dataset.

This folder contains:

Scripts required to prepare the UrbanSound8k dataset for the MCLNN processing.
Pretrained weights and indices for the 10-fold cross-validation in addition to the standardization parameters to replicate the results in:

Fady Medhat, David Chesmore and John Robinson, "Recognition of Acoustic Events Using Masked Conditional Neural Networks," 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

Prepossessing

The preprocessing involved in preparing the YorNoise dataset is resampling to 22050 Hz.

The preparation scripts require the following packages to be installed beforehand:

Download the UrbanSound8k and execute its preprocessing scripts in the preparation scripts following the order of their labels.
Download the YorNoise dataset and execute the scripts in the preparation scripts following the order of their labels.
Make sure the files are ordered following the YorNoise_storage_ordering file.
Configure the spectrogram transformation within the Dataset Transformer and generate the MCLNN-Ready hdf5 for the dataset using the YorNoiseUrbansound8k_MCLNN.csv file.
Generate the indices for the folds using the Index Generator script.