Handy ASR noise dataset

A handy dataset for noise augmentations for ASR / TTS:

~20k noise files;
~200 distinct categories;

Contribution ideas

Add much more data from BBC Sound Effects dataset.

Download links

Meta data file / 2.0M / 73cb528656a484b20e02d6c5fd05f14c Noise archive file / 4.7G / 5e069c867a0da891f57616905129b6c3

Open feather file:

import pandas as pd

df = pd.read_feather(file_path)

Data preparation

The dataset is compiled using open domain sources. All labels resembling loud human speech were removed (but background noise, i.e. street chatter, was not removed). All of the items are 0 - 60 seconds long.

All files are normalized as follows:

Converted to mono, if necessary;
Converted to 16 kHz sampling rate, if necessary;
Stored as 16-bit integers;

Contacts

Please contact us here or just create a GitHub issue!

License

cc-by

References / citations / licenses

Links / license

rnnoise / CC0;
acoustic events / if you end up using the dataset, we ask you to cite the following paper;
urban sounds / cc-by-nc;
esc-50 / license (cc-by-nc);
freiburg-106 / ?;
sound-events / ?;
BBC Sound Effects (a small part) / license;
nar dataset / the data are freely accessible for scientific research purposes and for non-commercial applications

Paper citations:

Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool,"Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition", Proc. Interspeech 2016, San Fransisco;
J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014;

Donations

Donate (each coffee pays for several full downloads) / use our DO referral link to help.

evios/asr-noises