Handy ASR noise dataset
A handy dataset for noise augmentations for ASR / TTS:
- ~20k noise files;
- ~200 distinct categories;
Contact us! Open issues, collaborate, submit a PR, contribute, share your datasets!
Contribution ideas
Add much more data from BBC Sound Effects dataset.
Download links
Meta data file / 2.0M / 73cb528656a484b20e02d6c5fd05f14c
Noise archive file / 4.7G / 5e069c867a0da891f57616905129b6c3
Open feather file:
import pandas as pd
df = pd.read_feather(file_path)
Data preparation
The dataset is compiled using open domain sources. All labels resembling loud human speech were removed (but background noise, i.e. street chatter, was not removed). All of the items are 0 - 60 seconds long.
All files are normalized as follows:
- Converted to mono, if necessary;
- Converted to 16 kHz sampling rate, if necessary;
- Stored as 16-bit integers;
Contacts
Please contact us here or just create a GitHub issue!
License
cc-by
References / citations / licenses
Links / license
- rnnoise / CC0;
- acoustic events /
if you end up using the dataset, we ask you to cite the following paper
; - urban sounds / cc-by-nc;
- esc-50 / license (cc-by-nc);
- freiburg-106 / ?;
- sound-events / ?;
- BBC Sound Effects (a small part) / license;
- nar dataset /
the data are freely accessible for scientific research purposes and for non-commercial applications
Paper citations:
- Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool,"Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition", Proc. Interspeech 2016, San Fransisco;
- J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014;
Donations
Donate (each coffee pays for several full downloads) / use our DO referral link to help.