/asr-noises

A handy dataset of noises for ASR

Handy ASR noise dataset

A handy dataset for noise augmentations for ASR / TTS:

  • ~20k noise files;
  • ~200 distinct categories;

Contact us! Open issues, collaborate, submit a PR, contribute, share your datasets!

Contribution ideas

Add much more data from BBC Sound Effects dataset.

Download links

Meta data file / 2.0M / 73cb528656a484b20e02d6c5fd05f14c Noise archive file / 4.7G / 5e069c867a0da891f57616905129b6c3

Open feather file:

import pandas as pd

df = pd.read_feather(file_path)

Data preparation

The dataset is compiled using open domain sources. All labels resembling loud human speech were removed (but background noise, i.e. street chatter, was not removed). All of the items are 0 - 60 seconds long.

All files are normalized as follows:

  • Converted to mono, if necessary;
  • Converted to 16 kHz sampling rate, if necessary;
  • Stored as 16-bit integers;

Contacts

Please contact us here or just create a GitHub issue!

License

cc-by

References / citations / licenses

Links / license

Paper citations:

  • Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool,"Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition", Proc. Interspeech 2016, San Fransisco;
  • J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014;

Donations

Donate (each coffee pays for several full downloads) / use our DO referral link to help.