/50languages

Corpus, dataset of speech recording in 50 languages

Primary LanguagePHPApache License 2.0Apache-2.0

50 Languages speech dataset

Source data taken from project page, then split into portions and transcribed.

Ready to use as a validation or benchmark set for speech recognition, speech processing.

Author: Pawel Cyrta

Data taken from [50 Languages project](https://www.50languages.com/language-mp3.php)