coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
MIT
Issues
- 0
Persian tts dataset
#220 opened by karim23657 - 1
kreyòl ayisyen :)
#190 opened by JRMeyer - 0
podcast fillers
#218 opened by JRMeyer - 0
- 0
TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
#216 opened by JRMeyer - 0
XTREME-S dataset
#215 opened by jhdeov - 0
Who <!--
#214 opened by Jerryagonoy25 - 0
Santa Barbara Speech Corpus
#213 opened by JRMeyer - 0
Kokoro Japanese TTS single speaker
#212 opened by JRMeyer - 0
male LJSpeech italian
#211 opened by JRMeyer - 0
CrowdSpeech
#210 opened by JRMeyer - 0
- 0
KsponSpeech (Korean conversations)
#208 opened by JRMeyer - 0
JTubeSpeech (Japanese Youtube)
#207 opened by JRMeyer - 0
EmoV-DB (emothional synthesis)
#206 opened by JRMeyer - 0
finnish parlament
#205 opened by JRMeyer - 0
databases from CMU speech group
#204 opened by JRMeyer - 0
Sadilar corpora
#203 opened by JRMeyer - 0
all podcasts dataset
#202 opened by JRMeyer - 0
Arabic corpus
#201 opened by JRMeyer - 0
Quran recitation (kaggle)
#200 opened by JRMeyer - 0
falabrasil portuguese
#199 opened by JRMeyer - 0
EasyComDataset (cocktail party effect)
#198 opened by JRMeyer - 0
spoken word QA dataset
#197 opened by JRMeyer - 0
Agriculture keywords (english + luganda)
#196 opened by JRMeyer - 0
key words for african languages
#195 opened by JRMeyer - 0
brazilian portuguese emotion recognition
#194 opened by JRMeyer - 0
Voxlingua 107 (6k hours)
#193 opened by JRMeyer - 0
WeNetSpeech (10k mandarin)
#192 opened by JRMeyer - 0
Do some one train with Japanese?
#191 opened by kju196 - 0
2k hours japanese TV
#188 opened by JRMeyer - 0
data is CC0! 400k hours unlabeled voxpopuli
#187 opened by JRMeyer - 0
10k hours japanese youtube
#186 opened by JRMeyer - 0
qualcomm hotwords ( hey snapdragon)
#185 opened by JRMeyer - 0
kerstin german TTS
#184 opened by JRMeyer - 0
Kaggle ukrainian
#183 opened by JRMeyer - 0
SpiCE Corpus === english / cantonese
#181 opened by JRMeyer - 0
Diarization Datasets
#180 opened by JRMeyer - 3
Mongolian 300 synthetic STT data + others
#179 opened by JRMeyer - 0
TwB corpora
#178 opened by JRMeyer - 0
media speech: french / arabic / spanish / turkish
#177 opened by JRMeyer - 0
english 12 speaker anechoic chamber cc-by 3.0
#176 opened by JRMeyer - 0
odia and indic langs
#175 opened by JRMeyer - 0
Datasets from jace-assistant
#173 opened by JRMeyer - 0
african NLP / ASR data
#172 opened by JRMeyer - 0
AVSpoof 2021 challenge
#171 opened by JRMeyer - 0
Chatino `ctp` CC-BY-SA
#170 opened by ftyers - 0
Arabic speech commands
#169 opened by JRMeyer - 0
chinese speech emotion datasets
#168 opened by JRMeyer - 0
Earnings 21 from Rev
#167 opened by JRMeyer