The data is 240 hours and is recorded by 401 Indian. It is recorded in both quiet and noisy environment, which is more suitable for the actual application scenario. The recording content is rich, covering economic, entertainment, news, spoken language, etc. All texts are manually transferred, with high accuracy. It can be applied to speech recognition, machine translation, voiceprint recognition.
For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/118?source=Github
16kHz, 16bit, uncompressed wav, mono channel
304 people complete the recording in quiet indoor environment, without echo; and 97 in the normal environment with noise that does not affect the voice recognition
economy, entertainment, news, oral language, numbers, letters
401 Indians, 61% of which are male
Android mobile phone, iPhone
Hindi
text, time point of speech data, 5 noise symbols, special identifiers
undefined
Commercial License