240-Hours-Hindi-Speech-Data-by-Mobile-Phone_Reading

Description

The data is 240 hours and is recorded by 401 Indian. It is recorded in both quiet and noisy environment, which is more suitable for the actual application scenario. The recording content is rich, covering economic, entertainment, news, spoken language, etc. All texts are manually transferred, with high accuracy. It can be applied to speech recognition, machine translation, voiceprint recognition.

For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/118?source=Github

Format

16kHz, 16bit, uncompressed wav, mono channel

Recording environment

304 people complete the recording in quiet indoor environment, without echo; and 97 in the normal environment with noise that does not affect the voice recognition

Recording content (read speech)

economy, entertainment, news, oral language, numbers, letters

Speaker

401 Indians, 61% of which are male

Device

Android mobile phone, iPhone

Language

Hindi

Transcription content

text, time point of speech data, 5 noise symbols, special identifiers 

Application scenarios

undefined

Licensing Information

Commercial License

Nexdata-AI/240-Hours-Hindi-Speech-Data-by-Mobile-Phone_Reading