The data were recorded by 797 Chinese children aged 3 to 5, of whom 39% were children aged 5. The recording content conforms to the characteristics of children, mainly storybooks, children's songs, spoken language. Around 120 sentences for each speaker. It is simultaneously recorded by hi-fi microphone and cellphone. The vaild data are 41.8 hours. Texts are manually transcribed with high accuracy.
For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/76?source=Github
16kHz/22.05kHz/44.1kHz, 16bit, uncompressed wav, mono channel
quiet indoor environment, without echo
general category, children's songs, storybooks, human-machine interaction, numbers, letters
797 people, 49% of which are female
recorded by mobile phone and microphone; Android mobile phone and iPhone
Mandarin
text, noise symbols
speech recognition; voiceprint recognition
Commercial License