A curated list of papers and resources for children's automatic speech recognition.
- My Science Tutor (MyST)
- Freely available for non-commercial use
- Age: Grade 3-5
- 400k hours, 230k utterances, coversational speech
- 100k utterances have been transcribed
- CSLU Kids' Speech Corpus (OGI)
- Age: K0 - G11
- PF-STAR
- Age: 4-14 years old
- The CMU Kids Corpus
- Wikipedia Page
- Automatic Speech Recognition Tuned for Child Speech in the Classroom
- Improved Children’s Automatic Speech Recognition Combining Adapters and Synthetic Data Augmentation
- Build a 50+ Hours Chinese Mandarin Corpus for Children’s Speech Recognition
- Exploring Adapters with Conformers for Children’s Automatic Speech Recognition
- Sparsely Shared LoRA on Whisper for Child Speech Recognition
- SASB workshop Analysis of Self-Supervised Speech Models on Children's Speech and Infant Vocalizations
- SASB workshop SOA: Reducing domain mismatch in SSL Pipeline by Speech Only Adaptation for low resource ASR
- JASA 2024 - ChildAugment: Data Augmentation Methods for Zero-Resource Children's Speaker Verification
- Interspeech 2023 - Data augmentation for children ASR and child-adult speaker classification using voice conversion methods
- ICASSP 2023 - Using Modified Adult Speech as Data Augmentation for Child Speech Recognition
- Interspeech 2022 - Spectral Modification Based Data Augmentation for Improving End-to-End ASR for Children’s Speech
- ICASSP 2022 - LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects
- Speech Communication 2021 - Fundamental frequency feature warping for frequency normalization and data augmentation in child automatic speech recognition
- ICASSP 2021 - Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition
- Interspeech 2020 - Data Augmentation Using Prosody and False Starts to Recognize Non-native Children’s Speech
- Interspeech 2020 - Voice Conversion Based Data Augmentation to Improve Children’s Speech Recognition in Limited Data Scenario
- ASRU 2019 - Data Augmentation Based on Vowel Stretch for Improving Children's Speech Recognition
- ASRU 2019 - GANs for Chidren: A Generative Data Augmentation Strategy for Children Speech Recognirion Interspeech 2019 - A Frequency Normalization Technique for Kindergarten Speech Recognition Inspired by the Role of fo in Vowel Perception
- IEEE SPL 2019 - Significance of Pitch-Based Spectral Normalization for Children’s Speech Recognition
- Interspeech 2016 - Improving Children’s Speech Recognition through Out-of-Domain Data Augmentation
- IEEE Acess 2024 - Exploring Native and Non-Native English Child Speech Recognition With Whisper
- Arxiv 2023 - Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
- Interspeech 2023 - Adaptation of Whisper models to child speech recognition
- Under-review Speech Communication 2022 - Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
- IEEE JSTSP 2022 - Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
- Interspeech 2022 - DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR
- Interspeech 2022 - Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
- ICASSP 2021 - Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR
- Computer Speech & Language 2020 - Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations
- Interspeech 2019 - Improving ASR Systems for Children with Autism and Language Impairment Using Domain-Focused DNN Transfer Techniques
- WOCCI 2016 - Improving DNN-Based Automatic Recognition of Non-native Children's Speech with Adult Speech
- ASRU 2023 - No Pitch Left Behind: Addressing Gender Unbalance In Automatic Speech Recognition Through Pitch Manipulation
- SLT 2022 - A Zero-Shot Approach to Identifying Children's Speech in Automatic Gender Classification
- ICASSP 2022 - Towards Better Meta-Initialization with Task Augmentation for Kindergarten-aged Speech Recognition
- Interspeech 2021 - Age-Invariant Training for End-to-End Child Speech Recognition using Adversarial Multi-Task Learning
- ICASSP 2020 - Learning Domain Invariant Representations for Child-Adult Classification from Speech
- Interspeech 2019 - Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network
- ISCSLP 2018 - A Study on Acoustic Modeling for Child Speech Based on Multi-Task Learning
- ICASSP 2019 - Improving Children Speech Recognition Through Feature Learning from Raw Speech Signal
- Connecting Speech science and Speech technology for Children’s Speech
- Interspeech 2023, Interspeech 2024
- MERLIon CCS Challenge: Language Identification on Code-Switched Child-Directed Speech
- Interspeech 2023
- ETLT 2021: Shared Task on ASR for Non-Native Children's Speech
- Interspeech 2021
- CSRC: Children Speech Recognition Challenge
- SLT 2021
- Spoken Language Processing for Children's Speech
- Interspeech 2019
This is an active repository and your contributions are always welcome!