910882575/Child-ASR-Paper

A list of papers for child ASR

MIT

Child-ASR-Paper

A curated list of papers and resources for children's automatic speech recognition.

Table of Contents

Datasets
Papers
Special Sessions
Contributing

Datasets

My Science Tutor (MyST)
- Freely available for non-commercial use
- Age: Grade 3-5
- 400k hours, 230k utterances, coversational speech
- 100k utterances have been transcribed
CSLU Kids' Speech Corpus (OGI)
- Age: K0 - G11
PF-STAR
- Age: 4-14 years old
The CMU Kids Corpus
Wikipedia Page

Papers

ICASSP 2024 - updated in 4/1/2024

Normalization and Data Augmentation

JASA 2024 - ChildAugment: Data Augmentation Methods for Zero-Resource Children's Speaker Verification
Interspeech 2023 - Data augmentation for children ASR and child-adult speaker classification using voice conversion methods
ICASSP 2023 - Using Modified Adult Speech as Data Augmentation for Child Speech Recognition
Interspeech 2022 - Spectral Modification Based Data Augmentation for Improving End-to-End ASR for Children’s Speech
ICASSP 2022 - LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects
Speech Communication 2021 - Fundamental frequency feature warping for frequency normalization and data augmentation in child automatic speech recognition
ICASSP 2021 - Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition
Interspeech 2020 - Data Augmentation Using Prosody and False Starts to Recognize Non-native Children’s Speech
Interspeech 2020 - Voice Conversion Based Data Augmentation to Improve Children’s Speech Recognition in Limited Data Scenario
ASRU 2019 - Data Augmentation Based on Vowel Stretch for Improving Children's Speech Recognition
ASRU 2019 - GANs for Chidren: A Generative Data Augmentation Strategy for Children Speech Recognirion Interspeech 2019 - A Frequency Normalization Technique for Kindergarten Speech Recognition Inspired by the Role of fo in Vowel Perception
IEEE SPL 2019 - Significance of Pitch-Based Spectral Normalization for Children’s Speech Recognition
Interspeech 2016 - Improving Children’s Speech Recognition through Out-of-Domain Data Augmentation

Pretraining + Finetuning

IEEE Acess 2024 - Exploring Native and Non-Native English Child Speech Recognition With Whisper
Arxiv 2023 - Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Interspeech 2023 - Adaptation of Whisper models to child speech recognition
Under-review Speech Communication 2022 - Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
IEEE JSTSP 2022 - Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR
Interspeech 2022 - DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR
Interspeech 2022 - Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
ICASSP 2021 - Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR
Computer Speech & Language 2020 - Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations
Interspeech 2019 - Improving ASR Systems for Children with Autism and Language Impairment Using Domain-Focused DNN Transfer Techniques
WOCCI 2016 - Improving DNN-Based Automatic Recognition of Non-native Children's Speech with Adult Speech

Other topics

ASRU 2023 - No Pitch Left Behind: Addressing Gender Unbalance In Automatic Speech Recognition Through Pitch Manipulation
SLT 2022 - A Zero-Shot Approach to Identifying Children's Speech in Automatic Gender Classification
ICASSP 2022 - Towards Better Meta-Initialization with Task Augmentation for Kindergarten-aged Speech Recognition
Interspeech 2021 - Age-Invariant Training for End-to-End Child Speech Recognition using Adversarial Multi-Task Learning
ICASSP 2020 - Learning Domain Invariant Representations for Child-Adult Classification from Speech
Interspeech 2019 - Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network
ISCSLP 2018 - A Study on Acoustic Modeling for Child Speech Based on Multi-Task Learning
ICASSP 2019 - Improving Children Speech Recognition Through Feature Learning from Raw Speech Signal

Special Sessions

Connecting Speech science and Speech technology for Children’s Speech
- Interspeech 2023, Interspeech 2024
MERLIon CCS Challenge: Language Identification on Code-Switched Child-Directed Speech
- Interspeech 2023
ETLT 2021: Shared Task on ASR for Non-Native Children's Speech
- Interspeech 2021
CSRC: Children Speech Recognition Challenge
- SLT 2021
Spoken Language Processing for Children's Speech
- Interspeech 2019

Contributing

This is an active repository and your contributions are always welcome!