Neural Speech Synthesis

Tutorial @ INTERSPEECH 2022, Sep 18, 2022


Xu Tan, Microsoft Research Asia,
Hung-yi Lee, National Taiwan University,


Speech synthesis, which consists of several key tasks including text to speech (TTS) and voice conversion (VC), has been a hot research topic in the speech community and has broad applications in the industry. As the development of deep learning and artificial intelligence, neural network-based speech synthesis has significantly improved the quality of synthesized speech in recent years. In this tutorial, we give a comprehensive introduction to neural speech synthesis, which consists of four parts: 1) The history of speech synthesis technology and taxonomy of neural speech synthesis; 2) The key methods and applications of text to speech; 3) The key methods and applications of voice conversion; 4) Challenges in neural speech synthesis and future research directions.


Slides for TTS
Slides for VC