Neural Speech Synthesis

Tutorial @ INTERSPEECH 2022, Sep 18, 2022

Speakers

Xu Tan, Microsoft Research Asia, xuta@microsoft.com
Hung-yi Lee, National Taiwan University, hungyilee@ntu.edu.tw

Abstract

Speech synthesis, which consists of several key tasks including text to speech (TTS) and voice conversion (VC), has been a hot research topic in the speech community and has broad applications in the industry. As the development of deep learning and artificial intelligence, neural network-based speech synthesis has significantly improved the quality of synthesized speech in recent years. In this tutorial, we give a comprehensive introduction to neural speech synthesis, which consists of four parts: 1) The history of speech synthesis technology and taxonomy of neural speech synthesis; 2) The key methods and applications of text to speech; 3) The key methods and applications of voice conversion; 4) Challenges in neural speech synthesis and future research directions.

Materials

Slides for TTS
Slides for VC