Deep Learning for Audio Course, 2023
Description
Topics discussed in course:
- Digital Signal Processing
- Automatic Speech Recognition (ASR)
- Key-word spotting (KWS)
- Text-to-Speech (TTS)
- Voice Conversion
- Music Generation with NNs
- Unsupervised learning in Audio
Course materials
Materials
# | Date | Description | Slides |
---|---|---|---|
1 | February, 8 | Lecture: Introduction and Digital Signal Processing | slides |
February, 8 | Seminar: Introduction and Spectrograms | notebook | |
2 | February, 15 | Lecture: Automatic Speech Recognition 1: introduction, CTC, LAS | slides |
February, 15 | Seminar: WER, Levenstein distance, Griffin-Lim Algorithm | notebook | |
3 | February, 22 | Lecture: Automatic Speech Recognition 2: RNN-T, Language models in ASR, BPE, Whisper | slides |
February, 22 | Seminar: CTC, Beam Search | notebook | |
4 | March, 1 | Lecture: Key-word spotting (KWS) | slides |
March, 1 | Seminar: RNN-T | notebook | |
5 | March, 15 | Lecture: Text-to-speech: Tacotron, FastSpeech, Guided Attention | slides |
March, 15 | Seminar: Key-word spotting | notebook | |
6 | March, 22 | Lecture: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) | slides |
March, 22 | Seminar: Speech generation (TTS): Tacotron2 | notebook | |
7 | March, 29 | Lecture: Voice Conversion: AutoVC, CycleGAN-VC, StarGAN-VC | slides |
March, 29 | Seminar: Wavenet | notebook | |
8 | April, 5 | Lecture: Self-supervised learning in Audio | slides |
April, 5 | Seminar: Advanced Vocoders | notebook | |
9 | April, 12 | Lecture: Speaker verification and identification | slides |
April, 12 | Seminar: Hi-Fi GAN | notebook | |
10 | April, 19 | Lecture: Music Generation | slides |
April, 19 | Seminar: Speaker verification | notebook |
Homeworks
Game rules
- 4 homeworks each of 2 points = 8 points
- final test = 2 points
- maximum points: 8 + 2 = 10 points
Authors
Pavel Severilov
- telegram: @severilov
- e-mail: pseverilov@gmail.com
Daniel Knyazev
- telegram: @Oorgien
- e-mail: xmaximuskn@gmail.com