Deep Learning for Audio Course, 2023

Description

Topics discussed in course:

#	Date	Description	Slides
1	February, 8	Lecture: Introduction and Digital Signal Processing	slides
	February, 8	Seminar: Introduction and Spectrograms	notebook
2	February, 15	Lecture: Automatic Speech Recognition 1: introduction, CTC, LAS	slides
	February, 15	Seminar: WER, Levenstein distance, Griffin-Lim Algorithm	notebook
3	February, 22	Lecture: Automatic Speech Recognition 2: RNN-T, Language models in ASR, BPE, Whisper	slides
	February, 22	Seminar: CTC, Beam Search	notebook
4	March, 1	Lecture: Key-word spotting (KWS)	slides
	March, 1	Seminar: RNN-T	notebook
5	March, 15	Lecture: Text-to-speech: Tacotron, FastSpeech, Guided Attention	slides
	March, 15	Seminar: Key-word spotting	notebook
6	March, 22	Lecture: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave)	slides
	March, 22	Seminar: Speech generation (TTS): Tacotron2	notebook
7	March, 29	Lecture: Voice Conversion: AutoVC, CycleGAN-VC, StarGAN-VC	slides
	March, 29	Seminar: Wavenet	notebook
8	April, 5	Lecture: Self-supervised learning in Audio	slides
	April, 5	Seminar: Advanced Vocoders	notebook
9	April, 12	Lecture: Speaker verification and identification	slides
	April, 12	Seminar: Hi-Fi GAN	notebook
10	April, 19	Lecture: Music Generation	slides
	April, 19	Seminar: Speaker verification	notebook

Homework	Date	Deadline	Description
1	February, 21	March, 7	Audio classification Audio preprocessing
2	March, 13	March, 27	ASR-1: CTC
3	April, 13	April, 27	ASR-2: RNN-T
4	May, 2	May, 14	Text-to-speech: FastPitch

Pavel Severilov

Daniel Knyazev