/DL-Audio-AIMasters-Course

Deep Learning Audio Course – AI Masters

Primary LanguageJupyter Notebook

logo

Deep Learning for Audio Course, 2023

Description

Topics discussed in course:

  • Digital Signal Processing
  • Automatic Speech Recognition (ASR)
  • Key-word spotting (KWS)
  • Text-to-Speech (TTS)
  • Voice Conversion
  • Music Generation with NNs
  • Unsupervised learning in Audio

Course materials

Materials

# Date Description Slides
1 February, 8 Lecture: Introduction and Digital Signal Processing slides
February, 8 Seminar: Introduction and Spectrograms notebook
2 February, 15 Lecture: Automatic Speech Recognition 1: introduction, CTC, LAS slides
February, 15 Seminar: WER, Levenstein distance, Griffin-Lim Algorithm notebook
3 February, 22 Lecture: Automatic Speech Recognition 2: RNN-T, Language models in ASR, BPE, Whisper slides
February, 22 Seminar: CTC, Beam Search notebook
4 March, 1 Lecture: Key-word spotting (KWS) slides
March, 1 Seminar: RNN-T notebook
5 March, 15 Lecture: Text-to-speech: Tacotron, FastSpeech, Guided Attention slides
March, 15 Seminar: Key-word spotting notebook
6 March, 22 Lecture: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) slides
March, 22 Seminar: Speech generation (TTS): Tacotron2 notebook
7 March, 29 Lecture: Voice Conversion: AutoVC, CycleGAN-VC, StarGAN-VC slides
March, 29 Seminar: Wavenet notebook
8 April, 5 Lecture: Self-supervised learning in Audio slides
April, 5 Seminar: Advanced Vocoders notebook
9 April, 12 Lecture: Speaker verification and identification slides
April, 12 Seminar: Hi-Fi GAN notebook
10 April, 19 Lecture: Music Generation slides
April, 19 Seminar: Speaker verification notebook

Homeworks

Homework Date Deadline Description Link
1 February, 21 March, 7
  1. Audio classification
  2. Audio preprocessing
Open In Github
2 March, 13 March, 27 ASR-1: CTC Open In Github
3 April, 13 April, 27 ASR-2: RNN-T Open In Github
4 May, 2 May, 14 Text-to-speech: FastPitch Open In Github

Game rules

  • 4 homeworks each of 2 points = 8 points
  • final test = 2 points
  • maximum points: 8 + 2 = 10 points

Authors

Pavel Severilov

Daniel Knyazev