Deep Learning for Audio

Lectures presented at:

Content:

  • Lecture 1: Physics of sound, Discret Fourier Transform, Spectrograms.
  • Lecture 2: Learnable DFT, Time and Spectral domain audio augmentations.
  • Lecture 3: Modern TTS, Tacotorn, Multi-speaker models, Multi-head models, Phonemes, Griffin-Lim Phase Reconstruction Vocoder
  • Lecture 4: Expressive TTS, Style Encodres, Style Tokens, SpichSplit, Zero Resource TTS.
  • Code for training neural DFT.