/Speech-and-Speaker-Recognition

DT2119: Speech-and-Speaker-Recognition

Primary LanguageJupyter Notebook

DT2119: Speech and Speaker Recognition

Lecturer: Prof. Giampiero Salvi

This is the course I took during my exchange in KTH in Sweden.

The course note is here (it's in Chinese because I think there is sufficient resource in English on the internet 😃)

Content

  • Assignment1: Feature extraction and comparison
    • Step-by-step MFCC and survey in detail through experiment
    • Dynamic Time Wrapping algorithm to compare utterances
  • Assignment2: Hidden Markov Model
    • Forward/Backward algorithm to evaluate the likelihood of state sequence
    • Viterbi algorithm to find the optimal state sequence
    • Baum-Welch Algorithm to train the phoneme model
  • Assignment3: Phoneme Recognition with DNN
    • Use Assignment1 to extract audio feature
    • Use Assignment2 to force-align phoneme as label
    • Train phoneme recognticizer implemented in Deep Neural Network