besacier/ASR2022

MIT

ASR2022

https://github.com/besacier/ASR2022/blob/main/ASRcourseMOSIG2022.pdf

This is an introduction (97 slides) to modern ASR systems and speech SSL Content for 6h-8h of lecture

Remarks:

(a) The 'current challenges' part at the end needs to be refreshed

(b) SSL part does not (yet) include the following models: -MAESTRO paper (from Interspeech 2022) https://arxiv.org/abs/2204.03409?context=eess -Speech2C model (by same authors as SpeechT5) https://www.isca-speech.org/archive/pdfs/interspeech_2022/ao22_interspeech.pdf -Data2Vec2.0 paper https://ai.facebook.com/research/publications/efficient-self-supervised-learning-with-contextualized-target-representations-for-vision-speech-and-language

(c) Whipser model not included in the current slides

please send your feedback to: laurent.besacier@naverlabs.com

ASR 2022