A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Proc. ICASSP, Vancouver, 2013.
J. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio, “Attention-based models for speech recognition,” in Proc. NIPS, 2015.
- Paper
- Official Code in Paddle(Baidu framework)
- Tensorflow Deepcpeech2
- Simplified tutorial in keras.io
- "SortaGrad”: order utterances by length during first epoch.
- "Batchnorm"
- Using CTC loss