This case study explores end-to-end ASR using the Deep Speech 2 architecture on PyTorch with the Common Voice dataset.
docker run -it --runtime=nvidia springernlp/chapter_12ds:latest
The container will start a jupyter notebook. Follow the commands inside the Chapter 12 notebook.
More information can be found at: Deep Learning for NLP and Speech Recognition by Springer