/DeepSpeech

An end-to-end model for Automatic Speech Recognition(ASR) on a small VoxForge dataset. It uses a CTC loss function and a single layer B-LSTM Network. The training accuracy is around 87% and to increase the validation accuracy a much deeper network with much more data is needed.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Issues