/tensorflow-speech-recognition

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Primary LanguagePythonOtherNOASSERTION

Tensorflow Speech Recognition

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks.

Replaces caffe-speech-recognition, see there for some background.

Ultimate goal

Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...

Sample spectrogram, That's what she said, too laid?

Sample spectrogram, Karen uttering 'zero' with 160 words per minute.

Getting started

Toy examples: ./number_classifier_tflearn.py ./speaker_classifier_tflearn.py

Some less trivial architectures: ./densenet_layer.py

Later: ./train.sh ./record.py

Sample spectrogram or record.py

Partners + collaborators wanted

We are in the process of tackling this project in seriousness. If you want to join the party just drop us an email at info@pannous.com.

Update: Nervana demonstrated that it is possible for 'independents' to build speech recognizers that are state of the art. Update: Mozilla is working on DeepSpeech and just achieved 0% error rate ... on the training set;) Free Speech is in good hands.

###Fun tasks for newcomers

###Extensions Extensions to current tensorflow which are probably needed:

Even though this project is far from finished we hope it gives you some starting points.

Looking for a tensorflow collaboration / consultant / deep learning contractor? Reach out to info@pannous.com