Single word (in Chinese or English) recognition with CNN and spectrograms. Pre-trained for 30 words. 97% correctness in test.
- Pre-processing
- Recorder
- End detection
- Short-time zero-crossing rate
- Short-time energy
- (Visualization)
- Speech Spectrum
- Resizing
- Mel-frequency cepstral coefficient
- CNN
- Tensorflow CNN
- UI
- Interactive recording and predicting command line tool
Take anything you want! Comments and suggenstions are welcomed.
The main program is working.py which recognize words from tmp/working/working.txt (generated by MATLAB script described below). This script waits for modification in the txt file (based on modification datatime). A pre-trained model is required. For generating pre-processed file like working.txt, see below about MATLAB script. A pre-trained model will be avaible in the repo.
In pre_process/ , run working.m in MATLAB. This will enable you to interactively record audio to a file so that TensorFlow in Python could read it. You may also place your audio file (when already started Python scirpt described above).