speech_recognition_translation
1. Introduction
There are two function module in this project. One for speech recognition, which adopts Google WaveNet DNN, the other for machine translation, which adopts seq2seq + attention DNN.
2. Environment
- python2.7
- tensorflow 1.0.0
- tflearn 0.3.2
- numpy 1.13.3
- six 1.11.0
3. Usage
Download the VCTK corpus and uncompress it to ./data/asr/. This corpus is used for training speech recognition model. Since it is an English corpus, the model will be able to recognise English speech. To train the speech recognition model:
python train_asr.py
To recognise a speech:
python recognise.py
The data for training machine translator has been put in ./data/nmt/. They are some subtitles of TED speech. Since it is an English-French corpus, the model can only translate English to French. To train the neural translation model:
python train_nmt.py
To translate a text:
python translate.py
4. TODO
- do fine tuning for the model
- add BLEU score to check the model performance
- add beam search to NMT