speech recognition based on deep neural network/hidden markov model
This project use same data as ASR-SG-GMM-HMM.
Data preparation:
- Prepare the HMM trained with the ASR-SG-GMM-HMM project;
- Perform the GMM/HMM based Viterbi algorithm (made at the project 1) for the whole training data;
- Prepare unique HMM state IDs;
- Use this unique HMM state ID to convert the all state sequence obtained in the step 2;
- Perform the context expansion (3 left and 3 right context) for all feature vector sequences of the training data;
- Make a one big label vector and one big feature matrix by concatenating them for all utterances;
- Computer the HMM state prior distribution;
DNN training:
- Set the DNN topologies;
- Perform the DNN training;
- Stop the training when the validation score starts degraded;
Predict the most likely digit for each utterance by selecting the largest likelihood digit;
Compute the accuracy (# of correct digits / # of test utterances * 100) by using whole training data.
command:
python submission.py --mode mlp train_1digit.feat test_1digit.feat