Goodness of Pronunciation (GoP)

This code reflects the work described in the INTERSPEECH 2019 published paper on "An improved goodness of pronunciation (GoP) measure for pronunciation evaluation with DNN-HMM system considering HMM transition probabilities".

Requirements :

  • Python (tested with v.2.7.5 & v.3.5.7).
  • Kaldi ASR toolkit (for documentation checkout : http://kaldi-asr.org/) considering acoustic models trained with nnet2 (Dan's recipe) (tested with nnet2 & nnet3) on LibriSpeech.

How to run the code :

Run the below code (prop_gop_eqn.py) to compute the score using the proposed GoP formulation by passing alignment_infile.txt and posterior_infile.ark generated for a given learner's utterance.

python prop_gop_eqn.py posterior_infile.ark alignment_infile.txt gop_outfile.txt
  • The alignment_infile.txt file is the output of the forced-alignment of the learner's uttered speech (.wav file) and this is obtained using align.sh.
  • The posterior_infile.ark file contains the frame level posterior-probabilities of the learner's uttered speech (.wav file) and this is obtained using nnet_am_compute.cc.
  • The gop_outfile.txt file contains the score for each phoneme.

NOTE :

  • The above python script requires a lookup table to generate the scores for an acoustic model as discussed in the paper, which can be generated using the following code :
./gen_lookup_table.sh

Placement of the downloaded folder :

  • Once the Goodness-of-Pronunciation-master.zip file is downloaded it needs to be placed in /home/user/kaldi/egs/Native_Acoustic_Model/s5/ and needs to unzipped as Extract Here which will result in the creation of the following path /home/user/kaldi/egs/Native_Acoustic_Model/s5/Goodness-of-Pronunciation-master/. The native acoustic model needs to be trained on nnet2 with all paths functional in exp folder.
  • Once the path is created it will have the following file structure :
├── kaldi_folder
│   ├── native_acoustic_model
│   │   ├── s5
│   │   │   ├── Goodness-of-Pronunciation-master
│   │   │   │   ├── extract_from_alignments.sh
│   │   │   │   ├── gen_lookup_table.sh
│   │   │   │   ├── modify_post.sh
│   │   │   │   ├── extract_from_alignments.sh
│   │   │   │   ├── gop_outfile.txt
│   │   │   │   ├── prop_gop_eqn.py
│   │   │   │   ├── reqd_files
│   │   │   │   │   ├── alignment_infile.txt
│   │   │   │   │   ├── posterior.txt
│   │   │   │   │   ├── posterior_infile.ark
│   │   │   │   │   ├── show_transitions.txt
│   │   │   │   │   ├── lookup_table.txt
│   │   │   │   │   ├── tmp_t_ids.txt
│   │   │   │   │   ├── tmp_phones.txt
│   │   │   │   │   ├── tmp_segments.txt

Citing:

If you find our work useful, please cite:

@inproceedings{Sudhakara2019,
  author={Sweekar Sudhakara and Manoj Kumar Ramanathi and Chiranjeevi Yarra and Prasanta Kumar Ghosh},
  title={{An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={954--958},
  doi={10.21437/Interspeech.2019-2363},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2363}
}