/std-mandarin-kaldi

Script for training a non-chain tdnn model for standard mandarin GOP scoring. The training data is from a filtered subset of AISHELL2 and MAGICDATA which contain (relatively) standard Mandarin pronunciation

Primary LanguageShell

  1. Create symlinks for aishell2 at aishell2/wav and for magicdata at magicdata/wav
  2. Run generate_data.sh to generate a subset of the aishell2+magicdata consisting of (relatively) standard Mandarin
  3. Enter scripts/ and run run.sh