data augmentation needs musan (http://www.openslr.org/17/). set $CORPUS_DIR in path.sh and put musan under $CORPUS_DIR. (see local/nnet3/run_aug.sh). reverberation needs RIRS_NOISES (https://www.openslr.org/28/). put RIRS_NOISES here. (see local/nnet3/run_aug.sh) python package requirements: ckiptagger regex zhon cn2an opencc to prepare PTS_TW-extra data, run: mandarin: local/prepare_pts_data.sh --train-dir PTS_TW-extra --data-dir data/<dataset_name> --add-parent-prefix true \ --lexicon-path language/mandarin_phn_lexiconp.txt --txtdir mandarin_text taibun: local/prepare_pts_data.sh --train-dir PTS_TW-extra --data-dir data/<dataset_name> --add-parent-prefix true \ --lexicon-path language/hanlo_tailo_phn_lexiconp.txt --txtdir taibun_text 上次的PTS_TW-extra的text是mandarin跟taibun在同一個檔案裡, 這次給的PTS_TW-extra-textonly只是把它分開來成taibun_text跟mandarin_text,作為--txtdir的參數