/MFA-mandarin-pinyin-dict-for-pretrained-mfa-model-v2.0

A dictionary for Montreal-Forced-Aligner users to align mandarin data labeled in pinyin form using the mfa pretrained model v2.0.

GNU General Public License v3.0GPL-3.0

Related project: Montreal-Forced-Aligner

A dictionary for Montreal-Forced-Aligner users to align mandarin data labeled in pinyin form using the mfa pretrained acoustic model v2.0.

适用于Montreal-Forced-Aligner的汉语普通话对齐中,以拼音标注为输入,并调用MFA预训练声学模型V2.0的情况。

Notes: The reduced tones are splitted into three types (according to the phoneset of the pretrained model):

tone 6: reduced tone after tone 1 and tone 2;

tone 7: after tone 3;

tone 8: after tone 4.

Still, tone 5 stands for all sorts of reduced tones. In the dictionary, a pinyin label with tone 5 has the same possibility to fall into each kind.

Example of tone 6-8:

太 高 了 : tai4 gao1 le6

我 的 : wo3 de7

我 干 的 : wo3 gan4 de8

Example of using in the alignment:

Run in terminal:

mfa align path/to/input/corpus/folder path/to/this/dictionary/mandarin_pinyin_to_mfa_lty.dict mandarin_mfa path/to/output/folder

简单来说,MFA按照轻声的上一个音的声调,把轻声分为三类。如果输入标注中不想分这么细的话,可以直接输入5,代表有相等可能属于这三类中的每一类。