NTT123/vietTTS

[Textgrid for dataset]

frank269 opened this issue · 10 comments

I am creating textgrid files for my dataset. Can you guide me how to create that file? Or you can give me information. Thank you so much!

Hi @frank269 , I used Montreal Forced Aligner to create textgrid files. Visit https://montreal-forced-aligner.readthedocs.io/en/latest/ for more information.

thank for your response.

I used MFA 2.0 to align the text, but it through the error. How can i generate lexicon for vietnamese?
This is an error:
montreal_forced_aligner.exceptions.PronunciationAcousticMismatchError: There were phones in the dictionary that do not have acoustic models: a, e, i, u, y, ê, ề

The pretrained acoustic model does not include these phones, in this case, you have to train your own acoustic model. See https://montreal-forced-aligner.readthedocs.io/en/latest/aligning.html#align-using-only-the-data-set for more information.

I tried using the first 6 files and the lexicon file in infore data to align and train with the command:
./bin/mfa_train_and_align MFA/dataset MFA/lexicon.txt MFA/aligned
But it only has the first file that has the correct textgrid file, and the other files that give the wrong data, Where did I go wrong?

To train a MFA model, you need: a lexicon file, a wav/text data directory.
The wav/text data directory includes all your audio clips and the transcript files. Each A.wav clip requires a A.txt transcript file in the same directory.

Yes, I used the first 6 files and lexicon file in the database infore you provided, I also manually created 6 transcription files for each audio clips. But when I run command train, the output of the first file is correct, but the other files are wrong, here are the results of the first 6 files:
1.zip

It is possible that your dataset is too small, so MFA cannot learn a useful model from that little data.

This is a notebook that I used to align InfoRe data with a slightly different phoneme set
https://colab.research.google.com/gist/NTT123/95b12ca42a4bdd1a856aba0fbb0f8936/infore-mfa-tutorial.ipynb

Oh, thank you so much!