NTT123/vietTTS

How to create lexicon.txt file

phanan9225 opened this issue · 2 comments

It's a great repo.
I have tried to train my own model, but I have still stuck at prepare a dataset. Can you instruct me how to create a lexicon.txt file corresponding to my dataset, then I can use MFA to create a grid file.
Thank you very much.

vietTTS uses a very simple lexicon file, you need to:

  • list all words in your dataset in lowercase.
  • for each word, each character is a phoneme.

The format is:

word1 [tab] phoneme1 [space] phoneme2 [space] ...
....

See https://github.com/NTT123/vietTTS/blob/master/assets/infore/lexicon.txt for an example.

Thank you for your response. I will try this.