How to add marker of sil, sp to TextGrid after MFA?
nampdn opened this issue · 6 comments
Hi @NTT123,
First of all thank you for your brilliant work! I have successfully trained my dataset with MFA, but it is not generated .TextGrid
as a marker for silence, space. Could you please help me on how we can detect and add these symbol to the TextGrid file?
Hi @nampdn, thank you for reporting this. The newest version of MFA removes these markers.
According to MontrealCorpusTools/Montreal-Forced-Aligner#377
you have to run mfa align
or mfa train
with an additional argument --disable_textgrid_cleanup
.
@nampdn, please checkout the fix_sil
branch for a quick fix. This branch can read textgrid files that have no "sil" or "sp" markers.
Woot! I'm so grateful. I'll try it now.
Have a happy holiday!
Hi @NTT123 ,
After pull latest fixes for sil
. I still have problem with some utterance that has number in it.
('n', 'g', 'ư', 'ờ', 'i', ' ', 'đ', 'o', ' ', 'c', 'h', 'i', 'ề', 'u', ' ', 'r', 'ộ', 'n', 'g', ' ', 'c', 'ủ', 'a', ' ', 'l', 'ố', 'i', ' ', 'v', 'à', 'o', ' ', 'c', 'ổ', 'n', 'g', ' ', 'sil', 'l', 'à', ' ', 'n', 'ă', 'm', ' ', 'sil', '3', ' ', 'm', 'é', 't', ' ', 'sil', 'v', 'à', ' ', 'c', 'h', 'i', 'ề', 'u', ' ', 'd', 'à', 'i', ' ', 'l', 'à', ' ', 's', 'á', 'u', ' ', 'sil', '9', ' ', 'm', 'é', 't', ' ', 'sil')
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/content/vietTTS/vietTTS/nat/acoustic_trainer.py", line 181, in <module>
train()
File "/content/vietTTS/vietTTS/nat/acoustic_trainer.py", line 100, in train
batch = next(train_data_iter)
File "/content/vietTTS/vietTTS/nat/data_loader.py", line 111, in load_textgrid_wav
ps = [phonemes.index(p) for p in ps]
File "/content/vietTTS/vietTTS/nat/data_loader.py", line 111, in <listcomp>
ps = [phonemes.index(p) for p in ps]
ValueError: '3' is not in list
Can you take a look on this sample? Can I add 0-9
into the phonemes
list or I have to flatten the number into readable text?
You have to normalize the transcripts. For example, "3" should be converted to "ba".
This is the reason why numbers are not includes in the phonemes list.
Oh I got that point, cheers!