How to add marker of sil, sp to TextGrid after MFA?

Question

How to add marker of sil, sp to TextGrid after MFA?

nampdn opened this issue 3 years ago · 6 comments

Hi @NTT123,
First of all thank you for your brilliant work! I have successfully trained my dataset with MFA, but it is not generated .TextGrid as a marker for silence, space. Could you please help me on how we can detect and add these symbol to the TextGrid file?

Answer 1 · 2022-04-30T11:39:20.000Z

~~Hi @nampdn, thank you for reporting this. The newest version of MFA removes these markers.~~

~~According to MontrealCorpusTools/Montreal-Forced-Aligner#377~~
~~you have to run mfa align or mfa train with an additional argument --disable_textgrid_cleanup.~~

Answer 2 · 2022-04-30T15:32:31.000Z

@nampdn, please checkout the fix_sil branch for a quick fix. This branch can read textgrid files that have no "sil" or "sp" markers.

Answer 3 · 2022-05-01T04:15:34.000Z

Woot! I'm so grateful. I'll try it now.
Have a happy holiday!

Answer 4 · 2022-05-24T18:04:45.000Z

Hi @NTT123 ,
After pull latest fixes for sil. I still have problem with some utterance that has number in it.

('n', 'g', 'ư', 'ờ', 'i', ' ', 'đ', 'o', ' ', 'c', 'h', 'i', 'ề', 'u', ' ', 'r', 'ộ', 'n', 'g', ' ', 'c', 'ủ', 'a', ' ', 'l', 'ố', 'i', ' ', 'v', 'à', 'o', ' ', 'c', 'ổ', 'n', 'g', ' ', 'sil', 'l', 'à', ' ', 'n', 'ă', 'm', ' ', 'sil', '3', ' ', 'm', 'é', 't', ' ', 'sil', 'v', 'à', ' ', 'c', 'h', 'i', 'ề', 'u', ' ', 'd', 'à', 'i', ' ', 'l', 'à', ' ', 's', 'á', 'u', ' ', 'sil', '9', ' ', 'm', 'é', 't', ' ', 'sil')
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/content/vietTTS/vietTTS/nat/acoustic_trainer.py", line 181, in <module>
    train()
  File "/content/vietTTS/vietTTS/nat/acoustic_trainer.py", line 100, in train
    batch = next(train_data_iter)
  File "/content/vietTTS/vietTTS/nat/data_loader.py", line 111, in load_textgrid_wav
    ps = [phonemes.index(p) for p in ps]
  File "/content/vietTTS/vietTTS/nat/data_loader.py", line 111, in <listcomp>
    ps = [phonemes.index(p) for p in ps]
ValueError: '3' is not in list

Can you take a look on this sample? Can I add 0-9 into the phonemes list or I have to flatten the number into readable text?

Answer 5 · 2022-05-25T01:01:38.000Z

You have to normalize the transcripts. For example, "3" should be converted to "ba".
This is the reason why numbers are not includes in the phonemes list.

Answer 6 · 2022-05-25T01:04:11.000Z

Oh I got that point, cheers!