
unable to pronounce some pinyins

Opened this issue · 0 comments

hello, I was generating wavs from your project with your pretrained models and was generating some audios.
However, I was faced with the following error.

2021-10-21 20:35:45,624 synthesize.py: INFO: processing 413|sil e2 cuo2 shan1 sil|sil 峨 痤 山 sil|10
Traceback (most recent call last):
  File "../../mtts/synthesize.py", line 98, in <module>
    name, tokens = text_processor(line)
  File "/mnt/data1/jungwonchang/projects/mandarin-tts/mtts/text/text_processor.py", line 44, in __call__
    return self._process(input)
  File "/mnt/data1/jungwonchang/projects/mandarin-tts/mtts/text/text_processor.py", line 38, in _process
    tokens = tokenizer.tokenize(seg)
  File "/mnt/data1/jungwonchang/projects/mandarin-tts/mtts/datasets/dataset.py", line 69, in tokenize
    tokens = [self.v2i[t] for t in text.split()]
  File "/mnt/data1/jungwonchang/projects/mandarin-tts/mtts/datasets/dataset.py", line 69, in <listcomp>
    tokens = [self.v2i[t] for t in text.split()]
KeyError: 'cuo2'

after debugging, I found out there was some pinyin sequences that you guys did not offer.

(Pdb) 'cuo2' in self.v2i.keys()

think you guys might update some pinyin sequences!

btw, thanks for the great project!