yl4579/StyleTTS

The pronunciations of single words or short words is poor?

Closed this issue · 1 comments

Hi, I trained the StyleTTS models using a multi-speaker Mandarin Corpus, and I filtered out utterances with spec length less than 60 to train StyleTTS models. The synthesis results of sentences with normal length is good, but when the input text is a single word or a very short sentence, the pronuniciations is poor or sometimes hard to distinguish.
Do you have this issues when training your styletts models? and is there any suggestions for me? @yl4579
Thanks again.

yl4579 commented

See #16