idiap/coqui-ai-TTS

[Bug] Mix language inference text

Closed this issue · 4 comments

Describe the bug

What should we do for text that contains multiple languages? Since inference sends everything to eSpeak with a fixed language setting, eSpeak does not handle it well!

To Reproduce

Since inference sends everything to eSpeak with a fixed language setting, eSpeak does not handle it well!

Expected behavior

well phonemizer working

Logs

No response

Environment

What should we do for text that contains multiple languages? Since inference sends everything to eSpeak with a fixed language setting, eSpeak does not handle it well!

Additional context

No response

Yes, there is currently no way to do this directly in Coqui. But you can do mixed-language TTS with Vits/YourTTS models if you add some custom code (including calling Espeak separately per language if you don't use grapheme-based models), see #104 for details.

Integrating this would first need SSML support (see previous discussions in coqui-ai#752), which is very complex, so closing as not planned for now, but I'm open to contributions regarding SSML.

Hi, I check espeak with Persian (fa) language. I found that it handle English also. Isnt it enough? @eginhard

If you just need to mix Persian and English, maybe? If you want to also mix other languages, then maybe not? I don't know your use case.

For the first run, yes Just Persian and English.
maybe Arabic in the next version.
in the last one, Germany (it is different from all of them)

But step by step. for the first step (Persian and English), do you have any concern?