X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
Demo page: https://X-E-Speech.github.io/X-E-Speech-demopage
Anonymous preprint: https://openreview.net/forum?id=J4fL6FDz36
I'm cleaning the train code.
The inference code is available now, the environment is similar to VITS.
The pre-trained models are available here: https://drive.google.com/drive/folders/1PHzFyqkOa_7O4TVI6vypZa8MIpU7nIbT?usp=sharing