karim23657/Persian-tts-coqui

Hi, Multi Speaker Tutorial

Opened this issue · 4 comments

Hi Karim,
How to use Multi Speaker ViTS Train.py (Kamtera/persian-tts-multispeaker-vits) for training Multi Speaker or Fine-Tuning Model?
could you help me?
Best Regard.

Hi , please have a look at : https://huggingface.co/Kamtera/persian-tts-multispeaker-vits
there is training code I'v used and training weights.
also I'v added these notebooks in repo : https://github.com/karim23657/Persian-tts-coqui/tree/main/recepies/vits/multispeaker
I followed these tutorials:
readme -> https://github.com/Edresson/YourTTS#reproducibility
my code inspired from -> https://github.com/coqui-ai/TTS/blob/dev/recipes/vctk/yourtts/train_yourtts.py

*notice: In my code i used a custom dataset loader called mozilla_with_speaker here is my fork from tts package ,and where I edited: https://github.com/karim23657/TTS/blob/3ba73bf488504bc689e3d6d954e1b5220cbad577/TTS/tts/datasets/formatters.py#L16

I would be very thankful if you share your works and models with us.
Feel free to ask your any question.

Thank you so much.
I will check them.
I have Tesla A100 40GB and your train code and config help me so much.
i will share my model as soon as checkpoint released in my system.
Best regards.

@Veria70 I tried running inference on Karim's pretrained multispeaker model and all 3 voices produced bad wav files. I don't know if it's my fault. I have reverted to using his pretrained "female vits" model which works great. I am very excited to try the result of your training!

@kfatehi Hi Keyvan, it's not bad wav i know whats problem with that voice. the problem is Accent in Dataset.
you need to use Mozilla common voice dataset and training with it.
Good Luck