auspicious3000/SpeechSplit

How to get a generated speech from the output of the trained Generator?

6lyx opened this issue · 3 comments

6lyx commented

I have trained the Generator model with my own data. However, I found that there may not exist a code for generating the speech from the trained Generator. And I check the code named "demo.ipynb" for founding out the way. It indicates that a trained F0_Converter is needed.
So I would like to ask the author that dose it nessusary to train a F0_Converter first for generating the speech from the trained Generator?(Because I found no code for training F0_Converter)? Or we just need to use the pretrained F0_Converter?

If your data is very different from vctk, you probably need to re-train the F0-converter

6lyx commented

Many many thanks for your quick answering. I am now using the speech with the sampling rate of 44100hz, does it mean that it is nesscuary to retrain the F0_Converter and the wavegen model? I have found that the speech I generated is much shorter than the original speech....... (Using the trained G model and the pretrained wavegen model obtained in this project).

Yes. In that case, you probably need to tweak other parts of the model as well.