Is it very slow for the Wavenet vocoder to synthesize a voice

Question

Is it very slow for the Wavenet vocoder to synthesize a voice

hhhuazi opened this issue 2 years ago · 5 comments

Hello, I use demo.ippynb synthesizes voice from mel, it takes 5 minutes to synthesize a voice. Isn't this too slow? Can I use HifiGAN's pre training model directly？Thank you for your answer！

Answer 1 · 2022-11-28T07:45:55.000Z

yes, that's the purpose

Answer 2 · 2022-11-29T02:46:01.000Z

Thank you for your answer! I found the pre training model of hifiGAN in github and added it, but the synthesized voice has no content, such as noise. Why? Do I need to use the VCTK dataset to train the HiFiGAN vocoder again? Does the dataset need to be divided?

Answer 3 · 2022-11-29T03:49:23.000Z

Yes. But you can also use the hifigan model under autovc or autopst.

Answer 4 · 2022-11-29T05:25:33.000Z

Are there any precautions for retraining vocoder?

Answer 5 · 2022-11-29T16:44:33.000Z

It should be straightforward.