How to use voice files instead pure TTS?

Question

How to use voice files instead pure TTS?

Vadim2S opened this issue 3 years ago · 4 comments

In papers you say about LJ speech dataset test (4.3 Content replacement). Can you provide code for loading voice files instead pure sample generation in tts.py?

Answer 1 · 2022-01-22T15:02:13.000Z

Hello @Vadim2S, thanks for opening this issue, and apologies for the belated reply. The modeling code is a direct adaptation of Grad-TTS, so you can refer to the upstream repository for detailed instructions on how to load data. Hope this helps!

Answer 2 · 2023-01-05T02:50:01.000Z

I have this question as well. Looking at the inference code, it is unclear how I could drop-in replace running Grad-TTS with my own source WAV file. Any tips would be appreciated :) Or if you have pointers to any other methods of doing something similar.

Answer 3 · 2023-01-05T07:12:42.000Z

Hey @mvoodarla, thanks for opening this issue.

how I could drop-in replace running Grad-TTS with my own source WAV file

Could you explain in more detail what you mean by this? I assume this is in the context of content replacement.

Answer 4 · 2023-01-05T16:11:15.000Z

Yes, I would like to have the ability to submit a source wav file with or without a text transcription of it, and be able to replace a word or a set of words that was said in that source wav file. Does that make sense?