kanttouchthis/text_generation_webui_xtts

How to make it work on Windows - Instructions

erew123 opened this issue · 6 comments

Not sure if the instructions are Linux based, but I had to perform a couple of extra steps to make this work on windows:

Clone this repo:
cd extensions
git clone https://github.com/kanttouchthis/text-generation-webui-xtts

Turns out it doesn't like loading extensions with a dash in the name! So rename it
ren text-generation-webui-xtts textgenerationwebuixtts

Move back up to your text-generation-webui folder
cd..

Activate your environment. For example:
conda activate textgen

Install dependencies for TTS using the renamed folder name
pip install -r \extensions\textgenerationwebuixtts\requirements.txt

Install TTS. Their version requirements cause issues so we install the dependencies above, without version requirements.
pip install TTS --no-dependencies

Edit \text-generation-webui\installer_files\env\Lib\site-packages\gradio\components\dropdown.py

In that file file, currently line 49, change

allow_custom_value: bool = False,

to

allow_custom_value: bool = True,

Without this setting, text-generation-webui doesnt like having custom interface boxes on it, so will error when trying to start xtts.

You can now go back to the main text-generation-webui folder and run start_windows.bat

When its loaded, on the "session" tab, select textgenerationwebuixtts and then click the "Apply flags/extansions and restart" button and when its restarting, in your command prompt window, you will be prompted to accept the coqui terms, a few files will download and then it should start the interface.

Thanks for instructions for Windows. Does it repeat previous generations with each new generation for you? It does that for me

I hadn't gotten that far testing.... until you asked! I was too busy writing the instructions on here before I forgot them!

Yeah its trying to repeat the 1st message! Damn, I saw this issue on another TTS in text-generation-webui and I cannot for the life of me think what the fix was, there was one though. If it comes to mind, I will post back here!

I have the same memory, I believe I have posted about that issue on a bark extension, and I forgot what they did to fix it.

I found it: wsippel/bark_tts#16

Oh Jeez.. this commit I guess wsippel/bark_tts@31f6761

I absolutely have no idea how to incorporate/merge those kinds of changes into this. But it looks like the code is ensuring streaming is off temporarily (I tried manually turning that off in the interface and it makes no changes) and then dropping the history of previous generations of audio/text.... or something like that!

Sadly I'm not a coder, but at least weve found a sort of path for someone to have a go at fixing it.

Thank you for trying, and your right this is a good start. :)

Closing because a few others have looked into this, made some changes and gotten things working in a more smooth way!

#3 (comment)