tortoise-tts-fast
HobisPL opened this issue · 8 comments
Have you considered adding Tortoise-TTS-Fast instead of the original Tortoise-TTS? Fast performs much faster, and you can add your own models to it. It's better to clone voices because you can extract the latents from whole audio samples, making them more sophisticated.
https://github.com/152334H/tortoise-tts-fast
That repo has a very strong license (AGPL v3).
However, this repo is under the same Apache 2.0: https://git.ecker.tech/mrq/tortoise-tts
Do you know if the mrq repo is worthwhile?
Also, tortoise-tts-fast offers a web UI that could be useful.
If that repo is very important then you could create a fork of this webui to swap tortoise-tts with tortoise-tts-fast.
Ok so - as far as I tested it:
The API seems compatible, so if you replaced tortoise/ folder with this fork, it should work.
And it does seem to install partially, but there seem to be some incompatibilities with torch versions, not sure if they will be fixable too soon.
This is the error I got:
ImportError: cannot import name 'fail_with_message' from 'torchaudio._internal.module_utils'
For installation I used:
# Installing TortoiseTTS Fast - AGPLv3
git clone https://github.com/152334H/tortoise-tts-fast tortoise_tts_fast
cd tortoise_tts_fast
pip install --ignore-installed llvmlite -e .
pip install --ignore-installed llvmlite git+https://github.com/152334H/BigVGAN.git
Improved performance and UI, although tortoise-tts-fast should still be faster.
#45
I found a repository that already uses Gradio, and it includes training and many cool options.
https://git.ecker.tech/mrq/ai-voice-clonin
@rsxdalv you can update this repo with new api call
tts = api.TextToSpeech(use_deepspeed=True, kv_cache=True, half=True)
I'm considering this resolved for now. Feel free to reopen this or another issue for more tortoise optimizations.