erew123/alltalk_tts

Espeak NG problems and question

Closed this issue · 4 comments

Describe the bug
Alltalkbeta Standalone: Unable to install Espeak NG on Windows 10.

To Reproduce
Configuration difficulties described in narrative.

Screenshots
If applicable, add screenshots to help explain your problem.

Text/logs
If applicable, copy/paste in your logs here from the console.

Desktop (please complete the following information):
AllTalk was updated: 10/15/2024
Custom Python environment: no
Text-generation-webUI was updated: 10/15/2024

Additional context
Hi! Text-generation-webui update v1.15 damaged my deepspeed/alltalk_tts extension, which I was unable to fix or revert, either not having matching pydantic, an appropriate deepspeed wheel, or something else I'm too dull to discern.

So I decided to reinstall my webui environment and try the alltalkbeta standalone instead, intending to connect with either webui or SillyTavern--anything that would work.

After installing standalone, I see that "Espeak NG" is a requirement. I tried to install that, but the binary installers for both x64 and x32 consistently hung. As I futzed with these I noticed their download directory became undeletable until I took ownership. Trying to install it in Sandboxie yielded an lsass.exe exception (originating from the local system) in which Windows alerted that it would shut down and restart in one minute. On downloading the source archive, Windows Defender reported "Backdoor:PHP/Dirtelti.HA; Alert Level: Severe...".

This detection is discussed in a Espeak NG issue from February, but perhaps the Windows binaries and/or source are yet to be changed.

Conversely, if it is a false positive, then perhaps the binaries will not install on older hardware without AVX/2 instructions (i7 930 CPU). Is there a manual way to install Espeak NG on Windows? I didn't find any so far.

Yet I am able to directly generate using my custom voice checkpoint in the standalone alltalk. Is it possible to connect to it without using Espeak NG? Is Espeak NG it something I can forgo if I just want to connect via API to a single voice on a single XTTSv2 checkpoint (trained previously with the extension)? I'm not exactly sure what it does or if I can skip it.

Thanks!

Edit: Hmm, Espeak seems already to be installed as part of Calibre2. In Calibre2, I activated "read aloud" and it downloaded the "en_US-libritts-high" voice, onnx and json. It read aloud and seems to be functional. Would this installation suffice for AllTalk standalone?

It has *_dict data files in an "espeak-ng-data" directory, as well as espeak-ng.dll, libtashkeel_model.ort, onnxruntime.dll, onnxruntime_providers_shared.dll, piper.exe, and piper_phonemize.dll.

Edit 2: I plucked the log file of the failed install from the temp files, and it shows MSI installation error 1603... One cause of that might be "Windows Installer is attempting to install an app that is already installed on your PC." That seems likely.

I may be able to reinstall Calibre2 without Espeak/Piper, so I will try that and close this issue.

@ShaunCassidyPoster Thanks for the heads up. I had not seen/come across the false positive virus alert warning at all. Not a single installation I've performed on Win 11 have ever flagged it (even brand new, factory fresh installs of Windows).

I can say for certain that the Coqui TTS engines use espeak-ng as their phonemizer, as I helped update the detection code in the Coqui TTS engine a couple of months back idiap/coqui-ai-TTS#32 and I am pretty sure Piper also uses/requires it on Windows, along with other TTS engines I intend to add.

The long and short answer to your "would Calibre2 work" is, I don't know at the moment as the scripts that are doing the checks are from the actual TTS engine developers and not my code. Maybe at some point I will have to try catch the espeak-ng Dev and ask if they will provide a new MSI installer, or maybe look at building it myself https://github.com/espeak-ng/espeak-ng/blob/master/docs/building.md

Thanks

Well, I solved the problem by not doing anything further (left Calibre alone) except installing the AllTalk 2 SillyTavern extension. My fine-tuned XTTSv2 model works great with the standalone API. I think it takes less VRAM, too! The only weird thing is AllTalk 2 still doesn't know that Espeak-NG may be installed and shows the red error message, or maybe it's a softer requirement or it works as long as certain files are present. Thanks a lot!

I can say for certain that the Coqui TTS engines use espeak-ng as their phonemizer

Some Coqui models use Espeak, but not all. In particular, XTTS does not use phonemes at all and it should be possible to run it without Espeak.

@eginhard Thanks, that's very handy to know! :)