Microsoft Edge's online text-to-speech service cannot be used now.
Closed this issue · 6 comments
edge-tts --text "Hello, world!" --write-media hello.mp3
returns out
aiohttp.client_exceptions.ServerTimeoutError: Connection timeout to host wss://speech.platform.bing.com/consumer/speech/synthesize/readaloud/edge/v1?TrustedClientToken=6A5AA1D4EAFF4E9FB37E23D68491D6F4&ConnectionId=7ab700c0e46443f7b5e4e02db4df6776
Could you please help us to upgrade into a new set of TrustedClientToken and ConnectionId, so that we can have our service in use as usual?
Thank you so much.
I haven't dug into the issue, but as a quick workaround I threw the edge-tts command into a try-catch loop in Python which handles it since the problem doesn't occur every time there. This loop will try {attemps} times to execute the edge-tts and if edge-tts fails it will wait {attempt_interval} seconds then try again:
async def _create_audio_file(text:str, file_name:str,
voice:str = "en-US-SteffanNeural",
speed:str = "+0%",
volume:str = "+0%",
attemps:int = 10,
attempt_interval:int = 1)
for attempt in range(attempts):
try:
communicate = edge_tts.Communicate(text, voice, rate = speed,
volume=volume)
await communicate.save(file_name)
except edge_tts.exceptions.NoAudioReceived as e:
logger.exception("edge_tts.exceptions.NoAudioReceived Error: %s, retry attempt: %s", e, (attempt+1))
if attempt == (attempts-1): #
logger.exception("Failed to obtain TTS after %s attempts. Max attempts exceeded.", attempts)
raise e
# We get here if there was an exception & attempts not exceeded
# (didn't reach above else)
time.sleep(attempt_interval)
continue
break
From some quick testing edge-tts seems to fail with edge_tts.exceptions.NoAudioReceived something like 5-10% of the time, but each time it succeeded in the second attempt after waiting 1 second to try again.
Since I have temporarily changed our TTS service into window's imbuilt SAPI.SpVoice, I will take your advice in some time after.
But still, thank you so much for your advice!
I am using
communicate = edge_tts.Communicate(chapter_content, sti)
asyncio.get_event_loop().run_until_complete(communicate.save(output_file))
audio = AudioSegment.from_file(output_file, format="mp3")
And it worked before, now I get
audio = AudioSegment.from_file(output_file, format="mp3")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "audio_segment.py", line 773, in from_file
raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1
Output from ffmpeg/avlib:
ffmpeg version 2023-06-21-git-1bcb8a7338-essentials_build-www.gyan.dev
...
[in#0 @ 000001785c644c40] Error opening input: Invalid argument
I am using
communicate = edge_tts.Communicate(chapter_content, sti) asyncio.get_event_loop().run_until_complete(communicate.save(output_file)) audio = AudioSegment.from_file(output_file, format="mp3")
And it worked before, now I get
audio = AudioSegment.from_file(output_file, format="mp3") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "audio_segment.py", line 773, in from_file raise CouldntDecodeError( pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1 Output from ffmpeg/avlib: ffmpeg version 2023-06-21-git-1bcb8a7338-essentials_build-www.gyan.dev ... [in#0 @ 000001785c644c40] Error opening input: Invalid argument
AudioSegment is not part of edge-tts (looks like it's from jiaaro/pydub?), this problem could definitely be caused by edge-tts not correctly generating an mp3 file but the error message from AudioSegment is not particularly useful here. Though it's strange for edge-tts to not throw an edge_tts.exceptions error that case (usually you would get edge_tts.exceptions.NoAudioReceived if this happened).
So, it would be helpful if you could investigate what is happening before your use of AudioSegment and see what's happening with the edge-tts part. Check the values of: chapter_content
and sti
to confirm they're valid input for edge_tts.Communicate and confirm communicate
is a valid object. Check that output_file
is a valid path for your OS. If so, check to confirm output_file
was generated and try opening the file with an audio player to confirm if it's valid.
You can do that by putting a breakpoint at your line audio = AudioSegment.from_file(output_file, format="mp3")
and checking those values in the debugger at that line, or adding something like the following to a testing branch:
In your imports add: import os
Replace the code you included above:
communicate = edge_tts.Communicate(chapter_content, sti)
asyncio.get_event_loop().run_until_complete(communicate.save(output_file))
audio = AudioSegment.from_file(output_file, format="mp3")
with:
communicate = edge_tts.Communicate(chapter_content, sti)
print(f"chapter content: {chapter_content}\nsti: {sti}\noutput_file: {output_file}")
asyncio.get_event_loop().run_until_complete(communicate.save(output_file))
assert(os.path.isfile(output_file))
audio = AudioSegment.from_file(output_file, format="mp3")
You might also need to run the edge-tts command to list voices & confirm your param is still a valid voice.
The pdf I got the data from happened to be corrupted at the same time Edge seemingly wasn't available, a classic example of corellation mistaken for causation.