v1.2.1 fails when trying to process a long number
Closed this issue · 9 comments
Ubuntu 24.04.1 LTS
I've tried to convert this book to audio by using a web application (app.py
) but it fails when there is a long number in the text.
Generating fragment: История, которая никогда не заканчиваетсяnotes12345678910111213141516171819202122
Error:
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/blocks.py", line 1935, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/blocks.py", line 1520, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 943, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/utils.py", line 826, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/Work/ebook2audiobook-1.2.1/app.py", line 956, in <lambda>
lambda *args: convert_ebook_to_audio(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/Work/ebook2audiobook-1.2.1/app.py", line 801, in convert_ebook_to_audio
convert_chapters_to_audio_standard_model(chapters_directory, output_audio_directory, temperature, length_penalty, repetition_penalty, top_k, top_p, speed, enable_text_splitting, target_voice, language)
File "/home/metacodeine/Work/ebook2audiobook-1.2.1/app.py", line 704, in convert_chapters_to_audio_standard_model
tts.tts_to_file(
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/api.py", line 334, in tts_to_file
wav = self.tts(
^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/api.py", line 276, in tts
wav = self.synthesizer.tts(
^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/utils/synthesizer.py", line 389, in tts
outputs = self.tts_model.synthesize(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 425, in synthesize
return self.full_inference(text, speaker_wav, language, **settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 494, in full_inference
return self.inference(
^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 540, in inference
text_tokens = torch.IntTensor(self.tokenizer.encode(sent, lang=language)).unsqueeze(0).to(self.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 682, in encode
txt = self.preprocess_text(txt, lang)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 668, in preprocess_text
txt = multilingual_cleaners(txt, lang)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 582, in multilingual_cleaners
text = expand_numbers_multilingual(text, lang)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 563, in expand_numbers_multilingual
text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/re/__init__.py", line 186, in sub
return _compile(pattern, flags).sub(repl, string, count)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 563, in <lambda>
text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 543, in _expand_number
return num2words(int(m.group(0)), lang=lang if lang != "cs" else "cz")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/__init__.py", line 104, in num2words
return getattr(converter, 'to_{}'.format(to))(number, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/lang_RU.py", line 295, in to_cardinal
return self._int2word(int(n), cardinal=True, case=case,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/lang_RU.py", line 389, in _int2word
self.pluralize(x, get_thousands_elements(i, case)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/lang_RU.py", line 236, in get_thousands_elements
return THOUSANDS[num][CASE_INDEXES[case]]
~~~~~~~~~^^^^^
KeyError: 11
ru language?
@ROBERT-MCDOWELL Yes
did it failed at the start before any conversion?
@ROBERT-MCDOWELL It worked some time, I think it finished the first chapter before to fail, someting about 2-3 minutes on my machine with GeForce 4060. I can't share logs right now but I'm sure that it was like this
fixed for v2.0.0, thanks for your report
@ROBERT-MCDOWELL Thank you for help!
the issue was no space between letters and numbers, no space between cyrillic and latin causing a mess for coqui-tts. More major TTS models support no more than 4 numbers to pronounce "thousand...." forcing us to split in 4 for long numbers.
please check the result below
test.zip
@ROBERT-MCDOWELL Yes, the audio sounds correct
Version 2.0 has been officially released which should fix this issue your having
making this as closed
you can re-open if you still have the issue :)