DrewThomasson/ebook2audiobook

v1.2.1 fails when trying to process a long number

Closed this issue · 9 comments

Ubuntu 24.04.1 LTS

I've tried to convert this book to audio by using a web application (app.py) but it fails when there is a long number in the text.

Generating fragment: История, которая никогда не заканчиваетсяnotes12345678910111213141516171819202122

Error:

  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/Work/ebook2audiobook-1.2.1/app.py", line 956, in <lambda>
    lambda *args: convert_ebook_to_audio(
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/Work/ebook2audiobook-1.2.1/app.py", line 801, in convert_ebook_to_audio
    convert_chapters_to_audio_standard_model(chapters_directory, output_audio_directory, temperature, length_penalty, repetition_penalty, top_k, top_p, speed, enable_text_splitting, target_voice, language)
  File "/home/metacodeine/Work/ebook2audiobook-1.2.1/app.py", line 704, in convert_chapters_to_audio_standard_model
    tts.tts_to_file(
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/api.py", line 334, in tts_to_file
    wav = self.tts(
          ^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/api.py", line 276, in tts
    wav = self.synthesizer.tts(
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/utils/synthesizer.py", line 389, in tts
    outputs = self.tts_model.synthesize(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 425, in synthesize
    return self.full_inference(text, speaker_wav, language, **settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 494, in full_inference
    return self.inference(
           ^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 540, in inference
    text_tokens = torch.IntTensor(self.tokenizer.encode(sent, lang=language)).unsqueeze(0).to(self.device)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 682, in encode
    txt = self.preprocess_text(txt, lang)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 668, in preprocess_text
    txt = multilingual_cleaners(txt, lang)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 582, in multilingual_cleaners
    text = expand_numbers_multilingual(text, lang)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 563, in expand_numbers_multilingual
    text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/re/__init__.py", line 186, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 563, in <lambda>
    text = re.sub(_number_re, lambda m: _expand_number(m, lang), text)
                                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 543, in _expand_number
    return num2words(int(m.group(0)), lang=lang if lang != "cs" else "cz")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/__init__.py", line 104, in num2words
    return getattr(converter, 'to_{}'.format(to))(number, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/lang_RU.py", line 295, in to_cardinal
    return self._int2word(int(n), cardinal=True, case=case,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/lang_RU.py", line 389, in _int2word
    self.pluralize(x, get_thousands_elements(i, case)))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/metacodeine/miniconda/lib/python3.12/site-packages/num2words/lang_RU.py", line 236, in get_thousands_elements
    return THOUSANDS[num][CASE_INDEXES[case]]
           ~~~~~~~~~^^^^^
KeyError: 11

ru language?

did it failed at the start before any conversion?

@ROBERT-MCDOWELL It worked some time, I think it finished the first chapter before to fail, someting about 2-3 minutes on my machine with GeForce 4060. I can't share logs right now but I'm sure that it was like this

fixed for v2.0.0, thanks for your report

@ROBERT-MCDOWELL Thank you for help!

the issue was no space between letters and numbers, no space between cyrillic and latin causing a mess for coqui-tts. More major TTS models support no more than 4 numbers to pronounce "thousand...." forcing us to split in 4 for long numbers.
please check the result below
test.zip

@ROBERT-MCDOWELL Yes, the audio sounds correct

Version 2.0 has been officially released which should fix this issue your having

making this as closed

you can re-open if you still have the issue :)