DrewThomasson/VoxNovel

Trouble again to convert epub

Closed this issue ยท 33 comments

Hello,
After performing the adjustments mentioned here

#4

I still experience some issues (see code at the end).
Any idea about what the problem may be?
Thanks!

$ python3.11 gui_run.py using device cpu [nltk_data] Downloading package averaged_perceptron_tagger to [nltk_data] /home/lorenzo/nltk_data... [nltk_data] Package averaged_perceptron_tagger is already up-to- [nltk_data] date! 1% Converting input to HTML... InputFormatPlugin: EPUB Input running on /home/lorenzo/audiobook-generation/Never_split_the_difference.epub Found HTML cover titlepage.xhtml Parsing all content... 34% Running transforms on e-book... Merging user specified metadata... Detecting structure... Detected chapter: CHAPTER 1 Detected chapter: CHAPTER 2 Detected chapter: CHAPTER 3 Detected chapter: CHAPTER 4 Detected chapter: CHAPTER 5 Detected chapter: CHAPTER 6 Detected chapter: CHAPTER 7 Detected chapter: CHAPTER 8 Detected chapter: CHAPTER 9 Detected chapter: CHAPTER 10 Flattening CSS and remapping font sizes... Source base font size is 11.99998pt Removing fake margins... Cleaning up manifest... Trimming unused files from manifest... Creating TXT Output... 67% Running TXT Output plugin Converting XHTML to TXT... TXT output written to /home/lorenzo/audiobook-generation/Never_split_the_difference.txt Output saved to /home/lorenzo/audiobook-generation/Never_split_the_difference.txt {'pipeline': 'entity,quote,supersense,event,coref', 'model': 'big'} Exception in Tkinter callback Traceback (most recent call last): File "/home/lorenzo/miniconda/lib/python3.11/tkinter/__init__.py", line 1948, in __call__ return self.func(*args) ^^^^^^^^^^^^^^^^ File "/home/lorenzo/VoxNovel/gui_run.py", line 427, in process_file booknlp = BookNLP("en", model_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lorenzo/miniconda/lib/python3.11/site-packages/booknlp/booknlp.py", line 14, in __init__ self.booknlp=EnglishBookNLP(model_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lorenzo/miniconda/lib/python3.11/site-packages/booknlp/english/english_booknlp.py", line 148, in __init__ self.entityTagger=LitBankEntityTagger(self.entityPath, tagsetPath) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lorenzo/miniconda/lib/python3.11/site-packages/booknlp/english/entity_tagger.py", line 22, in __init__ self.model.load_state_dict(torch.load(model_file, map_location=device)) File "/home/lorenzo/miniconda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Tagger: Unexpected key(s) in state_dict: "bert.embeddings.position_ids". Traceback (most recent call last): File "/home/lorenzo/VoxNovel/gui_run.py", line 487, in <module> filter_and_correct_quotes(file_path) File "/home/lorenzo/VoxNovel/gui_run.py", line 468, in filter_and_correct_quotes with open(file_path, 'r', encoding='utf-8') as file: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'Working_files/Book/Book.quotes'

I'll try making a Debian virtual machine to further test this giving it 8gb ram.

But in the meantime:

-Make sure it's not running on a ARM processor.

-try all these pip installs in the order given and then try re-running the program.

  • The order of all the pips and stuff in the install instructions in the readme are in a very specific order for a reason, the dependencies mess up if done in a different order :/

pip install styletts2
pip install tts==0.21.3
pip install booknlp
pip install -r Ubuntu_requirements.txt
python -m spacy download en_core_web_sm

-or try running it in a python 3.10 miniconda environment and not python 3.11.

-do a Git pull to update the local voxnovel repo.

These are just the main fixes that might work that I can think of at the moment, while I get it running in a Debian 12 virtual machine.

Hello,
I made a new fresh installation with python 3.10.
Here are the errors I collected (though allegedly everything was installed). I followed 100% the order of the installation instructions

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. styletts2 0.1.6 requires gruut<3.0.0,>=2.3.4, but you have gruut 2.2.3 which is incompatible. styletts2 0.1.6 requires librosa<0.11.0,>=0.10.1, but you have librosa 0.10.0 which is incompatible.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tts 0.21.3 requires transformers>=4.33.0, but you have transformers 4.30.0 which is incompatible. styletts2 0.1.6 requires gruut<3.0.0,>=2.3.4, but you have gruut 2.2.3 which is incompatible. styletts2 0.1.6 requires librosa<0.11.0,>=0.10.1, but you have librosa 0.10.0 which is incompatible. styletts2 0.1.6 requires transformers<5.0.0,>=4.36.0, but you have transformers 4.30.0 which is incompatible.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tts 0.21.3 requires trainer>=0.0.32, but you have trainer 0.0.31 which is incompatible. tts 0.21.3 requires transformers>=4.33.0, but you have transformers 4.30.0 which is incompatible. langchain-core 0.1.27 requires packaging<24.0,>=23.2, but you have packaging 23.1 which is incompatible. styletts2 0.1.6 requires accelerate<0.26.0,>=0.25.0, but you have accelerate 0.24.1 which is incompatible. styletts2 0.1.6 requires einops<0.8.0,>=0.7.0, but you have einops 0.6.1 which is incompatible. styletts2 0.1.6 requires filelock<3.13,>=3.12.4, but you have filelock 3.13.1 which is incompatible. styletts2 0.1.6 requires gruut<3.0.0,>=2.3.4, but you have gruut 2.2.3 which is incompatible. styletts2 0.1.6 requires librosa<0.11.0,>=0.10.1, but you have librosa 0.10.0 which is incompatible. styletts2 0.1.6 requires torch<3.0.0,>=2.1.2, but you have torch 2.1.0 which is incompatible. styletts2 0.1.6 requires torchaudio<3.0.0,>=2.1.2, but you have torchaudio 2.1.0 which is incompatible. styletts2 0.1.6 requires tqdm<5.0.0,>=4.66.1, but you have tqdm 4.64.1 which is incompatible. styletts2 0.1.6 requires transformers<5.0.0,>=4.36.0, but you have transformers 4.30.0 which is incompatible. styletts2 0.1.6 requires typing-extensions<5.0.0,>=4.9.0, but you have typing-extensions 4.8.0 which is incompatible.

Followed by

python gui_run.py Traceback (most recent call last): File "/home/lorenzo/VoxNovel/gui_run.py", line 104, in <module> from bs4 import BeautifulSoup ModuleNotFoundError: No module named 'bs4'

I simply pip installed bs4 and used your new gui_run.py script. Apparently now things work, but the expected time to process an epub on CPU only (with multiple narrators) is above 2 days and I killed the process for now.
I cannot help you, but it must be possible to use my GPU.
A simpler project

https://github.com/aedocw/epub2tts

allows me to use my GPU and I can process a book in a matter of 2-3 hours (only single narrator supported).
Cross my fingers and keep up the good work!

THANK YOU FOR YOUR FEEDBACK!!โœจ

I was wondering tho... did you say it would take 2 days for the BOOKNLP pre processing???

Or the actual audiobook generation?

Cause my estimator for time left on my audiobook gen in the GUI is ...BAD(at first it averages over time.).

A good way to tell if it's using the CPU or GPU for the audiobook generation is:

If you look in the terminal you should be able to see the processing times for each "sentence" as it's generating the audio,

if it's something like 30-50 seconds then it's CPU if it's a lot less then I can GUARANTEE that it's using the GPU

It should automatically use the GPU cause it's just calling the coqui tts api, or the styleTTS2 API,

depending on the cloning model you select from the dropdown at the top of the gui.

Also for double checking such.... I'd love to know what book ur trying to do so I can attempt to generate an audiobook from it in my own free time.

๐Ÿฅบ

THANK YOU FOR YOUR FEEDBACK!!โœจ

I was wondering tho... did you say it would take 2 days for the BOOKNLP pre processing???

Or the actual audiobook generation?

Cause my estimator for time left on my audiobook gen in the GUI is ...BAD(at first it averages over time.).

A good way to tell if it's using the CPU or GPU for the audiobook generation is:

If you look in the terminal you should be able to see the processing times for each "sentence" as it's generating the audio,

if it's something like 30-50 seconds then it's CPU if it's a lot less then I can GUARANTEE that it's using the GPU

It should automatically use the GPU cause it's just calling the coqui tts api, or the styleTTS2 API,

depending on the cloning model you select from the dropdown at the top of the gui.

Hi!
Estimated time around 1 day and 17 hours for the audiobook generation.
Vox is using the CPU now (I see it with a simple "top" command (around 5 cores are being used).
I use your modified script, because as I wrote, with the original gui Python file, the app crashes on my system.

Also for double checking such.... I'd love to know what book ur trying to do so I can attempt to generate an audiobook from it in my own free time.

๐Ÿฅบ

The book is "Never split the difference".

See

https://www.amazon.com/Never-Split-Difference-Negotiating-Depended/dp/0062407805

and get it here (within a couple of hours)

https://e.pcloud.link/publink/show?code=XZgJPnZscy4fXED3wztyGoQs9tE4bQHMykX

PS: I changed my mind and I am now trying to generate the audiobook which will be ready some time tomorrow to see if I can get Vox to work on my box (albeit relying on CPUs). I would love to carry out the same process in a matter of a few hours and I am looking forward to using the GPU.

EYYYY! Nice, I'll get back to you in 10 hours with my attempt at that book

and hopefully will have a video guide that summarizes all of the many functions of the gui and install such

HEADS-UP!!

-YOU NEED TO HAVE FFMPEG INSTALLED I found that out with my Debian virtual machine I got it running on.

-if it doesn't have it installed then by the time it gets to the end of generating then it won't combine the books correctly into the final m4b file lol.

Yes | sudo apt-get install ffmpeg

Here you go, took around 8 hours to generate this 9 and a half hour audiobook on my GPU.
Temp file link:(File will delete from file sharing server after first download)

https://file.io/R9RnzIt5rrHc

-Btw if on your end it's managing to generate your audiobook in less than a week then I can GARENTEE that the generating the audio with the GPU

-The fix code I gave you only turned off the GPU access temporarily for the BookNLP part Afterall,

so the second line you inserted at the end of the BookNLP processing should of turned the GPU access back ON....

Hi, and thanks for your help. In about 8 hours the audio book will be done on my machine and we will talk then.

Here you go, took around 8 hours to generate this 9 and a half hour audiobook on my GPU. Temp file link:(File will delete from file sharing server after first download)

https://file.io/R9RnzIt5rrHc

-Btw if on your end it's managing to generate your audiobook in less than a week then I can GARENTEE that the generating the audio with the GPU

-The fix code I gave you only turned off the GPU access temporarily for the BookNLP part Afterall,

so the second line you inserted at the end of the BookNLP processing should of turned the GPU access back ON....

Hi! The process is still running, but it will definitely take less than one week. However, I do not think it is running n GPU at all.
See the output of top (a bit scrambled)

%Cpu(s): 57.9 us, 5.3 sy, 0.0 ni, 36.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 32029.5 total, 4327.1 free, 11237.1 used, 17025.9 buff/cache MiB Swap: 19073.0 total, 19073.0 free, 0.0 used. 20792.5 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10367

lorenzo 0 0 13.6g 8.0g 348492 S 576.5 25.7 79,42 python

So, multiple cores are involved.
Furthermore, see the output of

$ watch -n 1 nvidia-smi

Every 1.0s: nvidia-smi fenix: Wed Feb 28 09:03:12 2024

Wed Feb 28 09:03:12 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:05:00.0 On | N/A |
| 34% 54C P8 12W / 151W | 390MiB / 8192MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 845 G /usr/lib/xorg/Xorg 184MiB |
| 0 N/A N/A 23629 G ...e/lorenzo/firefox/firefox 199MiB |
| 0 N/A N/A 24209 G Telegram 3MiB

So you see there is no python process or anything taking up a significant chunk of the CPU memory.
Hope it helps.

Thx! I'll see if I can make a Low-V-ram version to fix this issue.

I see what you mean tho hmm

Thx! I'll see if I can make a Low-V-ram version to fix this issue.

I see what you mean tho hmm

Thanks. I do not understand why you refer to this as a low-V-ram version. Did you not run a test using 8Gb of V ram, which is what I have on my system? As I see it, I should not encounter any particular difficulty even with my seasoned hardware.

Hi!
I was very close (20 min) to finishing generating the audiobook, when the app crashed.
Here is the error message

Combined audio saved to Working_files/generated_audio_clips/audio_2267_11.wav Deleted: Working_files/temp/1.wav Voice actor: mol.F, en tts_models/multilingual/multi-dataset/xtts_v2 is multi-dataset and multilingual

> Text splitted to sentences. ['12 reciprocity, 133, 148, 160, 168, 193, 196, 206, 207 Regini, Chuck, 98 Rogers, Carl, 97 Rowling, J. K., 256 Ruby Ridge siege, Idaho, 13 Rule of Three, 177โ€“78, 186 Rust, Kevin, 166 Sabaya, Abu, 98โ€“105, 142โ€“43, 144, 145 Sadat, Anwar, 133'] Exception in thread Thread-1 (generate_audio): Traceback (most recent call last): File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/home/lorenzo/VoxNovel/gui_run.py", line 1904, in generate_audio tts.tts_to_file(text=fragment, file_path=f"Working_files/temp/{temp_count}.wav", speaker_wav=list_reference_files(voice_actor), language=language_code) File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/api.py", line 432, in tts_to_file wav = self.tts( File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/api.py", line 364, in tts wav = self.synthesizer.tts( File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 383, in tts outputs = self.tts_model.synthesize( File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 397, in synthesize return self.inference_with_config(text, config, ref_audio_path=speaker_wav, language=language, **kwargs) File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 419, in inference_with_config return self.full_inference(text, ref_audio_path, language, **settings) File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 488, in full_inference return self.inference( File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/lorenzo/miniconda/envs/VoxNovel/lib/python3.10/site-packages/TTS/tts/models/xtts.py", line 535, in inference text_tokens.shape[-1] < self.args.gpt_max_text_tokens AssertionError: โ— XTTS can only generate text with a maximum of 400 tokens.

As I wrote, I used your modified script and everything was done via CPU.

Finally, despite the errors, the app created an audio book, but I have not checked it in detail. It is here
https://e.pcloud.link/publink/show?code=XZ3qCnZM0nOpUaUHHzxJ9gAV99l9p96MQd7

ignore this you were right Im gona make a separate version you'll be able to use your gpu on

Yeah I see that it seems to be running on only the cpu on your end, sorry about the miscommunication on my part...

If you have a better Phrase to describe offloading the gpu's work to the cpu for the book processing park that be great, cause I can't really think of anything good on the top of my head..... :// ๐Ÿค”

You might be right that I'm using the phrase "Low_vram version" incorrectly,

  • All the audio generation part in particular works well anyway with only 4gb vram without my modifications... so your right Low-vram isn't the right phrase for how I'm trying to fix it...
    • its just that the booknlp part is not optimized well..
    • To be honest I'm not exactly sure what to quickly call it when I offload the booknlp part to the cpu, for when the gpu isn't enough for booknlp.
    • ALTHOUGH I might be able to make booknlp process a chapter at a time instead of the whole book making it require less vram.. but I'll have to check at how that effects the quality of the booknlp output.

About your error:

Yeah I got that error too when generating,

  • I think it has to do that how many commas are in that sentence making it count as a ton of tokens
    • Probably cause it sees commas as pauses so it would of made a giant chunk of audio that's super long?

I'll see if I make it split again into smaller chunks for those kinds of things cause that's super weird....

I'll be doing a new test with my Low_vram implementation plugging my GeForce GTX 980 (4GB VRAM) into my computer with 32GB RAM later today.

  • So my GTX 980 instead of my usual 3060 (12GB VRAM).

  • I did run it on my GeForce GTX 980 (4GB VRAM) before
    But it was quite a while ago, so I'll be testing it again

How the "Low_Vram" mode will work in the updated script:

  • The test file is in the GitHub repo as the file "test_low_vram_gui_run.py"
    • You don't have to test it I'll be testing it later today just, thought you might want to know that lol.
  • The Low-vram version will be just making the GPU invisible to force CPU for the BookNLP and allow torch to see the GPU after for the audio generation as done before...
    • It'll be a checkbox in the beginning of the GUI
      • If the checkbox it clicked for low VRAM then it'll give you a popup when you click the generate audio button telling you what device torch sees.



  • I'll also be comparing the runtimes for this new script on my computer for:
    • Low VRAM mode activated with 4GB VRAM for BookNLP and audio generation
    • Run with only CPU and no GPU even plugged into the computer for BookNLP and audio generation

Image

This is the popup that showed on my intel Mac when I clicked "generate audio" in the GUI, don't have a Nvidia GPU on my intel Mac tho,....hence got test it later on my desktop :/
image

In my testing I see what you mean

The device doesn't change

You're absolutely right, my bad

I'll get you a fix in an hour or so.

HI!!

You were very right about your issue, once again apologies.

  • Turns out once you set the device it stays like that for the full program.
  • So, I've split the program into two Python programs: one CPU and one GPU. I've tested this on my (4GB VRAM GPU) and this solution works. at least on my end I really hope it works on your end. ๐Ÿ™

To run the fix I've made tailor made for your low Vram GPU situation:

To run the provided scripts on your system, follow these steps in order:

  1. Book Processing (CPU Only):

    • Script: 1CPU_Book_processing.py
    • This script handles the task of only processing the book using BookNLP, specifically forcing it to run on the CPU.
  2. Audio Generation (GPU Only):

    • Script: 2GPU_Audio_generation.py
    • This script is dedicated to only generating audio with the GPU and should be run after completing the book processing with 1CPU_Book_processing.py.

Performance Results

Upon running a mini test with an epub file using the above setup, the following performance metrics were observed:

Task Configuration Time (Seconds)
Book Processing GPU only (GeForce GTX 980), 4GB VRAM, 32GB RAM, Intel i7-8700K 2.922
Audio Generation GPU only (GeForce GTX 980), 4GB VRAM, 32GB RAM, Intel i7-8700K 128.48
Book Processing CPU only, 32GB RAM, Intel i7-8700K 4.964
Audio Generation CPU only, 32GB RAM, Intel i7-8700K 391.4227

-Also processing your book on CPU took 321 seconds on my computer
-Currently trying to do your whole book again but on 4gb ram, Gona see the time for that, plus with a fix for the sentence splitting issue we were having before.
-This fix has now been inserted into the readme.

Thanks a lot, I will be able to give it a go later on.
Do I just need a git pull and then try your scripts?

Yup, the scrips have been added and are in the main directory.
โœจ
:)

Hi!
At this point I am sure we will get there, but there is still some bumpy road ahead.

$python3.11 1CPU_Book_processing.py Low Vram mode turned on : cpu using device cpu [nltk_data] Downloading package averaged_perceptron_tagger to [nltk_data] /home/lorenzo/nltk_data... [nltk_data] Package averaged_perceptron_tagger is already up-to- [nltk_data] date! 1% Converting input to HTML... InputFormatPlugin: EPUB Input running on /home/lorenzo/audiobook-generation/Never_split_the_difference.epub Found HTML cover titlepage.xhtml Parsing all content... 34% Running transforms on e-book... Merging user specified metadata... Detecting structure... Detected chapter: CHAPTER 1 Detected chapter: CHAPTER 2 Detected chapter: CHAPTER 3 Detected chapter: CHAPTER 4 Detected chapter: CHAPTER 5 Detected chapter: CHAPTER 6 Detected chapter: CHAPTER 7 Detected chapter: CHAPTER 8 Detected chapter: CHAPTER 9 Detected chapter: CHAPTER 10 Flattening CSS and remapping font sizes... Source base font size is 11.99998pt Removing fake margins... Cleaning up manifest... Trimming unused files from manifest... Creating TXT Output... 67% Running TXT Output plugin Converting XHTML to TXT... TXT output written to /home/lorenzo/audiobook-generation/Never_split_the_difference.txt Output saved to /home/lorenzo/audiobook-generation/Never_split_the_difference.txt {'pipeline': 'entity,quote,supersense,event,coref', 'model': 'big'} Exception in Tkinter callback
Traceback (most recent call last): File "/home/lorenzo/miniconda/lib/python3.11/tkinter/__init__.py", line 1948, in __call__ return self.func(*args) ^^^^^^^^^^^^^^^^ File "/home/lorenzo/VoxNovel/1CPU_Book_processing.py", line 446, in process_file booknlp = BookNLP("en", model_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lorenzo/miniconda/lib/python3.11/site-packages/booknlp/booknlp.py", line 14, in __init__ self.booknlp=EnglishBookNLP(model_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lorenzo/miniconda/lib/python3.11/site-packages/booknlp/english/english_booknlp.py", line 148, in __init__ self.entityTagger=LitBankEntityTagger(self.entityPath, tagsetPath) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lorenzo/miniconda/lib/python3.11/site-packages/booknlp/english/entity_tagger.py", line 22, in __init__ self.model.load_state_dict(torch.load(model_file, map_location=device)) File "/home/lorenzo/miniconda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Tagger: Unexpected key(s) in state_dict: "bert.embeddings.position_ids".

In other words: I made a git pull and then I launched the first script, but things crashed. Anything else I should have done?
I see this error is referenced in several places when I google it, but to be fair if flies on top of my head.

lol realizing the first part of this was a bit loopy and reparative

upon second inspection of your error it might also be:

I'm seeing python3.11 in your code, what happened to python3.10?

Your previous attempt at running it that didn't have this issue used VoxNovels default python being 3.10 right?

why.... do I see $python3.11 1CPU_Book_processing.py and not $python 1CPU_Book_processing.py

If your already in the VoxNovel conda env then you should just be able to run it with python 1CPU_Book_processing.py cause that'll run it with the version of python that the VoxNovel conda env uses,

try:

python 1CPU_Book_processing.py

instead of

python3.11 1CPU_Book_processing.py ? While in the VoxNovel conda env?

-ps if that's not the issue then hence below:

I think I've run into that before,

Try pip install booknlp, and then try running it again

If that doesn't work then try the pip installs again in order from the readme install instructions

pip install bs4
pip install styletts2
pip install tts==0.21.3
pip install booknlp
pip install -r Ubuntu_requirements.txt

I'm not sure what causes it but i think its just dependencies or something,

Hi!
It appears a previous comment of mine went lost.
Sorry for the 3.11 python thing (my bad) but the real fix was to reinstall booknlp. Apparently now everything works, but I will test on another epub. Last question (I hope): to handle languages other than English, I just have to select the corresponding language in the interface and that is it, is it not?

Thanks!

Yes BUT Sadly BOOKNLP only identifies who said what for English right now..

BUT it can be used to change the accents of each character or for some reason if a character speaks only in Spanish you could use it for that lol

-Also I've updated to readme to describe all of the functions of each part of the GUI

๐ŸŒ๐ŸŽ™๏ธ Accents you can give each character with the default cloning model (XTTS) - They also allow them to speak these languages, but the quotation attribution won't correctly identify for anything thats not English. English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu), Korean (ko)

Everything is fine and I am about to close this thread. The only thing missing is a cleanup script. It turns out that there were some chapters txt files left from my previous book and I then generated a Frankenstein monster ๐Ÿ˜‚

oo good call I think I forgot about that,

-just making sure so it's making a weird combo of the two books after running the first one right?

Mmm I'll check that out hmm, I'll see if I can make it pre,wipe the working_folder every time right before it runs the BOOKNLP script.

developer notes for myself

cause I just thought of this as I was typing and I don't want to forget this lol

And I'll see if I can get a recursive sentence splitter that works better than the current to stop the awkward pauses in lists and such sometimes;

Where if the sentence is over 250 in length or the amount of pauses is in the sentence is more than the program can handle then it keeps on splitting it by the middle pause if there is if not then just the middle until every piece isn't and then it'll return the sentence splits,

Ok. Let me close this since now the app runs. We can always add extra comments or open a new issue/discussion. Thank you for your support!

But if your looking to make audiobooks in different languages I have this beta project where it does that with Xtts BUT only with a single voice :/

I need to apply the sentence splitting fix tho to fix the thing that happened at the end of your book on this tho,

And get a better version of multilingual sentence splitting method.

https://github.com/DrewThomasson/ebook2audiobookXTTS

Let us say that the first book generated 11 chapter text files and the
second one only 5. Then 6 extra chapters where added to the generated audiobook.

On Fri, Mar 01, 2024 at 02:55:05PM -0800, Drew Thomasson wrote:

oo good call I think I forgot about that,

-just making sure so it's making a weird combo of the two books after running the first one right?

Mmm I'll check that out hmm, I'll see if I can make it pre,wipe the working_folder every time right before it runs the BOOKNLP script.

developer notes for myself

cause I just thought of this as I was typing and I don't want to forget this lol

And I'll see if I can get a recursive sentence splitter that works better than the current to stop the awkward pauses in lists and such sometimes;

Where if the sentence is over 250 in length or the amount of pauses is in the sentence is more than the program can handle then it keeps on splitting it by the middle pause if there is if not then just the middle until every piece isn't and then it'll return the sentence splits,

--
Reply to this email directly or view it on GitHub:
#5 (comment)
You are receiving this because you authored the thread.

Message ID: @.***>

Wait, I'm....confused what you're trying to say with the chapter thing?

Also added a discussions to this repo, just found out how to do that lol, we can continue this there that might be a better place now that it's less about bugs :)

Let us say that the first book generated 11 chapter text files and the
second one only 5. Then 6 extra chapters where added to the generated audiobook.

On Fri, Mar 01, 2024 at 02:55:05PM -0800, Drew Thomasson wrote:

oo good call I think I forgot about that,

-just making sure so it's making a weird combo of the two books after running the first one right?

Mmm I'll check that out hmm, I'll see if I can make it pre,wipe the working_folder every time right before it runs the BOOKNLP script.

developer notes for myself

cause I just thought of this as I was typing and I don't want to forget this lol

And I'll see if I can get a recursive sentence splitter that works better than the current to stop the awkward pauses in lists and such sometimes;

Where if the sentence is over 250 in length or the amount of pauses is in the sentence is more than the program can handle then it keeps on splitting it by the middle pause if there is if not then just the middle until every piece isn't and then it'll return the sentence splits,

--
Reply to this email directly or view it on GitHub:
#5 (comment)
You are receiving this because you authored the thread.

Message ID: @.***>

Wait, I'm....confused what you're trying to say with the chapter thing?

Something went wrong with github here

hm : response to -> "Something went wrong with github here"

Update

  • Update: added the cleanup script to the chapter files, now it wipes the temp_ebook directory before processing,
  • added the fix to the normal full script and the two Low Vram files I made for you
    :)

Should fix your weird added chapter thing,

The issue was: It wasn't wiping that folder before so if you were generating a book with less chapters than the last it would have that weird Frankenstein book thing happen

Super ! In the next few days I will explore the generation of non English audio books. Thanks