Feature Requests!

Question

Feature Requests!

Closed this issue 12 days ago · 26 comments

quantumlump commented 2 months ago

Obviously no expectation that you do these but here are some requests!

Batch processing - Allow for loading in multiple books so that the app starts working on the next book immediately
Do not reset the UI on refresh - Right now if I mistakenly reload the web UI, it resets to the default screen (even though the tts is still working in the terminal). This means I can't download the file once it is done from the web UI (have to find it in the Docker files lol)
Ability to set a default output folder for the audiobooks (automatic downloading)

Thx!

DrewThomasson commented a month ago

👀

Answer 1 · 2024-11-01T12:16:06.000Z

it's done on the next version
todo indeed. I started to develop it but not happy from the result and retracted the modif. we should go deeper to find a good compromise.
it's done on the next version, output folder where all conversion go will be audiobooks/cli, audiobooks/gui/[gradio|host]

Answer 2 · 2024-11-02T23:30:06.000Z

Awesome! Another huge addition would be the ability to use the new F5-TTS model. It seems to have less hallucinations.

Answer 3 · 2024-11-02T23:48:31.000Z

Good idea! 👍

We do plan on integrating other tts services into ebook2audiobook.
But it's being put on the back-end until we figure out a standardized way of integrating many different tts engines into ebook2audiobook

Answer 4 · 2024-11-02T23:53:09.000Z

Best of luck!

Answer 5 · 2024-11-03T20:48:44.000Z

About request #2 though,

#38

Are you not able to click the download audiobooks button in the GUI to load download links to all the generated audiobook files in the output folder?
Cause that button should just show everything in the output folder? Irrespective of if the web gui session was re-loaded or not? 🤔

Example from hugging face GUI demo ⬇️

Answer 6 · 2024-11-03T23:02:14.000Z

Okay that is awesome, I never thought about clicking that button 😅. Maybe those should load automatically or work like a drop down menu in the UI? Or maybe I should learn to read better lol

Answer 7 · 2024-11-03T23:04:12.000Z

@scriptpony it's on the way also in the next version

Answer 8 · 2024-11-03T23:13:00.000Z

Short answer: gradio is extremely restrictive in reloading anything in the gui.

Easiest is to just give the user a reload/update button at the moment lol

I'll look at renaming the button to make it more clear on what it does tho lol,

That would probs help

Answer 9 · 2024-11-03T23:17:53.000Z

Perhaps renamed to

"View Completed Audiobooks" ? 🤔

Or "View Created Audiobooks?"

What do you think? Any suggestions?

Answer 10 · 2024-11-03T23:23:24.000Z

or show converted files

Answer 11 · 2024-11-04T00:41:43.000Z

Show converted files or audiobooks would make most sense to me!

Answer 12 · 2024-11-04T14:48:17.000Z

Changed to "Show Converted Files"

for this git shard

905cb21

:)

Answer 13 · 2024-11-26T05:59:51.000Z

After a lot of trial and error I was able to modify your code to get it working with F5-TTS!

I am running into issues getting it on Hugging Face (I am new to this) but here it is if you wanted to check it out:
https://drive.google.com/file/d/1tWWxoPX66JOoHcZVkf3idx9ZNpU28g9n/view?usp=sharing

Answer 14 · 2024-11-26T06:11:59.000Z

If you make a fork of this repo and then apply your changes it'll be easier for us to be able too see your changes you made to it

In the meantime were hard at work on a version 2.0 where the code base is heavily modified 😅

I'll look into your code tho to see how it preforms!

Well probs look into asking you for help in the future when adding it smoothly to ebook2audiobook v2.0

Answer 15 · 2024-11-26T06:32:44.000Z

I'll give you a heads up on the functions we would need you to make once we get our v2.0 stuff finalized 😅

In the meantime Here's me testing out your code as a hugginface space

https://huggingface.co/spaces/drewThomasson/ebook2audiobook_F5-TTS

I'll probs do some testing with it tomorrow to see if it works

Answer 16 · 2024-11-26T06:52:23.000Z

Awesome! Glad it works on hf. Fair warning the code is hot garbage chatGPT slop but it works so 🤷‍♂️. I just wanted to get something working asap

Super slow on CPU btw, actually idk if it even works on CPU I didn't test that

Answer 17 · 2024-11-28T11:19:33.000Z

I works well on CPU, fore sure much slower but ok. the feature 2. is developed on v2.0.0 but only in case of crash on server side, not client. meaning that if the gradio client is losing the connection but the process is still running on server side so it's still not possible to reconnect gradio to the same process, and I'm not sure it's possible unless with a low level code access to the OS.

Answer 18 · 2024-11-28T18:25:02.000Z

NEXT VERSION!

https://github.com/jondana/eBook_to_Audiobook_with_F5-TTS/tree/main

Batch convert eBooks
Pull the cover image correctly and place it in mp3 "icon" slot (I was having trouble with your .m4b files stopping randomly in my audiobook player?) mp3 seems more stable
Updated inferencing for more natural phrasing with better pauses
UI improvements

I spent way too much time trying to improve the progress bar to show wav stitching but didn't work out
I also tried to make an estimated time to completion feature but also didn't work out (I feel like Gradio should have that feature built in?? total steps/seconds to complete each step)

Answer 19 · 2024-11-28T19:35:47.000Z

don't waste too much your effort on the actual version, the next version is full code rebase and integrate what you did in the v2 version will be a lot of complications. better you wait the v2 then you can push a PR.

Answer 20 · 2024-11-28T19:38:15.000Z

Sadly Agreed😅

Answer 21 · 2024-11-28T19:41:32.000Z

Sounds good! I am definitely done for now. Excited to see what you guys come up with

Answer 22 · 2024-11-28T19:43:57.000Z

Updated the hugging face demo of ur code with your new app.py though in the meantime

https://huggingface.co/spaces/drewThomasson/ebook2audiobook_F5-TTS

Answer 23 · 2024-12-27T20:51:09.000Z

Is point 3 already implemented?

Ability to set a default output folder for the audiobooks (automatic downloading)

Answer 24 · 2024-12-27T21:59:57.000Z

@leet1994 yes, it's in ./lib/conf.py, change it to your own.

Answer 25 · 2024-12-27T22:02:54.000Z

@quantumlump

DONE
DONE, but when a refresh is done, you have to restart the conversion, and it will resume automatically from the next good sentence.
DONE check ./lib/conf.py "audiobooks_dir" variable.
Be aware that the next git update conf.py will be modified so you will have to reset it again.