Pinned Repositories
alltalk_tts
AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
StoryDiffusion
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
OMG
[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models
Pandrator
Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant, RVC-enhanced, XTTS fine-tuning) and LLM processing. It aspires to be a user-friendly app with a GUI, an installer and all-in-one packages.
ComfyUI_RyanOnTheInside
Everything-Reactivity in ComfyUI (audio, MIDI, motion, proximity, and more).
BgRemoverLite
Program for removing the background of an image using the u2net AI and the rembg library
Bookwald's Repositories
Bookwald/alltalk_tts
AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
Bookwald/clarity-wiki-backups
Bookwald/minecraft-clarity