Pinned Repositories
agenthub_operators
The repository of public operators from agenthub.dev
AI-Faceless-Video-Generator
Generate a video script, voice and a talking face completely with AI
amt-apc
AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model
anything_in_anyscene
awesome-ai-agents
A list of AI autonomous agents
bark
🔊 Text-Prompted Generative Audio Model
Breaking-reCAPTCHAv2
Code for the paper Breaking reCAPTCHAv2 accepted at COMPSAC 2024
Chrome-GPT
An AutoGPT agent that controls Chrome on your desktop
clarity-upscaler
Clarity AI | AI Image Upscaler & Enhancer - free and open-source Magnific Alternative
Comfyui_Object_Migration
This is a study aim to transfer the single concept by using DIT model self-attention capablity
neuromod0's Repositories
neuromod0/AI-Faceless-Video-Generator
Generate a video script, voice and a talking face completely with AI
neuromod0/amt-apc
AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model
neuromod0/bark
🔊 Text-Prompted Generative Audio Model
neuromod0/Breaking-reCAPTCHAv2
Code for the paper Breaking reCAPTCHAv2 accepted at COMPSAC 2024
neuromod0/Comfyui_Object_Migration
This is a study aim to transfer the single concept by using DIT model self-attention capablity
neuromod0/ebook2audiobookXTTS
Generates a audiobook with chapters and ebook metadata using Calibre and Xtts from Coqui tts, and with optional voice cloning, and supports multiple languages
neuromod0/facefusion
Industry leading face manipulation platform
neuromod0/flashinfer
FlashInfer: Kernel Library for LLM Serving
neuromod0/hallo2
neuromod0/hertz-dev
first base model for full-duplex conversational audio
neuromod0/HeyGenClone
A simple and open-source analogue of the HeyGen system
neuromod0/ichigo
Llama3.1 learns to Listen
neuromod0/linkedIn_auto_jobs_applier_with_AI
LinkedIn_AIHawk is a tool that automates the jobs application process on LinkedIn. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personalized way.
neuromod0/LitServe
Lightning-fast serving engine for AI models. Flexible. Easy. Enterprise-scale.
neuromod0/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
neuromod0/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
neuromod0/metavoice-src
Foundational model for human-like, expressive TTS
neuromod0/open-notebooklm
Convert any PDF into a podcast episode!
neuromod0/Open-Source-Ruby-and-Rails-Apps
Awesome Ruby and Rails Open Source applications
neuromod0/OpenVoice
Instant voice cloning by MIT and MyShell.
neuromod0/parler-tts
Inference and training library for high-quality TTS models.
neuromod0/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
neuromod0/SLIViT
An AI framework for clinical diagnosis of 3D biomedical scans
neuromod0/so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
neuromod0/street_gaussians
[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
neuromod0/ultravox
A fast multimodal LLM for real-time voice
neuromod0/UVR5-UI
Ultimate Vocal Remover 5 with Gradio UI. Separate an audio file into various stems, using multiple models
neuromod0/voice-changer
リアルタイムボイスチェンジャー Realtime Voice Changer
neuromod0/voicerestore
VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration
neuromod0/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.