Pinned Repositories
agents
Build real-time multimodal AI applications 🤖🎙️📹
AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
ansj_seg
ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典
audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
bark
🔊 Text-Prompted Generative Audio Model
Bark-Voice-Cloning
Bark Voice Cloning and Voice Cloning for Chinese Speech
CatVTON
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
ChatTTS
ChatTTS is a generative speech model for daily dialogue.
cland-websiteManage
新闻管理后台
ComfyUI
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
XxSuper's Repositories
XxSuper/agents
Build real-time multimodal AI applications 🤖🎙️📹
XxSuper/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
XxSuper/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
XxSuper/CatVTON
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
XxSuper/ChatTTS
ChatTTS is a generative speech model for daily dialogue.
XxSuper/ComfyUI
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
XxSuper/conversational-ai-livekit
基于阿里云的tts, llm,stt模型构建的实时对话应用
XxSuper/CosyVoice
LLM based TTS model, providing inference/training/deployment full-stack ability.
XxSuper/digital_human_video_player
带HTTP API的数字人视频播放器,使用gradio api对接Easy-Wav2Lip、Sadtalker、GeneFacePlusPlus
XxSuper/facefusion
Next generation face swapper and enhancer
XxSuper/fish-speech
Brand new TTS solution
XxSuper/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
XxSuper/grok-1
Grok open release
XxSuper/langchain
🦜🔗 Build context-aware reasoning applications
XxSuper/Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
XxSuper/live2d-TTS-LLM-GPT-SoVITS-Vtuber
低成本的简单基于live2d TTS文字转语音和大模型聊天的直播解决方案
XxSuper/llama3
The official Meta Llama 3 GitHub site
XxSuper/Llama3-XTuner-CN
Llama3-XTuner-CN (Finetune By XTuner)
XxSuper/MaxKB
💬 基于 LLM 大语言模型的知识库问答系统。开箱即用,支持快速嵌入到第三方业务系统,1Panel 官方出品。
XxSuper/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
XxSuper/metahuman-stream
Real time streaming digital human based on nerf
XxSuper/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
XxSuper/OpenVoice
Instant voice cloning by MyShell.
XxSuper/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
XxSuper/SenseVoice
Multilingual Voice Understanding Model
XxSuper/stable-diffusion-webui
Stable Diffusion web UI
XxSuper/Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️
XxSuper/transagents
XxSuper/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
XxSuper/whisper
Robust Speech Recognition via Large-Scale Weak Supervision