dengxb's Stars
louis030195/screen-pipe
Turn your screen into actions (using LLMs). Inspired by adept.ai, rewind.ai, Apple Shortcut. Rust.
CosmosShadow/gptpdf
Using GPT to parse PDF
ShaShekhar/aaiela
Tencent/MimicMotion
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
FoundationVision/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
datawhalechina/self-llm
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合**宝宝的部署教程
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
thu-ml/unidiffuser
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
xai-org/grok-1
Grok open release
Stability-AI/generative-models
Generative Models by Stability AI
abi/screenshot-to-code
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
emilwallner/Screenshot-to-code
A neural network that transforms a design mock-up into a static website.
TadasBaltrusaitis/OpenFace
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
rlawjdghek/StableVITON
[CVPR2024] StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On
ali-vilab/AnyDoor
Official implementations for paper: Anydoor: zero-shot object-level image customization
lllyasviel/Fooocus
Focus on prompting and generating
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
myshell-ai/OpenVoice
Instant voice cloning by MyShell.
reworkd/AgentGPT
🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.
cumulo-autumn/StreamDiffusion
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
CVI-SZU/Linly
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
Tencent/TencentPretrain
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
magic-research/magic-animate
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model