KazusaKitakawa's Stars
THUDM/CogVideo
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
jhc13/taggui
Tag manager and captioner for image datasets
starik222/BooruDatasetTagManager
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
Kwai-Kolors/Kolors
Kolors Team
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Qrange-group/SUR-adapter
ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities from large language models to build a high-quality textual semantic representation for text-to-image generation.
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
bmaltais/kohya_ss
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
lllyasviel/Omost
Your image is almost there!
advimman/lama
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
TencentARC/MotionCtrl
Official Code for MotionCtrl [SIGGRAPH 2024]
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
HL-hanlin/Ctrl-Adapter
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
Elegycloud/clash-for-linux-backup
基于Clash Core 制作的Clash For Linux备份仓库 A Clash For Linux Backup Warehouse Based on Clash Core
Kosinkadink/ComfyUI-AnimateDiff-Evolved
Improved AnimateDiff for ComfyUI and Advanced Sampling Support
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
sdbds/champ-for-windows
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
ltdrdata/ComfyUI-Manager
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, this extension provides a hub feature and convenience functions to access a wide range of information within ComfyUI.
wgwang/awesome-LLMs-In-China
**大模型
city96/ComfyUI_ExtraModels
Support for miscellaneous image models. Currently supports: DiT, PixArt, HunYuanDiT, MiaoBi, and a few VAEs.
Zejun-Yang/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation