KazusaKitakawa

KazusaKitakawa's Stars

THUDM/CogVideo
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python7.2k664
jhc13/taggui
Tag manager and captioner for image datasets
Language:Python65930
starik222/BooruDatasetTagManager
Language:C#1.4k123
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python19.3k2.1k
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
Language:HTML111k15k
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
Language:Python2k129
Kwai-Kolors/Kolors
Kolors Team
Language:Python3.5k225
LlamaFamily/Llama-Chinese
Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用
Language:Python13.6k1.2k
Qrange-group/SUR-adapter
ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities from large language models to build a high-quality textual semantic representation for text-to-image generation.
Language:Python1112
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
Language:Jupyter Notebook1.6k92
bmaltais/kohya_ss
Language:Python9.3k1.2k
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
Language:Python6.3k569
lllyasviel/Omost
Your image is almost there!
Language:Python7.2k417
advimman/lama
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Language:Jupyter Notebook7.8k831
TencentARC/MotionCtrl
Official Code for MotionCtrl [SIGGRAPH 2024]
Language:Python1.3k70
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Language:Python3.3k281
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.6k2.1k
HL-hanlin/Ctrl-Adapter
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
Language:Python37416
Elegycloud/clash-for-linux-backup
基于Clash Core 制作的Clash For Linux备份仓库 A Clash For Linux Backup Warehouse Based on Clash Core
Language:Shell2.1k885
Kosinkadink/ComfyUI-AnimateDiff-Evolved
Improved AnimateDiff for ComfyUI and Advanced Sampling Support
Language:Python2.6k194
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
Language:Python2.3k251
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Language:Python3.9k469
sdbds/champ-for-windows
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Language:Python181
ltdrdata/ComfyUI-Manager
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, this extension provides a hub feature and convenience functions to access a wide range of information within ComfyUI.
Language:JavaScript6k739
wgwang/awesome-LLMs-In-China
**大模型
5.3k435
city96/ComfyUI_ExtraModels
Support for miscellaneous image models. Currently supports: DiT, PixArt, HunYuanDiT, MiaoBi, and a few VAEs.
Language:Python36233
Zejun-Yang/AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Language:Python4.5k566
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Language:Python2.7k168
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python6k534
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
Language:Python1.6k109