fenfenfenfan's Stars
Zuellni/ComfyUI-PickScore-Nodes
PickScore nodes for ComfyUI.
VectorSpaceLab/OmniGen
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
showlab/Awesome-Unified-Multimodal-Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
showlab/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
XLabs-AI/x-flux
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
bghira/SimpleTuner
A general fine-tuning kit geared toward diffusion models.
kongds/E5-V
E5-V: Universal Embeddings with Multimodal Large Language Models
Xiaojiu-z/Stable-Hair
Stable-Hair: Real-World Hair Transfer via Diffusion Model
fusiming3/MARS
Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
GAIR-NLP/anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
ShaShekhar/aaiela
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
lks-ai/anynode
A Node for ComfyUI that does what you ask it to do
Nerogar/OneTrainer
OneTrainer is a one-stop solution for all your stable diffusion training needs.
SalesforceAIResearch/DiffusionDPO
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
google/python-fire
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
wangkai930418/DPL
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing (NeurIPS 2023)
Xiaojiu-z/SSR_Encoder
Pytorch Implementation of "SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation"(CVPR 2024)
LC044/WeChatMsg
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
FreeStyleFreeLunch/FreeStyle
FreeStyle : Free Lunch for Text-guided Style Transfer using Diffusion Models
CaraJ7/CoMat
[Neurips 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
idealo/image-quality-assessment
Convolutional Neural Networks to predict the aesthetic and technical quality of images.
christophschuhmann/improved-aesthetic-predictor
CLIP+MLP Aesthetic Score Predictor
cosmicman-cvpr2024/CosmicMan
CosmicMan: A Text-to-Image Foundation Model for Humans (CVPR 2024)
TencentARC/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
WUyinwei-hah/RRNet
[CVPR2024] The official implementation of paper Relation Rectification in Diffusion Model
OSU-NLP-Group/MagicBrush
[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance