text-to-image-generation
There are 152 repositories under text-to-image-generation topic.
NVlabs/Sana
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Lightricks/ComfyUI-LTXVideo
LTX-Video Support for ComfyUI
adobe-research/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
FoundationVision/Infinity
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
muzishen/IMAGDressing
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high fidelity and garment consistency for virtual dressing.
AIDC-AI/Awesome-Unified-Multimodal-Models
Awesome Unified Multimodal Models
songweige/rich-text-to-image
Rich-Text-to-Image Generation
PKU-YuanGroup/UniWorld
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
FoundationVision/Liquid
(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators
markfulton/NanoBananaEditor
The most advanced Nano Banana image generator and editor application. Your central hub for AI image generation and revisions. Intuitive UI features reference images, editing with image masks, version history, and more. Powered by Gemini 2.5 Flash images API.
donahowe/AutoStudio
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
Paranioar/Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
ByteVisionLab/TokenFlow
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
OSU-NLP-Group/MagicBrush
[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
woctezuma/stable-diffusion-colab
Colab notebook for Stable Diffusion Hyper-SDXL.
RockeyCoss/SPO
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
CFGpp-diffusion/CFGpp
Official repository for "CFG++: manifold-constrained classifier free guidance for diffusion models" (ICLR2025)
huggingface/diffusion-fast
Faster generation with text-to-image diffusion models.
yunqing-me/AttackVLM
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
tsunghan-wu/SLD
🔥 [CVPR2024] Official implementation of "Self-correcting LLM-controlled Diffusion Models (SLD)
GuoLanqing/Awesome-High-Resolution-Diffusion
🔥🔥🔥A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.
ExplainableML/ReNO
[NeurIPS 2024] ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
zituitui/BELM
[NeurIPS 2024] Official implementation of "BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models".
somepago/DCR
Official Pytorch repo of CVPR'23 and NeurIPS'23 papers on understanding replication in diffusion models.
yandex-research/swd
Scale-wise Distillation of Diffusion Models
louisYen/Gen4Gen
🏞️ Official implementation of "Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition"
QY-H00/attention-interpolation-diffusion
[NeurIPS 2024] Official Implementation of Attention Interpolation of Text-to-Image Diffusion
Correr-Zhou/MagicTailor
[IJCAI 2025 (Oral)] Offical implementation of the paper "MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models".
j-min/DSG
Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)
CSU-JPG/TextAtlas
A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation
mapo-t2i/mapo
Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).
YonghaoXu/Txt2Img-MHN
[IEEE TIP 2023] Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern Hopfield Networks
glami/glami-1m
The largest multilingual image-text classification dataset. It contains fashion products.
haoosz/ConceptExpress
[ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction
PangzeCheung/SingDiffusion
[CVPR 2024] Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models
YangLing0818/ContextDiff
[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation