diffusion-transformer
There are 50 repositories under diffusion-transformer topic.
Tencent-Hunyuan/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
bytedance/InfiniteYou
🔥 [ICCV 2025 Highlight] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
River-Zhang/ICEdit
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
Fantasy-AMAP/fantasy-talking
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
bytedance/UNO
[ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
shallowdream204/DreamClear
[NeurIPS 2024] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
thu-ml/RIFLEx
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025)
ByteDance-Seed/SeedVR
Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)
Tencent-Hunyuan/HunyuanImage-2.1
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
wangjiangshan0725/RF-Solver-Edit
[🚀ICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!
IceClear/SeedVR2
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
HumanAIGC/omnitalker
[NeurIPS 2025] OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication
nxnai/Voost
[SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off
DiT-3D/DiT-3D
🔥🔥🔥Official Codebase of "DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation"
bytedance/ComfyUI_InfiniteYou
🔥 [ICCV 2025 Highlight] Official ComfyUI native node supporting InfiniteYou with FLUX
TiankaiHang/Min-SNR-Diffusion-Training
[ICCV 2023] Efficient Diffusion Training via Min-SNR Weighting Strategy
AdaCache-DiT/AdaCache
Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"
TencentARC/GenCompositor
Official implementation of the paper "GenCompositor: Generative Video Compositing with Diffusion Transformer"
MyNiuuu/AniCrafter
AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models
yangluo7/CAME
[ACL 2023] The official implementation of "CAME: Confidence-guided Adaptive Memory Optimization"
ML-GSAI/Scaling-Diffusion-Transformers-muP
[NeurIPS 2025] Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".
desaixie/pa_vdm
CVPRW 2025 paper Progressive Autoregressive Video Diffusion Models: https://arxiv.org/abs/2410.08151
lucasnewman/f5-tts-swift
Implementation of F5-TTS in Swift using MLX
milmor/diffusion-transformer
Implementation of Diffusion Transformer Model in Pytorch
keshik6/grafting
[NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Grafting
Pur1zumu/RIFT-SVC
Implementation of RIFT-SVC, a singing voice conversion model based on Rectified Flow Transformer.
prathebaselva/FORA
FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.
explainingai-code/DiT-PyTorch
This repo implements Diffusion Transformers(DiT) in PyTorch and provides training and inference code on CelebHQ dataset
ModelTC/HarmoniCa
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
ArchiMickey/Just-a-DiT
A repo of a modified version of Diffusion Transformer
aimagelab/Alfie
Democratising RGBA Image Generation With No $$$ (AI4VA@ECCV24)
Pengchengpcx/FTEdit
[CVPR2025] Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
yuanze-lin/IllumiCraft
[NeurIPS 2025] The official code for "IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation"