diffusion-transformer

There are 50 repositories under diffusion-transformer topic.

  • Tencent-Hunyuan/HunyuanVideo

    HunyuanVideo: A Systematic Framework For Large Video Generation Model

    Language:Python11.2k1342561.1k
  • bytedance/InfiniteYou

    🔥 [ICCV 2025 Highlight] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

    Language:Python2.6k2838284
  • Alpha-VLLM/Lumina-T2X

    Lumina-T2X is a unified framework for Text to Any Modality Generation

    Language:Python2.2k319393
  • River-Zhang/ICEdit

    [NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!

    Language:Python2k1774112
  • Fantasy-AMAP/fantasy-talking

    [ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

    Language:Python1.6k6567124
  • bytedance/UNO

    [ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning

    Language:Python1.3k146777
  • shallowdream204/DreamClear

    [NeurIPS 2024] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

    Language:Python1.2k132850
  • thu-ml/RIFLEx

    Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025)

    Language:Python735362571
  • ByteDance-Seed/SeedVR

    Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)

    Language:Python707153142
  • Tencent-Hunyuan/HunyuanImage-2.1

    HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation​

    Language:Python65549
  • ivcylc/OpenMusic

    OpenMusic: SOTA Text-to-music (TTM) Generation

    Language:Python615141769
  • lucasnewman/f5-tts-mlx

    Implementation of F5-TTS in MLX

    Language:Python594143660
  • wangjiangshan0725/RF-Solver-Edit

    [🚀ICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!

    Language:Python594113715
  • IceClear/SeedVR2

    SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

  • HumanAIGC/omnitalker

    [NeurIPS 2025] OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication

    Language:JavaScript39228
  • nxnai/Voost

    [SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

    32224
  • DiT-3D/DiT-3D

    🔥🔥🔥Official Codebase of "DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation"

    Language:Python294123323
  • bytedance/ComfyUI_InfiniteYou

    🔥 [ICCV 2025 Highlight] Official ComfyUI native node supporting InfiniteYou with FLUX

    Language:Python281111946
  • TiankaiHang/Min-SNR-Diffusion-Training

    [ICCV 2023] Efficient Diffusion Training via Min-SNR Weighting Strategy

    Language:Python260297
  • AdaCache-DiT/AdaCache

    Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"

    Language:Python160388
  • TencentARC/GenCompositor

    Official implementation of the paper "GenCompositor: Generative Video Compositing with Diffusion Transformer"

    Language:Python1284
  • MyNiuuu/AniCrafter

    AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models

    Language:Python12511104
  • yangluo7/CAME

    [ACL 2023] The official implementation of "CAME: Confidence-guided Adaptive Memory Optimization"

    Language:Python952410
  • ML-GSAI/Scaling-Diffusion-Transformers-muP

    [NeurIPS 2025] Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".

    Language:Python91121
  • desaixie/pa_vdm

    CVPRW 2025 paper Progressive Autoregressive Video Diffusion Models: https://arxiv.org/abs/2410.08151

    Language:Python85761
  • lucasnewman/f5-tts-swift

    Implementation of F5-TTS in Swift using MLX

    Language:Swift846616
  • milmor/diffusion-transformer

    Implementation of Diffusion Transformer Model in Pytorch

    Language:Python702311
  • keshik6/grafting

    [NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Grafting

    Language:Jupyter Notebook612
  • Pur1zumu/RIFT-SVC

    Implementation of RIFT-SVC, a singing voice conversion model based on Rectified Flow Transformer.

    Language:Python512311
  • prathebaselva/FORA

    FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.

    Language:Python48162
  • explainingai-code/DiT-PyTorch

    This repo implements Diffusion Transformers(DiT) in PyTorch and provides training and inference code on CelebHQ dataset

    Language:Python47139
  • ModelTC/HarmoniCa

    [ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".

    Language:Python43541
  • ArchiMickey/Just-a-DiT

    A repo of a modified version of Diffusion Transformer

    Language:Python35252
  • aimagelab/Alfie

    Democratising RGBA Image Generation With No $$$ (AI4VA@ECCV24)

    Language:Python34411
  • Pengchengpcx/FTEdit

    [CVPR2025] Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

    Language:Python20720
  • yuanze-lin/IllumiCraft

    [NeurIPS 2025] The official code for "IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation"