/Awesome-Diffusion-LLM

A comprehensive list of papers about Large-Language-Diffusion-Models.

Awesome-Large-Language-Diffusion-Models

Awesome

A comprehensive list of papers about Large-Language-Diffusion-Models.


Important

Contributions welcome:

  • If you have a relevant paper not included in the library, please contact us! Or, you may also consider submitting 'Pull requests' directly, thank you!

  • If you think your paper is more suitable for another category, please contact us or submit 'Pull requests'.

  • If your paper is accepted, you may consider updating the relevant information.

  • Thank you!


💥 News 💥

  • 🔥🔥🔥 Awesome-LLDM is now open!

⭐️ Useful Resources (Blogs & Technical Reports)


⚙️ Framework


Survey Papers

Paper Title Year Conference/Journal Remark
Discrete Diffusion in Large Language and Multimodal Models: A Survey 2025 Arxiv
Diffusion-based Large Language Models Survey 2025 Arxiv
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models 2025 Arxiv

Diffusion Language Models

Large Diffusion Language Models (>7B)

Scaling

Paper Title Year Conference/Journal Remark
David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs 2023 NAACL
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning 2023 Arxiv
TESS 2: A Large-Scale Generalist Diffusion Language Model 2025 ACL Adapted from Mistral-7B-v0.1
Scaling Diffusion Language Models via Adaptation from Autoregressive Models 2025 ICLR 127M~7B (GPT2, LLaMA2)
Large Language Diffusion Models 2025 Arxiv LLaDA-8B
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models 2025 Arxiv
Large Language Models to Diffusion Finetuning 2025 Arxiv
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs 2025 Arxiv Long context scaling
Dream 7B: Diffusion Large Language Models 2025 Arxiv
UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models 2025 Arxiv

Accelerating

Paper Title Year Conference/Journal Remark
Accelerating Diffusion LLMs via Adaptive Parallel Decoding 2025 Arxiv
Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion 2025 Arxiv
dKV-Cache: The Cache for Diffusion Language Models 2025 Arxiv
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding 2025 Arxiv
Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs 2025 Arxiv
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles 2025 Arxiv
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size 2025 Arxiv
Fast-dLLM v2: Efficient Block-Diffusion LLM 2025 Arxiv
Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding 2025 Arxiv
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time 2025 ICLR
dParallel: Learnable Parallel Decoding for dLLMs 2025 Arxiv
d^2Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching 2025 Arxiv
Attention Sinks in Diffusion Language Models 2025 Arxiv
Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding 2025 Arxiv

Reasoning & Alignment

Paper Title Year Conference/Journal Remark
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models 2025 Arxiv
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning 2025 Arxiv
Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models 2024 NeurIPS
wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models 2025 Arxiv
Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs 2025 Arxiv
Review, Remask, Refine (R3): Process-Guided Block Diffusion for Text Generation 2025 ICML
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models 2025 Arxiv
DiFFPO: Training Diffusion LLMs to Reason Fast and Furious via Reinforcement Learning 2025 Arxiv
Principled and Tractable RL for Reasoning with Diffusion Language Models 2025 Arxiv
Improving Reasoning for Diffusion Language Models via Group Diffusion Policy Optimization 2025 Arxiv
Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies 2025 Arxiv
d2: Improved Techniques for Training Reasoning Diffusion Language Models 2025 Arxiv
Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step 2025 Arxiv
Inpainting-Guided Policy Optimization for Diffusion Large Language Models 2025 Arxiv
Beyond Surface Reasoning: Unveiling the True Long Chain-of-Thought Capacity of Diffusion Large Language Models 2025 Arxiv
Inpainting-Guided Policy Optimization for Diffusion Large Language Models 2025 Arxiv
Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models 2025 Arxiv
MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization 2025 Arxiv
Enhancing Reasoning for Diffusion LLMs via Distribution Matching Policy Optimization 2025 Arxiv
Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models 2025 Arxiv
Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner 2025 Arxiv
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models 2025 Arxiv
LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning 2025 Arxiv
TR2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning for Discrete Diffusion 2025 Arxiv
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models 2025 Arxiv
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall 2025 Arxiv
RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance 2025 Arxiv
Preference-Based Alignment of Discrete Diffusion Models 2025 Arxiv

Others

Paper Title Year Conference/Journal Remark
DINGO: Constrained Inference for Diffusion LLMs 2025 Arxiv Constrained decoding
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation 2025 Arxiv Coder
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference 2025 Arxiv Coder
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models 2025 Arxiv
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs 2025 Arxiv
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies 2025 Arxiv VLA
LLaDA-VLA: Vision Language Diffusion Action Models 2025 Arxiv VLA
Beyond Autoregression: An Empirical Study of Diffusion Large Language Models for Code Generation 2025 Arxiv Coder
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs 2025 Arxiv Quantization
Sequential Diffusion Language Models 2025 Arxiv
SparseD: Sparse Attention for Diffusion Language Models 2025 Arxiv Sparse Attention
LLaDA-MoE: A Sparse MoE Diffusion Language Model 2025 Arxiv MoE
dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought 2025 Arxiv VLA
Test-Time Anchoring for Discrete Diffusion Posterior Sampling 2025 Arxiv Sampling
What Makes Diffusion Language Models Super Data Learners? 2025 Arxiv
Why mask diffusion does not work 2025 Arxiv
DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding 2025 Arxiv
Think While You Generate: Discrete Diffusion with Planned Denoising 2025 ICLR
Diffusion Language Models Know the Answer Before Decoding 2025 Arxiv
CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation 2025 Arxiv
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models 2025 Arxiv

Diffusion Language Models (<7B)

Paper Title Year Conference/Journal Remark
Diffusion-LM Improves Controllable Text Generation 2022 NeurIPS Embedding
DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models 2023 ICLR Embedding
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models 2023 ACL Masked
Latent Diffusion for Language Generation 2023 NeurIPS Latent
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution 2024 ICML Masked
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control 2023 ACL Simplex, Blockwise
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation 2023 NeurIPS AR-like noise
Likelihood-Based Diffusion Language Models 2023 NeurIPS Plaid1B
Scaling up Masked Diffusion Models on Text 2024 ICLR 1.1B
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models 2025 ICLR
The Diffusion Duality 2025 ICML
Generalized Interpolating Discrete Diffusion 2025 ICML
Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions 2025 ICML
Esoteric Language Models 2025 Arxiv
Reinforced Context Order Recovery for Adaptive Reasoning and Planning 2025 Arxiv
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning 2025 ICLR
Your Absorbing Discrete Diffusion Secretly Models the Bayesian Posterior 2025 ArXiv
Any-Order Flexible Length Masked Diffusion 2025 Arxiv
Edit Flows: Flow Matching with Edit Operations 2025 Arxiv
DLM-One: Diffusion Language Models for One-Step Sequence Generation 2025 Arxiv
Simplified and Generalized Masked Diffusion for Discrete Data 2024 NeurIPS

Multi-Modal Diffusion Models

Paper Title Year Conference/Journal Remark
Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces 2025 ICML
MMaDA: Multimodal Large Diffusion Language Models 2025 Arxiv
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning 2025 Arxiv
Unified Multimodal Discrete Diffusion 2025 Arxiv
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding 2025 Arxiv
LaViDa: A Large Diffusion Language Model for Multimodal Understanding 2025 Arxiv
Dual Diffusion for Unified Image Generation and Understanding 2025 Arxiv
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model 2025 Arxiv
Show-o2: Improved Native Unified Multimodal Models 2025 Arxiv
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding 2025 Arxiv

Seminal Diffusion Papers

Paper Title Year Conference/Journal Remark
Deep Unsupervised Learning using Nonequilibrium Thermodynamics 2015 ICML Diffusion Formulation
Denoising Diffusion Probabilistic Models 2020 NeurIPS
Denoising Diffusion Implicit Models 2021 ICLR
Score-Based Generative Modeling through Stochastic Differential Equations 2021 ICLR
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps 2022 NeurIPS
High-Resolution Image Synthesis with Latent Diffusion Models 2022 CVPR
Scalable Diffusion Models with Transformers 2023 ICCV
Score-based Generative Modeling in Latent Space 2021 NeurIPS Latent
Structured Denoising Diffusion Models in Discrete State-Spaces 2021 NeurIPS Discrete
Vector Quantized Diffusion Model for Text-to-Image Synthesis 2022 CVPR VQ
Diffusion Models Beat GANs on Image Synthesis 2021 NeurIPS CG
Classifier-Free Diffusion Guidance 2021 NeurIPS CFG
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning 2023 ICLR Self-conditioning
Progressive Distillation for Fast Sampling of Diffusion Models 2022 ICLR Distillation
Consistency Models 2023 ICML

Contact

We welcome all researchers to contribute to this repository.

If you have a related paper that was not added to the library, please contact us.

Email: jake630@snu.ac.kr / wjk9904@snu.ac.kr / qicher@snu.ac.kr