This repository contains:
- A regularly updated paper list for Diffusion Large Language Models.
- A tutorial for Diffusion Large Language Models
- A nano code snippet for Diffusion Large Language Models (coming soon!)
- Gemini Diffusion blog
- Mercury: Ultra-Fast Language Models Based on Diffusion tech report
- Dream7B blog
- LaViDa: A Large Diffusion Language Model for Multimodal Understanding
- MMaDA: Multimodal Large Diffusion Language Models
- LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
- Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective (arxiv)
- dKV-Cache: The Cache for Diffusion Language Models (arxiv)
- CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation(arxiv)
- d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning (arxiv)
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models (ICLR 2025)
- Large Language Diffusion Models (arxiv 2025)
- Remasking Discrete Diffusion Models with Inference-Time Scaling (arxiv 2025)
- Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data (ICLR 2025)
- Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling (ICLR 2025)
- Scaling Diffusion Language Models via Adaptation from Autoregressive Models (ICLR 2025)
- Scaling up Masked Diffusion Models on Text (ICLR 2025)
- Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion (NAACL 2025)
- Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (ICML 2024 best paper)
- Discrete Flow Matching (NeurIPS 2024)
- Simple and Effective Masked Diffusion Language Models (NeurIPS 2024)
- Simplified and Generalized Masked Diffusion for Discrete Data (NeurIPS 2024)
- Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation (EACL 2024)
- A Reparameterized Discrete Diffusion Model for Text Generation (COLM 2024)
- Diffusion Glancing Transformer for Parallel Sequence-to-Sequence Learning (NAACL 2024)
- Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning (arxiv 2024)
- DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models (ACL 2023)
- A Continuous Time Framework for Discrete Denoising Models (NeurIPS 2022)
- Structured Denoising Diffusion Models in Discrete State-Spaces (NeurIPS 2021)
- Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions (NeurIPS 2021)
- DiffusER: Discrete Diffusion via Edit-based Reconstruction (ICLR 2023)
- A Cheaper and Better Diffusion Language Model with Soft-Masked Noise (EMNLP 2023)
- Difformer: Empowering Diffusion Models on the Embedding Space for Text Generation (NAACL2024)
- Latent Diffusion for Language Generation (NeurIPS 2023)
- AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation (NeurIPS 2023)
- Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise (ICML 2023)
- DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models (EMNLP 2023)
- DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM (EMNLP 2023)
- How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data? (ECAI 2023)
- SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control (ACL 2023)
- Glyphdiffusion: Text generation as image generation (arxiv 2023)
- Dinoiser: Diffused conditional sequence learning by manipulating noises (arxiv 2023)
- Diffusion-LM Improves Controllable Text Generation (NeurIPS 2022)
- DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models (ICLR 2022)
- Continuous diffusion for categorical data (arxiv 2022)
- Seqdiffuseq: Text diffusion with encoder-decoder transformers (arxiv 2022)
- Diffusion Guided Language Modeling (ACL 2024)
- Table-to-Text Generation with Pretrained Diffusion Models (IEEE 2024)
- Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models (NeurIPS)
- DiffusionNER: Boundary Diffusion for Named Entity Recognition (ACL 2023)
- Fine-grained Text Style Transfer with Diffusion-Based Language Models (RepL4NLP 2023)