T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization

2026-02-12 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

T3D introduces a trajectory self-distillation framework designed to enhance the efficiency of Diffusion Large Language Models (DLLMs) for few-step text generation. While DLLMs offer parallel token decoding, their practical inference speed is often limited by the necessity for numerous refinement steps, with aggressive step reduction severely impacting generation quality. T3D addresses this by distilling the model's own generative trajectories and integrating Direct Discriminative Optimization (DDO), a reverse-KL objective. DDO encourages mode-seeking distillation, prompting the student model to focus on high-probability teacher modes. This approach consistently outperforms existing few-step baselines and standard training methods under strict step budgets, significantly narrowing the performance gap with full-step decoding and laying a foundation for practical few-step DLLMs.

Key takeaway

For research scientists developing efficient text generation models, T3D offers a pathway to significantly improve few-step Diffusion LLM performance. You should consider integrating trajectory self-distillation and Direct Discriminative Optimization into your training pipelines to achieve faster inference without substantial quality degradation, especially when operating under tight computational budgets.

Key insights

Trajectory self-distillation with Direct Discriminative Optimization improves few-step diffusion language model efficiency.

Principles

Distill generative trajectories for efficiency.
Reverse-KL objective promotes mode-seeking distillation.

Method

The method involves trajectory self-distillation, where a model distills its own generative trajectories, combined with Direct Discriminative Optimization (DDO), a reverse-KL objective, to concentrate on high-probability teacher modes.

In practice

Apply DDO for few-step DLLM training.
Use trajectory distillation to reduce inference steps.

Topics

Diffusion Language Models
Trajectory Self-Distillation
Direct Discriminative Optimization
Few-Step Decoding
Text Generation

Code references

Tyrion58/T3D

Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.