DreamReasoner-8B: Block-Size Curriculum Learning for Diffusion Reasoning Models

2026-06-18 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

DreamReasoner-8B is an open-source 8B-parameter block diffusion reasoning model designed to accelerate decoding for long chain-of-thought (CoT) reasoning. A systematic study revealed that training with large block sizes degrades reasoning performance, while small block sizes preserve it. To overcome this, the authors propose block-size curriculum learning, which gradually transitions training from fine-grained to coarse-grained block sizes. This approach enables DreamReasoner-8B to achieve competitive results on mathematical and code reasoning benchmarks, matching leading autoregressive models like Qwen3-8B. Additionally, the work introduces RelaxedConfidence, an analytical decoding probe that yields average TPF (tokens per forward pass) gains of 22.5% and 54.5% in thinking and answering phases, respectively, by relaxing token commitment thresholds based on local context.

Key takeaway

For AI Scientists and Machine Learning Engineers developing efficient reasoning models, you should consider adopting block-size curriculum learning for diffusion language models. This approach stabilizes performance across varying inference block sizes, enabling competitive reasoning with autoregressive models like Qwen3-8B while offering parallel decoding benefits. Furthermore, explore implementing the RelaxedConfidence decoding strategy to achieve significant throughput gains, particularly in the answering phase, without compromising reasoning fidelity.

Key insights

Block-size curriculum learning enables diffusion models to achieve efficient, robust long-CoT reasoning competitive with autoregressive LLMs.

Principles

Large training block sizes degrade reasoning in block diffusion models.
Small training block sizes preserve reasoning capabilities.
Gradual block-size increase during training improves robustness.

Method

Block-size curriculum learning involves initial training with small block sizes (e.g., 4), then gradually transitioning to larger or mixed block sizes (e.g., 32 or {4,8,16,32}) to acquire robust local causal dependencies and generalize to larger inference blocks.

In practice

Implement block-size curriculum for diffusion model training.
Apply RelaxedConfidence decoding for throughput gains.
Evaluate diffusion models across diverse inference block sizes.

Topics

Diffusion Language Models
Block Diffusion
Chain-of-Thought Reasoning
Curriculum Learning
Model Efficiency
Mathematical Reasoning
Code Generation

Code references

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.