Where to Place the Query? Unveiling and Mitigating Positional Bias in In-Context Learning for Diffusion LLMs via Decoding Dynamics

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

Research into In-Context Learning (ICL) for Diffusion Large Language Models (dLLMs) like LLaDA-8B-Base and Dream-7B-Base reveals that query position is a first-order variable. Unlike Autoregressive LLMs, dLLMs' bidirectional attention offers spatial flexibility, yet current practices often use AR-style trailing queries. Empirical analysis shows positional variance impacts generation quality comparably to example semantic quality, with a relative importance ratio r=1.236 on GSM8K. Optimal placement is task-dependent: sequential reasoning (GSM8K) favors trailing, while global perception (Sudoku) prefers prefix. This sensitivity stems from a spatial "Recency Effect" and task-dependent "Decoding Trajectories." To mitigate this, the paper proposes Average Confidence (C̄), a novel metric tracking iterative decoding, and Auto-ICL, a training-free adaptive routing strategy. Auto-ICL dynamically optimizes query placement, robustly approaching oracle performance across tasks like GSM8K, MATH, MBPP, Sudoku, and Countdown, with marginal inference latency.

Key takeaway

For Machine Learning Engineers optimizing In-Context Learning for Diffusion LLMs, you must move beyond static, AR-style trailing query placements. Dynamically routing your query based on task type can significantly improve performance, especially for global-perception tasks like Sudoku which benefit from prefix placement, or under constrained generation budgets. Implement the Auto-ICL strategy using Average Confidence (C̄) to adaptively find the optimal query topology, ensuring robust and efficient model performance across diverse reasoning and perception challenges.

Key insights

Query placement is a first-order variable in dLLM In-Context Learning, impacting performance as much as example selection.

Principles

Method

Auto-ICL dynamically routes queries by enumerating candidate positions, running dLLM decoding passes to compute Average Confidence (C̄), and selecting the placement maximizing generation stability.

In practice

Topics

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.