WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents

2026-06-01 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

WRIT (Write-Read Intensive Trajectory Synthesis) is a novel pipeline designed to generate complex training trajectories for multi-turn user-facing agents. It addresses a gap where existing methods primarily focus on "write-intensive" tasks, which train sequential execution, but neglect the "read-heavy" challenge of gathering and comparing substantial evidence before making a single decision. WRIT synthesizes tasks along two complexity axes: the number of write decisions and the evidence burden per decision. The pipeline generates both write-intensive and read-heavy tasks, diversifies user behavior instructions, and simulates agent-user interactions in an executable environment. Training a 4B model on only 2K WRIT-synthesized trajectories resulted in performance exceeding GPT-5.1 no-think on τ²-bench, significantly reducing inference-time token usage. This demonstrates that compact Supervised Fine-Tuning (SFT) data can convert expensive test-time reasoning into efficient agent behavior.

Key takeaway

For Machine Learning Engineers developing multi-turn user-facing agents, consider integrating WRIT's trajectory synthesis approach. Your training data should explicitly incorporate both "write-intensive" sequential tasks and "read-heavy" evidence-gathering scenarios to build more robust and efficient agents. This method can significantly improve agent performance on benchmarks like τ²-bench and reduce inference-time token usage, converting expensive reasoning into efficient behavior.

Key insights

WRIT synthesizes agent training trajectories balancing sequential execution and evidence-gathering complexity for robust decision-making.

Principles

Single write decisions can be difficult due to evidence burden.
Training data should reflect realistic conversational variation.
Compact SFT data can convert test-time reasoning into efficient behavior.

Method

WRIT generates write-intensive and read-heavy tasks, diversifies user behavior instructions, then simulates agent-user interactions in an executable environment.

In practice

Train agents for longer task execution.
Develop agents for robust, evidence-grounded decision making.
Reduce inference-time token usage with targeted SFT data.

Topics

Multi-turn Agents
Trajectory Synthesis
Agent Training
Evidence-Grounded Decision Making
Large Language Models
Supervised Fine-Tuning

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.