WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents
Summary
WRIT (Write-Read Intensive Trajectory Synthesis) is a novel pipeline designed to generate complex training trajectories for multi-turn user-facing agents. It addresses a gap where existing methods primarily focus on "write-intensive" tasks, which train sequential execution, but neglect the "read-heavy" challenge of gathering and comparing substantial evidence before making a single decision. WRIT synthesizes tasks along two complexity axes: the number of write decisions and the evidence burden per decision. The pipeline generates both write-intensive and read-heavy tasks, diversifies user behavior instructions, and simulates agent-user interactions in an executable environment. Training a 4B model on only 2K WRIT-synthesized trajectories resulted in performance exceeding GPT-5.1 no-think on τ²-bench, significantly reducing inference-time token usage. This demonstrates that compact Supervised Fine-Tuning (SFT) data can convert expensive test-time reasoning into efficient agent behavior.
Key takeaway
For Machine Learning Engineers developing multi-turn user-facing agents, consider integrating WRIT's trajectory synthesis approach. Your training data should explicitly incorporate both "write-intensive" sequential tasks and "read-heavy" evidence-gathering scenarios to build more robust and efficient agents. This method can significantly improve agent performance on benchmarks like τ²-bench and reduce inference-time token usage, converting expensive reasoning into efficient behavior.
Key insights
WRIT synthesizes agent training trajectories balancing sequential execution and evidence-gathering complexity for robust decision-making.
Principles
- Single write decisions can be difficult due to evidence burden.
- Training data should reflect realistic conversational variation.
- Compact SFT data can convert test-time reasoning into efficient behavior.
Method
WRIT generates write-intensive and read-heavy tasks, diversifies user behavior instructions, then simulates agent-user interactions in an executable environment.
In practice
- Train agents for longer task execution.
- Develop agents for robust, evidence-grounded decision making.
- Reduce inference-time token usage with targeted SFT data.
Topics
- Multi-turn Agents
- Trajectory Synthesis
- Agent Training
- Evidence-Grounded Decision Making
- Large Language Models
- Supervised Fine-Tuning
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.