Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts
Summary
Retrospective Harness Optimization (RHO) is a novel self-supervised method designed to improve AI agent harnesses without requiring ground-truth validation sets, which are often difficult to obtain in practical deployments. RHO addresses this by selecting a diverse coreset of challenging tasks from past agent trajectories and re-solving them in parallel. The agent then uses self-validation and self-consistency to analyze these rollouts, generating candidate harness updates. It selects the most effective update based on its own pairwise self-preference. Evaluated across software engineering, technical work, and knowledge work domains, RHO notably improved the pass rate on SWE-Bench Pro from 59% to 78% in a single optimization round, all without external grading. This method effectively targets prior failure modes, altering agent behavior patterns and sustaining higher accuracy during long-horizon sessions.
Key takeaway
For AI Engineers and Machine Learning Scientists tasked with continuously improving LLM agents in deployment, Retrospective Harness Optimization (RHO) offers a critical solution. You can now enhance agent performance and adapt to new tasks without the prohibitive cost and difficulty of acquiring labeled ground-truth validation data. Consider implementing RHO to automate agent refinement, address persistent failure modes, and sustain higher accuracy in long-horizon operational sessions, thereby streamlining your agent development and maintenance workflows.
Key insights
RHO uses self-preference over past trajectories to improve AI agent harnesses without external validation.
Principles
- Self-supervision can optimize agent performance.
- Past failures inform future agent improvements.
- Self-preference guides harness update selection.
Method
RHO selects diverse, challenging tasks from past trajectories, re-solves them, applies self-validation and self-consistency, then uses pairwise self-preference to select the most effective harness update.
In practice
- Apply RHO to improve LLM agents.
- Use self-validation for agent feedback.
- Optimize agent behavior on prior failures.
Topics
- LLM Agents
- Self-Supervised Learning
- Harness Optimization
- SWE-Bench Pro
- Trajectory Rollouts
- AI Agent Improvement
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.