Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

2026-06-04 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

Retrospective Harness Optimization (RHO) is a novel self-supervised method designed to improve AI agent harnesses without requiring ground-truth validation sets, which are often difficult to obtain in practical deployments. RHO addresses this by selecting a diverse coreset of challenging tasks from past agent trajectories and re-solving them in parallel. The agent then uses self-validation and self-consistency to analyze these rollouts, generating candidate harness updates. It selects the most effective update based on its own pairwise self-preference. Evaluated across software engineering, technical work, and knowledge work domains, RHO notably improved the pass rate on SWE-Bench Pro from 59% to 78% in a single optimization round, all without external grading. This method effectively targets prior failure modes, altering agent behavior patterns and sustaining higher accuracy during long-horizon sessions.

Key takeaway

For AI Engineers and Machine Learning Scientists tasked with continuously improving LLM agents in deployment, Retrospective Harness Optimization (RHO) offers a critical solution. You can now enhance agent performance and adapt to new tasks without the prohibitive cost and difficulty of acquiring labeled ground-truth validation data. Consider implementing RHO to automate agent refinement, address persistent failure modes, and sustain higher accuracy in long-horizon operational sessions, thereby streamlining your agent development and maintenance workflows.

Key insights

RHO uses self-preference over past trajectories to improve AI agent harnesses without external validation.

Principles

Self-supervision can optimize agent performance.
Past failures inform future agent improvements.
Self-preference guides harness update selection.

Method

RHO selects diverse, challenging tasks from past trajectories, re-solves them, applies self-validation and self-consistency, then uses pairwise self-preference to select the most effective harness update.

In practice

Apply RHO to improve LLM agents.
Use self-validation for agent feedback.
Optimize agent behavior on prior failures.

Topics

LLM Agents
Self-Supervised Learning
Harness Optimization
SWE-Bench Pro
Trajectory Rollouts
AI Agent Improvement

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.