Recursive Agent Harnesses

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Advanced, quick

Summary

Recursive Agent Harness (RAH) is a new pattern extending recursive language models (RLMs) by integrating full agent harnesses, complete with filesystem tools, code execution, and planning, rather than just model calls. This "harness recursion" allows a parent agent to generate and run executable scripts that spawn parallel subagent harnesses for fine-grained workloads and use structured function calls for smaller subtasks. In controlled evaluations, RAH improved the Codex coding-agent baseline from 71.75% to 81.36% on Oolong-Synthetic, a dataset with 199 samples across 13 context-length buckets up to 4M tokens, using GPT-5. With Claude Sonnet 4.5 as the backbone, the same design achieved 89.77%.

Key takeaway

For AI Engineers developing advanced coding agents or tackling long-context reasoning challenges, you should investigate the Recursive Agent Harness (RAH) pattern. Implementing RAH can significantly boost performance, as demonstrated by an 81.36% score with GPT-5 and 89.77% with Claude Sonnet 4.5 on complex coding tasks. Consider integrating harness recursion to manage fine-grained workloads and leverage parallel subagents for improved scalability and accuracy in your agentic systems.

Key insights

Recursive Agent Harnesses extend model recursion with full agent capabilities for long-context reasoning and complex task execution.

Principles

Method

A parent agent generates and runs an executable script to spawn parallel subagent harnesses for fine-grained workloads, utilizing structured function calls for smaller subtasks.

In practice

Topics

Best for: Research Scientist, AI Architect, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.