LLM-as-Code Agentic Programming for Agent Harness
Summary
Agentic Programming and LLM-as-Code are proposed as a solution to inherent reliability issues in current LLM agent frameworks, which typically assign deterministic control flow to probabilistic LLMs. This architectural flaw leads to problems like token explosion, control-flow hallucination, and unreliable task completion. The new paradigm shifts control flow to the program, treating the LLM as an "LLM-as-Code" component invoked only for specific reasoning or generation tasks. It features a code-driven workflow, a Directed Acyclic Graph (DAG)-structured context that limits context length by call depth, facilitates multi-agent collaboration, and supports self-programmed evolution where improvements are committed as durable code. Empirical evidence from a GUI automation agent on the OSWorld benchmark demonstrates its practicality, achieving an 86.8% success rate in 15 steps, surpassing the strongest prior system's 80.4% in 100 steps.
Key takeaway
For Machine Learning Engineers building LLM agents for structured, long-horizon tasks, you should reconsider the common LLM-as-orchestrator paradigm. This approach inherently leads to unreliability and context overflow. Instead, adopt Agentic Programming, where your program manages deterministic control flow and invokes LLMs as adaptive components for reasoning. This ensures compliance, bounds context, and improves overall agent stability, as demonstrated by an 86.8% success rate on OSWorld.
Key insights
Assigning deterministic control flow to probabilistic LLMs causes inherent agent unreliability; programs should manage control, invoking LLMs only for reasoning.
Principles
- Deterministic control flow must be program-managed.
- Probabilistic LLMs excel at reasoning, not orchestration.
- Context should be bounded by call depth, not steps.
Method
Implement agent workflows with ordinary code for control flow, invoking LLMs as "LLM-as-Code" components for reasoning or generation within specific function calls. This creates a DAG-structured context.
In practice
- Use code for loops, branches, and sequencing.
- Wrap LLM calls in Python functions with decorators.
- Commit agent improvements as durable code.
Topics
- LLM Agents
- Agentic Programming
- LLM-as-Code
- Control Flow
- Context Management
- Software Engineering Agents
- OSWorld Benchmark
Code references
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.