LLM-as-Code Agentic Programming for Agent Harness

2026-06-02 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Agentic Programming and LLM-as-Code are proposed as a solution to inherent reliability issues in current LLM agent frameworks, which typically assign deterministic control flow to probabilistic LLMs. This architectural flaw leads to problems like token explosion, control-flow hallucination, and unreliable task completion. The new paradigm shifts control flow to the program, treating the LLM as an "LLM-as-Code" component invoked only for specific reasoning or generation tasks. It features a code-driven workflow, a Directed Acyclic Graph (DAG)-structured context that limits context length by call depth, facilitates multi-agent collaboration, and supports self-programmed evolution where improvements are committed as durable code. Empirical evidence from a GUI automation agent on the OSWorld benchmark demonstrates its practicality, achieving an 86.8% success rate in 15 steps, surpassing the strongest prior system's 80.4% in 100 steps.

Key takeaway

For Machine Learning Engineers building LLM agents for structured, long-horizon tasks, you should reconsider the common LLM-as-orchestrator paradigm. This approach inherently leads to unreliability and context overflow. Instead, adopt Agentic Programming, where your program manages deterministic control flow and invokes LLMs as adaptive components for reasoning. This ensures compliance, bounds context, and improves overall agent stability, as demonstrated by an 86.8% success rate on OSWorld.

Key insights

Assigning deterministic control flow to probabilistic LLMs causes inherent agent unreliability; programs should manage control, invoking LLMs only for reasoning.

Principles

Deterministic control flow must be program-managed.
Probabilistic LLMs excel at reasoning, not orchestration.
Context should be bounded by call depth, not steps.

Method

Implement agent workflows with ordinary code for control flow, invoking LLMs as "LLM-as-Code" components for reasoning or generation within specific function calls. This creates a DAG-structured context.

In practice

Use code for loops, branches, and sequencing.
Wrap LLM calls in Python functions with decorators.
Commit agent improvements as durable code.

Topics

LLM Agents
Agentic Programming
LLM-as-Code
Control Flow
Context Management
Software Engineering Agents
OSWorld Benchmark

Code references

langchain-ai/langgraph

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.