Self-Programmed Execution for Language-Model Agents

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

This paper introduces Self-Programmed Execution (SPE), a novel agent architecture where the language model's completion itself acts as the orchestrator program, rather than relying on a fixed external harness. The authors formalize SPE using "agentic machines" and demonstrate its practical implementation through Spell, a Lisp-based language designed to handle the unique challenges of self-modifying, executable context. Spell enables programs to edit and re-evaluate themselves safely, preventing replay of side effects and managing turn-boundary interference. Empirical evaluations with frontier models like GPT-5.4 and Claude Opus 4.6 on coding benchmarks (Terminal-Bench 1.1, SWE-bench Lite) and orchestration games show that existing models can effectively use Spell for tasks such as context management and programmatic tool calling, achieving competitive accuracy with reduced token costs compared to traditional harnesses like Codex CLI, though performance varied on benchmarks like LongBench v2 and AppWorld.

Key takeaway

For AI Architects and Research Scientists designing advanced LLM agents, adopting the Self-Programmed Execution (SPE) paradigm with a language like Spell offers a powerful alternative to fixed orchestration policies. You should explore Spell's capabilities for dynamic context management and programmatic tool invocation, as it can lead to more flexible and cost-efficient agents. Consider training models specifically for SPE to potentially unlock more sophisticated self-orchestration strategies and improve performance on complex, long-horizon tasks.

Key insights

Self-programmed execution allows language models to dynamically orchestrate their own multi-turn behavior by generating executable code.

Principles

Method

SPE involves a language model generating a program (Spell) that dictates its own turn-to-turn transitions, with a harness evaluating this model-written code. Spell uses Lisp syntax and a trailing-expression pattern to manage context and prevent side-effect re-evaluation.

In practice

Topics

Code references

Best for: AI Architect, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.