Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Expert, long

Summary

This conceptual framework introduces "intent compilation" for open-world AI agents, addressing why capable models often fail deployment in institutional settings despite their problem-solving abilities. The authors propose transforming partially specified human purpose into inspectable artifacts that bind execution, distinguishing between closed-world solvers and open-world agents. They formalize residual openness as a "closure-gap vector" across semantic, evidentiary, procedural, and institutional dimensions. The paper defines "delegation envelopes" as pre-authorized regions of action space and "misclosure" as a distinct failure mode where a plausible output cannot be ratified due to underspecified or violated contracts. It also outlines benchmark metrics to test when closure interventions are more effective than additional inference-time search, using a travel rebooking example to illustrate the four-contract stack.

Key takeaway

For research scientists developing AI agents for institutional deployment, you should prioritize designing systems that explicitly compile human intent into inspectable contracts. Focusing on reducing "closure gaps" across semantic, evidentiary, procedural, and institutional dimensions will likely improve time-to-authorized-action and compliance more cost-effectively than merely increasing inference-time search, especially for high-risk tasks. Implement a four-contract stack to define clear delegation envelopes, ensuring auditable and safe autonomous operation.

Key insights

Intent compilation transforms human purpose into binding artifacts for open-world AI agents, addressing deployment failures.

Principles

Method

Intent compilation produces a contract tuple $(S_t, E_t, M_t, I_t)$ specifying task semantics, evidence, method, and authority, exposing residual closure gaps for autonomous action, clarification, or escalation.

In practice

Topics

Best for: Research Scientist, AI Scientist, AI Architect, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.