Formal Skill: Programmable Runtime Skills for Efficient and Accurate LLM Agents

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

Formal Skill is a novel runtime-native abstraction designed to enhance the efficiency and accuracy of Large Language Model (LLM) agents by formalizing reusable capabilities. Unlike existing informal skills that rely on natural language prompts, Formal Skill represents procedures as structured executable objects with JSON metadata, action schemas, reliable Python executors, hook-governed control logic, and skill-local runtime state. This approach reduces token consumption and enforces operational semantics. The abstraction is implemented in FairyClaw, an open-source event-driven runtime. Evaluated on Harness-Bench, FairyClaw achieved a competitive average score of 0.690, ranking third overall and first in the gpt-5.4 group with a 0.746 score. Crucially, it used substantially fewer tokens, averaging 49.0K tokens per task, which is approximately 48% lower than the mean of other harnesses and 33% lower than the next most token-efficient system. A case study with CodeRepairOps, a code-repair skill, demonstrated its effectiveness in procedural tasks requiring controlled actions and verification.

Key takeaway

For AI Architects designing LLM agent systems, adopting Formal Skill can significantly improve operational efficiency and reliability. You should consider implementing runtime-native, programmable skills with structured interfaces and executable policies to reduce token costs and enforce procedural invariants. This approach ensures agents follow explicit workflows, validate actions, and manage recovery states effectively, moving beyond ambiguous natural-language instructions.

Key insights

Formal Skill transforms LLM agent capabilities from informal text prompts into token-efficient, enforceable runtime-native protocols.

Principles

Method

Formal Skill involves defining JSON metadata/schemas, Python executors, lifecycle hooks, skill-local runtime state, and routing metadata for agent capabilities.

In practice

Topics

Code references

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.