OpenSkill: Open-World Self-Evolution for LLM Agents

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

OpenSkill is a novel framework enabling open-world self-evolution for LLM agents, allowing them to build skills and verification signals from scratch using public resources like documentation and web pages, without target-task supervision. It bootstraps a learning loop by acquiring grounded knowledge and verification anchors, synthesizing them into transferable skills, and refining these skills against self-built virtual tasks. Across three benchmarks (SkillsBench, SocialMaze, ScienceWorld) and two target agents (Opus 4.6, GPT 5.2), OpenSkill achieved the best automated pass rate, improving by +8.9 and +8.8 points over baselines. Its skills transfer across models without adaptation, and its self-built verifier aligns with ground-truth outcomes, covering 88.9% of test intents. The framework involves open-world knowledge acquisition, leakage-free skill evolution, and zero-shot target evaluation.

Key takeaway

For AI Engineers developing LLM agents for dynamic, open-ended environments, OpenSkill offers a robust approach to continuous improvement. You should consider integrating open-world knowledge acquisition and self-verification mechanisms to enable agents to adapt post-deployment without relying on costly human-curated skills or target-task supervision. This method yields transferable skills and a reliable practice environment, significantly boosting agent performance and reducing dependency on explicit feedback.

Key insights

OpenSkill enables LLM agents to self-evolve skills and verification signals using open-world data, free from target-task supervision.

Principles

Method

OpenSkill acquires knowledge and verification anchors from open-world resources, synthesizes them into skills, and iteratively refines these skills using virtual tests in a leakage-free environment.

In practice

Topics

Code references

Best for: Research Scientist, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.