SkillMAS: Skill Co-Evolution with LLM-based Multi-Agent System

· Source: cs.MA updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

SkillMAS is a non-parametric framework designed to improve Large Language Model (LLM) agent systems after deployment by adaptively specializing them. It addresses the common issue of "adaptation decoupling," where skill evolution and multi-agent system (MAS) restructuring are treated as separate problems, leading to bottlenecks and mis-specialization. SkillMAS integrates these two adaptation targets by using Utility Learning to assign credit from verified execution traces, bounded skill evolution to refine reusable procedures without uncontrolled library growth, and evidence-gated MAS restructuring that triggers only when retained failures and Executor Utility indicate a structural mismatch. The framework was evaluated across embodied manipulation (ALFWorld), command-line OS workflows (Lifelong Agent Bench OS Task), and retail workflows (τ-Bench Retail), achieving competitive success rates, such as 94.0% on ALFWorld's unseen split and 76.7% on Lifelong Agent Bench OS Task.

Key takeaway

For AI Architects and AI Engineers designing adaptive LLM agent systems, SkillMAS offers a principled approach to post-deployment specialization. By integrating skill evolution with MAS restructuring through a shared evidence surface, your teams can avoid common pitfalls like context overload and mis-specialization. Consider adopting a similar coupled adaptation strategy to ensure your agent systems improve holistically and efficiently, rather than optimizing skills and organization in isolation.

Key insights

SkillMAS couples skill evolution and MAS restructuring using verified execution traces for adaptive LLM agent specialization.

Principles

Method

SkillMAS executes episodes, learns Skill and Executor Utility from verified traces, constructs a retained evidence set, then applies bounded skill evolution and evidence-gated MAS restructuring to update the system state.

In practice

Topics

Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.