MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization
Summary
MOCHA (Multi-Objective Chebyshev Annealing) is a novel framework designed to optimize LLM agent skills, which are structured natural-language specifications subject to hard platform constraints. Unlike traditional prompt optimizers that treat skills as single-objective text blobs, MOCHA addresses the inherent multi-objective nature of skill optimization, balancing task performance against limits like description truncation (1,024 characters) and instruction body compaction (5,000 characters). It employs Chebyshev scalarization to cover the entire Pareto front, including non-convex regions, and uses exponential annealing to transition from exploration to exploitation. In experiments across six diverse agent skills, MOCHA achieved a 7.5% relative improvement in mean correctness over the strongest baseline, with gains up to 14.9% on FEVER and 10.4% on TheoremQA. It also discovered twice as many Pareto-optimal skill variants, while existing optimizers failed to improve the seed skill on 4 of 6 tasks after 1000 rollouts.
Key takeaway
For AI Scientists and Machine Learning Engineers deploying LLM agents, traditional single-objective prompt optimizers are inadequate for refining multi-field skills constrained by platform limits. You should adopt multi-objective optimization frameworks like MOCHA to explicitly navigate the trade-offs between task correctness and compliance (e.g., token limits). This approach ensures you discover Pareto-optimal skill variants, achieving significant performance gains, especially in tasks with high objective conflict, where other methods fail to make progress.
Key insights
Skill optimization for LLM agents is a multi-objective problem requiring principled Pareto front navigation.
Principles
- Skill optimization is inherently multi-objective.
- Chebyshev scalarization covers full Pareto fronts.
- Annealing balances exploration and exploitation.
Method
MOCHA selects parents via randomized Chebyshev scalarization, then mutates. Acceptance criteria anneals from Hypervolume Contribution (HVC) for exploration to Chebyshev for exploitation.
In practice
- Gains scale with objective conflict (e.g., FEVER).
- Applicable to meta-harness optimization.
- Refine existing tool-backed skill definitions.
Topics
- LLM Agents
- Skill Optimization
- Multi-Objective Optimization
- Chebyshev Scalarization
- Pareto Front
- Prompt Optimization
Best for: AI Engineer, NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Prompt Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.