MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

MOCHA (Multi-Objective Chebyshev Annealing) is a novel framework designed to optimize LLM agent skills, which are structured natural-language specifications subject to hard platform constraints. Unlike traditional prompt optimizers that treat skills as single-objective text blobs, MOCHA addresses the inherent multi-objective nature of skill optimization, balancing task performance against limits like description truncation (1,024 characters) and instruction body compaction (5,000 characters). It employs Chebyshev scalarization to cover the entire Pareto front, including non-convex regions, and uses exponential annealing to transition from exploration to exploitation. In experiments across six diverse agent skills, MOCHA achieved a 7.5% relative improvement in mean correctness over the strongest baseline, with gains up to 14.9% on FEVER and 10.4% on TheoremQA. It also discovered twice as many Pareto-optimal skill variants, while existing optimizers failed to improve the seed skill on 4 of 6 tasks after 1000 rollouts.

Key takeaway

For AI Scientists and Machine Learning Engineers deploying LLM agents, traditional single-objective prompt optimizers are inadequate for refining multi-field skills constrained by platform limits. You should adopt multi-objective optimization frameworks like MOCHA to explicitly navigate the trade-offs between task correctness and compliance (e.g., token limits). This approach ensures you discover Pareto-optimal skill variants, achieving significant performance gains, especially in tasks with high objective conflict, where other methods fail to make progress.

Key insights

Skill optimization for LLM agents is a multi-objective problem requiring principled Pareto front navigation.

Principles

Method

MOCHA selects parents via randomized Chebyshev scalarization, then mutates. Acceptance criteria anneals from Hypervolume Contribution (HVC) for exploration to Chebyshev for exploitation.

In practice

Topics

Best for: AI Engineer, NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Prompt Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.