A Multi-Agent system for Multi-Objective constrained optimization
Summary
MAMO (Multi-Agent system for Multi-Objective constrained optimization), presented at AAMAS 2026 in Paphos, Cyprus, from May 25-29, 2026, is a novel multi-agent reinforcement learning approach designed to address the challenge of balancing conflicting objectives in constrained optimization problems within dynamic computing and networking environments. Traditional methods often rely on manually selected reward weights, which critically impact policy behavior and make it difficult to achieve an appropriate trade-off between cost optimization and constraint satisfaction, especially in non-stationary settings. MAMO tackles this by decoupling task execution from objective design, formulating the selection of these crucial reward weights as a learning problem. It employs a hierarchical architecture with a Task-Execution (TE) agent that learns control policies and a Weight-Adaptation (WA) agent that observes long-term system indicators to dynamically adjust the weighting coefficients, enabling autonomous adaptation to evolving conditions.
Key takeaway
For Machine Learning Engineers designing RL solutions for dynamic, constrained optimization problems, MAMO offers a robust alternative to manual reward weight tuning. You should consider implementing a hierarchical multi-agent system like MAMO to autonomously adapt objective trade-offs, ensuring your policies remain optimal even as environmental conditions or QoS requirements evolve. This approach can significantly reduce the effort in fine-tuning and improve system resilience.
Key insights
MAMO autonomously learns optimal reward weights for constrained multi-objective RL, decoupling task execution from objective design.
Principles
- Decouple task execution from objective design.
- Treat reward weight selection as a learning problem.
- Use hierarchical agents for different time scales.
Method
MAMO uses a two-phase iterative workflow: WA agent selects weights for a training horizon, TE agent learns; then WA agent evaluates performance and adjusts weights.
In practice
- Apply to edge-FaaS replica scaling.
- Manage resource selection and workload scaling.
- Adapt to non-stationary workload patterns.
Topics
- Multi-Agent Reinforcement Learning
- Constrained Optimization
- Multi-Objective Optimization
- Reward Shaping
- Edge Computing
- FaaS Resource Scaling
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.