Training-Free Agentic AI: Probabilistic Control and Coordination in Multi-Agent LLM Systems

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Multi-Agent Systems · Depth: Expert, extended

Summary

REDEREF is a training-free controller designed to enhance the efficiency and robustness of multi-agent Large Language Model (LLM) systems. It addresses challenges like inefficient routing, noisy feedback, and high interaction costs in complex, long-horizon tasks. The framework integrates four key components: belief-guided delegation using Thompson sampling to prioritize agents with positive historical contributions, reflection-driven re-routing via a calibrated LLM or programmatic judge, evidence-based selection over output averaging, and memory-aware priors to mitigate cold-start inefficiencies. Experiments on split-knowledge tasks demonstrate that REDEREF reduces token usage by 28%, agent calls by 17%, and time-to-success by 19% compared to random recursive delegation, all while maintaining comparable task success rates. It also exhibits graceful adaptation to agent or judge degradation, proving that simple probabilistic control can significantly improve multi-agent LLM system performance without requiring training or fine-tuning.

Key takeaway

For AI Engineers building multi-agent LLM systems, consider integrating training-free probabilistic control mechanisms like REDEREF. Your systems can achieve substantial efficiency gains—reducing token usage, agent calls, and time-to-success—without the overhead of fine-tuning or complex reinforcement learning. Focus on robust judge calibration and memory-aware prior initialization to ensure reliable, adaptive, and auditable multi-agent coordination.

Key insights

Probabilistic control with belief-guided delegation significantly boosts multi-agent LLM system efficiency and robustness without training.

Principles

Method

REDEREF uses Thompson sampling for belief-guided delegation, a calibrated judge for reflection and re-routing, evidence-based selection for aggregation, and memory-aware priors for cold-start mitigation within a recursive loop.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.