Contagion Networks: Evaluator Bias Propagation in Multi-Agent LLM Systems
Summary
A new framework, Contagion Networks, quantifies evaluator bias propagation in multi-agent LLM systems. Experiments using DeepSeek-chat agents with distinct bias profiles (structured, balanced, evidence-based) revealed that evaluator biases consistently propagate, with mean contagion coefficients γ ∈ [0.143, 0.304]. Homogeneous-model agents typically operate in a "suppression regime," where bias attenuates rapidly across hops (cumulative factor β₃=0.0055). This contrasts sharply with prior cross-model work (MM-EPC), which observed 3–5× stronger contagion (γ ≈ 0.85–1.3) leading to a "cascade regime." Crucially, increasing the evaluator committee size from k=1 to k=3 reduced effective contagion by 72.4%, demonstrating a practical mitigation strategy. The open-source framework is released for community use.
Key takeaway
For AI Architects designing multi-agent LLM systems where agents evaluate each other, you must account for evaluator bias propagation. Measure the Cross-Agent Contagion Matrix Γₚ pre-deployment to predict system stability. Prioritize homogeneous model families for evaluators, as they offer natural bias suppression. Additionally, implement evaluator committees of at least three agents and monitor strategy entropy H(ω) as a real-time health indicator to maintain cognitive diversity.
Key insights
Evaluator biases propagate in multi-agent LLM systems, but homogeneous models suppress it, while diverse evaluator committees mitigate it.
Principles
- Bias propagation magnitude depends on evaluator diversity.
- Homogeneous model pools provide implicit regularization.
- Spectral radius ρ(Γₚ) governs propagation regimes.
Method
The Contagion Networks framework measures bias propagation using a Cross-Agent Contagion Matrix Γₚ and Test-Time Reinforcement Learning (TTRL) for strategy updates.
In practice
- Measure Γ before system deployment.
- Prefer homogeneous evaluator pools.
- Use committees of ≥ 3 evaluators.
Topics
- Multi-Agent Systems
- LLM Evaluation
- Bias Propagation
- Contagion Networks
- Evaluator Diversity
- DeepSeek-chat
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.