Contagion Networks: Evaluator Bias Propagation in Multi-Agent LLM Systems
Summary
Contagion Networks, a new formal framework, measures how evaluator biases propagate through interacting LLM agents in multi-agent systems. A controlled 3-agent experiment using DeepSeek-chat, with structured, balanced, and evidence-based evaluator bias profiles, revealed that biases consistently propagate between agents, with gamma values ranging from 0.157 to 0.352. The study identified three propagation regimes and found that homogeneous-model agents produce contagion coefficients 3-5x weaker than cross-model coefficients observed in prior work (MM-EPC: gamma approx 0.85-1.3), placing them in a suppression regime. Critically, increasing the evaluator committee size from k=1 to k=3 reduced effective contagion by 72.4%, offering a practical mitigation strategy. The open-source Contagion Network experimental framework has been released.
Key takeaway
For machine learning engineers designing multi-agent LLM systems, recognize that evaluator biases propagate significantly, even within homogeneous models. You should consider increasing your evaluator committee size from one to three agents, as this demonstrably reduces effective bias contagion by 72.4%, enhancing system reliability. Utilize the Contagion Network framework to measure and manage these biases, ensuring more robust and fair multi-agent deployments.
Key insights
Evaluator biases propagate in multi-agent LLM systems, measurable via Contagion Networks.
Principles
- Evaluator biases consistently propagate across interacting LLM agents.
- Homogeneous LLM agents exhibit weaker bias contagion than cross-model interactions.
- Increasing evaluator committee size significantly mitigates bias propagation.
Method
Contagion Networks provide a formal framework to measure evaluator bias spread in multi-agent LLM systems, using a Cross-Agent Contagion Matrix Gamma_N.
In practice
- Utilize the open-source Contagion Network experimental framework.
- Increase evaluator committee size (k=1 to k=3) to reduce contagion by 72.4%.
- Consider DeepSeek-chat for multi-agent LLM experiments.
Topics
- Large Language Models
- Multi-Agent Systems
- Evaluator Bias
- Bias Propagation
- Contagion Networks
- AI Safety
Best for: Research Scientist, AI Architect, CTO, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.