Adversarial Consensus Protocol (ACP): Validation by Contradiction in Multi-LLM Systems
Summary
The Adversarial Consensus Protocol (ACP) validates large language model (LLM) outputs by utilizing genuine architectural divergence between models, contrasting with common multi-agent debate methods that use staged disagreement from a single model. Traditional approaches prompt copies of one model to argue different sides, potentially sharing inherent blind spots. ACP employs three distinct LLM families: one proposes an answer, a second (from a different family) attacks by identifying weaknesses, and a third (from a third family) adjudicates the outcome. This protocol is being tested across single agent, homogeneous debate, and heterogeneous ACP conditions to compare accuracy, error catch rate, adjudication independence, and spurious conflict rate. Earlier research (El Kandoussi, 2026) found heterogeneous LLM groups developed measurably divergent behavioral profiles (0.56 average pairwise cosine similarity) compared to homogeneous groups (0.85). This divergence is crucial for ACP, which also requires anonymous agent interaction to prevent perceived authority from shrinking differences. Early pilot runs indicate heterogeneous adjudication effectively catches errors that homogeneous debate misses.
Key takeaway
For AI Architects designing robust LLM validation systems, you should prioritize architectural diversity over prompting for conflict. Your multi-agent setups will achieve higher error catch rates by using genuinely different LLM families for proposal, attack, and adjudication roles. Ensure agents remain anonymous to each other to preserve critical divergence. This approach moves beyond superficial debate, anchoring validation in real differences and building a valuable jurisprudence of decisions.
Key insights
Real architectural divergence between LLMs, not staged conflict, is key to robust validation.
Principles
- Genuine LLM divergence improves error detection.
- Agent anonymity preserves critical differences.
- Group interaction fosters necessary engagement.
Method
ACP involves three distinct LLM families: one proposes, a second attacks by finding weaknesses, and a third adjudicates the final answer. This process builds a "jurisprudence" log of decisions.
In practice
- Implement multi-LLM systems with diverse architectures.
- Ensure LLM agents operate anonymously to each other.
- Track adjudication rulings to build a decision log.
Topics
- Adversarial Consensus Protocol
- Multi-agent LLM Systems
- LLM Validation
- Model Divergence
- AI Safety
- Error Detection
Best for: Research Scientist, AI Scientist, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by LLM on Medium.