Trust, Lies, and Long Memories: Emergent Social Dynamics and Reputation in Multi-Round Avalon with LLM Agents
Summary
A study investigated emergent social dynamics in Large Language Model (LLM) agents playing The Resistance: Avalon, a hidden-role deception game. Unlike prior work, this research focused on multi-round interactions, with agents retaining memory of past games and player behaviors. Across 188 games, two main phenomena emerged: LLM agents developed stable, role-conditional reputations, referencing past behavior in their strategic reasoning. For example, an agent might be described as "straightforward" when playing good but "subtle" when evil, and high-reputation players received 46% more team inclusions. Additionally, higher reasoning effort (using OpenAI GPT-5.1's reasoning_effort parameter) correlated with more sophisticated deception strategies, such as evil players passing early missions to build trust before sabotaging later ones (75% in high-effort games vs. 36% in low-effort games).
Key takeaway
For AI Scientists developing multi-agent systems, this research demonstrates that incorporating persistent memory and variable reasoning depth can lead to the emergence of sophisticated social dynamics, including reputation formation and advanced deceptive strategies. You should consider these architectural elements when designing agents for complex, interactive environments, as they significantly influence strategic behavior and interaction patterns. This suggests that more human-like social intelligence in AI may require longitudinal interaction capabilities.
Key insights
LLM agents with cross-game memory develop role-conditional reputations and sophisticated deception strategies.
Principles
- Reputation dynamics emerge organically with persistent memory.
- Higher reasoning effort enables more strategic deception.
- Social models can be role-conditional for LLM agents.
Method
LLM agents (OpenAI GPT-5.1) played repeated Avalon games, interleaving reasoning with action. A structured reflection system retained cross-game memory, and reasoning depth was manipulated via the reasoning_effort parameter.
In practice
- Implement cross-game memory for complex multi-agent simulations.
- Vary reasoning effort to observe emergent strategic behaviors.
- Use social deduction games as testbeds for AI social reasoning.
Topics
- Large Language Models
- Multi-Agent Systems
- The Resistance: Avalon
- Reputation Dynamics
- Strategic Deception
Code references
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.