Scale-Dependent Collective Adaptation in Self-Amending LLM Societies: A Cross-Family Study of Emergent Governance

2026-05-19 · Source: cs.MA updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Social Sciences & Behavioral Studies · Depth: Expert, extended

Summary

A study on collective adaptation in self-amending LLM societies, using the Nomic game, reveals that institutional change does not improve monotonically with model size across Qwen3.5 (0.8B, 2B, 4B, 9B, 27B) and Gemma3 (270M, 1B, 4B, 12B, 27B) families. Instead, both families exhibit a narrow mid-scale regime (4B for Qwen3.5, 12B for Gemma3) that supports sustained rule adoption, diverse amendments, and balanced consensus. Smaller models tend to be rule-inert, while larger models often converge on restrictive voting patterns. Heterogeneous mixed-size groups collapse into veto-driven gridlock. These non-monotonic patterns persist under temperature perturbations and shifts from unanimity to majority voting, with the sweet spot showing linearly decodable vote-predictive information at intermediate-to-mid layers.

Key takeaway

For AI Scientists designing multi-agent LLM systems, recognize that institutional adaptability does not simply scale with model size. You should focus on identifying and utilizing mid-scale models, such as Qwen3.5 4B or Gemma3 12B, for tasks requiring complex collective adaptation and diverse rule dynamics. Be wary of deploying the largest models or heterogeneous groups, as they tend towards gridlock and restrictive behaviors, undermining the system's capacity for self-governance.

Key insights

Collective adaptation in LLM societies is non-monotonic, with mid-scale models exhibiting optimal rule dynamics.

Principles

Optimal collective adaptation is scale-dependent and non-monotonic.
Heterogeneous LLM societies tend to collapse into gridlock.
Decodability of vote-predictive signals is necessary but not sufficient for adaptive play.

Method

The Nomic game serves as a controlled testbed for studying collective adaptation by allowing LLM agents to propose and vote on rule changes, coupling behavioral measurement with hidden-state probing.

In practice

Avoid deploying largest LLMs for self-governing multi-agent systems.
Prioritize mid-scale LLMs for emergent institutional dynamics.
Ensure agent homogeneity for stable collective adaptation.

Topics

Self-Amending LLM Societies
Collective Adaptation
Nomic Game
Non-Monotonic Scaling
Vote-Prediction Probing

Code references

KazuyaHoribe/nomic-coevolution

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.