Enhancing Decision-Making with Large Language Models through Multi-Agent Fictitious Play

2026-06-18 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Multi-Agent Fictitious Play (MAFP) is a novel paradigm addressing "stance entanglement" in LLM-based multi-agent decision-making, a challenge where stakeholder decisions are mutually dependent. Unlike existing systems that handle "execution complexity," MAFP formulates decision-making as an equilibrium-seeking process. It iteratively updates each agent's decision by best-responding to the empirical mixture of other agents' past decisions, inspired by game theory's fictitious play. Evaluated across 13 competitive game and negotiation scenarios, MAFP outperformed single-round and multi-round baselines, including CoT, Self-Reflection, Debate, and Theory-of-Mind. It achieved the highest average tournament strength (0.533) and robustness (0.421), demonstrating superior effectiveness in complex scenarios with imperfect information or mixed-strategy equilibria, using Qwen3.5-35B-A3B as the action model.

Key takeaway

For AI Architects designing LLM-based multi-agent systems for strategic decision-making, you should consider implementing Multi-Agent Fictitious Play (MAFP). This framework effectively addresses "stance entanglement" by iteratively refining agent policies against empirical opponent mixtures, leading to more robust and less exploitable outcomes. Your systems will achieve superior performance in complex, real-world scenarios involving imperfect information or mixed-strategy equilibria, moving beyond the limitations of single-chain reasoning and traditional divide-and-conquer methods.

Key insights

Iterative fictitious play with LLM agents resolves "stance entanglement" in complex decision-making by decomposing recursive anticipation into sequential best responses.

Principles

Decision complexity arises from "stance entanglement" among interdependent stakeholders.
Robust decisions are equilibrium-seeking, maximizing payoff without exploitable weaknesses.
Fictitious play decomposes recursive anticipation into iterative best responses.

Method

MAFP initializes agents with policies, then iteratively aggregates opponents' past policies into an empirical mixture, and generates best-response policies against this mixture.

In practice

Apply MAFP to competitive games and negotiation scenarios.
Evaluate policies using "tournament strength" and "robustness" metrics.
Utilize Qwen3.5-35B-A3B as the action model for policy execution.

Topics

Large Language Models
Multi-Agent Systems
Game Theory
Fictitious Play
Decision-Making
Stance Entanglement
Strategic Reasoning

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.