The Theory of Mind Utility: Formal Specification of a Mentalizing Mechanism
Summary
The Theory of Mind Utility (ToM-U) formally specifies the epistemic state inference problem inherent in mentalizing at a computational level, detailing what and why mentalizing computes without algorithmic or neural commitments. ToM-U constructs Local Epistemic World Models (LEWMs), which are directed typed graphs representing agents, state nodes, and their epistemic relationships. It evaluates discrete candidate LEWMs against observed behavior to achieve sufficient confidence. The framework comprises five formal definitions covering LEWM structure, agent node properties including ordered information access history, a bounded proliferation mechanism for recursive mentalizing, three inference procedures, and a residue function for failed attempts. ToM-U distinguishes itself from Bayesian Theory of Mind by deriving belief states from information access history and source credibility, rather than presupposing them, and from simulation theory by offering a formal apparatus for epistemic state inference. This domain-agnostic mechanism generates falsifiable predictions about mentalizing failures.
Key takeaway
For AI Scientists and Research Scientists developing social AI, this formalization offers a critical shift. Your systems should move beyond presupposing belief states, as ToM-U demonstrates how to derive them from ordered information access and source credibility using Local Epistemic World Models. Incorporating its generate-and-filter architecture and residue function can lead to more robust mentalizing capabilities, preventing systematic failures seen in current large language models and enabling more nuanced social cognition.
Key insights
ToM-U formalizes epistemic state inference by deriving beliefs from ordered information access history and source credibility, not presupposing them.
Principles
- Beliefs derive from ordered information access and source credibility.
- Inference evaluates discrete candidate world models.
- Failed mentalizing leaves a structured, persistent residue.
Method
ToM-U generates Local Epistemic World Models (LEWMs) as directed typed graphs. It employs a generate-and-filter architecture with backward inference, self-projection, and mutual reconciliation to evaluate candidate LEWMs against observed behavior, accumulating confidence.
In practice
- Model ordered information access to avoid LLM-like ToM failures.
- Distinguish "absent" from "false" beliefs for precise inference.
- Miscalibrated observability can model self-deception.
Topics
- Theory of Mind
- Epistemic State Inference
- Local Epistemic World Models
- Cognitive Architecture
- Belief Inference
- Mentalizing Mechanisms
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.