The Theory of Mind Utility: Formal Specification of a Mentalizing Mechanism

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Science & Research — Mathematics & Computational Sciences, Social Sciences & Behavioral Studies, Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

The Theory of Mind Utility (ToM-U) formally specifies the epistemic state inference problem inherent in mentalizing at a computational level, detailing what and why mentalizing computes without algorithmic or neural commitments. ToM-U constructs Local Epistemic World Models (LEWMs), which are directed typed graphs representing agents, state nodes, and their epistemic relationships. It evaluates discrete candidate LEWMs against observed behavior to achieve sufficient confidence. The framework comprises five formal definitions covering LEWM structure, agent node properties including ordered information access history, a bounded proliferation mechanism for recursive mentalizing, three inference procedures, and a residue function for failed attempts. ToM-U distinguishes itself from Bayesian Theory of Mind by deriving belief states from information access history and source credibility, rather than presupposing them, and from simulation theory by offering a formal apparatus for epistemic state inference. This domain-agnostic mechanism generates falsifiable predictions about mentalizing failures.

Key takeaway

For AI Scientists and Research Scientists developing social AI, this formalization offers a critical shift. Your systems should move beyond presupposing belief states, as ToM-U demonstrates how to derive them from ordered information access and source credibility using Local Epistemic World Models. Incorporating its generate-and-filter architecture and residue function can lead to more robust mentalizing capabilities, preventing systematic failures seen in current large language models and enabling more nuanced social cognition.

Key insights

ToM-U formalizes epistemic state inference by deriving beliefs from ordered information access history and source credibility, not presupposing them.

Principles

Beliefs derive from ordered information access and source credibility.
Inference evaluates discrete candidate world models.
Failed mentalizing leaves a structured, persistent residue.

Method

ToM-U generates Local Epistemic World Models (LEWMs) as directed typed graphs. It employs a generate-and-filter architecture with backward inference, self-projection, and mutual reconciliation to evaluate candidate LEWMs against observed behavior, accumulating confidence.

In practice

Model ordered information access to avoid LLM-like ToM failures.
Distinguish "absent" from "false" beliefs for precise inference.
Miscalibrated observability can model self-deception.

Topics

Theory of Mind
Epistemic State Inference
Local Epistemic World Models
Cognitive Architecture
Belief Inference
Mentalizing Mechanisms

Best for: AI Scientist, Research Scientist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.