GPT-4o Lacks Core Features of Theory of Mind
Summary
A new evaluation framework assesses whether Large Language Models (LLMs) possess a Theory of Mind (ToM) by probing for a causal model of mental states and behavior. This research specifically investigates if LLMs have a coherent, domain-general, and consistent understanding of how mental states drive actions, independent of human-like ToM. The study found that while LLMs, including GPT-4o, can approximate human judgments in basic ToM scenarios, they fail at logically equivalent tasks and show low consistency between their predicted actions and inferred mental states. These results indicate that the observed social proficiency in LLMs does not stem from a domain-general or consistent ToM.
Key takeaway
For AI Researchers developing socially intelligent agents, you should critically re-evaluate current ToM benchmarks. Your models' apparent social proficiency may not reflect a true understanding of mental states, necessitating new evaluation methods that probe for consistent, causal models of behavior rather than just approximating human judgments. This shift is crucial for building truly robust and reliable AI systems.
Key insights
LLMs like GPT-4o lack a consistent, domain-general causal model of mental states and behavior, despite social task success.
Principles
- Social proficiency does not imply ToM.
- ToM requires causal mental state models.
Method
The framework tests LLMs for a coherent, domain-general, and consistent model of how mental states cause behavior, using logically equivalent tasks and consistency checks between action predictions and mental state inferences.
In practice
- Evaluate LLMs beyond simple benchmarks.
- Focus on causal models for ToM assessment.
Topics
- Theory of Mind
- Large Language Models
- GPT-4o
- AI Evaluation
- Social Cognition
Best for: AI Researcher, AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.