From Shortcuts to Reasoning: Robust Post-Training of Theory of Mind with Reinforcement Learning

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new study introduces Thinking-RFT, a Reinforcement Fine-Tuning method, to address the pervasive "shortcut" issue in Theory of Mind (ToM) datasets for foundation models. Existing datasets can yield up to 99% accuracy by exploiting spurious causal correlations, creating a false sense of ToM. The researchers developed a framework to identify these shortcuts, noting that "belief" questions are more prone to them than "intention" questions. Applying Thinking-RFT, which uses verifiable rewards and explicit reasoning chains, across four shortcut-free datasets and three ToM contexts, the method achieved a 6% improvement over Supervised Fine-Tuning (SFT) overall. This included a 10% improvement in complex higher-order reasoning and 7% in multimodal cases, alongside better generalization and robustness. The study highlights that the joint effect of reasoning and RL in Thinking-RFT, grounding reasoning on anchor cues, specifically contributed to a 7% average improvement over Non-Thinking-RFT.

Key takeaway

For AI Scientists developing or fine-tuning foundation models for real-world applications requiring Theory of Mind, you should critically evaluate your ToM datasets for "shortcut" correlations, especially for "belief" questions. Implement Thinking-RFT, which combines explicit reasoning chains with reinforcement learning, to achieve more robust and generalizable ToM capabilities, particularly for higher-order reasoning and multimodal scenarios. This approach improves performance by 6-10% over SFT.

Key insights

Thinking-RFT, combining reasoning and RL, robustly improves Theory of Mind in foundation models by avoiding dataset shortcuts.

Principles

Method

Thinking-RFT involves Reinforcement Fine-Tuning with verifiable rewards and explicit reasoning chains, applied to shortcut-free datasets, to elevate Theory of Mind capabilities.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.