SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution
Summary
The Savoir framework, grounded in cooperative game theory, addresses the credit assignment problem in training socially intelligent language agents via reinforcement learning. It combines expected utility for prospective valuation, capturing an utterance's strategic potential for future dialogue trajectories, with Shapley values for fair credit distribution, ensuring axiomatic guarantees. Evaluated on the SOTOPIA benchmark, Savoir achieves new state-of-the-art performance across all settings, with its 7B model matching or exceeding proprietary models like GPT-4o and Claude-3.5-Sonnet. Notably, large reasoning models consistently underperform, suggesting social intelligence requires distinct capabilities from analytical reasoning. Human evaluations further validate that Savoir produces more strategic responses and offers better credit assignment.
Key takeaway
For AI Engineers developing socially intelligent agents, Savoir offers a robust, theoretically grounded approach to reward modeling. You should consider integrating expected utility and Shapley values into your reinforcement learning pipelines to move beyond heuristic credit assignment. This can lead to agents that exhibit more strategic and human-aligned social behaviors, even outperforming larger proprietary models on complex social tasks.
Key insights
Savoir uses game theory's Shapley values and expected utility for principled, forward-looking credit assignment in social reinforcement learning.
Principles
- Expected utility captures strategic potential.
- Shapley values ensure fair credit distribution.
- Social intelligence differs from analytical reasoning.
Method
Savoir computes utterance-level rewards by sampling coalitions, evaluating their expected utility via Monte Carlo rollouts, and then solving a weighted linear regression using KernelSHAP to derive Shapley values for fair credit assignment.
In practice
- Train reward models with 7,500 utterance-level annotations.
- Use KernelSHAP for efficient Shapley value approximation.
- Prioritize extreme coalition sizes in sampling.
Topics
- Savoir Framework
- Social Intelligence
- Reinforcement Learning
- Shapley Values
- Expected Utility
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.