The Representational Limit of Scalar Interactions: An Interventional Decomposition
Summary
A new paper, "The Representational Limit of Scalar Interactions: An Interventional Decomposition," submitted on 17 Jun 2026, addresses the fundamental issue that signed pairwise interaction scores inherently conflate uniqueness (U), redundancy (R), and synergy (S). The authors demonstrate this limitation using a minimal 3-way XOR structural causal model, where faithful indices like Shapley-Taylor yield zero per pair, while projective indices such as Shapley Interaction incorrectly spread third-order effects. To overcome this, the paper introduces Stochastic Hi-Fi, a post-hoc, retraining-free predictability decomposition method. This estimator provides exact interventional semantics, finite-sample Monte Carlo bounds, strict variance reduction via coupled diamond sampling, and uniform finite-vocabulary convergence. Stochastic Hi-Fi successfully recovers structure missed by scalar baselines, showing up to 411x larger interaction-magnitude recovery ratios on tabular SCMs, and effectively separates redundant and synergistic heads within the GPT-2 IOI circuit. It also matches GradCAM on the Pointing Game and significantly improves Deletion AUC on NIH ChestX-ray14.
Key takeaway
For machine learning engineers and AI scientists focused on model interpretability, you should recognize that traditional scalar interaction scores often misrepresent feature contributions by conflating uniqueness, redundancy, and synergy. To gain a more accurate understanding of complex model behaviors, consider implementing advanced decomposition methods like Stochastic Hi-Fi. This approach offers superior insights into feature interactions, crucial for debugging models and ensuring reliability in critical applications.
Key insights
Scalar interaction scores inherently conflate uniqueness, redundancy, and synergy, necessitating advanced decomposition for accurate feature attribution.
Principles
- Scalar interaction scores conflate U/R/S.
- Interventional masked inference estimates U/R/S.
- Coupled diamond sampling reduces variance.
Method
Stochastic Hi-Fi estimates per-feature uniqueness, redundancy, and synergy profiles using interventional masked inference, providing a post-hoc, retraining-free predictability decomposition.
In practice
- Recover missed structure in tabular SCMs.
- Separate GPT-2 IOI circuit heads.
- Improve Deletion AUC on medical imaging.
Topics
- Machine Learning
- Feature Attribution
- Model Interpretability
- Causal Inference
- Stochastic Hi-Fi
- Predictability Decomposition
Best for: Research Scientist, NLP Engineer, Computer Vision Engineer, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.