The Representational Limit of Scalar Interactions: An Interventional Decomposition
Summary
Stochastic Hi-Fi is a novel post-hoc, retraining-free predictability decomposition method designed to address the fundamental issue where signed pairwise interaction scores conflate uniqueness (U), redundancy (R), and synergy (S). This conflation is demonstrated on a minimal 3-way XOR structural causal model, where existing faithful indices like Shapley-Taylor return zero per pair, and projective indices like Shapley Interaction spread third-order effects into conflated pair scalars. Stochastic Hi-Fi estimates per-feature U/R/S profiles using interventional masked inference, offering exact interventional semantics, finite-sample Monte Carlo bounds, strict variance reduction from coupled diamond sampling, and uniform finite-vocabulary convergence. In evaluations, it recovered structure on tabular SCMs up to 411x larger interaction-magnitude recovery ratios compared to scalar baselines. It also successfully separated redundant and synergistic heads in the GPT-2 IOI circuit and matched GradCAM on Pointing Game while substantially improving Deletion AUC on NIH ChestX-ray14.
Key takeaway
For AI Scientists and ML Engineers focused on model interpretability, traditional scalar interaction scores like Shapley Interaction are insufficient for understanding feature contributions. If you are analyzing complex models or sensitive applications, recognize that these scores conflate uniqueness, redundancy, and synergy. You should consider Stochastic Hi-Fi to accurately decompose these interactions, gaining clearer insights into feature roles in tabular SCMs, large language models like GPT-2, and medical imaging tasks. This approach provides a more robust understanding of model behavior.
Key insights
Stochastic Hi-Fi accurately decomposes feature interactions into uniqueness, redundancy, and synergy, overcoming limitations of scalar interaction scores.
Principles
- Scalar interaction scores conflate uniqueness, redundancy, and synergy.
- Interventional masked inference estimates U/R/S profiles.
- Coupled diamond sampling reduces variance in estimations.
Method
Stochastic Hi-Fi employs interventional masked inference to estimate per-feature uniqueness, redundancy, and synergy profiles. It is a post-hoc, retraining-free predictability decomposition method with exact interventional semantics.
In practice
- Recover missed structure in tabular SCMs.
- Separate redundant/synergistic LLM heads.
- Enhance interpretability for medical imaging.
Topics
- Machine Learning Interpretability
- Feature Interaction
- Structural Causal Models
- Explainable AI
- GPT-2
- Medical Imaging
Best for: Research Scientist, NLP Engineer, Computer Vision Engineer, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.