Statistical Inference and Learning for Shapley Additive Explanations (SHAP)
Summary
This paper introduces a semi-parametric approach to perform statistical inference on the $p$-th powers of Shapley Additive Explanations (SHAP) values, specifically $\theta_p := \mathbb{E}|\phi_a(X)|^p$. SHAP is a critical tool for attributing feature importance in predictive tasks, but existing methods lack statistical inference capabilities for global measures like mean absolute SHAP or mean squared SHAP. The authors develop asymptotically normal estimates for $\theta_p$ by treating the SHAP curve as a nuisance function. For $p \geq 2$, they propose a de-biased estimator combining U-statistics with Neyman orthogonal scores. For $1 \leq p < 2$, where the target parameter is not twice differentiable, they construct de-biased U-statistics for a smoothed alternative, carefully tuning a temperature parameter to ensure valid inference for the true, unsmoothed $p$-th power. Additionally, the paper presents a Neyman orthogonal loss function for learning the SHAP curve via empirical risk minimization, providing excess risk guarantees.
Key takeaway
For AI Scientists and machine learning practitioners relying on SHAP for feature importance and selection, this work provides crucial statistical rigor. You can now construct asymptotically valid confidence intervals for global SHAP measures, such as mean absolute SHAP and mean squared SHAP, enhancing the trustworthiness and interpretability of your models. This enables more reliable decision-making in applications like medicine and marketing, where understanding feature contributions is paramount. Consider integrating these inferential techniques to quantify uncertainty in your SHAP-based analyses.
Key insights
New methods enable statistical inference and confidence intervals for global SHAP values, addressing a critical gap in explainable AI.
Principles
- Treat SHAP curve as a nuisance function for robust estimation.
- Neyman orthogonality helps de-bias estimators for complex functionals.
- Smoothing irregular functionals allows asymptotic normality.
Method
The method uses de-biased U-statistics and Neyman orthogonal scores. For non-differentiable cases ($1 \leq p < 2$), it employs a smoothed surrogate function based on $\tanh(x)$ with a carefully tuned temperature parameter.
In practice
- Construct confidence intervals for global SHAP importance.
- Improve feature selection reliability with statistical guarantees.
- Utilize the de-biased loss for high-fidelity SHAP curve estimation.
Topics
- Shapley Additive Explanations
- Statistical Inference
- Feature Importance
- Semi-parametric Learning
- Neyman Orthogonality
Code references
Best for: AI Scientist, AI Researcher, Data Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.