Towards Rigorous Explainability by Feature Attribution

2026-04-20 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This paper addresses the lack of rigor in non-symbolic methods for explaining complex machine learning (ML) models, particularly the ubiquitous SHAP tool, which uses Shapley values for feature attribution. It highlights critical flaws in SHAP scores, demonstrating that they can mislead human decision-makers by assigning importance to irrelevant features or vice-versa across various ML models, including classification, regression, continuous, and differentiable types. The authors propose a rigorous alternative called "corrected SHAP scores," based on a novel logic-based characteristic function for the XAI game, which respects properties like strong value independence and compliance with feature (ir)relevancy. The paper also introduces nuSHAP, a tool that approximates these corrected scores using the Castro, Gómez, and Tejada (CGT) algorithm, and presents experimental results showing no correlation between nuSHAP and SHAP rankings, while maintaining similar runtime performance.

Key takeaway

For AI Engineers and data scientists evaluating model interpretability, you should be aware that widely used SHAP scores have provable theoretical flaws that can lead to misleading feature importance. Your reliance on SHAP for critical decisions warrants reassessment; consider exploring rigorous alternatives like nuSHAP, which aligns feature attribution with logical relevancy and offers comparable computational performance.

Key insights

Existing SHAP scores for ML explainability are flawed and misleading; rigorous alternatives are necessary.

Principles

Rigor is paramount in high-stakes ML explanations.
Feature (ir)relevancy must align with attribution scores.
Popularity does not equate to rigor in XAI methods.

Method

Corrected SHAP scores utilize a logic-based characteristic function for the XAI game, testing for Weak Abductive Explanations (WAXp) instead of expected values, and are approximated using the CGT algorithm.

In practice

Reassess conclusions from studies relying on SHAP scores.
Consider nuSHAP for rigorous feature importance rankings.
Use logic-based explainers for WAXp predicate checks.

Topics

Shapley Values
Explainable AI
Feature Attribution
Logic-Based Explanations
Corrected SHAP Scores

Code references

Best for: AI Engineer, NLP Engineer, Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.