MinShap: A Modified Shapley Value Approach for Feature Selection
Summary
MinShap is a novel feature selection algorithm that adapts the classic Shapley value framework to address its limitations in distinguishing direct from indirect feature effects. Developed from a directed acyclic graphical (DAG) model perspective, MinShap considers the minimum marginal contribution across feature permutations, rather than the average, to identify features with direct influence on the target variable. The algorithm provides theoretical guarantees for Type I error control and is connected to multiple hypothesis testing and p-value approaches, enhancing performance in lower-sample and higher-noise settings. Numerical simulations across linear, non-linear, conditional interaction, and logistic models, using XGBoost and neural networks, demonstrate that MinShap and its related Max-p algorithm consistently outperform state-of-the-art methods like LOCO, GCM, and Lasso in terms of accuracy, F1 score, and stability, while maintaining Type I error control. Real-world applications to wine quality and California housing datasets further validate its superior performance and stability.
Key takeaway
For AI Engineers and Research Scientists tasked with robust feature selection in complex, non-linear models with highly dependent features, MinShap offers a statistically sound and empirically superior alternative to traditional methods. Its ability to distinguish direct from indirect feature effects, coupled with strong performance in accuracy and stability, means you can build more interpretable and reliable models. Consider integrating MinShap, especially when dealing with sparse datasets or when existing methods like LOCO or GCM yield unstable or inaccurate results.
Key insights
MinShap adapts Shapley values for robust feature selection by focusing on minimum marginal contributions across permutations.
Principles
- Faithfulness assumption links conditional independence to DAG structure.
- Minimum marginal contribution identifies direct feature effects.
- Multiple hypothesis testing improves power in finite-sample regimes.
Method
MinShap replaces the average marginal contribution in Shapley values with the minimum contribution across permutations, leveraging DAG faithfulness. It uses a threshold derived from variance estimates for Type I error control and can be extended with adjusted p-value methods.
In practice
- Use MinShap for feature selection in non-linear models with dependent features.
- Consider adjusted p-value methods for sparse models or small sample sizes.
- Parallelize computation of marginal contributions for efficiency.
Topics
- MinShap Algorithm
- Feature Selection
- Shapley Values
- Directed Acyclic Graphical Models
- Multiple Hypothesis Testing
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.