When SHAP and LIME Fail: Lessons from Predicting Quality in the Automotive Industry

2026-02-28 · Source: Towards AI - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Quality Control & Standards, Data Science & Analytics · Depth: Intermediate, long

Summary

This article examines the failure modes of popular machine learning explainability tools, SHAP and LIME, specifically within the context of predicting material and composition defects in the automotive industry. Using the UCI Automobile dataset as a reproducible example, it demonstrates how these tools can produce misleading, unstable, or incorrect explanations, unlike predictions that can be verified against ground truth. The analysis reveals that LIME explanations can be unstable across multiple runs for the same instance, SHAP can inaccurately distribute credit among highly correlated features, and global SHAP plots can obscure critical part-specific patterns. The content highlights the absence of ground truth for explanations, making it difficult to validate their accuracy and potentially leading to flawed engineering decisions.

Key takeaway

For Machine Learning Engineers and Data Scientists working on quality prediction in manufacturing, you must approach explainability tools with skepticism. Do not treat SHAP or LIME outputs as definitive answers without rigorous validation. Implement stability checks for LIME, analyze feature correlations before interpreting SHAP, and always segment your data to reveal nuanced patterns. Consider simpler, more interpretable models if explainability is a hard requirement, as an understandable model with slightly lower accuracy often outweighs a black box with unreliable explanations.

Key insights

Explainability tools like SHAP and LIME have critical failure modes that can lead to misleading insights, especially with correlated data.

Principles

Explanation stability is crucial for actionable insights.
Correlated features distort SHAP credit distribution.
Global explanations can hide segment-specific patterns.

Method

The article demonstrates failure modes by applying SHAP, LIME, and built-in feature importance to the UCI Automobile dataset, comparing their outputs and analyzing their behavior under specific conditions like feature correlation and repeated LIME runs.

In practice

Triangulate explanations across multiple methods.
Stability-test LIME (70%+ consistency for top feature).
Check feature correlations before interpreting SHAP (|r| > 0.7).

Topics

Machine Learning Explainability
SHAP
LIME
Feature Importance
Automotive Quality Prediction

Best for: Machine Learning Engineer, Data Scientist, AI Operations Specialist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.