What Is the Prediction Actually Made Of?

2026-01-11 · Source: Agus’s Substack · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, long

Summary

This post examines the limitations of SHAP (SHapley Additive exPlanations) for retrieval-based machine learning models, particularly in the context of tabular in-context learning (ICL). It argues that while SHAP provides additive feature attributions, it fails to explain the underlying mechanism of how predictions are formed in models like k-nearest neighbors (kNN), TabPFN, and TabICL. Using soft kNN as a transparent teaching model, the author demonstrates four critical aspects SHAP misses: neighbor identity, label disagreement, neighbor switching versus structural effects, and local data density/extrapolation. The article introduces an "inner explanation" framework that focuses on the weighted neighborhood of training examples, providing metrics like Neff (effective neighbor count), Δy (label dispersion), and Δx (local radius) to offer a more complete understanding of model predictions and their stability.

Key takeaway

For Machine Learning Engineers building or deploying retrieval-based models, relying solely on SHAP for interpretability can be misleading. You should integrate "inner explanations" by analyzing neighbor weights, effective neighbor count (Neff), label dispersion (Δy), and local radius (Δx) to understand prediction stability and confidence. This dual-layer approach provides a more robust understanding of model behavior, especially when predictions are accidental averages or based on sparse data, informing better model debugging and risk assessment.

Key insights

SHAP alone is insufficient for explaining retrieval-based model predictions, which require an "inner explanation" of neighbor influence.

Principles

Prediction is an average over weighted training examples.
SHAP explains query features, not underlying retrieval mechanisms.
Soft kNN weights are exact local label influences.

Method

The proposed explanation workflow prioritizes inner explanation (neighbor weights, Neff, Δy, Δx) first, followed by SHAP as an outer feature summary, correlating SHAP with neighborhood-formation scores.

In practice

Use Neff to gauge effective neighbor count.
Monitor Δy for label disagreement in neighborhoods.
Assess Δx for query support by nearby data.

Topics

SHAP Feature Attribution
Soft kNN Explanations
Tabular In-Context Learning
Model Interpretability
Prediction Stability

Code references

asudjianto-xml/ICL

Best for: AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Agus’s Substack.