Path-Sampled Integrated Gradients

2026-04-17 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, long

Summary

Path-Sampled Integrated Gradients (PS-IG) is a new framework designed to improve feature attribution in deep neural networks by addressing the limitations of standard Integrated Gradients (IG), specifically its dependence on baseline choice and susceptibility to gradient noise. PS-IG computes the expected attribution over baselines sampled along a linear interpolation path, rather than relying on a single reference point or global dataset sampling. The framework proves mathematically equivalent to Path-Weighted Integrated Gradients (PWIG), where the weighting function matches the cumulative distribution function of the sampling density. This equivalence allows for a deterministic Riemann sum evaluation, enhancing the error convergence rate from O(m^-1/2) to O(m^-1) for smooth models. Furthermore, PS-IG acts as a variance-reducing filter, strictly lowering attribution variance by a factor of 1/3 under uniform sampling, while preserving key axiomatic properties like linearity and implementation invariance.

Key takeaway

For research scientists developing or applying interpretability methods, PS-IG offers a robust and computationally efficient alternative to standard Integrated Gradients. You should consider adopting PS-IG to mitigate issues of baseline sensitivity and gradient noise, as its deterministic implementation provides faster convergence and significantly reduced attribution variance without additional computational overhead. This can lead to more reliable and trustworthy explanations for deep learning model predictions.

Key insights

PS-IG unifies stochastic baseline sampling with deterministic path weighting for robust, efficient feature attribution.

Principles

Stochastic sampling can be deterministically weighted.
Weighting functions reduce gradient noise variance.
Completeness axiom adapts for stochastic baselines.

Method

PS-IG computes expected IG over baselines sampled along a linear path, then converts this expectation into a deterministic, weighted path integral using the sampling density's CDF for efficient calculation.

In practice

Use PS-IG for more stable attribution maps.
Apply deterministic Riemann sum for faster convergence.
Consider uniform sampling for 1/3 variance reduction.

Topics

Path-Sampled Integrated Gradients
Feature Attribution
Integrated Gradients
Variance Reduction
Computational Efficiency

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.