Epistemic Uncertainty Is Not the Reducible Kind

2026-06-12 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

Robin Young's research from the University of Cambridge demonstrates a fundamental inconsistency in the standard taxonomy of predictive uncertainty, specifically regarding epistemic uncertainty. The paper proves that the common definition—epistemic uncertainty is reducible by more data—and its measure, the mutual-information term, are extensionally inconsistent. Through an explicit construction, the measure assigns all uncertainty to the epistemic class, yet no amount of training data reduces it. This leads to a proposed trichotomy: aleatoric, sample-reducible epistemic, and mechanism-reducible epistemic uncertainty, where the latter requires changing data acquisition methods. The work also shows that in-distribution data never reduces mechanism-irreducible uncertainty and can increase it, and that ensemble disagreement, a widely used epistemic estimate, tracks training procedures rather than the true epistemic term, collapsing to zero or reflecting initialization noise. Experiments confirm these theoretical findings with high statistical significance.

Key takeaway

For AI Scientists designing or deploying uncertainty quantification systems, you must critically re-evaluate assumptions about epistemic uncertainty. Your active learning pipelines, which often route high-epistemic inputs to additional in-distribution sampling, may be ineffective or even counterproductive for mechanism-reducible uncertainty. Furthermore, relying on ensemble disagreement as a proxy for epistemic uncertainty can be misleading, as it tracks training dynamics rather than the true underlying uncertainty. Consider diversifying data acquisition strategies beyond i.i.d. sampling to effectively reduce all forms of uncertainty.

Key insights

Standard epistemic uncertainty definitions and measures are inconsistent; reducibility depends on data acquisition methods.

Principles

Uncertainty reducibility is a property of (uncertainty, acquisition class).
In-distribution data can increase mechanism-irreducible epistemic uncertainty.
Ensemble disagreement tracks optimization, not true epistemic uncertainty.

Method

A level-$\alpha$ finite-sample falsification test can reject mechanism-irreducibility by detecting a statistically significant decrease in the epistemic term under exact unidentifiability.

In practice

Re-evaluate active learning strategies that assume i.i.d. data reduces all epistemic uncertainty.
Do not rely on ensemble disagreement as a direct measure of epistemic uncertainty.
Consider off-support data acquisition for truly reducing certain uncertainties.

Topics

Uncertainty Quantification
Epistemic Uncertainty
Aleatoric Uncertainty
Active Learning
Ensemble Methods
Data Acquisition
Model Calibration

Best for: Research Scientist, AI Scientist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.