Density-Informed Pseudo-Counts for Calibrated Evidential Deep Learning

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, medium

Summary

Evidential Deep Learning (EDL) is a popular framework for uncertainty-aware classification that models predictive uncertainty using Dirichlet distributions parameterized by neural networks. This work provides a statistical interpretation, proving EDL training corresponds to amortized variational inference in a hierarchical Bayesian model. A key finding is that standard EDL conflates epistemic and aleatoric uncertainty, leading to systematic overconfidence on out-of-distribution (OOD) inputs. To mitigate this, the paper introduces Density-Informed Pseudo-count EDL (DIP-EDL), a new parametrization that decouples class prediction from uncertainty magnitude. DIP-EDL achieves this by separately estimating the conditional label distribution and the marginal covariate density, preserving evidence in high-density regions while shrinking predictions toward a uniform prior for OOD data. Theoretically, DIP-EDL achieves asymptotic concentration, and empirically, it enhances interpretability, robustness, and uncertainty calibration under distributional shift.

Key takeaway

For Machine Learning Engineers developing uncertainty-aware classification systems, understanding EDL's inherent overconfidence on out-of-distribution data is crucial. You should consider implementing Density-Informed Pseudo-count EDL (DIP-EDL) to decouple class prediction from uncertainty magnitude. This approach will improve the calibration of your models, providing more reliable uncertainty estimates and enhancing robustness when encountering novel or shifted data distributions.

Key insights

Standard Evidential Deep Learning conflates uncertainty types, causing overconfidence on out-of-distribution inputs.

Principles

Method

DIP-EDL separately estimates conditional label distributions and marginal covariate densities. This preserves evidence in high-density regions and reduces OOD overconfidence.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.