Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference

2026-05-21 · Source: stat.ML updates on arXiv.org · Field: Science & Research — Mathematics & Computational Sciences, Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

This paper introduces a unified framework for establishing finite-sample, distribution-free upper bounds on False Discovery Proportions (FDP) in conformal inference. Unlike existing methods that control the expected FDP and require pre-fixed thresholds, this approach provides high-probability bounds that hold simultaneously across all possible rejection thresholds. This enables flexible, post hoc selection of thresholds without invalidating statistical guarantees. The method constructs a high-probability envelope for the empirical distribution function of "null" conformal p-values by sampling from their joint distribution. It is applied to outlier detection and conformal selection, demonstrating tighter and more valid bounds than previous approaches in synthetic and real-data experiments, including a drug-target interaction task using the DAVIS dataset.

Key takeaway

For Data Scientists or Research Scientists performing multiple testing with conformal p-values, this framework allows you to adaptively select rejection thresholds after inspecting data, without sacrificing statistical validity. You gain rigorous, high-probability FDP bounds that hold across all thresholds, providing reliable instance-wise error control. This flexibility is crucial for exploratory analysis in areas like drug discovery or outlier detection, where initial results often guide subsequent adjustments.

Key insights

Simultaneous FDP bounds enable flexible, data-driven threshold selection with rigorous statistical guarantees.

Principles

FDR control only guarantees FDP is small on average, not for the data at hand.
Post hoc threshold adjustments invalidate traditional FDR control.
Exchangeability allows tractable joint distribution sampling for null conformal p-values.

Method

Construct a high-probability envelope for the empirical CDF of "null" conformal p-values by sampling from their joint distribution, modulating its shape with summary statistics like Truncated Higher Criticism.

In practice

Modulate envelope shape for tighter bounds in regions of primary interest.
Apply to outlier detection and conformal selection problems.
Code is available for reproducing numerical experiments.

Topics

Conformal Inference
False Discovery Proportion
Multiple Testing
Outlier Detection
Conformal Selection
High-Probability Bounds
Empirical CDF

Code references

sza919/everywhere-valid-fdp-bounds-in-conformal-inference

Best for: AI Scientist, Research Scientist, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.