Optimal Experiments for Partial Causal Effect Identification
Summary
This research introduces the "max-potency problem" for selecting cost-constrained experiments that maximally tighten bounds on partially identifiable causal queries from observational data. The problem is formalized and proven NP-hard via a reduction from 0-1 knapsack. The authors develop a general procedure for evaluating "epistemic potency" in discrete settings, building on the polynomial-programming framework of Duarte et al. (2023). To manage the super-exponential search space, two graphical pruning criteria are introduced: a novel linear-time path-interception rule exploiting district structure to certify zero potency, and an identifiability check based on the ID algorithm. These criteria collectively prune 50–88% of candidate experiments on average in experiments on Erdős–Rényi random graphs and 11 bnlearn benchmark networks. The ID-pruned experiments are shown to be combinatorially inert, leading to a super-exponential reduction in the number of subsets evaluated. An end-to-end demonstration is provided using observational NHANES data to select optimal experiments for estimating the effect of physical activity on diabetes.
Key takeaway
For AI Scientists and Research Scientists designing causal experiments under budget constraints, understanding the NP-hard nature of the max-potency problem is crucial. You should integrate the proposed graphical pruning criteria, specifically the path-interception rule and ID algorithm-based checks, into your experimental design workflow. This approach can drastically reduce the computational burden by identifying and eliminating useless experiments ex ante, allowing you to focus resources on experiments with genuine potential to tighten causal bounds.
Key insights
Optimal experimental design for partially identifiable causal queries is NP-hard but can be significantly pruned using graphical criteria.
Principles
- Experiments can only tighten causal bounds, never widen them.
- Zero potency implies an experiment does not change existing bounds.
- Potency is additive across disjoint causal cells.
Method
The method involves formalizing the max-potency problem, evaluating epistemic potency via polynomial programming, and pruning useless experiments using a path-interception rule and an ID algorithm-based identifiability check.
In practice
- Use graphical pruning to reduce the search space for optimal experiments.
- Apply the path-interception rule for linear-time zero-potency certification.
- Leverage ID-pruned experiments to reduce subset search complexity.
Topics
- Causal Effect Identification
- Optimal Experimental Design
- Partial Causal Effects
- Epistemic Potency
- Polynomial Programming
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.