Benchmarking Counterfactual Prediction in Epidemic Time Series with Time-Varying Interventions
Summary
EpiCF-Bench is a new large-scale benchmark designed for counterfactual prediction in epidemic time series, specifically under dynamic interventions. It addresses the gap in realistic benchmarks by utilizing a calibrated agent-based model (ABM) that integrates real-world demographic, mobility, epidemiological, and policy data. The benchmark generates factual and counterfactual COVID-19 trajectories across 158 U.S. counties, covering a 168-day period from October 26, 2020, to April 11, 2021, for populations between 20,000 and 200,000. It supports both static and time-varying treatments, as well as single-policy and multi-policy intervention scenarios. Evaluation using EpiCF-Bench revealed substantial performance differences among various causal inference methods, highlighting the complexities of realistic time-series causal reasoning. The underlying ABM demonstrated high fidelity to ground truth, achieving a mean NRMSE of 0.1887.
Key takeaway
For machine learning engineers developing causal inference models for dynamic systems, you should integrate realistic benchmarks like EpiCF-Bench into your evaluation pipeline. This benchmark provides ground-truth counterfactuals for epidemic time series, enabling rigorous assessment of model performance under complex, time-varying interventions. Use it to validate whether your models effectively balance predictive accuracy with true causal inference capabilities, especially in multi-policy scenarios where interactions are critical.
Key insights
The lack of realistic benchmarks with ground-truth counterfactuals constrains time-series causal inference progress.
Principles
- Realistic benchmarks are crucial for causal inference.
- ABMs can bridge realism and evaluability gaps.
- Policy interventions can have delayed or negative effects.
Method
EpiCF-Bench uses a differentiable agent-based model calibrated with real-world data to simulate factual and counterfactual epidemic trajectories by modifying policy variables through network freezing mechanisms.
In practice
- Use EpiCF-Bench to evaluate time-series causal models.
- Explore single- and multi-policy intervention effects.
- Analyze "flattening the curve" trade-offs in simulations.
Topics
- Counterfactual Prediction
- Epidemic Time Series
- Causal Inference Benchmarks
- Agent-Based Models
- Dynamic Interventions
- COVID-19 Policies
Code references
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.