Prediction-Powered Causal Inference by Automatic Debiased Machine Learning and Semi-Supervised Riesz Regression
Summary
This study introduces Prediction-Powered Causal Inference (PPCI), a novel framework for semiparametric efficient estimation of causal and structural parameters within a semi-supervised context. PPCI utilizes unlabeled auxiliary regressors in addition to labeled observations to achieve significantly smaller asymptotic variances compared to methods relying solely on labeled data. The research first establishes the efficient influence function and efficiency bound, demonstrating the variance reduction attainable through auxiliary regressors. It then proposes DML-PPCI methods, integrating the efficient influence function with the debiased machine learning (DML) framework. Two specific estimators, EE-DML-PPCI (estimating-equation) and TMLE-DML-PPCI (targeted-learning), are detailed, both matching the derived efficiency bound. A crucial aspect involves estimating the efficient influence function, which is a Neyman orthogonal score, and for this, the study develops semi-supervised generalized Riesz regression with convergence rate guarantees.
Key takeaway
For data scientists or econometricians working with causal inference, you should consider integrating Prediction-Powered Causal Inference (PPCI) into your estimation workflows. This approach allows you to incorporate unlabeled auxiliary regressors, significantly reducing the asymptotic variance of your causal and structural parameter estimates. Implementing DML-PPCI, such as EE-DML-PPCI or TMLE-DML-PPCI, yields more robust and efficient models, especially with limited labeled data.
Key insights
Prediction-Powered Causal Inference (PPCI) uses unlabeled data to significantly reduce asymptotic variance in causal and structural parameter estimation.
Principles
- Auxiliary regressors reduce asymptotic variance.
- Efficient influence functions improve estimation.
- DML framework enhances causal inference.
Method
DML-PPCI combines efficient influence functions with debiased machine learning. It uses semi-supervised generalized Riesz regression to estimate the Riesz representer for Neyman orthogonal scores.
In practice
- Apply DML-PPCI for more precise causal effects.
- Utilize unlabeled data to boost estimation efficiency.
- Implement semi-supervised Riesz regression.
Topics
- Causal Inference
- Debiased Machine Learning
- Semi-Supervised Learning
- Riesz Regression
- Asymptotic Variance
- Econometrics
Best for: AI Scientist, Research Scientist, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.