Cross-Fitting-Free Debiased Machine Learning with Multiway Dependence

· Source: stat.ML updates on arXiv.org · Field: Science & Research — Mathematics & Computational Sciences, Research Methodology & Innovation, Machine Learning Theory · Depth: Expert, extended

Summary

This paper, authored by Kaicheng Chen and Harold D. Chiang, develops a novel asymptotic theory for two-step debiased machine learning (DML) estimators within generalized method of moments (GMM) models. Crucially, this theory operates without relying on cross-fitting, a common technique that can be statistically inefficient and computationally burdensome, especially with complex first-stage learners and multiway clustered dependence. The authors demonstrate that valid inference can be achieved by combining Neyman-orthogonal moment conditions with a localization-based empirical process approach, accommodating an arbitrary number of clustering dimensions. The resulting DML-GMM estimators are shown to be asymptotically linear and normal under multiway clustered dependence. A significant technical contribution includes the derivation of new global and local maximal inequalities for general classes of functions of sums of separately exchangeable arrays, which are foundational to their theoretical arguments and hold independent interest.

Key takeaway

For AI Researchers and Research Scientists working with DML-GMM in environments with multiway clustered data, this work suggests that abandoning traditional cross-fitting can lead to more efficient and less computationally intensive inference. You should consider implementing the proposed cross-fitting-free DML-GMM framework, particularly when dealing with high-dimensional or nonparametric nuisance components where sample splitting is costly. This approach offers a robust alternative for achieving asymptotically linear and normal estimators, potentially improving the precision of your parameter estimates in finite samples.

Key insights

Valid DML-GMM inference is possible without cross-fitting, even with multiway clustered data.

Principles

Method

The method combines Neyman-orthogonal moment conditions with a localization-based empirical process approach, leveraging novel maximal inequalities for separately exchangeable arrays to achieve valid inference without sample splitting in multiway clustered DML-GMM.

In practice

Topics

Best for: AI Researcher, AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.