Robust Sequential Experimental Design for A/B Testing
Summary
This paper introduces a robust sequential experimental design framework for A/B testing, addressing the critical limitation of existing designs that rely on correctly specified models. The proposed approach unifies contextual bandit and dynamic settings, proving theoretically that it bounds the worst-case mean squared error (MSE) of the estimated treatment effect. It incorporates three key components: an orthogonalization technique to derive a robust MSE upper bound independent of specific nonlinear effects, a dynamic programming (DP) algorithm for sequential allocation that optimizes this long-term MSE upper bound, and an extension to dynamic settings with carryover effects using a hierarchical design combining DP and reinforcement learning (RL). Numerical experiments on synthetic and real-world datasets from a leading technology company demonstrate that this robust sequential design consistently achieves lower MSE compared to non-robust, random, switchback, Neyman balanced, and Bayesian biased-coin baseline methods, particularly in small-sample and misspecified environments.
Key takeaway
For research scientists designing A/B tests in environments prone to model misspecification or finite-sample covariate imbalances, you should consider implementing this robust sequential experimental design. This approach, which integrates orthogonalization and dynamic programming, offers superior precision in Average Treatment Effect (ATE) estimation by explicitly accounting for potential nonlinearities and optimizing long-term objectives. Adopting this framework can lead to more reliable and accurate experimental outcomes, especially in dynamic settings with carryover effects, thereby enhancing data-driven decision-making.
Key insights
Robust sequential experimental design improves A/B testing efficiency by mitigating model misspecification and finite-sample imbalances.
Principles
- Orthogonalization yields robust MSE bounds.
- Dynamic programming optimizes long-term objectives.
- Hierarchical DP/RL handles carryover effects.
Method
The method uses orthogonalization for MSE upper bounds, dynamic programming for sequential allocation, and a nested DP/RL scheme for dynamic settings with carryover effects.
In practice
- Apply orthogonalization to handle unknown nonlinear effects.
- Use DP to optimize A/B test allocation over time.
- Combine DP and RL for complex dynamic experiments.
Topics
- A/B Testing
- Sequential Experimental Design
- Model Misspecification
- Robust Design
- Dynamic Programming
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.