Robust Sequential Experimental Design for A/B Testing

2026-05-14 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

This paper introduces a robust sequential experimental design framework for A/B testing, addressing the critical limitation of existing designs that rely on correctly specified models. The proposed approach unifies contextual bandit and dynamic settings, proving theoretically that it bounds the worst-case mean squared error (MSE) of the estimated treatment effect. It incorporates three key components: an orthogonalization technique to derive a robust MSE upper bound independent of specific nonlinear effects, a dynamic programming (DP) algorithm for sequential allocation that optimizes this long-term MSE upper bound, and an extension to dynamic settings with carryover effects using a hierarchical design combining DP and reinforcement learning (RL). Numerical experiments on synthetic and real-world datasets from a leading technology company demonstrate that this robust sequential design consistently achieves lower MSE compared to non-robust, random, switchback, Neyman balanced, and Bayesian biased-coin baseline methods, particularly in small-sample and misspecified environments.

Key takeaway

For research scientists designing A/B tests in environments prone to model misspecification or finite-sample covariate imbalances, you should consider implementing this robust sequential experimental design. This approach, which integrates orthogonalization and dynamic programming, offers superior precision in Average Treatment Effect (ATE) estimation by explicitly accounting for potential nonlinearities and optimizing long-term objectives. Adopting this framework can lead to more reliable and accurate experimental outcomes, especially in dynamic settings with carryover effects, thereby enhancing data-driven decision-making.

Key insights

Robust sequential experimental design improves A/B testing efficiency by mitigating model misspecification and finite-sample imbalances.

Principles

Orthogonalization yields robust MSE bounds.
Dynamic programming optimizes long-term objectives.
Hierarchical DP/RL handles carryover effects.

Method

The method uses orthogonalization for MSE upper bounds, dynamic programming for sequential allocation, and a nested DP/RL scheme for dynamic settings with carryover effects.

In practice

Apply orthogonalization to handle unknown nonlinear effects.
Use DP to optimize A/B test allocation over time.
Combine DP and RL for complex dynamic experiments.

Topics

A/B Testing
Sequential Experimental Design
Model Misspecification
Robust Design
Dynamic Programming

Code references

QianglinSIMON/Robust-Sequential-Experimental-Design-for-AB-Testing

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.