Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

Real-world bandit deployments, such as recommendation and clinical dosing, often involve rewards on a low-dimensional latent subspace that drifts. The paper "Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity" presents a novel approach to these piecewise-stationary low-rank linear contextual bandits with scalar feedback. The authors establish three necessary and jointly sufficient probe-side conditions for recovering the moving subspace from quadratic reward functionals. They propose SPSC (Single-Play Subspace-Calibrated Optimism), an algorithm that interleaves isotropic probes with windowed projected ridge-UCB exploitation. SPSC achieves a costed dynamic regret of \u00d5(r\u221aT)+\u00d5(T^2/3)+O(W\u2009V_in), replacing the ambient \u00d5(d\u221aT) rate with the intrinsic rank r. Empirical evaluations across eleven benchmarks, including synthetic, UCI/MovieLens, clinical, and ZOZOTOWN production data, demonstrate SPSC's superior performance when d-r\u227d T^1/6, aligning with theoretical predictions.

Key takeaway

For machine learning engineers building high-dimensional sequential decision systems with drifting latent structures, SPSC offers a robust solution. You should consider deploying SPSC when your ambient dimension d significantly exceeds the intrinsic rank r, specifically if d-r \u227d T^1/6. This method provides substantial regret reductions by adapting to changing low-rank subspaces, outperforming traditional non-stationary and fixed low-rank approaches. It's also resilient to mild overestimation of the true rank.

Key insights

Scalar rewards enable quadratic measurements for changing low-rank subspace recovery in non-stationary bandits.

Principles

Method

SPSC interleaves isotropic probes with windowed projected ridge-UCB exploitation within the learned r-dimensional subspace, using a CUSUM-style detector for online change point discovery.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.