High-Dimensional Robust Change-Point Detection via Angular Kernel Statistics
Summary
A new nonparametric change-point detection method, the dimension-averaged angular kernel scan framework, is proposed for high-dimensional, low sample size (HDLSS) data. This framework specifically targets marginal distributional shifts in scenarios where inference relies on small observation batches and the ambient dimension diverges while sequence length is fixed. The statistic is fully nonparametric, hyperparameter-free, and moment-agnostic, making it robust to heavy-tailed or contaminated distributions without requiring finite marginal moments. For offline single-change problems, the method provides an exact population mean factorization and null covariance structure, valid for any fixed sample size and dimension. It also establishes an HDLSS multivariate central limit theorem, enabling variance-calibrated, asymptotically distribution-free tests with type-I error control and bounds on power and localization accuracy. The procedure extends to fixed-window sequential monitoring for high-dimensional streaming data, offering ARL calibration and worst-case Pollak EDD bounds. Simulation studies confirm its accuracy in challenging HDLSS and streaming environments where traditional moment-based methods are unstable.
Key takeaway
For Machine Learning Engineers analyzing high-dimensional data streams, this angular kernel scan framework offers a robust solution for change-point detection. If your datasets are heavy-tailed, contaminated, or have low sample sizes, you can achieve accurate detection and localization without relying on moment assumptions or hyperparameter tuning. Consider implementing this method for improved stability and accuracy in challenging HDLSS and streaming environments where traditional techniques fail.
Key insights
A new angular kernel scan framework robustly detects high-dimensional change-points in low sample size, even with heavy-tailed data.
Principles
- Aggregate one-dimensional angular discrepancies.
- Nonparametric, hyperparameter-free, moment-agnostic.
- Valid for any fixed sample size and dimension.
Method
The dimension-averaged angular kernel scan aggregates bounded one-dimensional angular discrepancies across coordinates. It derives exact population mean factorization and null covariance, then extends to fixed-window sequential monitoring.
In practice
- Detect changes in HDLSS datasets.
- Monitor high-dimensional streaming data.
- Analyze heavy-tailed or contaminated distributions.
Topics
- High-Dimensional Data
- Change-Point Detection
- Nonparametric Statistics
- Angular Kernel Scan
- Streaming Data Analysis
- HDLSS Regime
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.