High-Dimensional Robust Change-Point Detection via Angular Kernel Statistics

· Source: stat.ML updates on arXiv.org · Field: Science & Research — Mathematics & Computational Sciences, Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new nonparametric change-point detection method, the dimension-averaged angular kernel scan framework, is proposed for high-dimensional, low sample size (HDLSS) data. This framework specifically targets marginal distributional shifts in scenarios where inference relies on small observation batches and the ambient dimension diverges while sequence length is fixed. The statistic is fully nonparametric, hyperparameter-free, and moment-agnostic, making it robust to heavy-tailed or contaminated distributions without requiring finite marginal moments. For offline single-change problems, the method provides an exact population mean factorization and null covariance structure, valid for any fixed sample size and dimension. It also establishes an HDLSS multivariate central limit theorem, enabling variance-calibrated, asymptotically distribution-free tests with type-I error control and bounds on power and localization accuracy. The procedure extends to fixed-window sequential monitoring for high-dimensional streaming data, offering ARL calibration and worst-case Pollak EDD bounds. Simulation studies confirm its accuracy in challenging HDLSS and streaming environments where traditional moment-based methods are unstable.

Key takeaway

For Machine Learning Engineers analyzing high-dimensional data streams, this angular kernel scan framework offers a robust solution for change-point detection. If your datasets are heavy-tailed, contaminated, or have low sample sizes, you can achieve accurate detection and localization without relying on moment assumptions or hyperparameter tuning. Consider implementing this method for improved stability and accuracy in challenging HDLSS and streaming environments where traditional techniques fail.

Key insights

A new angular kernel scan framework robustly detects high-dimensional change-points in low sample size, even with heavy-tailed data.

Principles

Method

The dimension-averaged angular kernel scan aggregates bounded one-dimensional angular discrepancies across coordinates. It derives exact population mean factorization and null covariance, then extends to fixed-window sequential monitoring.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.