Bayesian Nonparametric Detection of Anomalies in Multivariate Functional Data

· Source: stat.ML updates on arXiv.org · Field: Science & Research — Mathematics & Computational Sciences, Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

Daniel Krasnov and David Stephens introduce WICMAD, a Bayesian nonparametric model for semi-supervised anomaly detection in multivariate functional data. This approach models functional observations as an infinite mixture of multi-output Gaussian processes, using wavelet bases and Besov priors for smooth, sparse mean function representation. It captures cross-functional dependence via the intrinsic coregionalization model and employs a Carlin-Chib product space step for automatic, cluster-specific covariance kernel selection. WICMAD assigns anomalous observations to small mixture components without requiring prior knowledge of anomaly count or type. Tested in a semi-supervised setting with 15% labeled normal observations, the model achieved near-perfect median performance on simulated univariate and multivariate datasets, and strong results on real-world data like Character Trajectories, Asphalt Regularity, and Chinatown datasets.

Key takeaway

For research scientists developing functional anomaly detection systems, WICMAD offers a robust, model-based approach that adapts to diverse anomaly types and multivariate dependencies. You should consider its semi-supervised framework, which leverages limited labels to enhance performance, especially where high recall is critical. This method reduces sensitivity to kernel misspecification, providing more reliable anomaly identification in complex functional datasets.

Key insights

WICMAD uses infinite Gaussian process mixtures and adaptive kernel selection for semi-supervised functional anomaly detection.

Principles

Method

WICMAD performs posterior inference via an MCMC algorithm combining Gibbs sampling and Metropolis-Hastings updates for cluster assignments, wavelet means, kernel indicators, and ICM parameters.

In practice

Topics

Best for: AI Scientist, Research Scientist, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.