Bayesian Nonparametric Detection of Anomalies in Multivariate Functional Data
Summary
Daniel Krasnov and David Stephens introduce WICMAD, a Bayesian nonparametric model for semi-supervised anomaly detection in multivariate functional data. This approach models functional observations as an infinite mixture of multi-output Gaussian processes, using wavelet bases and Besov priors for smooth, sparse mean function representation. It captures cross-functional dependence via the intrinsic coregionalization model and employs a Carlin-Chib product space step for automatic, cluster-specific covariance kernel selection. WICMAD assigns anomalous observations to small mixture components without requiring prior knowledge of anomaly count or type. Tested in a semi-supervised setting with 15% labeled normal observations, the model achieved near-perfect median performance on simulated univariate and multivariate datasets, and strong results on real-world data like Character Trajectories, Asphalt Regularity, and Chinatown datasets.
Key takeaway
For research scientists developing functional anomaly detection systems, WICMAD offers a robust, model-based approach that adapts to diverse anomaly types and multivariate dependencies. You should consider its semi-supervised framework, which leverages limited labels to enhance performance, especially where high recall is critical. This method reduces sensitivity to kernel misspecification, providing more reliable anomaly identification in complex functional datasets.
Key insights
WICMAD uses infinite Gaussian process mixtures and adaptive kernel selection for semi-supervised functional anomaly detection.
Principles
- Anomalies emerge as small, distinct mixture components.
- Wavelet bases provide sparse, smooth mean function representations.
- Carlin-Chib product space enables adaptive kernel selection.
Method
WICMAD performs posterior inference via an MCMC algorithm combining Gibbs sampling and Metropolis-Hastings updates for cluster assignments, wavelet means, kernel indicators, and ICM parameters.
In practice
- Detects magnitude, local behavior, and shape anomalies.
- Effective in semi-supervised settings with 15% normal labels.
Topics
- Bayesian Nonparametrics
- Functional Anomaly Detection
- Gaussian Processes
- Wavelet Analysis
- Intrinsic Coregionalization Model
- Semi-supervised Learning
Best for: AI Scientist, Research Scientist, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.