Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection
Summary
A new framework extends the classical Mahalanobis distance to separable Banach spaces, reinterpreting it as a Cameron-Martin norm associated with a probability measure. This leads to a basis-free, data-driven anomaly distance called the variance norm, which can be estimated using empirical measures. The framework generalizes existing $\mathbb{R}^d$, functional $(L^2[0,1])^d$, and kernelized settings, notably incorporating non-injective covariance operators. The variance norm is proven invariant under invertible bounded linear transformations, extending prior results limited to unitary operators. In Hilbert spaces, the variance norm connects to the RKHS of the covariance operator, with consistency and convergence established for empirical estimation using Tikhonov regularization. The authors introduce a kernelized nearest-neighbour Mahalanobis distance, demonstrating its superior performance over traditional kernelized Mahalanobis distance for multivariate time series novelty detection on 12 real-world datasets, utilizing advanced time series kernels like signature, global alignment, and Volterra reservoir kernels.
Key takeaway
For research scientists developing anomaly detection systems, this work suggests that adopting the new infinite-dimensional Mahalanobis distance, specifically the kernelized nearest-neighbour variant, can significantly improve performance for multivariate time series novelty detection. You should consider implementing this approach, especially when dealing with complex time series data and non-injective covariance operators, to achieve more robust and accurate anomaly identification.
Key insights
The Mahalanobis distance is extended to infinite-dimensional spaces via a variance norm for enhanced novelty detection.
Principles
- Variance norm is invariant under invertible linear transformations.
- Cameron-Martin norm reinterprets Mahalanobis distance.
- Empirical measures can estimate variance norm consistently.
Method
The method reinterprets Mahalanobis distance as a Cameron-Martin norm, estimates a variance norm using empirical measures with Tikhonov regularization, and applies a kernelized nearest-neighbour approach for novelty detection.
In practice
- Apply kernelized nearest-neighbour for time series anomaly.
- Use signature or global alignment kernels.
- Leverage Tikhonov regularization for estimation.
Topics
- Mahalanobis Distance
- Novelty Detection
- Kernel Methods
- Banach Spaces
- Time Series Analysis
Code references
Best for: Research Scientist, AI Researcher, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by JMLR.