Computational-Statistical Trade-off in Kernel Two-Sample Testing with Random Fourier Features

2024-07-12 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A recent paper by Ikjun Choi and Ilmun Kim, arXiv:2407.08976v2, last revised 20 May 2026, investigates the computational-statistical trade-off in kernel two-sample testing using Random Fourier Features (RFF) to approximate the Maximum Mean Discrepancy (MMD) test. The MMD test is effective for complex, high-dimensional data but suffers from quadratic-time complexity. The authors reveal that the approximated MMD test is pointwise consistent in power only when the number of random features approaches infinity. However, they demonstrate that by carefully selecting the number of random features, it is possible to achieve the same minimax separation rates as the full MMD test within sub-quadratic time. This finding is supported by simulation studies and applies under various distributional assumptions, such as densities in a Sobolev ball.

Key takeaway

For AI Scientists or Research Scientists dealing with large-scale two-sample testing or high-dimensional data, you can now achieve the statistical power of the Maximum Mean Discrepancy (MMD) test with significantly reduced computational cost. Carefully determine the optimal number of Random Fourier Features (RFF) to balance efficiency and power. This enables robust testing on datasets previously too large for standard MMD, accelerating machine learning development.

Key insights

Approximated MMD tests with Random Fourier Features can match full MMD power at sub-quadratic time with careful feature selection.

Principles

MMD tests face quadratic complexity.
Infinite random features ensure pointwise consistency.
Optimal feature count balances speed and power.

Method

Revisit approximated MMD with Random Fourier Features; investigate computational-statistical trade-off; study time-power trade-off under minimax testing framework to identify optimal feature count for equivalent power at sub-quadratic time.

In practice

Apply RFF-approximated MMD for large datasets.
Optimize random feature count for efficiency.
Consider Sobolev ball densities for assumptions.

Topics

Kernel Two-Sample Testing
Maximum Mean Discrepancy
Random Fourier Features
Computational Complexity
Statistical Power
Minimax Testing

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.