A nonparametric two-sample test using a parametric integral probability metric

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A new nonparametric two-sample test, PReLU-TST, is introduced for detecting distributional differences between two independent samples without assuming a specific parametric distribution. This 45-page study proposes a test statistic, PReLU-IPM, based on an integral probability metric (IPM) that utilizes a specially designed parametric discriminator class with a single neural network node. The research establishes theoretical guarantees for PReLU-TST, including its consistency and asymptotical equivalence to existing nonparametric IPM-based tests under regularity conditions. Empirical evaluations across multiple simulated and real benchmark datasets demonstrate that PReLU-TST achieves higher power across various alternatives or performs comparably to competitor methods for finite samples. The work has been accepted for publication in Statistical Analysis and Data Mining.

Key takeaway

For Machine Learning Engineers or Data Scientists evaluating distributional differences between datasets, PReLU-TST offers a powerful, assumption-free alternative. You should consider integrating this nonparametric test, especially for finite samples where it shows higher power or comparable performance. This could improve the reliability of your model validation or data quality checks, providing stronger statistical evidence for sample divergence.

Key insights

PReLU-TST offers a robust nonparametric two-sample test with strong theoretical guarantees and superior finite-sample power.

Principles

Method

The PReLU-TST procedure constructs a test statistic, PReLU-IPM, using an integral probability metric with a single-node neural network as its parametric discriminator class.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.