Distance-Matrix Wasserstein Statistics for Scalable Gromov--Wasserstein Learning

2026-05-14 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

Distance-Matrix Wasserstein (DMW) is a novel statistical framework designed to compare graphs, shapes, and point clouds by analyzing laws of random finite distance matrices. This approach addresses the scalability challenges of discrete Gromov--Wasserstein (GW) distances, which are nonconvex quadratic optimal transport problems. DMW operates by sampling "n" points from each space, recording their pairwise distances, and then transporting the resulting matrix laws, rather than optimizing a global point-level alignment. The framework proves DMW as a relaxation and lower bound of GW, with the GW--DMW gap controlled by the Wasserstein error of approximating each original measure with "n" samples. Population DMW converges to GW as sampled subspaces become dense, and the method offers finite-sample bounds, including intrinsic-dimensional rates. For practical scalability, sliced and multi-scale DMW variants are introduced, with the "p=1" sliced multi-scale dissimilarity yielding positive-definite exponential kernels. Experiments confirm the theory across synthetic metric spaces, scalability benchmarks, graph classification, and two-sample testing.

Key takeaway

For AI Scientists and Research Scientists working with structural data comparison, DMW provides a scalable and theoretically grounded alternative to traditional Gromov--Wasserstein. You should consider DMW for tasks like graph classification or two-sample testing where computational efficiency is critical, especially when dealing with large datasets. Its ability to approximate GW with controlled error, even with finite samples, makes it a robust tool for analyzing complex data structures without requiring common coordinate systems.

Key insights

DMW offers a scalable, provable relaxation of Gromov--Wasserstein for comparing structural data via sampled distance matrices.

Principles

DMW is a lower bound of GW.
GW-DMW gap depends on sampling density.
Intrinsic-dimensional rates control finite-sample bounds.

Method

Sample "n" points from each space, compute pairwise distances, and transport the resulting distance matrix laws to compare structural data.

In practice

Use DMW for scalable graph classification.
Apply DMW to two-sample testing.
Employ sliced/multi-scale DMW for efficiency.

Topics

Distance-Matrix Wasserstein
Gromov-Wasserstein
Optimal Transport
Graph Classification
Two-sample Testing

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.