Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization
Summary
A new learning paradigm integrates both paired and unpaired data seamlessly through data likelihood maximization techniques for learning conditional distributions π*(y|x). This approach connects intriguingly with inverse entropic optimal transport (OT), enabling a light learning algorithm for conditional distributions. The proposed method introduces a novel loss function, theoretically equivalent to the inverse entropic OT problem, which supports end-to-end learning. Empirical tests on a Gaussian-to-Swiss Roll task, using P=128 paired and Q=R=1024 unpaired samples, and a real-world weather prediction dataset, with 192 paired and 500 unpaired samples, demonstrate its effectiveness. The framework successfully learns conditional distributions even with modest paired data, outperforming baselines like Conditional Normalizing Flows (CNF), Conditional GAN (CGAN), Unconditional GAN (UGAN)+ℓ², and regression.
Key takeaway
For Machine Learning Engineers developing domain translation models with limited paired data, you should consider this inverse entropic optimal transport approach. It offers a non-minimax, likelihood-based objective that effectively integrates both paired and unpaired samples, mitigating overfitting and improving performance over traditional methods like CNF or GANs. This framework allows for robust conditional distribution learning even with modest paired data, making it valuable for resource-constrained scenarios.
Key insights
Inverse entropic optimal transport provides a unified likelihood maximization framework for semi-supervised conditional distribution learning using mixed data.
Principles
- Data likelihood maximization can integrate paired and unpaired data.
- Inverse OT is equivalent to a likelihood maximization problem.
- Unpaired data significantly improves learning with limited paired samples.
Method
A novel loss function, derived from data likelihood maximization and linked to inverse entropic OT, is optimized using a Gaussian mixture parameterization for cost and dual potential, enabling closed-form solutions for normalization and conditional distributions.
In practice
- Apply Gaussian mixture parameterization for tractable OT solutions.
- Utilize stochastic gradient descent for empirical loss optimization.
- Sample conditional distributions directly from Gaussian mixtures.
Topics
- Semi-supervised Learning
- Optimal Transport
- Conditional Distributions
- Domain Translation
- Likelihood Maximization
- Gaussian Mixture Models
Best for: Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.