OTCHA: Optimal Transport-driven Confidence-aware Latent Hub Alignment for Multi-View Medical Image Classification
Summary
OTCHA, an Optimal Transport-driven Confidence-aware Latent Hub Alignment module, addresses challenges in multi-view medical image classification by refining patch tokens before fusion. Traditional methods often contaminate fused embeddings with view-specific artifacts and irrelevant background cues due to direct per-view representation fusion. OTCHA introduces learnable latent hub tokens shared across views and computes an Optimal Transport (OT) plan between patch and hub tokens, integrating feature similarity and geometry. This OT formulation includes token-conditional dustbins for partial matching, enabling the discarding of irrelevant tokens. The resulting transport plan yields token-wise matching confidence, which gates hub-mediated message passing and weights a novel OT-based representation alignment loss to stabilize refinement. Experiments on three multi-view medical image datasets demonstrate consistent improvements over competing baselines across diverse anatomies and view configurations. The code is available at https://github.com/labhai/OTCHA, with publication on 2026-06-18.
Key takeaway
For Machine Learning Engineers developing multi-view medical image classification systems, consider integrating OTCHA's pre-fusion token refinement. Your models can achieve greater robustness against view-specific artifacts and unregistered inputs. This is done by employing confidence-aware optimal transport and shared latent hub tokens. This approach prevents irrelevant content from contaminating fused embeddings, leading to more reliable diagnostic support. Explore the provided code at https://github.com/labhai/OTCHA to implement these techniques in your next project.
Key insights
OTCHA refines multi-view medical image tokens using confidence-aware optimal transport and shared latent hubs to prevent fusion contamination.
Principles
- Pre-fusion token refinement improves multi-view robustness
- Optimal Transport aligns features and geometry
- Confidence-aware gating enhances message passing
Method
OTCHA computes an Optimal Transport plan between patch and shared latent hub tokens, augmented with dustbins for partial matching. This plan generates confidence scores to gate message passing and weight a representation alignment loss.
In practice
- Apply OT with dustbins for partial feature matching
- Use shared latent tokens for multi-view alignment
- Gate message passing with token-wise confidence
Topics
- Multi-View Medical Imaging
- Optimal Transport
- Medical Image Classification
- Latent Hub Alignment
- Token Refinement
- Medical Image Analysis
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.