OTCHA: Optimal Transport-driven Confidence-aware Latent Hub Alignment for Multi-View Medical Image Classification

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Expert, quick

Summary

OTCHA, an Optimal Transport-driven Confidence-aware Latent Hub Alignment module, addresses challenges in multi-view medical image classification by refining patch tokens before fusion. Traditional methods often contaminate fused embeddings with view-specific artifacts and irrelevant background cues due to direct per-view representation fusion. OTCHA introduces learnable latent hub tokens shared across views and computes an Optimal Transport (OT) plan between patch and hub tokens, integrating feature similarity and geometry. This OT formulation includes token-conditional dustbins for partial matching, enabling the discarding of irrelevant tokens. The resulting transport plan yields token-wise matching confidence, which gates hub-mediated message passing and weights a novel OT-based representation alignment loss to stabilize refinement. Experiments on three multi-view medical image datasets demonstrate consistent improvements over competing baselines across diverse anatomies and view configurations. The code is available at https://github.com/labhai/OTCHA, with publication on 2026-06-18.

Key takeaway

For Machine Learning Engineers developing multi-view medical image classification systems, consider integrating OTCHA's pre-fusion token refinement. Your models can achieve greater robustness against view-specific artifacts and unregistered inputs. This is done by employing confidence-aware optimal transport and shared latent hub tokens. This approach prevents irrelevant content from contaminating fused embeddings, leading to more reliable diagnostic support. Explore the provided code at https://github.com/labhai/OTCHA to implement these techniques in your next project.

Key insights

OTCHA refines multi-view medical image tokens using confidence-aware optimal transport and shared latent hubs to prevent fusion contamination.

Principles

Method

OTCHA computes an Optimal Transport plan between patch and shared latent hub tokens, augmented with dustbins for partial matching. This plan generates confidence scores to gate message passing and weight a representation alignment loss.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.