Geometry-Aware Uncertainty Coresets for Robust Visual In-Context Learning in Histopathology

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition, Health & Medical Research · Depth: Expert, quick

Summary

A new training-free coreset selection method, GAUC (Geometry-Aware Uncertainty Coresets), has been developed to improve visual in-context learning (ICL) in computational histopathology using vision-language models (VLMs). VLMs are promising for clinical reasoning but face challenges with fine-tuning on limited expert data and ICL's sensitivity to example selection. GAUC addresses these issues by operating directly in the pre-trained multimodal embedding space, optimizing for distributional fidelity via Maximum Mean Discrepancy, bounding performance degradation from prompt paraphrases using an Effective Mutual Information Difference regularizer, and suppressing overconfident outputs with a predictive-variance penalty. Tested on CRC-100K and MHIST datasets with various open-source VLM architectures, GAUC consistently enhanced accuracy, calibration, and prompt robustness compared to existing ICL selection methods and dataset-distillation baselines, all without requiring any gradient updates.

Key takeaway

For AI scientists and research scientists developing VLM applications in histopathology, GAUC offers a training-free method to significantly improve in-context learning performance. You should consider integrating GAUC into your VLM workflows to enhance diagnostic accuracy, calibration, and robustness against prompt variations, thereby reducing the need for costly fine-tuning on scarce expert data.

Key insights

GAUC improves VLM in-context learning for histopathology by selecting robust coresets without parameter updates.

Principles

Method

GAUC jointly optimizes Maximum Mean Discrepancy for distributional fidelity, Effective Mutual Information Difference for prompt robustness, and a predictive-variance penalty for output stability, all within the VLM's pre-trained multimodal embedding space.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.