In-Context Multiple Instance Learning

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

In-Context Multiple Instance Learning (ICMIL) introduces a novel framework to address Multiple Instance Learning (MIL) challenges, particularly in low-label data environments common in computational pathology and satellite imagery. The method involves pretraining a Perceiver-style architecture on diverse synthetic bag-structured data. This enables it to solve new MIL tasks from a handful of labeled bags in a single forward pass without gradient updates. Researchers investigated various synthetic data generators. A model pretrained on a mixture of these priors, termed ICMIL, achieved the best average AUROC (84.17) and rank (3.62) across twelve MIL benchmarks. This approach outperforms supervised baselines requiring task-specific training and hyperparameter tuning. It demonstrates superior sample efficiency and lower variance in low-sample regimes. The architecture effectively handles scalability, task-dependent instance compression, and permutation invariance for hierarchical set inputs.

Key takeaway

For Machine Learning Engineers tackling Multiple Instance Learning problems with scarce labeled data, ICMIL offers a compelling alternative to traditional supervised methods. You should consider adopting in-context learning with synthetic data pretraining to achieve robust performance and reduce reliance on extensive task-specific training and hyperparameter tuning. This approach can significantly improve sample efficiency and reduce variance, making it ideal for real-world applications where labels are limited.

Key insights

Pretraining a Perceiver-style in-context learner on synthetic MIL data enables few-shot classification without gradient updates.

Principles

Pretraining on diverse synthetic priors captures complementary inductive biases.
Mixing priors yields a robust generalist model for varied MIL tasks.
In-context learning avoids overfitting and high variance in low-label regimes.

Method

ICMIL uses a Perceiver-style architecture with iterative instance aggregation and inter-bag attention. It's pretrained on synthetic bag-structured data from factorized and joint priors, then classifies new tasks in a single forward pass.

In practice

Design synthetic data generators to align with target MIL task structures.
Combine different prior types (factorized, joint) for broader applicability.
Utilize Perceiver-style architectures for hierarchical set inputs in MIL.

Topics

Multiple Instance Learning
In-Context Learning
Perceiver Architecture
Synthetic Data Generation
Few-Shot Learning
Low-Label Regime

Code references

injurise/ICMIL

Best for: Research Scientist, Computer Vision Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.