QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging
Summary
QG-MIL introduces a gated transformer aggregator designed to overcome attention concentration issues in Multiple Instance Learning (MIL) aggregators used in medical imaging, which often lead to overconfident and unstable predictions. This novel architecture incorporates four key components: RMSNorm-based pre-normalization, per-head QK normalization, fine-grained attention output gating, and SwiGLU-style feed-forward modules. These elements collectively stabilize training and distribute attention more uniformly across instances without requiring auxiliary losses or multi-stage regularization. Evaluated across six benchmarks covering whole-slide pathology and cell-level hematology, QG-MIL variants consistently outperformed leading baselines, achieving an average improvement of +6.1 mean macro F1 points. The design ensures consistent cross-domain performance and reduced variance.
Key takeaway
For Machine Learning Engineers developing medical imaging diagnostics, QG-MIL offers a robust solution to common Multiple Instance Learning challenges. If your current attention-based MIL models suffer from overconfident or unstable predictions, you should consider integrating QG-MIL's gated transformer aggregator. Its design, confirmed by +6.1 mean macro F1 improvement across diverse benchmarks, provides more distributed attention and consistent cross-domain performance, enhancing diagnostic reliability.
Key insights
QG-MIL stabilizes attention in MIL aggregators for medical imaging, improving prediction stability and cross-domain performance.
Principles
- Attention concentration causes unstable MIL predictions.
- Architectural components can stabilize transformer attention.
- Distributed attention improves cross-domain performance.
Method
QG-MIL integrates RMSNorm pre-normalization, per-head QK normalization, fine-grained attention output gating, and SwiGLU-style feed-forward modules to stabilize attention.
In practice
- Apply QG-MIL to whole-slide pathology.
- Use QG-MIL for cell-level hematology.
- Implement QG-MIL for stable MIL predictions.
Topics
- Multiple Instance Learning
- Medical Imaging
- Transformer Aggregators
- Attention Mechanisms
- Pathology
- Hematology
- Computer Vision
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.