From Point Estimates to Distributions: GMM Pooling for MIL in Preterm Birth Prediction
Summary
A new Gaussian Mixture Model (GMM) pooling method is proposed for Multiple Instance Learning (MIL) in preterm birth (PTB) prediction, addressing the limitation of using single ultrasound frames. This approach formulates PTB prediction as an MIL problem, where each patient is represented by a variable-sized bag of transvaginal ultrasound (TVUS) images. Unlike standard MIL aggregators that produce point estimates, GMM pooling models the feature distribution of all images within a bag, summarizing them into a fixed-length representation to capture intra-patient variability. Evaluated on a private clinical cohort, GMM pooling improved the PR-AUC for PTB prediction from 0.44 to 0.56 over an instance-based model. The method also demonstrated strong performance on a public lymph node metastasis benchmark, achieving a 0.91 F1-score and 0.89 ROC-AUC for classification, and 0.18 MAE for regression. The code is publicly available.
Key takeaway
For AI Scientists developing medical diagnostic models from multi-image data, consider implementing GMM pooling in your Multiple Instance Learning pipelines. This method effectively captures intra-patient variability by modeling feature distributions, leading to improved predictive performance, as demonstrated by the PR-AUC increase from 0.44 to 0.56 in preterm birth prediction. You should explore its application to other variable-sized image datasets to move beyond single-frame analyses.
Key insights
The GMM pooling method enhances MIL by modeling feature distributions from multiple images, improving prediction accuracy for medical diagnostics.
Principles
- Intra-patient variability improves diagnostic models.
- Distributional modeling surpasses point estimates.
- Multiple instance learning suits variable image sets.
Method
GMM pooling summarizes variable-sized bags of TVUS images into fixed-length representations by modeling their feature distribution, moving beyond single-frame point estimates in MIL.
In practice
- Apply GMM pooling to multi-image medical datasets.
- Use for preterm birth prediction from TVUS.
- Implement for lymph node metastasis classification.
Topics
- Gaussian Mixture Models
- Multiple Instance Learning
- Preterm Birth Prediction
- Medical Diagnostics
- Ultrasound Imaging
Code references
Best for: Computer Vision Engineer, AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.