Mitosis Detection in the Wild: Multi-Tumor and Context-Aware Generalization in the MIDOG 2025 Challenge

2026-06-08 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Medical Image Analysis · Depth: Expert, extended

Summary

The MItosis DOmain Generalization (MIDOG) 2025 challenge evaluated automated mitotic figure detection and atypical mitotic figure classification across unprecedented biological and contextual diversity. It featured two tracks, utilizing a comprehensive test dataset of 365 cases from 12 distinct human, canine, and feline tumor types, digitized across multiple scanning platforms. Crucially, evaluation extended beyond traditional hand-selected hotspots to include random tissue areas and challenging regions rich in imposters. For Track 1 (detection), 18 teams achieved $F_{1}$ scores up to 0.740. Track 2 (classification) saw 21 submissions with balanced accuracy up to 0.908. Analysis revealed significant performance degradation in challenging regions, with a 208% increase in false positive rates, and variability across tumor types. Ensembling consistently improved $F_{1}$ by 1.5 percentage points and balanced accuracy by 1.3 percentage points, while test-time augmentation showed no relevant improvement.

Key takeaway

For AI Scientists and Machine Learning Engineers developing computational pathology tools, you must prioritize training data diversity that includes random and challenging tissue regions, not just hotspots. Current models demonstrate significant performance degradation and increased false positive rates (up to 208%) outside curated areas, limiting their clinical reliability for whole slide image analysis. Incorporate robust validation across varied contexts and consider ensembling strategies to improve generalization, as test-time augmentation proved largely ineffective in this challenge.

Key insights

Automated mitosis detection models exhibit significant generalization gaps "in the wild" beyond curated hotspot regions.

Principles

Clinical reliability demands multi-contextual evaluation.
Biological diversity exposes model "blind spots."
Ensembling consistently improves model robustness.

Method

The MIDOG 2025 challenge evaluated models using a multi-contextual framework (hotspot, random, challenging ROIs) for mitotic figure detection and atypical mitotic figure classification, with Docker container submissions for reproducibility.

In practice

Validate models on diverse tissue regions.
Consider ensembling for robustness.
Re-evaluate TTA's utility in pathology.

Topics

Mitosis Detection
Computational Pathology
Domain Generalization
Atypical Mitotic Figures
Whole Slide Imaging
Machine Learning Challenges

Code references

Best for: Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.