Struggling with Overfitting on Medical Imaging Task [D]

2026-05-15 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Advanced, short

Summary

A user is experiencing severe overfitting in a 2-class classification task (LCA vs. RCA coronary arteries) using 2D X-ray angiograms. The model, an InceptionV3 architecture fine-tuned with ImageNet weights, achieves 95-99% training accuracy but validation accuracy peaks at 74-79% before collapsing to 30-40%. The dataset is small, comprising ~900 training frames from ~240 unique DICOMs and 227 validation frames from 73 independent DICOMs. Despite implementing normalization, class weights, Dropout (0.3-0.6), Weight Decay (1e-4), basic augmentations (flips, rotations, translation), and a ReduceLROnPlateau scheduler, the issue persists. Partial unfreezing of InceptionV3's top Mixed_7 blocks yielded the best validation accuracy of 76.65%, but full unfreezing or only training the classifier head performed worse.

Key takeaway

For Machine Learning Engineers developing medical imaging classifiers with small datasets, your current InceptionV3 fine-tuning strategy likely suffers from a learning rate that is too high for the backbone. You should try training only the classifier head initially, then unfreeze the top layers of the backbone with a significantly reduced learning rate (e.g., 1e-5 or lower) to prevent destroying valuable pre-trained features and mitigate overfitting.

Key insights

Small medical imaging datasets often lead to severe overfitting, especially with complex models.

Principles

Dataset independence is crucial for valid validation metrics.
Pre-trained features may not transfer well to grayscale medical images.
High learning rates can destroy pre-trained weights during fine-tuning.

Method

When fine-tuning, first train only the classification layers, then unfreeze the backbone and significantly drop the learning rate to preserve pre-trained weights.

In practice

Ensure validation sets are strictly patient-independent.
Consider contrastive learning frameworks like SimCLR.
Log gradient magnitudes to detect training instability.

Topics

Overfitting
Medical Imaging
X-ray Angiography
InceptionV3
Transfer Learning

Best for: Machine Learning Engineer, AI Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.