Cascade Classification of Dermoscopic Images of Skin Neoplasms with Controllable Sensitivity and External Clinical Validation
Summary
The study compared four deep learning architectures (ViT-B/16, Swin-S, ConvNeXt-S, EfficientNetV2-S) and three classification schemes for dermoscopic images of skin neoplasms. Models were trained on aggregated ISIC Archive data using ImageNet-pretrained weights and evaluated on internal and two independent clinical datasets from Russian practice (Melanoscope AI, Sechenov University). Internally, the binary classification stage achieved ROC-AUC 0.952-0.966, but on Sechenov University data, ROC-AUC dropped to 0.797-0.893, sensitivity to 0.53-0.67, and ECE increased from 0.02 to 0.27-0.39, indicating a significant generalization gap. The two-stage cascade classification scheme improved macro F1 over single-stage four-class classification, particularly for ViT-B/16, by recovering malignant lesions.
Key takeaway
For AI Scientists developing diagnostic tools for dermoscopic images, recognize that a significant generalization gap exists between open datasets and real-world clinical data. You should prioritize implementing cascade classification schemes with tunable triage thresholds to achieve controllable sensitivity and ensure robust external clinical validation and recalibration before any deployment. This approach better aligns with clinical differential-diagnosis logic and improves overall performance.
Key insights
Cascade classification with a tunable triage threshold offers superior sensitivity control and better mimics clinical diagnosis logic.
Principles
- External clinical validation is crucial for generalization.
- Deep learning models show a significant generalization gap.
- Cascade classification can recover misclassified lesions.
Method
The study compared binary, single-stage four-class, and two-stage cascade classification schemes using ViT-B/16, Swin-S, ConvNeXt-S, and EfficientNetV2-S architectures on dermoscopic images.
In practice
- Implement a two-stage cascade for skin lesion classification.
- Prioritize external validation for model deployment.
- Consider ViT-B/16 with cascade for improved F1.
Topics
- Dermoscopic Image Classification
- Cascade Classification
- Deep Learning Architectures
- Clinical Validation
- Generalization Gap
- Skin Neoplasms
Best for: Computer Vision Engineer, AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.