Quantifying Explanation Consistency: The C-Score Metric for CAM-Based Explainability in Medical Image Classification
Summary
A new metric, the C-Score (Consistency Score), has been proposed to quantify the intra-class explanation reproducibility of Class Activation Mapping (CAM) methods in medical image classification. Unlike existing evaluation frameworks that focus on localization fidelity against radiologist annotations, the C-Score is confidence-weighted and annotation-free, measuring whether a model applies consistent spatial reasoning for the same pathology across different patients. The metric uses intensity-emphasized pairwise soft IoU across correctly classified instances. Six CAM techniques (GradCAM, GradCAM++, LayerCAM, EigenCAM, ScoreCAM, MS GradCAM++) were evaluated across three CNN architectures (DenseNet201, InceptionV3, ResNet50V2) over thirty training epochs on the Kermany chest X-ray dataset. The study identified three mechanisms of AUC-consistency dissociation and demonstrated that C-Score can provide an early warning for model instability, detecting ScoreCAM deterioration on ResNet50V2 one full checkpoint before catastrophic AUC collapse.
Key takeaway
For AI Scientists developing medical imaging classifiers, integrating the C-Score into your evaluation pipeline is crucial. This metric provides an early warning signal for model instability and can inform architecture-specific clinical deployment recommendations based on explanation quality, not just predictive accuracy. You should consider C-Score alongside traditional AUC metrics to ensure robust and reliable model behavior in critical applications.
Key insights
The C-Score quantifies explanation consistency in CAM methods, revealing model instability beyond classification metrics.
Principles
- Explanation consistency is distinct from localization fidelity.
- Early detection of model instability is possible via C-Score.
Method
The C-Score quantifies intra-class explanation reproducibility using confidence-weighted, annotation-free, intensity-emphasized pairwise soft IoU across correctly classified instances.
In practice
- Apply C-Score to monitor CAM-based model stability.
- Use C-Score for architecture-specific deployment recommendations.
Topics
- Class Activation Mapping
- C-Score Metric
- Medical Image Classification
- Explanation Consistency
- CNN Architectures
Best for: AI Scientist, Research Scientist, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.