Supervised Dimensionality Reduction Revisited: Why LDA on Frozen CNN Features Deserves a Second Look
Summary
A new study investigates the impact of dimensionality reduction on frozen pretrained Convolutional Neural Network (CNN) features for image classification, a common practice when labeled data or computational resources are limited. The research, spanning four backbone architectures (ResNet-18, ResNet-50, MobileNetV3-Small, EfficientNet-B0) and two datasets (CIFAR-100, Tiny ImageNet), systematically compares ten dimensionality reduction strategies. It finds that Linear Discriminant Analysis (LDA) consistently improves classification accuracy by up to 4.6 percentage points over using full-dimensional features, while simultaneously reducing feature dimensionality by 61-95%. This improvement is statistically significant ($p<0.001$) across all eight backbone-dataset configurations. LDA also outperforms unsupervised PCA in 7 of 8 settings and is more efficient than complex alternatives like Local Fisher Discriminant Analysis and Neighbourhood Components Analysis. The study also introduces two lightweight extensions, Residual Discriminant Augmentation (RDA) and Discriminant Subspace Boosting (DSB), which offer marginal additional accuracy gains.
Key takeaway
For AI Engineers and Research Scientists building transfer learning pipelines with frozen CNN features, you should integrate Linear Discriminant Analysis (LDA) as a standard preprocessing step. This will reliably improve classification accuracy by 0.3-4.6 percentage points and reduce feature dimensionality by 61-95%, leading to faster and more stable classifier training. Ensure your training set has at least 50 samples per class for optimal LDA performance, and always project to $C-1$ dimensions.
Key insights
Applying LDA to frozen CNN features consistently boosts classification accuracy and reduces dimensionality.
Principles
- Supervised dimensionality reduction is more effective than unsupervised.
- LDA acts as an implicit regularizer for downstream classifiers.
- The benefit of LDA scales with the ratio of feature dimensions to classes.
Method
Extract frozen CNN features, apply LDA to project them to $C-1$ dimensions, then train an $\ell_2$-regularized logistic regression classifier on the reduced features.
In practice
- Use LDA with $d=C-1$ components for frozen CNN features.
- Standardize features after projection but before classification.
- Skip dimensionality reduction if training data is <50 samples per class.
Topics
- Linear Discriminant Analysis
- Frozen CNN Features
- Transfer Learning
- Supervised Dimensionality Reduction
- Image Classification Benchmarks
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.