GP-Adapter: Gaussian Process CLIP-Adapter for Few-Shot Out-of-Distribution Detection
Summary
GP-Adapter is a novel, training-free framework that enhances CLIP (Contrastive Language-Image Pre-training) with Gaussian Process (GP) uncertainty modeling. Designed for few-shot classification and out-of-distribution (OOD) detection, it addresses CLIP's limitation of providing only deterministic similarity scores, which offers insufficient uncertainty information in low-data or shifted distribution scenarios. GP-Adapter builds modality-specific, class-wise one-class GPs on frozen CLIP embeddings, utilizing an RBF kernel for image features and a linear kernel for text prompts. It then fuses these predictive statistics to generate a variance-aware confidence score for OOD detection. The method requires no CLIP backbone fine-tuning, operating with a small K-shot cache and lightweight hyperparameter selection, incurring a memory cost of O(CK^2). Experiments on ImageNet and various OOD benchmarks demonstrate competitive few-shot performance and improved OOD detection, particularly when integrated with prompt-learning baselines. The framework was published on 2026-06-05.
Key takeaway
For Machine Learning Engineers developing robust vision systems in low-data or distribution-shifted environments, GP-Adapter offers a compelling solution. You should consider integrating this training-free framework with your CLIP-based models to gain critical uncertainty information. This approach improves few-shot classification and out-of-distribution detection without fine-tuning the CLIP backbone, enhancing model reliability and decision-making in challenging real-world applications.
Key insights
GP-Adapter augments CLIP with Gaussian Processes for training-free, uncertainty-aware few-shot OOD detection, improving reliability in low-data settings.
Principles
- Probabilistic inference enhances vision-language model reliability.
- Uncertainty modeling is crucial for distribution shifts.
- GP-based uncertainty complements prompt learning.
Method
Construct modality-specific, class-wise one-class Gaussian Processes on frozen CLIP embeddings using RBF for images and linear for text. Fuse predictive statistics to generate variance-aware confidence scores for OOD detection with a K-shot cache.
In practice
- Enhance few-shot classification.
- Improve out-of-distribution detection.
- Boost reliability in low-data scenarios.
Topics
- Gaussian Processes
- CLIP
- Out-of-Distribution Detection
- Few-Shot Learning
- Vision-Language Models
- Uncertainty Quantification
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.