ZEBRA: Zero-Shot Entropy-Regularized Prompt Learning for Base-to-Novel Generalization in Audio-Language Models

2026-06-30 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

ZEBRA, a novel plug-and-play framework, addresses a critical base-to-novel generalization gap observed in Audio-Language Models (ALMs) using prompt learning. While prompt learning enhances accuracy on base classes through few-shot supervised adaptation, it often degrades performance on novel classes, sometimes falling below zero-shot accuracy. ZEBRA tackles this by fusing zero-shot logits with prompt-learning logits and applying self-entropy regularization. This regularization technique specifically aims to reduce overfitting to base classes. Experimental results across multiple audio classification datasets demonstrate that ZEBRA consistently improves novel-class performance while maintaining strong base accuracy, effectively narrowing the generalization gap compared to standard prompt learning approaches. The framework's code is publicly available.

Key takeaway

For AI Scientists developing Audio-Language Models, if you are encountering a performance drop on novel classes when using prompt learning, ZEBRA offers a solution. You should consider integrating this plug-and-play framework to fuse zero-shot and prompt-learning logits, leveraging self-entropy regularization. This approach can significantly improve your model's generalization to unseen categories while preserving base class accuracy.

Key insights

Prompt learning in ALMs creates a base-to-novel generalization gap; ZEBRA mitigates this by fusing logits and entropy regularization.

Principles

Prompt learning can degrade novel class performance.
Self-entropy regularization reduces base class overfitting.

Method

ZEBRA fuses zero-shot and prompt-learning logits, then applies self-entropy regularization to prevent overfitting to base classes, improving novel-class generalization.

In practice

Apply ZEBRA to improve ALM novel-class generalization.
Integrate zero-shot and prompt-learning logits.

Topics

Audio-Language Models
Prompt Learning
Zero-Shot Learning
Generalization Gap
Entropy Regularization
Audio Classification

Code references

asif-hanif/zebra

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.