Towards Data-free and Training-free Compression for Speech Foundation Models Using Parameter Clustering

2026-06-10 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

A novel data-free and training-free compression method for speech foundation models utilizes channelwise clustering via k-means, with further exploration into mixed sparsity pruning using layer-level varying parameter clusters. Experiments on the LibriSpeech dataset demonstrated significant performance improvements over magnitude-based pruning. For HuBERT-large at 50% sparsity, absolute Word Error Rate (WER) reductions of 27.73% and 18.61% were achieved on test-clean and test-other subsets, respectively, before fine-tuning. After only three epochs of fine-tuning, reductions of 0.19% and 0.79% absolute were still observed. Whisper-large-v3 also saw WER reductions of 2.86% and 5.02% absolute at 10% sparsity, all while maintaining no significant WER increase compared to the uncompressed baseline.

Key takeaway

For Machine Learning Engineers deploying large speech foundation models, consider this data-free, training-free compression approach. Your team can achieve significant Word Error Rate reductions, such as 27.73% absolute on HuBERT-large or 5.02% on Whisper-large-v3, without extensive retraining or data requirements. This method offers a compelling alternative to magnitude-based pruning, preserving performance while reducing model size efficiently.

Key insights

Channelwise k-means clustering enables effective data-free, training-free compression for speech foundation models.

Principles

Data-free compression is viable.
Clustering outperforms magnitude pruning.
Mixed sparsity improves results.

Method

The method involves channelwise k-means clustering for parameter compression, optionally combined with mixed sparsity pruning using layer-level varying numbers of parameter clusters.

In practice

Apply k-means for model compression.
Test mixed sparsity pruning.
Evaluate on HuBERT or Whisper models.

Topics

Speech Foundation Models
Model Compression
Parameter Clustering
Pruning
HuBERT
Whisper

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.