CompreSSM: Compressing State-Space Models During Training with Hankel Singular Values

2026-06-14 · Source: Machine Learning on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, short

Summary

CompreSSM, a new approach highlighted by MIT CSAIL News and accepted at ICLR 2026, enables the compression of state-space models during their training phase. Unlike traditional methods that prune or distill models after full training, CompreSSM integrates pruning directly into the training loop. It leverages control-theoretic Hankel singular values (HSVs) to identify and remove dispensable model subcomponents on-the-fly. This method quantifies each state's contribution to input-output behavior, allowing for the iterative pruning of low-HSV subspaces. Early benchmarks indicate CompreSSM achieved up to 3x speedups in training state-space models for sequence tasks, maintaining test accuracy without perceptible loss compared to standard training followed by post-hoc pruning. This innovation significantly reduces computational costs and hardware requirements.

Key takeaway

For Machine Learning Engineers developing state-space models for sequential tasks, CompreSSM offers a significant shift in optimization strategy. You should consider integrating in-training pruning using Hankel singular values to achieve up to 3x faster training and reduced memory footprints without sacrificing model accuracy. This approach allows your teams to develop efficient models more rapidly, particularly beneficial for resource-constrained deployments or hyperparameter search explorations.

Key insights

CompreSSM uses Hankel singular values to prune state-space models during training, achieving faster, leaner models without accuracy loss.

Principles

Hankel singular values quantify state contribution.
Small HSVs indicate dispensable model components.
Pruning during training accelerates learning.

Method

CompreSSM computes Hankel singular values at intervals during training, prunes unnecessary state subspaces on-the-fly, then continues training the reduced model for adaptation.

In practice

Apply to S4, LSSL, DSS models.
Optimize sequence tasks like speech or time series.
Reduce training costs on limited GPUs.

Topics

State-Space Models
Model Compression
Hankel Singular Values
In-Training Pruning
Control Theory
Sequential Tasks

Best for: NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning on Medium.