SSM Adapters via Hankel Reduced-order Modeling: Injection Site Determines Task Suitability in Long-Context Fine-Tuning
Summary
SSM Adapters via Hankel Reduced-order Modeling (HRM adapter) are introduced as a parameter-efficient fine-tuning (PEFT) method specifically designed for tasks requiring sequential state accumulation. This SSM-based residual module is initialized using Balanced Truncation of empirical Hankel Grammians and leverages the time-invariance of its system matrix $\bar{A}$ to enable an exact FFT-based parallel scan, achieving computational parity with LoRA across all context lengths. In evaluations on Mistral-7B with 8.4M trainable parameters, HRM consistently outperformed LoRA variants on LongBench tasks, demonstrating a +34.8% relative accuracy improvement on QuALITY and a +71.6% relative ROUGE-1 score increase on QMSum. Furthermore, HRM showed superior performance across 18 configurations of synthetic state-tracking (DFA, Parity) and character-level language modeling (enwik8), with gate analysis revealing its ability to effectively modulate recurrence.
Key takeaway
For Machine Learning Engineers fine-tuning large language models for long-context sequential tasks, consider implementing SSM Adapters via Hankel Reduced-order Modeling (HRM) as a superior alternative to LoRA. HRM adapters significantly boost performance on benchmarks like LongBench, QuALITY, and QMSum, offering computational parity with LoRA while effectively modulating recurrence. Your teams should evaluate HRM for applications requiring robust sequential state accumulation, potentially achieving substantial accuracy gains over traditional low-rank adaptation methods.
Key insights
HRM adapters, an SSM-based PEFT method, outperform LoRA for long-context sequential tasks by modulating recurrence via Hankel Reduced-order Modeling.
Principles
- SSM adapters can enhance PEFT for sequential state accumulation.
- MLP blocks are effective injection sites for SSM adapters.
- Hankel Reduced-order Modeling enables efficient SSM initialization.
Method
The HRM adapter is an SSM-based residual module initialized via Balanced Truncation of empirical Hankel Grammians, enabling an exact FFT-based parallel scan through the time-invariance of its system matrix $\bar{A}$.
In practice
- Apply HRM adapters for long-context sequence modeling.
- Consider HRM for tasks like QuALITY and QMSum.
- Use HRM as an alternative to low-rank adaptation.
Topics
- SSM Adapters
- Parameter-Efficient Fine-Tuning
- Hankel Reduced-order Modeling
- Long-Context Models
- Mistral-7B
- Sequence Modeling
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.