Low Kruskal-Rank Adaptation
Summary
Low Kruskal-Rank Adaptation (LoKRA) is a new parameter-efficient fine-tuning (PEFT) algorithm designed to enhance Low-Rank Adaptation (LoRA) by addressing its limitations in capturing redundancy within update matrices. LoKRA replaces the standard matrix rank with Kruskal rank, which offers a more informative criterion for characterizing update diversity and global independence. The method introduces a penalty term into the loss function to optimize the Kruskal rank of learnable matrices A and B^T. An enhanced variant, LoKRA+, further improves performance by using the Khatri-Rao product instead of matrix multiplication, providing a tighter theoretical lower bound on the Kruskal rank. Experiments on LLMs like LLaMA-7B/13B, LLaMA2-7B, LLaMA3-8B, and Qwen3-8B demonstrate that LoKRA consistently outperforms LoRA, improving average accuracy by 2.4%-5.0% and surpassing previous best methods by up to 1.5%. LoKRA+ further boosts accuracy by 0.3%-0.8%. All tests were conducted on a single AMD Instinct MI300 Accelerator. The paper is accepted by ICML 2026, and code is available on GitHub.
Key takeaway
For Machine Learning Engineers fine-tuning large language models, LoKRA offers a significant advancement over traditional LoRA. Your models can achieve 2.4%-5.0% higher average accuracy by adopting LoKRA, which explicitly addresses update matrix redundancy using Kruskal rank. Consider implementing LoKRA or its enhanced variant, LoKRA+, to improve adaptation stability and generalization, especially when deploying on AMD Instinct MI300 Accelerators. Access the public GitHub repository to integrate this method into your PEFT workflows.
Key insights
LoKRA improves PEFT by using Kruskal rank to mitigate redundancy in LoRA update matrices, enhancing LLM adaptation.
Principles
- Standard matrix rank fails to capture update redundancy.
- Kruskal rank reflects global and robust independence.
- Higher Kruskal rank improves stability and generalization.
Method
LoKRA optimizes Kruskal rank of LoRA update matrices (A and B^T) via a penalty term. LoKRA+ uses the Khatri-Rao product for a tighter theoretical lower bound.
In practice
- Apply LoKRA to improve LLM fine-tuning accuracy.
- Use LoKRA+ for stronger empirical performance.
- Access code on GitHub for implementation.
Topics
- Low-Rank Adaptation
- Kruskal Rank
- Parameter-Efficient Fine-Tuning
- Large Language Models
- AMD Instinct MI300
- ICML 2026
Code references
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AMD ROCm Blogs.