LoRA but with Only 13 Parameters??

2025-07-07 · Source: The Kaitchup – AI on a Budget · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, medium

Summary

Researchers from Meta's Fundamental AI Research and academic collaborators have introduced TinyLoRA, a method that enhances an LLM's mathematical reasoning by updating as few as 13 parameters, equivalent to about 26 bytes in bfloat16. This approach addresses the high cost of reinforcement learning (RL) reasoning runs, where even conventional LoRA adapters typically require millions of trainable parameters. TinyLoRA replaces LoRA's small trainable matrix with an even smaller trainable vector, projected through a fixed random tensor, and can tie this vector across modules, reducing trainable parameters to single digits. Experiments on GSM8K show Qwen2.5-7B-Instruct improving from 88% to ~91% accuracy with only 13 parameters, demonstrating RL's superior effectiveness over Supervised Fine-Tuning (SFT) at ultra-tiny update sizes. The method's efficacy is notably architecture-dependent, with Qwen models responding significantly better than Llama models.

Key takeaway

For AI Engineers and Research Scientists optimizing LLM performance on a budget, TinyLoRA offers a compelling approach to enhance mathematical reasoning with minimal computational overhead. Your teams should investigate applying TinyLoRA, particularly with Qwen models, to achieve significant accuracy gains on tasks like GSM8K while drastically reducing the number of trainable parameters. This method allows for more efficient fine-tuning, making advanced reasoning capabilities accessible even with limited resources.

Key insights

TinyLoRA enables significant LLM math reasoning improvements with minimal parameter updates, leveraging RL over SFT.

Principles

RL is more effective than SFT for ultra-tiny parameter updates.
Model architecture significantly impacts parameter efficiency.

Method

TinyLoRA replaces LoRA's trainable matrix with a smaller trainable vector, projected via a fixed random tensor, allowing parameter tying across modules to drastically reduce trainable parameters.

In practice

Use TinyLoRA for cost-effective LLM reasoning improvements.
Prioritize Qwen models for TinyLoRA applications.

Topics

TinyLoRA
Parameter-Efficient Fine-Tuning
Reinforcement Learning
LLM Quantization
Qwen Models

Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Machine Learning Engineer, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Kaitchup – AI on a Budget.