How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

2026-05-28 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new study introduces the Parametric Memory Law, a robust power law that quantifies the exact parametric memory capacity of Large Language Models (LLMs) when finetuned with Low-Rank Adaptation (LoRA). Addressing the gap in qualitative evaluations, researchers employed LoRA as a controlled memory capacity probe within the latent space. Their analysis reveals a deterministic phase transition at the token level, establishing that a prediction probability of p > 0.5 is a sufficient condition for verbatim recall under greedy decoding. Based on these insights, the study proposes MemFT, a threshold-guided optimization strategy. MemFT dynamically redistributes the training budget towards sub-threshold tokens, demonstrating enhanced memory fidelity and efficiency in empirical evaluations. Code for this work will be released at https://github.com/zjunlp/ParametricMemoryLaw.

Key takeaway

For Machine Learning Engineers finetuning LLMs with LoRA for continuous knowledge updates, understanding the Parametric Memory Law is crucial. This law quantifies how LoRA stores information, revealing that tokens with prediction probability p > 0.5 are reliably recalled. You should consider implementing MemFT, the proposed threshold-guided optimization strategy. MemFT dynamically focuses your training budget on less-remembered tokens (p ≤ 0.5). This approach can significantly enhance memory fidelity and training efficiency in your LoRA finetuning workflows.

Key insights

The Parametric Memory Law quantifies LoRA's exact memory capacity in LLMs, linking loss reduction to effective parameters and sequence length.

Principles

Loss reduction (Delta L) follows a power law with effective parameters and sequence length.
Prediction probability p > 0.5 enables verbatim recall via greedy decoding.
Dynamic training budget redistribution improves memory fidelity.

Method

LoRA is used as a controlled memory capacity probe in the latent space to systematically quantify exact parametric memory. MemFT optimizes by dynamically redistributing training budget to sub-threshold tokens.

In practice

Apply MemFT to improve LoRA finetuning efficiency.
Target tokens with p ≤ 0.5 for focused memory updates.

Topics

Large Language Models
LoRA Finetuning
Parametric Memory Law
MemFT Optimization
Memory Fidelity
Greedy Decoding

Code references

zjunlp/ParametricMemoryLaw

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.