How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new study introduces the Parametric Memory Law, a robust power law that quantifies the exact parametric memory capacity of Large Language Models (LLMs) when finetuned with Low-Rank Adaptation (LoRA). Addressing the gap in qualitative evaluations, researchers employed LoRA as a controlled memory capacity probe within the latent space. Their analysis reveals a deterministic phase transition at the token level, establishing that a prediction probability of p > 0.5 is a sufficient condition for verbatim recall under greedy decoding. Based on these insights, the study proposes MemFT, a threshold-guided optimization strategy. MemFT dynamically redistributes the training budget towards sub-threshold tokens, demonstrating enhanced memory fidelity and efficiency in empirical evaluations. Code for this work will be released at https://github.com/zjunlp/ParametricMemoryLaw.

Key takeaway

For Machine Learning Engineers finetuning LLMs with LoRA for continuous knowledge updates, understanding the Parametric Memory Law is crucial. This law quantifies how LoRA stores information, revealing that tokens with prediction probability p > 0.5 are reliably recalled. You should consider implementing MemFT, the proposed threshold-guided optimization strategy. MemFT dynamically focuses your training budget on less-remembered tokens (p ≤ 0.5). This approach can significantly enhance memory fidelity and training efficiency in your LoRA finetuning workflows.

Key insights

The Parametric Memory Law quantifies LoRA's exact memory capacity in LLMs, linking loss reduction to effective parameters and sequence length.

Principles

Method

LoRA is used as a controlled memory capacity probe in the latent space to systematically quantify exact parametric memory. MemFT optimizes by dynamically redistributing training budget to sub-threshold tokens.

In practice

Topics

Code references

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.