GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

GRASS (Gradient-based Adaptive Layer-wise Importance Sampling) is a new framework designed to overcome the memory constraints of full-parameter fine-tuning for large language models (LLMs). Existing low-rank adaptation methods, while memory-efficient, often compromise model expressiveness and performance. Layer-wise fine-tuning methods, which use static importance sampling, fail to adapt to varying layer importance across tasks and training stages. GRASS addresses these issues by employing mean gradient norms as a dynamic, task-aware, and training-stage-aware metric for estimating layer importance. It adaptively adjusts layer sampling probabilities and incorporates a layer-wise optimizer state offloading mechanism to further reduce memory usage. Experiments show GRASS improves accuracy by up to 4.38 points and reduces memory usage by up to 19.97% compared to state-of-the-art methods across multiple models and benchmarks.

Key takeaway

For AI Engineers and Research Scientists struggling with GPU memory limitations during LLM fine-tuning, GRASS offers a compelling solution. By dynamically identifying and prioritizing important layers, you can achieve superior performance with significantly reduced memory footprint. Consider integrating GRASS into your fine-tuning workflows, especially for large models where full-parameter updates are infeasible, to improve both efficiency and accuracy.

Key insights

GRASS adaptively samples LLM layers for fine-tuning based on gradient norms, significantly reducing memory while boosting performance.

Principles

Method

GRASS uses mean gradient norms to estimate layer importance, adaptively adjusts sampling probabilities, and offloads optimizer states to reduce memory during LLM fine-tuning.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.