SILAGE: Memory-Efficient, Full-Gradient-Free Nonconvex Optimization for Nested Finite Sums

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

SILAGE is a new variance-reduced algorithm designed for memory-efficient, full-gradient-free nonconvex optimization, specifically targeting empirical risk minimization on massive datasets with a nested double finite-sum structure. This structure involves N=nm total samples partitioned into n blocks of size m. Unlike recursive estimators such as PAGE, which demand computationally expensive periodic global full-gradient refreshes over all nm samples, SILAGE eliminates these refreshes by evaluating at most one local group gradient per iteration. Furthermore, it significantly reduces memory requirements to only ℮(n), contrasting with single-loop methods like SILVER that need an impractical ℮(nm) memory footprint. SILAGE's convergence analysis adapts to data geometry through across-group (δ₁) and within-group (δ₂) heterogeneity, yielding improved bounds over existing methods in several practical scenarios.

Key takeaway

For Machine Learning Engineers optimizing models on massive, nested datasets, SILAGE offers a compelling alternative to traditional variance-reduced methods. You can achieve efficient nonconvex optimization with only ℮(n) memory, avoiding costly global full-gradient refreshes. This allows you to scale training processes more effectively, especially with data partitioned into n blocks of size m, without the impractical ℮(nm) memory overhead of other single-loop approaches.

Key insights

SILAGE optimizes nested finite sums with ℮(n) memory and no global full-gradient refreshes, adapting to data heterogeneity.

Principles

Method

SILAGE is a variance-reduced algorithm that exploits a double-sum structure, evaluating at most one local group gradient per iteration to avoid global full-gradient refreshes while maintaining ℮(n) memory.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.