Finding Sparse Subnetworks in One Training Cycle via Progressive Magnitude-Based Pruning

2026-06-10 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, medium

Summary

A new method, progressive magnitude-based pruning, offers a single-cycle alternative for neural network sparsification, addressing the multi-cycle training requirement of approaches like the Lottery Ticket Hypothesis (LTH). This technique gradually increases sparsity during training via a linear schedule, updating pruning masks based on active weight magnitudes. Systematic experiments on CIFAR-10 and MNIST datasets, utilizing ResNet, VGG-style, and LeNet architectures, demonstrate its effectiveness. On CIFAR-10, the method achieved 95.12% accuracy on ResNet-18 at 72.9% sparsity, outperforming LTH's reported 90.5%. At extreme sparsity, it reached 93.13% accuracy on a VGG-like architecture at 97% sparsity, surpassing SNIP's approximately 92.0%, and 93.44% accuracy on VGG-19 at 97.97% sparsity, compared to GraSP's 92.19% at 98% sparsity. Accuracy on ResNet-18 remained within 0.1 percentage points of the dense baseline across 70-85% sparsity.

Key takeaway

For Machine Learning Engineers optimizing model deployment, progressive magnitude-based pruning offers a significant efficiency gain. You can achieve high model sparsity and maintain competitive accuracy in a single training cycle, eliminating the computational overhead of iterative pruning methods. This approach allows you to streamline your model development workflow, reducing training time and resource consumption for deploying compact, performant neural networks. Consider integrating this technique to accelerate your sparsification efforts.

Key insights

Progressive magnitude-based pruning enables effective neural network sparsification within a single training cycle.

Principles

Iterative pruning methods often demand multiple training cycles.
Sparsity can be increased progressively during a single training run.
Weight magnitudes can guide dynamic pruning mask updates.

Method

Sparsity is gradually increased via a linear schedule throughout training, with pruning masks dynamically updated based on the magnitudes of active weights.

In practice

Apply single-cycle pruning to ResNet, VGG, and LeNet architectures.
Achieve 70-97% sparsity while preserving accuracy on image classification tasks.

Topics

Neural Network Pruning
Model Sparsification
Progressive Pruning
Lottery Ticket Hypothesis
Deep Learning Architectures
Model Compression

Best for: Research Scientist, AI Engineer, Computer Vision Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.