Finding Sparse Subnetworks in One Training Cycle via Progressive Magnitude-Based Pruning

· Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, medium

Summary

A new method, progressive magnitude-based pruning, offers a single-cycle alternative for neural network sparsification, addressing the multi-cycle training requirement of approaches like the Lottery Ticket Hypothesis (LTH). This technique gradually increases sparsity during training via a linear schedule, updating pruning masks based on active weight magnitudes. Systematic experiments on CIFAR-10 and MNIST datasets, utilizing ResNet, VGG-style, and LeNet architectures, demonstrate its effectiveness. On CIFAR-10, the method achieved 95.12% accuracy on ResNet-18 at 72.9% sparsity, outperforming LTH's reported 90.5%. At extreme sparsity, it reached 93.13% accuracy on a VGG-like architecture at 97% sparsity, surpassing SNIP's approximately 92.0%, and 93.44% accuracy on VGG-19 at 97.97% sparsity, compared to GraSP's 92.19% at 98% sparsity. Accuracy on ResNet-18 remained within 0.1 percentage points of the dense baseline across 70-85% sparsity.

Key takeaway

For Machine Learning Engineers optimizing model deployment, progressive magnitude-based pruning offers a significant efficiency gain. You can achieve high model sparsity and maintain competitive accuracy in a single training cycle, eliminating the computational overhead of iterative pruning methods. This approach allows you to streamline your model development workflow, reducing training time and resource consumption for deploying compact, performant neural networks. Consider integrating this technique to accelerate your sparsification efforts.

Key insights

Progressive magnitude-based pruning enables effective neural network sparsification within a single training cycle.

Principles

Method

Sparsity is gradually increased via a linear schedule throughout training, with pruning masks dynamically updated based on the magnitudes of active weights.

In practice

Topics

Best for: Research Scientist, AI Engineer, Computer Vision Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.