DYNA-PRUNER: Input-Adaptive Data-Model Co-Pruning for Efficient and Scalable Spatio-Temporal Media Prediction

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

Dyna-Pruner is an end-to-end framework designed for input-dependent co-pruning of data and model structure, addressing the high computational cost of modern spatio-temporal prediction models used in applications like radar/satellite nowcasting and traffic monitoring. These models often suffer from a mismatch between dense computation and input-dependent redundancy, such as calm seas or clear skies. Dyna-Pruner employs a shared-importance synchronization mechanism to generate coupled masks, which prune redundant regions and their corresponding computational units, creating per-sample sparse sub-networks during inference. Tested on WeatherBench, SEVIR, and TaxiBJ datasets, it seamlessly integrates with CNN, RNN, and Transformer backbones, achieving up to a 70% reduction in FLOPs and a 2.5x speedup on NVIDIA Jetson AGX Orin, all while maintaining negligible accuracy loss, specifically less than 1%.

Key takeaway

For AI Engineers deploying spatio-temporal prediction models on resource-constrained edge devices like NVIDIA Jetson AGX Orin, you should consider implementing input-adaptive co-pruning frameworks such as Dyna-Pruner. This approach allows you to achieve significant efficiency gains, including up to a 70% FLOPs reduction and a 2.5x speedup, with negligible accuracy loss (<1%). Evaluate its integration with your existing CNN, RNN, or Transformer backbones to enable scalable, real-time applications.

Key insights

Dyna-Pruner co-prunes data and model structure input-adaptively, creating sparse sub-networks for efficient spatio-temporal media prediction.

Principles

Method

Dyna-Pruner uses a shared-importance synchronization mechanism to generate coupled masks. These masks prune redundant input regions and corresponding computational units (e.g., convolutional filters), yielding per-sample sparse sub-networks at inference time.

In practice

Topics

Best for: Computer Vision Engineer, Research Scientist, Machine Learning Engineer, AI Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.