DYNA-PRUNER: Input-Adaptive Data-Model Co-Pruning for Efficient and Scalable Spatio-Temporal Media Prediction
Summary
Dyna-Pruner is an end-to-end framework designed for input-dependent co-pruning of data and model structure, addressing the high computational cost of modern spatio-temporal prediction models used in applications like radar/satellite nowcasting and traffic monitoring. These models often suffer from a mismatch between dense computation and input-dependent redundancy, such as calm seas or clear skies. Dyna-Pruner employs a shared-importance synchronization mechanism to generate coupled masks, which prune redundant regions and their corresponding computational units, creating per-sample sparse sub-networks during inference. Tested on WeatherBench, SEVIR, and TaxiBJ datasets, it seamlessly integrates with CNN, RNN, and Transformer backbones, achieving up to a 70% reduction in FLOPs and a 2.5x speedup on NVIDIA Jetson AGX Orin, all while maintaining negligible accuracy loss, specifically less than 1%.
Key takeaway
For AI Engineers deploying spatio-temporal prediction models on resource-constrained edge devices like NVIDIA Jetson AGX Orin, you should consider implementing input-adaptive co-pruning frameworks such as Dyna-Pruner. This approach allows you to achieve significant efficiency gains, including up to a 70% FLOPs reduction and a 2.5x speedup, with negligible accuracy loss (<1%). Evaluate its integration with your existing CNN, RNN, or Transformer backbones to enable scalable, real-time applications.
Key insights
Dyna-Pruner co-prunes data and model structure input-adaptively, creating sparse sub-networks for efficient spatio-temporal media prediction.
Principles
- Input-dependent redundancy drives model inefficiency.
- Co-pruning data and model structure is effective.
- Per-sample sparse sub-networks reduce inference cost.
Method
Dyna-Pruner uses a shared-importance synchronization mechanism to generate coupled masks. These masks prune redundant input regions and corresponding computational units (e.g., convolutional filters), yielding per-sample sparse sub-networks at inference time.
In practice
- Deploy efficient spatio-temporal models on edge devices.
- Optimize CNN, RNN, and Transformer backbones.
- Reduce FLOPs by up to 70% for real-time tasks.
Topics
- Model Pruning
- Spatio-Temporal Prediction
- Deep Learning Efficiency
- Edge AI
- Neural Network Optimization
- NVIDIA Jetson AGX Orin
Best for: Computer Vision Engineer, Research Scientist, Machine Learning Engineer, AI Engineer, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.