Sharpness Aware Surrogate Training for Spiking Neural Networks

· Source: cs.NE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Sharpness Aware Surrogate Training (SAST) is a novel method for optimizing Spiking Neural Networks (SNNs) that addresses the "surrogate to hard transfer gap" by applying Sharpness Aware Minimization (SAM) to a smooth surrogate forward SNN. Unlike conventional methods that couple a non-smooth forward model with a biased gradient estimator, SAST optimizes a smooth empirical risk, ensuring an exact training gradient for the auxiliary model. The method provides theoretical guarantees including state stability, input Lipschitz bounds, smoothness of the surrogate objective, and nonconvex convergence for stochastic SAST. Empirically, SAST significantly reduces the transfer gap on datasets like N-MNIST (from 65.7% to 94.7% hard spike accuracy) and DVS Gesture (from 31.8% to 63.3% hard spike accuracy), while maintaining high surrogate forward accuracy and improving corruption robustness. The approach requires careful controls for compute matching and threshold calibration for a comprehensive practical assessment.

Key takeaway

Research Scientists developing Spiking Neural Networks should consider integrating Sharpness Aware Surrogate Training (SAST) into their workflow, especially when aiming for robust hard spike deployment. SAST demonstrably improves hard spike accuracy and reduces the critical surrogate-to-hard transfer gap, which is vital for real-world neuromorphic hardware applications. You should rigorously evaluate SAST against compute-matched and calibration-aware baselines to confirm its efficiency and practical superiority for your specific SNN architectures and datasets.

Key insights

SAST optimizes SNNs by applying SAM to a smooth surrogate forward model, reducing the surrogate-to-hard transfer gap.

Principles

Method

SAST applies SAM to a surrogate forward SNN, using backpropagation through time to compute exact gradients of a smooth objective. It involves two SAM passes with state resets and independent minibatches for convergence guarantees.

In practice

Topics

Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.NE updates on arXiv.org.