Selective Attention-Based Network for Robust Infrared Small Target Detection

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

SANet, a Selective Attention-based Network, significantly advances infrared small target detection (IRSTD) by addressing limitations in existing deep learning methods. Built on the U-Net framework, SANet introduces two novel components: a Dual-path Semantic-aware Module (DSM) and a Selective Attention Fusion Module (SAFM). The DSM integrates standard convolutions for local detail with pinwheel-shaped convolutions for expanded, direction-sensitive receptive fields, enhanced by a Convolutional Block Attention Module (CBAM). The SAFM replaces static skip connections with a spatially adaptive, learnable weighting mechanism for context-aware, cross-scale feature fusion. Evaluated on NUAA-SIRST, IRSTD-1K, and NUDT-SIRST benchmarks, SANet consistently outperforms fourteen state-of-the-art methods, achieving IoU improvements of 1.93%, 4.32%, and 2.21% over the second-best approaches, demonstrating strong generalization and practical applicability.

Key takeaway

For research scientists developing advanced IRSTD systems, SANet offers a robust solution to persistent challenges like low signal-to-clutter ratios and complex backgrounds. You should consider integrating its Dual-path Semantic-aware Module and Selective Attention Fusion Module to enhance fine-grained target perception and dynamically fuse multi-scale features, potentially leading to significant improvements in detection accuracy and false alarm suppression in your applications.

Key insights

SANet enhances infrared small target detection by dynamically fusing multi-scale features and refining early feature extraction.

Principles

Method

SANet uses a Dual-path Semantic-aware Module (DSM) for feature extraction and a Selective Attention Fusion Module (SAFM) for adaptive cross-scale feature fusion within a U-Net architecture, optimized with Soft-IoU loss.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.