DeltaSeg: Tiered Attention and Deep Delta Learning for Multi-Class Structural Defect Segmentation

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Advanced, extended

Summary

DeltaSeg is a U-shaped encoder-decoder architecture designed for automated multi-class structural defect segmentation from visual inspection imagery. It features a tiered attention strategy, integrating Squeeze-and-Excitation (SE) channel attention in the encoder, Coordinate Attention at the bottleneck and decoder, and a novel Deep Delta Attention (DDA) mechanism in the skip connections. The encoder utilizes depthwise separable convolutions with dilated stages to expand the receptive field while maintaining spatial resolution, and Atrous Spatial Pyramid Pooling (ASPP) captures multi-scale context at the bottleneck. The DDA module refines skip connections through a dual-path scheme, combining a learned delta operator for nuisance feature suppression with spatial attention gates conditioned on decoder signals. Deep supervision via multi-scale auxiliary heads further strengthens gradient flow. DeltaSeg consistently outperforms 12 competing architectures, including U-Net and SegFormer, on the S2DS dataset (7 classes) and the Culvert-Sewer Defect Dataset (CSDD, 9 classes), achieving 70.46% defect mIoU on S2DS and 76.75% on CSDD, with a parameter count of 7.14M.

Key takeaway

For research scientists developing robust computer vision models for infrastructure inspection, DeltaSeg's tiered attention and Deep Delta Attention (DDA) module offer a blueprint for overcoming challenges like class imbalance and diverse defect types. You should consider adopting a similar strategy of context-aware attention placement and semantically refined skip connections to improve both accuracy and generalization across varied structural geometries and imaging conditions in your own segmentation tasks.

Key insights

Tiered attention and refined skip connections significantly improve structural defect segmentation accuracy and generalization.

Principles

Method

DeltaSeg employs SE attention in the encoder, Coordinate Attention at the bottleneck and decoder, and Deep Delta Attention (DDA) in skip connections, combining nuisance suppression with spatial gating.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.