Attention-Guided Autoencoder Fusion for Insulator Defect Detection Using UAV Transmission-Line Imaging

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

AE-YOLO, an Attention-Guided AutoEncoder-Enhanced YOLO framework, addresses challenges in automated high-voltage transmission-line insulator defect detection using UAV imagery. It integrates lightweight bottleneck autoencoders within a Feature Pyramid Network-Path Aggregation Network (FPN-PAN) neck to preserve anomaly-sensitive information during multi-scale feature fusion. Convolutional Block Attention Modules (CBAM) enhance feature discrimination, and a variance-maximizing autoencoder regularization strategy encourages diverse, defect-discriminative latent representations. The network trains with a unified objective combining focal loss, CIoU loss, and autoencoder regularization. During inference, Weighted Boxes Fusion (WBF) combines predictions from YOLOv8, YOLOv10, and YOLO11, with an autoencoder-guided confidence boosting mechanism. On the Insulator-Defect Detection dataset, AE-YOLO with an EfficientNetV2 backbone achieved 95.10% mAP@0.5, 96.40% precision, and 93.80% recall, surpassing the strongest YOLO baseline by 5.0 points in mAP@0.5 and 6.7 points in recall.

Key takeaway

For Machine Learning Engineers developing UAV-based inspection systems, AE-YOLO offers a robust solution for critical defect detection. You should consider integrating attention-guided autoencoders into your YOLO-based frameworks to significantly improve recall for small, rare anomalies like pollution-flashover. This approach enhances sensitivity to subtle defects, crucial for preventing costly infrastructure failures and improving grid reliability.

Key insights

Integrating autoencoders and attention into YOLO enhances UAV-based insulator defect detection, especially for small, rare anomalies.

Principles

Method

AE-YOLO uses a configurable attention-enhanced backbone, an FPA neck with bottleneck autoencoders for anomaly-aware feature refinement, a multi-scale detection head, and an ensemble inference module with AE-guided confidence boosting and WBF.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.