YOLO-AMC: An Improved YOLO Architecture with Attention Mechanisms for Building Crack Detection

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

YOLO-AMC is an improved YOLO-based architecture designed to enhance automated crack detection in infrastructure inspection and Structural Health Monitoring (SHM). Built upon YOLOv11, this model removes the original C2PSA module and integrates multiple attention mechanisms, including Global Attention Mechanism (GAM), Residual Convolutional Block Attention Module (Res-CBAM), and Shuffle Attention (SA), into its multi-scale feature fusion layers. Experimental results show YOLO-AMC consistently outperforms baseline models YOLOv11n and YOLOv8n. Specifically, GAM achieved the best performance with mAP@0.5 = 0.9917 and mAP@0.5:0.95 = 0.9506, surpassing YOLOv11 (0.9833 / 0.9112) and YOLOv8 (0.9707 / 0.8921). The model maintains a computational complexity of 7.6 GFLOPs, achieving 110.95 FPS on an NVIDIA RTX 4090 and approximately 5 FPS on a Raspberry Pi 5, demonstrating a favorable balance between accuracy and deployment efficiency.

Key takeaway

For Computer Vision Engineers developing automated inspection systems, YOLO-AMC offers a robust solution for challenging crack detection tasks. Your projects requiring high accuracy on low-contrast features, especially for Structural Health Monitoring, can benefit from integrating attention mechanisms like GAM into YOLOv11-based models. This approach provides a strong balance of performance and efficiency, making it suitable for deployment on both high-end GPUs and edge devices like the Raspberry Pi 5.

Key insights

Integrating attention mechanisms into YOLO's feature fusion significantly improves crack detection accuracy and deployment efficiency.

Principles

Method

YOLO-AMC improves YOLOv11 by replacing C2PSA and embedding GAM, Res-CBAM, or SA into Neck's multi-scale feature fusion layers.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.