Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images

Mu, Zonghan; Qin, Yong; Yu, Chongchong; Wu, Yunpeng; Wang, Zhipeng; Yang, Huaizhi; Huang, Yonghui

doi:10.1631/jzus.A2200175

Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images

适用于铁路桥梁钢结构无人机图像缺陷检测的自适应裁剪浅层注意力网络

Research Article
Published: 04 April 2023

Volume 24, pages 243–256, (2023)
Cite this article

Journal of Zhejiang University-SCIENCE A Aims and scope Submit manuscript

Zonghan Mu (牟宗涵) ORCID: orcid.org/0000-0003-0198-434X^1,2,
Yong Qin (秦勇)¹,
Chongchong Yu (于重重)³,
Yunpeng Wu (吴云鹏)⁴,
Zhipeng Wang (王志鹏)¹,
Huaizhi Yang (杨怀志)⁵ &
…
Yonghui Huang (黄永辉)⁵

181 Accesses
1 Citation
Explore all metrics

Abstract

Bridges are an important part of railway infrastructure and need regular inspection and maintenance. Using unmanned aerial vehicle (UAV) technology to inspect railway infrastructure is an active research issue. However, due to the large size of UAV images, flight distance, and height changes, the object scale changes dramatically. At the same time, the elements of interest in railway bridges, such as bolts and corrosion, are small and dense objects, and the sample data set is seriously unbalanced, posing great challenges to the accurate detection of defects. In this paper, an adaptive cropping shallow attention network (ACSANet) is proposed, which includes an adaptive cropping strategy for large UAV images and a shallow attention network for small object detection in limited samples. To enhance the accuracy and generalization of the model, the shallow attention network model integrates a coordinate attention (CA) mechanism module and an alpha intersection over union (α-IOU) loss function, and then carries out defect detection on the bolts, steel surfaces, and railings of railway bridges. The test results show that the ACSANet model outperforms the YOLOv5s model using adaptive cropping strategy in terms of the total mAP (an evaluation index) and missing bolt mAP by 5% and 30%, respectively. Also, compared with the YOLOv5s model that adopts the common cropping strategy, the total mAP and missing bolt mAP are improved by 10% and 60%, respectively. Compared with the YOLOv5s model without any cropping strategy, the total mAP and missing bolt mAP are improved by 40% and 67%, respectively.

概要

目的

桥梁钢结构以及钢结构上的高强度螺栓长期受风雨侵蚀,常常会有锈蚀或缺失的情况发生,而人工巡检的效率低、危险性大且视觉盲区多。本文期望通过无人机拍摄,对铁路桥梁钢结构图像所包含的检测目标（螺母正常、螺栓正常、螺栓缺失、螺母缺失、钢表面锈蚀和钢栏杆锈蚀）进行识别和检测,以提高铁路桥梁巡检工作的精度和效率。

创新点

1. 提出了一种自适应图像裁剪方法,可根据图像的具体情况,自适应的调整图像的分割尺寸以及裁剪重叠区域面积,可以消除无人机拍摄距离以及焦距不固定带来的负面影响,并且提高小目标的检测效果;2. 基于铁路桥梁钢结构待检测对象的特征,提出了浅层注意力网络,使模型能够更加关注待检测对象的浅层特征,从而使锈蚀区域更易于检测;3. 将坐标注意力（CA）机制模块集成到浅层注意力网络模型当中,帮助网络在大范围的无人机拍摄场景下找到缺陷区域;4. 将阿尔法并交比（α-IOU）损失函数集成到浅层注意力网络模型当中,提高针对铁路桥梁钢结构小数据集的训练和测试精度。

方法

1. 提出自适应图像裁剪策略,对无人机大尺寸图像进行处理,得到更易于网络检测出缺陷目标的小图像;2. 通过对YOLO网络进行改进,得到更关注浅层特征的浅注意力网络,提高对锈蚀、缺失的检测精度;3. 集成CA注意力机制和α-IOU损失函数到浅注意力网络中,提高图像检测的精度。

结论

1. 在小数据集中,待检测目标与输入图像的比例对最终的检测结果有明显影响;在本研究使用的数据集中,图像与主目标比例在20׃1到80׃1之间时,以50׃1为界限,大于50׃1时,精度变化较大,但是训练时间基本不变,而小于50׃1时,精度基本不变,但是训练时间变化较大,因此在训练过程中,存在一个临界点,此时训练效率和测试结果最佳。2. 更深层的网络会干扰小目标、少样本且简单特征对象的检测精度;对比其他策略相同但网络结构不同的检测结果,ACSANet相较于ACNet+CA+α-IOU的螺栓缺失精度提高了近10%。3. 不同的注意力机制由于注意方向不同,并不一定会提高检测精度;合适的注意力机制以及损失函数可以对铁路桥梁钢结构无人机图像目标进行更好的检测,采用不合适的注意力机制会对检测产生负面效果。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ali R, Kang D, Suh G, et al., 2021. Real-time multiple damage mapping using autonomous UAV and deep faster region-based neural networks for GPS-denied structures. Automation in Construction, 130:103831. https://doi.org/10.1016/j.autcon.2021.103831
Article Google Scholar
Arivazhagan S, Shebiah RN, Magdalene JS, et al., 2015. Railway track derailment inspection system using segmentation based fractal texture analysis. ICTACT Journal on Image and Video Processing, 6(1): 1060–1065. https://doi.org/10.21917/ijivp.2015.0155
Article Google Scholar
Bochkovskiy A, Wang CY, Liao HYM, 2020. YOLOv4: optimal speed and accuracy of object detection. arXiv: 2004.10934. https://doi.org/10.48550/arXiv.2004.10934
Box GEP, Cox DR, 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2):211–243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
MathSciNet MATH Google Scholar
Cha YJ, Choi W, Büyüköztürk O, 2017. Deep learning-based crack damage detection using convolutional neural networks. Computer-Aided Civil and Infrastructure Engineering, 32(5):361–378. https://doi.org/10.1111/mice.12263
Article Google Scholar
Cha YJ, Choi W, Suh G, et al., 2018. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering, 33(9):731–747. https://doi.org/10.1111/mice.12334
Article Google Scholar
Chen JW, Liu ZG, Wang HR, et al., 2018. Automatic defect detection of fasteners on the catenary support device using deep convolutional neural network. IEEE Transactions on Instrumentation and Measurement, 67(2):257–269. https://doi.org/10.1109/TIM.2017.2775345
Article Google Scholar
Chen P, Wu YP, Qin Y, et al., 2019. Rail fastener defect inspection based on UAV images: a comparative study. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.685–694. https://doi.org/10.1007/978-981-15-2914-6_65
Chen Q, Liu L, Han R, et al., 2019. Image identification method on highspeed railway contact network based on YOLO v3 and SENet. Chinese Control Conference, p.8772–8777. https://doi.org/10.23919/ChiCC.2019.8865153
Chen YK, Zhang PZ, Li ZM, et al., 2020. Stitcher: feedback-driven data provider for object detection. arXiv: 2004. 12432. https://doi.org/10.48550/arXiv.2004.12432
Choi W, Cha YJ, 2020. SDDNet: real-time crack segmentation. IEEE Transactions on Industrial Electronics, 67(9): 8016–8025. https://doi.org/10.1109/TIE.2019.2945265
Article Google Scholar
Duque L, Seo J, Wacker J, 2018. Bridge deterioration quantification protocol using UAV. Journal of Bridge Engineering, 23(10):04018080. https://doi.org/10.1061/(ASCE)BE.1943-5592.0001289
Article Google Scholar
He JB, Erfani S, Ma XJ, et al., 2021. Alpha-IOU: a family of power intersection over union losses for bounding box regression. arXiv: 2110.13675. https://doi.org/10.48550/arXiv.2110.13675
Hou QB, Zhou DQ, Feng JS, 2021. Coordinate attention for efficient mobile network design. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350
Hu J, Shen L, Albanie S, et al., 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Jia XY, Luo WG, 2019. Crack damage detection of bridge based on convolutional neural networks. Chinese Control and Decision Conference, p.3995–4000. https://doi.org/10.1109/CCDC.2019.8833336
Kang DH, Cha YJ, 2018. Autonomous UAVs for structural health monitoring using deep learning and an ultrasonic beacon system with geo-tagging. Computer-Aided Civil and Infrastructure Engineering, 33(10):885–902. https://doi.org/10.1111/mice.12375
Article Google Scholar
Kang DH, Cha YJ, 2021. Efficient attention-based deep encoder and decoder for automatic crack segmentation. Structural Health Monitoring, 21(5): 1–16. https://doi.org/10.1177/14759217211053776
Google Scholar
Kang DH, Benipal SS, Gopal DL, et al., 2020. Hybrid pixellevel concrete crack segmentation and quantification across complex backgrounds using deep learning. Automation in Construction, 118:103291. https://doi.org/10.1016/j.autcon.2020.103291
Article Google Scholar
Kisantal M, Wojna Z, Murawski J, et al., 2019. Augmentation for small object detection. arXiv: 1902.07296. https://doi.org/10.48550/arXiv.1902.07296
Liu G, Han J, Rong WZ, 2021. Feedback-driven loss function for small object detection. Image and Vision Computing, 111:104197. https://doi.org/10.1016/j.imavis.2021.104197
Article Google Scholar
Liu JH, Wu YP, Qin Y, et al., 2019. Defect detection for bird-preventing and fasteners on the catenary support device using improved Faster R-CNN. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.695–704. https://doi.org/10.1007/978-981-15-2914-6_66
Liu W, Anguelov D, Erhan D, et al., 2016. SSD: single shot MultiBox detector. Proceedings of the 14th European Conference on Computer Vision, p.21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Long A, Kim CW, Kondo Y, 2021. Detecting loosening bolts of highway bridges by image processing techniques. Proceedings of the 16th East Asian-Pacific Conference on Structural Engineering and Construction, p. 119–127. https://doi.org/10.1007/978-981-15-8079-6_11
Morgenthal G, Hallermann N, Kersten J, et al., 2019. Framework for automated UAS-based structural condition assessment of bridges. Automation in Construction, 97:77–95. https://doi.org/10.1016/j.autcon.2018.10.006
Article Google Scholar
Noh J, Bae W, Lee W, et al., 2019. Better to follow, follow to be better: towards precise supervision of feature superresolution for small object detection. IEEE/CVF International Conference on Computer Vision, p.9724–9733. https://doi.org/10.1109/ICCV.2019.00982
Rahman MA, Yang W, 2016. Optimizing intersection-over-union in deep neural networks for image segmentation. The 12th International Symposium on Advances in Visual Computing, p.234–244. https://doi.org/10.1007/978-3-319-50835-1_22
Ramana L, Choi W, Cha YJ, 2017. Automated vision-based loosened bolt detection using the cascade detector. Sensors and Instrumentation, 5:23–28. https://doi.org/10.1007/978-3-319-54987-3_4
Article Google Scholar
Redmon J, Divvala S, Girshick R, et al., 2016. You only look once: unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition, p.779–788. https://doi.org/10.1109/CVPR.2016.91
Rezatofighi H, Tsoi N, Gwak J, et al., 2019. Generalized intersection over union: a metric and a loss for bounding box regression. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.658–666. https://doi.org/10.1109/CVPR.2019.00075
Shao ZF, Li CM, Li DR, et al., 2020. An accurate matching method for projecting vector data into surveillance video to monitor and protect cultivated land. ISPRS International Journal of Geo-Information, 9(7):448. https://doi.org/10.3390/ijgi9070448
Article Google Scholar
Tang X, Du DK, He ZQ, et al., 2018. PyramidBox: a context-assisted single shot face detector. The 15th European Conference on Computer Vision, p.812–828. https://doi.org/10.1007/978-3-030-01240-3_49
Tao X, Zhang DP, Ma WZ, et al., 2018. Automatic metallic surface defect detection and recognition with convolutional neural networks. Applied Sciences, 8(9): 1575. https://doi.org/10.3390/app8091575
Article Google Scholar
van Etten A, 2018. You only look twice: rapid multi-scale object detection in satellite imagery. arXiv: 1805.09512. https://doi.org/10.48550/arXiv.1805.09512
Wang JK, He XH, Faming S, et al., 2021. A real-time bridge crack detection method based on an improved inception-resnet-v2 structure. IEEE Access, 9:93209–93223. https://doi.org/10.1109/ACCESS.2021.3093210
Article Google Scholar
Wang ZQ, Zhang YS, Yu Y, et al., 2021. Prior-information auxiliary module: an injector to a deep learning bridge detection model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14:6270–6278. https://doi.org/10.1109/JSTARS.2021.3089519
Article Google Scholar
Wei ZQ, Liang D, Zhang D, et al., 2022. Learning calibrated-guidance for object detection in aerial images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:2721–2733. https://doi.org/10.1109/JSTARS.2022.3158903
Article Google Scholar
Woo S, Park J, Lee JY, et al., 2018. CBAM: convolutional block attention module. The 15th European Conference on Computer Vision, p.3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Wu YP, Qin Y, Wang ZP, et al., 2018. A UAV-based visual inspection method for rail surface defects. Applied Sciences, 8(7):1028. https://doi.org/10.3390/app8071028
Article Google Scholar
Wu YP, Qin Y, Qian Y, et al., 2022. Hybrid deep learning architecture for rail surface segmentation and surface defect detection. Computer-Aided Civil and Infrastructure Engineering, 37(2):227–244. https://doi.org/10.1111/mice.12710
Article Google Scholar
Yang CHY, Huang ZH, Wang NY, 2021. QueryDet: cascaded sparse query for accelerating high-resolution small object detection. arXiv: 2103.09136. https://doi.org/10.48550/arXiv.2103.09136
Zhang HY, Cisse M, Dauphin YN, et al., 2018. Mixup: beyond empirical risk minimization. The 6th International Conference on Learning Representations.
Zhang YF, Ren WQ, Zhang Z, et al., 2022. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing, 506:146–157. https://doi.org/10.1016/j.neucom.2022.07.042
Article Google Scholar
Zheng ZH, Wang P, Liu W, et al., 2019. Distance-IOU loss: faster and better learning for bounding box regression. Proceedings of the 34th AAAI Conference on Artificial Intelligence, p.12993–13000. https://doi.org/10.1609/aaai.v34i07.6999
Zhu XK, Lyu SC, Wang X, et al., 2021. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. IEEE/CVF International Conference on Computer Vision Workshops, p.2778–2788. https://doi.org/10.1109/ICCVW54120.2021.00312

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61833002).

Author information

Authors and Affiliations

State Key Lab of Rail Traffic Control & Safety, Beijing Jiaotong University, Beijing, 100091, China
Zonghan Mu (牟宗涵), Yong Qin (秦勇) & Zhipeng Wang (王志鹏)
School of Traffic and Transportation, Beijing Jiaotong University, Beijing, 100091, China
Zonghan Mu (牟宗涵)
School of Artificial Intelligence, Beijing Technology and Business University, Beijing, 100048, China
Chongchong Yu (于重重)
School of Safety Engineering and Emergency Management, Shijiazhuang Tiedao University, Shijiazhuang, 050043, China
Yunpeng Wu (吴云鹏)
Beijing-Shanghai High Speed Railway Co., Ltd., Beijing, 100038, China
Huaizhi Yang (杨怀志) & Yonghui Huang (黄永辉)

Authors

Zonghan Mu (牟宗涵)
View author publications
You can also search for this author in PubMed Google Scholar
Yong Qin (秦勇)
View author publications
You can also search for this author in PubMed Google Scholar
Chongchong Yu (于重重)
View author publications
You can also search for this author in PubMed Google Scholar
Yunpeng Wu (吴云鹏)
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Wang (王志鹏)
View author publications
You can also search for this author in PubMed Google Scholar
Huaizhi Yang (杨怀志)
View author publications
You can also search for this author in PubMed Google Scholar
Yonghui Huang (黄永辉)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Qin (秦勇).

Additional information

Author contributions

Zonghan MU designed the research and processed the corresponding data. Zhipeng WANG, Huaizhi YANG, and Yonghui HUANG collected the data. Zonghan MU wrote the first draft of the manuscript. Chongchong YU and Yunpeng WU helped organize the manuscript. Yong QIN revised and edited the final version.

Conflict of interest

Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, and Yonghui HUANG declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mu, Z., Qin, Y., Yu, C. et al. Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images. J. Zhejiang Univ. Sci. A 24, 243–256 (2023). https://doi.org/10.1631/jzus.A2200175

Download citation

Received: 29 March 2022
Accepted: 13 July 2022
Published: 04 April 2023
Issue Date: March 2023
DOI: https://doi.org/10.1631/jzus.A2200175

Key words

关键词

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images

Abstract

概要

目的

创新点

方法

结论

Access this article

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Author contributions

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

Search

Navigation