Abstract
Bridges are an important part of railway infrastructure and need regular inspection and maintenance. Using unmanned aerial vehicle (UAV) technology to inspect railway infrastructure is an active research issue. However, due to the large size of UAV images, flight distance, and height changes, the object scale changes dramatically. At the same time, the elements of interest in railway bridges, such as bolts and corrosion, are small and dense objects, and the sample data set is seriously unbalanced, posing great challenges to the accurate detection of defects. In this paper, an adaptive cropping shallow attention network (ACSANet) is proposed, which includes an adaptive cropping strategy for large UAV images and a shallow attention network for small object detection in limited samples. To enhance the accuracy and generalization of the model, the shallow attention network model integrates a coordinate attention (CA) mechanism module and an alpha intersection over union (α-IOU) loss function, and then carries out defect detection on the bolts, steel surfaces, and railings of railway bridges. The test results show that the ACSANet model outperforms the YOLOv5s model using adaptive cropping strategy in terms of the total mAP (an evaluation index) and missing bolt mAP by 5% and 30%, respectively. Also, compared with the YOLOv5s model that adopts the common cropping strategy, the total mAP and missing bolt mAP are improved by 10% and 60%, respectively. Compared with the YOLOv5s model without any cropping strategy, the total mAP and missing bolt mAP are improved by 40% and 67%, respectively.
概要
目的
桥梁钢结构以及钢结构上的高强度螺栓长期受风雨侵蚀,常常会有锈蚀或缺失的情况发生,而人工巡检的效率低、危险性大且视觉盲区多。本文期望通过无人机拍摄,对铁路桥梁钢结构图像所包含的检测目标(螺母正常、螺栓正常、螺栓缺失、螺母缺失、钢表面锈蚀和钢栏杆锈蚀)进行识别和检测,以提高铁路桥梁巡检工作的精度和效率。
创新点
1. 提出了一种自适应图像裁剪方法,可根据图像的具体情况,自适应的调整图像的分割尺寸以及裁剪重叠区域面积,可以消除无人机拍摄距离以及焦距不固定带来的负面影响,并且提高小目标的检测效果;2. 基于铁路桥梁钢结构待检测对象的特征,提出了浅层注意力网络,使模型能够更加关注待检测对象的浅层特征,从而使锈蚀区域更易于检测;3. 将坐标注意力(CA)机制模块集成到浅层注意力网络模型当中,帮助网络在大范围的无人机拍摄场景下找到缺陷区域;4. 将阿尔法并交比(α-IOU)损失函数集成到浅层注意力网络模型当中,提高针对铁路桥梁钢结构小数据集的训练和测试精度。
方法
1. 提出自适应图像裁剪策略,对无人机大尺寸图像进行处理,得到更易于网络检测出缺陷目标的小图像;2. 通过对YOLO网络进行改进,得到更关注浅层特征的浅注意力网络,提高对锈蚀、缺失的检测精度;3. 集成CA注意力机制和α-IOU损失函数到浅注意力网络中,提高图像检测的精度。
结论
1. 在小数据集中,待检测目标与输入图像的比例对最终的检测结果有明显影响;在本研究使用的数据集中,图像与主目标比例在20׃1到80׃1之间时,以50׃1为界限,大于50׃1时,精度变化较大,但是训练时间基本不变,而小于50׃1时,精度基本不变,但是训练时间变化较大,因此在训练过程中,存在一个临界点,此时训练效率和测试结果最佳。2. 更深层的网络会干扰小目标、少样本且简单特征对象的检测精度;对比其他策略相同但网络结构不同的检测结果,ACSANet相较于ACNet+CA+α-IOU的螺栓缺失精度提高了近10%。3. 不同的注意力机制由于注意方向不同,并不一定会提高检测精度;合适的注意力机制以及损失函数可以对铁路桥梁钢结构无人机图像目标进行更好的检测,采用不合适的注意力机制会对检测产生负面效果。
References
Ali R, Kang D, Suh G, et al., 2021. Real-time multiple damage mapping using autonomous UAV and deep faster region-based neural networks for GPS-denied structures. Automation in Construction, 130:103831. https://doi.org/10.1016/j.autcon.2021.103831
Arivazhagan S, Shebiah RN, Magdalene JS, et al., 2015. Railway track derailment inspection system using segmentation based fractal texture analysis. ICTACT Journal on Image and Video Processing, 6(1): 1060–1065. https://doi.org/10.21917/ijivp.2015.0155
Bochkovskiy A, Wang CY, Liao HYM, 2020. YOLOv4: optimal speed and accuracy of object detection. arXiv: 2004.10934. https://doi.org/10.48550/arXiv.2004.10934
Box GEP, Cox DR, 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2):211–243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
Cha YJ, Choi W, Büyüköztürk O, 2017. Deep learning-based crack damage detection using convolutional neural networks. Computer-Aided Civil and Infrastructure Engineering, 32(5):361–378. https://doi.org/10.1111/mice.12263
Cha YJ, Choi W, Suh G, et al., 2018. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering, 33(9):731–747. https://doi.org/10.1111/mice.12334
Chen JW, Liu ZG, Wang HR, et al., 2018. Automatic defect detection of fasteners on the catenary support device using deep convolutional neural network. IEEE Transactions on Instrumentation and Measurement, 67(2):257–269. https://doi.org/10.1109/TIM.2017.2775345
Chen P, Wu YP, Qin Y, et al., 2019. Rail fastener defect inspection based on UAV images: a comparative study. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.685–694. https://doi.org/10.1007/978-981-15-2914-6_65
Chen Q, Liu L, Han R, et al., 2019. Image identification method on highspeed railway contact network based on YOLO v3 and SENet. Chinese Control Conference, p.8772–8777. https://doi.org/10.23919/ChiCC.2019.8865153
Chen YK, Zhang PZ, Li ZM, et al., 2020. Stitcher: feedback-driven data provider for object detection. arXiv: 2004. 12432. https://doi.org/10.48550/arXiv.2004.12432
Choi W, Cha YJ, 2020. SDDNet: real-time crack segmentation. IEEE Transactions on Industrial Electronics, 67(9): 8016–8025. https://doi.org/10.1109/TIE.2019.2945265
Duque L, Seo J, Wacker J, 2018. Bridge deterioration quantification protocol using UAV. Journal of Bridge Engineering, 23(10):04018080. https://doi.org/10.1061/(ASCE)BE.1943-5592.0001289
He JB, Erfani S, Ma XJ, et al., 2021. Alpha-IOU: a family of power intersection over union losses for bounding box regression. arXiv: 2110.13675. https://doi.org/10.48550/arXiv.2110.13675
Hou QB, Zhou DQ, Feng JS, 2021. Coordinate attention for efficient mobile network design. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350
Hu J, Shen L, Albanie S, et al., 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Jia XY, Luo WG, 2019. Crack damage detection of bridge based on convolutional neural networks. Chinese Control and Decision Conference, p.3995–4000. https://doi.org/10.1109/CCDC.2019.8833336
Kang DH, Cha YJ, 2018. Autonomous UAVs for structural health monitoring using deep learning and an ultrasonic beacon system with geo-tagging. Computer-Aided Civil and Infrastructure Engineering, 33(10):885–902. https://doi.org/10.1111/mice.12375
Kang DH, Cha YJ, 2021. Efficient attention-based deep encoder and decoder for automatic crack segmentation. Structural Health Monitoring, 21(5): 1–16. https://doi.org/10.1177/14759217211053776
Kang DH, Benipal SS, Gopal DL, et al., 2020. Hybrid pixellevel concrete crack segmentation and quantification across complex backgrounds using deep learning. Automation in Construction, 118:103291. https://doi.org/10.1016/j.autcon.2020.103291
Kisantal M, Wojna Z, Murawski J, et al., 2019. Augmentation for small object detection. arXiv: 1902.07296. https://doi.org/10.48550/arXiv.1902.07296
Liu G, Han J, Rong WZ, 2021. Feedback-driven loss function for small object detection. Image and Vision Computing, 111:104197. https://doi.org/10.1016/j.imavis.2021.104197
Liu JH, Wu YP, Qin Y, et al., 2019. Defect detection for bird-preventing and fasteners on the catenary support device using improved Faster R-CNN. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.695–704. https://doi.org/10.1007/978-981-15-2914-6_66
Liu W, Anguelov D, Erhan D, et al., 2016. SSD: single shot MultiBox detector. Proceedings of the 14th European Conference on Computer Vision, p.21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Long A, Kim CW, Kondo Y, 2021. Detecting loosening bolts of highway bridges by image processing techniques. Proceedings of the 16th East Asian-Pacific Conference on Structural Engineering and Construction, p. 119–127. https://doi.org/10.1007/978-981-15-8079-6_11
Morgenthal G, Hallermann N, Kersten J, et al., 2019. Framework for automated UAS-based structural condition assessment of bridges. Automation in Construction, 97:77–95. https://doi.org/10.1016/j.autcon.2018.10.006
Noh J, Bae W, Lee W, et al., 2019. Better to follow, follow to be better: towards precise supervision of feature superresolution for small object detection. IEEE/CVF International Conference on Computer Vision, p.9724–9733. https://doi.org/10.1109/ICCV.2019.00982
Rahman MA, Yang W, 2016. Optimizing intersection-over-union in deep neural networks for image segmentation. The 12th International Symposium on Advances in Visual Computing, p.234–244. https://doi.org/10.1007/978-3-319-50835-1_22
Ramana L, Choi W, Cha YJ, 2017. Automated vision-based loosened bolt detection using the cascade detector. Sensors and Instrumentation, 5:23–28. https://doi.org/10.1007/978-3-319-54987-3_4
Redmon J, Divvala S, Girshick R, et al., 2016. You only look once: unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition, p.779–788. https://doi.org/10.1109/CVPR.2016.91
Rezatofighi H, Tsoi N, Gwak J, et al., 2019. Generalized intersection over union: a metric and a loss for bounding box regression. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.658–666. https://doi.org/10.1109/CVPR.2019.00075
Shao ZF, Li CM, Li DR, et al., 2020. An accurate matching method for projecting vector data into surveillance video to monitor and protect cultivated land. ISPRS International Journal of Geo-Information, 9(7):448. https://doi.org/10.3390/ijgi9070448
Tang X, Du DK, He ZQ, et al., 2018. PyramidBox: a context-assisted single shot face detector. The 15th European Conference on Computer Vision, p.812–828. https://doi.org/10.1007/978-3-030-01240-3_49
Tao X, Zhang DP, Ma WZ, et al., 2018. Automatic metallic surface defect detection and recognition with convolutional neural networks. Applied Sciences, 8(9): 1575. https://doi.org/10.3390/app8091575
van Etten A, 2018. You only look twice: rapid multi-scale object detection in satellite imagery. arXiv: 1805.09512. https://doi.org/10.48550/arXiv.1805.09512
Wang JK, He XH, Faming S, et al., 2021. A real-time bridge crack detection method based on an improved inception-resnet-v2 structure. IEEE Access, 9:93209–93223. https://doi.org/10.1109/ACCESS.2021.3093210
Wang ZQ, Zhang YS, Yu Y, et al., 2021. Prior-information auxiliary module: an injector to a deep learning bridge detection model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14:6270–6278. https://doi.org/10.1109/JSTARS.2021.3089519
Wei ZQ, Liang D, Zhang D, et al., 2022. Learning calibrated-guidance for object detection in aerial images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:2721–2733. https://doi.org/10.1109/JSTARS.2022.3158903
Woo S, Park J, Lee JY, et al., 2018. CBAM: convolutional block attention module. The 15th European Conference on Computer Vision, p.3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Wu YP, Qin Y, Wang ZP, et al., 2018. A UAV-based visual inspection method for rail surface defects. Applied Sciences, 8(7):1028. https://doi.org/10.3390/app8071028
Wu YP, Qin Y, Qian Y, et al., 2022. Hybrid deep learning architecture for rail surface segmentation and surface defect detection. Computer-Aided Civil and Infrastructure Engineering, 37(2):227–244. https://doi.org/10.1111/mice.12710
Yang CHY, Huang ZH, Wang NY, 2021. QueryDet: cascaded sparse query for accelerating high-resolution small object detection. arXiv: 2103.09136. https://doi.org/10.48550/arXiv.2103.09136
Zhang HY, Cisse M, Dauphin YN, et al., 2018. Mixup: beyond empirical risk minimization. The 6th International Conference on Learning Representations.
Zhang YF, Ren WQ, Zhang Z, et al., 2022. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing, 506:146–157. https://doi.org/10.1016/j.neucom.2022.07.042
Zheng ZH, Wang P, Liu W, et al., 2019. Distance-IOU loss: faster and better learning for bounding box regression. Proceedings of the 34th AAAI Conference on Artificial Intelligence, p.12993–13000. https://doi.org/10.1609/aaai.v34i07.6999
Zhu XK, Lyu SC, Wang X, et al., 2021. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. IEEE/CVF International Conference on Computer Vision Workshops, p.2778–2788. https://doi.org/10.1109/ICCVW54120.2021.00312
Acknowledgments
This work is supported by the National Natural Science Foundation of China (No. 61833002).
Author information
Authors and Affiliations
Corresponding author
Additional information
Author contributions
Zonghan MU designed the research and processed the corresponding data. Zhipeng WANG, Huaizhi YANG, and Yonghui HUANG collected the data. Zonghan MU wrote the first draft of the manuscript. Chongchong YU and Yunpeng WU helped organize the manuscript. Yong QIN revised and edited the final version.
Conflict of interest
Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, and Yonghui HUANG declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Mu, Z., Qin, Y., Yu, C. et al. Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images. J. Zhejiang Univ. Sci. A 24, 243–256 (2023). https://doi.org/10.1631/jzus.A2200175
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/jzus.A2200175