Abstract
Helmet detection in road surveillance images has become increasingly important with the increasing number of accidents involving two-wheeled electric vehicles and motorcycles. However, small detection targets and complex road environments make traditional helmet detection methods difficult. In this study, we propose an intelligent helmet detection model based on convolutional neural networks. To accurately capture the location of the helmet, we introduce the coordinate attention to obtain position information in the model. We thereafter introduce the pixel attention to enhance interpixel correlation and pixel-level feature filtering for the input images. These two attention mechanisms are combined to design the CPA module, and multi-CPA groups are constructed in a densely connected manner to obtain improved CPAG dense blocks. The proposed dual-attention mechanism effectively enhanced the weight of useful information and suppressed useless information. A dense block can improve the feature extraction ability and avoid information loss in the network. The CPAG dense block is inserted into the convolutional network model to obtain CPAG-Net as the detection network. To complete the system, we added a localization network to obtain the upper part of the rider. The localization network is accomplished using an improved YOLOv5s model in which we introduce an efficient channel attention mechanism to improve the localization ability for small targets. We compared the performance of the proposed method with those of several other methods. The results indicate that the proposed method is more robust than the other methods and has a higher accuracy for helmet detection in road surveillance images.
Similar content being viewed by others
Availability of data and materials
The data presented in this article are publicly available in EasyLink at https://easylink.cc/vfq7t3.
Code availability
The code presented in this article are publicly available in Github at https://github.com/MiJiang151/Helmet_Detection.git.
References
Chiverton J (2012) Helmet presence classification with motorcycle detection and tracking. IET Intel Transp Syst 6(3):259–269. https://doi.org/10.1049/iet-its.2011.0138
Tambi P, Jain S, Mishra DK (2019) Person-dependent face recognition using histogram of oriented gradients (HOG) and convolution neural network (CNN). In: International conference on advanced computing networking and informatics. advances in intelligent systems and computing, vol 870. Springer, Singapore. https://doi.org/10.1007/978-981-13-2673-8_5
Söylemez ÖF, Ergen B (2013) Eye location and eye state detection in facial images using circular Hough transform. In: Computer information systems and industrial management, vol 8104. Springer, Berlin. https://doi.org/10.1007/978-3-642-40925-7_14
Silva R et al (2013) Automatic detection of motorcyclists without helmet. In: 2013 XXXIX Latin American computing conference (CLEI). https://doi.org/10.1109/CLEI.2013.6670613
Awange JL, Paláncz B, Lewis RH, Völgyesi L (2023) Support vector machines (SVM). In: Mathematical geosciences. Springer, Cham, pp 1–7. https://doi.org/10.1007/978-3-030-92495-9_11
Bhagat S (2016) Cascade classifier based helmet detection using OpenCV in image processing. In: National conference on recent trends in computer and communication technology (RTCCT), vol 10
Kumar N, Sukavanam N (2020) Detecting helmet of bike riders in outdoor video sequences for road traffic accidental avoidance. Intell Syst Des Appl 941:24–33. https://doi.org/10.1007/978-3-030-16660-1_3
Silva RRVE, Kelson RTA, de Rodrigo MSV (2018) Detection of helmets on motorcyclists. Multimed Tools Appl 77:5659–5683. https://doi.org/10.1007/s11042-017-4482-7
LeCun Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, vol 1, pp 1097–1105
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556
He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Howard AG et al. (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. https://doi.org/10.48550/arXiv.1704.04861
Sandler M et al (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520. https://doi.org/10.48550/arXiv.1801.04381
Howard A et al (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1314–1324. https://doi.org/10.1109/ICCV.2019.00140
Chen Y et al (2023) DARGS: image inpainting algorithm via deep attention residuals group and semantics. J King Saud Univ Comput Inf Sci 35(6):101567. https://doi.org/10.1016/j.jksuci.2023.101567
Chen Y et al (2023) FFTI: image inpainting algorithm via features fusion and two-steps inpainting. J Vis Commun Image Represent 91:103776. https://doi.org/10.1016/j.jvcir.2023.103776
Deng Q et al (2023) Research on lightweight based on SSD fatigue driving detection algorithm. In: International conference on electronic engineering and informatics (EEI), pp 249–253. https://doi.org/10.1109/EEI59236.2023.10212442
Liu W et al (2016) SSD: single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016, vol 9905. Springer, Cham, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Zhang J et al (2022) ReYOLO: a traffic sign detector based on network reparameterization and features adaptive weighting. J Amb Intell Smart Environ Preprint. https://doi.org/10.3233/AIS-220038
Jin S et al (2022) A visual analytics system for improving attention-based traffic forecasting models. IEEE Trans Visual Comput Graph 29(1):1102–1112. https://doi.org/10.1109/TVCG.2022.3209462
Tao H, Duan Q (2024) Hierarchical attention network with progressive feature fusion for facial expression recognition. Neural Netw 170:337–348. https://doi.org/10.1016/j.neunet.2023.11.033
Zhang J, Wang W, Lu C et al (2020) Lightweight deep network for traffic sign classification. Ann Telecommun 75:369–379. https://doi.org/10.1007/s12243-019-00731-9
Zhang J, Ye Z, Jin X et al (2022) Real-time traffic sign detection based on multiscale attention and spatial information aggregator. J Real Time Image Proc 19:1155–1167. https://doi.org/10.1007/s11554-022-01252-w
Tao H, Lu M, Hu Z, Xin Z, Wang J (2022) Attention-aggregated attribute-aware network with redundancy reduction convolution for video-based industrial smoke emission recognition. IEEE Trans Ind Inf 18(11):7653–7664. https://doi.org/10.1109/TII.2022.3146142
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722. https://doi.org/10.1109/CVPR46437.2021.01350
Zhang J, Huang H, Jin X et al (2024) Siamese visual tracking based on criss-cross attention and improved head network. Multimed Tools Appl 83:1589–1615. https://doi.org/10.1007/s11042-023-15429-3
Patel G, Kim T, Lin Q, Allebach JP, Qiu Q (2024) Self-attention enhanced recognition: a unified model for handwriting and scene-text recognition with improved inference. Electron Imaging 36:1–6. https://doi.org/10.2352/EI.2024.36.8.IMAGE-241
Zhang W, Zhao W, Li J, Zhuang P, Sun H, Xu Y, Li C (2024). CVANet: cascaded visual attention network for single image super-resolution. Neural Netw 170:622–634. https://github.com/WilyZhao8/CVANet
Yogameena B, Menaka K, Saravana Perumaal S (2019) Deep learning-based helmet wear analysis of a motorcycle rider for intelligent surveillance system. IET Intel Transport Syst 13(7):1190–1198. https://doi.org/10.1049/iet-its.2018.5241
Ren S et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 39:1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Saumya A et al (2020) Machine learning based surveillance system for detection of bike riders without helmet and triple rides. In: International conference on smart electronics and communication (ICOSEC), pp 347–352. https://doi.org/10.1109/ICOSEC49089.2020.9215266
Darji M et al. (2020) Licence plate identification and recognition for non-helmeted motorcyclists using light-weight convolution neural network. In: International conference for emerging technology (INCET), pp 1–6. https://doi.org/10.1109/INCET49848.2020.9154075
Kadam S et al (2021) Automatic detection of bikers with no helmet and number plate detection. In: International conference on computing communication and networking technologies (ICCCNT), pp 1–5. https://doi.org/10.1109/ICCCNT51525.2021.9579898
Sugiarto R, Susanto EK, Kristian Y (2021) Helmet usage detection on motorcyclist using deep residual learning. In: East Indonesia conference on computer and information technology (EIConCIT), pp 194–198. https://doi.org/10.1109/EIConCIT50028.2021.9431914
RaviKrishna B et al (2021) Comprehensive CNN-based approach for helmet use detection of tracked motor cycles. In: International conference on recent developments in control, automation and power engineering (RDCAPE), pp 510–514. https://doi.org/10.1109/RDCAPE52977.2021.9633668
Sathe P et al (2022) Helmet detection and number pate recognition using deep learning. In: 2022 IEEE region 10 symposium (TENSYMP). IEEE, pp 1–6. https://doi.org/10.1109/TENSYMP54529.2022.9864462
Yi Z et al (2021) Research on helmet wearing detection in multiple scenarios based on YOLOv5. In: 2021 33rd Chinese control and decision conference (CCDC), pp 769–773. https://doi.org/10.1109/CCDC52312.2021.9602337
Liu H, Duan X, Lou H et al (2023) Improved GBS-YOLOv5 algorithm based on YOLOv5 applied to UAV intelligent traffic. Sci Rep 13:9577. https://doi.org/10.1038/s41598-023-36781-2
Zhang Z, Lu X, Cao S (2024) An efficient detection model based on improved YOLOv5s for abnormal surface features of fish. Math Biosci Eng 21(2):1765–1790. https://doi.org/10.3934/mbe.2024076
Yang R et al (2023) KPE-YOLOv5: an improved small target detection algorithm based on YOLOv5. Electronics 12(4):817. https://doi.org/10.3390/electronics12040817
Wang Q et al (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542. https://doi.org/10.1109/CVPR42600.2020.01155
Sharma S, Sharma S, Anidhya A (2017) Activation functions in neural networks. Towards Data Sci 4(12):310–316
Majid S et al (2022) Attention based CNN model for fire detection and localization in real-world images. Expert Syst Appl 189:116114. https://doi.org/10.1016/j.eswa.2021.116114
Furusho Y, Ikeda K (2020) Effects of skip-connection in ResNet and batch-normalization on fisher information matrix. In: Recent advances in big data and deep learning. INNSBDDL 2019. Proceedings of the international neural networks society, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-16841-4_35
Agarap AF et al (2018) Deep learning using rectified linear units (RELU). arXiv:1803.08375. https://doi.org/10.48550/arXiv.1803.08375
Huang G et al (2017) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269. https://doi.org/10.1109/CVPR.2017.243
Wang X, Fan Y et al (2022) Multiscale densely connected attention network for hyperspectral image classification. IEEE J Sel Top Appl Earth Observ Remote Sens 15:1617–1628. https://doi.org/10.1109/JSTARS.2022.3145917
Luo J, Wang J et al (2020) Image demosaicing based on generative adversarial network. Math Problems Eng. https://doi.org/10.1155/2020/7367608
Zhang J et al (2022) CCTSDB 2021: a more comprehensive traffic sign detection benchmark. In: Human-centric computing and information sciences, vol 12. https://doi.org/10.22967/HCIS.2022.12.023
Acknowledgements
This research was funded by the National Natural Science Foundation of China (Nos. 42374153, 42374149).
Funding
This work was supported by the National Natural Science Foundation of China (Grant Nos. 42374153 and 42374149).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, experimental preparation, and experimental analysis were performed by Jingrui Luo, Haixia Zhao, Xingguo Huang, and Jiang Mi. The first draft of the manuscript was written by Jiang Mi; the review and editing of the manuscript were completed by Jingrui Luo; and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose. We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled Improved dense residual network with the coordinate and pixel attention mechanisms for helmet detection in road surveillance images.
Consent to participate
All authors agreed with the content and that all gave explicit consent to submit, and all authors obtained consent from the responsible authorities at the institute where the work has been carried out.
Consent for publication
All authors explicitly agree to the publication of the article.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mi, J., Luo, J., Zhao, H. et al. Improved dense residual network with the coordinate and pixel attention mechanisms for helmet detection. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02205-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13042-024-02205-4