Abstract
This paper focuses on the problem of motorcyclist helmet detection in single images. Although some previous works have been developed to deal with this problem, yet most of them are designed for videos and not suitable for single-image helmet detection. In view of this, in this paper, we propose a novel dual-detection framework for single-image motorcyclist helmet detection with multi-head self-attention. Particularly, two types of detectors are first trained, namely the rider detector and the head-shoulder detector, which are jointly leveraged in the dual-detection scheme. To take advantage of the contextual relevance information, the multi-head self-attention mechanism is incorporated, where multiple self-attention layers are integrated to capture the complex relationships among the input features so as to further enhance the detection accuracy. A new benchmark dataset, termed the SCAU helmet detection on motorcyclists (SCAU-HDM) dataset, is presented, which consists of 8000 training images and 2000 test images. Extensive experiments on the benchmark dataset demonstrate the effectiveness of the proposed framework. The code is available at https://github.com/LiChunHong2020/SCAUHDM.
Similar content being viewed by others
Data availability
The trained model’s weights and the test set of the SCAU-HDM dataset are available at https://pan.baidu.com/s/1bWGec9Or_EUGPXDdIKMuJg (extracting passward: 4udr).
References
Bansal M, Kumar M, Kumar M, Kumar K (2021) An efficient technique for object recognition using Shi-Tomasi corner detection algorithm. Soft Comput 25:4423–4432. https://doi.org/10.1007/s00500-020-05453-y
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Cai X, Huang D, Zhang G-Y, Wang C-D (2023) Seeking commonness and inconsistencies: a jointly smoothed approach to multi-view subspace clustering. Inf Fus 91:364–375. https://doi.org/10.1016/j.inffus.2022.10.020
Chairat A, Dailey M, Limsoonthrakul S, Ekpanyapong M, KC DR (2020) Low cost, high performance automatic motorcycle helmet violation detection. In: Proceedings of IEEE winter conference on applications of computer vision (WACV), pp 3560–3568. https://doi.org/10.1109/WACV45572.2020.9093538
Cordonnier J-B, Loukas A, Jaggi M (2019) On the relationship between self-attention and convolutional layers. arXiv preprint arXiv:1911.03584
Dahiya K, Singh D, Mohan CK (2016) Automatic detection of bike-riders without helmet using surveillance videos in real-time. In: Proceedings of international joint conference on neural networks (IJCNN), pp. 3046–3051. https://doi.org/10.1109/IJCNN.2016.7727586
Deng X, Huang D, Wang C-D (2023) Heterogeneous tri-stream clustering network. Neural Process Lett. https://doi.org/10.1007/s11063-023-11147-x
Fang S-G, Huang D, Cai X-S, Wang C-D, He C, Tang Y (2023) Efficient multi-view clustering via unified and discrete bipartite graph learning. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2023.3261460
Friedman N, Russell S (2013) Image segmentation in video sequences: a probabilistic approach. arXiv preprint arXiv:1302.1539
Ghiasi G, Lin T-Y, Le QV (2019) Nas-FPN: learning scalable feature pyramid architecture for object detection. In: Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7036–7045. https://doi.org/10.1109/CVPR.2019.00720
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587. https://doi.org/10.1109/CVPR.2014.81
Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) AugFPN: improving multi-scale feature learning for object detection. In: Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 12595–12604. https://doi.org/10.1109/CVPR42600.2020.01261
He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2844175
Huang D, Wang C-D, Lai J-H (2018) Locally weighted ensemble clustering. IEEE Trans Cybern 48(5):1460–1473. https://doi.org/10.1109/tcyb.2017.2702343
Huang D, Wang C-D, Wu J-S, Lai J-H, Kwoh C-K (2020) Ultra-scalable spectral clustering and ensemble clustering. IEEE Trans Knowl Data Eng 32(6):1212–1226. https://doi.org/10.1109/TKDE.2019.2903410
Huang D, Wang C-D, Peng H, Lai J-H, Kwoh C-K (2021) Enhanced ensemble clustering via fast propagation of cluster-wise similarities. IEEE Trans Syst Man Cybern Syst 51(1):508–520. https://doi.org/10.1109/TSMC.2018.2876202
Huang D, Wang C-D, Lai J-H (2023) Fast multi-view clustering via ensembles: towards scalability, superiority, and simplicity. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2023.3236698
Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2022) Transformers in vision: a survey. ACM Comput Surv. https://doi.org/10.1145/3505244
Li C-H, Huang D (2021) Detecting helmets on motorcyclists by deep neural networks with a dual-detection scheme. In: Proceedings of international conference on neural information processing (ICONIP), pp 417–427. https://doi.org/10.1007/978-3-030-92270-2_36
Liang Y, Huang D, Wang C-D, Yu PS (2022) Multi-view graph learning by joint modeling of consistency and inconsistency. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3192445
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 2117–2125. https://doi.org/10.1109/CVPR.2017.106
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of IEEE international conference on computer vision (ICCV), pp 2980–2988. https://doi.org/10.1109/ICCV.2017.324
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Proceedings of European conference on computer vision (ECCV), pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: Proceedings of international conference on pattern recognition (ICPR), pp 850–855. https://doi.org/10.1109/ICPR.2006.479
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra R-CNN: towards balanced learning for object detection. In: Proceedings of IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 821–830. https://doi.org/10.1109/CVPR.2019.00091
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst. https://doi.org/10.5555/3454287.3455008
Rafique MA, Pedrycz W, Jeon M (2018) Vehicle license plate detection using region-based convolutional neural networks. Soft Comput 22:6429–6440. https://doi.org/10.1007/s00500-017-2696-2
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp. 779–788. https://doi.org/10.1109/CVPR.2016.91
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 7263–7271. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2016.2577031
Shi L, Wang C, Tian F, Jia H (2021) An integrated neural network model for pupil detection and tracking. Soft Comput 25:10117–10127. https://doi.org/10.1007/s00500-021-05984-y
Shine L, Jiji CV (2020) Automated detection of helmet on motorcyclists from traffic surveillance videos: a comparative analysis using hand-crafted features and CNN. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-08627-w
Siebert FW, Lin H (2020) Detecting motorcycle helmet use with deep learning. Accid Anal Prev. https://doi.org/10.1016/j.aap.2019.105319
Silva RRV, Aires KRT, de MSVeras R, (2018) Detection of helmets on motorcyclists. Multimed Tools Appl. https://doi.org/10.1007/s11042-017-4482-7
Sravanthi R, Sarma ASV (2021) Efficient image-based object detection for floating weed collection with low cost unmanned floating vehicles. Soft Comput 25:13093–13101. https://doi.org/10.1007/s00500-021-06171-9
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: Proceedings of international conference on machine learning (ICML), pp 6105–6114
Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 10781–10790. https://doi.org/10.1109/CVPR42600.2020.01079
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst. https://doi.org/10.5555/3295222.3295349
Vishnu C, Singh D, Mohan CK, Babu S (2017) Detection of motorcyclists without helmet in videos using convolutional neural network. In: Proceedings of international joint conference on neural networks (IJCNN), pp 3036–3041. https://doi.org/10.1109/IJCNN.2017.7966233
Zhan W, Sun C, Wang M, She J, Zhang Y, Zhang Z, Sun Y (2022) An improved YOLOv5 real-time detection method for small objects captured by UAV. Soft Comput 26:361–373. https://doi.org/10.1007/s00500-021-06407-8
Zhang H, Cissé M, Dauphin YN, Lopez-Paz D (2018) mixup: Beyond empirical risk minimization. In: Proceedings of international conference on learning representations (ICLR), pp 1–13
Zhong Z, Zheng L, Kang G, Li S, Yang Y (2020) Random erasing data augmentation. In: Proceedings of the AAAI conference on artificial intelligence, pp 13001–13008. https://doi.org/10.1609/aaai.v34i07.7000
Zoph B, Cubuk ED, Ghiasi G, Lin T-Y, Shlens J, Le QV (2020) Learning data augmentation strategies for object detection. In: Proceedings of European conference on computer vision (ECCV), pp 566–583. https://doi.org/10.1007/978-3-030-58583-9_34
Funding
This work was supported by the National Natural Science Foundation of China (61976097 & 62276277), the Natural Science Foundation of Guangdong Province (2021A1515012203), and the Science and Technology Program of Guangzhou, China (202201010314).
Author information
Authors and Affiliations
Contributions
CHL involved in conceptualization, methodology, and writing—original draft. DH involved in conceptualization and writing—review , editing. GYZ involved in methodology and data annotations. JC involved in optimization and validation.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent to participate
Informed consent to participate was obtained from all individual participants included in the study.
Consent for publication
Informed consent for publication was obtained from all individual participants included in the study.
Informed consent
The code is available at https://github.com/LiChunHong2020/SCAUHDM.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, CH., Huang, D., Zhang, GY. et al. Motorcyclist helmet detection in single images: a dual-detection framework with multi-head self-attention. Soft Comput 28, 4321–4333 (2024). https://doi.org/10.1007/s00500-023-08723-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-08723-7