Abstract
Industrial development can bring huge economic benefits to the country and society. However, it has also caused serious environmental pollution, leading to serious health problems and medical burdens for people, and is often accompanied by the emission of polluting gases. Many companies specify that masks must be worn at work to prevent the inhalation of harmful gases. Quickly detecting whether workers are wearing masks has emerged as a topic of some importance. However, existing face detection networks have limited detection accuracy and considerably reduced performance for masked faces. Hence, most state-of-the-art face mask detection technologies are based on deep learning. In this study, we developed a new feature extraction module called Attention Residual Ghost Module based on attention mechanisms and residual structure. To improve performance, we construct CSP-ARG and ARG-PANert. We then fused attention between the two to obtain a new type of lightweight face mask detection model. The results of an experimental evaluation of the performance of our proposed approach on the public AIZOO and FMDD datasets showed that it achieved accuracy values of 93.4% and 89.3%, respectively, in terms of mAP.
Similar content being viewed by others
Data availability
The AIZOO dataset is accessed on https://github.com/AIZOOTech/FaceMaskDetection, and the FMDD dataset is accessed on https://www.kaggle.com/datasets/wobotintelligence/face-mask-detection-dataset, while the code is available for any entity of interest.
References
WH Organization, World Health Statistics (2016) [OP]: Monitoring health for the sustainable development goals (SDGs), World Health Organization, 2016. https://books.google.com.tw/books%3Fhl=zh-CN%26lr=%26id=-A4LDgAAQBAJ%26oi=fnd%26pg=PP1%26dq=Monitoring+health+for+the+sustainable+development+goals%26ots=dcqfZM8nxz%26sig=ioQDFvoetpijoNX3WMnJu8rhLU%26redir_esc=y%23v=onepage%26q=Monitoring%20health%20for%20the%20sustainable%20development%20goals%26f=false
He K, Zhang X, Ren S, Sun J (2016) Identity Mappings in Deep Residual Networks, In: European Conference on Computer Vision, Springer, pp. 630–645
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) YOLOv4: optimal speed and accuracy of object detection, arXiv preprint arXiv:2004.10934
Bhatti UA, Tang H, Wu G, Marjan S, Hussain A (2023) Deep learning with graph convolutional networks: an overview and latest applications in computational intelligence. Int J Intell Syst 2023:1–28
Bhatti UA, Huang M, Neira-Molina H, Marjan S, Baryalai M, Tang H, Wu G, Bazai SU (2023) MFFCG–multi feature fusion for hyperspectral image classification using graph attention network. Expert Syst Appl 229:120496
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60:84–90
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You Only Look Once: Unified, Real-Time Object Detection, In: 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, June 26-July 1, IEEE Computer Society, Las Vegas, NV, United states, pp. 779–788
Redmon J, Farhadi A (2017) YOLO9000: Better, Faster, STRONGER, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271
J Farhadi (2018) Yolov3: an incremental improvement, arXiv preprint arXiv:1804.02767
Han K, Wang Y, Tian Q, Guo J, Xu C (2020) Ghostnet: More Features from Cheap Operations, In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications, arXiv preprint arXiv:1704.04861
Fan X, Jiang M, (2021) RetinaFaceMask: A Single Stage Face Mask Detector for Assisting Control of the COVID-19 Pandemic, In: 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, pp. 832–837
Lin TG, Goyal P, Girshick R, He K, Dollár P (2017) Focal Loss for Dense Object Detection, In: IEEE transactions on Pattern Analysis and Machine Intelligence Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988
Singh S, Ahuja U, Kumar M, Kumar K, Sachdeva M (2021) Face mask detection using YOLOv3 and faster R-CNN models: COVID-19 environment. Multimed Tools Appl 80:19753–19768
Li Z, Li H, Hu W, Yang J, Hua Z (2021) Masked face detection model based on multi-scale attention-driven faster R-CNN. J Southwest Jiaotong Univ 56:1002–1010
Jiping Chen Y, Chen J, Darrell T, Malik J (2022) Ghost-YOLO: lightweight masked face detection algorithm. J Signal Process 38:1954–1964
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: Efficient Channel attention for Deep Convolutional Neural Networks, In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, June 1-June 19, IEEE Computer Society, Virtual, Online, United states, pp. 11531–11539
Jaderberg M, Simonyan K, Zisserman A (2015) Spatial transformer networks, Advances in Neural Information Processing Systems, 28, arXiv preprint arXiv:1506.02025
Ioffe S, Szegedy (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, In: International Conference on Machine Learning, PMLR, pp. 448–456
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted Residuals and Linear Bottlenecks, In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path Aggregation Network for Instance segmentation, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–876
Zhang H, Cisse M, Dauphin YN, Lopez-Paz (2017) Mixup: beyond empirical risk minimization, arXiv preprint arXiv:1710.09412
Loshchilov I, Hutter F (2016) Sgdr: stochastic gradient descent with warm restarts, arXiv preprint arXiv:1608.03983
Yang S, Luo P, Loy C-C, Tang X (2016) Wider Face: A Face Detection Benchmark, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533
Ge S, Li J, Ye Q, Luo Z (2017) Detecting Masked Faces in the Wild With lle-cnns, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2682–2690
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single Shot Multibox Detector, In: European Conference on Computer Vision, Springer, pp. 21–37
He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely Connected Convolutional networks, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 4700–4708
Jocher G, Chaurasia A, Stoken A, Borovec J, Kwon Y, Michael K, Fang J, Yifu Z, Wong C, Montes D (2022) ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation, Zenodo
Funding
We appreciate the anonymous reviewers who provided constructive and thoughtful comments that helped improve the manuscript. The authors thank the National Natural Science Foundation of China (Grant Nos.42174164 and 41704132), the key project of Science Technology Department of Sichuan Province (Grant No.2021YJ0358), and the Key Research and Development Support Projects of Chengdu Science and Technology Department (Grant No.2021-YF05-02411-SN) for their financial support.
Author information
Authors and Affiliations
Contributions
All authors conceived the presented idea. All authors developed the theory and performed the computations. All authors verified the analytical methods. All authors discussed the results and contributed to the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no competing interests.
Ethical approval
This declaration is “not applicable.”
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, R., Wang, P., Li, X. et al. YOLO-ARGhost: a lightweight face mask detection model. J Supercomput 80, 3162–3182 (2024). https://doi.org/10.1007/s11227-023-05588-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05588-3