CTA-FPN: Channel-Target Attention Feature Pyramid Network for Prohibited Object Detection in X-ray Images

Zhang, Yi; Zhuo, Li; Ma, Chunjie; Zhang, Yutong; Li, Jiafeng

doi:10.1007/s11220-023-00416-7

CTA-FPN: Channel-Target Attention Feature Pyramid Network for Prohibited Object Detection in X-ray Images

Research
Published: 04 May 2023

Volume 24, article number 14, (2023)
Cite this article

Sensing and Imaging Aims and scope Submit manuscript

Yi Zhang¹,
Li Zhuo¹,
Chunjie Ma¹,
Yutong Zhang¹ &
…
Jiafeng Li¹

516 Accesses
1 Citation
Explore all metrics

Abstract

Fast and accurate prohibited object detection in X-ray images is great challenging. Based on YOLOv6 object detection framework, in this paper, Channel-Target Attention Feature Pyramid Network (CTA-FPN) is proposed for prohibited object detection in X-ray images. It includes two key components: TAAM (Target Aware Attention Module) and CAM (Channel Attention Module). TAAM is to generate the target attention map to enhance the features of prohibited object regions and suppress those of the background regions, so as to solve the problems of object occlusion and cluttered background in X-ray images. CAM is to highlight the feature channels important to the detection tasks, and suppress the irrelevant ones. The target-wise and channel-wise feature enhancement can effectively strengthen the feature representation capability of the network. The proposed CTA-FPN is incorporated into S, M and L models of YOLOv6 respectively, obtaining three X-ray prohibited object detection models. The experimental results on two publicly available benchmark datasets of SIXray and CLCXray show that, CTA-FPN can effectively improve the detection performance of YOLOv6. Especially, YOLOv6-CTA-FPN-L can achieve the state-of-the-arts detection accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dualray: Dual-View X-ray Security Inspection Benchmark and Fusion Detection Framework

Prohibited Item Detection in Airport X-Ray Security Images via Attention Mechanism Based CNN

AC-YOLOv4: an object detection model incorporating attention mechanism and atrous convolution for contraband detection in x-ray images

Article 31 August 2023

Data Availability

The datasets generated during and/or analyzed during the current study are available in the [SIXray dataset] repository with [https://github.com/MeioJane/SIXray], the [CLCXray dataset] repository with [https://github.com/Vill-Lab/2022-TIFS-CLCXray].

References

Heitz, G., Chechik G. (2010). Object separation in x-ray image sets. In 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, San Francisco, CA, USA (pp. 2093–2100). https://doi.org/10.1109/CVPR.2010.5539887
Turcsany, D.,Mouton, A., Breckon, T. P. (2013). Improving feature-based object recognition for X-ray baggage security screening using primed visualwords. In 2013 IEEE International conference on industrial technology ICIT (pp. 1140–1145). https://doi.org/10.1109/ICIT.2013.6505833
Huang, S., Wang, X., Chen, Y., Xu, J., Tang, T., & Mu, B. (2019). Modeling and quantitative analysis of X-ray transmission and backscatter imaging aimed at security inspection. Optics Express, 27, 337–349. https://doi.org/10.1364/OE.27.000337
Article Google Scholar
Akcay, S., Breckon, T. P. (2017). An evaluation of region based object detection strategies within X-ray baggage security imagery. In 2017 IEEE International Conference on Image Processing ICIP (pp. 1337–1341). https://doi.org/10.1109/ICIP.2017.8296499
Girshick, R., Donahue, J., Darrell, T., Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE conference on computer vision and pattern recognition (pp. 580–587). https://doi.org/10.1109/CVPR.2014.81
Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You only look once: Unified, real-time object detection. In 2016 IEEE conference on computer vision and pattern recognition CVPR (pp. 779–788). https://doi.org/10.1109/CVPR.2016.91
Karim, S., Zhang, Y., Yin, S., & Bibi, I. (2021). Auxiliary bounding box regression for object detection in optical remote sensing imagery. Sensing and Imaging, 22, 5. https://doi.org/10.1007/s11220-020-00319-x
Article Google Scholar
Han, Y., & Han, Y. (2021). A deep lightweight convolutional neural network method for real-time small object detection in optical remote sensing images. Sensing and Imaging, 22, 24. https://doi.org/10.1007/s11220-021-00348-0
Article Google Scholar
Guo, M.-H., Xu, T.-X., Liu, J.-J., Liu, Z.-N., Jiang, P.-T., Mu, T.-J., Zhang, S.-H., Martin, R. R., Cheng, M.-M., & Hu, S.-M. (2022). Attention mechanisms in computer vision: A survey. Computational Visual Media, 8, 331–368. https://doi.org/10.1007/s41095-022-0271-y
Article Google Scholar
Hu, J., Shen, L., Albanie, S., Sun, G., & Wu, E. (2020). Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. (2017). Attention is all you need. arXiv preprint arXiv:1706.03762
Ma, C., Zhuo, L., Li, J., Zhang, Y., & Zhang, J. (2022). EAOD-Net: Effective anomaly object detection networks for X-ray images. IET Image Process., 16, 2638–2651. https://doi.org/10.1049/ipr2.12514
Article Google Scholar
Wang, M., Du, H., Mei, W., Wang, S., & Yuan, D. (2022). Material-aware cross-channel interaction attention (MCIA) for occluded prohibited item detection. The Visual Computer. https://doi.org/10.1007/s00371-022-02498-y
Article Google Scholar
Wang, Z., Zhang, H., Lin, Z., Tan, X., Zhou, B. (2022). Prohibited items detection in baggage security based on improved YOLOv5. In 2022 IEEE 2nd international conference on software engineering and artificial intelligence (SEAI) (pp. 20–25). https://doi.org/10.1109/SEAI55746.2022.9832407
Ma, C., Zhuo, L., Li, J., Zhang, Y., & Zhang, J. (2023). Occluded prohibited object detection in X-ray images with global context-aware multi-scale feature aggregation. Neurocomputing, 519, 1–16. https://doi.org/10.1016/j.neucom.2022.11.034
Article Google Scholar
Purkait, P., Zhao, C., Zach, C. (2017). SPP-Net: Deep absolute pose regression with synthetic views. arXiv preprint arXiv:1712.03452
Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Lu, X., Li, B., Yue, Y., Li, Q., Yan, J. (2019). Grid R-CNN. In 2019 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 7355–7364). https://doi.org/10.1109/CVPR.2019.00754
Zhang, H., Chang, H., Ma, B., Wang, N., Chen, X. (2020). Dynamic R-CNN: towards high quality object detection via dynamic training. http://arxiv.org/abs/2004.06002
Sun, P., Zhang, R., Jiang, Y., Kong, T., Xu, C., Zhan, W., Tomizuka, M., Li, L., Yuan, Z., Wang, C., Luo, P. (2021). Sparse R-CNN: End-to-end object detection with learnable proposals. In 2021 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 14449–14458). https://doi.org/10.1109/CVPR46437.2021.01422
Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., Fu, Y. (2020). Rethinking classification and localization for object detection. In 2020 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 10183–10192). https://doi.org/10.1109/CVPR42600.2020.01020
Qiao, S., Chen, L.-C., Yuille, A. (2021). DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution. In 2021 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 10208–10219). https://doi.org/10.1109/CVPR46437.2021.01008
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D. (2019). Libra R-CNN: Towards balanced learning for object detection. In 2019 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 821–830). https://doi.org/10.1109/CVPR.2019.00091
Cai, Z., Vasconcelos, N. (2018). Cascade R-CNN: Delving into high quality object detection. In 2018 IEEECVF conference on computer vision and pattern recognition (pp. 6154–6162)https://doi.org/10.1109/CVPR.2018.00644
Redmon, J., Farhadi, A. (2018). YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M. (2020) YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Glenn J. (n.d.). yolov5. https://github.com/ultralytics/yolov5
Long, X., Deng, K., Wang, G., Zhang, Y., Dang, Q., Gao, Y., Shen, H., Ren, J., Han, S., Ding, E., Wen, S. (2020). PP-YOLO: An effective and efficient implementation of object detector. arXiv preprint arXiv:2007.12099
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J. (2021). YOLOX: Exceeding YOLO series in 2021. arXiv preprint arXiv:2107.08430
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., Li, Y., Zhang, B., Liang, Y., Zhou, L., Xu, X., Chu, X., Wei, X., Wei, X. (2022). YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single shot MultiBox detector. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer vision—ECCV 2016 (pp. 21–37). Cham: Springer.
Chapter Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. In 2015 IEEE conference on computer vision and pattern recognition CVPR (pp. 1–9). https://doi.org/10.1109/CVPR.2015.7298594
Tian, Z., Shen, C., Chen, H., He, T. (2019). FCOS: Fully convolutional one-stage object detection. In 2019 IEEECVF international conference on computer vision ICCV (pp. 9626–9635)https://doi.org/10.1109/ICCV.2019.00972
Wang, N., Gao, Y., Chen, H., Wang, P., Tian, Z., Shen, C., Zhang, Y. (2020). NAS-FCOS: Fast neural architecture search for object detection. In 2020 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 11940–11948). https://doi.org/10.1109/CVPR42600.2020.01196
Kim, K., & Lee, H. S. (2020). Probabilistic anchor assignment with IoU prediction for object detection. In A. Vedaldi, H. Bischof, T. Brox, & J.-M. Frahm (Eds.), Computer vision—ECCV 2020 (pp. 355–371). Cham: Springer International Publishing.
Google Scholar
Mery, D., Riffo, V., Zscherpel, U., Mondragón, G., Lillo, I., Zuccar, I., Lobel, H., & Carrasco, M. (2015). GDXray: The database of X-ray images for nondestructive testing. Journal of Nondestructive Evaluation, 34, 1–12. https://doi.org/10.1007/s10921-015-0315-7
Article Google Scholar
Miao, C., Xie, L., Wan, F., Su, C., Liu, H., Jiao, J., Ye, Q. (2019). SIXray: A large-scale security inspection X-ray benchmark for prohibited item discovery in overlapping images. In 2019 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 2114–2123). https://doi.org/10.1109/CVPR.2019.00222
Chang, A., Zhang, Y., Zhang, S., Zhong, L., & Zhang, L. (2022). Detecting prohibited objects with physical size constraint from cluttered X-ray baggage images. Knowledge-Based Systems, 237, 107. https://doi.org/10.1016/j.knosys.2021.107916
Article Google Scholar
Zhang, Y., Kong, W., Li, D., & Liu, X. (2020). On Using XMC R-CNN model for contraband detection within X-ray baggage security images. Mathematical Problems in Engineering. https://doi.org/10.1155/2020/1823034
Article Google Scholar
Shao, F., Liu, J., Wu, P., Yang, Z., & Wu, Z. (2022). Exploiting foreground and background separation for prohibited item detection in overlapping X-Ray images. Pattern Recognition, 122, 108261. https://doi.org/10.1016/j.patcog.2021.108261
Article Google Scholar
Wang, Y., & Zhang, L. (2021). Dangerous goods detection based on multi-scale feature fusion in security images. Laser and Optoelectronics Progress, 58, 0810012. https://doi.org/10.3788/LOP202158.0810012
Article Google Scholar
Tao, R., Wei, Y., Jiang, X., Li, H., Qin, H., Wang, J., Ma, Y., Zhang, L., Liu, X. (2021). Towards real-world X-ray security inspection: A high-quality benchmark and lateral inhibition module for prohibited items detection. In 2021 IEEECVF international conference on computer vision ICCV (pp. 10903–10912). https://doi.org/10.1109/ICCV48922.2021.01074
Zhao, C., Zhu, L., Dou, S., Deng, W., & Wang, L. (2022). Detecting overlapped objects in X-ray security imagery by a label-aware mechanism. IEEE Transactions on Information Forensics and Security, 17, 998–1009. https://doi.org/10.1109/TIFS.2022.3154287
Article Google Scholar
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2020). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 318–327. https://doi.org/10.1109/TPAMI.2018.2858826
Article Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J. (2018). Path aggregation network for instance segmentation. In 2018 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 8759-8768). https://doi.org/10.1109/CVPR.2018.00913
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J. (2021). RepVGG: Making VGG-style ConvNets great again. In 2021 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 13728–13737). https://doi.org/10.1109/CVPR46437.2021.01352
Huang, X., Zhuo, L., Zhang, H., Li, X., & Zhang, J. (2022). Lw-TISNet: Light-weight convolutional neural network incorporating attention mechanism and multiple supervision strategy for tongue image segmentation. Sensing and Imaging, 23, 6. https://doi.org/10.1007/s11220-021-00375-x
Article Google Scholar
Li, X., Wang, W., Hu, X., Yang, J. (2019). Selective kernel networks. In: 2019 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 510–519). https://doi.org/10.1109/CVPR.2019.00060
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q. (2020). ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings IEEECVF conference on computer vision and pattern recognition (pp. 11531–11539). https://doi.org/10.1109/CVPR42600.2020.01155
Park, J., Woo, S., Lee, J.-Y., Kweon, I.S. (2018). BAM: Bottleneck attention module. arXiv preprint arXiv:1807.06514
Woo, S., Park, J., Lee, J.-Y., & Kweon, I. S. (2018). CBAM: Convolutional block attention module. In V. Ferrari, M. Hebert, C. Sminchisescu, & Y. Weiss (Eds.), Computer vision—ECCV 2018 (pp. 3–19). Cham: Springer International Publishing.
Chapter Google Scholar
He, P., Huang, W., He, T., Zhu, Q., Qiao, Y., Li, X. (2017) Single shot text detector with regional attention. In 2017 IEEE international conference on computer vision ICCV (pp. 3066–3074. https://doi.org/10.1109/ICCV.2017.331
Zhu, K., Wu, J. (2021) Residual attention: A simple but effective method for multi-label recognition. In 2021 IEEECVF IEEE/CVF international conference on computer vision ICCV (pp. 184–193). https://doi.org/10.1109/ICCV48922.2021.00025
Dai, X., Chen, Y., Xiao, B., Chen, D., Liu, M., Yuan, L., Zhang, L. (2021) Dynamic head: Unifying object detection heads with attentions. In: 2021 IEEECVF conference on computer vision and pattern recognition (pp. 7369–7378). https://doi.org/10.1109/CVPR46437.2021.00729
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S. (2017) Feature pyramid networks for object detection. In 2017 IEEE conference on computer vision and pattern recognition CVPR (pp. 936–944). https://doi.org/10.1109/CVPR.2017.106
Hou, Q., Zhou, D., Feng, J. (2021) Coordinate attention for efficient mobile network design. In 2021 IEEECVF conference on computer vision and pattern recognition CVPR (pp. 13708–13717). https://doi.org/10.1109/CVPR46437.2021.01350
Wei, Y., Tao, R., Wu, Z., Ma, Y., Zhang, L., Liu, X. (2020) Occluded prohibited items detection: An X-ray security inspection benchmark and de-occlusion attention module. In Proceedings of the 28th ACM international conference on multimedia (pp. 138–146). New York: Association for Computing Machinery. https://doi.org/10.1145/3394171.3413828
Webb, T. W., Bhowmik, N., Gaus, Y. F.A., Breckon, T. P. (2021) Operationalizing convolutional neural network architectures for prohibited object detection in X-Ray imagery. In 2021 20th IEEE international conference on machine learning and applications ICMLA (pp. 610–615). https://doi.org/10.1109/ICMLA52953.2021.00102
Ma, C., Zhuo, L., Li, J., Zhang, Y., Zhang, J. (2022). Prohibited object detection in X-ray images with dynamic deformable convolution and adaptive IoU. In 2022 IEEE international conference on image processing (ICIP) (pp. 1-5)

Download references

Acknowledgements

This work in this paper is supported by the R&D Program of Beijing Municipal Education Commission (No.KZ202210005007), the Beijing Natural Science Foundation (No.L211017), the General Program of Beijing Municipal Education Commission (No.KM202110005027).

Author information

Authors and Affiliations

Beijing Key Laboratory of Computational Intelligence and Intelligent System, and the Faculty of Information, Beijing University of Technology, Beijing, 100124, China
Yi Zhang, Li Zhuo, Chunjie Ma, Yutong Zhang & Jiafeng Li

Authors

Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhuo
View author publications
You can also search for this author in PubMed Google Scholar
Chunjie Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yutong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiafeng Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YZ: conceptualization, methodology, software, investigation, writing-original draft preparation. LZ: conceptualization, supervision, writing-reviewing and editing, funding acquisition. CM: software, investigation, validation. YZ: software, investigation. JL: resources, funding acquisition.

Corresponding author

Correspondence to Li Zhuo.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Zhuo, L., Ma, C. et al. CTA-FPN: Channel-Target Attention Feature Pyramid Network for Prohibited Object Detection in X-ray Images. Sens Imaging 24, 14 (2023). https://doi.org/10.1007/s11220-023-00416-7

Download citation

Received: 14 February 2023
Revised: 06 March 2023
Accepted: 24 March 2023
Published: 04 May 2023
DOI: https://doi.org/10.1007/s11220-023-00416-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CTA-FPN: Channel-Target Attention Feature Pyramid Network for Prohibited Object Detection in X-ray Images

Abstract

Access this article

Similar content being viewed by others

Dualray: Dual-View X-ray Security Inspection Benchmark and Fusion Detection Framework

Prohibited Item Detection in Airport X-Ray Security Images via Attention Mechanism Based CNN

AC-YOLOv4: an object detection model incorporating attention mechanism and atrous convolution for contraband detection in x-ray images

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CTA-FPN: Channel-Target Attention Feature Pyramid Network for Prohibited Object Detection in X-ray Images

Abstract

Access this article

Similar content being viewed by others

Dualray: Dual-View X-ray Security Inspection Benchmark and Fusion Detection Framework

Prohibited Item Detection in Airport X-Ray Security Images via Attention Mechanism Based CNN

AC-YOLOv4: an object detection model incorporating attention mechanism and atrous convolution for contraband detection in x-ray images

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation