Abstract
Camouflaged people like soldiers on the battlefield or even camouflaged objects in the natural environments are hard to be detected because of the strong resemblances between the hidden target and the background. That’s why seeing these hidden objects is a challenging task. Due to the nature of hidden objects, identifying them require a significant level of visual perception. To overcome this problem, we present a new end-to-end framework via a multi-level attention network in this paper. We design a novel inception module to extract multi-scale receptive fields features aiming at enhancing feature representation. Furthermore, we use a dense feature pyramid taking advantage of multi-scale semantic features. At last, to locate and distinguish the camouflaged target better from the background, we develop a multi-attention module that generates more discriminative feature representation and combines semantic information with spatial information from different levels. Experiments on the camouflaged people dataset show that our approach outperformed all state-of-the-art methods.
Similar content being viewed by others
References
Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. IEEE Conference on Computer Vision and Pattern Recognition, pp 1597–1604
Bhajantri NU, Nagabhushan P (2006) Camouflage defect identification: a novel approach. International Conference on Information Technology, pp 145–148
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4):834–848
Dai J, Qi H, Xiong Y, Li Y, Zhang G et al (2017) Deformable convolutional networks. IEEE International Conference on Computer Vision
Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. IEEE Conference on Computer Vision and Pattern Recognition, pp 1623–1632
Galun M, Sharon E, Basri R, Brandt A (2003) Texture segmentation by multiscale aggregation of filter responses and shape elements. International Conference on Computer Vision, pp 716–723
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE Conference Computer Vision Pattern Recognition; 770-778
Hou Q, Cheng M, Hu X, Borji A, Tu Z, et al. (2017) Deeply supervised salient object detection with short connections. IEEE Conference on Computer Vision and Pattern Recognition, pp 3203–3212
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R et al (2014) Caffe: convolutional architecture for fast feature embedding. ACM international conference on Multimedia; 675-678
Li S, Deng W (2018) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28:356–370
Pan Y, Chen Y, Fu Q, Zhang P, Xu X (2011) Study on the camouflaged target detection method based on 3d convexity. Mod Appl Sci, 5(152)
Pengottuvelan P, Wahi A, Shanmugam A (2008) Performance of decamouflaging through exploratory image analysis. International Conference on Emerging Trends in Engineering and Technology; 6-10
Perazzi F, Krähenbühl P, Pritch Y, Hornung A (2012) Saliency filters: contrast based filtering for salient region detection. IEEE Conference on Computer Vision and Pattern Recognition, pp 733–740
Piergiovanni AJ, Angelova A, Ryoo MS (2020) Evolving losses for unsupervised video representation learning. IEEE Conference on Computer Vision and Pattern Recognition, pp 133–142
Pulla Rao C, Guruva Reddy A, Rama Rao CB (2020) Camouflaged object detection for machine vision applications. International Journal of Speech Technology 23:327–335
Shao R, Lan X, Yuen PC (2019) Joint discriminative learning of deep dynamic textures for 3d mask face anti-spoofing. IEEE Transactionson Information Forensics and Security 14(4):923–938
Song L, Geng W (2010) A new camouflage texture evaluation method based on WSSIM and nature image features. International Conference on Multimedia Technology, pp 1–4
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S et al (2015) Going deeper with convolutions. IEEE Conference on Computer Vision and Pattern Recognition; 1-9
Tankus A, Yeshurun Y (2001) Convexity-based visual camouflage breaking. Computer Vision and Image Understanding 82(3):208–237
Zeng Y, Zhang P, Zhang J, Lin Z, Lu H (2019) Towards high-resolution salient object detection. IEEE International Conference on Computer Vision, pp 7234–7243
Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: aggregating multi-level convolutional features for salient object detection. IEEE International Conference on Computer Vision
Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. IEEE International Conference on Computer Vision
Zhang P, Liu W, Lu H, Shen C (2018) Salient object detection by lossless feature reflection. International Joint Conferences on Artificial Intelligence, pp 1–8
Zhang P, Liu W, Lei Y, Lu H (2019) Hyperfusion-net: hyper-densely reflective feature fusion for salient object detection. Pattern Recogn, pp 521–533
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. IEEE Conference Computer Vision Pattern Recognition; 2881-2890
Zheng Y, Zhang X, Wang F, Cao T, Sun M (2018) Detection of people with camouflage pattern via dense deconvolution network. IEEE Signal Processing Letters 14(8):29–33
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
Conceptualization, and methodology, V.N.; software, R.H; formal analysis, V.N and R.H.; resources, R.H; data curation, R.H.; writing—original draft preparation, R.H..; writing—review and editing, R.H. and V.N.; visualization, V.N and R.H..; supervision, V.N.
All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hendaoui, R., Nabiyev, V. An end-to-end neural network for detecting hidden people in images based on multiple attention network. Multimed Tools Appl 81, 18531–18542 (2022). https://doi.org/10.1007/s11042-022-12118-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12118-5