Skip to main content
Log in

Spatiotemporal crowds features extraction of infrared images using neural network

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Crowds can lead up to severe disasterous consequences resulting in fatalities. Videos obtained through public cameras or captured by drones flying overhead can be processed with artificial intelligence-based crowd analysis systems. Being a hot area of research over the past few years, the goal is not only to identify the presence of crowds but also to predict the probability of crowd-formation in order to issue timely warnings and preventive measures. Such systems will significantly reduce the probablity of the potential disasters. Developing effective systems is a challenging task, especially due to factors such as naturally occuring diverse conditions, variations in people or background pixel areas, noise, behaviors of individuals, relative amounts/distributions/directions of crowd movements, and crowd building reasons. This paper proposes an infrared video processing system based on U-Net convolutional neural network for crowd monitoring in infrared video frames to help estimate the people crowd with normal or abnormal trends. The proposed U-Net architecture aims to efficiently extract crowd features, achieve sufficient people marking-up accuracy, competitively with optimal network configurations in terms of the depth and number of filters to consequently minimise the number of coefficients. For further faster processing, hardware resources/implementation area savings, and lower power, the optimized network coefficients measured are represented in Canonic-Signed Digit with minimal number of nonzero (± 1) digits, minimizing the number of underlying shift-add/subtract operations of all multipliers. The achieved significantly reduced computational cost makes the proposed U-Net effectively suitable for resource-constrained and low power applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Abuarafah AG, Khozium MO, AbdRabou E (2012) Real-time crowd monitoring using infrared thermal video sequences. J Am Sci 8(3):133–140

    Google Scholar 

  • Alom MZ, Hasan M, Yakopcic C, Taha TM, Asari VK (2018) Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955

  • Balasubramaniam S, Vijesh Joe C, Sivakumar TA, Prasanth A, Satheesh Kumar K, Kavitha V, Dhanaraj RK (2023) Optimization enabled deep learning-based DDoS attack detection in cloud computing. Int J Intell Syst 2023:1–16

    Article  Google Scholar 

  • Braveen M, Nachiyappan S, Seetha R et al (2023) ALBAE feature extraction based lung pneumonia and cancer classification. Soft Comput (2023). https://doi.org/10.1007/s00500-023-08453-w

    Article  Google Scholar 

  • Bui HM, Lech M, Cheng E, Neville K, Burnett IS (2016) Using grayscale images for object recognition with convolutional-recursive neural network. In: 2016 IEEE Sixth International Conference on Communications and Electronics (ICCE). IEEE, pp 321–325

  • Chakraborty MS (2022). Magnitude comparison in canonical signed-digit number system. In: Sengodan T, Murugappan M, Misra S (eds) Advances in electrical and computer technologies. ICAECT 2021. Lecture Notes in Electrical Engineering, vol 881. Springer, Singapore. https://doi.org/10.1007/978-981-19-1111-8_7

  • Elshiekh N (2022) Astroworld tragedy: analyzing previous crowd crush case. U. Cent. Fla. Dep’t Legal Stud. LJ 5:153

  • Felix AY, Sasipraba T (2022) Retraction Note to: Spatial and temporal analysis of flood hazard assessment of Cuddalore District, Tamil Nadu, India. Using geospatial techniques. Springer, Berlin

    Google Scholar 

  • GitHub (2022) https://github.com/jingdao/IR-detection. Accessed 21 Sept 2023

  • Gong VX, Daamen W, Bozzon A, Hoogendoorn SP (2020) Crowd characterization for crowd management using social media data in city events. Travel Behav Soc 20:192–212

    Article  Google Scholar 

  • Hasan YMY, Karam LJ, Falkinburg M, Helwig A, Ronning M (2001) Canonic signed digit Chebyshev FIR filter design. IEEE Signal Process Lett 8(6):167–169

    Article  Google Scholar 

  • Hasan MK, Islam S, Sulaiman R, Khan S, Hashim A-HA, Habib S, Islam M, Alyahya S, Ahmed MM, Kamil S et al (2021) Lightweight encryption technique to enhance medical image security on internet of medical things applications. IEEE Access 9:47731–47742

    Article  Google Scholar 

  • Henke LL (2016) Estimating crowd size: a multidisciplinary review and framework for analysis. Bus Stud J 8(1):27–38

    Google Scholar 

  • Kang D, Ma Z, Chan AB (2018) Beyond counting: comparisons of density maps for crowd analysis tasks—counting, detection, and tracking. IEEE Trans Circuits Syst Video Technol 29(5):1408–1422

    Article  Google Scholar 

  • Kaul C, Manandhar S, Pears N (2019) Focusnet: an attention-based fully convolutional network for medical image segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE, pp 455–458

  • Kavitha M, Roobini S, Prasanth A, Sujaritha M (2023) Systematic view and impact of artificial intelligence in smart healthcare systems, principles, challenges and applications. In: Machine learning and artificial intelligence in healthcare systems, pp 25–56

  • Li Z, Zhang H, Li Z, Ren Z (2022) Residual-attention unet++: a nested residual-attention u-net for medical image segmentation. Appl Sci 12(14):7149

    Article  Google Scholar 

  • Liu Q, He Z, Li X, Zheng Y (2019) Ptb-tir: a thermal infrared pedestrian tracking benchmark. IEEE Trans Multimed 22(3):666–675

    Article  Google Scholar 

  • Mahmood M, Jalal A, Sidduqi M (2018) Robust spatio-temporal features for human interaction recognition via artificial neural network. In: 2018 International conference on Frontiers of Information Technology (FIT). IEEE, pp 218–223

  • Park J, Chen J, Cho YK, Kang DY, Son BJ (2020) CNN-Based person detection using infrared images for night-time intrusion warning systems. Sensors 20(1):34. https://doi.org/10.3390/s20010034

    Article  Google Scholar 

  • Poonkodi M, Vadivu G (2021) Action recognition using correlation of temporal difference frame (CTDF)—an algorithmic approach. J Ambient Intell Humaniz Comput 12:7107–7120

    Article  Google Scholar 

  • Priyadharsini N, Chitra D (2021) A kernel support vector machine based anomaly detection using spatio-temporal motion pattern models in extremely crowded scenes. J Ambient Intell Hum-Ized Comput 12:5225–5234

    Article  Google Scholar 

  • Rezaee K, Mousavirad SJ, Khosravi MR, Moghimi MK, Heidari M (2021) An autonomous uav-assisted distance-aware crowd sensing platform using deep shufflenet transfer learning. IEEE Trans Intell Transp Syst 23(7):9404–9413

    Article  Google Scholar 

  • Srinivasan A, Bharadwaj A, Sathyan M, Natarajan S (2020) Optimization of image embeddings for few shot learning. arXiv preprint arXiv:2004.02034

  • The Federal Bureau of Investigation (2015) https://vault.fbi.gov/protests-in-baltimore-maryland-2015/unedited-versions-of-video-surveillance-footage. Accessed 21 Sept 2023

  • Trumble M, Gilbert A, Hilton A, Collomosse J (2016) Deep convolutional networks for marker-less human pose estimation from multiple views. In: Proceedings of the 13th European Conference on Visual Media Production (CVMP 2016), pp 1–9

  • Wang J, Hu X (2021) Convolutional neural networks with gated recurrent connections. IEEE Trans Pattern Anal Mach Intell 44(7):3421–3435

    Google Scholar 

  • Wang Q, Chen M, Nie F, Li X (2018) Detecting coherent groups in crowd scenes by multiview clustering. IEEE Trans Pattern Anal Mach Intell 42(1):46–58

    Article  Google Scholar 

  • Wang S, Hu S-Y, Cheah E, Wang X, Wang J, Chen L, Baikpour M, Ozturk A, Li Q, Chou S-H et al (2020) U-net using stacked dilated convolutions for medical image segmentation. arXiv preprint arXiv:2004.03466

  • Waters DP, Wang L, Wang C, Sun Z, Chen S (2020) An improved dice loss for pneumothorax segmentation by mining the information of negative areas. IEEE Access 8:167939–167949

    Article  Google Scholar 

  • Wu X, Liang G, Lee KK, Xu Y (2006) Crowd density estimation using texture analysis and learning. In: 2006 IEEE international conference on robotics and biomimetics. IEEE, pp 214–219

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anas M. Al-Oraiqat.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Oraiqat, A.M., Drieiev, O., Drieieva, H. et al. Spatiotemporal crowds features extraction of infrared images using neural network. J Ambient Intell Human Comput 15, 2543–2556 (2024). https://doi.org/10.1007/s12652-024-04771-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-024-04771-5

Keywords

Navigation