Skip to main content
Log in

HMR-vid: a comparative analytical survey on human motion recognition in video data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

According to the rapid spread of multimedia data and online observations by users, the importance of researching on machine vision also, analyzing and automatic understanding of video data content is progressively increasing. Human motion recognition in video data is a crucial research subject in machine vision science that has plenty of applications, for instance, video surveillance, video indexing, robotics, human-computer interface and multimedia retrieval. Despite a high number of researches conducted on this topic, there is a necessity to achieve a more in-depth understanding, complete classification, and evaluation of existing human motion recognition stages. The novelty of this paper, our comparative analytical framework includes three major parts. Firstly, three different stages are introduced in recognizing human motion consisting of background subtraction, feature extraction, and machine learning classification. Secondly, five essential criteria are defined for evaluating the proposed human motion recognition methods. Finally, our comparative analysis of human motion recognition stages comprises two models. The analysis of background subtraction methods is based on applying the criteria for a qualitative comparison. Next, the feature extraction and machine learning classification methods are examined by specifying their main idea, benefits and challenges. Our comparative analytical framework can be beneficial for every researcher in this field by simplifying accurate selection and development of human motion recognition methods in future works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Aanaes H, Lindbjerg Dahl A, Pedersen KS (2011) Interesting interest points a comparative study of interest point performance on a unique data set. Int J Comput Vis 97:18–35

    Google Scholar 

  2. Abdulmunem A, Lai YK, Sun X (2016) Saliency guided local and global descriptors for effective action recognition. Comput Vis Media 2(1):97–106

    Google Scholar 

  3. Abidine BM, Fergani L, Fergani B, Oussalah M (2016) The joint use of sequence features combination and modified weighted SVM for improving daily activity recognition. Pattern Anal Applic 21:119–138

    MathSciNet  Google Scholar 

  4. Afsar P, Cortez P, Santos H (2015) Automatic visual detection of human behavior: a review from 2000 to 2014. Expert Syst Appl 42(20):6935–6956

    Google Scholar 

  5. Aggarwal JK, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Underst 73(3):428–440

    Google Scholar 

  6. Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):1–43

    Google Scholar 

  7. Aggarwal JK, Cai Q, Liao W, Sabata B (1994) Articulated and elastic non-rigid motion: a review. IEEE workshop on motion of non-rigid and articulated objects, pp 2-14

  8. Ahmadi M, O’Neil M, Fragala-Pinkham M, Lennon N, Trost S (2018) Machine learning algorithms for activity recognition in ambulant children and adolescents with cerebral palsy. Journal of neuroEngineering and Rehabilitation (JNER) 15(1):105

  9. Alawi MA, Khalifa OO, Islam MDR (2013) Performance comparison of background estimation algorithms for detecting moving vehicle. World Applied Sciences Journal 21 (Mathematical Applications in Engineering), pp 109–114

  10. Ali N, Neagu D, Trundle P (2019) Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Appl Sci 1:1559

    Google Scholar 

  11. Al-Maadeed S, Boubezari R, Kunhoth S, Bouridane A (2018) Robust feature point detector for car make recognition. Comput Ind 100:129–136

    Google Scholar 

  12. Awad AI, Hassaballah M (2016) Image feature detectors and descriptors foundations and applications. Studies in computational intelligence (Springer), vol(630)

  13. Awad M, Khanna R (2015) Efficient learning machines theories, concepts, and applications for engineers and system designers. Apress Open

  14. Baf FE, Bouwmans Th, Vachon B (2008) A fuzzy approach for background subtraction. 15th IEEE international conference on image processing, pp 2648–2651

  15. Baf FE, Bouwmans Th, Vachon Bertrand (2008) Fuzzy integral for moving object detection. IEEE international conference on fuzzy systems (IEEE World Congress on Computational Intelligence)

  16. Baf FE, Bouwmans T, Vachon B (2008) Type-2 Fuzzy Mixture of Gaussians Model: Application to Background Modeling. International Symposium on Visual Computing, pp772–781

  17. Bello AA, Chiroma H, Gital AY, Gabralla LA, Abdulhamid SM, Shuib L (2020) Machine learning algorithms for improving security on touch screen devices: a survey and new perspectives. Neural Comput & Applic

  18. Benezeth Y, Jodoin PM, Emile B, Laurent H, Rosenberger C (2010) Comparative study of background subtraction algorithms. J Electronic Imaging 19(3)

  19. Bhatia A (2007) Hessian-Laplace feature detector and Haar descriptor for image matching. Postdoctoral Studies Thesis, University of Ottawa

  20. Bobick AF (1997) Movement, activity and action: the role of knowledge in the perception of motion. Philos Trans R Soc Lond Ser B Biol Sci 352(1358):1257–1265

    Google Scholar 

  21. Boghdady R, Salama Ch, Wahba A (2015) GPU-accelerated real-time video background subtraction. 2015 tenth international conference on Computer Engineering & Systems (ICCES), pp 34-39

  22. Bouttefroy PLM, Bouzerdoum A, Phung SL, Beghdadi A (2010) On the analysis of background subtraction techniques using Gaussian mixture models. IEEE International Conference on Acoustics, Speech and Signal Processing,pp 4042–4045

  23. Bouwmans Th (2011) Recent advanced statistical background modeling for foreground detection - a systematic survey. Guachi 4(3), 147–176

  24. Bouwmans T (2014) Traditional and recent approaches in background modeling for foreground detection: an overview. Comput Sci Rev 11-12:31–66

    MATH  Google Scholar 

  25. Braham M, Droogenbroeck MV (2016) Deep background subtraction with scene-specific convolutional neural networks. 2016 international conference on systems, signals and image processing (IWSSIP)

  26. Bux A, Angelov P, Habib Z (2017) Vision based human activity recognition: a review. Advances in computational intelligence systems, pp 341-371

  27. C’ulibrk D, Crnojevic V (2010) GPU-Based Complex-Background Segmentation Using Neural Networks. 21000 Novi Sad, Serbia

  28. Cabani C, MacLean WJ (2006) A proposed pipelined-architecture for FPGA-based affine-invariant feature detectors. 2006 conference on computer vision and pattern recognition workshop (CVPRW'06)

  29. Calvo-Gallego E, Brox P (2014) Low-cost dedicated hardware IP modules for background subtraction in embedded vision systems. J Real-Time Image Proc 12:681–695

    Google Scholar 

  30. Cano A, Yeguas-Bolivar E, Munoz-Salinas R, Medina-Carnicer R, Ventura S (2018) Parallelization strategies for markerless human motion capture. J Real-Time Image Proc 14(2):453–467

    Google Scholar 

  31. Cao D, Masoud OT, Boley D, Papanikolopoulos N (2009) Human motion recognition using support vector machines. Comput Vis Image Underst 113(19):1064–1075

    Google Scholar 

  32. Carr P (2008) GPU accelerated multimodal background subtraction. Digital Image Computing: Techniques and Applications

  33. Caruccio L, Polese G, Tortora G, Lannone D (2019) EDCAR: a knowledge representation framework to enhance automatic video surveillance, human activity analysis: a review. Expert Syst Appl 131:190–207

    Google Scholar 

  34. Cedras C, Shah M (1995) Motion-based recognition: a survey. Image Vis Comput 13(2):129–155

    Google Scholar 

  35. Chacon-Murguia MI, Ramirez-Alonso G (2015) Fuzzy-neural self-adapting background modeling with automaticmotion analysis for dynamic object detection. Appl Soft Comput 36:570–577

    Google Scholar 

  36. Chandrasekhar U, Das TK (2011) A survey of techniques for background subtraction and traffic analysis on surveillance video. Univers J Appl Comput Sci Technol 1(3):107–113

    Google Scholar 

  37. Cheng L, Gong J, Yang X, Fan C, Han P (2008) Robust affine invariant feature extraction for image matching. IEEE Geosci Remote Sens Lett 5(2):246–250

    Google Scholar 

  38. Cheng L, Gong M, Schuurmans D, Caelli T (2011) Real-time discriminative background subtraction. IEEE Trans Image Process 20(5):1401–1414

    MathSciNet  MATH  Google Scholar 

  39. Cheok MJ, Omar Z, Jaward MH (2017) A review of hand gesture and sign language recognition techniques. Int J Mach Learn Cybern 10(1):131–153

    Google Scholar 

  40. Cheung SCS, Kamath C (2004) Robust techniques for background subtraction in urban traffic video. Visu Commun Image Process 5308:881–892

    Google Scholar 

  41. Chiu CC, Ku MY, Liang LW (2009) A robust object segmentation system using a probability-based background extraction algorithm. IEEE Trans Circuits Syste Video Technol 20(4):518–528

    Google Scholar 

  42. Cho SG, Yoshikawa M, Ding M, Takamatsu J, Ogasawara T (2019) Machine-learning-based hand motion recognition system by measuring forearm deformation with a distance sensor array. International Journal of Intelligent Robotics and Applications 3: 418–429

  43. Choudhury SK, Sa PK, Bakshi S, Majhi B (2016) An evaluation of background subtraction. IEEE Access 4:6133–6150

    Google Scholar 

  44. Choudhury SK, Sa PK, Bakshi S, Majhi B (2016) An evaluation of background subtraction for object detection Vis-a-Vis mitigating challenging scenarios. IEEE Access 4:6133–6150

    Google Scholar 

  45. Chuanwei D, Li Z, Chen G, Lei B, Zhicheng L, Hong H, Yusheng L, Xiahua Z (2018) Non-contact human motion recognition based on UWB radar. IEEE J Emerg Sel Topics Circuits Syst 8(2):306–315

    Google Scholar 

  46. Cohignac T, Lopez C, Morel JM (1994) Integral and local affine invariant parameter and application to shape recognition. Proceedings of 12th international conference on pattern recognition, pp 164-168

  47. Cristani M, Farenzena M, Bloisi D, Murino V (2010) Background subtraction for automated multisensor surveillance: a comprehensive review. EURASIP Journal on Advances in Signal Processing (JASP) 2010:1–24

  48. Cucchiara R, Grana C, Piccardi M, Prati A (2003) Detecting moving objects, ghosts, and shadows in video streams. IEEE Trans Pattern Anal Mach Intell 25(10):1337–1342

    Google Scholar 

  49. Dadi HS, Moham Pillutla GK, Makkena ML (2018) Face recognition and human tracking using GMM, HOG and SVM in surveillance videos. Ann Data Sci 5(2):157–179

    Google Scholar 

  50. Dai KX, Li GH, Gan YL (2006) A probabilistic model for surveillance video mining. Proceedings of the fifth international conference on machine learning and cybernetics, pp 1144–1148

  51. Dai C, Liu X, Lai J, Chao HC (2019) Human behavior deep recognition architecture for Smart City applications in the 5G environment. IEEE Netw 33(5):206–211

    Google Scholar 

  52. Darwich A, Hebert PA, Bigand A, Mohanna Y (2018) Background subtraction based on a new fuzzy mixture of Gaussians for moving object detection. J Imaging 4(7)

  53. Dawn DD, Shaikh SH (2015) A comprehensive survey of human action recognition with spatio-temporal interest point. Vis Comput 32:289–306

    Google Scholar 

  54. Debnath S, Roy P (2020) Appearance and shape-based hybrid visual feature extraction: toward audio-visual automatic speech recognition. Signal, Image and Video Processing. https://doi.org/10.1007/s11760-020-01717-0

  55. Dewan A, Caselitz T, Burgard W (2018) Learning a local feature descriptor for 3D LiDAR scans. 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4774-4780

  56. Dhome Y, Tronson N, Vacavant A, Chateau Th, Gabard Ch, Goyat Y, Gruyer D (2010) A benchmark for background subtraction algorithms in monocular vision: a comparative study. 2010 2nd international conference on image processing theory, tools and applications

  57. Ding C, Zhu P (2014) Head motion synthesis from speech using deep neural networks. Multimed Tools Appl 74:9871–9888

    Google Scholar 

  58. Domale M, Gaikwad V (2017) Robust pedestrian detection framework using Harris corner detector and Kalman filter. International Journal of Engineering Research & Technology (IJERT) 6(2):227–232

    Google Scholar 

  59. Drosou A, Loannidis D, Moustaks K, Tzovaras D (2012) Spatiotemporal analysis of human activities for biometric authentication. Comput Vis Image Underst 116(3):411–421

    Google Scholar 

  60. Elgammal A, Harwood D, Davis L (2000) Non-parametric model for background subtraction. 6th European conference on computer vision, Dublin

  61. Elhoushi M, Georgy J, Noureldin A, Korenberg MJ (2016) A survey on approaches of motion mode recognition using sensors. IEEE Trans Intell Transp Syst 18(7):1662–1686

    Google Scholar 

  62. Fauske E, Mol L, Bakken RH (2009) A comparison of learning based background subtraction techniques implemented in CUDA. Proceedings of the first Norwegian artificial intelligence symposium 181-192

  63. Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In computer vision and pattern recognition, pp 264-271

  64. Forsyth DA, Arikan O, Ikemoto L, O’Brien J, Ramanan D (2005) Computational studies of human motion: part 1, tracking and motion synthesis. Found Trends Comput Graph Vis 1(2/3):77–254

    Google Scholar 

  65. Gafurov D (2007) A survey of biometric gait recognition: approaches, security and challenges. Annual Norwegian Computer Science Conference

  66. Gauglitz S, Hollerer T, Turk M (2011) Evaluation of interest point detectors and feature descriptors for visual tracking. Int J Comput Vis 94:335–360

    MATH  Google Scholar 

  67. Gavrila DM (1999) The visual analysis of human movement: a survey. Comput Vis Image Underst 73(1):82–98

    MATH  Google Scholar 

  68. Giannarou S, Visentini-Scarzanella M, Yang GZh (2009) Affine-invariant anisotropic detector for soft tissue tracking in minimally invasive surgery, pp 1059-1062

  69. Gong M, Shu Y (2020) Real-time detection and motion recognition of human moving objects based on deep learning and multi-scale feature fusion in video. IEEE Access pp, 25811–25822

  70. Goyal K, Singhai J (2018) Review of background subtraction methods using Gaussian mixture model for video surveillance systems. Artif Intell Rev 50:241–259

    Google Scholar 

  71. Grimson WEL, Stauffer C, Romano R, Lee L (1998) Using adaptive tracking to classify and monitor activities in a site. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  72. Guachi Guachi de los A (2016) Background Subtraction for moving object detection. Ph.D. thesis, University of Calabria

  73. Guler P, Emeksiz D, Temizel A, Teke M, Temizel TT (2013) Real-time multi-camera video analytics system on GPU. J Real-Time Image Proc 11:457–472

    Google Scholar 

  74. Hall P, Marshall D, Martin R (2000) Merging and splitting eigenspace models. IEEE Trans Pattern Anal Mach Intell 22(9):1042–1049

    Google Scholar 

  75. Han B, Jain R (2007) Real-time subspace-based background modeling using multi-channel data. International Symposium on Visual Computing (ISVC) 4842:162–172

  76. Hao Z, Duan Y, Dang X, Zhang T (2020) CSI-HC: a WiFi-based indoor complex human motion recognition method. Hindawi 2020:20

    Google Scholar 

  77. Hasan H, Abdul-Kareem S (2014) Static hand gesture recognition using neural networks. Artif Intell Rev 41:147–181

    Google Scholar 

  78. Hassaballah M, Kenk MA, El-Henawyy IM (2020) Local binary pattern-based on-road vehicle detection in urban traffic scene. Pattern Anal Applic

  79. Hassan MM, Ullah S, Hossain MS, Alelaiwi A (2020) An end-to-end deep learning model for human activity recognition from highly sparse body sensor data in internet of medical things environment. J Supercomput

  80. Hoferlin B, Zimmermann K (2009) Towards reliable traffic sign recognition. 2009 IEEE Intelligent Vehicles Symposium, pp 324–329

  81. Hongeng S, Nevatia R, Bremond F (2004) Video-based event recognition: activity representation and probabilistic recognition methods. Comput Vis Image Underst 96(2):129–162

    Google Scholar 

  82. Hruz M, Trojanova J, Zelezny M (2011) Local binary pattern based features for sign language recognition. Software and Hardware for Pattern Recognition and Image Analysis 21(3):389–401

    Google Scholar 

  83. Hsieh JW, Chen LC, Chen DY (2014) Symmetrical SURF and its applications to vehicle make and model recognition. IEEE Trans Intell Transp Syst 15(1):6–20

    Google Scholar 

  84. Hsieh CC, Hsih MH, Jiang MK, Cheng YM, Liang EH (2015) Effective semantic features for facial expressions recognition using SVM. Multimed Tools Appl 75:6663–6682

    Google Scholar 

  85. Hussein Ali K, Wang T (2014) Recognition of human action and identification based on SIFT and watermark. International conference on intelligent computing, pp 298-309

  86. Jabri S, Duric Z, Wechsler H, Rosenfeld A (2000) Detection and location of people in video images using adaptive fusion of color and edge information. Proceedings 15th international conference on pattern recognition, pp 627–631

  87. Jalal A, Mahmood M, Hasan AS (2019) Multi-features descriptors for human activity tracking and recognition in indoor-outdoor environments. Proceedings of 2019 16th international Bhurban conference on applied sciences & technology (IBCAST), pp 371-376

  88. Jazayeri A, Cai H, Zheng JY, Tuceryan M (2011) Vehicle detection and tracking in car video based on motion model. IEEE Trans Intell Transp Syst 12(2):583–595

    Google Scholar 

  89. Jhuang H, Serre T, Wolf L, Poggio T (2007) A biologically inspired system for action recognition. IEEE 11th international conference on computer vision, pp 1–8

  90. Ji Sh XW, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231

    Google Scholar 

  91. Ji XF, Wu QQ, Ju ZJ, Wang YY (2014) Study of human action recognition based on improved spatio-temporal features. Int J Autom Comput 11(5):500–509

    Google Scholar 

  92. Jiang Q, Liu M, Wang X, Ge M, Lin L (2016) Human motion segmentation and recognition using machine vision for mechanical assembly operation. SpringerPlus 5(1):1629

    Google Scholar 

  93. Joshi MJ, Chaudhari JP (2020) Comparative study and implementation of background modeling techniques for background subtraction. Int J Adv Sci Technol 29(12)

  94. Joudaki S, Bin Sunar MSh, Kolivand H (2015) Background subtraction methods in video streams: a review. International conference on interactive digital media (ICIDM), pp 1-6

  95. Ju Z, Liu H (2012) Fuzzy Gaussian Mixture Models. Pattern Recogn 45(3):1146–1158

  96. Juan L, Gwun O (2009) A comparison of SIFT, PCA-SIFT and SURF. International Journal of Image Processing (IJIP) 3(4):143–152

    Google Scholar 

  97. Kale GV, Patil VH (2016) A study of vision based human motion recognition and analysis. International Journal of Ambient Computing and Intelligence (IJACI) 7(2):75–92

  98. Kalsotra R, Arora S (2017) Recent trends in background subtraction approach for moving object detection. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2(7)

  99. Kavitha P, Vijaya K (2018) Optimal feature-level fusion and layer k-support vector machine for spoofing face detection. Multimed Tools Appl 77:26509–26543

    Google Scholar 

  100. Kellokumpu V, Zhao G, Pietikäinen M (2007) Recognition of human actions using texture descriptors. Mach Vis Appl 22:767–780

    Google Scholar 

  101. Kenan MU, Hui F, Zhao X, Prehofer C (2016) Multiscale edge fusion for vehicle detection based on difference of Gaussian. Optik 127(11):4794–4798

    Google Scholar 

  102. Khalifa AF, Badr E, Elmahdt HN (2019) A survey on human detection surveillance systems for raspberry pi. Image Vis Comput 85:1–13

    Google Scholar 

  103. Khan MA, Javed K, Khan SA, Saba T, Habib U, Khan JA, Abbasi AA (2020) Human action recognition using fusion of multiview and deep features: an application to video surveillance. Multimed Tools Appl

  104. Kim CH, Lee JY, Lee JJ (2003) Feature extraction method for a robot map using neural networks. Artif Life Robots 7:86–90

    Google Scholar 

  105. Kiran VK, Parida P, Dash S (2013) Vehicle detection and classification: a review. Journal of Information Assurance and Security (JIAS) 8:067–093

  106. Krig S (2016) Interest point detector and feature descriptor survey. Computer vision metrics (Springer), pp 187-246

  107. Kumar M, Gupta S, Mohan N (2020) A computational approach for printed document forensics using SURF. Soft Comput 24:13197–13208

    Google Scholar 

  108. Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In Computer Vision and Pattern Recognition CVPR, pp 1–8

  109. Laugraud B, Pierard S, Braham M, Droogenbroeck MV (2015) Simple median-based method for stationary background generation using background subtraction algorithms. International Conference on Image Analysis and Processing, pp 477–484

  110. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2):2169–2178

  111. Lee J, Ko H (2008) Gradient-based local affine invariant feature extraction for mobile robot localization in idoor environment. Pattern Recogn Lett 29:1934–1940

    Google Scholar 

  112. Lee MH, Park IK (2014) Performance evaluation of local descriptors for affine invariant region detector. Asian conference on computer vision (Springer), pp 630-643

  113. Leng C, Zhang H, Li B, Cai G, Pei Z, He AL (2016) Local feature descriptor for image matching: a survey. IEEE Access 4:1–12

    Google Scholar 

  114. Li C, Ma L (2009) A new framework for feature descriptor based on SIFT. Pattern Recogn 30:544–557

    Google Scholar 

  115. Li L, Huang W, Gu IYH, Tian Q (2004) Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans Image Process 13(11):1459–1472

    Google Scholar 

  116. Li C, Kulkarni PR, Prabhakaran B (2007) Segmentation and recognition of motion capture data stream by classification. Multimed Tools Appl 35:55–70

    Google Scholar 

  117. Li J, Ma T, Zhou X, Liu Y, Cheng Sh, Ye Ch, Wang Y (2017) A real-time human motion recognition system using topic model and SVM. IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), pp 173–176

  118. Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vis 30(2)

  119. Liu Z, Song X, Tang Z (2015) Fusing hierarchical multi-scale local binary patterns and virtual mirror samples to perform face recognition. Neural Comput & Applic 26:2013–2026

    Google Scholar 

  120. Liu H, Ju Zh, Ji X, Chan ChS, Khoury M (2017) Human motion sensing and recognition a fuzzy qualitative approach. Studies in computational intelligence (Springer), 675

  121. Lu S, Ma X (2019) Adaptive random-based self-organizing background subtraction for moving detection. Int J Mach Learn Cybern 11:1267–1276

    Google Scholar 

  122. Maale BR, Gurredar R (2019) Survey on human motion recognition. Int J Eng Trends Technol 67(10):17–19

    Google Scholar 

  123. Maddalena L, Petrosino A (2008) A self-organizing approach to background subtraction for visual surveillance applications. IEEE Trans Image Process 17(7):1168–1177

    MathSciNet  Google Scholar 

  124. Maddalena L, Petrosino A (2009) Self organizing and fuzzy modelling for parked vehicles detection. International conference on advanced concepts for intelligent vision systems, pp 422–433

  125. Marinho LB, Souza Junior AHd, Reboucas Filho PP (2016) A new approach to human activity recognition using machine learning techniques. international conference on intelligent systems design and applications, pp 529–538

  126. Matsui YI, Miyoshi Y (2007) Difference-of-Gaussian-like characteristics for optoelectronic visual sensor. IEEE Sens Journal 7(10):1447–1452

    Google Scholar 

  127. Matsuyama T, Wada T, Habe H, Tanahashi K (2006) Background subtraction under varying illumination. Systems and Computers in Japan 37(4):2201–2211

    Google Scholar 

  128. Mayo Z, Tapamo JR (2009) Background subtraction survey for highway surveillance. Twentieth annual symposium of the pattern recognition Association of South Africa, pp 77-82

  129. McKenna SJ, Jabari S, Duric Z, Rosenfeld A, Wechsler H (2000) Tracking groups of people. Comput Vis Image Underst 80(1):42–56

    MATH  Google Scholar 

  130. Medioni G, Cohen I, Bremond F, Hongeng S, Nevatia R (2001) Event detection and analysis from video streams. IEEE Trans Pattern Anal Mach Intell 23(8):873–889

    Google Scholar 

  131. Miao Q, Wang G, Shi C, Lin X, Ruan Z (2011) A new framework for on-line object tracking based on SURF. Pattern Recogn 32:1564–1571

    Google Scholar 

  132. Mikolajczyk K, Schmid C (2004) Scale & Affine Invariant Interest Point Detectors. Int J Comput Vis 60(1):63–86

    Google Scholar 

  133. Mingyang W, Yimin D. Zh Guolong C (2019) Human motion recognition exploiting radar with stacked recurrent neural network. Digital Signal Processing (Elsevier), vol (87): 125:131

  134. Mishra SK, Jtmcoe F, Bhagat KS (2015) A survey on human motion detection and surveillance. International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) 4(4):1044–1048

    Google Scholar 

  135. Moeslund TB, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Underst 81:231–268

    MATH  Google Scholar 

  136. Moeslund TB, Hilton A, Kruger V (2006) A survey of advances in vision-based human motion capture and analysis. Comput Vis Image Underst 104:90–126

    Google Scholar 

  137. Mohamed AN, Ali MM (2013) Human motion analysis, recognition and understanding in computer vision: a review. Journal of Engineering Sciences (JES), Assiut University, Faculty of Engineering 41(5):1928–1946

  138. Mohanty A, Shantaiya S (2015) A survey on moving object detection using background subtraction methods in video. Int J Comput Appl 975

  139. Mubarak Sh, Ramesh J (2013) Motion-based recognition. Springer Science and Business Media, vol (9)

  140. Nagel HH (1988) From image sequences towards conceptual descriptions. Image Vis Comput 6(2):59–74

    Google Scholar 

  141. Ng CC, Yap MH, Costen N, Li AB (2015) Wrinkle detection using Hessian line tracking. IEEE Access 3:1079–1088

    Google Scholar 

  142. Niebles JC, Fei-Fei L (2007) A hierarchical model of shape and appearance for human action classification. IEEE conference on computer and pattern recognition. https://doi.org/10.1109/CVPR.2007.383132

  143. Nikolov B, Kostov N (2014) Motion detection using adaptive temporal averaging method. Radioengineering 23(2):652–658

    Google Scholar 

  144. Nurhadiyatna A, Wijayanti R, Fryantoni D (2016) Extended Gaussian mixture model enhanced by hole filling algorithm (GMMHF) utilize GPU acceleration. Information science and applications (ICISA), pp 459-469

  145. Nweke HF, The YW, Mujtaba G, Alo UR, Al-garadi MA (2019) Multi-sensor fusion based on multiple classifier systems for human activity identification. Human-Centric Computing and Information Sciences (HCIS) volume 9(34):1–44

  146. Olivar NM, Rosario B, Pentland AP (2000) A Bayesian computer vision system for modeling human interactions. IEEE Trans Pattern Anal Mach Intell 22(8):831–843

    Google Scholar 

  147. Oliver P, Sapiro G, Tannenbaum A (1999) Affine invariant detection: edge maps, anisotropic diffusion, and active contours. Acta Appl Math 59:45–77

    MathSciNet  MATH  Google Scholar 

  148. Oyallo E, Rabin J (2015) An analysis of the SURF method. Image Processing on Line 5:176–218

    MathSciNet  Google Scholar 

  149. Pal SK, Bhoumik D, Chakraborty DB (2019) Granulated deep learning and Z-numbers in motion detection and object recognition. Neural Comput & Applic

  150. Panahi S, Sheikhi S, Hadadan Sh, Gheissari N (2008) Evaluation of background subtraction methods. Digital Image Computing: Techniques and Applications, pp 357–364

  151. Park S, Aggarwal JK (2004) A hierarchical Bayesian network for event recognition of human actions and interactions. Multimedia Systems 10:164–179

    Google Scholar 

  152. Parks DH, Fels SS (2008) Evaluation of background subtraction algorithms with post-processing. IEEE fifth international conference on advanced video and signal based surveillance, pp 192-199

  153. Patel TP, Panchal SR (2014) Corner detection techniques: an introductory survey. Int J Eng Dev Res 2(4):3680–3686

    Google Scholar 

  154. Pattar SY (2015) Study of corner detection algorithms and evaluation methods. Int J Innov Res Sci Eng Technol 4(5):2780–2787

    Google Scholar 

  155. Phapatanaburi K, Wang L, Sakagami R, Zhang Z, Li X, Iwahashi M (2015) Distant-talking accent recognition by combining GMM and DNN. Multimed Tools Appl 75:5109–5124

    Google Scholar 

  156. Piccardi M (2004) Background subtraction techniques: a review. IEEE International Conference on Systems, Man and Cybernetics (SMC) 4:3099–3104

  157. Pilet J, Strecha Ch, Fua P (2008) Making background subtraction robust to sudden illumination changes. European Conference on Computer Vision ECCV, pp 567–558

  158. Poppe R (2007) Vision-based human motion analysis: An overview. Comput Vis Image Underst 108:4–18

    Google Scholar 

  159. Rahman FYA, Hussain A, Zaki WMDW, Zaman HB, Tahir NM (2013) Enhancement of background subtraction techniques using a second derivative in gradient direction filter. Journal of Electrical and Computer Engineering, pp: 1–12

  160. Ramamurthy SR, Roy N (2018) Recent trends in machine learning for human activity recognition- a survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4)

  161. Rautaray SS, Agrawal A (2015) Vision based hand gesture recognition for human computer interation: a survey. Artif Intell Rev 43:1–54

    Google Scholar 

  162. Reyneke C, Nel A, Robinson P (2014) Comparison of background subtraction techniques under sudden illumination. Conference: Pattern Recognition Association of South Africa (PRASA)

  163. Rodriguez-Moreno I, Martinez-Otzeta JM, Sierra B, Rodriguez I, Jauregi E (2019) Video Activity Recognition: State-of-the-Art Sensors 19(14): 3160

  164. Roshanbin N, Miller J (2016) A comparative study of the performance of local feature-based pattern recognition algorithms. Pattern Anal Applic 20:1145–1156

    MathSciNet  Google Scholar 

  165. Sajid H, Sch SC (2017) Universal multimode background subtraction. IEEE Trans Image Process 26(7):3249–3260

    MathSciNet  MATH  Google Scholar 

  166. Sajid H, Cheung SCS, Jacobs N (2019) Motion and appearance based background subtraction for freely moving cameras. Signal Process Image Commun 75:11–21

    Google Scholar 

  167. Salahat E, Qasaimeh M (2017) Recent Advances in Features Extraction and Description Algorithms: A Comprehensive Survey. IEEE international conference on industrial technology, pp 1059–1063

  168. Santoyo-Morales JE, Hasimoto-Beltran R (2014) Video background subtraction in complex environments. J Appl Res Technol 12(3):527–537

    Google Scholar 

  169. Saremi M, Yaghmaee F (2019) Efficient encoding of video descriptor distribution for action recognition. Multimed Tools Appl 79:6025–6043

    Google Scholar 

  170. Sasirekha K, Thangavel K (2018) Optimization of K-nearest neighbor using particle swarm optimization for face recognition. Neural Comput & Applic 31:7935–7944

    Google Scholar 

  171. Schuldt Ch, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. Proceedings of the 17th international conference on pattern recognition vol(3), pp 32–36

  172. Sehairi K, Fatima C, Meunier J (2017) Comparative study of motion detection methods for video surveillance systems. J Electron Imaging 26(2):023025

    Google Scholar 

  173. Seib V, Kusenback M, Thierfelder S, Paulus D (2014) Object recognition using Hough-transform clustering of SURF features. Conference: Workshops on Electronical and Computer Engineering Subfields

  174. Seki M, Fujiwara H, Sumi K (2000) A robust background subtraction method for changing background. Proceedings fifth IEEE workshop on applications of computer vision, pp 207-213

  175. Setiono R, Liu H (1998) Feature extraction via neural network. Kluwer Academic Publishers, pp 192–204

  176. Shahbaz A, Hariyono J, Jo KH (2015) Evaluation of background subtraction algorithms for video surveillance. University of Ulsan

  177. Shaikh SH, Saeed Kh, Chaki N (2014) Moving object detection using background subtraction, Springer, pp 15–23

  178. Sharma L, Lohan N (2019) Performance analysis of moving object detection using BGS techniques in visual surveillance. International Journal of Spatio-Temporal Data Science (IJSTDS) 1(1):22–53

  179. Sharma TK, Sarvesh NSB, Mamatha YN (2013) Satellite image feature extraction using neural network technique. Proceedings of ICAdC, AISC 174:101–106

    Google Scholar 

  180. Shi Z, Li H, Cao Q, Ren H, Fan B (2020) An image mosaic method based on convolutional neural network semantic features extraction. J Signal Process Syst 92:435–444

    Google Scholar 

  181. Shirazi MS, Morris BT (2019) Trajectory prediction of vehicles turning at intersections using deep neural networks. Mach Vis Appl 30:1097–1109

    Google Scholar 

  182. Shunzhi Z, Liu L, Si C (2015) Image feature detection algorithm based on the spread of hessian source. Multimedia Systems 23:105–117

    Google Scholar 

  183. Sigari MH, Mozayani N, Pourreza HM (2008) Fuzzy running average and fuzzy background subtraction: concepts and applications. International Journal of Computer Science and Network Security (IJCSNS) 8(2):138–143

  184. Smach F, Lemaitre C, Gauthier JP, Miteran J, Atri M (2008) Generalized Fourier descriptors with applications to objects recognition in SVM context. J Math Imaging Vis 30:43–71

    MathSciNet  Google Scholar 

  185. Sobral A, Vacavant A (2014) A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput Vis Image Underst 122:4–21

    Google Scholar 

  186. Stauffer C, Grimson WEL (2000) Learning patterns of activity using real-time tracking. IEEE Trans Pattern Anal Mach Intell 22(8):747–757

    Google Scholar 

  187. Suhr JK, Jung HG, Li G, Kim J (2011) Mixture of Gaussians-based background subtraction for bayer-pattern image sequences. IEEE Trans Circuits Syst Video Technol 21(3):365–370

    Google Scholar 

  188. Sun L,·Sheng W, Liu Y (2015) Background modeling and its evaluation for complex scenes. Multimed Tools Appl 74(11): 3947–3966

    Google Scholar 

  189. Sykora P, Kamencay P, Hudec R (2014) Comparison of SIFT and SURF methods for use on hand gesture recognition based on depth map. 2014 AASRI conference on circuits and signal processing (CSP 2014): 19-24

  190. Takatoo M, Kitamura T, Kobayashi Y (1998) Vehicle extraction using spatial differentiation and subtraction. Systems and Computers in Japan 29(7):2976–2985

    Google Scholar 

  191. Takhar G, Prakash Ch, Mittal N, Kumar R (2016) Comparative analysis of background subtraction techniques and applications. IEEE international conference on recent advances and innovations in engineering, pp 23-25

  192. Takhar G, Prakash Ch, Mittal N, Kumar R (2016) Comparative analysis of background. IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2016)

  193. Tang Y, Jiang L, Hou Y, Wang R (2017) Contactless fingerprint image enhancement algorithm based on hessian matrix and SIFT. 2nd international conference on multimedia and image processing, pp 156-160

  194. Thangaraju B, Vennila I, Chinnasamy G (2012) Detection of microcalcification clusters using hessian matrix and foveal segmentation method on multiscale analysis in digital mammograms. J Digit Imaging 25:607–619

    Google Scholar 

  195. Tian Y, Senior A, Lu M (2012) Robust and efficient foreground analysis in complex surveillance videos. Mach Vis Appl 23:967–983

    Google Scholar 

  196. Toyama K, Krumm J, Brumitt B, Meyers B (1999) Principles and practice of background maintenance. Proceedings of the Seventh IEEE International Conference on Computer Vision

  197. Tuncer T, Dogan S, Ertam F (2019) A novel neural network based image descriptor for texture classification. Physica A: Statistical Mechanics and its Applications 526:1–10

    MathSciNet  Google Scholar 

  198. Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circuits Syst Video Technol 18(11):1473–1488

    Google Scholar 

  199. Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circuits Syst Video Technol 18(11):1473–1488

    Google Scholar 

  200. Tuytelaars T, Mikolajczyk K (2008) Local invariant feature detectors: a survey. Found Trends Comput Graph Vis 3(3):177–280

    Google Scholar 

  201. Vacavant A, Chateau Th, Wilhelm A, Lequievre L (2013) A Benchmark Dataset for Outdoor Foreground/Background Extraction. ACCV 2012 Workshops, Part I, LNCS, pp 291–300

  202. Vafadar M, Behrad A (2014) A vision based system for communicating in virtual reality environments by recognizing human hand gestures. Multimed Tools Appl 74:7515–7535

    Google Scholar 

  203. Varkey JP, Pompili D, Walls TA (2011) Human motion recognition using a wireless sensor-based wearable system. Pers Ubiquit Comput 16:897–910

    Google Scholar 

  204. Vishwakarma S, Agrawal A (2012) A survey on activity recognition and behavior understanding in video surveillance. Vis Comput 29:983–1009

    Google Scholar 

  205. Vishwakarma DK, Dhiman C (2018) A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel. Vis Comput 35:1595–1613

    Google Scholar 

  206. Vosters L.P.J., Shan C, Gritti T (2010) Background Subtraction under Sudden Illumination Changes. Seventh IEEE International Conference on Advanced Video and Signal Based Surveillance, pp 384–391

  207. Vosters L, Shan C, Gritti T (2012) Real-time robust background subtraction under rapidly changing illumination conditions. Image Vis Comput 30:1004–1015

    Google Scholar 

  208. Wang Z (2012) Manifold adaptive kernel local fisher discriminant analysis for face recognition. J Multimed 7(6):387–433

    Google Scholar 

  209. Wang L, Hu W, Tan T (2003) Recent developments in human motion analysis. Pattern Recogn 36:585–601

    Google Scholar 

  210. Wang H, Ullah MM, Klaser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. Proceedings of the British machine vision conference, pp 1-11

  211. Wang JG, Li J, Yau WY, Sung E (2010) Boosting dense SIFT descriptors and shape contexts of face images for gender recognition. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition – Workshops, pp 96–102

  212. Wang H, Oneata D, Verbeek J, Schmid C (2015) A robust and efficient video representation for action recognition. Int J Comput Vis 119:219–238

    MathSciNet  Google Scholar 

  213. Wang A, Chen G, Wu X, Liu L, An N, Chang CY (2018) Towards human activity recognition: a hierarchical feature selection framework. Sensors 18(11):3629

    Google Scholar 

  214. Wang P, Li W, Ogunbona P, Wan J, Escalera S (2018) RGB-D-based human motion recognition with deep learning: a survey. Comput Vis Image Und 171:118–139

    Google Scholar 

  215. Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recogn Lett 119:3–11

    Google Scholar 

  216. Wei H, Peng Q (2018) A block-wise frame difference method for real-time video detection. International Journal of Advanced Robotic Systems, pp 1–13

  217. Wei H, Peng Q (2018) A block-wise frame difference method for real-time video motion detection. Int J Adv Robot Syst 15(4):172988141878363

    Google Scholar 

  218. Wei W, Yunxiao A (2009) Vision-based human motion recognition: a survey. Second international conference on intelligent networks and intelligent systems, pp 386-389

  219. Weinland D, Ronfard R, Boyer E (2011) A survey of vision-based methods for action representation, segmentation and recognition. Comput Vis Image Underst 115(2):224–241

    Google Scholar 

  220. White B, Shah M (2007) Automatically tuning background subtraction parameters using particle swarm optimization. 2007 IEEE international conference on multimedia and expo

  221. Wren CR, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785

    Google Scholar 

  222. Xiao JZ, Wang HR, Yang XC, Gao Z (2011) Multiple faults diagnosis in motion system based on SVM. Int J Mach Learn Cybern 3:77–82

    Google Scholar 

  223. Xu W, Miao Z, Zhang Q (2014) Projection transform on spatio-temporal context for action recognition. Multimed Tools Appl 74:7711–7728

    Google Scholar 

  224. Xu Y, Ji H, Zhang W (2019) Coarse-to-fine sample-based background subtraction for moving object detection. Optik 207:1–44

    Google Scholar 

  225. Xue-mei X, Li-chao Z, Qin M, Qiao-yun G (2015) Vehicle detection algorithm based on codebook and local binary patterns algorithms. J Cent South Univ 22:593–600

    Google Scholar 

  226. Yadav Y, Walavalkar R, Sunchak S, Yedurkar A, Gharat S (2017) Comparison of processing time of different size of images and video resolutions for object detection using fuzzy inference system. Int J Sci Technol Res 6(1):191–195

    Google Scholar 

  227. Yao G, Lei T, Zhong J, Jiang P, Jia W (2017) Comparative evaluation of background subtraction algorithms in remote scene videos captured by MWIR sensors. Sensors 17(9)

  228. Yi Y, Wang H, Zhang B (2017) Learning correlations for human action recognition in videos. Multimed Tools Appl 76:18891–18913

    Google Scholar 

  229. Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38(4):1–45

    Google Scholar 

  230. Yu D, Deng L (2014) Automatic speech recognition a deep learning approach. Springer

  231. Yuan J, Liu Z, Wu Y (2009) Discriminative subvolume search for efficient action detection. In: IEEE CVPR, pp 2442–2449

  232. Zacharatos H, Gatzoulis C, Chrysanthou YL (2014) Automatic emotion recognition based on body movement analysis: a survey. IEEE Comput Graph Appl 34(6):35–45

    Google Scholar 

  233. Zhang Ch, Tabkhi H, Schirner G (2014) A GPU-based algorithm-specific optimization for high-performance background subtraction. 43rd international conference on parallel processing, pp 182-191

  234. Zhang L, Lu J, Wang J, Wu Y, Jiang Z (2016) The improved Harris operator based on steerable filter. In ISME 2016 - information science and management engineering IV 1: 305-311

  235. Zhang HB, Zhang YX, Zhong B, Lei Q, Yang L, Du JX, Chen DS (2019) A comprehensive survey of vision-based human. Sensors 19(5):1–20

    Google Scholar 

  236. Zhang HB, Zhang YX, Zhong B, Lei Q, Yang L, Du JX, Chen DS (2019) A comprehensive survey of vision-based human action recognition methods. Sensors 19(5):1–20

    Google Scholar 

  237. Zhang T, Lin H, Ju Z, Yang C (2020) Hand gesture recognition in complex background based on convolutional pose machine and fuzzy Gaussian mixture models. Int J Fuzzy Syst 22:1330–1341

    Google Scholar 

  238. Zhao X, Zhang Sh (2012) Facial expression recognition using local binary patterns and discriminant kernel locally linear embedding 20: 1–9

  239. Zhao L, Wang Z, Zhang Gm Qi Y, Wang X (2017) Eye state recognition based on deep integrated neural network and transfer learning. Multmedia Tools and Apllications 77: 19415–19438

  240. Zheng W, Wang K, Wang FY (2019) A novel background subtraction algorithm based on parallel vision and Bayesian GANs. Neurocomputing 394:178–200

    Google Scholar 

  241. Zhou X (2020) Wearable health monitoring system based on human motion state recognition. Comput Commun 150:62–71

    Google Scholar 

  242. Zivkovic Z (2004) Improved adaptive Gaussian mixture model for background subtraction. Proceedings of the 17th international conference on pattern recognition (ICPR’04)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Reza Keyvanpour.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keyvanpour, M.R., Vahidian, S. & Ramezani, M. HMR-vid: a comparative analytical survey on human motion recognition in video data. Multimed Tools Appl 79, 31819–31863 (2020). https://doi.org/10.1007/s11042-020-09485-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09485-2

Keywords

Navigation