Skip to main content
Log in

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Three-dimensional convolutional neural networks (3D-CNNs) and full connection long short-term memory networks (FC-LSTMs) have been demonstrated as a kind of powerful non-intrusive approaches in fall detection. However, the feature extration of 3D-CNN-based requires a large-scale dataset. Meanwhile, the deployment of FC-LSTM to expand the input into one-dimension leads to the loss of spatial information. To this end, a novel model combined lightweight 3D-CNN and convolutional long short-term memory (ConvLSTM) networks is proposed in this paper. In this model, a lightweight 3D convolutional neural network with five layers is presented to avoid the phenomenon of over-fitting. To further explore the discrimnative features, the channel- and spatial-wise attention modules are adopted in each layer to improve the detection performance. In addition, the ConvLSTM is presented to extract the long-term spatial–temporal features of 3D tensors. Finally, we verify our model through extensive experiments by comprehensive comparisons with HMDB5, UCF11, URFD, and MCFD. Experimental results on the public benchmarks demonstrate that our method outperforms current state-of-the-art single-stream networks with 62.55 ± 7.99% on HMDB5, 97.28 ± 0.36% on UCF11, 98.06 ± 0.32% on URFD, and 94.84 ± 4.64% on MCFD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

References

  1. Yang L, Ren Y, Hu H, Tian B (2015) New fast fall detection method based on spatio-temporal context tracking of head by using depth images. Sensors 15(1):23004–23019

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  2. Burns E, Kakara R (2018) Deaths from falls among persons aged ≥ 65 years-United States, 2007–2016. Morb Mortal Weekly Rep 67(18):509–514

    Article  Google Scholar 

  3. Lord SR, Menz HB, Catherine S (2006) Home environment risk factors for falls in older people and the efficacy of home modifications. Age Ageing 35(2):55–59

    Article  Google Scholar 

  4. Vallabh P, Malekian R (2018) Fall detection monitoring systems: a comprehensive review. J Ambient Intell Humanized Comput 9(6):1809–1833

    Article  Google Scholar 

  5. Makhlouf A, Boudouane I, Saadia N, Ramdane Cherif A (2019) Ambient assistance service for fall and heart problem detection. J Amb Intel Hum Comput 10(4):1527–1546

    Article  Google Scholar 

  6. Shrivastava R, Pandey M (2020) Real time fall detection in fog computing scenario. Cluster Comput 23(4):2861–2870

    Article  Google Scholar 

  7. Islam MM, Rahaman A, Islam MR (2020) Development of smart healthcare monitoring system in IoT environment. SN Comput Sci 1(3):185–197

    Article  PubMed  PubMed Central  Google Scholar 

  8. R. Wang, Y. Zhang, L. Dong, J. Lu, and X. He, (2015) “Fall detection algorithm for the elderly based on human characteristic matrix and SVM,” In: Proc. 15th Int. Conf. Control, Autom. Syst. (ICCAS 2015), Busan, South Korea, Oct., pp. 1190–1195.

  9. Eduardo C, Lora-Rivera Rl, García-Lagos F (2020) A study on the application of convolutional neural networks to fall detection evaluated with multiple public datasets. Sensors 20(5):1466–1479

    Article  ADS  Google Scholar 

  10. Villaseor LM, Ponce H (2020) Design and analysis for fall detection system simplification. J Vis Exp 1(1):158–164

    Google Scholar 

  11. Luna-Perejón F, Domínguez-Morales MJ, Civit-Balcells A (2019) Wearable fall detector using recurrent neural networks. Sensors 19(22):4885–4883

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  12. Wang G, Li Q, Wang L, Zhang Y, Liu Z (2019) Elderly fall detection with an accelerometer using lightweight neural networks. Electronics 8(11):1354–1373

    Article  CAS  Google Scholar 

  13. Khraief C, Benzarti F, Amiri H (2020) Elderly fall detection based on multi-stream deep convolutional networks. Multimedia Tools Appl 79(27–28):19537–19560

    Article  Google Scholar 

  14. Chhetri S, Alsadoon A, In T, Prasad PWC, Rashid TA, Maag A (2021) Deep learning for vision-based fall detection system: enhanced optical dynamic flow. Comput Intell 37(1):578–595

    Article  MathSciNet  Google Scholar 

  15. Khan S, Nogas J, Mihailidis A (2021) Spatio-temporal adversarial learning for detecting unseen falls. Pattern Anal Appl 24(1):191–381

    Article  Google Scholar 

  16. Merrouche F, Baha N (2020) Fall detection based on shape deformation. Multimed Tools Appl 79(1):30489–30508

    Article  Google Scholar 

  17. Liu J, Xia Y, Tang Z (2021) Privacy-preserving video fall detection using visual shielding information. Vis Comput 37(1):359–370

    Article  Google Scholar 

  18. Li S, Song X, Xu S, Qi H, Xue Y (2022) Dilated spatial-temporal convolutional auto-encoders for human fall detection in surveillance videos. ICT Exp 9(4):734–740

    Article  CAS  Google Scholar 

  19. Xiong X, Min W, Zheng WS, Liao P, Yang H, Wang S (2020) S3DCNN: skeleton-based 3D consecutive-low-pooling neural network for fall detection. Int J Speech Technol 50(10):3521–3534

    Google Scholar 

  20. S. Jeong, S. Kang, and I. Chun, (2019) “Human-skeleton based fall-detection method using LSTM for manufacturing industries,” In: Proc. the 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC 2019), JeJu, Korea, pp. 1–4.

  21. Xu Q, Huang G, Yu M, Guo Y (2020) Fall prediction based on key points of human bones. Phys A 540:382

    Article  MathSciNet  Google Scholar 

  22. Ramirez H, Velastin SA, Meza I, Fabregas E, Makris D, Farias G (2021) Fall detection and activity recognition using human skeleton features. IEEE Access 9(1):33532–33542

    Article  Google Scholar 

  23. Martínez-Villaseor L et al (2019) UP-fall detection dataset: a multimodal approach. Sensors 19(9):1988

    Article  ADS  Google Scholar 

  24. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE T Pattern Anal 35(8):1798–1828

    Article  Google Scholar 

  25. Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202

    Article  CAS  PubMed  Google Scholar 

  26. Haut JM, Paoletti ME, Plaza J, Plaza A, Li J (2019) Visual attention-driven hyperspectral image classifification. IEEE T Geosci Remote 57(10):8065–8080

    Article  ADS  Google Scholar 

  27. Hinton GE, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  ADS  MathSciNet  CAS  PubMed  Google Scholar 

  28. Núñez-Marcos A, Azkune G, Arganda-Carreras I (2017) Vision-based fall detection with convolutional neural networks. Wirel Commun Mob Com 2017(1):1–16

    Google Scholar 

  29. Guan Y, Hu W, Hu X (2021) Abnormal behavior recognition using 3D-CNN combined with LSTM. Multimed Tools Appl 80(8):18787–18801

    Article  Google Scholar 

  30. C. Feichtenhofer, H. Fan, J. Malik, and K. He, (2019) “Slow fast networks for video recognition,” In: Proc. the 2019 IEEE/CVF 17th International Conference on Computer Vision (ICCV 2019), Seoul, Korea, pp. 6201–6210.

  31. D. Tran, H. Wang, M. Feiszli, and L. Torresani, (2019) “Video classification with channel-separated convolutional networks,” In: Proc. the 2019 IEEE/CVF 17th International Conference on Computer Vision (ICCV 2019), Seoul, Korea (South, pp. 5551–5560.

  32. S. Sudhakaran, S. Escalera, and O. Lanz, (2020) “Gate-shift networks for video action recognition,” In: Proc. the 2020 IEEE 21th Computer Vision and Pattern Recognition (CVPR 2020), Seattle, USA, pp. 1102–1111.

  33. Xiong X, Min W, Zheng W, Liao P, Yang H, Wang S (2020) S3D-CNN: skeleton-based 3D consecutive-low-pooling neural network for fall detection. Appl Intell 50(1):3521–3534

    Article  Google Scholar 

  34. F. Wang, M. Jiang, Q. Chen, S. Yang, and X. Tang, (2017) “Residual attention network for image classifification,” In: Proc. the 2017 IEEE 18th Computer Vision and Pattern Recognition (CVPR 2017), Hawaii, USA, pp. 6450–6458.

  35. Jie H, Li S, Gang S, Albanie S (2020) Squeeze-and-excitation networks. IEEE T Pattern Anal 42(8):2011–2023

    Article  Google Scholar 

  36. J. Park, S. Woo, J. Y. Lee, and I. S. Kweon, (2018) “BAM: bottleneck attention module.” In: Proc. the 2018 IEEE 29th Conference on British Machine Vision Conference (BMVC 2018), Northumbria, Britain, pp. 1–6.

  37. S. Woo, J. Park, J. Y. Lee and I. S. Kweon, (2018) “CBAM: Convolutional block attention module,” In: Proc. the 2018 IEEE 15th European Conference on Computer Vision (ECCV), Munich, Germany, pp. 3–19.

  38. X. Shi, Z. Chen, H. Wang and D. Y. Yeun, (2015) “Convolutional LSTM network: a machine learning approach for precipitation nowcasting.” In: Proc. the 2015 IEEE 28th Advances in Neural Information Processing Systems (NIPS 2015), Montreal, Quebec, Canada, pp. 802–810.

  39. I. ICharfi, J. Miteran, J. Dubois, M. Atri, and R. Tourki, (2012) “Definition and performance evaluation of a robust svm based fall detection solution,” In: Proc. the 2012 IEEE 8th International Conference on Signal Image Technology and Internet Based Systems (SITIS 2012), Naples, Italy, pp. 218–224.

  40. H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, (2011) “HMDB: a large video database for human motion recognition,” In: Proc.the 2011 IEEE 13th International Conference on Computer Vision (ICCV 2011), Barcelona, Spain, pp. 2556–2563.

  41. Li S, Song X (2023) Future frame prediction network for human fall detection in surveillance videos. IEEE Sens J 23(13):14460–14470

    Article  ADS  Google Scholar 

  42. Kwolek B, Kepski M (2014) Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput Meth Prog Bio 117(3):489–501

    Article  Google Scholar 

  43. K. Simonyan, and A. Zisserman, (2015) “Very deep convolutional networks for large-scale image recognition,” in Proc. the 2015 IEEE 3th International Conference on Learning Representation (ICLR 2015). San Diego, CA, pp. 1–6.

  44. S. Sharma, R. Kiros, and R. Salakhutdinov, (2015) “Action recognition using visual attention,” In: Proc. the 2015 IEEE 28th Advances in Neural Information Processing Systems (NIPS 2015), Montreal, Quebec, Canada, pp. 1–12.

  45. C. Szegedy, L. Wei, J. Yangqing, P. Sermanet, S. Reed, and D. Anguelov, (2015) “Going deeper with convolutions,” In: Proc. the 2015 IEEE 15th International Conference on Computer Vision (ICCV 2015), Boston, MA, USA, pp. 1–9.

  46. Wang D, Wu B, Zhou G (2023) Kronecker CP decomposition with fast multiplication for compressing RNNs. IEEE T Neur Net Lear 34(5):2205–2219

    MathSciNet  Google Scholar 

  47. Cui M, Wang W, Zhang K, Sun Z, Wang L (2023) Pose-appearance relational modeling for video action recognition. IEEE T Image Process 32(1):295–308

    Article  ADS  Google Scholar 

  48. K. Duvvuri, H. Kanisettypalli, K. Jaswanth, and K. Murali, (2023) “Video classification using CNN and ensemble learning,” In: Proc. the 2023 IEEE 9th International Conference on Advanced Computing and Communication Systems (ICACCS 2023), Coimbatore, India, pp. 66–70.

  49. Assefa M, Jiang W, Gedamu K (2023) Actor-aware self-supervised learning for semi-supervised video representation learning. IEEE T Circ Syst Vid 1(1):1–1

    Google Scholar 

  50. S. Das, and M. Ryoo, (2023) “Cross-modal manifold cutmix for self-supervised video representation learning,” In: Proc. the 2023 18th International Conference on Machine Vision and Applications (MVA 2023), Hamamatsu, Japan, pp. 1–6.

  51. Lin W, Ding X, Huang Y, Zeng H (2023) Self-supervised video-based action recognition with disturbances. IEEE T Image Process 32(1):2493–2507

    Article  ADS  Google Scholar 

  52. S. A. Cameiro, G. P. D. Silva, G. V. Leite, R. Moreno, and H. Pedrini, (2019) “Multi-stream deep convolutional network using high-level features applied to fall detection in video sequences,” In: Proc. the 2019 IEEE 26th International Conference on Systems, Signals and Image Processing (IWSSIP 2019), Osijek, Croatia, pp. 293–298.

  53. S. Hwang, M. Ki, S. H. Lee, S. Park, and B. K. Jeon, (2022) “Cut and continuous paste towards real-time deep fall detection,” In: Proc. the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, Singapore, pp. 1775–1779.

  54. Chen T, Ding Z, Li B (2022) Elderly fall detection based on improved YOLOv5s network. IEEE Access 10(1):91273–91282

    Article  Google Scholar 

  55. X. Wang, R. Song, and X. Zhang, (2022) “Real-time human fall recognition based on deep learning methods and single depth image with privacy requirements,” In: Proc. the 2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC 2022), Beijing, China, pp. 1548–1553.

  56. Wu L (2023) Robust fall detection in video surveillance based on weakly supervised learning. Neural Netw 163(1):286–297

    Article  PubMed  Google Scholar 

  57. Soni P, Choudhary A (2022) Grassmann manifold based framework for automated fall detection from a camera. Image Vis Comput 122(1):104431–104443

    Article  Google Scholar 

  58. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, (2017) “Gradcam: visual explanations from deep networks via gradient-based localization,” In: Proc. the 2017 IEEE 16th International Conference on Computer Vision (ICCV 2017), Venice, Italy, pp. 618–626.

Download references

Acknowledgements

This paper is supported in part by the National Natural Science Foundation of China (61962019) and in part by Natural Science Foundation of Jiangxi Province (20224BAB212016), China Scholarship Council (No. 202106825021), and Natural Science Foundation of Shaanxi Province (2020NY-175).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deyu Lin.

Ethics declarations

Conflict of interest

All authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Su, C., Wei, J., Lin, D. et al. A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks. Pattern Anal Applic 27, 3 (2024). https://doi.org/10.1007/s10044-024-01224-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10044-024-01224-9

Keywords

Navigation