A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

Su, Chan; Wei, Jianguo; Lin, Deyu; Kong, Linghe; Guan, Yong Liang

doi:10.1007/s10044-024-01224-9

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

Theoretical Advances
Published: 15 February 2024

Volume 27, article number 3, (2024)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Chan Su¹,
Jianguo Wei¹,
Deyu Lin ORCID: orcid.org/0000-0003-1400-4769^1,2,3,
Linghe Kong² &
…
Yong Liang Guan³

281 Accesses
Explore all metrics

Abstract

Three-dimensional convolutional neural networks (3D-CNNs) and full connection long short-term memory networks (FC-LSTMs) have been demonstrated as a kind of powerful non-intrusive approaches in fall detection. However, the feature extration of 3D-CNN-based requires a large-scale dataset. Meanwhile, the deployment of FC-LSTM to expand the input into one-dimension leads to the loss of spatial information. To this end, a novel model combined lightweight 3D-CNN and convolutional long short-term memory (ConvLSTM) networks is proposed in this paper. In this model, a lightweight 3D convolutional neural network with five layers is presented to avoid the phenomenon of over-fitting. To further explore the discrimnative features, the channel- and spatial-wise attention modules are adopted in each layer to improve the detection performance. In addition, the ConvLSTM is presented to extract the long-term spatial–temporal features of 3D tensors. Finally, we verify our model through extensive experiments by comprehensive comparisons with HMDB5, UCF11, URFD, and MCFD. Experimental results on the public benchmarks demonstrate that our method outperforms current state-of-the-art single-stream networks with 62.55 ± 7.99% on HMDB5, 97.28 ± 0.36% on UCF11, 98.06 ± 0.32% on URFD, and 94.84 ± 4.64% on MCFD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

CBAM: Convolutional Block Attention Module

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

Data availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

References

Yang L, Ren Y, Hu H, Tian B (2015) New fast fall detection method based on spatio-temporal context tracking of head by using depth images. Sensors 15(1):23004–23019
Article ADS PubMed PubMed Central Google Scholar
Burns E, Kakara R (2018) Deaths from falls among persons aged ≥ 65 years-United States, 2007–2016. Morb Mortal Weekly Rep 67(18):509–514
Article Google Scholar
Lord SR, Menz HB, Catherine S (2006) Home environment risk factors for falls in older people and the efficacy of home modifications. Age Ageing 35(2):55–59
Article Google Scholar
Vallabh P, Malekian R (2018) Fall detection monitoring systems: a comprehensive review. J Ambient Intell Humanized Comput 9(6):1809–1833
Article Google Scholar
Makhlouf A, Boudouane I, Saadia N, Ramdane Cherif A (2019) Ambient assistance service for fall and heart problem detection. J Amb Intel Hum Comput 10(4):1527–1546
Article Google Scholar
Shrivastava R, Pandey M (2020) Real time fall detection in fog computing scenario. Cluster Comput 23(4):2861–2870
Article Google Scholar
Islam MM, Rahaman A, Islam MR (2020) Development of smart healthcare monitoring system in IoT environment. SN Comput Sci 1(3):185–197
Article PubMed PubMed Central Google Scholar
R. Wang, Y. Zhang, L. Dong, J. Lu, and X. He, (2015) “Fall detection algorithm for the elderly based on human characteristic matrix and SVM,” In: Proc. 15th Int. Conf. Control, Autom. Syst. (ICCAS 2015), Busan, South Korea, Oct., pp. 1190–1195.
Eduardo C, Lora-Rivera Rl, García-Lagos F (2020) A study on the application of convolutional neural networks to fall detection evaluated with multiple public datasets. Sensors 20(5):1466–1479
Article ADS Google Scholar
Villaseor LM, Ponce H (2020) Design and analysis for fall detection system simplification. J Vis Exp 1(1):158–164
Google Scholar
Luna-Perejón F, Domínguez-Morales MJ, Civit-Balcells A (2019) Wearable fall detector using recurrent neural networks. Sensors 19(22):4885–4883
Article ADS PubMed PubMed Central Google Scholar
Wang G, Li Q, Wang L, Zhang Y, Liu Z (2019) Elderly fall detection with an accelerometer using lightweight neural networks. Electronics 8(11):1354–1373
Article CAS Google Scholar
Khraief C, Benzarti F, Amiri H (2020) Elderly fall detection based on multi-stream deep convolutional networks. Multimedia Tools Appl 79(27–28):19537–19560
Article Google Scholar
Chhetri S, Alsadoon A, In T, Prasad PWC, Rashid TA, Maag A (2021) Deep learning for vision-based fall detection system: enhanced optical dynamic flow. Comput Intell 37(1):578–595
Article MathSciNet Google Scholar
Khan S, Nogas J, Mihailidis A (2021) Spatio-temporal adversarial learning for detecting unseen falls. Pattern Anal Appl 24(1):191–381
Article Google Scholar
Merrouche F, Baha N (2020) Fall detection based on shape deformation. Multimed Tools Appl 79(1):30489–30508
Article Google Scholar
Liu J, Xia Y, Tang Z (2021) Privacy-preserving video fall detection using visual shielding information. Vis Comput 37(1):359–370
Article Google Scholar
Li S, Song X, Xu S, Qi H, Xue Y (2022) Dilated spatial-temporal convolutional auto-encoders for human fall detection in surveillance videos. ICT Exp 9(4):734–740
Article CAS Google Scholar
Xiong X, Min W, Zheng WS, Liao P, Yang H, Wang S (2020) S3DCNN: skeleton-based 3D consecutive-low-pooling neural network for fall detection. Int J Speech Technol 50(10):3521–3534
Google Scholar
S. Jeong, S. Kang, and I. Chun, (2019) “Human-skeleton based fall-detection method using LSTM for manufacturing industries,” In: Proc. the 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC 2019), JeJu, Korea, pp. 1–4.
Xu Q, Huang G, Yu M, Guo Y (2020) Fall prediction based on key points of human bones. Phys A 540:382
Article MathSciNet Google Scholar
Ramirez H, Velastin SA, Meza I, Fabregas E, Makris D, Farias G (2021) Fall detection and activity recognition using human skeleton features. IEEE Access 9(1):33532–33542
Article Google Scholar
Martínez-Villaseor L et al (2019) UP-fall detection dataset: a multimodal approach. Sensors 19(9):1988
Article ADS Google Scholar
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE T Pattern Anal 35(8):1798–1828
Article Google Scholar
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202
Article CAS PubMed Google Scholar
Haut JM, Paoletti ME, Plaza J, Plaza A, Li J (2019) Visual attention-driven hyperspectral image classifification. IEEE T Geosci Remote 57(10):8065–8080
Article ADS Google Scholar
Hinton GE, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article ADS MathSciNet CAS PubMed Google Scholar
Núñez-Marcos A, Azkune G, Arganda-Carreras I (2017) Vision-based fall detection with convolutional neural networks. Wirel Commun Mob Com 2017(1):1–16
Google Scholar
Guan Y, Hu W, Hu X (2021) Abnormal behavior recognition using 3D-CNN combined with LSTM. Multimed Tools Appl 80(8):18787–18801
Article Google Scholar
C. Feichtenhofer, H. Fan, J. Malik, and K. He, (2019) “Slow fast networks for video recognition,” In: Proc. the 2019 IEEE/CVF 17th International Conference on Computer Vision (ICCV 2019), Seoul, Korea, pp. 6201–6210.
D. Tran, H. Wang, M. Feiszli, and L. Torresani, (2019) “Video classification with channel-separated convolutional networks,” In: Proc. the 2019 IEEE/CVF 17th International Conference on Computer Vision (ICCV 2019), Seoul, Korea (South, pp. 5551–5560.
S. Sudhakaran, S. Escalera, and O. Lanz, (2020) “Gate-shift networks for video action recognition,” In: Proc. the 2020 IEEE 21th Computer Vision and Pattern Recognition (CVPR 2020), Seattle, USA, pp. 1102–1111.
Xiong X, Min W, Zheng W, Liao P, Yang H, Wang S (2020) S3D-CNN: skeleton-based 3D consecutive-low-pooling neural network for fall detection. Appl Intell 50(1):3521–3534
Article Google Scholar
F. Wang, M. Jiang, Q. Chen, S. Yang, and X. Tang, (2017) “Residual attention network for image classifification,” In: Proc. the 2017 IEEE 18th Computer Vision and Pattern Recognition (CVPR 2017), Hawaii, USA, pp. 6450–6458.
Jie H, Li S, Gang S, Albanie S (2020) Squeeze-and-excitation networks. IEEE T Pattern Anal 42(8):2011–2023
Article Google Scholar
J. Park, S. Woo, J. Y. Lee, and I. S. Kweon, (2018) “BAM: bottleneck attention module.” In: Proc. the 2018 IEEE 29th Conference on British Machine Vision Conference (BMVC 2018), Northumbria, Britain, pp. 1–6.
S. Woo, J. Park, J. Y. Lee and I. S. Kweon, (2018) “CBAM: Convolutional block attention module,” In: Proc. the 2018 IEEE 15th European Conference on Computer Vision (ECCV), Munich, Germany, pp. 3–19.
X. Shi, Z. Chen, H. Wang and D. Y. Yeun, (2015) “Convolutional LSTM network: a machine learning approach for precipitation nowcasting.” In: Proc. the 2015 IEEE 28th Advances in Neural Information Processing Systems (NIPS 2015), Montreal, Quebec, Canada, pp. 802–810.
I. ICharfi, J. Miteran, J. Dubois, M. Atri, and R. Tourki, (2012) “Definition and performance evaluation of a robust svm based fall detection solution,” In: Proc. the 2012 IEEE 8th International Conference on Signal Image Technology and Internet Based Systems (SITIS 2012), Naples, Italy, pp. 218–224.
H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, (2011) “HMDB: a large video database for human motion recognition,” In: Proc.the 2011 IEEE 13th International Conference on Computer Vision (ICCV 2011), Barcelona, Spain, pp. 2556–2563.
Li S, Song X (2023) Future frame prediction network for human fall detection in surveillance videos. IEEE Sens J 23(13):14460–14470
Article ADS Google Scholar
Kwolek B, Kepski M (2014) Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput Meth Prog Bio 117(3):489–501
Article Google Scholar
K. Simonyan, and A. Zisserman, (2015) “Very deep convolutional networks for large-scale image recognition,” in Proc. the 2015 IEEE 3th International Conference on Learning Representation (ICLR 2015). San Diego, CA, pp. 1–6.
S. Sharma, R. Kiros, and R. Salakhutdinov, (2015) “Action recognition using visual attention,” In: Proc. the 2015 IEEE 28th Advances in Neural Information Processing Systems (NIPS 2015), Montreal, Quebec, Canada, pp. 1–12.
C. Szegedy, L. Wei, J. Yangqing, P. Sermanet, S. Reed, and D. Anguelov, (2015) “Going deeper with convolutions,” In: Proc. the 2015 IEEE 15th International Conference on Computer Vision (ICCV 2015), Boston, MA, USA, pp. 1–9.
Wang D, Wu B, Zhou G (2023) Kronecker CP decomposition with fast multiplication for compressing RNNs. IEEE T Neur Net Lear 34(5):2205–2219
MathSciNet Google Scholar
Cui M, Wang W, Zhang K, Sun Z, Wang L (2023) Pose-appearance relational modeling for video action recognition. IEEE T Image Process 32(1):295–308
Article ADS Google Scholar
K. Duvvuri, H. Kanisettypalli, K. Jaswanth, and K. Murali, (2023) “Video classification using CNN and ensemble learning,” In: Proc. the 2023 IEEE 9th International Conference on Advanced Computing and Communication Systems (ICACCS 2023), Coimbatore, India, pp. 66–70.
Assefa M, Jiang W, Gedamu K (2023) Actor-aware self-supervised learning for semi-supervised video representation learning. IEEE T Circ Syst Vid 1(1):1–1
Google Scholar
S. Das, and M. Ryoo, (2023) “Cross-modal manifold cutmix for self-supervised video representation learning,” In: Proc. the 2023 18th International Conference on Machine Vision and Applications (MVA 2023), Hamamatsu, Japan, pp. 1–6.
Lin W, Ding X, Huang Y, Zeng H (2023) Self-supervised video-based action recognition with disturbances. IEEE T Image Process 32(1):2493–2507
Article ADS Google Scholar
S. A. Cameiro, G. P. D. Silva, G. V. Leite, R. Moreno, and H. Pedrini, (2019) “Multi-stream deep convolutional network using high-level features applied to fall detection in video sequences,” In: Proc. the 2019 IEEE 26th International Conference on Systems, Signals and Image Processing (IWSSIP 2019), Osijek, Croatia, pp. 293–298.
S. Hwang, M. Ki, S. H. Lee, S. Park, and B. K. Jeon, (2022) “Cut and continuous paste towards real-time deep fall detection,” In: Proc. the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore, Singapore, pp. 1775–1779.
Chen T, Ding Z, Li B (2022) Elderly fall detection based on improved YOLOv5s network. IEEE Access 10(1):91273–91282
Article Google Scholar
X. Wang, R. Song, and X. Zhang, (2022) “Real-time human fall recognition based on deep learning methods and single depth image with privacy requirements,” In: Proc. the 2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC 2022), Beijing, China, pp. 1548–1553.
Wu L (2023) Robust fall detection in video surveillance based on weakly supervised learning. Neural Netw 163(1):286–297
Article PubMed Google Scholar
Soni P, Choudhary A (2022) Grassmann manifold based framework for automated fall detection from a camera. Image Vis Comput 122(1):104431–104443
Article Google Scholar
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, (2017) “Gradcam: visual explanations from deep networks via gradient-based localization,” In: Proc. the 2017 IEEE 16th International Conference on Computer Vision (ICCV 2017), Venice, Italy, pp. 618–626.

Download references

Acknowledgements

This paper is supported in part by the National Natural Science Foundation of China (61962019) and in part by Natural Science Foundation of Jiangxi Province (20224BAB212016), China Scholarship Council (No. 202106825021), and Natural Science Foundation of Shaanxi Province (2020NY-175).

Author information

Authors and Affiliations

School of Software, Nanchang University, Nanchang, China
Chan Su, Jianguo Wei & Deyu Lin
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
Deyu Lin & Linghe Kong
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Deyu Lin & Yong Liang Guan

Authors

Chan Su
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Wei
View author publications
You can also search for this author in PubMed Google Scholar
Deyu Lin
View author publications
You can also search for this author in PubMed Google Scholar
Linghe Kong
View author publications
You can also search for this author in PubMed Google Scholar
Yong Liang Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deyu Lin.

Ethics declarations

Conflict of interest

All authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Su, C., Wei, J., Lin, D. et al. A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks. Pattern Anal Applic 27, 3 (2024). https://doi.org/10.1007/s10044-024-01224-9

Download citation

Received: 20 February 2023
Accepted: 31 December 2023
Published: 15 February 2024
DOI: https://doi.org/10.1007/s10044-024-01224-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

CBAM: Convolutional Block Attention Module

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

CBAM: Convolutional Block Attention Module

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation