learning anomalous human actions using frames of interest and decoderless deep embedded clustering

Javed, Muhammad Hafeez; Yu, Zeng; Li, Tianrui; Anwar, Noreen; Rajeh, Taha M.

doi:10.1007/s13042-023-01851-4

learning anomalous human actions using frames of interest and decoderless deep embedded clustering

Original Article
Published: 19 May 2023

Volume 14, pages 3575–3589, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Muhammad Hafeez Javed¹,
Zeng Yu¹,
Tianrui Li ORCID: orcid.org/0000-0001-7780-104X¹,
Noreen Anwar² &
…
Taha M. Rajeh¹

264 Accesses
1 Citation
Explore all metrics

Abstract

Inconsistent data and unclear labels make it difficult to learn anomalous behavior from video. Therefore, methods based on deep clustering are now trending in this area. A deep clustering strategy usually relies on encoding and reconstruction to facilitate information discovery. However, it seems pointless to reconstruct the input after the model’s learning process is already concluded. On the other hand, multiple input types carry various features which may help identify the problem more accurately. Hence to mitigate the requirement of utilizing assorted features with clustering, we propose Skeletal Based Autoencoder (SKELBA), which allows us to process the different types of inputs parallelly. The model consists of a spatial graph convolution operator, which helps us convolve the skeletal data more precisely. A decoder-less deep clustering architecture is introduced to enhance the stability of clustering. The relation between reconstruction error and minimizing the lower bound of mutual information (MI) helps us look into decoder-free systems. The joint venture of local–global feature collection and decoder-free encoders techniques shows improved results. Extensive experiments performed on the various benchmark datasets highlight the proposed model’s superiority among recently proposed approaches in the same field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences

Article 21 May 2023

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Dynamic video anomaly detection and localization using sparse denoising autoencoders

Article 21 June 2017

Data availability

All datasets used in the research are publicly available and any other information or data will be available on request.

References

Savitha C, Ramesh D (2018) Motion detection in video surviellance: a systematic survey. In 2018 2nd International Conference on Inventive Systems and Control (ICISC), IEEE, pp 51–54
Yan J, Angelini F, Naqvi SM (2020) Image segmentation based privacy-preserving human action recognition for anomaly detection. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 8931–8935
Hafeez JM, Zeng Yu, Tianrui L, Rajeh Taha M, Fahad R, Syed W (2022) Hybrid two-stream dynamic cnn for view adaptive human action recognition using ensemble learning. Int J Mach Learn Cybern 13:1157
Article Google Scholar
Yu T, Ren Z, Li Y, Yan E, Xu N, Yuan J (2019) Temporal structure mining for weakly supervised action detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 5522–5531
Shean CY, Haur TY (2017) Abnormal event detection in videos using spatiotemporal autoencoder. International symposium on neural networks. Springer, Cham, pp 189–196
Google Scholar
Muzamil A, Muhammad R, Ullah KH, Saqib I, Attique KM, Jung-In C, Yunyoung N, Seifedine K (2021) Real-time violent action recognition using key frames extraction and deep learning. Comput Mater Continua 69(2):2217–2230
Article Google Scholar
Markovitz A, Sharir G, Friedman I, Zelnik-Manor L, Avidan S (2020) Graph embedded pose clustering for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10539–10547
Xuan HH, Zhenlong LL (2021) Deep clustering based on embedded auto-encoder. Soft Comput 27:1075
Google Scholar
Wang J, Jiang J (2021) Unsupervised deep clustering via adaptive gmm modeling and optimization. Neurocomputing 433:199–211
Article Google Scholar
Ji Q, Sun Y, Gao J, Hu Y, Yin B (2021) A decoder-free variational deep embedding for unsupervised clustering. IEEE Trans Neural Netw Learn Syst 33(10):5681–93
Article MathSciNet Google Scholar
Okada M, Taniguchi T (2021) Dreaming: model-based reinforcement learning by latent imagination without reconstruction. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp 4209–4215
Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 733–742
Hung V, Dinh NT, Anthony T, Svetha V, Dinh P (2017) Energy-based localized anomaly detection in video surveillance. Pacific-Asia conference on knowledge discovery and data mining. Springer, Cham, pp 641–653
Google Scholar
Serhan C, Giuseppe D, Vania B, Carolina G, Otavio AL, François B (2016) Toward abnormal trajectory and event detection in video surveillance. IEEE Trans Circuits Syst Video Technol 27(3):683–695
Google Scholar
Hinami R, Mei T, Satoh S (2017) Joint detection and recounting of abnormal events by learning deep generic knowledge. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3619–3627
Choi S, Kim C, Kang YS, Youm S (2021) Human behavioral pattern analysis-based anomaly detection system in residential space. J Supercomput. 77:9248–65
Article Google Scholar
Jiang Y, Jun X, Zhang T (2020) View-independent representation with frame interpolation method for skeleton-based human action recognition. Int J Mach Learn Cybern 11(12):2625–2636
Article Google Scholar
Liu C, Ying J, Yang H, Xing H, Liu J (2021) Improved human action recognition approach based on two-stream convolutional neural network model. Vis Comput 37(6):1327–1341
Article Google Scholar
Chang Y, Zhigang T, Xie W, Luo B, Zhang S, Sui H, Yuan J (2022) Video anomaly detection with spatio-temporal dissociation. Pattern Recogn 122:108213
Article Google Scholar
Mekthanavanh V, Li T, Meng H, Yang Y, Jie H (2019) Social web video clustering based on multi-view clustering via nonnegative matrix factorization. Int J Mach Learn Cybern 10(10):2779–2790
Article Google Scholar
Chang Y, Zhigang T, Xie W, Luo B, Zhang S, Sui H, Yuan J (2022) Video anomaly detection with spatio-temporal dissociation. Pattern Recogn 122:108213
Article Google Scholar
Haisheng S, Zhao X, Tianwei L (2018) Cascaded pyramid mining network for weakly supervised temporal action localization. Asian conference on computer vision. Springer, Cham, pp 558–574
Google Scholar
Oded M, Tomás L-P (1998) A framework for multiple-instance learning. Advances in neural information processing systems. Springer, Cham, pp 570–576
Google Scholar
You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4651–4659
Wang L, Xiong Y, Lin D, Van Gool L (2017) Untrimmednets for weakly supervised action recognition and detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4325–4334
Paul S, Roy S, RCK Amit (2018) W-talc: Weakly-supervised temporal activity localization and classification. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 563–579
Singh KK, Lee YJ (2017) Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 3544–3553IEEE
Nguyen P, Liu T, Prasad G, Han B(2018) Weakly supervised action localization by sparse temporal pooling network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6752–6761
Liu Z, Wang L, Zhang Q, Gao Z, Niu Z, Zheng N, Hua G (2019) Weakly supervised temporal action localization through contrast based evaluation networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3899–3908
Zhong J-X, Li N, Kong W, Zhang T, Li Thomas H, Li G (2018) Step-by-step erasion, one-by-one collection: a weakly supervised temporal action detector. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 35–44
Liu D, Jiang T, Wang Y (2019) Completeness modeling and context separation for weakly supervised temporal action localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1298–1307
Narayan S, Cholakkal H, Khan F S, Shao L (2019) 3c-net: category count and center loss for weakly-supervised action localization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 8679–8687
Nguyen PX, Ramanan D, Fowlkes CC (2019) Weakly-supervised action localization with background modeling. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 5502–5511
Jianbang G, Peng S, Sang-Bing T (2022) A study on the optimization simulation of big data video image keyframes in motion models. Wirel Commun Mob Comput. https://doi.org/10.1155/2022/2508174
Article Google Scholar
Khan FA, Nawaz M, Imran M, Rahman AU, Qayum F (2021) Foreground detection using motion histogram threshold algorithm in high-resolution large datasets. Multimed Syst 27:667–678
Article Google Scholar
Gong D, Liu L, Le V, Saha B, Mansour MR, Venkatesh S, Hengel van den A (2019) Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1705–1714
Park H, Noh J, Ham B (2020) Learning memory-guided normality for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14372–1438
Le W, Junwen T, Sanping Z, Haoyue S, Gang H (2023) Memory-augmented appearance-motion network for video anomaly detection. Pattern Recogn 138:109335
Article Google Scholar
Cai R, Zhang H, Liu W, Gao S, Hao Z (2021) Appearance-motion memory consistency network for video anomaly detection. Proc AAAI Conf Artif Intell 35:938–946
Google Scholar
Hou J, Zhang Y, Zhong Q, Xie D, Pu S, Zhou H (2021) Divide-and-assemble: learning block-wise memory for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 8791–8800
Prawiro H, Peng J-W, Pan T-Y, Hu M-C(2020) Abnormal event detection in surveillance videos using two-stream decoder. In: 2020 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), IEEE, pp 1–6
Hyun W, Nam W-J, Lee J, Lee S-W (2022) Learning temporal context of normality for unsupervised anomaly detection in videos. In: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, pp 3261–3266
Lan T, Wang Y, Mori G (2011) Discriminative figure-centric models for joint action localization and recognition. In: 2011 International Conference on Computer Vision, IEEE, pp 2003–2010
Soomro K, Shah M (2017) Unsupervised action discovery and localization in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp 696–705
Abati D, Porrello A, Calderara S, Cucchiara R (2019) Latent space autoregression for novelty detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 481–490
Luo W, Liu W, Gao S (2017) Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), IEEE, pp 439–444
Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the IEEE International Conference on Computer Vision, pp 341–349
Medel JR, Savakis A (2016) Anomaly detection in video using predictive convolutional long short-term memory networks. arXiv preprint arXiv:1612.00390
Sabokrou M, Fayyaz M, Fathy M, Klette R (2017) Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans Image Process 26(4):1992–2004
Article MathSciNet MATH Google Scholar
An J, Cho S (2015) Variational autoencoder based anomaly detection using reconstruction probability. Special Lect IE 2(1):1–18
Google Scholar
Akçay S, Atapour-Abarghouei A, Breckon TP (2019) Skip-ganomaly: skip connected and adversarially trained encoder-decoder anomaly detection. In: 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, pp 1–8
Lotter W, Kreiman G, Cox D (2015) Unsupervised learning of visual structure using predictive generative networks. arXiv preprint arXiv:1511.06380
Liu W, Luo W, Lian D, Gao S(2018) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6536–6545
Morais R, Le V, Tran T, Saha B, Mansour M, Venkatesh S (2019) Learning regularity in skeleton trajectories for anomaly detection in videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11996–12004
Zhang J, Ye G, Zhigang T, Qin Y, Qin Q, Zhang J, Liu J (2022) A spatial attentive and temporal dilated (satd) gcn for skeleton-based action recognition. CAAI Trans Intell Technol 7(1):46–55
Article Google Scholar
Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670
Blei DM, Jordan MI (2006) Variational inference for dirichlet process mixtures. Bayesian Anal 1(1):121–143
Article MathSciNet MATH Google Scholar
Shahroudy A, Liu J, Ng T-T, Wang G (2016) Ntu rgb+ d: a large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1010–1019
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6479–6488
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1026–1034
Mingchao Y, Yonghua X, Jinhua S (2023) Memory clustering autoencoder method for human action anomaly detection on surveillance camera video. IEEE Sens J. https://doi.org/10.1109/JSEN.2023.3239219
Article Google Scholar
Zaheer MZ, Mahmood A, Khan MH, Segu M, Yu F, Lee S-I (2022) Generative cooperative learning for unsupervised video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14744–14754
Luo W, Liu W, Gao S (2021) Normal graph: spatial temporal graph convolutional networks based prediction network for skeleton based video anomaly detection. Neurocomputing 444:332–337
Article Google Scholar
Zhong J-X, Li N, Kong W, Liu S, Li TH, Li G (2019) Graph convolutional label noise cleaner: train a plug-and-play action classifier for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1237–1246
Chang Y, Zhigang T, Xie W, Luo B, Zhang S, Sui H, Yuan J (2022) Video anomaly detection with spatio-temporal dissociation. Pattern Recogn 122:108213
Article Google Scholar
Hyun W, Nam W-J, Lee S-W (2023) Dissimilate-and-assimilate strategy for video anomaly detection and localization. Neurocomputing 522:203–213
Article Google Scholar
Feng J-C, Hong F-T, Zheng W-S (2021) Mist: multiple instance self-training framework for video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14009–14018
Waseem U, Amin U, Ul HI, Khan M, Muhammad S, Wook BS (2021) Cnn features with bi-directional lstm for real-time anomaly detection in surveillance networks. Multimed Tools Appl 80(11):16979–16995
Article Google Scholar
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6479–6488
Cao C, Zhang X, Zhang S, Wang P, Zhang Y (2022) Adaptive graph convolutional networks for weakly supervised anomaly detection in videos. IEEE Signal Process Lett 29:2497–2501
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Science Foundation of China (Nos. 62176221, 62276215, 62276216).

Author information

Authors and Affiliations

School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China
Muhammad Hafeez Javed, Zeng Yu, Tianrui Li & Taha M. Rajeh
The State Key Laboratory for Management and Control of Complex Systems Institute of Automation, Chinese Academy of Sciences, Beijing, China
Noreen Anwar

Authors

Muhammad Hafeez Javed
View author publications
You can also search for this author in PubMed Google Scholar
Zeng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Tianrui Li
View author publications
You can also search for this author in PubMed Google Scholar
Noreen Anwar
View author publications
You can also search for this author in PubMed Google Scholar
Taha M. Rajeh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianrui Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Javed, M.H., Yu, Z., Li, T. et al. learning anomalous human actions using frames of interest and decoderless deep embedded clustering. Int. J. Mach. Learn. & Cyber. 14, 3575–3589 (2023). https://doi.org/10.1007/s13042-023-01851-4

Download citation

Received: 22 January 2022
Accepted: 25 April 2023
Published: 19 May 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s13042-023-01851-4

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

learning anomalous human actions using frames of interest and decoderless deep embedded clustering

Abstract

Access this article

Similar content being viewed by others

DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Dynamic video anomaly detection and localization using sparse denoising autoencoders

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Navigation

learning anomalous human actions using frames of interest and decoderless deep embedded clustering

Abstract

Access this article

Similar content being viewed by others

DeMAAE: deep multiplicative attention-based autoencoder for identification of peculiarities in video sequences

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Dynamic video anomaly detection and localization using sparse denoising autoencoders

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation