Abstract
Video event recognition plays an important role in the various research fields particularly in surveillance detection system. In the existing system it is done by deep hierarchical context model which utilizes several contextual data which contains various context information at level of feature, semantic and priority for the video recognition process. However, this research method might perform low with increased volume of videos and might be failed to predict the events accurately with less interrelation contextual features. The standstill challenges are solved via improved hybridized deep structured model. The three primary features of contextual data are to discriminate resultant neighborhood. Here hybrid textual perceptual descriptor and concept-based attribute extraction is performed for accurate recognition of video events. These extracted interaction context features are grouped by using improved K means algorithm. In addition, improved deep structured model that combines convolutional neural networks and conditional random fields are developed for learning middle level representations and mingle the bottom feature level, mid-semantic and top-level meanings for the identification of incidents. This proposed research method is evaluated by using VIRAT data set whose simulation analysis is performed using Matlab simulation toolkit. The overall evaluation of the proposed research method proves that the suggested method can provide better output in terms of accurate recognition of events.
Similar content being viewed by others
Change history
06 June 2022
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s12652-022-04072-9
References
Antony SJS, Ravi S (2015) Detection of masses in digital mammograms using K-means and neural network. Int J Appl Eng Res 10(7):17643–17656
Awad G, Fiscus J, Michel M, Joy D, Kraaij W, Smeaton AF, Ordelman R (2016) TRECVID 2016. Evaluating video search, video event detection, localization and hyperlinking, pp 1–55
Battaglia P, Pascanu R, Lai M, Rezende DJ (2016) Interaction networks for learning about objects, relations and physics. In: Advances in neural information processing systems, pp 4502–4510
Edwards M, Deng J, Xie X (2015) From pose to activity: surveying datasets and introducing CONVERSE, pp 1–38
Elangovan K, Subashini S (2018) Particle bee optimized convolution neural network for managing security using cross-layer design in cognitive radio network. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-018-1007-9
Frost DM, Beach TA, Callaghan JP, McGill SM (2015) FMS scores change with performers' knowledge of the grading criteria—are general whole-body movement screens capturing “Dysfunction”? J Strength Cond Res 29(11):3037–3044
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202
Gaidon A, Harchaoui Z, Schmid C (2014) Activity representation with motion hierarchies. Int J Comput Vision 107(3):219–238
Geng C, Song J (2016) Human action recognition based on convolutional neural networks with a convolutional auto-encoder. In: 2015 5th International Conference on computer sciences and automation engineering (ICCSAE 2015), Atlantis Press
Gupta A, Davis LS (2007) Objects in action: An approach for combining action understanding and object perception. In: 2007 IEEE Conference on computer vision and pattern recognition, pp 1–8
Heilbro FC, Escorcia V, Ghanem B, Niebles JC (2015) Activitynet: a large-scale video benchmark for human activity understanding. In: 2015 Proceedings of the IEEE conference on computer vision and pattern recognition, pp 961–970
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J Physiol 160(1):106–154
Idrees H, Zamir AR, Jiang YG, Gorban A, Laptev I, Sukthankar R, Shah M (2017) The THUMOS challenge on action recognition for videos “in the wild”. In: Computer vision and image understanding, vol 155, pp 1–23
Izadinia H, Shah M (2012) Recognizing complex events using large margin joint low-level event model. In: European conference on computer vision. Springer, Berlin, Heidelberg, pp 430–444
Jiang YG, Dai Q, Mei T, Rui Y, Chang SF (2015) Super fast event recognition in internet videos. IEEE Trans Multimedia 17(8):1174–1186
Kale GV, Patil VH (2016) A study of vision based human motion recognition and analysis. Int J Ambient Comput Intell 7(2):75–92
Kousalya R, Dharani S (2017) Multiple video instance detection and retrieval using spatio-temporal analysis using semi supervised SVM algorithm. Int J Comput Appl 163(4):12–19
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li LJ, Fei-Fei L (2007) What, where and who? classifying events by scene and object recognition. In: 2007 IEEE 11th international conference on computer vision, pp 1–8
Onofri L, Soda P, Pechenizkiy M, Iannello G (2016) A survey on using domain and contextual knowledge for human activity recognition in video streams. Expert Syst Appl 63:97–111
Ramanathan V, Liang P, Fei-Fei L (2013) Video event understanding using natural language descriptions. In: Proceedings of the IEEE international conference on computer vision, pp 905–912
Rao X, Lin F, Chen Z, Zhao J (2019) Distracted driving recognition method based on deep convolutional neural network. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01597-4
Sun C, Nevatia R (2013) Active: activity concept transitions in video event classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 913–920
Sun J, Wu X, Yan S, Cheong LF, Chua TS, Li J (2009) Hierarchical spatio-temporal context modeling for action recognition. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2004–2011
Wang X, Ji Q (2015) Video event recognition with deep hierarchical context model. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4418–4427
Wang J, Chen Z, Wu Y (2011) Action recognition with multiscale spatio-temporal contexts. In: 2010 IEEE Computer vision and pattern recognition (CVPR 2011), pp 3185–3192
Yao B, Fei-Fei L (2010) Modeling mutual context of object and human pose in human-object interaction activities. In: 2010 IEEE Computer society conference on computer vision and pattern recognition, pp 17–24
Zeng X, Ouyang W, Wang X (2013) Multi-stage contextual deep learning for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 121–128
Zhu Y, Nayak NM, Roy-Chowdhury AK (2013) Context-aware modeling and recognition of activities in video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2491–2498
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s12652-022-04072-9
About this article
Cite this article
Kavitha, R., Chitra, D. RETRACTED ARTICLE: An improved hybridized deep structured model for accurate video event recognition. J Ambient Intell Human Comput 12, 6019–6028 (2021). https://doi.org/10.1007/s12652-020-02157-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02157-x