Abstract
Robust and accurate visual tracking is challenging as targets undergo significant changes in appearance by scale variance, occlusion and fast motion. We propose a novel tracking framework, called scalable spatiotemporal visual tracking algorithm (SSVT). First, we construct the Direction Prediction Model (DPM) to predict the spatiotemporal correlation of the target in the next frame. That will efficiently narrow down the search area and improve the accuracy of spatial location. Then, Occlusion Detection algorithm (ODA) is presented to overcome the wrong updates stemming from the region of interest (ROI) based on the estimated direction and Kalman filter. Finally, the multi-scale pyramid kernelized correlation filter (MSPKCF) is presented in tracking to realize the adaptive adjustment of the varying scales of the targets and the ROI size. Extensive experiments on OTB100 and VOT2016 datasets demonstrate that our tracker performs favorably against state-of-the-art trackers, which can effectively reduce computation redundancy and improve tracking accuracy.
Similar content being viewed by others
References
Babenko B, Yang M-H, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33(8):1619–1632
Bhat G, Johnander J, Danelljan M, Khan FS, Felsberg M (2018) Unveiling the power of deep tracking. In: Proceedings of European conference on computer vision, pp 1–22
Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: IEEE Computer society conference on computer vision and pattern recognition, pp 1–8
Cliff DT (1992) Neural networks for visual tracking in an artificial fly. In: Varela FJ, Bourgine P (eds) Towards a practice of autonomous systems. Proceedings of the first European conference on artificial life (ECAL 91). MIT Press Bardford Books, Cambridge, pp 78–87
Cui Z, Xiao S, Feng J, Yan S (2016) Recurrently target-attending tracking. In: The IEEE Conference on computer vision and pattern recognition, pp 1449–1458
Danelljan M, Bhat G, Khan FS, Felsberg M (2017) Efficient convolution operators for tracking. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 6931–6939
Danelljan M, Hager G, Khan FS, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. In: The IEEE International conference on computer vision (ICCV) workshops, pp 58–66
Danelljan M, Hager G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: 2015 IEEE International conference on computer vision (ICCV), pp 4310–4318
Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of European conference on computer vision, pp 472–488
Danelljan M, Hager G, Khan FS, Felsberg M (2017) Discriminative scale space tracking. IEEE Trans Pattern Anal Mach Intell 39(8):1561–1575
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Fan H, Ling H (2017) Sanet: structure-aware network for visual tracking. In: The IEEE Conference on computer vision and pattern recognition, pp 2217–2224
Grabner H, Leistner C, Bischof B (2008) Semi-supervised on-line boosting for robust tracking. In: European Conference on computer vision, pp 234–249
Han B, Davis L (2005) On-line density-based appearance modeling for object tracking. In: IEEE Conference on computer vision, pp 1492–1499
Hare S, Golodetz S, Saffari A, Vineet V, Cheng M-M, Hicks SL (2016) Struck: structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Huh J-H (2017) PLC-based design of monitoring system for ICT-integrated vertical fish farm. Human-Centric Comput Inf Sci 7(20):1–19
Huh J-H, Kim T-J (2019) A location-based mobile health care facility search system for senior citizens. J Supercomput 75(4):1831–1848
Jia X, Lu H, Yang M-H (2012) Visual tracking via adaptive structural local sparse appearance model. In: IEEE Conference on computer vision and pattern recognition, pp 4303–4311
Kahou SE, Michalski V, Memisevic R, Pal C, Vincent P (2017) Ratm: recurrent attentive tracking model. In: Computer vision and pattern recognition workshops, pp 1613–1622
Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 7(34):1409–1422
Kristan M, Eldesokey A, Xing Y, Fan Y, Zheng Z, Zhang Z, He Z, Fernandez G, Garciamartin A, Muhic A (2017) The visual object tracking vot2017 challenge results. In: IEEE International conference on computer vision workshop
Li Y, Zhu J (2015) A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of European conference on computer vision, pp 254–265
Li H, Li Y, Porikli F (2014) Robust online visual tracking with a single convolutional neural network. Asian conference on computer vision. Springer, Cham, pp 1–12
Li P, Wang D, Wang L, Lu H (2018) Deep visual tracking: review and experimental comparison. Pattern Recogn 76(2):323–338
Ma C, Huang J-B, Yang X, Yang M-H (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3074–3082
Ma C, Huang J-B, Yang X, Yang M-H (2018) Robust visual tracking via hierarchical convolutional features. IEEE Trans Pattern Anal Mach Intell, pp 1–14
MacCormick J, Blake A (2000) A probabilistic exclusion principle for tracking multiple objects. Int J Comput Vis 39(1):57–71
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: The IEEE Conference on computer vision and pattern recognition, pp 4293–4302
Ning J, Yang J, Jiang S, Zhang L, Yang M-H (2016) Object tracking via dual linear structured SVM and explicit feature map. In: The IEEE Conference on computer vision and pattern recognition, pp 4266–4274
Ning G, Zhang Z, Huang C, He Z (2017) Spatially supervised recurrent convolutional neural networks for visual object tracking. In: 2017 IEEE International symposium on circuits and systems (ISCAS). IEEE, pp 1–10
Ning G, Zhang Z, Huang C, Ren X, Wang H, Cai C, He Z (2017) Spatially supervised recurrent convolutional neural networks for visual object tracking. In: IEEE International symposium on circuits and systems, pp 1–4
Qi Y, Zhang S, Qin L, Yao H, Huang Q, Lim J, Yang M-H (2016) Hedged deep tracking. In: The IEEE Conference on computer vision and pattern recognition, pp 4303–4311
Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PHS (2017) End-to-end representation learning for correlation filter based tracking. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 5000–5008
Wang N, Yeung D-Y (2013) Convolutional features for correlation filter based visual tracking. Adv Neural Inf Process Syst, 3119–3127
Wang L, Ouyang W, Wang X, Lu H (2015) Visual tracking with fully convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 3119–3127
Wang N, Li S, Gupta A, Yeung D-Y (2015) Transferring rich feature hierarchies for robust visual tracking. arXiv:1501.04587, pp 4293–4302
Wang N, Zhou W, Tian Q, Hong R, Wang M, Li H (2018) Multi-cue correlation filters for robust visual tracking. In: 2018 IEEE Conference on computer vision and pattern recognition (CVPR), pp 4844–4853
Wang Q, Yuan C, Wang J, Zeng W (2019) Learning attentional recurrent neural network for visual tracking. IEEE Trans Multimed 21(4):930–942
Wang X, Hou Z, Yu W, Jin Z, Zha Y, Qin X (2019) Online scale adaptive visual tracking based on multilayer convolutional features. IEEE Trans Cybern 49(1):146–158
Wu Y, Lim J, Yang M-H (2013) Online object tracking: a benchmark. In: IEEE Conference on computer vision and pattern recognition (CVPR)
Xu X, Ma B, Chang H, Chen X (2017) Siamese recurrent architecture for visual tracking. In: IEEE International conference on image processing, pp 1152–1156
Yang T, Chan AB (2018) Learning dynamic memory networks for object tracking. In: Proceedings of European conference on computer vision, pp 1–16
Yao Y, Wu X, Zhang L, Shan S, Zuo M (2018) Joint representation and truncated inference learning for correlation filter based tracking. In: Proceedings of European conference on computer vision, pp 1–16
Yun S, Choi J, Yoo Y, Yun K, Choi JY (2017) Action-decision networks for visual tracking with deep reinforcement learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2711–2720
Zhang S, Yao H, Sun X, Liu S (2012) Robust visual tracking using an effective appearance model based on sparse coding. ACM Trans Intell Syst Technol 3 (3):1–18
Zhang K, Zhang L, Liu Q, Zhang D, Yang M-H (2014) Fast visual tracking via dense spatio-temporal context learning. In: European conference on computer vision, pp 127–141
Zhang M, Wang Q, Xing J, Gao J, Peng P, Hu W, Mavbank S (2018) Visual tracking via spatially aligned correlation filters network. In: Proceedings of European conference on computer vision, pp 1–16
Zhou Y, Han J, Yang F, Zhang K, Hong R (2018) Efficient correlation tracking via center-biased spatial regularization. IEEE Trans Image Process 27(12):6159–6173
Zhu L, Huang Z, Liu X, He X, Sun J, Zhou X (2017) Discrete multimodal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079
Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29 (2):472–486
Zhu L, Huang Z, Li Z, Xie L, Shen HT (2019) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. arXiv:1904.11207, 1–14
Acknowledgments
The work presented in this paper was supported by Beijing Natural Science Foundation of China (Grant No. L182033), Fund for Beijing University of Posts and Telecommunications (2019PTB-001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ming, Y., Zhang, Y. Efficient scalable spatiotemporal visual tracking based on recurrent neural networks. Multimed Tools Appl 79, 2239–2261 (2020). https://doi.org/10.1007/s11042-019-08331-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08331-4