Abstract
We describe a novel video object segmentation system based on a conditional random field model with high-order term which is capable of capturing longer-range spatial and temporal grouping information. Our system is able to segment different moving objects effectively from complex background due to integrating the complementary properties of trajectories from points and regions. Although point and region trajectories have already been used in video object segmentation, their complementary properties have not been well investigated. In this paper, we propose an ingenious scheme to transfer the labels of sparse point trajectories to region trajectories. Especially, for region trajectories with few texture, this scheme can automatically predict their label probabilities by using a Gaussian mixture model of appearance and motion given the labels of point trajectories. Meanwhile, we design a reliability measurement for region trajectories based on shape consistency, which helps us to design robust high-order potentials for spatially overlapping region trajectories. Our region trajectories are extracted from hierarchical image over-segmentation, and hence they can capture meaningful regions over time. Additionally, our approach is a streaming process, in which object labels are propagated over a video. We validate the effectiveness of our approach on public challenging datasets, and show that our approach outperforms other competing methods
Similar content being viewed by others
References
Arbelaez P, Maire M, Fowlkes C, Malik J (2009) From contours to regions: an empirical evaluation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2294–2301
Brox T, Malik J (2010) Object segmentation by long term analysis of point trajectories. In: European conference on computer vision, pp 282–295
Brox T, Malik J (2011) Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans Pattern Anal Mach Intell 33(3):500–513
Brutzer S, Hoferlin B, Heidemann G (2011) Evaluation of background subtraction techniques for video surveillance. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 1937–1944
Budvytis I, Badrinarayanan V, Cipolla R (2011) Semi-supervised video segmentation using tree structured graphical models. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2257–2264
Cheriyadat A M, Radke R J (2009) Non-negative matrix factorization of partial track data for motion segmentation. In: Proceedings of IEEE international conference on computer vision, pp 865–872
Deng Y, Manjunath BS (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE Trans Pattern Anal Mach Intell 23(8):800–810
Esche M, Glantz A, Krutz A, Sikora T (2012) Adaptive temporal trajectory filtering for video compression. IEEE Trans Circuits Syst Video Technol 22(5):659–670
Fathi A, Balcan MF, Ren X, Rehg JM (2011) Combining self training and active learning for video segmentation. In: Proceedings of British machine vision conference, pp 78.1–78.11
Felzenszwalb PF, Huttenlocher DF (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the nystrom method. IEEE Trans Pattern Anal Mach Intell 26(2):214–225
Fragkiadaki K, Zhang G, Shi J (2012) Video segmentation by tracing discontinuities in a trajectory embedding. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 1846–1853
Galasso F, Iwasaki M, Nobori K, Cipolla R (2011) Spatio-temporal clustering of probabilistic region trajectories. In: Proceedings of IEEE international conference on computer vision, pp 1738–1745
Galasso F, Cipolla R, Schiele B (2013) Video segmentation with superpixels. In: Proceedings Asian conference on computer vision. Springer, pp 760–774
Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Grady L (2006) Random walks for image segmentation. IEEE Trans Pattern Anal Mach Intell 28(11):1768–1783
Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2141–2148
Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863
Huttenlocher DP, Noh JJ, Rucklidge WJ (1993) Tracking non-rigid objects in complex scenes. In: Proceedings of IEEE international conference on computer vision, pp 93–101
Kohli P, Ladicky L, Torr PH (2009) Robust higher order potentials for enforcing label consistency. Int J Comput Vis 82(3):302–324
Lee YJ, Kim J, Grauman K (2011) Key-segments for video object segmentation. In: Proc. IEEE international conference on computer vision, pp 1995–2002
Lezama J, Alahari K, Sivic J, Laptev I (2011) Track to the future: spatio-temporal video segmentation with long-range motion cues. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3369–3376
Li H, Ngan KN (2008) Saliency model-based face segmentation and tracking in head-and-shoulder video sequences. J Vis Comun Image Represent 19(5):320–333
Martin DR, Fowlkes CC, Malik J (2004) Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 26(5):530–549
Mezaris V, Kompatsiaris I, Boulgouris N, Strintzis M (2004) Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Trans Circuits Syst Video Technol 14(5):606–621
Nagaraja NS, Ochs P, Liu K, Brox T (2012) Hierarchy of localized random forests for video annotation. In: Pattern Recognition (Proc. DAGM). Springer, LNCS
Nikolaos Anastasios DD (2013) Motion-based segmentation of objects using overlapping temporal windows. Image Vision Comput 31(9):593–602
Ochs P, Brox T (2011) Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. In: Proceedings of 0IEEE international conference on computer vision, pp 1583–1590
Ochs P, Brox T (2012) Higher order motion models and spectral clustering. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 614–621
Panagiotakis C, Grinias I, Tziritas G (2011) Natural image segmentation based on tree equipartition, bayesian flooding and region merging. IEEE Trans Image Process 20(8):2276–2287
Paris S, Durand F (2007) A topological approach to hierarchical segmentation using mean shift. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1–8
Sand P, Teller S (2008) Particle video: Long-range motion estimation using point trajectories. Int J Comput Vis 80(1):72–91
Sheikh Y, Javed O, Kanade T (2009) Background subtraction for freely moving cameras. In: Proceedings of IEEE international conference on computer vision, pp 1219–1225
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Sundaram N, Brox T, Keutzer K (2010) Dense point trajectories by GPU-accelerated large displacement optical flow. In: Proceedings of European conference on computer vision. Springer, pp 438–451
Sundberg P, Brox T, Maire M, Arbelaez P, Malik J (2011) Occlusion boundary detection and figure/ground assignment from optical flow. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2233–2240
Vazquez-Reina A, Avidan S, Pfister H, Miller E (2010) Multiple hypothesis video segmentation from superpixel flows. In: Proceedings of European conference on computer vision, pp 268–281
Xu C, Corso JJ (2012) Evaluation of super-voxel methods for early video processing. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1202–1209
Xu C, Xiong C, Corso JJ (2012) Streaming hierarchical video segmentation. In: Proceedings of European conference on computer vision, pp 626–639
Zhang G, Yuan Z, Chen D, Liu Y, Zheng N (2012) Video object segmentation by clustering region trajectories. In: Proceedings of IAPR international conference on pattern recognition, pp 2598– 2601
Acknowledgments
This work was supported in part by the National Basic Research Program of China under Grant No. 2012CB316402 and the National Natural Science Foundation of China under Grant No. 91120006 and No. 61273252.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, G., Yuan, Z., Liu, Y. et al. Video object segmentation by integrating trajectories from points and regions. Multimed Tools Appl 74, 9665–9696 (2015). https://doi.org/10.1007/s11042-014-2145-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2145-5