Skip to main content
Log in

Temporal video segmentation by event detection: A novelty detection approach

  • Representation, Processing, Analysis and Understanding of Images
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

Temporal segmentation of videos into meaningful image sequences containing some particular activities is an interesting problem in computer vision. We present a novel algorithm to achieve this semantic video segmentation. The segmentation task is accomplished through event detection in a frame-by-frame processing setup. We propose using one-class classification (OCC) techniques to detect events that indicate a new segment, since they have been proved to be successful in object classification and they allow for unsupervised event detection in a natural way. Various OCC schemes have been tested and compared, and additionally, an approach based on the temporal self-similarity maps (TSSMs) is also presented. The testing was done on a challenging publicly available thermal video dataset. The results are promising and show the suitability of our approaches for the task of temporal video segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. P. Bodesheim, A. Freytag, E. Rodner, M. Kemmler, and J. Denzler, “Kernel null space methods for novelty detection,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’13) (Portland, 2013).

    Google Scholar 

  2. J. S. Boreczky and L. D. Wilcox, “A hidden Markov model framework for video segmentation using audio and image features,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (Seattle, 1998), Vol. 6, pp. 3741–3744.

    Google Scholar 

  3. A. Bosch, A. Zisserman, and X. Munoz, “Representing shape with a spatial pyramid kernel,” in Proc. 6th ACM Int. Conf. on Image and Video Retrieval (CIVR’07) (Minneapolis, 2007), pp. 401–408.

    Chapter  Google Scholar 

  4. C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector machines,” ACM Trans. Intellig. Syst. Technol. 2(3) (2011).

    Google Scholar 

  5. M. Cooper, T. Liu, and E. Rieffel, “Video segmentation via temporal pattern classification,” IEEE Trans. Multimedia 9(3), 610–618 (2007).

    Article  Google Scholar 

  6. R. Cutler and L. S. Davis, “Robust real-time periodic motion detection, analysis, and applications,” IEEE Trans. Pattern Anal. Mach. Intellig. (TPAMI) 22(8), 781–796 (2000).

    Article  Google Scholar 

  7. J. N. Goyette, P.-M. Porikli, J. F. Konrad, and P. Ishwar, “Changedetection.net: A new change detection benchmark dataset,” in Proc. IEEE Workshop on Change Detection (CDW’12) at CVPR’12 (Providence, RI, 2012).

    Google Scholar 

  8. J. S. Iwanski and E. Bradley, “Recurrence plots of experimental data: To embed or not to embed?,” Chaos 8(4), 861–871 (1998).

    Article  Google Scholar 

  9. I. N. Junejo, E. Dexter, I. Laptev, and P. Pórez, “Viewindependent action recognition from temporal self-similarities,” IEEE Trans. Pattern Anal. Mach. Intellig. (TPAMI) 33(1), 172–185 (2011).

    Article  Google Scholar 

  10. M. Kemmler, E. Rodner, and J. Denzler, “One-class classification with Gaussian processes,” in Proc. Asian Conf. on Computer Vision (ACCV’10) (Queenstown, 2010), pp. 489–500.

    Google Scholar 

  11. I. Koprinska and S. Carrato, “Temporal video segmentation: a survey,” Signal Processing: Image Commun. 16(5), 477–500 (2001).

    Google Scholar 

  12. M. Körner and J. Denzler, “Temporal self-similarity for appearance-based action recognition in multi-view setups,” in Proc. 15th Int. Conf. on Computer Analysis of Images and Patterns (CAIP) (York, 2013).

    Google Scholar 

  13. Tianming Liu, Hong-Jiang Zhang, and Feihu Qi, “A novel video key-frame-extraction algorithm based on perceived motion energy model,” IEEE Trans. Circuits Syst. Video Technol. 13(10), 1006–1013 (2003).

    Article  Google Scholar 

  14. S. Maji, A.C. Berg, and J. Malik, “Classification using intersection kernel support vector machines is efficient,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’08) (Anchorage, 2008), pp. 1–8.

    Chapter  Google Scholar 

  15. G. McGuire, N. B. Azar, and M. Shelhamer, “Recurrence matrices and the preservation of dynamical properties,” Phys. Lett. A 237(1–2), 43–47 (1997).

    Article  MATH  MathSciNet  Google Scholar 

  16. F. Odone, A. Barla, and A. Verri, “Building kernels from binary strings for image matching,” IEEE Trans. Image Processing 14(2), 169–180 (2005).

    MathSciNet  Google Scholar 

  17. C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning (MIT Press, 2006).

    MATH  Google Scholar 

  18. B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, “Estimating the support of a high-dimensional distribution,” Neural Comput. 13(7), 1443–1471 (2001).

    Article  MATH  Google Scholar 

  19. P. Sidiropoulos, V. Mezaris, I. Kompatsiaris, H. Meinedo, M. Bugalho, and I. Trancoso, “Temporal video segmentation to scenes using high-level audiovisual features,” IEEE Trans. Circuits Syst. Video Tech. 21 (8), 1163–1177 (2011).

    Google Scholar 

  20. D. Swanberg, Chiao-Fe Shu, and R. C. Jain, “Knowledge-guided parsing in video databases,” SPIE 36, 13–24 (1993).

    Article  Google Scholar 

  21. D. M. J. Tax and R. P. W. Duin, “Support vector data description,” Mach. Learn. 54(1), 45–66 (2004).

    Article  MATH  Google Scholar 

  22. R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for detecting and classifying production effects,” Multimedia Syst. 7(2), 119–128 (1999).

    Article  Google Scholar 

  23. Hong Jiang Zhang, A. Kankanhalli, and S. W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Syst. 1(1), 10–28 (1993).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahesh Venkata Krishna.

Additional information

The article is published in the original.

This article uses the materials of the report submitted at the 4th International Workshop “Image Mining. Theory and Applications”, Barcelona, Spain, February 2013

Mahesh Venkata Krishna, born in 1984, received the Bachelor degree in Telecommunications Engineering in 2006 from the Visvesvaraya Technological University, India and obtained the MSc degree in Communication Engineering from the RWTH, Aachen in 2011. He is currently a holder of a scholarship from the Graduate Academy for Image Processing of the Free State of Thuringia, Germany, funded by Carl Zeiss AG. He is a member of the Computer Vision Group of Joachim Denzler at the Friedrich Schiller University, Jena. His research interests include video analysis, event rules of a scene based on visual data etc.

Paul Bodesheim. Born in 1987, received the Diploma degree in Computer Science (“Diplom-Informatiker”) in 2011 from the Friedrich Schiller University Jena, Germany. He is currently a holder of a scholarship from the Graduate Academy of the University Jena partially funded by the Free State of Thuringia, Germany (“Landesgraduiertenstipendium”) and a PhD student in the Computer Vision Group of Joachim Denzler at the University Jena. His research interests are in the field of computer vision and machine learning, especially one-class classification and novelty detection as well as incremental, large-scale, and life-long learning for visual object category recognition.

Marco Körner. Born in 1984, received the Diploma degree in Computer Science (“Diplom-Informatiker”) in 2008 from the Friedrich Schiller University Jena, Germany. He is currently a PhD student at the Computer Vision Group of Joachim Denzler at the University Jena. His research interests are in the field of 3D computer vision and machine learning, especially action recognition in multi-sensor environments.

Joachim Denzler. Earned the degrees “Diplom-Informatiker,” “Dr.-Ing.,” and “Habilitation” from the University of Erlangen in the years 1992, 1997, and 2003, respectively. Currently, he holds a position of full professor for computer science and is head of the Chair for Computer Vision, Faculty of Mathematics and Informatics, Friedrich-Schiller-University of Jena. His research interests comprise active computer vision, object recognition and tracking, 3D reconstruction, and plenoptic modeling, as well as computer vision for autonomous systems. He is author and coauthor of over 200 journal and conference papers as well as technical articles. He is a member of the IEEE, IEEE computer society, DAGM, and GI.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Krishna, M.V., Bodesheim, P., Körner, M. et al. Temporal video segmentation by event detection: A novelty detection approach. Pattern Recognit. Image Anal. 24, 243–255 (2014). https://doi.org/10.1134/S1054661814020114

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661814020114

Keywords

Navigation