Skip to main content

Ordering of Visual Descriptors in a Classifier Cascade Towards Improved Video Concept Detection

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

Abstract

Concept detection for semantic annotation of video fragments (e.g. keyframes) is a popular and challenging problem. A variety of visual features is typically extracted and combined in order to learn the relation between feature-based keyframe representations and semantic concepts. In recent years the available pool of features has increased rapidly, and features based on deep convolutional neural networks in combination with other visual descriptors have significantly contributed to improved concept detection accuracy. This work proposes an algorithm that dynamically selects, orders and combines many base classifiers, trained independently with different feature-based keyframe representations, in a cascade architecture for video concept detection. The proposed cascade is more accurate and computationally more efficient, in terms of classifier evaluations, than state-of-the-art classifier combination approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bao, L., et al.: CMU-Informedia@TRECVID 2011 semantic indexing. In: TRECVID 2011 Workshop, Gaithersburg, MD, USA (2011)

    Google Scholar 

  2. Bay, H., et al.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)

    Article  Google Scholar 

  3. Chellapilla, K., Shilman, M., Simard, P.Y.: Combining multiple classifiers for faster optical character recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 358–367. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Cheng, W.C., Jhan, D.M.: A cascade classifier using adaboost algorithm and support vector machine for pedestrian detection. In: IEEE International Conference on SMC, pp. 1430–1435 (2011)

    Google Scholar 

  5. Jegou, H., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, CA, pp. 3304–3311 (2010)

    Google Scholar 

  6. Krizhevsky, A., Ilya, S., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc., Red Hook (2012)

    Google Scholar 

  7. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  8. Markatopoulou, F., Pittaras, N., Papadopoulou, O., Mezaris, V., Patras, I.: A study on the use of a binary local descriptor and color extensions of local descriptors for video concept detection. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 282–293. Springer, Heidelberg (2015)

    Google Scholar 

  9. Markatopoulou, F., Mezaris, V., Patras, I.: Cascade of classifiers based on binary, non-binary and deep convolutional network descriptors for video concept detection. In: IEEE International Conference on Image Processing (ICIP 2015). IEEE, Canada (2015)

    Google Scholar 

  10. Nguyen, C., Vu Le, H., Tokuyama, T.: Cascade of multi-level multi-instance classifiers for image annotation. In: KDIR 2011, pp. 14–23 (2011)

    Google Scholar 

  11. Over, P., et al.: Trecvid 2013 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. NIST, USA (2013)

    Google Scholar 

  12. Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Maragos, P., Paragios, N., Daniilidis, K. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  13. Safadi, B., Quénot, G.: Re-ranking by local re-scoring for video indexing and retrieval. In: 20th ACM International Conference on Information and Knowledge Management, pp. 2081–2084. ACM, NY (2011)

    Google Scholar 

  14. Sidiropoulos, P., Mezaris, V., Kompatsiaris, I.: Video tomographs and a base detector selection strategy for improving large-scale video concept detection. IEEE Trans. Circ. Syst. Video Technol. 24(7), 1251–1264 (2014)

    Article  Google Scholar 

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv technical report (2014)

    Google Scholar 

  16. Strat, S.T., Benoit, A., Bredin, H., Quénot, G., Lambert, P.: Hierarchical late fusion for concept detection in videos. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part III. LNCS, vol. 7585, pp. 335–344. Springer, Heidelberg (2012)

    Google Scholar 

  17. Szegedy, C., et al.: Going deeper with convolutions. In: CVPR 2015 (2015). http://arxiv.org/abs/1409.4842

  18. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001), vol. 1, pp. 511–518 (2001)

    Google Scholar 

  19. Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: 31st ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 603–610. ACM, USA (2008)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the European Commission under contract FP7-600826 ForgetIT.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Foteini Markatopoulou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Markatopoulou, F., Mezaris, V., Patras, I. (2016). Ordering of Visual Descriptors in a Classifier Cascade Towards Improved Video Concept Detection. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27671-7_73

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27670-0

  • Online ISBN: 978-3-319-27671-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics