Ordering of Visual Descriptors in a Classifier Cascade Towards Improved Video Concept Detection

Markatopoulou, Foteini; Mezaris, Vasileios; Patras, Ioannis

doi:10.1007/978-3-319-27671-7_73

Foteini Markatopoulou^19,20,
Vasileios Mezaris¹⁹ &
Ioannis Patras²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

International Conference on Multimedia Modeling

2892 Accesses
3 Citations

Abstract

Concept detection for semantic annotation of video fragments (e.g. keyframes) is a popular and challenging problem. A variety of visual features is typically extracted and combined in order to learn the relation between feature-based keyframe representations and semantic concepts. In recent years the available pool of features has increased rapidly, and features based on deep convolutional neural networks in combination with other visual descriptors have significantly contributed to improved concept detection accuracy. This work proposes an algorithm that dynamically selects, orders and combines many base classifiers, trained independently with different feature-based keyframe representations, in a cascade architecture for video concept detection. The proposed cascade is more accurate and computationally more efficient, in terms of classifier evaluations, than state-of-the-art classifier combination approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bao, L., et al.: CMU-Informedia@TRECVID 2011 semantic indexing. In: TRECVID 2011 Workshop, Gaithersburg, MD, USA (2011)
Google Scholar
Bay, H., et al.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Chellapilla, K., Shilman, M., Simard, P.Y.: Combining multiple classifiers for faster optical character recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 358–367. Springer, Heidelberg (2006)
Chapter Google Scholar
Cheng, W.C., Jhan, D.M.: A cascade classifier using adaboost algorithm and support vector machine for pedestrian detection. In: IEEE International Conference on SMC, pp. 1430–1435 (2011)
Google Scholar
Jegou, H., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, CA, pp. 3304–3311 (2010)
Google Scholar
Krizhevsky, A., Ilya, S., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc., Red Hook (2012)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Markatopoulou, F., Pittaras, N., Papadopoulou, O., Mezaris, V., Patras, I.: A study on the use of a binary local descriptor and color extensions of local descriptors for video concept detection. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 282–293. Springer, Heidelberg (2015)
Google Scholar
Markatopoulou, F., Mezaris, V., Patras, I.: Cascade of classifiers based on binary, non-binary and deep convolutional network descriptors for video concept detection. In: IEEE International Conference on Image Processing (ICIP 2015). IEEE, Canada (2015)
Google Scholar
Nguyen, C., Vu Le, H., Tokuyama, T.: Cascade of multi-level multi-instance classifiers for image annotation. In: KDIR 2011, pp. 14–23 (2011)
Google Scholar
Over, P., et al.: Trecvid 2013 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. NIST, USA (2013)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Maragos, P., Paragios, N., Daniilidis, K. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Safadi, B., Quénot, G.: Re-ranking by local re-scoring for video indexing and retrieval. In: 20th ACM International Conference on Information and Knowledge Management, pp. 2081–2084. ACM, NY (2011)
Google Scholar
Sidiropoulos, P., Mezaris, V., Kompatsiaris, I.: Video tomographs and a base detector selection strategy for improving large-scale video concept detection. IEEE Trans. Circ. Syst. Video Technol. 24(7), 1251–1264 (2014)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv technical report (2014)
Google Scholar
Strat, S.T., Benoit, A., Bredin, H., Quénot, G., Lambert, P.: Hierarchical late fusion for concept detection in videos. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part III. LNCS, vol. 7585, pp. 335–344. Springer, Heidelberg (2012)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR 2015 (2015). http://arxiv.org/abs/1409.4842
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001), vol. 1, pp. 511–518 (2001)
Google Scholar
Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: 31st ACM SIGIR International Conference on Research and Development in Information Retrieval, pp. 603–610. ACM, USA (2008)
Google Scholar

Download references

Acknowledgements

This work was supported by the European Commission under contract FP7-600826 ForgetIT.

Author information

Authors and Affiliations

Information Technologies Institute (ITI), CERTH, 57001, Thermi, Greece
Foteini Markatopoulou & Vasileios Mezaris
Queen Mary University of London, Mile End Campus, London, E14NS, UK
Foteini Markatopoulou & Ioannis Patras

Authors

Foteini Markatopoulou
View author publications
You can also search for this author in PubMed Google Scholar
Vasileios Mezaris
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Patras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Foteini Markatopoulou .

Editor information

Editors and Affiliations

University of Texas at San Antonio, San Antonio, USA
Qi Tian
Dept. of Information Engineering, University of Trento, Povo, Trento, Italy
Nicu Sebe
EECS, University of Central Florida, Orlando, Florida, USA
Guo-Jun Qi
EURECOM, Sophia-Antipolis, France
Benoit Huet
Hefei University of Technology, Hefei, Anhui, China
Richang Hong
School of Computing and Information, Hefei University of Technology, Hefei, Anhui, China
Xueliang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Markatopoulou, F., Mezaris, V., Patras, I. (2016). Ordering of Visual Descriptors in a Classifier Cascade Towards Improved Video Concept Detection. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_73

Download citation

DOI: https://doi.org/10.1007/978-3-319-27671-7_73
Published: 03 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics