Stacking of SVMs for Classifying Intangible Cultural Heritage Images

Do, Thanh-Nghi; Pham, The-Phi; Pham, Nguyen-Khang; Nguyen, Huu-Hoa; Tabia, Karim; Benferhat, Salem

doi:10.1007/978-3-030-38364-0_17

Thanh-Nghi Do^18,19,
The-Phi Pham¹⁸,
Nguyen-Khang Pham¹⁸,
Huu-Hoa Nguyen¹⁸,
Karim Tabia²⁰ &
…
Salem Benferhat²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1121))

Included in the following conference series:

International Conference on Computer Science, Applied Mathematics and Applications

712 Accesses
8 Citations

Abstract

Our investigation aims at classifying images of the intangible cultural heritage (ICH) in the Mekong Delta, Vietnam. We collect an images dataset of 17 ICH categories and manually annotate them. The comparative study of the ICH image classification is done by the support vector machines (SVM) and many popular vision approaches including the handcrafted features such as the scale-invariant feature transform (SIFT) and the bag-of-words (BoW) model, the histogram of oriented gradients (HOG), the GIST and the automated deep learning of invariant features like VGG19, ResNet50, Inception v3, Xception. The numerical test results on 17 ICH dataset show that SVM models learned from Inception v3 and Xception features give good accuracy of 61.54% and 62.89% respectively. We propose to stack SVM models using different visual features to improve the classification result performed by any single one. Triplets (SVM-Xception, SVM-Inception-v3, SVM-VGG19), (SVM-Xception, SVM-Inception-v3, SVM-SIFT-BoW) achieve 65.32% of the classification correctness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. In: Proceedings of the European Conference on Computer Vision, pp. 517–530 (2006)
Chapter Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(27), 1–27 (2011)
Article Google Scholar
Chollet, F., et al.: Keras. https://keras.io (2015)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. CoRR abs/1610.02357 (2016)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: and Other Kernel-based Learning Methods. Cambridge University Press, New York (2000)
Book Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) - vol. 1, pp. 886–893. IEEE Computer Society (2005)
Google Scholar
Deng, J., Berg, A.C., Li, K., Li, F.: What does classifying more than 10,000 image categories tell us? In: Proceedings of Computer Vision - ECCV 2010 - 11th European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010, Part V, pp. 71–84 (2010)
Chapter Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9(4), 1871–1874 (2008)
MATH Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)
Google Scholar
Itseez: Open source computer vision library (2015). https://github.com/itseez/opencv
Kreßel, U.H.G.: Pairwise classification and support vector machines. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 255–268. MIT Press, Cambridge (1999)
Google Scholar
Li, F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), San Diego, CA, USA, pp. 524–531 (2005)
Google Scholar
Lin, C.: A practical guide to support vector classification (2003)
Google Scholar
Lowe, D.: Object recognition from local scale invariant features. In: Proceedings of the 7th International Conference on Computer Vision, pp. 1150–1157 (1999)
Google Scholar
Lowe, D.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, vol. 1, pp. 281–297 (1967)
Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Article Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. IEEE Trans. Pattern Anal. Mach. Intell. 33(4), 754–766 (2011)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: 9th IEEE International Conference on Computer Vision (ICCV 2003), Nice, France, 14–17 October 2003, pp. 1470–1477 (2003)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. CoRR abs/1512.00567 (2015)
Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Book Google Scholar
Wolpert, D.: Stacked generalization. Neural Netw. 5, 241–259 (1992)
Article Google Scholar

Download references

Acknowledgments

This work has received support from the European Project H2020 Marie Sklodowska-Curie Actions (MSCA), Research and Innovation Staff Exchange (RISE): Aniage project (High Dimensional Heterogeneous Data based Animation Techniques for Southeast Asian ICH Digital Content), No: 691215.

Author information

Authors and Affiliations

College of Information Technology, Can Tho University, Can Tho, 92000, Vietnam
Thanh-Nghi Do, The-Phi Pham, Nguyen-Khang Pham & Huu-Hoa Nguyen
UMI UMMISCO 209 (IRD/UPMC), UPMC, Sorbonne University, Pierre and Marie Curie University, Paris 6, France
Thanh-Nghi Do
CRIL UMR 8188, CRIL CNRS and Artois University, Arras, France
Karim Tabia & Salem Benferhat

Authors

Thanh-Nghi Do
View author publications
You can also search for this author in PubMed Google Scholar
The-Phi Pham
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen-Khang Pham
View author publications
You can also search for this author in PubMed Google Scholar
Huu-Hoa Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Karim Tabia
View author publications
You can also search for this author in PubMed Google Scholar
Salem Benferhat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thanh-Nghi Do .

Editor information

Editors and Affiliations

Computer Science and Applications Department LGIPM, University of Lorraine, Metz Cedex 03, France
Hoai An Le Thi
Computer Science and Applications Department LGIPM, University of Lorraine, Metz Cedex 03, France
Hoai Minh Le
Laboratory of Mathematics, National Institute for Applied Sciences, Saint-Étienne-du-Rouvray Cedex, France
Tao Pham Dinh
Department of Information Systems, Wroclaw University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Do, TN., Pham, TP., Pham, NK., Nguyen, HH., Tabia, K., Benferhat, S. (2020). Stacking of SVMs for Classifying Intangible Cultural Heritage Images. In: Le Thi, H., Le, H., Pham Dinh, T., Nguyen, N. (eds) Advanced Computational Methods for Knowledge Engineering. ICCSAMA 2019. Advances in Intelligent Systems and Computing, vol 1121. Springer, Cham. https://doi.org/10.1007/978-3-030-38364-0_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-38364-0_17
Published: 20 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38363-3
Online ISBN: 978-3-030-38364-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics