Skip to main content

Learning Bag-of-Words Models Using Sparse Partial Least Squares

  • Conference paper
Foundations of Intelligent Systems

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 122))

  • 1717 Accesses

Abstract

Representing images using Bag-of-Words (BOW) model has been shown excellent performance for image classification and retrieval. However, there are still some limitations in this model such as the presence of many noisy visual words and the hard to define vocabulary size. To circumvent these drawbacks, this paper concentrates on tuning compact, robust and thus efficient BOW model even with a universal size for image representation. The proposed approach increases expressive power by employing Sparse Partial Least Squares (SPLS) for tuning the traditional and high-dimensional BOW model and learning more discriminative subspace with 10 latent variables. The performance of learning BOW models to image classification is studied through extensive experiments on the VOC 2006 dataset. Empirical results indicate that the proposed method yields quite stable results, and outperforms the traditional BOW models with various vocabulary sizes and PCA with SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 429.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 549.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jurie, F., Triggs, B.: Creating Efficient Vocabularies for Visual Recognition. In: Proc. 10th IEEE International Conference Computer Vision (2005)

    Google Scholar 

  2. Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. IJCV 73(2), 213–238 (2007)

    Article  Google Scholar 

  4. Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV (2003)

    Google Scholar 

  5. Tirilly, P., Claveau, V., Gros, P.: Language modeling for bag-of-visual words image categorization. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, ACM CIVR 2008, Niagara Falls, Canada, pp. 249–258 (July 2008)

    Google Scholar 

  6. Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV 2004 Workshop on Statistical Learning in Computer Vision, pp. 59–74 (2004)

    Google Scholar 

  7. Hastie, T., Tibshirani, R., Friedman, J.: Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn., 745 pages in full color. Springer, New York (2009)

    Google Scholar 

  8. van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel Codebooks for Scene Categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Comparing Compact Vocabularies for Visual Categorization. In: Computer Vision and Image Understanding (2010) (in press)

    Google Scholar 

  10. van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Trans. Pattern Analysis and Machine Intelligence (2010) (in press)

    Google Scholar 

  11. Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Multimedia Information Retrieval, pp. 197–206. ACM, New York (2007)

    Chapter  Google Scholar 

  12. Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards Optimal Bag-of-Features for Object Categorization and Semantic Video Retrieval. In: ACM International Conference on Image and Video Retrieval (CIVR 2007), Amsterdam, Netherlands, July 9-11 (2007)

    Google Scholar 

  13. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR (2006)

    Google Scholar 

  14. Yang, L., Jin, R., Sukthankar, R., Jurie, F.: Unifying Discriminative Visual Codebook Generation with Classifier Training for Object Category Recognition. In: Proceedings of Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  15. Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. International Journal of Computer Vision (2010)

    Google Scholar 

  16. Perronnin, F., Dance, C., Csurka, G., Bressan, M.: Universal and adapted vocabularies for generic visual categorization. IEEE Pattern Analysis and Machine Intelligence (2010)

    Google Scholar 

  17. Bosch, A., Zisserman, A., Muñoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(4), 712–727 (2008)

    Article  Google Scholar 

  18. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1/2), 177–196 (2001)

    Google Scholar 

  19. Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–916 (2003)

    Google Scholar 

  20. Zhang, X., Li, Z., Zhang, L., Ma, W.-Y., et al.: Efficient indexing for large scale visual search. In: ICCV (2009)

    Google Scholar 

  21. Jégou, H., Douze, M., Schmid, C.: Packing bag-offeatures. In: ICCV (2009)

    Google Scholar 

  22. Perronnin, F., Liu, Y., Sanchez, J., Poirier, H.: Large-scale image retrieval with compressed Fisher vectors. In: CVPR (June 2010)

    Google Scholar 

  23. Wold, H.: Partial Least Squares. In: Kotz, S., Johnson, N. (eds.) Encyclopedia of Statistical Sciences, vol. 6, pp. 581–591. Wiley, New York (1985)

    Google Scholar 

  24. Schwartz, W.R., Kembhavi, A., Harwood, D., Davis, L.S.: Human Detection Using Partial Least Squares Analysis. Accepted to be presented in the International Conference on Computer Vision (ICCV 2009), Kyoto, Japan, September 27-October 04 (2009)

    Google Scholar 

  25. Schwartz, W.R., Davis, L.S.: Learning Discriminative Appearance-Based Models Using Partial Least Squares. In: Proceedings of the XXII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2009), Rio de Janeiro, Brazil, October 11-14 (2009)

    Google Scholar 

  26. Chung, D., Keles, S.: Sparse Partial Least Squares Classification for High Dimensional Data. Statistical Applications in Genetics and Molecular Biology 9, Article 17 (2010)

    Google Scholar 

  27. Chun, H., Keles, S.: Sparse partial least squares for simultaneous dimension reduction and variable selection. Journal of the Royal Statistical Society: Series B 72, 3–25 (2010)

    Article  MathSciNet  Google Scholar 

  28. Everingham, M., Zisserman, A., Williams, C., Gool, L.: The pascal visual object classes challenge 2006 results (2006), http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2006/

  29. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A Comparison of Affine Region Detectors. International Journal Computer Vision 65(1/2), 43–72 (2005)

    Article  Google Scholar 

  30. Mikolajczyk, K., Schmid, C.: A Performance Evaluation of Local Descriptors. IEEE Trans. Pattern Analysis and Machine Intelligence 27(10), 1,615–1,630 (2005)

    Google Scholar 

  31. Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal Computer Vision 2(60), 91–110 (2004)

    Article  Google Scholar 

  32. Mikolajczyk, K.: Affine Covariant Features, Visual Geometry Group, University of Oxford (2004), http://www.robots.ox.ac.uk/~vgg/research/affine/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, J., Zeng, G. (2011). Learning Bag-of-Words Models Using Sparse Partial Least Squares. In: Wang, Y., Li, T. (eds) Foundations of Intelligent Systems. Advances in Intelligent and Soft Computing, vol 122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25664-6_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25664-6_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25663-9

  • Online ISBN: 978-3-642-25664-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics