Skip to main content
Log in

Learning a hierarchical image manifold for Web image classification

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

Image classification is an essential task in content-based image retrieval. However, due to the semantic gap between low-level visual features and high-level semantic concepts, and the diversification of Web images, the performance of traditional classification approaches is far from users’ expectations. In an attempt to reduce the semantic gap and satisfy the urgent requirements for dimensionality reduction, high-quality retrieval results, and batch-based processing, we propose a hierarchical image manifold with novel distance measures for calculation. Assuming that the images in an image set describe the same or similar object but have various scenes, we formulate two kinds of manifolds, object manifold and scene manifold, at different levels of semantic granularity. Object manifold is developed for object-level classification using an algorithm named extended locally linear embedding (ELLE) based on intra- and inter-object difference measures. Scene manifold is built for scene-level classification using an algorithm named locally linear submanifold extraction (LLSE) by combining linear perturbation and region growing. Experimental results show that our method is effective in improving the performance of classifying Web images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ames, M., Naaman, M., 2007. Why We Tag: Motivations for Annotation in Mobile and Online Media. SIGCHI Conf. on Human Factors in Computing, p.971–980.

  • Belkin, M., Niyogi, P., 2001. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. Advances in Neural Information Processing Systems 14. MIT Press, p.585–591.

  • Bellman, R.E., 1961. Adaptive Control Processes: a Guided Tour. Princeton University Press, New Jersey.

    MATH  Google Scholar 

  • Briggs, F., Raich, R., Fern, X.Z., 2009. Audio Classification of Bird Species: a Statistical Manifold Approach. Ninth IEEE Int. Conf. on Data Mining, p.51–60. [doi:10.1109/ICDM.2009.65]

  • Carlsson, G., Ishkhanov, T., de Silva, V., Zomorodian, A., 2008. On the local behavior of spaces of natural images. Int. J. Comput. Vis., 76(1):1–12. [doi:10.1007/s11263-007-00 56-x]

    Article  Google Scholar 

  • Chai, Y.M., Zhu, X.Y., Zhou, S., Bian, Y.T., Bu, F., Li, W., Zhu, J., 2009. Ontology-Based Digital Photo Annotation Using Multi-source Information. IEEE Int. Conf. on Computational Intelligence for Measurement Systems and Applications, p.38–41. [doi:10.1109/CIMSA.2009.5069914]

  • Chang, E., Goh, K., Sychay, G., Wu, G., 2003. CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. Circ. Syst. Video Technol., 13(1):26–38. [doi:10.1109/TCSVT.2002.808079]

    Article  Google Scholar 

  • Cheng, E., Jing, F., Zhang, L., 2009. A unified relevance feedback framework for Web image retrieval. IEEE Trans. Image Process., 18(6):1350–1357. [doi:10.1109/TIP.2009.2017128]

    Article  MathSciNet  Google Scholar 

  • Datta, R., Joshi, D., Li, J., Wang, J.Z., 2008. Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2):1–60. [doi:10.1145/1348246.1348248]

    Article  Google Scholar 

  • de Juan, C., Bodenheimer, B., 2004. Cartoon Textures. Proc. ACM SIGGRAPH/Eurographics Symp. on Computer Animation, p.267–276. [doi:10.1145/1028523.1028559]

  • de Ridder, D., Kouropteva, O., Okun, O., Pietikainen, M., Duin, R.P.W., 2003. Supervised locally linear embedding. LNCS, 2714:175. [doi:10.1007/3-540-44989-2_40]

    Google Scholar 

  • dos Santos, J.A., Ferreira, C.D., Torres, R.S., Goncalves, M.A., Lamparelli, R.A.C., 2011. A relevance feedback method based on genetic programming for classification of remote sensing images. Inform. Sci., 181(13):2671–2684. [doi:10.1016/j.ins.2010.02.003]

    Article  Google Scholar 

  • El Sayad, I., Martinet, J., Urruty, T., Amir, S., Dieraba, C., 2010. Effective Object-Based Image Retrieval Using Higher-Level Visual Representation. Int. Conf. on Machine and Web Intelligence, p.218–224. [doi:10.1109/ICMWI.2010.5648110]

  • Enser, P., Sandom, C., 2003. Towards a Comprehensive Survey of the Semantic Gap in Visual Image Retrieval. Int. Conf. on Image and Video Retrieval, p.291–299. [doi:10. 1007/3-540-45113-7_29]

  • Fan, J.P., Gao, Y.L., Luo, H.Z., Jain, R., 2008. Mining multilevel image semantic via hierarchical classification. IEEE Trans. Multimedia, 10(2):167–187. [doi:10.1109/TMM.2007.911775]

    Article  Google Scholar 

  • Fan, W., Yeung, D.Y., 2006. Locally Linear Models on Faces Appearance Manifolds with Application to Dual-Subspace Based Classification. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, 2:1384–1390. [doi:10.1109/CVPR.2006.178]

    Google Scholar 

  • Farajtabar, M., Rabbiee, H.R., Shaban, A., Soltani-Farani, A., 2011. Efficient Iterative Semi-supervised Classification on Manifold. IEEE 11th Int. Conf. on Data Mining Workshops, p.228–235. [doi:10.1109/ICDMW.2011.181]

  • Fischer, B., Buhmann, J.M., 2003. Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 25(4):513–518. [doi:10.1109/TPAMI.2003.1190577]

    Article  Google Scholar 

  • Gao, Y., Fan, J.P., 2005. Semantic Image Classification with Hierarchical Feature Subset Selection. Proc. 7th ACM SIGMM Int. Workshop on Multimedia Information Retrieval, p.135–142. [doi:10.1145/1101826.1101850]

  • Guo, G.D., Jain, A.K., Ma, W.Y., Zhang, H.J., 2002. Learning similarity measure for natural image retrieval with relevance feedback. IEEE Trans. Neur. Networks, 13(4):811–820. [doi:10.1109/TNN.2002.1021882]

    Article  Google Scholar 

  • Huang, J., Kumar, S.R., Zabih, R., 2003. Automatic hierarchical color image classification. EURASIP J. Appl. Signal Process., (2):151–159. [doi:10.1155/S1110865703211161]

    Google Scholar 

  • Huiskes, M.J., Lew, M.S., 2008. The MIR Flickr Retrieval Evaluation. Proc. 1st ACM Int. Conf. on Multimedia Information Retrieval, p.39–43. [doi:10.1145/1460096.1460104]

    Google Scholar 

  • Jaimes, A., Smith, J.R., 2003. Semi-automatic, Data-Driven Construction of Multimedia Ontologies. Proc. Int. Conf. on Multimedia and Expo, 1:781–784. [doi:10.1109/ICME.2003.1221034]

    Google Scholar 

  • Jaimes, A., Jaimes, R., Chang, S.F., 1999. Model-Based Classification of Visual Information for Content-Based Retrieval. Conf. on Storage and Retrieval for Image and Video Databases, p.402–414.

  • Joshi, A.J., Porikli, F., Papanikolopoulos, N., 2009. Multi-class Active Learning for Image Classification. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Poster Session 5.

  • Jun, G., Ghosh, J., 2010. Nearest-Manifold Classification with Gaussian Processes. 20th Int. Conf. on Pattern Recognition, p.914–917. [doi:10.1109/ICPR.2010.230]

  • Kang, S.D., Park, S.S., Yoo, H.W., Shin, Y.G., Jang, D.S., 2009. Development of expert system for extraction of the objects of interest. Exp. Syst. Appl., 36(3):7210–7218. [doi:10.1016/j.eswa.2008.09.062]

    Article  Google Scholar 

  • Kim, B.S., Park, J.Y., Mohan, A., Gilbert, A., Savarese, S., 2010. Hierarchical Classification of Images by Sparse Approximation. Proc. British Machine Vision Conf., p.106.1–106.11. [doi:10.5244/C.25.106]

  • Kim, D.W., Song, J.H., Lee, J.H., Choi, B.G., 2007. Support vector machine learning for region-based image retrieval with relevance feedback. ETRI J., 29(5):700–702. [doi:10. 4218/etrij.07.0207.0037]

    Article  Google Scholar 

  • Kim, T.K., Kittle, J., Cipolla, R., 2007. Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. Pattern Anal. Mach. Intell., 29(6):1005–1008. [doi:10.1109/TPAMI.2007.1037]

    Article  Google Scholar 

  • Klaydios, K., 2004. Relevance Feedback Methods for Web Image. PhD Thesis, Technical University of Crete, Chania, Greece.

    Google Scholar 

  • Li, L.J., Wang, C., Lim, Y.W., Blei, D.M., Li, F.F., 2010. Building and Using a Semantivisual Image Hierarchy. IEEE Conf. on Computer Vision and Pattern Recognition, p.336–3343. [doi:10.1109/CVPR.2010.5540027]

  • Li, X.R., Snoek, C.G.M., Worring, M., 2010. Unsupervised Multi-feature Tag Relevance Learning for Social Image Retrieval. Proc. ACM Int. Conf. on Image and Video Retrieval, p.10–17. [doi:10.1145/1816041.1816044]

  • Lin, Y.Q., Lv, F.J., Zhu, S.H., Yang, M., Cour, T., Yu, K., Cao, L.L., Huang, T., 2011. Large-Scale Image Classification: Fast Feature Extraction and SVM Training. IEEE Conf. on Computer Vision and Pattern Recognition, p.1689–1696. [doi:10.1109/CVPR.2011.5995477]

  • Liu, D., Yang, S.C., Mu, Y.D., Hua, X.S., Zhang, H.J., 2011. Towards Optimal Discriminating Order for Multiclass Classification. IEEE 11th Int. Conf. on Data Mining, p.388–397. [doi:10.1109/ICDM.2011.147]

  • Lu, D., Weng, Q., 2007. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens., 28(5):823–870. [doi:10. 1080/01431160600746456]

    Article  Google Scholar 

  • Luo, D.J., Huang, H., Ding, C., 2010. Discriminative High Order SVD: Adaptive Tensor Subspace Selection for Image Classification, Clustering, and Retrieval. IEEE Int. Conf. on Computer Vision, p.1443–1448. [doi:10.1109/ICCV.2011.6126400]

  • Luo, J.B., Singhal, A., Etz, S.P., Gray, R.T., 2004. A computational approach to determination of main subject regions in photographic images. Image Vis. Comput., 22(3):227–241. [doi:10.1016/j.imavis.2003.09.012]

    Article  Google Scholar 

  • Parikh, D., 2011. Recognizing Jumbled Images: the Role of Local and Global Information in Image Classification. IEEE Int. Conf. on Computer Vision, p.519–526. [doi:10. 1109/ICCV.2011.6126283]

  • Patterson, F., 1986. Photography and the Art of Seeing. Baker & Taylor Books, Charlotte, North Carolina.

    Google Scholar 

  • Pillati, M., Viroli, C., 2005. Supervised Locally Linear Embedding for Classification: an Application to Gene Expression Data Analysis. Annual Conf. of the German Classification Society, p.15–18.

  • Rizon, M., Yazid, H., Saad, P., Shakaff, A.Y.M., Saad, A.R., Mamat, M.R., Yaacob, S., Desa, H., Karthigayan, M., 2006. Object detection using geometric invariant moment. Am. J. Appl. Sci., 3(6):1876–1878. [doi:10.3844/ajassp.2006.1876.1878]

    Article  Google Scholar 

  • Roweis, S.T., Saul, L.K., 2000. Nonlinear dimensional reduction by locally linear embedding. Science, 290(5500): 2323–2326. [doi:10.1126/science.290.5500.2323]

    Article  Google Scholar 

  • Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S., 1998. Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Video Technol., 8(5):644–655. [doi:10.1109/76.718510]

    Article  Google Scholar 

  • Saul, L.K., Roweis, S.T., 2003. Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res., 4:119–155. [doi:10.1162/153244304322972667]

    MathSciNet  Google Scholar 

  • Seung, H.S., Lee, D.D., 2000. The manifold ways of perception. Science, 290(5500):2268–2269. [doi:10.1126/science.290.5500.2268]

    Article  Google Scholar 

  • Shao, L., Brady, M., 2006. Specific object retrieval based on salient regions. Pattern Recogn., 39(10):1932–1948. [doi:10.1016/j.patcog.2006.04.010]

    Article  MATH  Google Scholar 

  • Souvenir, R., Pless, R., 2005. Manifold Clustering. Int. Conf. on Computer Vision, p.648–653.

  • Sun, A., Bhowmick, S.S., Nguyen, K.T.N., Bai, G., 2011. Tag-based social image retrieval: an empirical evaluation. J. Am. Soc. Inform. Sci. Technol., 62(12):2364–2381. [doi:10.1002/asi.21659]

    Article  Google Scholar 

  • Tao, D., Tang, X., Li, X., Rui, Y., 2006. Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm. IEEE Trans. Multimedia, 8(4):716–727. [doi:10.1109/TMM.2005.861375]

    Article  Google Scholar 

  • Tenenbaum, J.B., Silva, V.D., Langford, J.C., 2000. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323. [doi:10.1126/science.290.5500.2319]

    Article  Google Scholar 

  • Vailaya, A., Jain, A., Zhang, H.J., 1998. On image classification: city images vs. landscapes. Pattern Recogn., 31(12): 1921–1935. [doi:10.1016/S0031-3203(98)00079-X]

    Article  Google Scholar 

  • Vieux, R., Domenger, J.P., Benois-Pineau, J., Braquelaire, A., 2007. Image Classification with User Defined Ontology. 15th European Signal Processing Conf., p.723–727.

  • Viola, P., Jones, M., 2004. Robust real-time face detection. Int. J. Comput. Vis., 57(2):137–154. [doi:10.1023/B:VISI.0000013087.49260.fb]

    Article  Google Scholar 

  • Wang, C.H., Zhang, L., Zhang, H.J., 2008. Learning to Reduce the Semantic Gap in Web Image Retrieval and Annotation. Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.355–362. [doi:10.1145/1390334.1390396]

  • Wang, L., Wang, X., Feng, J., 2006. Subspace distance analysis with application to adaptive Bayesian algorithm for face recognition. Pattern Recogn., 39(3):456–464. [doi:10. 1016/j.patcog.2005.08.015]

    Article  MATH  Google Scholar 

  • Wang, R.P., Shan, S.G., Chen, X.L., Gao, W., 2008. Manifold-Manifold Distance with Application to Face Recognition Based on Image Set. IEEE Conf. on Computer Vision and Pattern Recognition, p.1–8. [doi:10.1109/CVPR.2008.4587719]

  • Wu, Y., Chan, K.L., 2004. An Extended Isomap Algorithm for Learning Multi-class Manifold. Int. Conf. on Machine Learning and Cybernetics, 6:3429–3433.

    Google Scholar 

  • Yang, M.H., 2002. Extended Isomap for Pattern Classification. Proc. AAAI/AAI, p.224–229.

  • Zeng, Z.Y., Yao, Z.Q., Liu, S.G., 2009. An Efficient and Effective Image Representation for Region-Based Image Retrieval. Proc. 2nd Int. Conf. on Interaction Sciences: Information Technology, Culture and Human, p.429–434. [doi:10.1145/1655925.1656004]

  • Zhai, S.D., Luo, B., Zhang, C.Y., 2008. Video abstraction based on manifold learning and mixture model. J. Image Graph., 13(4):735–740 (in Chinese).

    Google Scholar 

  • Zhang, Y.J., 2008. Image Classification and Retrieval with Mining Technologies. In: Song, M., Wu, Y.F.B. (Eds.), Handbook of Research on Text and Web Mining Technologies, Chapter VI, p.96–110. [doi:10.4018/978-1-599 04-990-8.ch006]

  • Zhou, X., Cui, N., Li, Z., Liang, F., Huang, T.S., 2009. Hierarchical Gaussianization for Image Classification. IEEE 12th Int. Conf. on Computer Vision, p.1971–1977. [doi:10. 1109/ICCV.2009.5459435]

  • Zhu, R., Yao, M., 2009. Image feature optimization based on nonlinear dimensionality reduction. J. Zhejiang Univ.-Sci. A, 10(12):1720–1737. [doi:10.1631/jzus.A0920310]

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rong Zhu.

Additional information

Project supported by the National High-Tech R & D Program (863) of China (No. 2009AA011900), the Zhejiang Provincial Natural Science Foundation of China (No. 2011Y1110960), and the Zhejiang Provincial Nonprofit Technology and Application Research Program of China (Nos. 2011C31045 and 2012C21020)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, R., Yao, M., Ye, Lh. et al. Learning a hierarchical image manifold for Web image classification. J. Zhejiang Univ. - Sci. C 13, 719–735 (2012). https://doi.org/10.1631/jzus.C1200032

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C1200032

Key words

CLC number

Navigation