Visual graph modeling for scene recognition and mobile robot localization

Pham, Trong-Ton; Mulhem, Philippe; Maisonnasse, Loïc; Gaussier, Eric; Lim, Joo-Hwee

doi:10.1007/s11042-010-0598-8

Visual graph modeling for scene recognition and mobile robot localization

Published: 14 September 2010

Volume 60, pages 419–441, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Trong-Ton Pham¹,
Philippe Mulhem²,
Loïc Maisonnasse³,
Eric Gaussier² &
…
Joo-Hwee Lim⁴

236 Accesses
4 Citations
Explore all metrics

Abstract

Image retrieval and categorization may need to consider several types of visual features and spatial information between them (e.g., different point of views of an image). This paper presents a novel approach that exploits an extension of the language modeling approach from information retrieval to the problem of graph-based image retrieval and categorization. Such versatile graph model is needed to represent the multiple points of views of images. A language model is defined on such graphs to handle a fast graph matching. We present the experiments achieved with several instances of the proposed model on two collections of images: one composed of 3,849 touristic images and another composed of 3,633 images captured by a mobile robot. Experimental results show that using visual graph model (VGM) improves the accuracies of the results of the standard language model (LM) and outperforms the Support Vector Machine (SVM) method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-Based Discriminative Learning for Location Recognition

Article 12 November 2014

Graph-Based Object Class Discovery from Images with Multiple Objects

Interpreting Context of Images Using Scene Graphs

Notes

References

Boutell MR, Luo J, Brown CM (2007) Scene parsing using region-based generative models. IEEE Trans Multimedia 9(1):136–146
Article Google Scholar
Chang Y, Ann H, Yeh W (2000) A unique-id-based matrix strategy for efficient iconic indexing of symbolic pictures. Pattern Recogn 33(8):1263–1276
Article Google Scholar
Chua TS, Tan KL, Ooi BC (1997) Fast signature-based color-spatial image retrieval. In: ICMCS 1997, pp 362–369
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Article Google Scholar
Egenhofer M, Herring J (1991) Categorizing binary topological relationships between regions, lines and points in geographic databases. In: A framework for the definition of topological relationships and an approach to spatial reasoning within this framework. Santa Barbara, CA
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Article Google Scholar
Gao S, Wang DH, Lee CH (2006) Automatic image annotation through multi-topic text categorization. In: Proc. of ICASSP 2006, pp 377–380
Han D, Li W, Li Z (2008) Semantic image classification using statistical local spatial relations model. Multimedia Tools and Applications 39(2):169–188
Article Google Scholar
Hironobu YM, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In: Neural networks, pp 405–409
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: SIGIR ’03, pp 119–126
Li J, Wang JZ (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE PAMI 25(9):1075–1088
Article Google Scholar
Lim J, Li Y, You Y, Chevallet J (2007) Scene recognition with camera phones for tourist information access. In: ICME’07
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2) 91–110
Article Google Scholar
Maisonnasse L, Gaussier E, Chevallet J (2007) Revisiting the dependence language model for information retrieval. In: SIGIR ’07
Maisonnasse L, Gaussier E, Chevalet J (2009) Model fusion in conceptual language modeling. In: ECIR ’09, pp 240–251
Manning CD, Raghavan P, Schtze H (2009) Language models for information retrieval. In: An introduction to information retrieval. Cambridge University Press, pp 237–252
Mulhem P, Debanne E (2006) A framework for mixed symbolic-based and feature-based query by example image retrieval. Int J Inf Technol 12(1):74–98
Google Scholar
Ounis I, Pasca M (1998) Relief: combining expressiveness and rapidity into a single system. In: SIGIR ’98, pp 266–274
Papadopoulos G, Mezaris V, Kompatsiaris I, Strintzis MG (2007) Combining global and local information for knowledge-assisted image analysis and classification. EURASIP Journal on Advances in Signal Processing, Special Issue on Knowledge-Assisted Media Analysis for Interactive Multimedia Applications 2007
Pham TT, Maisonnasse L, Mulhem P (2009) Visual language modeling for mobile localization: Lig participation in Robotvision’09. In: CLEF working notes 2009. Corfu, Greece
Pham TT, Maisonnasse L, Mulhem P, Gaussier E (2010) Integration of spatial relationship in visual language model for scene retrieval. In: 8th IEEE int. workshop on content-based multimedia indexing
Pham TT, Mulhem P, Maisonnasse L (2010) Spatial relationships in visual graph modeling for image categorization. In: ACM SIGIR’10. Geneva, Switzerland
Pham TV, Smeulders AWM (2006) Learning spatial relations in object recognition. Pattern Recogn Lett 27(14):1673–1684
Article Google Scholar
Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: SIGIR ’98
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Proceedings of the international conference on computer vision, vol 2, pp 1470–1477
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE PAMI 22(12):1349–1380
Article Google Scholar
Smith JR, Chang S-F (1996) Visualseek: a fully automated content-based image query system. In: Proceedings ACM MM, pp 87–98
Song F, Croft WB (1999) General language model for information retrieval. In: CIKM’99, pp 316–321
Tirilly P, Claveau V, Gros P (2008) Language modeling for bag-of-visual words image categorization. In: Proc. of CIVR 2008, pp 249–258
Won CS, Park DK, Park SJ (2002) Efficient use of mpeg-7 edge histogram descriptor. ETRI J 24(1)
Wu L, Li M, Li Z, Ma WY, Yu N (2007) Visual language modeling for image classification. In: MIR ’07. ACM, New York, pp 115–124
Chapter Google Scholar
Zhai C, Lafferty J (2001) A study of smoothing methods for language models applied to ad-hoc information retrieval. In: SIGIR ’01, pp 334–342

Download references

Acknowledgements

This work was supported by the French National Agency of Research (ANR-06-MDCA-002). Pham Trong-Ton would like to thank Merlion programme of the French Embassy in Singapore for their supports during his Ph.D study.

Author information

Authors and Affiliations

Grenoble Institute of Technology—Laboratoire Informatique de Grenoble (LIG), 385 Av. de la Bibliothèque, 38400, Grenoble, France
Trong-Ton Pham
Multimedia Information Modeling and Retrieval—Laboratoire Informatique de Grenoble (LIG), 385 Av. de la Bibliothèque, 38400, Grenoble, France
Philippe Mulhem & Eric Gaussier
R&D Department-TecKnowMetrix, 4 rue Léon Béridot, Voiron, France
Loïc Maisonnasse
Computer Vision and Image Understanding-Institute for Infocomm Research (I2R), 1 Fusionpolis Way, #21-01, Connexis, 138632, Singapore
Joo-Hwee Lim

Authors

Trong-Ton Pham
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Mulhem
View author publications
You can also search for this author in PubMed Google Scholar
Loïc Maisonnasse
View author publications
You can also search for this author in PubMed Google Scholar
Eric Gaussier
View author publications
You can also search for this author in PubMed Google Scholar
Joo-Hwee Lim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Trong-Ton Pham.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, TT., Mulhem, P., Maisonnasse, L. et al. Visual graph modeling for scene recognition and mobile robot localization. Multimed Tools Appl 60, 419–441 (2012). https://doi.org/10.1007/s11042-010-0598-8

Download citation

Published: 14 September 2010
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11042-010-0598-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual graph modeling for scene recognition and mobile robot localization

Abstract

Access this article

Similar content being viewed by others

Graph-Based Discriminative Learning for Location Recognition

Graph-Based Object Class Discovery from Images with Multiple Objects

Interpreting Context of Images Using Scene Graphs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visual graph modeling for scene recognition and mobile robot localization

Abstract

Access this article

Similar content being viewed by others

Graph-Based Discriminative Learning for Location Recognition

Graph-Based Object Class Discovery from Images with Multiple Objects

Interpreting Context of Images Using Scene Graphs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation