Skip to main content
Log in

Automatic tag expansion using visual similarity for photo sharing websites

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper we present an automatic photo tag expansion method designed for photo sharing websites. The purpose of the method is to suggest tags that are relevant to the visual content of a given photo at upload time. Both textual and visual cues are used in the process of tag expansion. When a photo is to be uploaded, the system asks for a couple of initial tags from the user. The initial tags are used to retrieve relevant photos together with their tags. These photos are assumed to be potentially content related to the uploaded target photo. The tag sets of the relevant photos are used to form the candidate tag list, and visual similarities between the target photo and relevant photos are used to give weights to these candidate tags. Tags with the highest weights are suggested to the user. The method is applied on Flickr (http://www.flickr.com). Results show that including visual information in the process of photo tagging increases accuracy with respect to text-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://del.icio.us

  2. http://last.fm

References

  1. Barnard K, Forsyth DA (2001) Learning the semantics of words and pictures. In: Proceedings of the international conference on computer vision, vol 2, pp 408–415

  2. Barnard K, Duygulu P, de Freitas N, Forsyth DA, Blei D, Jordan M (2003) Matching words and pictures. J Mach Learn Res 3:1107–1135

    Article  MATH  Google Scholar 

  3. Blei D, Jordan MI (2003) Modeling annotated data. In: Proceedings of 26th annual international ACM SIGIR conference, Toronto, Canada, July 28–August, pp 127–134

  4. Byde A, Wan H, Cayzer S (2007) Personalized tag recommendations via tagging and content-based similarity metrics. In: Proceedings of the international conference on weblogs and social media, Boulder, CO, USA

  5. Carneiro G, Vasconcelos N (2005) Formulating semantic image annotation as a supervised learning problem. In: Proceedings of IEEE conference on computer vision and pattern recognition, vol 2, pp 163–168

  6. Duygulu P, Barnard K, Freitas N, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of 7th European conference on computer vision, vol 4, Copenhagen Denmark, 27 May–2 June, pp 97–112

  7. Feng S, Manmatha R, Lavrenko V (2004) Multiple bernoulli relevance models for image and video annotation. In: Proceedings of international conference on computer vision and pattern recognition, vol 2, pp 1002–1009

  8. Jaschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2008) Tag recommendations in social bookmarking systems. AI Commun 21(4):231–247

    Google Scholar 

  9. Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference, Toronto, Canada, 28 July–1 August, pp 119–126

  10. Jing Y, Baluja S (2008) VisualRank: applying pagerank to large-scale image search. IEEE Trans PAMI 30(11):1877–1890

    Google Scholar 

  11. Kucuktunc O, Sevil SG, Tosun AB, Zitouni H, Duygulu P, Can F (2008) Tag Suggestr: automatic photo tag expansion using visual information for photo sharing websites. In: Proceedings of 3rd international conference on semantic and digital media technologies (SAMT ’08), Koblenz, Germany, 3–5 December 2008. Lecture notes in computer science, vol 5392/2008. Springer, Berlin, pp 63–71

    Google Scholar 

  12. Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures. In: Proceedings of 17th annual conference on neural information processing systems, vol 16, pp 553–560

  13. Lazarinis F (2007) Engineering and utilizing a stopword list in Greek web retrieval. JASIST 58(11):1645–1652

    Article  Google Scholar 

  14. Li J, Wang J (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans Pattern Anal Mach Intell 25(9):1075–1088

    Article  Google Scholar 

  15. Li X, Snoek CGM, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimedia (in press)

  16. Lindstaedt S, Mrzinger R, Sorschag R, Pammer V, Thallinger G (2009) Automatic image annotation using visual content and folksonomies. Multimedia Tools and Applications 42(1)

  17. Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2)

  18. Lux M, Marques O, Pitman A (2008) Using visual features to improve tag suggestions in image sharing sites. In: Proceedings of knowledge acquisition from the social web, Graz, Austria

  19. Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the 17th conference on hypertext and hypermedia, Odense, Denmark, 22–25 August

  20. Maron O, Ratan AL (1998) Multiple-Instance learning for natural scene classification. In: Proceedings of the 15th international conference on machine learning, pp 341–349

  21. Martinez JM: Overview of the MPEG-7 standard. ISO/IEC JTC1/SC29/WG11 N4031 (2001)

  22. Mishne G (2008) AutoTag: a collaborative approach to automated tag assignment for weblog posts. In: Proceedings of the 15th international conference on world wide web (WWW ’08), Edinburgh, Scotland

  23. Monay F, Gatica-Perez D (2004) PLSA-based image auto-annotation: constraining the latent space. In: Proceedings of ACM international conference on multimedia, pp 348–351

  24. Mori Y, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In: Proceedings of 1st int. workshop on multimedia intelligent storage and retrieval management

  25. MPEG-7 XM Software (2001) Institute for integrated circuits. Technische Universität Munchen, Germany

    Google Scholar 

  26. Pan JY, Yang HJ, Duygulu P, Faloutsos C (2004) Automatic image captioning. In: Proceedings of the 2004 IEEE international conference on multimedia and expo, vol 3, Taipei, Taiwan, June, pp 1987–1990

  27. Quack T, Leibe B, Gool LV (2008) World-scale mining of objects and events from community photo collections. In: Proceedings of ACM international conference on image and video retrieval (CIVR ’08), Niagara Falls, Canada, 7–9 July

  28. Rui Y, Huang T, Chang S (1999) Image retrieval: current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(4):39–62

    Article  Google Scholar 

  29. Sigurbjrnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web (WWW ’08), Beijing, China, 21–25 April, pp 327–336

  30. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  31. Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large dataset for nonparametric object and scene recognition. IEEE Trans PAMI 30(11):1958–1970

    Google Scholar 

  32. Wang G, Hoiem D, Forsyth D (2009) Building text features for object image classification. In: Proceedings of 19th international conference on pattern recognition

  33. Wang C, Jing F, Zhang L, Zhang HJ (2008) Scalable search-based image annotation. Multimedia Syst 14(4):205–220

    Article  Google Scholar 

  34. Wang X, Zhang L, Jing F, Ma WY (2006) AnnoSearch: image auto-annotation by search. In: Proceedings of international conference on computer vision and pattern recognition (CVPR ’06), New York, USA

  35. Wenyin L, Dumais S, Sun Y, Zhang H, Czerwinski M, Field B (2001) Semiautomatic image annotation. In: Proceedings of the 8th IFIP TC.13 conference on human-computer interaction (INTERACT ’01), pp 326–333

  36. Wenyin L, Sun Y, Zhang H (2000) MiAlbum - a system for home photo managemet using the semi-automatic image annotation approach. In: Proceedings of the eighth ACM international conference on multimedia (MULTIMEDIA ’00), Marina del Rey, California, United States, pp 479–480

  37. Xu Z, Fu Y, Mao J, Su D (2008) Towards the semantic web: collaborative tag suggestions. In: Proceedings of third international conference on internet and web applications and services, Athens, Greece

Download references

Acknowledgements

We thank Muhammet Bastan for preparing MPEG-7 visual feature extractor, and all the users participated in the user-study. This research is partially supported by TUBITAK Career grant number 104E065.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sare Gul Sevil.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sevil, S.G., Kucuktunc, O., Duygulu, P. et al. Automatic tag expansion using visual similarity for photo sharing websites. Multimed Tools Appl 49, 81–99 (2010). https://doi.org/10.1007/s11042-009-0394-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-009-0394-5

Keywords

Navigation