Abstract
In this paper we present an automatic photo tag expansion method designed for photo sharing websites. The purpose of the method is to suggest tags that are relevant to the visual content of a given photo at upload time. Both textual and visual cues are used in the process of tag expansion. When a photo is to be uploaded, the system asks for a couple of initial tags from the user. The initial tags are used to retrieve relevant photos together with their tags. These photos are assumed to be potentially content related to the uploaded target photo. The tag sets of the relevant photos are used to form the candidate tag list, and visual similarities between the target photo and relevant photos are used to give weights to these candidate tags. Tags with the highest weights are suggested to the user. The method is applied on Flickr (http://www.flickr.com). Results show that including visual information in the process of photo tagging increases accuracy with respect to text-based methods.
Similar content being viewed by others
References
Barnard K, Forsyth DA (2001) Learning the semantics of words and pictures. In: Proceedings of the international conference on computer vision, vol 2, pp 408–415
Barnard K, Duygulu P, de Freitas N, Forsyth DA, Blei D, Jordan M (2003) Matching words and pictures. J Mach Learn Res 3:1107–1135
Blei D, Jordan MI (2003) Modeling annotated data. In: Proceedings of 26th annual international ACM SIGIR conference, Toronto, Canada, July 28–August, pp 127–134
Byde A, Wan H, Cayzer S (2007) Personalized tag recommendations via tagging and content-based similarity metrics. In: Proceedings of the international conference on weblogs and social media, Boulder, CO, USA
Carneiro G, Vasconcelos N (2005) Formulating semantic image annotation as a supervised learning problem. In: Proceedings of IEEE conference on computer vision and pattern recognition, vol 2, pp 163–168
Duygulu P, Barnard K, Freitas N, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of 7th European conference on computer vision, vol 4, Copenhagen Denmark, 27 May–2 June, pp 97–112
Feng S, Manmatha R, Lavrenko V (2004) Multiple bernoulli relevance models for image and video annotation. In: Proceedings of international conference on computer vision and pattern recognition, vol 2, pp 1002–1009
Jaschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2008) Tag recommendations in social bookmarking systems. AI Commun 21(4):231–247
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference, Toronto, Canada, 28 July–1 August, pp 119–126
Jing Y, Baluja S (2008) VisualRank: applying pagerank to large-scale image search. IEEE Trans PAMI 30(11):1877–1890
Kucuktunc O, Sevil SG, Tosun AB, Zitouni H, Duygulu P, Can F (2008) Tag Suggestr: automatic photo tag expansion using visual information for photo sharing websites. In: Proceedings of 3rd international conference on semantic and digital media technologies (SAMT ’08), Koblenz, Germany, 3–5 December 2008. Lecture notes in computer science, vol 5392/2008. Springer, Berlin, pp 63–71
Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures. In: Proceedings of 17th annual conference on neural information processing systems, vol 16, pp 553–560
Lazarinis F (2007) Engineering and utilizing a stopword list in Greek web retrieval. JASIST 58(11):1645–1652
Li J, Wang J (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans Pattern Anal Mach Intell 25(9):1075–1088
Li X, Snoek CGM, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimedia (in press)
Lindstaedt S, Mrzinger R, Sorschag R, Pammer V, Thallinger G (2009) Automatic image annotation using visual content and folksonomies. Multimedia Tools and Applications 42(1)
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2)
Lux M, Marques O, Pitman A (2008) Using visual features to improve tag suggestions in image sharing sites. In: Proceedings of knowledge acquisition from the social web, Graz, Austria
Marlow C, Naaman M, Boyd D, Davis M (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proceedings of the 17th conference on hypertext and hypermedia, Odense, Denmark, 22–25 August
Maron O, Ratan AL (1998) Multiple-Instance learning for natural scene classification. In: Proceedings of the 15th international conference on machine learning, pp 341–349
Martinez JM: Overview of the MPEG-7 standard. ISO/IEC JTC1/SC29/WG11 N4031 (2001)
Mishne G (2008) AutoTag: a collaborative approach to automated tag assignment for weblog posts. In: Proceedings of the 15th international conference on world wide web (WWW ’08), Edinburgh, Scotland
Monay F, Gatica-Perez D (2004) PLSA-based image auto-annotation: constraining the latent space. In: Proceedings of ACM international conference on multimedia, pp 348–351
Mori Y, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In: Proceedings of 1st int. workshop on multimedia intelligent storage and retrieval management
MPEG-7 XM Software (2001) Institute for integrated circuits. Technische Universität Munchen, Germany
Pan JY, Yang HJ, Duygulu P, Faloutsos C (2004) Automatic image captioning. In: Proceedings of the 2004 IEEE international conference on multimedia and expo, vol 3, Taipei, Taiwan, June, pp 1987–1990
Quack T, Leibe B, Gool LV (2008) World-scale mining of objects and events from community photo collections. In: Proceedings of ACM international conference on image and video retrieval (CIVR ’08), Niagara Falls, Canada, 7–9 July
Rui Y, Huang T, Chang S (1999) Image retrieval: current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(4):39–62
Sigurbjrnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web (WWW ’08), Beijing, China, 21–25 April, pp 327–336
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large dataset for nonparametric object and scene recognition. IEEE Trans PAMI 30(11):1958–1970
Wang G, Hoiem D, Forsyth D (2009) Building text features for object image classification. In: Proceedings of 19th international conference on pattern recognition
Wang C, Jing F, Zhang L, Zhang HJ (2008) Scalable search-based image annotation. Multimedia Syst 14(4):205–220
Wang X, Zhang L, Jing F, Ma WY (2006) AnnoSearch: image auto-annotation by search. In: Proceedings of international conference on computer vision and pattern recognition (CVPR ’06), New York, USA
Wenyin L, Dumais S, Sun Y, Zhang H, Czerwinski M, Field B (2001) Semiautomatic image annotation. In: Proceedings of the 8th IFIP TC.13 conference on human-computer interaction (INTERACT ’01), pp 326–333
Wenyin L, Sun Y, Zhang H (2000) MiAlbum - a system for home photo managemet using the semi-automatic image annotation approach. In: Proceedings of the eighth ACM international conference on multimedia (MULTIMEDIA ’00), Marina del Rey, California, United States, pp 479–480
Xu Z, Fu Y, Mao J, Su D (2008) Towards the semantic web: collaborative tag suggestions. In: Proceedings of third international conference on internet and web applications and services, Athens, Greece
Acknowledgements
We thank Muhammet Bastan for preparing MPEG-7 visual feature extractor, and all the users participated in the user-study. This research is partially supported by TUBITAK Career grant number 104E065.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sevil, S.G., Kucuktunc, O., Duygulu, P. et al. Automatic tag expansion using visual similarity for photo sharing websites. Multimed Tools Appl 49, 81–99 (2010). https://doi.org/10.1007/s11042-009-0394-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0394-5