Skip to main content
Log in

Result diversification in social image retrieval: a benchmarking framework

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This article addresses the diversification of image retrieval results in the context of image retrieval from social media. It proposes a benchmarking framework together with an annotated dataset and discusses the results achieved during the related task run in the MediaEval 2013 benchmark. 38 multimedia diversification systems, varying from graph-based representations, re-ranking, optimization approaches, data clustering to hybrid approaches that included a human in the loop, and their results are described and analyzed in this text. A comparison of the use of expert vs. crowdsourcing annotations shows that crowdsourcing results have a slightly lower inter-rater agreement but results are comparable at a much lower cost than expert annotators. Multimodal approaches have best results in terms of cluster recall. Manual approaches can lead to high precision but often lower diversity. With this detailed results analysis we give future insights into diversity in image retrieval and also for preparing new evaluation campaigns in related areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.flickr.com/

  2. http://datahub.io/dataset/yago/

  3. http://en.wikipedia.org/

  4. http://www.flickr.com/services/api/

  5. data can be downloaded from http://traces.cs.umass.edu/index.php/mmsys/mmsys/

  6. http://www.multimediaeval.org/

  7. http://crowdflower.com/

  8. http://lucene.apache.org/

  9. http://media-manager.xilopix.com/

  10. http://www.rsscse-edu.org.uk/tsj/bts/noether/text.html

  11. for the interpretation of Spearman’s rank correlation coefficient values, see http://www.statstutor.ac.uk/resources/uploaded/spearmans.pdf

References

  1. Agrawal R, Gollapudi S, Halverson A, Ieong S (2009) Diversifying search results. ACM International Conference on Web Search and Data Mining, Barcelona, Spain

    Book  Google Scholar 

  2. Arandjelovic R, Zisserman AA (2012) Three things everyone should know to improve object retrieval, IEEE Conference on Computer Vision and Pattern Recognition

  3. Armagan A, Popescu A, Duygulu P (2013) MUCKE Participation at Retrieving Diverse Social Images Task of MediaEval 2013, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  4. Ballan L, Bertini M, Uricchio T, Del Bimbo A (2014) Data-driven approaches for social image and video tagging, Multimedia Tools and Applications, doi:10.1007/s11042-014-1976-4

  5. Bursuc A, Zaharia T (2013) ARTEMIS @ MediaEval 2013: A Content-Based Image Clustering Method for Public Image Repositories, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  6. Cheng X-Q, Du P, Guo J, Zhu X, Chen Y (2013) Ranking on data manifold with sink points. IEEE Trans Knowl Data Eng 25(1):177–191

    Article  Google Scholar 

  7. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Stud Meas XX(1):37–46

    Article  Google Scholar 

  8. Cohen J (1968) Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70(4):213–220

    Article  Google Scholar 

  9. Corney D, Martin C, Göker A, Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris Y, Aiello L, Thomee B (2013) SocialSensor: Finding Diverse Images at MediaEval 2013, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  10. Dang V, Croft W B (2012) Diversity by Proportionality: An Election-based Approach to Search Result Diversification. ACM International Conference on Research and Development in Information Retrieval, Oregon, USA, pp 65–74

  11. Datta R, Joshi D, Li J, Wang JZ (2008) Image Retrieval: Ideas, Influences, and Trends of the New Age. ACM Comput Surv 40(2):1–60

    Article  Google Scholar 

  12. Deselaers T, Gass T, Dreuw P, Ney H (2009) Jointly Optimising Relevance and Diversity in Image Retrieval, ACM International Conference on Image and Video Retrieval

  13. Escalante HJ, Morales-Reyes A (2013) TIA-INAOE’s Approach for the 2013 Retrieving Diverse Social Images Task, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  14. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2012) The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. URL http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html

  15. García Seco de Herrera A, Kalpathy-Cramer J, Demner Fushman D, Antani S, Müller H (2013) Overview of the ImageCLEF 2013 medical tasks. Working Notes of CLEF 2013 Cross Language Evaluation Forum, Valencia, Spain

  16. Huang Z, Hu B, Cheng H, Shen H, Liu H, Zhou X (2010) Mining near-duplicate graph for cluster-based reranking of web video search results, vol 28. ACM Transactions on Information Systems, USA, pp 22:1-22:27

  17. Jain N, Hare J, Samangooei S, Preston J, Davies J, Dupplaw D, Lewis P (2013) Experiments in Diversifying Flickr Result Sets, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  18. Ionescu B, Menéndez M, Müller H, Popescu A (2013) Retrieving Diverse Social Images at MediaEval 2013: Objectives, Dataset and Evaluation, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  19. Ionescu B, Popescu A, Müller H, Menéndez M, Radu A-L (2014) Benchmarking Result Diversification In Social Image Retrieval. IEEE International Conference on Image Processing, Paris, France , pp 27–30

  20. Ionescu B, Radu A-L, Menéndez M, Müller H, Popescu A, Loni B (2014) Div400: A Social Image Retrieval Result Diversification Dataset. ACM Multimedia Systems, Singapore, pp 19–21

  21. Jing Y, Baluja S (2008) Visualrank: Applying pagerank to large-scale image search. IEEE Trans Pattern Anal Mach Intell 30(11):1870–1890

    Google Scholar 

  22. Kuoman C, Tollari S, Detyniecki M (2013) UPMC at MediaEval 2013: Relevance by Text and Diversity by Visual Clustering, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  23. Lehman A (2005) Jmp For Basic Univariate And Multivariate Statistics: A Step-by-step Guide, Cary, NC: SAS Press. pp 123. ISBN 1-59047-576-3

  24. Li X, Snoek CGM, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimedia 11(7):1310–1322

    Article  Google Scholar 

  25. McGinty L, Smyth B (2003) On the role of diversity in conversational recommender systems, International Conference on Case-Based Reasoning, pp 276–290

  26. Over P, Awad G, Michel M, Fiscus J, Sanders G, Kraaij W, Smeaton A.F, Quéenot G (2013) TRECVID 2013 – An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics, Proceedings of TRECVID 2013, URL http://www-nlpir.nist.gov/projects/tvpubs/tv13.papers/tv13overview.pdf. NIST, USA

  27. Paramita ML, Sanderson M, Clough P (2009) Diversity in Photo Retrieval: Overview of the ImageCLEF Photo Task 2009. ImageCLEF

  28. Perdoch M, Chum O, Matas JJ (2009) Efficient Representation of Local Geometry for Large Scale Object Retrieval, IEEE Conference on Computer Vision and Pattern Recognition

  29. Popescu A (2013) CEA LISTs Participation at the MediaEval 2013 Retrieving Diverse Social Images Task, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  30. Popescu A, Grefenstette G (2011) Social Media Driven Image Retrieval. ACM ICMR, Trento, Italy, pp 17–20

  31. Priyatharshini R, Chitrakala S (2013) Association Based Image Retrieval: A Survey, Mobile Communication and Power Engineering. Springer Commun Comput Inf Sci 296:17–26

  32. Radu A-L, Boteanu B, Pleş O, Ionescu B (2013) LAPI @ Retrieving Diverse Social Images Task 2013: Qualitative Photo Retrieval using Multimedia Content, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  33. Radu A-L, Ionescu B, Menéndez M, Stöttinger J, Giunchiglia F, De Angeli A (2014) A hybrid machine- crowd approach to photo retrieval result diversification. Multimedia Model, Ireland, LNCS 8325:25–36

    Article  Google Scholar 

  34. Randolph JJ (2005) Free-Marginal Multirater Kappa (multirater κfree): an Alternative to Fleiss Fixed-Marginal Multirater Kappa. Joensuu Learning and Instruction Symposium

  35. Rudinac S, Hanjalic A, Larson MA (2013) Generating visual summaries of geographic areas using community-contributed images. IEEE Trans Multimedia 15(4):921–932

    Article  Google Scholar 

  36. Smeulders A W M, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  37. Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris I, Tsoumakas G, Vlahavas I, Tsoumakas G, Vlahavas I (2012) An empirical study on the combination of SURF features with VLAD vectors for image search. International Workshop on Image Analysis for Multimedia Interactive Services, Dublin, Ireland

  38. Szűcs G, Paróczi Z, Vincz DM (2013) BMEMTM at MediaEval 2013 Retrieving Diverse Social Images Task: Analysis of Text and Visual Information, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  39. Taneva B, Kacimi M, Weikum G (2010) Gathering and Ranking Photos of Named Entities with High Precision, High Recall, and Diversity, ACM Web Search and Data Mining, pp 431–440

  40. Tsikrika TT, García Seco de Herrera A, Müller H (2011) Assessing the Scholarly Impact of ImageCLEF, Springer Lecture Notes in Computer Science (LNCS):95–106

  41. Tsikrika T, Kludas J, Popescu A (2012) Building reliable and reusable test collections for image retrieval: the wikipedia task at imageclef. IEEE Multimedia 19(3):24–33

    Article  Google Scholar 

  42. Vandersmissen B., Tomar A., Godin F., De Neve W., Van de Walle R (2013) Ghent University-iMinds at MediaEval 2013 Diverse Images: Relevance-Based Hierarchical Clustering, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  43. van Leuken RH, Garcia L, Olivares X, Van Zwol R (2009) Visual Diversification of Image Search Results. ACM World Wide Web, pp 341–350

  44. Vee E, Srivastava U, Shanmugasundaram J, Bhat P, Yahia SA (2008) Efficient computation of diverse query results, IEEE International Conference on Data Engineering, pp 228–236

  45. Vieira MR, Razente HL, Barioni MCN, Hadjieleftheriou M, Srivastava D, Traina Jr C, Tsotras VJ (2011) On Query Result Diversification, vol 11-16. IEEE International Conference on Data Engineering, Hannover, Germany, pp 1163–1174

  46. Wilkie D (1980) Pictorial representation of kendall’s, rank correlation coefficient. Teach Stat 2:76–78

    Article  Google Scholar 

  47. Yanai K, Nga DH (2013) UEC, Tokyo at MediaEval 2013 Retrieving Diverse Social Images Task, Working Notes Proceedings MediaEval 2013 Workshop, Eds. M Larson, X Anguera, T Reuter, G J F Jones, B Ionescu, M Schedl, T Piatrik, C Hauff, M Soleymani, co-located with ACM Multimedia, Barcelona, Spain, October 18-19, CEUR-WS.org, ISSN 1613-0073, Vol 1043, http://ceur-ws.org/Vol-1043/

  48. Zhu X, Goldberg A, Gael JV, Andrzejewski D (2007) Improving Diversity in Ranking using Absorbing Random Walks, Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics

Download references

Acknowledgments

This work was supported by the following projects: CUbRIK (http://www.cubrikproject.eu/), PROMISE (http://www.promise-noe.eu/) and MUCKE (http://ifs.tuwien.ac.at/~mucke/). We acknowledge also the MediaEval Benchmarking Initiative for Multimedia Evaluation (http://www.multimediaeval.org/).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bogdan Ionescu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ionescu, B., Popescu, A., Radu, AL. et al. Result diversification in social image retrieval: a benchmarking framework. Multimed Tools Appl 75, 1301–1331 (2016). https://doi.org/10.1007/s11042-014-2369-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2369-4

Keywords

Navigation