skip to main content
10.1145/1526709.1526761acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

An axiomatic approach for result diversification

Published:20 April 2009Publication History

ABSTRACT

Understanding user intent is key to designing an effective ranking system in a search engine. In the absence of any explicit knowledge of user intent, search engines want to diversify results to improve user satisfaction. In such a setting, the probability ranking principle-based approach of presenting the most relevant results on top can be sub-optimal, and hence the search engine would like to trade-off relevance for diversity in the results.

In analogy to prior work on ranking and clustering systems, we use the axiomatic approach to characterize and design diversification systems. We develop a set of natural axioms that a diversification system is expected to satisfy, and show that no diversification function can satisfy all the axioms simultaneously. We illustrate the use of the axiomatic framework by providing three example diversification objectives that satisfy different subsets of the axioms. We also uncover a rich link to the facility dispersion problem that results in algorithms for a number of diversification objectives. Finally, we propose an evaluation methodology to characterize the objectives and the underlying axioms. We conduct a large scale evaluation of our objectives based on two data sets: a data set derived from the Wikipedia disambiguation pages and a product database.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proc. 2nd ACM Intl Conf on Web Search and Data Mining, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Altman and M. Tennenholtz. On the axiomatic foundations of ranking systems. In Proc. 19th International Joint Conference on Artificial Intelligence, pages 917--922, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kenneth Arrow. Social Choice and Individual Values. Wiley, New York, 1951.Google ScholarGoogle Scholar
  4. Yair Bartal. On approximating arbitrary metrices by tree metrics. In STOC, pages 161--168, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrei Z. Broder, Moses Charikar, Alan M. Frieze, and Michael Mitzenmacher. Min-wise independent permutations. Journal of Computer and System Sciences, 60(3):630--659, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Barun Chandra and Magnus M. Halldorsson. Approximation algorithms for dispersion problems. J. Algorithms, 38(2):438--465, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. Chen and D.R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 429--436, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C.L.A. Clarke, M. Kolla, G.V. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sreenivas Gollapudi and Rina Panigrahy. Exploiting asymmetry in hierarchical topic extraction. In CIKM, pages 475--482, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Hassin, S. Rubinstein, and A. Tamir. Approximation algorithms for maximum dispersion. Operations Research Letters, 21(3):133--137, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Kleinberg. An Impossibility Theorem for Clustering. Advances in Neural Information Processing Systems 15: Proceedings of the 2002 Conference, 2003.Google ScholarGoogle Scholar
  13. B. Korte and D. Hausmann. An Analysis of the Greedy Heuristic for Independence Systems. Algorithmic Aspects of Combinatorics, 2:65--74, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  14. SS Ravi, D.J. Rosenkrantz, and G.K. Tayi. Facility dispersion problems: Heuristics and special cases. Proc. 2nd Workshop on Algorithms and Data Structures (WADS), pages 355--366, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  15. S.S. Ravi, D.J. Rosenkrantz, and G.K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42(2):299--310, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. SS Ravi, D.J. Rosenkrantzt, and G.K. Tayi. Approximation Algorithms for Facility Dispersion. In Teofilo F. Gonzalez, editor, Handbook of Approximation Algorithms and Metaheuristics. Chapman & Hall/CRC, 2007.Google ScholarGoogle Scholar
  17. Stephen Robertson and Hugo Zaragoza. On rank-based e ectiveness measures and optimization. Inf. Retr., 10(3):321--339, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Atish Das Sarma, Sreenivas Gollapudi, and Samuel Ieong. Bypass rates: reducing query abandonment using negative inferences. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 177--185, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S.A. Yahia. Efficient Computation of Diverse Query Results. IEEE 24th International Conference on Data Engineering, 2008. ICDE 2008, pages 228--236, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. ChengXiang Zhai. Risk Minimization and Language Modeling in Information Retrieval. PhD thesis, Carnegie Mellon University, 2002.Google ScholarGoogle Scholar
  21. C.X. Zhai, W.W. Cohen, and J. La erty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C.X. Zhai and J. La erty. A risk minimization framework for information retrieval. Information Processing and Management, 42(1):31--55, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C.N. Ziegler, S.M. McNee, J.A. Konstan, and G. Lausen. Improving recommendation lists through topic diversification. Proceedings of the 14th international conference on World Wide Web, pages 22--32, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An axiomatic approach for result diversification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader