skip to main content
10.1145/1645953.1646032acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Empirical justification of the gain and discount function for nDCG

Published:02 November 2009Publication History

ABSTRACT

The nDCG measure has proven to be a popular measure of retrieval effectiveness utilizing graded relevance judgments. However, a number of different instantiations of nDCG exist, depending on the arbitrary definition of the gain and discount functions used (1) to dictate the relative value of documents of different relevance grades and (2) to weight the importance of gain values at different ranks, respectively. In this work we discuss how to empirically derive a gain and discount function that optimizes the efficiency or stability of nDCG. First, we describe a variance decomposition analysis framework and an optimization procedure utilized to find the efficiency- or stability-optimal gain and discount functions. Then we use TREC data sets to compare the optimal gain and discount functions to the ones that have appeared in the IR literature with respect to (a) the efficiency of the evaluation, (b) the induced ranking of systems, and (c) the discriminative power of the resulting nDCG measure.

References

  1. A. Al-Maskari, M. Sanderson, andP. Clough. The relationship between ir effectiveness measures and user satisfaction. In SIGIR'07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 773--774, NewYork, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. A. Aslam, E. Yilmaz, and V. Pavlu. The maximum entropy method for analyzing retrieval measures. In SIGIR'05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 27--34, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Banks, P. Over, and N.-F. Zhang. Blind men and elephants: Six approaches to trec data. Inf. Retr., 1(1-2):7--34, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Bodoff and P. Li. Test theory for assessing ir test collection. In SIGIR'07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 367--374, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. L. Brennan. Generalizability Theory. Springer-Verlag, NewYork, 2001.Google ScholarGoogle Scholar
  6. C. Buckley and E. M. Voorhees. Evaluating evaluation measurestability. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 33--40, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML'05: Proceedings of the 22nd international conference on Machine learning, pages 89--96, NewYork, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Carterette, V. Pavlu, E. Kanoulas, J. A. Aslam, and J. Allan. Evaluation over thousands of queries. In SIGIR'08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 651--658, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Jarvelin and J. Kekalainen. Ir evaluation methods for retrieving highly relevant documents. In SIGIR '00: Proceedings of the23rd annual international ACM SIGIR conference on Research and development in information retrieval, pages 41--48, New York, NY, USA, 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Kekalainen. Binary and graded relevance in ir evaluations: comparison of the effects on ranking of ir systems. Inf. Process. Manage., 41(5):1019--1033, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Sakai. Evaluating evaluation metrics based on the bootstrap. In SIGIR'06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 525--532, NewYork, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Sakai. On penalising late arrival of relevant documents in information retrieval evaluation with graded relevance. In First International Workshop on Evaluating Information Access (EVIA2007), pages 32--43, 2007.Google ScholarGoogle Scholar
  14. E. M. Voorhees. Evaluation by highly relevant documents. In SIGIR'01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 74--82, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Empirical justification of the gain and discount function for nDCG

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
      November 2009
      2162 pages
      ISBN:9781605585123
      DOI:10.1145/1645953

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader