skip to main content
10.1145/2009916.2010037acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

System effectiveness, user models, and user utility: a conceptual framework for investigation

Published:24 July 2011Publication History

ABSTRACT

There is great interest in producing effectiveness measures that model user behavior in order to better model the utility of a system to its users. These measures are often formulated as a sum over the product of a discount function of ranks and a gain function mapping relevance assessments to numeric utility values. We develop a conceptual framework for analyzing such effectiveness measures based on classifying members of this broad family of measures into four distinct families, each of which reflects a different notion of system utility. Within this framework we can hypothesize about the properties that such a measure should have and test those hypotheses against user and system data. Along the way we present a collection of novel results about specific measures and relationships between them.

References

  1. Rakesh Agrawal, Sreenivas Gollapudi, Halan Halverson, and Samuel Ieong. Diversifying search results. In Proceedings of WSDM, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Leif Azzopardi, Kalervo Jarvelin, Jaap Kamps, and Mark D. Smucker, editors. Proceedings of the SIGIR 2010 Workshop on the Simulation of Interaction: Automated Evaluation of Interactive IR, 2010.Google ScholarGoogle Scholar
  3. Stefan Buettcher, Charles L.A. Clarke, and Ian Soboroff. The TREC 2006 Terabyte Track. In Proceedings of TREC, 2006.Google ScholarGoogle Scholar
  4. Ben Carterette and Paul N. Bennett. Evaluation measures for preference judgments. In Proceedings of SIGIR, 2008. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ben Carterette, Paul N. Bennett, D. Maxwell Chickering, and Susan T. Dumais. Here or there: Preference judgments for relevance. In Proceedings of ECIR, pages 16--27, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Olivier Chapelle, Donald Metzler, Ya Zhang, and Pierre Grinspan. Expceted reciprocal rank for graded relevance. In Proceedings of CIKM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Charles L. A. Clarke, Nick Craswell, and Ian Soboroff. Preliminary report on the trec 2009 web track. In Proceedings of Text Retrieval Conference (TREC-2009), 2009.Google ScholarGoogle Scholar
  8. Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of SIGIR, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gordon V. Cormack, Christopher R. Palmer, and Charles L.A. Clarke. Efficient construction of large test collections. In Proceedings of SIGIR, pages 282--289, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kalervo Jarvelin and Jaana Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Evangelos Kanoulas, Ben Carterette, Paul D. Clough, and Mark Sanderson. Evaluation over multi-query sessions. In Proceedings of SIGIR, 2011. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Alistair Moffat and Justin Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Info. Sys., 27(1):1--27, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Stephen E. Robertson. A new interpretation of average precision. In Proceedings of SIGIR, pages 689--690, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stephen E. Robertson, Evangelos Kanoulas, and Emine Yilmaz. Extending average precision to graded relevance judgments. In Proceedings of SIGIR, pages 603--610, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mark E. Rorvig. The simple scalability of documents. JASIS, 41(8):590--598, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  16. Ellen Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. In Proceedings of SIGIR, pages 315--323, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ellen M. Voorhees and Donna Harman. Overview of the Sixth Text REtrieval Conference (TREC-6). In Proceedings of the Sixth Text REtrieval Conference (TREC-6), pages 1--24, 1997. NIST Special Publication 500--240.Google ScholarGoogle Scholar
  18. Emine Yilmaz, Javed A. Aslam, and Stephen Robertson. A new rank correlation coefficient for information retrieval. In Proceedings of SIGIR, pages 587--594, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Emine Yilmaz, Milad Shokouhi, Nick Craswell, and Stephen Robertson. Expected browsing utility for web search evaluation. In Proceedings of CIKM, 2010. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yuye Zhang, Laurence A. Park, and Alistair Moffat. Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retr., 13:46--69, February 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. System effectiveness, user models, and user utility: a conceptual framework for investigation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
      July 2011
      1374 pages
      ISBN:9781450307574
      DOI:10.1145/2009916

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 July 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader