skip to main content
10.1145/1076034.1076085acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Linear discriminant model for information retrieval

Published:15 August 2005Publication History

ABSTRACT

This paper presents a new discriminative model for information retrieval (IR), referred to as linear discriminant model (LDM), which provides a flexible framework to incorporate arbitrary features. LDM is different from most existing models in that it takes into account a variety of linguistic features that are derived from the component models of HMM that is widely used in language modeling approaches to IR. Therefore, LDM is a means of melding discriminative and generative models for IR. We present two algorithms of parameter learning for LDM. One is to optimize the average precision (AP) directly using an iterative procedure. The other is a perceptron-based algorithm that minimizes the number of discordant document-pairs in a rank list. The effectiveness of our approach has been evaluated on the task of ad hoc retrieval using six English and Chinese TREC test sets. Results show that (1) in most test sets, LDM significantly outperforms the state-of-the-art language modeling approaches and the classical probabilistic retrieval model; (2) it is more appropriate to train LDM using a measure of AP rather than likelihood if the IR system is graded on AP; and (3) linguistic features (e.g. phrases and dependences) are effective for IR if they are incorporated properly.

References

  1. Cohen, W. R. Shapire and Y. Singer. 1999. Learning to order things. Journal of Artificial Intelligence Research, 10, pp. 243--270. Google ScholarGoogle ScholarCross RefCross Ref
  2. Collins, Michael. 2002. Discriminative training methods for Hidden Markov Models: theory and experiments with the perceptron algorithm. In: EMNLP. pp 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Crammer, K and Y. Singer. 2001. Pranking with ranking. In: NIPS.Google ScholarGoogle Scholar
  4. Duda, Richard O, Hart, Peter E. and Stork, David G. 2001. Pattern classification. John Wiley & Sons, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Fletcher, R. 1987. Practical methods of optimization. John Wiley & Sons, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Freund, Yoav, Raj Iyer, Robert E. Schapire, and Yoram Singer. 1998. An efficient boosting algorithm for combining preferences. In ICML'98, pp. 170--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gao, Jianfeng, Hao Yu, Peng Xu and Wei Yuan. 2005. Minimum sample risk methods for language modeling. To appear.Google ScholarGoogle Scholar
  8. Gao, Jianfeng, Mu Li, Andi Wu and Changning Huang. 2004. A pragmatic approach to Chinese word segmentation. Tech-Report of Microsoft Research. MSR-TR-2004-123.Google ScholarGoogle Scholar
  9. Gao, Jianfeng, Jian-Yun Nie, Guangyuan Wu and Guihong Cao. 2004. Dependence language model for information retrieval. In: SIGIR, pp. 170--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gao, Jianfeng, Joshua Goodman and Jiangbo Miao. 2001. The use of clustering techniques for language model -- application to Asian language. Computational Linguistics and Chinese Language Processing. Vol. 6, No. 1, pp 27--60.Google ScholarGoogle Scholar
  11. Harman, D. K. 1995. Overview of the fourth Text REtrieval Conference (TREC-4). In: TREC-4, pp 1--24.Google ScholarGoogle Scholar
  12. Herbrich, R. T. Graepel and K. Obermayer. 2000. Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers, pp. 115--132. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  13. Joachims, T. 1999. Making large-scale SVM learning practical. In B. Scholkopt, C. Burges and A. Smola, editors, Advances in Kernel Methods -- Support Vector Learning. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Joachims, T. 2002. Optimizing search engines using clickthrough data. In: SIGKDD, pp. 133--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jones, K. S., S. Walker and S. Robertson. 1998. A probabilistic model of information retrieval: development and status. Technical Report TR-446, Cambridge University Computer Laboratory.Google ScholarGoogle Scholar
  16. Juang, Biing-Hwang, Wu Chou and Chin-Hui Lee. 1997. Minimum classification error rate methods for speech recognition. IEEE Tran. Speech and Audio Processing. Vol. 5, No. 3. pp. 257--265.Google ScholarGoogle ScholarCross RefCross Ref
  17. Lafferty, John and Chengxiang Zhai. 2001. Document language models, query models, and risk minimization for information retrieval. In: SIGIR, pp. 111--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Miller, D. H., Leek, T. and Schwartz, R. 1999. A hidden Markov model information retrieval system. In: SIGIR'99, pp. 214--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nallapati, R. 2004. Discriminative models for information retrieval. In: SIGIR, pp. 67--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Nallapati, R. and J. Allan. 2002. Capturing term dependencies using a language model based on sentence trees. In: CIKM, pp. 383--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ng, A. N. and M. I. Jordan. 2002. On discriminative vs. generative classifiers: a comparison of logistic regression and naïve Bayes. In: NIPS, pp. 841--848.Google ScholarGoogle Scholar
  22. Och, Franz. 2003. Minimum error rate training in statistical machine translation. In: ACL, pp. 160--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ponte, J. and W. B. Croft. 1998. A language modeling approach to information retrieval, In: SIGIR'98, pp. 275--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Press, W. H., S. A. Teukolsky, W. T. Vetterling andB. P. Flannery. 1992. Numerical Recipes In C: The Art of Scientific Computing. New York: Cambridge Univ. Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Quirk, C., A. Merezes and C. Cherry. 2005. Dependency tree translation: syntactically informed phrasal SMT. To appear.Google ScholarGoogle Scholar
  26. Robertson, S. E. and S. Walker. 1994. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: SIGIR, pp. 232--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Robertson, S. E. and Walker, S. 2000. Microsoft Cambridge at TREC-9: Filtering track. In: TREC-9, pp. 361--368.Google ScholarGoogle Scholar
  28. Song, F. and Croft, B. 1999. A general language model for information retrieval. In: CIKM'99, pp. 316--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Vapnik, V. N. 1999. The nature of statistical learning theory. Springer-Verlag, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zhai, C., and J. Lafferty. 2002. Two-stage language models for information retrieval. In: SIGIR, pp. 49--56. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Linear discriminant model for information retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
      August 2005
      708 pages
      ISBN:1595930345
      DOI:10.1145/1076034

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 August 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader