Improving Text Rankers by Term Locality Contexts

Liu, Rey-Long; Lin, Zong-Xing

doi:10.1007/978-3-642-04769-5_24

Rey-Long Liu²³ &
Zong-Xing Lin²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

Asia Information Retrieval Symposium

857 Accesses

Abstract

When ranking texts retrieved for a query, semantics of each term t in the texts is a fundamental basis. The semantics often depends on locality context (neighboring) terms of t in the texts. In this paper, we present a technique CTFA4TR that improves text rankers by encoding the term locality contexts to the assessment of term frequency (TF) of each term in the texts. Results of the TF assessment may be directly used to improve various kinds of text rankers, without calling for any revisions to algorithms and development processes of the rankers. Moreover, CTFA4TR is efficient to conduct the TF assessment online, and neither training process nor training data is required. Empirical evaluation shows that CTFA4TR significantly improves various kinds of text rankers. The contributions are of practical significance, since many text rankers were developed, and if they consider TF in ranking, CTFA4TR may be used to enhance their performance, without incurring any cost to them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alvarez, C., Langlais, P., Nie, J.-Y.: Word Pairs in Language Modeling for Information Retrieval. In: Proceedings of RIAO (Recherche d’Information Assistée par Ordinateur), pp. 686–705. University of Avignon (Vaucluse), France (2004)
Google Scholar
Büttcher, S., Clarke, C.L.A., Lushman, B.: Term Proximity Scoring for Ad-Hoc Retrieval on Very Large Text Collections. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, USA, pp. 621–622 (2006)
Google Scholar
Cao, Y., Xu, J., Liu, T.-Y., Li, H., Huang, Y., Hon, H.-W.: Adapting Ranking SVM to Document Retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, Washington, pp. 186–193 (2006)
Google Scholar
Cao, Z., Qin, T., Li, T.-Y., Tsai, M.-F., Li, H.: Learning to Rank: From Pairwise Approach to Listwise Approach. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, pp. 129–136 (2007)
Google Scholar
Gao, J., Nie, J.-Y., Wu, G., Cao, G.: Dependence Language Model for Information Retrieval. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, Sheffield South Yorkshire, UK, pp. 170–177 (2004)
Google Scholar
Hersh, W.: OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research. In: Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval, pp. 192–201 (1994)
Google Scholar
Järvelin, K., Kekäläinen, J.: IR Evaluation Methods for Retrieving Highly Relevant Documents. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 41–48 (2000)
Google Scholar
Joachims, T.: Optimizing Search Engines using Clickthrough Data. In: Proceedings of ACM SIGKDD, Edmonton, Alberta, Canada, pp. 133–142 (2002)
Google Scholar
Liu, T.-Y., Xu, J., Qin, T., Xiong, W., Li, H.: LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. In: Proceedings of ACM SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, pp. 3–10 (2007)
Google Scholar
Nallapati, R.: Discriminative Models for Information Retrieval. In: Proceedings of SIGIR, July 25–29, pp. 64–71 (2004)
Google Scholar
Rasolofo, Y., Savoy, J.: Term Proximity Scoring for Keyword-Based Retrieval Systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003)
Chapter Google Scholar
Robertson, S.E., Walker, S., Jone, S., Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: Proceedings of the 3rd Text REtrieval Conference, Gaithersburg, USA (1994)
Google Scholar
Srikanth, M., Srihari, R.: Biterm Language Models for Document Retrieval. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. Tampere, Finland (2002)
Google Scholar
Tao, T., Zhai, C.: An Exploration of Proximity Measures in Information Retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, The Netherlands, pp. 23–27 (2007)
Google Scholar
Veloso, A., Almeida, H.M., Gonçalves, M., Meira Jr., W.: Learning to Rank at Query-Time using Association Rules. In: Proceedings of the 31rd annual international ACM SIGIR conference on research and development in information retrieval, Singapore, pp. 267–274 (2008)
Google Scholar
Wang, X., McCallum, A., Wei, X.: Topical N-grams: Phrase and Topic Discovery, with an Application to Information Retrieval. In: Proceedings of the IEEE 7th International Conference on Data Mining, Omaha NE, USA, pp. 697–702 (2007)
Google Scholar
Xu, J., Li, H.: AdaRank: A Boosting Algorithm for Information Retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, Netherlands, pp. 391–398 (2007)
Google Scholar
Yeh, J.-Y., Lin, J.-Y., Ke, H.-R., Yang, W.-P.: Learning to Rank for Information Retrieval Using Genetic Programming. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, Netherlands (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Medical Informatics, Tzu Chi University, Hualien, Taiwan, R.O.C.
Rey-Long Liu & Zong-Xing Lin

Authors

Rey-Long Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zong-Xing Lin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, 790-784, Pohang, Korea
Gary Geunbae Lee
School of Computing, The Robert Gordon University, St Andrew Street, AB25 1HG, Aberdeen, UK
Dawei Song
Microsoft Reseach Asia, 5F Beijing Sigma Center, 49 Zhichun Road, Haidian District, 100190, Beijing, P.R. China
Chin-Yew Lin
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Akiko Aizawa
School of Literature, Shirayuri College, 1-25 Midorigaoka, Chofu-shi, 182-8525, Tokyo, Japan
Kazuko Kuriyama
Graduate School of Information Science and Technology, Hokkaido University, North 14 West 9, Kita-ku. Sapporo-shi, 060-0814, Hokkaido, Japan
Masaharu Yoshioka
Microsoft Research Asia, 5F Beijing Sigma Center, 49 Zhichun Road, Haidian District, 100190, Beijing, P.R. China
Tetsuya Sakai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, RL., Lin, ZX. (2009). Improving Text Rankers by Term Locality Contexts. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-04769-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04768-8
Online ISBN: 978-3-642-04769-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics