Abstract
When ranking texts retrieved for a query, semantics of each term t in the texts is a fundamental basis. The semantics often depends on locality context (neighboring) terms of t in the texts. In this paper, we present a technique CTFA4TR that improves text rankers by encoding the term locality contexts to the assessment of term frequency (TF) of each term in the texts. Results of the TF assessment may be directly used to improve various kinds of text rankers, without calling for any revisions to algorithms and development processes of the rankers. Moreover, CTFA4TR is efficient to conduct the TF assessment online, and neither training process nor training data is required. Empirical evaluation shows that CTFA4TR significantly improves various kinds of text rankers. The contributions are of practical significance, since many text rankers were developed, and if they consider TF in ranking, CTFA4TR may be used to enhance their performance, without incurring any cost to them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alvarez, C., Langlais, P., Nie, J.-Y.: Word Pairs in Language Modeling for Information Retrieval. In: Proceedings of RIAO (Recherche d’Information Assistée par Ordinateur), pp. 686–705. University of Avignon (Vaucluse), France (2004)
Büttcher, S., Clarke, C.L.A., Lushman, B.: Term Proximity Scoring for Ad-Hoc Retrieval on Very Large Text Collections. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, USA, pp. 621–622 (2006)
Cao, Y., Xu, J., Liu, T.-Y., Li, H., Huang, Y., Hon, H.-W.: Adapting Ranking SVM to Document Retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, Washington, pp. 186–193 (2006)
Cao, Z., Qin, T., Li, T.-Y., Tsai, M.-F., Li, H.: Learning to Rank: From Pairwise Approach to Listwise Approach. In: Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, pp. 129–136 (2007)
Gao, J., Nie, J.-Y., Wu, G., Cao, G.: Dependence Language Model for Information Retrieval. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, Sheffield South Yorkshire, UK, pp. 170–177 (2004)
Hersh, W.: OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research. In: Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval, pp. 192–201 (1994)
Järvelin, K., Kekäläinen, J.: IR Evaluation Methods for Retrieving Highly Relevant Documents. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 41–48 (2000)
Joachims, T.: Optimizing Search Engines using Clickthrough Data. In: Proceedings of ACM SIGKDD, Edmonton, Alberta, Canada, pp. 133–142 (2002)
Liu, T.-Y., Xu, J., Qin, T., Xiong, W., Li, H.: LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. In: Proceedings of ACM SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, pp. 3–10 (2007)
Nallapati, R.: Discriminative Models for Information Retrieval. In: Proceedings of SIGIR, July 25–29, pp. 64–71 (2004)
Rasolofo, Y., Savoy, J.: Term Proximity Scoring for Keyword-Based Retrieval Systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003)
Robertson, S.E., Walker, S., Jone, S., Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: Proceedings of the 3rd Text REtrieval Conference, Gaithersburg, USA (1994)
Srikanth, M., Srihari, R.: Biterm Language Models for Document Retrieval. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. Tampere, Finland (2002)
Tao, T., Zhai, C.: An Exploration of Proximity Measures in Information Retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, Amsterdam, The Netherlands, pp. 23–27 (2007)
Veloso, A., Almeida, H.M., Gonçalves, M., Meira Jr., W.: Learning to Rank at Query-Time using Association Rules. In: Proceedings of the 31rd annual international ACM SIGIR conference on research and development in information retrieval, Singapore, pp. 267–274 (2008)
Wang, X., McCallum, A., Wei, X.: Topical N-grams: Phrase and Topic Discovery, with an Application to Information Retrieval. In: Proceedings of the IEEE 7th International Conference on Data Mining, Omaha NE, USA, pp. 697–702 (2007)
Xu, J., Li, H.: AdaRank: A Boosting Algorithm for Information Retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, Netherlands, pp. 391–398 (2007)
Yeh, J.-Y., Lin, J.-Y., Ke, H.-R., Yang, W.-P.: Learning to Rank for Information Retrieval Using Genetic Programming. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, Netherlands (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, RL., Lin, ZX. (2009). Improving Text Rankers by Term Locality Contexts. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-04769-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04768-8
Online ISBN: 978-3-642-04769-5
eBook Packages: Computer ScienceComputer Science (R0)