Skip to main content

Axiomatic Analysis of Translation Language Model for Information Retrieval

  • Conference paper
Book cover Advances in Information Retrieval (ECIR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7224))

Included in the following conference series:

Abstract

Statistical translation models have been shown to outperform simple document language models which rely on exact matching of words in the query and documents. A main challenge in applying translation models to ad hoc information retrieval is to estimate a translation model without training data. In this paper, we perform axiomatic analysis of translation language model for retrieval in order to gain insights about how to optimize the estimation of translation probabilities. We propose a set of constraints that a reasonable translation language model should satisfy. We check these constraints on the state-of-the-art translation estimation method based on Mutual Information and find that it does not satisfy most of the constraints. We then propose a new estimation method that better satisfies the defined constraints. Experimental results on representative TREC data sets show that the proposed new estimation method outperforms the existing Mutual Information-based estimation, suggesting that the proposed constraints are indeed helpful for designing better estimation methods for translation language model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: ACM SIGIR, pp. 222–229 (1999)

    Google Scholar 

  2. Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: SIGIR, pp. 49–56 (2004)

    Google Scholar 

  3. Fang, H., Tao, T., Zhai, C.: Diagnostic evaluation of information retrieval models. TOIS 29 (2011)

    Google Scholar 

  4. Jin, R., Hauptmann, A.G., Zhai, C.X.: Title language model for information retrieval. In: ACM SIGIR, pp. 42–48 (2002)

    Google Scholar 

  5. Karimzadehgan, M., Zhai, C.: Estimation of statistical translation models based on mutual information for ad hoc information retrieval. In: SIGIR, pp. 323–330 (2010)

    Google Scholar 

  6. Ponte, J., Croft, W.B.: A language modeling approach to information retrieval. In: ACM SIGIR, pp. 275–281 (1998)

    Google Scholar 

  7. Porter, M.: An algorithm for suffix stripping. Program 14(3) (1980)

    Google Scholar 

  8. Rijsbergen, C.J.V.: Information retrieval. Butterworths (1979)

    Google Scholar 

  9. Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)

    Article  Google Scholar 

  10. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: ACM SIGIR, pp. 334–342 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karimzadehgan, M., Zhai, C. (2012). Axiomatic Analysis of Translation Language Model for Information Retrieval. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28997-2_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28996-5

  • Online ISBN: 978-3-642-28997-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics