Skip to main content

Comparing the Retrieval Performance of English and Japanese Text Databases

  • Chapter
Natural Language Processing Using Very Large Corpora

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 11))

  • 364 Accesses

Abstract

Text retrieval systems provide a good test-bed for language processing technologies. Any qualitative or quantitative aspects of the language, i.e., lexicon, morphology, syntax, semantics and pragmatics, can be applied to these systems. A query as a representation of the user’s information need, is entered to a retrieval system, and the system retrieves the relevant documents from the (possibly gigabytes of) full-text database. Information retrieval (IR) relies on using the linguistic and statistical characteristics of the text. A comparative study of IR performance between two languages will help to understand the role of language in the retrieval process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Callan, J. P., Croft, W. B. and Harding, S. M. 1992. The INQUERY Retrieval System. 3rd International Conference on Database and Expert Systems Application, pp. 7883.

    Google Scholar 

  • Callan, J. P. and Croft, W. B. 1993. An Evaluation of Query Processing Strategies Using the TIPSTER Collection. ACM SIGIR-93, pp. 347-355.

    Google Scholar 

  • Croft, W. B., Turtle, H. R. and Lewis, D. D. 1991. The use of phrases and structured queries in information retrieval. ACM SIGIR-91, pp. 32–45.

    Google Scholar 

  • Fagan, J. 1987. Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods. Doctoral dissertation, Cornell University.

    Google Scholar 

  • Frakes, W. B. 1992. Introduction to information storage and retrieval systems. In Frakes and Baeza-Yates (eds), Information retrieval. Englewood Cliffs, NJ: Prentice Hall.

    Google Scholar 

  • Fujii, H. 1997. An investigation of the linguistic characteristics of Japanese informa- tion retrieval. Doctoral dissertation, University of Massachusetts, Amherst.

    Google Scholar 

  • Fujii, H. and Croft, W. B. 1993. A comparison of indexing techniques for Japanese text retrieval. ACM SIGIR-93, pp. 237–246.

    Google Scholar 

  • Harman, D. 1992. The DARPA TIPSTER project. SIGIR Forum,26(2), pp. 26–28. Kageyama, T. 1989. The place of morphology in the grammar: Verb-verb compounds in Japanese. In Booij and van Marie (eds), Yearbook of morphology

    Google Scholar 

  • Kajiwara, K. 1993. History of words for the thermometer in Japanese: changes and acceptance of modern Chinese words (A type). The National Language Research Institute Research Report, 105 (14), pp. 81–137.

    Google Scholar 

  • Kimoto, H., Tanaka, T. and Ishikawa, T. 1993. A proposal for constructing a test collection for information retrieval systems. IPSJ JohoGaku Kiso, 32 (1), pp. 1–8.

    Google Scholar 

  • Matsumoto, Y., Kurohashi, S. and Myoki, Y. 1991. User’s guide for the JUMAN system: A user-extensible morphological analyzer for Japanese. Nagao Laboratory, Kyoto University.

    Google Scholar 

  • Matsuo, J., Nishio, T. and Tanaka, A. 1965. Japanese synonymy and its problems

    Google Scholar 

  • The National Language Research Institute Report 28. Tokyo: Shuei-Shuppan. Turtle, H.R. 1991. Inference network for document retrieval Doctoral dissertation. University of Massachusetts.

    Google Scholar 

  • Turtle, H.R., and Croft, W.B. 1991. Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems, 9(3), pp. 187–222.

    Google Scholar 

  • Salton, G. and McGill, M. 1983. Introduction to Modern Structured Information Retrieval McGraw-Hill.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Fujii, H., Croft, W.B. (1999). Comparing the Retrieval Performance of English and Japanese Text Databases. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol 11. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2390-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2390-9_17

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5349-7

  • Online ISBN: 978-94-017-2390-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics