Abstract
Text retrieval systems provide a good test-bed for language processing technologies. Any qualitative or quantitative aspects of the language, i.e., lexicon, morphology, syntax, semantics and pragmatics, can be applied to these systems. A query as a representation of the user’s information need, is entered to a retrieval system, and the system retrieves the relevant documents from the (possibly gigabytes of) full-text database. Information retrieval (IR) relies on using the linguistic and statistical characteristics of the text. A comparative study of IR performance between two languages will help to understand the role of language in the retrieval process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Callan, J. P., Croft, W. B. and Harding, S. M. 1992. The INQUERY Retrieval System. 3rd International Conference on Database and Expert Systems Application, pp. 7883.
Callan, J. P. and Croft, W. B. 1993. An Evaluation of Query Processing Strategies Using the TIPSTER Collection. ACM SIGIR-93, pp. 347-355.
Croft, W. B., Turtle, H. R. and Lewis, D. D. 1991. The use of phrases and structured queries in information retrieval. ACM SIGIR-91, pp. 32–45.
Fagan, J. 1987. Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods. Doctoral dissertation, Cornell University.
Frakes, W. B. 1992. Introduction to information storage and retrieval systems. In Frakes and Baeza-Yates (eds), Information retrieval. Englewood Cliffs, NJ: Prentice Hall.
Fujii, H. 1997. An investigation of the linguistic characteristics of Japanese informa- tion retrieval. Doctoral dissertation, University of Massachusetts, Amherst.
Fujii, H. and Croft, W. B. 1993. A comparison of indexing techniques for Japanese text retrieval. ACM SIGIR-93, pp. 237–246.
Harman, D. 1992. The DARPA TIPSTER project. SIGIR Forum,26(2), pp. 26–28. Kageyama, T. 1989. The place of morphology in the grammar: Verb-verb compounds in Japanese. In Booij and van Marie (eds), Yearbook of morphology
Kajiwara, K. 1993. History of words for the thermometer in Japanese: changes and acceptance of modern Chinese words (A type). The National Language Research Institute Research Report, 105 (14), pp. 81–137.
Kimoto, H., Tanaka, T. and Ishikawa, T. 1993. A proposal for constructing a test collection for information retrieval systems. IPSJ JohoGaku Kiso, 32 (1), pp. 1–8.
Matsumoto, Y., Kurohashi, S. and Myoki, Y. 1991. User’s guide for the JUMAN system: A user-extensible morphological analyzer for Japanese. Nagao Laboratory, Kyoto University.
Matsuo, J., Nishio, T. and Tanaka, A. 1965. Japanese synonymy and its problems
The National Language Research Institute Report 28. Tokyo: Shuei-Shuppan. Turtle, H.R. 1991. Inference network for document retrieval Doctoral dissertation. University of Massachusetts.
Turtle, H.R., and Croft, W.B. 1991. Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems, 9(3), pp. 187–222.
Salton, G. and McGill, M. 1983. Introduction to Modern Structured Information Retrieval McGraw-Hill.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Fujii, H., Croft, W.B. (1999). Comparing the Retrieval Performance of English and Japanese Text Databases. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol 11. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2390-9_17
Download citation
DOI: https://doi.org/10.1007/978-94-017-2390-9_17
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5349-7
Online ISBN: 978-94-017-2390-9
eBook Packages: Springer Book Archive