Abstract
We present a question answering system that can handle noisy and incomplete natural language data, and methods and measures for the evaluation of question answering systems. Our question answering system is based on the vector space model and linguistic analysis of the natural language data. In the evaluation procedure, we test eight different preprocessing schemes for the data, and come to the conclusion that lemmatization combined with breaking compound words into their constituents gives significantly better results than the baseline. The evaluation process is based on stratified random sampling and bootstrapping. To measure the correctness of an answer, we use partial credits as well as full credits.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Moldovan, D., Harabagiu, S., Paşca, M., Mihalcea, R., Goodrum, R., Gîrju, R., Rus, V.: LASSO: A tool for surfing the answer net. In: Proceedings of the Text Retrieval Conference (TREC-8), Gaithersburg, Maryland, USA (1999)
Harabagiu, S., Moldovan, D., Paşca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Gîrju, R., Rus, V., Morărescu, P.: FALCON: Boosting knowledge for answer engines. In: Proceedings of TREC-9, Gaithersburg, Maryland, USA (2000)
Harabagiu, S., Moldovan, D., Paşca, M., Surdeanu, M., Mihalcea, R., Gîrju, R., Rus, V., Lăcăatuşu, F., Morăarescu, P., Bunescu, R.: Answering complex, list and context questions with LCC’s question-answering server. In: Proceedings of TREC-10, Gaithersburg, Maryland, USA (2001)
Busemann, S., Schmeier, S., Arens, R.G.: Message classification in the call center. In: Proceedings of 6th Applied Natural Language Processing Conference, Seattle, Washington, USA (2000)
Tapanainen, P., Järvinen, T.: A non-projective dependency parser. In: Proceedings of the 5th Conference on Applied Natural Language Processing, Washington, D. C., USA, Association for Computational Linguistics (1997)
Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley (1989)
Nolan, D., Speed, T.: Stat Labs Mathematical Statistics Through Applications. Springer-Verlag (2001)
Efron, B.: The Jackknife, the Bootstrap and Other Resampling Plans. Society for Industrial and Applied Mathematics (1983)
Cohen, P.: Empirical Methods for Artificial Intelligence. The MIT Press (1995)
Voorhees, E.M.: Overview of the TREC-2001 question answering track. In Voorhees, E.M., Harman, D.K., eds.: Proceedings of TREC-10, Gaithersburg, Maryland, USA, Department of Commerce, National Institute of Standards and Technology (2001)
van Rijsbergen, C. J.: Information Retrieval. 2nd edn. Butterworths (1980)
Carletta, J.: Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics 22 (1996) 249–254
Alkula, R.: From plain character strings to meaningful words: Producing better full text databases for inflectional and compounding languages with morphological analysis software. Information Retrieval 4 (2001) 195–208
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aunimo, L., Heinonen, O., Kuuskoski, R., Makkonen, J., Petit, R., Virtanen, O. (2003). Question Answering System for Incomplete and Noisy Data. In: Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2003. Lecture Notes in Computer Science, vol 2633. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36618-0_14
Download citation
DOI: https://doi.org/10.1007/3-540-36618-0_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-01274-0
Online ISBN: 978-3-540-36618-8
eBook Packages: Springer Book Archive