ABSTRACT
Building an effective Information Retrieval system requires various design choices, ranging from the weighting scheme of the type of morphological normalization. The combination of runs has become a standard technique to reap the benefits of different run types. Until now, systematic studies of the effectiveness of combination strategies have only been carried out for English. This paper provides an exploratory overview of the effectiveness of combination methods in nine European languages. We demonstrate that the combination of effective information retrieval strategies can lead to significant improvements of retrieval effectiveness. Furthermore, we analyze the relative impact of retrieving more relevant documents and of improved ranking of relevant documents. The experimental evidence is obtained using the 2003 testsuite of the cross-language evaluation forum (CLEF).
- J. A. Aslam and M. Montague. Bayes optimal metasearch: A probablistic model for combining the results of multiple retrieval systems. In SIGIR'00, pp. 379--381, 2000. Google ScholarDigital Library
- J. A. Aslam and M. Montague. Models for metasearch. In SIGIR'01, pp. 276--284, 2001. Google ScholarDigital Library
- S. M. Beitzel et al. Disproving the fusion hypothesis: An analysis of data fusion via effective information retrieval strategies. In ACM SAC'03, pp. 823--827, 2003. Google ScholarDigital Library
- N. J. Belkin, C. Cool, W. B. Croft, and J. P. Callan. The effect of multiple query representations on information retrieval system performance. In SIGIR'93, pp. 339--346, 1993. Google ScholarDigital Library
- C. Buckley, A. Singhal, and M. Mitra. New retrieval approaches using SMART: TREC 4. In TREC-4, pp. 25--48, 1996.Google Scholar
- A. Chowdhurry, O. Frieder, D. Grossman, and C. McCabe. Analyses of multiple-evidence combinations for retrieval strategies. In SIGIR'01, pp. 394--395, 2001. Google ScholarDigital Library
- W. B. Croft. Combining approaches to information retrieval. In Advances in Information Retrieval, pp. 1--36. Kluwer, 2000.Google ScholarDigital Library
- T. de Heer. The application of the concept of homeosemy to natural language information retrieval. IP&M, 18:229--236, 1982.Google Scholar
- B. Efron. Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7:1--26, 1979.Google ScholarCross Ref
- B. Efron and R. J. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, New York, 1993.Google ScholarCross Ref
- E. A. Fox and J. A. Shaw. Combination of multiple searches. In TREC-2, pp. 243--252, 1994.Google Scholar
- V. Hollink, J. Kamps, C. Monz, and M. de Rijke. Monolingual document retrieval for European languages. IR, 6, 2003. Google ScholarDigital Library
- J. H. Lee. Combining multiple evidence from different properties of weighting schemes. In SIGIR'95, pp. 180--188. 1995. Google ScholarDigital Library
- J. H. Lee. Analyses of multiple evidence combination. In SIGIR'97, pp. 267--276, 1997. Google ScholarDigital Library
- P. McNamee and J. Mayfield. Character n-gram tokenization for European language text retrieval. IR, 6, 2003. Google ScholarDigital Library
- C. Monz and M. de Rijke. Shallow morphological analysis in monolingual information retrieval for Dutch, German and Italian. In CLEF-2001, pp. 262--277, 2002. Google ScholarDigital Library
- M. Porter. An algorithm for suffix stripping. Program, 14 (3):130--137, 1980.Google ScholarCross Ref
- J. J. Rocchio, Jr. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, chapter 14, pp. 313--323. Prentice-Hall, 1971.Google ScholarDigital Library
- T. Saracevic and P. B. Kantor. A study of information seeking and retrieving. III. searchers, searches, overlap. JASIST, 39:197--216, 1988.Google ScholarCross Ref
- J. Savoy. Combining multiple strategies for effective monolingual and cross-language retrieval. IR, 6, 2003. Google ScholarDigital Library
- Snowball. Snowball stemmers, 2003. http://snowball.tartarus.org/.Google Scholar
- H. Turtle and W. B. Croft. Evaluation of an inference network-based retrieval model. ACM TOIS, 9:187--222, 1991. Google ScholarDigital Library
- C. C. Vogt and G. W. Cottrell. Predicting the performance of linearly combined IR systems. In SIGIR'98, pp. 190--196, 1998. Google ScholarDigital Library
Recommendations
Towards effective strategies for monolingual and bilingual information retrieval: Lessons learned from NTCIR-4
At the NTCIR-4 workshop, Justsystem Corporation (JSC) and Clairvoyance Corporation (CC) collaborated in the cross-language retrieval task (CLIR). Our goal was to evaluate the performance and robustness of our recently developed commercial-grade CLIR ...
The influence of relevance levels on the effectiveness of interactive information retrieval
In this paper, we focus on the effect of graded relevance on the results of interactive information retrieval (IR) experiments based on assigned search tasks in a test collection. A group of 26 subjects searched for four Text REtrieval Conference (TREC) ...
Combining IR Models for Bengali Information Retrieval
Word mismatch between queries and documents is a fundamental problem in information retrieval domain. In this article, the authors present an effective approach to Bengali information retrieval that combines two IR models to tackle the word mismatch ...
Comments