Abstract
The term relevance weighting method has been shown to produce optimal information retrieval queries under well-defined conditions. The parameters needed to generate the term relevance factors cannot unfortunately be estimated accurately in practice; futhermore, in realistic test situations, it appears difficult to obtain improved retrieval results using the term relevance weights over much simpler term weighting systems such as, for example, the inverse document frequency weights.It is shown in this study that the inverse document frequency weights and the term relevance weights are closely related over a wide range of the frequency spectrum. Methods are introduced for estimating the term relevance weights, and experimental results are given comparing the inverse document frequency with the estimated term relevance weights.
- D. H. Kraft and A. Bookstein, Evaluation of Information Retrieval Systems: A Decision Theory Approach, Journal of the ASIS, Vol. 29, 1978, p. 31-34.Google Scholar
- S. E. Robertson and K. Sparck Jones, Relevance Weighting of Search Terms, Journal of the ASIS, Vol. 27, No. 3, 1976, p. 129-146.Google Scholar
- C. T. Yu, W. S. Luk and M. K. Siu, On Models of Information Retrieval Processes, Information Systems, Vol. 4, No. 3, 1979, p. 205-218.Google Scholar
- C. T. Yu and G. Salton, Precision Weighting - An Effective Automatic Indexing Method, Journal of the ACM, Vol. 23, No. 1, 1976, p. 76-88. Google ScholarDigital Library
- K. Sparck Jones, Experiments in Relevance Weighting of Search Terms, Information Processing and Management, Vol. 15, 1979, p. 133-144.Google ScholarCross Ref
- K. Sparck Jones, Search Term Relevance Weighting Given Little Relevance Information, Journal of Documentation, Vol. 35, 1979, p. 30-48.Google ScholarCross Ref
- K. Sparck Jones, Search Term Relevance Weighting - Some Recent Results, Journal of Information Science, Vol. 1, 1980, p. 325-332.Google ScholarDigital Library
- S. E. Robertson, C. J. VanRijsbergen and M. F. Porter, Probabilistic Models of Indexing and Searching, Proc. of ACM-BCS Symposium on Research and Development in Information Retrieval, Cambridge, England, 1980. Google ScholarDigital Library
- G. Salton, A. Wong, and C. T. Yu, Automatic Indexing Using Term Discrimination and Term Precision Measurements, Information Processing and Management, Vol. 12, 1976, p. 43-51.Google Scholar
- G. Salton and R. K. Waldstein, Term Relevance Weights in On-Line Information Retrieval, Information Processing and Management, Vol. 14, 1978, p. 29-35.Google ScholarCross Ref
- W. B. Croft and D. J. Harper, Using Probabilistic Models of Document Retrieval Without Relevance Information, Journal of Documentation, Vol. 35, 1979, p. 285-295.Google ScholarCross Ref
- D. J. Harper and C. J. VanRijsbergen, An Evaluation of Feedback in Retrieval Using Co-Occurrence Data, Journal of Documentation, Vol. 34, 1978, p. 189-216.Google ScholarCross Ref
- G. Salton, H. Wu, and C. T. Yu, The Measurement of Term Importance in Automatic Indexing, to be published in Journal of the ASIS.Google Scholar
- C. T. Yu, K. Lam, and G. Salton, Optimum Term Weighting in Information Retrieval Using the Term Precision Model, to be published in Journal of the ACM. Google ScholarDigital Library
- K. Sparck Jones, A Statistical Interpretation of Term Specificity and its Application in Retrieval, Journal of Documentation, Vol. 28, 1972, p. 11-21.Google ScholarCross Ref
Recommendations
Context-Aware Document Term Weighting for Ad-Hoc Search
WWW '20: Proceedings of The Web Conference 2020Bag-of-words document representations play a fundamental role in modern search engines, but their power is limited by the shallow frequency-based term weighting scheme. This paper proposes HDCT, a context-aware document term weighting framework for ...
A comparison of search term weighting: term relevance vs. inverse document frequency
SIGIR '81: Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval: theoretical issues in information retrievalThe term relevance weighting method has been shown to produce optimal information retrieval queries under well-defined conditions. The parameters needed to generate the term relevance factors cannot unfortunately be estimated accurately in practice; ...
Comments