- Phrase Matching in XML

https://doi.org/10.1016/B978-012722442-8/50024-0Get rights and content

Publisher Summary

Phrase matching is a common information retrieval (IR) technique to search text and identify relevant documents in a document collection. Phrase matching in XML presents new challenges as text may be interleaved with arbitrary markup, thwarting search techniques that require strict contiguity or close proximity of keywords. This chapter presents a technique for phrase matching in XML that permits dynamic specification of both the phrase to be matched and the markup to be ignored. The chapter develops an effective algorithm for the technique that utilizes inverted indices on phrase words and XML tags. It describes experimental results comparing the algorithm to an indexed-nested loop algorithm that illustrates algorithm's efficiency.

References (0)

Cited by (2)

View full text