ABSTRACT
Keyword search enables web users to easily access XML data without the need to learn a structured query language and to study possibly complex data schemas. Existing work has addressed the problem of selecting qualified data nodes that match keywords and connecting them in a meaningful way, in the spirit of inferring a where clause in XQuery. However, how to infer the return clause for keyword search is an open problem.
To address this challenge, we present an XML keyword search engine, XSeek, to infer the semantics of the search and identify return nodes effectively. XSeek recognizes possible entities and attributes inherently represented in the data. It also distinguishes between search predicates and return specifications in the keywords. Then based on the analysis of both XML data structures and keyword match patterns, XSeek generates return nodes. Extensive experimental studies show the effectiveness of XSeek.
- XQuery 1.0: An XML query language, June 2001. http://www.w3.org/XML/Query.Google Scholar
- S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer: A System for Keyword-Based Search over Relational Databases. In Proceedings of ICDE, pages 5--16, 2002. Google ScholarDigital Library
- S. Amer-Yahia, C. Botev, J. Dorre, and J. Shanmugasundaram. XQuery Full-Text extensions explained. In IBM Systems Journal, pages 335--352, 2006. Google ScholarDigital Library
- M. Barg and R. K. Wong. Structural proximity searching for large collections of semi-structured data. In Proceedings of CIKM, pages 175--182, New York, NY, USA, 2001. ACM Press. Google ScholarDigital Library
- G. Bhalotia, C. Nakhe, A. Hulgeri, S. Chakrabarti, and S. Sudarshan. Keyword Searching and Browsing in Databases using BANKS. In ICDE, 2002.Google ScholarDigital Library
- D. Carmel, Y. Maarek, Y. Mass, N. Efraty, and G. Landau. An extension of the vector space model for querying xml documents via xml fragments. In ACM SIGIR 2002: Workshop on XML and Information Retrieval, 2002.Google Scholar
- J. Clark and S. DeRose. XML Path language (XPath) 1.0, November 1999. http://www.w3.org/TR/xpath.Google Scholar
- S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. XSEarch: A semantic Search Engine for XML, 2003. Google ScholarDigital Library
- A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with STORED. In SIGMOD '99: Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pages 431--442, New York, NY, USA, 1999. ACM Press. Google ScholarDigital Library
- D. Florescu, D. Kossmann, and I. Manolescu. Integrating Keyword Search into XML Query Processing. Computer Networks( Amsterdam, Netherlands: 1999), 33(1--6):119--135, 2000.Google Scholar
- N. Fuhr and K. GroBjohann. XIRQL: A Query Language for Information Retrieval in XML Documents. In Proceedings of SIGIR, pages 172--180, New York, NY, USA, 2001. ACM Press. Google ScholarDigital Library
- L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked Keyword Search over XML Documents. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 16--27, 2003. Google ScholarDigital Library
- V. Hristidis, N. Koudas, Y. Papakonstantinou, and D. Srivastava. Keyword Proximity Search in XML Trees. IEEE Transactions on Knowledge and Data Engineering, 18(4), 2006. Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In Procs. VLDB, 2002. Google ScholarDigital Library
- V. Hristidis, Y. Papakonstantinou, and A. Balmin. Keyword Proximity Search on XML Graphs, 2003. In ICDE. Google ScholarDigital Library
- G. Koutrika, A. Simitsis, and Y. E. Ioannidis. Précis: The essence of a query answer. In ICDE, page 69, 2006. Google ScholarDigital Library
- Y. Li, C. Yu, and H. V. Jagadish. Schema-Free XQuery. In VLDB, 2004. Google ScholarDigital Library
- F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD '06: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 563--574, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
- V. Vesper. Let's Do Dewey. http://www.mtsu.edu/vvesper/dewey.html.Google Scholar
- Extensible markup language (xml) 1.0, 2004. http://www.w3.org/TR/REC-xml/.Google Scholar
- Y. Xu and Y. Papakonstantinou. Efficient Keyword Search for Smallest LCAs in XML Databases. In Proceedings of the 2005 ACMS IGMOD International Conference on Management of Data, pages 527--538, New York, NY, USA, 2005. ACM Press. Google ScholarDigital Library
- C. Yu and H. V. Jagadish. Schema Summarization. In Proceedings of VLDB, 2006. Google ScholarDigital Library
Index Terms
- Identifying meaningful return information for XML keyword search
Recommendations
Return specification inference and result clustering for keyword search on XML
Keyword search enables Web users to easily access XML data without the need to learn a structured query language and to study possibly complex data schemas. Existing work has addressed the problem of selecting qualified data nodes that match keywords ...
Towards an Effective XML Keyword Search
Inspired by the great success of information retrieval (IR) style keyword search on the web, keyword search on XML has emerged recently. The difference between text database and XML database results in three new challenges: 1) Identify the user search ...
Processing keyword search on XML: a survey
Keyword search is a user-friendly approach for users to retrieve information from XML data. Since an XML document can have a large size and contain a lot of information, an XML keyword search result should be a fragment of an XML document dynamically ...
Comments