skip to main content
article

Accelerating XPath evaluation in any RDBMS

Published:01 March 2004Publication History
Skip Abstract Section

Abstract

This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, etc.). This feature lets the index stand out among related work on XML indexing structures which had a focus on the child and descendant axes only. The index has been designed with a close eye on the XPath semantics as well as the desire to engineer its internals so that it can be supported well by existing relational database query processing technology: the index (a) permits set-oriented (or, rather, sequence-oriented) path evaluation, and (b) can be implemented and queried using well-established relational index structures, notably B-trees and R-trees.We discuss the implementation of the XPath accelerator on top of different database backends and show that the index performs well on all levels of the memory hierarchy, including disk-based and main-memory based database systems.

Skip Supplemental Material Section

Supplemental Material

References

  1. Altinel, M. and Franklin, M. J. 2000. Efficient filtering of XML documents for selective dissemination of information. In Proceedings of the 26th International Conference on Very Large Databases (VLDB) (Cairo, Egypt). Morgan-Kaufmann, San Francisco, Calif., 53--64.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berglund, A., Boag, S., Chamberlin, D., Fernandez, M. F., Kay, M., Robie, J., and Siméon, J. 2002. XML Path Language (XPath) 2.0. Tech. Rep. W3C Working Draft, Version 2.0, World Wide Web Consortium. Aug. http://www.w3.org/TR/xpath20/.]]Google ScholarGoogle Scholar
  3. Boag, S., Chamberlin, D., Fernandez, M., Florescu, D., Robie, J., and Siméon, J. 2002. XQuery 1.0: An XML Query Language. Tech. Rep. W3C Working Draft, World Wide Web Consortium. Aug. http://www.w3.org/TR/xquery.]]Google ScholarGoogle Scholar
  4. Böhm, C., Berchtold, S., Kriegel, H.-P., and Michel, U. 2000. Multidimensional index structures in relational databases. J. Intel. Inf. Syst. (JIIS) 15, 1, 51--70.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Boncz, P. A. 2002. Monet: A next-generation DBMS kernel for query-intensive applications. Ph.D. dissertation. University of Amsterdam, The Netherlands.]]Google ScholarGoogle Scholar
  6. Boncz, P. A. and Kersten, M. L. 1999. MIL primitives for querying a fragmented world. The VLDB J. 8, 2, 101--119.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, Z., Jagadish, H., Korn, F., Koudas, N., Muthukrishnan, S., Ng, R., and Srivastava, D. 2001. Counting twig matches in a tree. In Proceedings of the 17th International Conference on Data Engineering (ICDE) (Heidelberg, Germany). IEEE Computer Society Press, Los Alamitos, Calif., 595--604.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cohen, E., Kaplan, H., and Milo, T. 2002. Labeling dynamic XML trees. In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS) (Madison, Wisc.). ACM, New York, 271--121.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cooper, B. F., Sample, N., Franklin, M. J., Hjaltason, G. R., and Shadmon, M. 2001. A fast index for semistructured data. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB) (Rome, Italy). Morgan-Kaufmann, San Francisco, Calif., 341--360.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dietz, P. F. and Sleator, D. D. 1987. Two algorithms for maintaining order in a list. In Conference Record of the 19th Annual ACM Symposium on Theory of Computing (STOC). ACM, New York, 365--372.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fernandez, M., Marsh, J., and Nagy, M. 2002. XQuery 1.0 and XPath 2.0 Data Model. Tech. Rep. W3C Working Draft, World Wide Web Consortium. Aug. http://www.w3.org/TR/ query-datamodel.]]Google ScholarGoogle Scholar
  12. Florescu, D. and Kossmann, D. 1999. A performance evaluation of alternative mapping schemes for storing XML data in a relational database. Tech. Rep. 3680. INRIA, Rocquencourt, France. May.]]Google ScholarGoogle Scholar
  13. Goldman, R. and Widom, J. 1997. DataGuides: Enabling query formulation and optimization in semistructured databases. In Proceedings of the 23rd International Conference on Very Large Databases (VLDB) (Athens, Greece). Morgan-Kaufmann, San Francisco, Calif., 436--445.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Gottlob, G., Koch, C., and Pichler, R. 2002. Efficient algorithms for processing XPath queries. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB) (Hong Kong, China). Morgan-Kaufmann, San Francisco, Calif., 95--106.]]Google ScholarGoogle Scholar
  15. Grust, T. 2002. Accelerating XPath location steps. In Proceedings of the 21st International ACM SIGMOD Conference on Management of Data (Madison, Wisc.). ACM, New York, 109--120.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Grust, T. and van Keulen, M. 2003. Tree awareness for relational database kernels: Staircase join. In Intelligent Search on XML, H. Blanken, H.-J. Schek, and G. Weikum, Eds. Lecture Notes in Computer Science, vol. 2818. Springer-Verlag, Heidelberg, Germany.]]Google ScholarGoogle Scholar
  17. Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In SIGMOD 1984, Proceedings of Annual Meeting (Boston, Mass.). ACM, New York, 47--57.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hellerstein, J. M., Naughton, J. F., and Pfeffer, A. 1995. Generalized search trees for database systems. In Proceedings of the 21st International Conference on Very Large Databases (VLDB) (Zurich, Switzerland). Morgan-Kaufmann, San Francisco, Calif., 562--573.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kamel, I. and Faloutsos, C. 1993. On packing R-trees. In Proceedings of the 2nd International Conference on Information and Knowledge Management (CIKM) (Washington D.C.). ACM, New York, 490--499.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kaushik, R., Bohannon, P., Naughton, J. F., and Korth, H. K. 2002. Covering indexes for branching path Queries. In Proceedings of the 21st International ACM SIGMOD Conference on Management of Data (Madison, Wisc.). ACM, New York, 133--144.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kriegel, H.-P., Pötke, M., and Seidl, T. 2000. Managing intervals efficiently in object-relational databases. In Proceedings of the 26th International Conference on Very Large Databases (VLDB) (Cairo, Egypt). Morgan-Kaufmann, San Francisco, Calif., 407--418.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Li, Q. and Moon, B. 2001. Indexing and querying XML data for regular path expressions. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB) (Rome, Italy). Morgan-Kaufmann, San Francisco, Calif., 361--370.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Olteanu, D., Meuss, H., Furche, T., and Bry, F. 2001. Symmetry in XPath. Tech. Rep. PMS-FB-2001-16. Institute of Computer Science, University of Munich, Munich, Germany.]]Google ScholarGoogle Scholar
  24. Roussopoulos, N. and Leifker, D. 1985. Direct spatial search on pictorial databases using packed R-trees. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Austin, Tex.). ACM, New York, 17--31.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. SAX (Simple API for XML). http://sax.sourceforge.net/.]]Google ScholarGoogle Scholar
  26. Schmidt, A., Waas, F., Kersten, M., Carey, M. J., Manolescu, I., and Busse, R. 2002. XMark: A Benchmark for XML Data Management. In Proceedings of the 28th International Conference on Very Large Databases (VLDB) (Honk Kong, China). Morgan-Kaufmann, San Francisco, Calif., 974--985.]]Google ScholarGoogle Scholar
  27. Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., and Naughton, J. 1999. Relational databases for querying XML documents: Limitations and opportunities. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB) (Edinburgh, Scotland). Morgan-Kaufmann, San Francisco, Calif., 302--314.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Suciu, D. and Milo, T. 1999. Index structures for path expressions. In Proceedings of the 7th International Conference on Database Theory (ICDT) (Jerusalem, Israel). Lecture Notes in Computer Science, vol. 1540. Springer-Verlag, New York, 277--295.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Wu, Y., Patel, J. M., and Jagadish, H. 2002. Estimating answer sizes for XML queries. In Proceedings of the 8th International Conference on Extending Database Technology (EDBT) (Prague, Czech Republic). Springer-Verlag, New York, 590--608.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., and Lohman, G. 2001. On supporting containment queries in relational database management systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Santa Barbara, Calif.). ACM, New York, 425--436.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Accelerating XPath evaluation in any RDBMS

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    • Published in

                      cover image ACM Transactions on Database Systems
                      ACM Transactions on Database Systems  Volume 29, Issue 1
                      March 2004
                      232 pages
                      ISSN:0362-5915
                      EISSN:1557-4644
                      DOI:10.1145/974750
                      Issue’s Table of Contents

                      Copyright © 2004 ACM

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 1 March 2004
                      Published in tods Volume 29, Issue 1

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • article

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader