skip to main content
10.1145/2095536.2095542acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

A workload-aware approach for optimizing the XML schema design trade-off

Published:05 December 2011Publication History

ABSTRACT

In general, the design of XML schemas involves translating conceptual schemas into XML schemas which aim to be: (i) normalized schemas, and (ii) connected structures in order to achieve good performance on queries. However, these requirements address a trade-off because highly connected XML structures allow data redundancy, and normalized schemas generate disconnected XML structures. This paper describes a workload-based approach which balances this trade-off on translating conceptual schemas into XML structures. An experimental study on an XML database shows that our XML schemas provide high query performance on the relevant elements for the workload and, at the same time, low cost of data redundancy on elements that are not relevant for update operations.

References

  1. M. Arenas and L. Libkin. A normal form for xml documents. In Symposium on Principles of Database Systems, pages 85--96. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Barbosa, A. Mendelzon, J. Keenleyside, and K. Lyons. Toxgene: A template-based data generator for xml. In Proc. WebDB, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Batini, S. Ceri, and S. Navathe. Conceptual Database Design: An Entity-Relationship Approach. The Benjamin/Cummings Publishing Company, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Bird, A. Goodchild, and T. A. Halpin. Object role modeling and xml-schema. In International Conference on Conceptual Modeling, pages 661--705. Springer, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Bradford, V. Gritsenko, and K. O'Neill. Apache xindice. http://xml.apache.org/xindice/, 2011.Google ScholarGoogle Scholar
  6. C. Curino, E. Jones, Y. Zhang, and S. Madden. Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB Endow., 3:48--57, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Elmasri, J. Weeldreyer, and A. R. Hevner. The category concept: An extension to the entity-relationship model. In Data Knowledge Engineering, volume 1, pages 75--116, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Embley, S. Liddle, and S. Kamha. Enterprise modeling with conceptual xml. In International Conference on Conceptual Modeling, pages 150--165, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Fong and A. F. et. al. Translating relational schema with constraints into xml schema. In International Journal of Software Engineering and Knowledge Engineering, volume 16, pages 201--244, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Kudrass and T. Krumbein. Rule-based generation of xml schemas from uml class diagrams. In Advances in Databases and Information Systems. Springer, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. Mani. Erex: A conceptual model for xml. In Internation XML Database Symposium, pages 128--142. Springer, 2004.Google ScholarGoogle Scholar
  12. W. Y. Mok and D. W. Embley. Generating compact redundancy-free xml documents from conceptual-model hypergraphs. In IEEE Transactions on Knowledge and Data Engineering, volume 18, pages 1082--1096, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. M. Moro, L. Lim, and Y.-C. Chang. Schema advisor for hybrid relational-xml dbms. In SIGMOD '07: ACM SIGMOD international conference on Management of data, pages 959--970. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Pigozzo and E. Quintarelli. An algorithm for generating xml schemas from er schemas. In Italian Symposium on Advanced Database Systems, pages 192--199, 2005.Google ScholarGoogle Scholar
  15. N. Routledge, L. Bird, and A. Goodchild. Uml and xml schema. In Australian Database Conference, pages 157--166. IEEE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Schöning. Tamino - a dbms designed for xml-schema. In International Conference on Data Engineering, pages 149--154. IEEE, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Schroeder and R. D. S. Mello. Designing xml documents from conceptual schemas and workload information. Multimedia Tools Appl., 43:303--326, July 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Schroeder and R. S. Mello. Improving query performance on xml documents: A workload-driven design approach. In Symposium on Document Engineering, pages 177--186, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Stephens and A. G. et. al. Constructing consensus ontologies for the semantic web: A conceptual approach. In World Wide Web Journal, volume 7, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Wiwatwattana and H. J. et. al. Making designer schemas with colors. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Yu and H. V. Jagadish. Xml schema refinement through redundancy detection and normalization. In The VLDB Journal, volume 17, pages 203--223, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A workload-aware approach for optimizing the XML schema design trade-off

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      iiWAS '11: Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
      December 2011
      572 pages
      ISBN:9781450307840
      DOI:10.1145/2095536

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 December 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader