skip to main content
10.1145/1030397.1030441acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
Article

Towards efficient implementation of XML schema content models

Published:28 October 2004Publication History

ABSTRACT

XML Schema uses an extension of traditional regular expressions for describing allowed contents of document elements. Iteration is described through numeric attributes <b>minOccurs</b> and <b>maxOccurs</b> attached to content-describing elements such as <b>sequence</b>, <b>choice</b>, and <b>element</b>. These numeric occurrence indicators are a challenge to standard automata-based solutions. Straightforward solutions require space that is exponential with respect to the length of the expressions.We describe a strategy to implement unambiguous content model expressions as <i>counter automata</i>, which are of linear size only.

References

  1. A. Aho, R. Sethi, and J. Ullman. Compilers, principles, techniques, and tools. Addison-Wesley, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Apache Software Foundation. Xerces2 Java Parser, 2004. http://xml.apache.org/xerces2-j/.Google ScholarGoogle Scholar
  3. P. Biron and A. Malhotra, editors. XML Schema Part 2: Datatypes. W3C Recommendation, May 2001.Google ScholarGoogle Scholar
  4. T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler, and F. Yergeau, editors. Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation, February 2004.Google ScholarGoogle Scholar
  5. A. Brüggemann-Klein. Regular expressions into finite automata. Theoretical Computer Science, 120:197--213, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Brüggemann-Klein and D. Wood. One-unambiguous regular languages. Information and Computation, 142:182--206, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Clark and M. Murata. RELAX NG Specification. OASIS, December 2001. http://www.relaxng.org/spec-20011203.html.Google ScholarGoogle Scholar
  8. J. E. F. Friedl. Mastering Regular Expressions. O'Reilly & Associates, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. R. Gansner and S. C. North. An open graph visualization system and its applications to software engineering. Software -- Practice and Experience, 30(11):1203--1233, Sept. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. Glushkov. The abstract theory of automata. Russian Mathematical Surveys, 16:1--53, 1961.Google ScholarGoogle ScholarCross RefCross Ref
  11. P. Hazel. Perl Compatible Regular Expressions. University of Cambridge, 2003. http://www.pcre.org/.Google ScholarGoogle Scholar
  12. IEEE Std 1003.1-2001 Standard for Information Technology --- Portable Operating System Interface (POSIX) Base Definitions, Issue 6. IEEE, 2001.Google ScholarGoogle Scholar
  13. P. Kilpeläinen and R. Tuhkanen. Regular expressions with numerical occurrence indicators---preliminary results. In em Proc. of the Eighth Symposium on Programming Languages and Software Tools, pages 163--173. University of Kuopio, Department of Computer Science, 2003.Google ScholarGoogle Scholar
  14. P. Kilpeläinen and R. Tuhkanen. Counter automata for unambiguous regular expressions with numeric occurrence indicators. Report in preparation, University of Kuopio, 2004.Google ScholarGoogle Scholar
  15. S. Sippu and E. Soisalon-Soininen. Parsing Theory, volume I: Languages and Parsing. Springer-Verlag, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Thompson, D. Beech, M. Maloney, and N. Mendelsohn, editors. XML Schema Part 1: Structures. W3C Recommendation, May 2001.Google ScholarGoogle Scholar
  17. L. Wall and R. Schwartz. Programming perl. O'Reilly & Associates, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards efficient implementation of XML schema content models

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            DocEng '04: Proceedings of the 2004 ACM symposium on Document engineering
            October 2004
            252 pages
            ISBN:1581139381
            DOI:10.1145/1030397

            Copyright © 2004 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 October 2004

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate178of537submissions,33%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader