ABSTRACT
XML Schema uses an extension of traditional regular expressions for describing allowed contents of document elements. Iteration is described through numeric attributes <b>minOccurs</b> and <b>maxOccurs</b> attached to content-describing elements such as <b>sequence</b>, <b>choice</b>, and <b>element</b>. These numeric occurrence indicators are a challenge to standard automata-based solutions. Straightforward solutions require space that is exponential with respect to the length of the expressions.We describe a strategy to implement unambiguous content model expressions as <i>counter automata</i>, which are of linear size only.
- A. Aho, R. Sethi, and J. Ullman. Compilers, principles, techniques, and tools. Addison-Wesley, 1986. Google ScholarDigital Library
- Apache Software Foundation. Xerces2 Java Parser, 2004. http://xml.apache.org/xerces2-j/.Google Scholar
- P. Biron and A. Malhotra, editors. XML Schema Part 2: Datatypes. W3C Recommendation, May 2001.Google Scholar
- T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler, and F. Yergeau, editors. Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation, February 2004.Google Scholar
- A. Brüggemann-Klein. Regular expressions into finite automata. Theoretical Computer Science, 120:197--213, 1993. Google ScholarDigital Library
- A. Brüggemann-Klein and D. Wood. One-unambiguous regular languages. Information and Computation, 142:182--206, 1998. Google ScholarDigital Library
- J. Clark and M. Murata. RELAX NG Specification. OASIS, December 2001. http://www.relaxng.org/spec-20011203.html.Google Scholar
- J. E. F. Friedl. Mastering Regular Expressions. O'Reilly & Associates, 1997. Google ScholarDigital Library
- E. R. Gansner and S. C. North. An open graph visualization system and its applications to software engineering. Software -- Practice and Experience, 30(11):1203--1233, Sept. 2000. Google ScholarDigital Library
- V. Glushkov. The abstract theory of automata. Russian Mathematical Surveys, 16:1--53, 1961.Google ScholarCross Ref
- P. Hazel. Perl Compatible Regular Expressions. University of Cambridge, 2003. http://www.pcre.org/.Google Scholar
- IEEE Std 1003.1-2001 Standard for Information Technology --- Portable Operating System Interface (POSIX) Base Definitions, Issue 6. IEEE, 2001.Google Scholar
- P. Kilpeläinen and R. Tuhkanen. Regular expressions with numerical occurrence indicators---preliminary results. In em Proc. of the Eighth Symposium on Programming Languages and Software Tools, pages 163--173. University of Kuopio, Department of Computer Science, 2003.Google Scholar
- P. Kilpeläinen and R. Tuhkanen. Counter automata for unambiguous regular expressions with numeric occurrence indicators. Report in preparation, University of Kuopio, 2004.Google Scholar
- S. Sippu and E. Soisalon-Soininen. Parsing Theory, volume I: Languages and Parsing. Springer-Verlag, 1988. Google ScholarDigital Library
- H. Thompson, D. Beech, M. Maloney, and N. Mendelsohn, editors. XML Schema Part 1: Structures. W3C Recommendation, May 2001.Google Scholar
- L. Wall and R. Schwartz. Programming perl. O'Reilly & Associates, 1991. Google ScholarDigital Library
Index Terms
- Towards efficient implementation of XML schema content models
Recommendations
XML-based XML schema access
WWW '07: Proceedings of the 16th international conference on World Wide WebXML Schema's abstract data model consists of components, which are the structures that eventually define a schema as a whole. XML Schema's XML syntax, on the other hand, is not a direct representation of the schema components, and it proves to be ...
Simplifying XML schema: effortless handling of nondeterministic regular expressions
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataWhether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. ...
Constraint Preserving Transformation from Relational Schema to XML Schema
XML has become the standard for publishing and exchanging data on the Web. However, most business data is managed and will remain to be managed by relational database management systems. As such, there is an increasing need to efficiently and accurately ...
Comments