Skip to main content

Regular Expression Indexing

2002; Chan, Garofalakis, Rastogi

  • Reference work entry
  • 261 Accesses

Keywords and Synonyms

Regular expression indexing; Regular expression retrieval

Problem Definition

Regular expressions (REs) provide an expressive and powerful formalism for capturing the structure of messages, events, and documents. Consequently, they have been used extensively in the specification of a number of languages for important application domains, including the XPath pattern language for XML documents [6], and the policy language of the Border Gateway Protocol (BGP) for propagating routing information between autonomous systems in the Internet [12]. Many of these applications have to manage large databases of RE specifications and need to provide an effective matching mechanism that, given an input string, quickly identifies all the REs in the database that match it. This RE retrieval problem is therefore important for a variety of software components in the middleware and networking infrastructure of the Internet.

The RE retrieval problem can be stated as follows: Given...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   399.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Altinel, M., Franklin, M.: Efficient filtering of XML documents for selective dissemination of information. In: Proceedings of 26th International Conference on Very Large Data Bases, Cairo, Egypt, pp. 53–64. Morgan Kaufmann, Missouri (2000)

    Google Scholar 

  2. Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R*-Tree: An efficient and robust access method for points and rectangles. In: Proceedings of the ACM International Conference on Management of Data, Atlantic City, New Jersey, pp. 322–331. ACM Press, New York (1990)

    Google Scholar 

  3. Chan, C.-Y., Felber, P., Garofalakis, M., Rastogi, R.: Efficient filtering of XML documents with XPath expressions. In: Proceedings of the 18th International Conference on Data Engineering, San Jose, California, pp. 235–244. IEEE Computer Society, New Jersey (2002)

    Google Scholar 

  4. Chan, C.-Y., Garofalakis, M., Rastogi, R.: RE-Tree: An efficient index structure for regular expressions. In: Proceedings of 28th International Conference on Very Large Data Bases, Hong Kong, China, pp. 251–262. Morgan Kaufmann, Missouri (2002)

    Google Scholar 

  5. Chan, C.-Y., Garofalakis, M., Rastogi, R.: RE-Tree: An efficient index structure for regular expressions. VLDB J. 12(2), 102–119 (2003)

    Google Scholar 

  6. Clark, J., DeRose, S.: XML Path Language (XPath) Version 1.0. W3C Recommendation, http://www.w3.org./TR/xpath, Accessed Nov 1999

    Google Scholar 

  7. Diao, Y., Fischer, P., Franklin, M., To, R.: YFilter: Efficient and scalable filtering of XML documents. In: Proceedings of the 18th International Conference on Data Engineering, San Jose, California, pp. 341–342. IEEE Computer Society, New Jersey (2002)

    Google Scholar 

  8. Guttman, A.: R-Trees: A dynamic index structure for spatial searching. In: Proceedings of the ACM International Conference on Management of Data, Boston, Massachusetts, pp. 47–57. ACM Press, New York (1984)

    Google Scholar 

  9. Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Massachusetts (1979)

    Google Scholar 

  10. Kannan, S., Sweedyk, Z., Mahaney, S.: Counting and random generation of strings in regular languages. In: Proceedings of the 6th ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, pp. 551–557. ACM Press, New York (1995)

    Google Scholar 

  11. Rissanen, J.: Modeling by Shortest Data Description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  12. Stewart, J.W.: BGP4, Inter-Domain Routing in the Internet. Addison Wesley, Massacuhsetts (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry

Chan, CY., Garofalakis, M., Rastogi, R. (2008). Regular Expression Indexing. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_339

Download citation

Publish with us

Policies and ethics