skip to main content
10.1145/1544012.1544037acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article

Extending finite automata to efficiently match Perl-compatible regular expressions

Published:09 December 2008Publication History

ABSTRACT

Regular expression matching is a crucial task in several networking applications. Current implementations are based on one of two types of finite state machines. Non-deterministic finite automata (NFAs) have minimal storage demand but have high memory bandwidth requirements. Deterministic finite automata (DFAs) exhibit low and deterministic memory bandwidth requirements at the cost of increased memory space. It has already been shown how the presence of wildcards and repetitions of large character classes can render DFAs and NFAs impractical. Additionally, recent security-oriented rule-sets include patterns with advanced features, namely back-references, which add to the expressive power of traditional regular expressions and cannot therefore be supported through classical finite automata.

In this work, we propose and evaluate an extended finite automaton designed to address these shortcomings. First, the automaton provides an alternative approach to handle character repetitions that limits memory space and bandwidth requirements. Second, it supports back-references without the need for back-tracking in the input string. In our discussion of this proposal, we address practical implementation issues and evaluate the automaton on real-world rule-sets. To our knowledge, this is the first high-speed automaton that can accommodate all the Perl-compatible regular expressions present in the Snort network intrusion and detection system.

References

  1. A. V. Aho and M. J. Corasick, "Efficient String Matching: An Aid to Bibliographic Search," in Communications of the ACM, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. E. Hopcroft and J. D. Ullman, "Introduction to Automata Theory, Languages, and Computation," Addison Wesley, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. E. F. Friedl, "Mastering Regular Expressions," Third Edition, O'Reilly, August 2006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Perl Compatible Regular Expressions: http://www.pcre.org/Google ScholarGoogle Scholar
  5. Ville Laurikari, "NFAs with Tagged Transitions, Their Conversion to Deterministic Automata and Application to Regular Expressions", in SPIRE 2000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Roesch, "Snort: Lightweight Intrusion Detection for Networks," in 13th System Administration Conf., Nov 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Snort: http://www.Snort.org/Google ScholarGoogle Scholar
  8. V. Paxson, "Bro: A System for Detecting Network Intruders in Real-Time", in Computer Networks, 31(23--24), Dec. 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. ClamAV: http://www.clamav.net/Google ScholarGoogle Scholar
  10. Cisco Security Appliance. http://www.cisco.com. 2007.Google ScholarGoogle Scholar
  11. Citrix Application Firewall. http://www.citrix.com. 2007.Google ScholarGoogle Scholar
  12. M. Altinel, M. J. Franklin, "Efficient Filtering of XML Documents for Selective Dissemination of Information", in Proc. VLDB Conference 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Sommer and V. Paxson "Enhancing byte-level network intrusion detection signatures with context," in CCS 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Newsome et al., "Polygraph: Automatic Signature Generation for Polymorphic Worms", in IEEE Security & Privacy Symp., 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Tan, and T. Sherwood, "A High Throughput String Matching Architecture for Intrusion Detection and Prevention," in ISCA 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. Yu et al., "Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection", in ANCS 2006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Kumar et al., "Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection," in ACM SIGCOMM, Sept 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Kumar et al., "Advanced Algorithms for Fast and Scalable Deep Packet Inspection", ANCS 2006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Becchi and P. Crowley, "An Improved Algorithm to Accelerate Regular Expression Evaluation", in ANCS 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Becchi and P. Crowley, "A Hybrid Finite Automaton for Practical Deep Packet Inspection", in CoNEXT 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Kumar et al. "Curing Regular Expressions Matching Algorithms from Insomnia, Amnesia, and Acalculia," in ANCS 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Sidhu and V. K. Prasanna, "Fast Regular Expression Matching using FPGAs", in FCCM 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Franklin et al., "Assisting Network Intrusion Detection with Reconfigurable Hardware," FCCM 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Clark et al., "Efficient reconfigurable logic circuit for matching complex network intrusion detection patterns," in FLP 2003Google ScholarGoogle Scholar
  25. B. Brodie, et al., "A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching," in ISCA 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Mitra et al., "Compiling PCRE to FPGA for Accelerating SNORT IDS", in ANCS 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Becchi et al., "A workload for evaluating deep packet inspection architectures," in IISWC 2008Google ScholarGoogle Scholar

Index Terms

  1. Extending finite automata to efficiently match Perl-compatible regular expressions

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CoNEXT '08: Proceedings of the 2008 ACM CoNEXT Conference
          December 2008
          526 pages
          ISBN:9781605582108
          DOI:10.1145/1544012

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 December 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate198of789submissions,25%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader