SFADiff: Automated Evasion Attacks and Fingerprinting Using Black-box Differential Automata Learning

Authors:
George Argyros

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

,
Ioannis Stais

University of Athens, Athens, Greece

University of Athens, Athens, Greece
View Profile

,
Suman Jana

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

,
Angelos D. Keromytis

Columbia University, New York, NY, USA

Columbia University, New York, NY, USA
View Profile

,
Aggelos Kiayias

University of Edinburgh, Edinburgh, United Kingdom

University of Edinburgh, Edinburgh, United Kingdom
View Profile

CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications SecurityOctober 2016Pages 1690–1701https://doi.org/10.1145/2976749.2978383

Published:24 October 2016Publication History

CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

Pages 1690–1701

ABSTRACT

Finding differences between programs with similar functionality is an important security problem as such differences can be used for fingerprinting or creating evasion attacks against security software like Web Application Firewalls (WAFs) which are designed to detect malicious inputs to web applications. In this paper, we present SFADIFF, a black-box differential testing framework based on Symbolic Finite Automata (SFA) learning. SFADIFF can automatically find differences between a set of programs with comparable functionality. Unlike existing differential testing techniques, instead of searching for each difference individually, SFADIFF infers SFA models of the target programs using black-box queries and systematically enumerates the differences between the inferred SFA models. All differences between the inferred models are checked against the corresponding programs. Any difference between the models, that does not result in a difference between the corresponding programs, is used as a counterexample for further refinement of the inferred models. SFADIFF's model-based approach, unlike existing differential testing tools, also support fully automated root cause analysis in a domain-independent manner.

We evaluate SFADIFF in three different settings for finding discrepancies between: (i) three TCP implementations, (ii) four WAFs, and (iii) HTML/JavaScript parsing implementations in WAFs and web browsers. Our results demonstrate that SFADIFF is able to identify and enumerate the differences systematically and efficiently in all these settings. We show that SFADIFF is able to find differences not only between different WAFs but also between different versions of the same WAF. SFADIFF is also able to discover three previously-unknown differences between the HTML/JavaScript parsers of two popular WAFs (PHPIDS 0.7 and Expose 2.4.0) and the corresponding parsers of Google Chrome, Firefox, Safari, and Internet Explorer. We confirm that all these differences can be used to evade the WAFs and launch successful cross-site scripting attacks.

References

Peach fuzzer. http://www.peachfuzzer.com/. (Accessed on 08/10/2016).Google Scholar
F. Aarts, J. D. Ruiter, and E. Poll. Formal models of bank cards for free. In Software Testing, Verification and Validation Workshops (ICSTW), IEEE International Conference on, 2013. Google ScholarDigital Library
F. Aarts, J. Schmaltz, and F. Vaandrager. Inference and abstraction of the biometric passport. In Leveraging Applications of Formal Methods, Verification, and Validation. 2010. Google ScholarDigital Library
D. Angluin. Learning regular sets from queries and counterexamples. Information and computation, 75(2):87--106, 1987. Google ScholarDigital Library
G. Argyros, I. Stais, A. Keromytis, and A. Kiayias. Back in black: Towards formal, black-box analysis of sanitizers and filters. In Security and privacy (S&P), 2016 IEEE symposium on, 2016.Google Scholar
J. Balcázar, J. Díaz, R. Gavalda, and O. Watanabe. Algorithms for learning finite automata from queries: A unified view. Springer, 1997.Google Scholar
M. Botincan and D. Babić. Sigma*: Symbolic Learning of Input-Output Specifications. In POPL, 2013. Google ScholarDigital Library
C. Brubaker, S. Jana, B. Ray, S. Khurshid, and V. Shmatikov. Using frankencerts for automated adversarial testing of certificate validation in SSL/TLS implementations. In Security and privacy (S&P), 2016 IEEE symposium on, 2014. Google ScholarDigital Library
D. Brumley, J. Caballero, Z. Liang, J. Newsome, and D. Song. Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation. In USENIX Security Symposium (USENIX Security), 2007. Google ScholarDigital Library
J. Caballero, S. Venkataraman, P. Poosankam, M. Kang, D. Song, and A. Blum. FiG: Automatic fingerprint generation. Department of Electrical and Computing Engineering, page 27, 2007.Google Scholar
Y. Chen, T. Su, C. Sun, Z. Su, and J. Zhao. Coverage-directed differential testing of JVM implementations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 85--99. ACM, 2016. Google ScholarDigital Library
T. Chow. Testing software design modeled by finite-state machines. IEEE transactions on software engineering, (3):178--187, 1978. Google ScholarDigital Library
T. H. Cormen. Introduction to algorithms. MIT press, 2009. Google ScholarDigital Library
L. D'Antoni and M. Veanes. Minimization of symbolic automata. In ACM SIGPLAN Notices, volume 49, pages 541--553. ACM, 2014. Google ScholarDigital Library
P. Fiterau-Broştean, R. Janssen, and F. Vaandrager. Learning fragments of the TCP network protocol. In Formal Methods for Industrial Critical Systems. 2014.Google Scholar
P. Fiterau-Broştean, R. Janssen, and F. Vaandrager. Combining model learning and model checking to analyze TCP implementations. In International Conference on Computer-Aided Verification (CAV). 2016.Google ScholarCross Ref
Fyodor. Remote OS detection via TCP/IP fingerprinting (2nd generation).Google Scholar
A. Groce, G. Holzmann, and R. Joshi. Randomized differential testing as a prelude to formal verification. In International Conference on Software Engineering (ICSE), 2007. Google ScholarDigital Library
A. Groce, D. Peled, and M. Yannakakis. Adaptive model checking. In Tools and Algorithms for the Construction and Analysis of Systems, pages 357--370. 2002. Google ScholarDigital Library
J. Jung, A. Sheth, B. Greenstein, D. Wetherall, G. Maganis, and T. Kohno. Privacy oracle: a system for finding application leaks with black box differential testing. In CCS, 2008. Google ScholarDigital Library
D. Kozen. Lower bounds for natural proof systems. In FOCS, 1977. Google ScholarDigital Library
F. Massicotte and Y. Labiche. An analysis of signature overlaps in Intrusion Detection Systems. In IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2011. Google ScholarDigital Library
W. McKeeman. Differential testing for software. Digital Technical Journal, 10(1), 1998.Google Scholar
H. Raffelt, B. Steffen, and T. Berg. Learnlib: A library for automata learning and experimentation. In Proceedings of the 10th international workshop on Formal methods for industrial critical systems (FMICS), 2005. Google ScholarDigital Library
D. Richardson, S. Gribble, and T. Kohno. The limits of automatic OS fingerprint generation. In ACM workshop on Artificial intelligence and security (AISec), 2010. Google ScholarDigital Library
J. D. Ruiter and E. Poll. Protocol state fuzzing of TLS implementations. In USENIX Security Symposium (USENIX Security), 2015. Google ScholarDigital Library
G. Shu and D. Lee. Network Protocol System Fingerprinting-A Formal Approach. In IEEE Conference on Computer Communications (INFOCOM), 2006.Google Scholar
M. Sipser. Introduction to the Theory of Computation, volume 2. Thomson Course Technology Boston, 2006. Google ScholarDigital Library
M. Veanes, P. D. Halleux, and N. Tillmann. Rex: Symbolic regular expression explorer. In International Conference on Software Testing, Verification and Validation (ICST), 2010. Google ScholarDigital Library
M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, and N. Bjorner. Symbolic finite state transducers: Algorithms and applications. ACM SIGPLAN Notices, 47, 2012. Google ScholarDigital Library
W. Xu, Y. Qi, and D. Evans. Automatically evading classifiers a case study on PDF malware classifiers. In Proceedings of the 2016 Network and Distributed Systems Symposium (NDSS), 2016.Google ScholarCross Ref
X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In PLDI, 2011. Google ScholarDigital Library

Index Terms

SFADiff: Automated Evasion Attacks and Fingerprinting Using Black-box Differential Automata Learning

Recommendations

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

In recent years, machine learning algorithms, and more specifically deep learning algorithms, have been widely used in many fields, including cyber security. However, machine learning systems are vulnerable to adversarial attacks, and this limits the ...
Read More
Evasion attacks against machine learning at test time
ECMLPKDD'13: Proceedings of the 2013th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III

In security-sensitive applications, the success of machine learning depends on a thorough vetting of their resistance to adversarial data. In one pertinent, well-motivated attack scenario, an adversary may attempt to evade a deployed system at test time ...
Read More
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
IH&MMSec '16: Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security

Learning to discriminate between secure and hostile patterns is a crucial problem for species to survive in nature. Mimetism and camouflage are well-known examples of evolving weapons and defenses in the arms race between predators and preys. It is thus ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security
October 2016
1924 pages
ISBN:9781450341394
DOI:10.1145/2976749
General Chairs:
Edgar Weippl
SBA Research, Austria
,
Stefan Katzenbeisser
TU Darmstadt, CYSEC, Germany
,
Program Chairs:
Christopher Kruegel
University of California, Santa Barbara, USA
,
Andrew Myers
Cornell University, USA
,
Shai Halevi
IBM Research, USA
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automata learning
differential testing
evasion attacks
fingerprints
web application firewalls
Qualifiers
- research-article
Conference

Acceptance Rates
CCS '16 Paper Acceptance Rate137of831submissions,16%Overall Acceptance Rate1,261of6,999submissions,18%
More
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 1,356
  Total Downloads
- Downloads (Last 12 months)152
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SFADiff: Automated Evasion Attacks and Fingerprinting Using Black-box Differential Automata Learning

CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Evasion attacks against machine learning at test time

Machine Learning under Attack: Vulnerability Exploitation and Security Measures