research-article

Generalized vulnerability extrapolation using abstract syntax trees

Authors:
Fabian Yamaguchi

University of Göttingen, Göttingen, Germany

University of Göttingen, Göttingen, Germany
View Profile

,
Markus Lottmann

Technische Universität Berlin, Berlin, Germany

Technische Universität Berlin, Berlin, Germany
View Profile

,
Konrad Rieck

University of Göttingen, Göttingen, Germany

University of Göttingen, Göttingen, Germany
View Profile

ACSAC '12: Proceedings of the 28th Annual Computer Security Applications ConferenceDecember 2012Pages 359–368https://doi.org/10.1145/2420950.2421003

Published:03 December 2012Publication History

ACSAC '12: Proceedings of the 28th Annual Computer Security Applications Conference

Pages 359–368

ABSTRACT

The discovery of vulnerabilities in source code is a key for securing computer systems. While specific types of security flaws can be identified automatically, in the general case the process of finding vulnerabilities cannot be automated and vulnerabilities are mainly discovered by manual analysis. In this paper, we propose a method for assisting a security analyst during auditing of source code. Our method proceeds by extracting abstract syntax trees from the code and determining structural patterns in these trees, such that each function in the code can be described as a mixture of these patterns. This representation enables us to decompose a known vulnerability and extrapolate it to a code base, such that functions potentially suffering from the same flaw can be suggested to the analyst. We evaluate our method on the source code of four popular open-source projects: LibTIFF, FFmpeg, Pidgin and Asterisk. For three of these projects, we are able to identify zero-day vulnerabilities by inspecting only a small fraction of the code bases.

References

T. Avgerinos, S. K. Cha, B. L. T. Hao, and D. Brumley. AEG: Automatic Exploit Generation. In Proc. of Network and Distributed System Security Symposium (NDSS), 2011.Google Scholar
I. D. Baxter, A. Yahin, L. Moura, M. S. Anna, and L. Bier. Clone detection using abstract syntax trees. In Proc. of the International Conference on Software Maintenance (ICSM), 1998. Google ScholarDigital Library
S. Bellon, R. Koschke, I. C. Society, G. Antoniol, J. Krinke, I. C. Society, and E. Merlo. Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering, 33: 577--591, 2007. Google ScholarDigital Library
M. Cova, V. Felmetsger, G. Banks, and G. Vigna. Static detection of vulnerabilities in x86 executables. In Proc. of Annual Computer Security Applications Conference (ACSAC), pages 269--278, 2006. Google ScholarDigital Library
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6): 391--407, 1990.Google ScholarCross Ref
D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proc. of ACM Symposium on Operating Systems Principles (SOSP), pages 57--72, 2001. Google ScholarDigital Library
N. Falliere, L. O. Murchu, and E. Chien. W32.stuxnet dossier. Symantec Corporation, 2011.Google Scholar
P. Godefroid, M. Y. Levin, and D. Molnar. SAGE: whitebox fuzzing for security testing. Communications of the ACM, 55(3): 40--44, 2012. Google ScholarDigital Library
S. Heelan. Vulnerability detection systems: Think cyborg, not robot. IEEE Security & Privacy, 9(3): 74--77, 2011. Google ScholarDigital Library
J. Hopcroft and J. Motwani, R. Ullmann. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 2 edition, 2001. Google ScholarDigital Library
J. Jang, A. Agrawal, and D. Brumley. ReDeBug: finding unpatched code clones in entire os distributions. In Proc. of IEEE Symposium on Security and Privacy, 2012. Google ScholarDigital Library
N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: A static analysis tool for detecting web application vulnerabilities. In Proc. of IEEE Symposium on Security and Privacy, pages 6--263, 2006. Google ScholarDigital Library
T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering, pages 654--670, 2002. Google ScholarDigital Library
K. A. Kontogiannis, R. Demori, E. Merlo, M. Galler, and M. Bernstein. Pattern matching for clone and concept detection. Journal of Automated Software Engineering, 3: 108, 1996.Google ScholarCross Ref
Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proc. of European Software Engineering Conference (ESEC), pages 306--315, 2005. Google ScholarDigital Library
Z. Li, S. Lu, S. Myagmar, and Y. Zhou. Cp-miner: Finding copy-paste and related bugs in large-scale software code. IEEE Transactions on Software Engineering, 32: 176--192, 2006. Google ScholarDigital Library
B. Livshits and T. Zimmermann. Dynamine: finding common error patterns by mining software revision histories. In Proc. of European Software Engineering Conference (ESEC), pages 296--305, 2005. Google ScholarDigital Library
V. B. Livshits and M. S. Lam. Finding security vulnerabilities in java applications with static analysis. In Proc. of USENIX Security Symposium, 2005. Google ScholarDigital Library
A. Marcus and J. I. Maletic. Identification of high-level concept clones in source code. In Proc. of International Conference on Automated Software Engineering (ASE), page 107, 2001. Google ScholarDigital Library
L. Moonen. Generating robust parsers using island grammars. In Proc. of Working Conference on Reverse Engineering (WCRE), pages 13--22, 2001. Google ScholarDigital Library
D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver. Inside the Slammer worm. IEEE Security and Privacy, 1(4): 33--39, 2003. Google ScholarDigital Library
J. Newsome and D. Song. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In Proc. of Network and Distributed System Security Symposium (NDSS), 2005.Google Scholar
T. Parr and R. Quong. ANTLR: A predicated-LL(k) parser generator. Software Practice and Experience, 25: 789--810, 1995. Google ScholarDigital Library
rats. Rough auditing tool for security. Fortify Software Inc., https://www.fortify.com/ssa-elements/threat-intelligence/rats.html, visited April, 2012.Google Scholar
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1986. Google ScholarDigital Library
C. Shannon and D. Moore. The spread of the Witty worm. IEEE Security and Privacy, 2(4): 46--50, 2004. Google ScholarDigital Library
M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley Professional, 2007. Google ScholarDigital Library
J. Viega, J. Bloch, Y. Kohno, and G. McGraw. ITS4: A static vulnerability scanner for C and C++ code. In Proc. of Annual Computer Security Applications Conference (ACSAC), pages 257--267, 2000. Google ScholarDigital Library
T. Wang, T. Wei, Z. Lin, and W. Zou. IntScope: Automatically detecting integer overflow vulnerability in x86 binary using symbolic execution. In Proc. of Network and Distributed System Security Symposium (NDSS), 2009.Google Scholar
D. A. Wheeler. Flawfinder. http://www.dwheeler.com/flawfinder/, visited April, 2012.Google Scholar
C. C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering, 31: 466--480, 2005. Google ScholarDigital Library
Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages. In Proc. of USENIX Security Symposium, 2006. Google ScholarDigital Library
F. Yamaguchi, F. Lindner, and K. Rieck. Vulnerability extrapolation: Assisted discovery of vulnerabilities using machine learning. In USENIX Workshop on Offensive Technologies (WOOT), Aug. 2011. Google ScholarDigital Library

Index Terms

Generalized vulnerability extrapolation using abstract syntax trees

Recommendations

Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning
WOOT'11: Proceedings of the 5th USENIX conference on Offensive technologies

Rigorous identification of vulnerabilities in program code is a key to implementing and operating secure systems. Unfortunately, only some types of vulnerabilities can be detected automatically. While techniques from software testing can accelerate the ...
Read More
Improving the performance of code vulnerability prediction using abstract syntax tree information
PROMISE 2022: Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering

The recent emergence of the Log4jshell vulnerability demonstrates the importance of detecting code vulnerabilities in software systems. Software Vulnerability Prediction Models (VPMs) are a promising tool for vulnerability detection. Recent studies ...
Read More
An Abstract Syntax Tree based static fuzzing mutation for vulnerability evolution analysis
Abstract Context:
Zero-day vulnerabilities are highly destructive and sudden. However, traditional static and dynamic testing methods cannot efficiently detect them.
Objective:
In this paper, a static ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACSAC '12: Proceedings of the 28th Annual Computer Security Applications Conference
December 2012
464 pages
ISBN:9781450313124
DOI:10.1145/2420950
Conference Chair:
Robert H'obbes' Zakon
Zakon Group LLC
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 December 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
ACSAC '12 Paper Acceptance Rate44of231submissions,19%Overall Acceptance Rate104of497submissions,21%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 143
  Total Citations
  View Citations
- 1,056
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Generalized vulnerability extrapolation using abstract syntax trees

ACSAC '12: Proceedings of the 28th Annual Computer Security Applications Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning

Improving the performance of code vulnerability prediction using abstract syntax tree information

An Abstract Syntax Tree based static fuzzing mutation for vulnerability evolution analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Generalized vulnerability extrapolation using abstract syntax trees

ACSAC '12: Proceedings of the 28th Annual Computer Security Applications Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning

Improving the performance of code vulnerability prediction using abstract syntax tree information

An Abstract Syntax Tree based static fuzzing mutation for vulnerability evolution analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media