Abstract
The identification of vulnerabilities is an important element in the software development life cycle to ensure the security of software. While vulnerability identification based on the source code is a well studied field, the identification of vulnerabilities on basis of a binary executable without the corresponding source code is more challenging. Recent research [1] has shown how such detection can generally be enabled by deep learning methods, but appears to be very limited regarding the overall amount of detected vulnerabilities. We analyse to what extent we could cover the identification of a larger variety of vulnerabilities. Therefore, a supervised deep learning approach using recurrent neural networks for the application of vulnerability detection based on binary executables is used. The underlying basis is a dataset with 50,651 samples of vulnerable code in the form of a standardised LLVM Intermediate Representation. Te vectorised features of a Word2Vec model are used to train different variations of three basic architectures of recurrent neural networks (GRU, LSTM, SRNN). A binary classification was established for detecting the presence of an arbitrary vulnerability, and a multi-class model was trained for the identification of the exact vulnerability, which achieved an out-of-sample accuracy of 88% and 77%, respectively. Differences in the detection of different vulnerabilities were also observed, with non-vulnerable samples being detected with a particularly high precision of over 98%. Thus, our proposed technical approach and methodology enables an accurate detection of 23 (compared to 4 [1]) vulnerabilities.
This work has been funded by the German BMBF under Grant Number 16KIS1403.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zheng, J., Pang, J., Zhang, X., Zhou, X., Li, M., Wang, J.: Recurrent neural network based binary code vulnerability detection. In: Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence, ser. ACAI 2019, Sanya, China, pp. 160–165. Association for Computing Machinery (2019). https://doi.org/10.1145/3377713.3377738. ISBN 9781450372619
Li, J., Zhao, B., Zhang, C.: Fuzzing: a survey. Cybersecurity 1(1), 1–13 (2018)
Arakelyan, S., Arasteh, S., Hauser, C., Kline, E., Galstyan, A.: Bin2vec: learning representations of binary executable programs for security tasks. Cybersecurity 4(1), 1–14 (2021)
Li, Z., Zou, D., Xu, S., Jin, H., Qi, H., Hu, J.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, ser. ACSAC 2016, Los Angeles, California, USA, pp. 201–213. Association for Computing Machinery (2016). https://doi.org/10.1145/2991079.2991102. ISBN 9781450347716
Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire OS distributions. In: 2012 IEEE Symposium on Security and Privacy, pp. 48–62 (2012). https://doi.org/10.1109/SP.2012.13
Liu, Z., Wei, Q., Cao, Y.: VFDETECT: a vulnerable code clone detection system based on vulnerability fingerprint. In: 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC), pp. 548–553 (2017). https://doi.org/10.1109/ITOEC.2017.8122356
Kim, S., Woo, S., Lee, H., Oh, H.: VUDDY: a scalable approach for vulnerable code clone discovery. In: IEEE Symposium on Security and Privacy (SP), pp. 595–614 (2017). https://doi.org/10.1109/SP.2017.62
Li, Z., et al.: VulDeePecker: a deep learning-based system for vulnerability detection. In: Proceedings 2018 Network and Distributed System Security Symposium (2018). https://doi.org/10.14722/ndss.2018.23158
Black, P.E., et al.: SARD: thousands of reference programs for software assurance. J. Cyber Secur. Inf. Syst. Tools Test. Tech. Assur. Softw. Dod Softw. Assur. Community Pract. 2(5) (2017)
Li, Z., Zou, D., Xu, S., Chen, Z., Zhu, Y., Jin, H.: VulDeeLocator: a deep learning-based fine-grained vulnerability detector. IEEE Trans. Dependable Secure Comput. 1 (2021). https://doi.org/10.1109/tdsc.2021.3076142. ISSN 2160-9209
Pewny, J., Garmany, B., Gawlik, R., Rossow, C., Holz, T.: Cross-architecture bug search in binary executables. In: IEEE Symposium on Security and Privacy, pp. 709–724 (2015). https://doi.org/10.1109/SP.2015.49
Eschweiler, S., Yakdan, K., Gerhards-Padilla, E.: DiscovRE: efficient cross-architecture identification of bugs in binary code. In: NDSS, vol. 52, pp. 58–79 (2016)
Dahl, W.A., Erdodi, L., Zennaro, F.M.: Stack-based buffer overflow detection using recurrent neural networks (2020). arXiv:2012.15116 [cs.CR]
Xue, H., Sun, S., Venkataramani, G., Lan, T.: Machine learning-based analysis of program binaries: a comprehensive study (2019). https://doi.org/10.1109/ACCESS.2019.2917668
Gutstein, B., Richardson, A.: Juliet test suite for C/C++ (2019). https://github.com/arichardson/juliet-test-suite-c
Wang, Y., Wu, Z., Wei, Q., Wang, Q.: NeuFuzz: efficient fuzzing with deep neural network. IEEE Access 7, 36 340–36 352 (2019). https://doi.org/10.1109/ACCESS.2019.2903291
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schaad, A., Binder, D. (2023). Deep-Learning-Based Vulnerability Detection in Binary Executables. In: Jourdan, GV., Mounier, L., Adams, C., Sèdes, F., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2022. Lecture Notes in Computer Science, vol 13877. Springer, Cham. https://doi.org/10.1007/978-3-031-30122-3_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-30122-3_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30121-6
Online ISBN: 978-3-031-30122-3
eBook Packages: Computer ScienceComputer Science (R0)