Abstract
Currently, malware activities pose a substantial risk to the security of Android applications. These risks are capable of stealing important information and causing chaos in the economy, social structure, and financial sector. Malicious network traffic targets Android applications due to their constant connectivity. This study develops the NMal-Droid approach for network-based Android malware detection and classification. First, we designed a packet parser algorithm that filters the combination of HTTP traces and TCP flows from PCAPs (Packet Capturing) files. Second, the fine-tune embedding approach is developed that uses a word2vec pre-trained model to analyze features’ embeddings in three different ways, i.e., random, static, and dynamic. It is used to learn and extract feature-matrix matrices with related meanings. Third, The Convolutional Neural Network (CNN) is used to extract effective features from embedded information. Fourth, the Bi-directional Gated Recurrent Unit (Bi-GRU) neural network is designed to compute gradient computation in the context of time-forward and time-reversed. Finally, a multi-head ensemble of CNN-BiGRU is developed for accurate malware classification and detection. The proposed approach is evaluated on five different activation functions with 100 filters and a range of 1–5 kernel sizes for in-depth investigation. An explainable AI-based experiment is conducted to interpret and validate the proposed approach. The proposed method is tested using two big Android malware datasets, CIC-AAGM2017 and CICMalDroid 2020, which comprise a total of 10.2k malware and 3.2K benign samples. It is shown that the proposed approach outperforms as compared to the state-of-the-art methods.
Similar content being viewed by others
Data availability
The data that support the findings of this study are openly available in Canadian Institute for Cybersecurity-CIC-AAGM2017 and CICMalDroid2020 at https://www.unb.ca/cic/datasets/android-adware.html, and https://www.unb.ca/cic/datasets/maldroid-2020.html, respectively.
Notes
https://www.unb.ca/cic/datasets/index.html. The first dataset, the Canadian Institute of Cybersecurity Android Adware and General Malware (CICAAGM2017)
References
Arshad, S., Shah, M. A., Khan, A., & Ahmed, M. (2016). Android malware detection & protection: A survey. International Journal of Advanced Computer Science and Applications, 7(2), 463–475.
Felt, A.P., Finifter, M., Chin, E., Hanna, S., & Wagner, D. (2011). A survey of mobile malware in the wild. In: Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices.
Berman, D. S., Buczak, A. L., Chavis, J. S., & Corbett, C. L. (2019). A survey of deep learning methods for cyber security. Information, 10(4), 122.
Faruki, P., Bharmal, A., Laxmi, V., Ganmoor, V., Gaur, M. S., Conti, M., & Rajarajan, M. (2014). Android security: A survey of issues, malware penetration, and defenses. IEEE Communications Surveys & Tutorials, 17(2), 998–1022.
Zhu, H.-J., You, Z. H., Zhu, Z. X., Shi, W. L., Chen, X., & Cheng, L. (2018). DroidDet: Effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing, 272, 638–646.
Qiu, J., Zhang, J., Luo, W., Pan, L., Nepal, S., & Xiang, Y. (2020). A survey of android malware detection with deep neural models. ACM Computing Surveys (CSUR), 53(6), 1–36.
Egele, M., Scholte, T., Kirda, E., & Kruegel, C. (2008). A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys (CSUR), 44(2), 1–42.
Zhou, Y. & Jiang, X. (2012). Dissecting android malware: Characterization and evolution. In: 2012 IEEE symposium on security and privacy. IEEE.
Yang, L., Han, Z., Huang, Z., & Ma, J. (2018). A remotely keyed file encryption scheme under mobile cloud computing. Journal of Network and Computer Applications, 106, 90–99.
Jiang, J., Yin, Q., Shi, Z., & Li, M. (2018). Comprehensive behavior profiling model for malware classification. In: 2018 IEEE Symposium on Computers and Communications (ISCC). IEEE.
Chen, Z., Peng, L., Gao, C., Yang, B., Chen, Y., & Li, J. (2017). Flexible neural trees based early stage identification for IP traffic. Soft Computing, 21(8), 2035–2046.
Talha, K. A., Alper, D. I., & Aydin, C. (2015). APK Auditor: Permission-based Android malware detection system. Digital Investigation, 13, 1–14.
Wang, W., Wang, X., Feng, D., Liu, J., Han, Z., & Zhang, X. (2014). Exploring permission-induced risk in android applications for malicious application detection. IEEE Transactions on Information Forensics and Security, 9(11), 1869–1882.
Ullah, F., Srivastava, G., & Ullah, S. J. J. O. C. C. (2022). A malware detection system using a hybrid approach of multi-heads attention-based control flow traces and image visualization., 11(1), 1–21.
Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P. G., & Alvarez, G. (2013). Puma: Permission usage to detect malware in android. International Joint Conference CISIS’12-ICEUTE 12-SOCO 12 Special Sessions. Springer.
de la Puerta, J.G., Sanz, B., Santos Grueiro, I., & Bringas, P. G. (2015). The evolution of permission as feature for Android malware detection. In: Computational Intelligence in Security for Information Systems Conference. Springer.
Liu, X. & Liu, J. (2014). A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering. IEEE.
Wang, S., Chen, Z., Yan, Q., Yang, B., Peng, L., & Jia, Z. (2019). A mobile malware detection method using behavior features in network traffic. Journal of Network and Computer Applications, 133, 15–25.
Aresu, M., Ariu, D., Ahmadi, M., Maiorca, D., & Giacinto, G. (2015). Clustering android malware families by http traffic. In: 2015 10th International Conference on Malicious and Unwanted Software (MALWARE). IEEE.
Wang, S., Yan, Q., Chen, Z., Yang, B., Zhao, C., & Conti, M. (2017). Detecting android malware leveraging text semantics of network flows. IEEE Transactions on Information Forensics and Security, 13(5), 1096–1109.
Li, Z., Sun, L., Yan, Q., Srisa-an, W., & Chen, Z. (2016). Droidclassifier: Efficient adaptive mining of application-layer header for classifying android malware. in International Conference on Security and Privacy in Communication Systems. Springer.
Wang, S., Chen, Z., Zhang, L., Yan, Q., Yang, B., Peng, L., & Jia, Z. (2016). Trafficav: An effective and explainable detection of mobile malware behavior using network traffic. In: 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS). IEEE.
Vierthaler, J., Kruszelnicki, R. & Schutte, J. (2018). Webeye-automated collection of malicious http traffic. arXiv preprint arXiv:1802.06012.
Naeem, H., Ullah, F., Naeem, M. R., Khalid, S., Vasan, D., Jabbar, S., & Saeed, S. (2020). Malware detection in industrial Internet of Things based on hybrid image visualization and deep learning model. Ad Hoc Networks, 105, 102154.
Xu, P., Eckert, C., & Zarras, A. (2021). Falcon: Malware detection and categorization with network traffic images. In: International Conference on Artificial Neural Networks. Springer.
Wang, S., Chen, Z., Yan, Q., Ji, K., Wang, L., Yang, B., & Conti, M. (2018). Deep and broad learning based detection of android malware via network traffic. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). IEEE.
Chen, Z., Yu, B., Zhang, Y., Zhang, J., & Xu, J. (2016). Automatic mobile application traffic identification by convolutional neural networks. In: 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE.
David, O. E. & Netanyahu, N. S. (2015). Deepsign: Deep learning for automatic malware signature generation and classification. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE.
Wang, S., Chen, Z., Yan, Q., Ji, K., Peng, L., Yang, B., & Conti, M. (2020). Deep and broad URL feature mining for android malware detection. Information Sciences, 513, 600–613.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.
Karbab, E. B., Debbabi, M., Derhab, A., & Mouheb, D. (2018). MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation, 24, S48–S59.
Qiao, Y., Zhang, W., Du, X., & Guizani, M. (2021). Malware classification based on multilayer perception and word2Vec for IoT security. ACM Transactions on Internet Technology (TOIT), 22(1), 1–22.
Lee, W. Y., Saxe, J., & Harang, R. (2019). SeqDroid: Obfuscated Android malware detection using stacked convolutional and recurrent neural networks, In: Deep learning applications for cyber security. Springer. pp. 197–210.
Vasan, D., Alazab, M., Wassan, S., Safaei, B., & Zheng, Q. (2020). Image-based malware classification using ensemble of CNN architectures (IMCEC). Computers & Security, 92, 101748.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Bai, H., Liu, G., Liu, W., Quan, Y., & Huang, S. (2021). N-gram, semantic-based neural network for mobile malware network traffic detection. Security and Communication Networks, 2021.
Lashkari, A.H., Kadir, A. F. A., Gonzalez, H., Mbah, K. F., & Ghorbani, A. A. (2017). Towards a network-based framework for android malware detection and characterization. In: 2017 15th Annual conference on privacy, security and trust (PST). IEEE.
Mahdavifar, S., Kadir, A. F. A., Fatemi, R., Alhadidi, D., & Ghorbani, A. A. (2020). Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning. In: 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). IEEE.
Shyong, Y.-C., Jeng, T.-H., & Chen, Y.-M. (2020). Combining static permissions and dynamic packet analysis to improve android malware detection. In: 2020 2nd International Conference on Computer Communication and the Internet (ICCCI). IEEE.
Ficco, M. (2021). Malware analysis by combining multiple detectors and observation windows. IEEE Transactions on Computers, 71(6), 1276–1290.
Ullah, F., Srivastava, G., & Ullah, S. (2022). A malware detection system using a hybrid approach of multi-heads attention-based control flow traces and image visualization. Journal of Cloud Computing, 11(1), 1–21.
Ullah, F., Ullah, S., Srivastava, G., & Lin, J. C. W. (2023). IDS-INT: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic. Digital Communications and Networks.
Mathews, S.M. (2019). Explainable artificial intelligence applications in NLP, biomedical, and malware classification: A literature review. In: Intelligent computing-proceedings of the computing conference. Springer.
Ullah, F., Alsirhani, A., Alshahrani, M. M., Alomari, A., Naeem, H., & Shah, S. A. (2022). Explainable malware detection system using transformers-based transfer learning and multi-model visual representation. Sensors, 22(18), 6766.
Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(11).
Funding
Funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant no. RGPIN-2020-05363).
Author information
Authors and Affiliations
Contributions
FU proposed the study, simulated it, and wrote the manuscript. SU helped in writing algorithms and formatting. YZ reviewed. and made writing suggestions. GS and JC-WL reviewed and analyzed the proposed research. All authors have read and agreed to the published version of the manuscript
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ullah, F., Ullah, S., Srivastava, G. et al. NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble. Wireless Netw (2023). https://doi.org/10.1007/s11276-023-03414-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s11276-023-03414-5