NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble

Ullah, Farhan; Ullah, Shamsher; Srivastava, Gautam; Lin, Jerry Chun-Wei; Zhao, Yue

doi:10.1007/s11276-023-03414-5

NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble

Published: 27 June 2023

(2023)
Cite this article

Wireless Networks Aims and scope Submit manuscript

Farhan Ullah¹,
Shamsher Ullah²,
Gautam Srivastava ORCID: orcid.org/0000-0001-9851-4103^3,5,6,
Jerry Chun-Wei Lin⁴ &
…
Yue Zhao¹

492 Accesses
4 Citations
Explore all metrics

Abstract

Currently, malware activities pose a substantial risk to the security of Android applications. These risks are capable of stealing important information and causing chaos in the economy, social structure, and financial sector. Malicious network traffic targets Android applications due to their constant connectivity. This study develops the NMal-Droid approach for network-based Android malware detection and classification. First, we designed a packet parser algorithm that filters the combination of HTTP traces and TCP flows from PCAPs (Packet Capturing) files. Second, the fine-tune embedding approach is developed that uses a word2vec pre-trained model to analyze features’ embeddings in three different ways, i.e., random, static, and dynamic. It is used to learn and extract feature-matrix matrices with related meanings. Third, The Convolutional Neural Network (CNN) is used to extract effective features from embedded information. Fourth, the Bi-directional Gated Recurrent Unit (Bi-GRU) neural network is designed to compute gradient computation in the context of time-forward and time-reversed. Finally, a multi-head ensemble of CNN-BiGRU is developed for accurate malware classification and detection. The proposed approach is evaluated on five different activation functions with 100 filters and a range of 1–5 kernel sizes for in-depth investigation. An explainable AI-based experiment is conducted to interpret and validate the proposed approach. The proposed method is tested using two big Android malware datasets, CIC-AAGM2017 and CICMalDroid 2020, which comprise a total of 10.2k malware and 3.2K benign samples. It is shown that the proposed approach outperforms as compared to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

A systematic literature review for network intrusion detection system (IDS)

Article 27 March 2023

Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions

Article 14 March 2022

Data availability

The data that support the findings of this study are openly available in Canadian Institute for Cybersecurity-CIC-AAGM2017 and CICMalDroid2020 at https://www.unb.ca/cic/datasets/android-adware.html, and https://www.unb.ca/cic/datasets/maldroid-2020.html, respectively.

Notes

https://www.statista.com/statistics/266210/number-of-available-applications-in-thegoogle-play-store.
https://www.unb.ca/cic/datasets/index.html. The first dataset, the Canadian Institute of Cybersecurity Android Adware and General Malware (CICAAGM2017)

References

Arshad, S., Shah, M. A., Khan, A., & Ahmed, M. (2016). Android malware detection & protection: A survey. International Journal of Advanced Computer Science and Applications, 7(2), 463–475.
Article Google Scholar
Felt, A.P., Finifter, M., Chin, E., Hanna, S., & Wagner, D. (2011). A survey of mobile malware in the wild. In: Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices.
Berman, D. S., Buczak, A. L., Chavis, J. S., & Corbett, C. L. (2019). A survey of deep learning methods for cyber security. Information, 10(4), 122.
Article Google Scholar
Faruki, P., Bharmal, A., Laxmi, V., Ganmoor, V., Gaur, M. S., Conti, M., & Rajarajan, M. (2014). Android security: A survey of issues, malware penetration, and defenses. IEEE Communications Surveys & Tutorials, 17(2), 998–1022.
Article Google Scholar
Zhu, H.-J., You, Z. H., Zhu, Z. X., Shi, W. L., Chen, X., & Cheng, L. (2018). DroidDet: Effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing, 272, 638–646.
Article Google Scholar
Qiu, J., Zhang, J., Luo, W., Pan, L., Nepal, S., & Xiang, Y. (2020). A survey of android malware detection with deep neural models. ACM Computing Surveys (CSUR), 53(6), 1–36.
Article Google Scholar
Egele, M., Scholte, T., Kirda, E., & Kruegel, C. (2008). A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys (CSUR), 44(2), 1–42.
Article Google Scholar
Zhou, Y. & Jiang, X. (2012). Dissecting android malware: Characterization and evolution. In: 2012 IEEE symposium on security and privacy. IEEE.
Yang, L., Han, Z., Huang, Z., & Ma, J. (2018). A remotely keyed file encryption scheme under mobile cloud computing. Journal of Network and Computer Applications, 106, 90–99.
Article Google Scholar
Jiang, J., Yin, Q., Shi, Z., & Li, M. (2018). Comprehensive behavior profiling model for malware classification. In: 2018 IEEE Symposium on Computers and Communications (ISCC). IEEE.
Chen, Z., Peng, L., Gao, C., Yang, B., Chen, Y., & Li, J. (2017). Flexible neural trees based early stage identification for IP traffic. Soft Computing, 21(8), 2035–2046.
Article Google Scholar
Talha, K. A., Alper, D. I., & Aydin, C. (2015). APK Auditor: Permission-based Android malware detection system. Digital Investigation, 13, 1–14.
Article Google Scholar
Wang, W., Wang, X., Feng, D., Liu, J., Han, Z., & Zhang, X. (2014). Exploring permission-induced risk in android applications for malicious application detection. IEEE Transactions on Information Forensics and Security, 9(11), 1869–1882.
Article Google Scholar
Ullah, F., Srivastava, G., & Ullah, S. J. J. O. C. C. (2022). A malware detection system using a hybrid approach of multi-heads attention-based control flow traces and image visualization., 11(1), 1–21.
Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P. G., & Alvarez, G. (2013). Puma: Permission usage to detect malware in android. International Joint Conference CISIS’12-ICEUTE 12-SOCO 12 Special Sessions. Springer.
de la Puerta, J.G., Sanz, B., Santos Grueiro, I., & Bringas, P. G. (2015). The evolution of permission as feature for Android malware detection. In: Computational Intelligence in Security for Information Systems Conference. Springer.
Liu, X. & Liu, J. (2014). A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering. IEEE.
Wang, S., Chen, Z., Yan, Q., Yang, B., Peng, L., & Jia, Z. (2019). A mobile malware detection method using behavior features in network traffic. Journal of Network and Computer Applications, 133, 15–25.
Article Google Scholar
Aresu, M., Ariu, D., Ahmadi, M., Maiorca, D., & Giacinto, G. (2015). Clustering android malware families by http traffic. In: 2015 10th International Conference on Malicious and Unwanted Software (MALWARE). IEEE.
Wang, S., Yan, Q., Chen, Z., Yang, B., Zhao, C., & Conti, M. (2017). Detecting android malware leveraging text semantics of network flows. IEEE Transactions on Information Forensics and Security, 13(5), 1096–1109.
Article Google Scholar
Li, Z., Sun, L., Yan, Q., Srisa-an, W., & Chen, Z. (2016). Droidclassifier: Efficient adaptive mining of application-layer header for classifying android malware. in International Conference on Security and Privacy in Communication Systems. Springer.
Wang, S., Chen, Z., Zhang, L., Yan, Q., Yang, B., Peng, L., & Jia, Z. (2016). Trafficav: An effective and explainable detection of mobile malware behavior using network traffic. In: 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS). IEEE.
Vierthaler, J., Kruszelnicki, R. & Schutte, J. (2018). Webeye-automated collection of malicious http traffic. arXiv preprint arXiv:1802.06012.
Naeem, H., Ullah, F., Naeem, M. R., Khalid, S., Vasan, D., Jabbar, S., & Saeed, S. (2020). Malware detection in industrial Internet of Things based on hybrid image visualization and deep learning model. Ad Hoc Networks, 105, 102154.
Article Google Scholar
Xu, P., Eckert, C., & Zarras, A. (2021). Falcon: Malware detection and categorization with network traffic images. In: International Conference on Artificial Neural Networks. Springer.
Wang, S., Chen, Z., Yan, Q., Ji, K., Wang, L., Yang, B., & Conti, M. (2018). Deep and broad learning based detection of android malware via network traffic. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). IEEE.
Chen, Z., Yu, B., Zhang, Y., Zhang, J., & Xu, J. (2016). Automatic mobile application traffic identification by convolutional neural networks. In: 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE.
David, O. E. & Netanyahu, N. S. (2015). Deepsign: Deep learning for automatic malware signature generation and classification. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE.
Wang, S., Chen, Z., Yan, Q., Ji, K., Peng, L., Yang, B., & Conti, M. (2020). Deep and broad URL feature mining for android malware detection. Information Sciences, 513, 600–613.
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.
Karbab, E. B., Debbabi, M., Derhab, A., & Mouheb, D. (2018). MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation, 24, S48–S59.
Article Google Scholar
Qiao, Y., Zhang, W., Du, X., & Guizani, M. (2021). Malware classification based on multilayer perception and word2Vec for IoT security. ACM Transactions on Internet Technology (TOIT), 22(1), 1–22.
Article Google Scholar
Lee, W. Y., Saxe, J., & Harang, R. (2019). SeqDroid: Obfuscated Android malware detection using stacked convolutional and recurrent neural networks, In: Deep learning applications for cyber security. Springer. pp. 197–210.
Vasan, D., Alazab, M., Wassan, S., Safaei, B., & Zheng, Q. (2020). Image-based malware classification using ensemble of CNN architectures (IMCEC). Computers & Security, 92, 101748.
Article Google Scholar
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Bai, H., Liu, G., Liu, W., Quan, Y., & Huang, S. (2021). N-gram, semantic-based neural network for mobile malware network traffic detection. Security and Communication Networks, 2021.
Lashkari, A.H., Kadir, A. F. A., Gonzalez, H., Mbah, K. F., & Ghorbani, A. A. (2017). Towards a network-based framework for android malware detection and characterization. In: 2017 15th Annual conference on privacy, security and trust (PST). IEEE.
Mahdavifar, S., Kadir, A. F. A., Fatemi, R., Alhadidi, D., & Ghorbani, A. A. (2020). Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning. In: 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). IEEE.
Shyong, Y.-C., Jeng, T.-H., & Chen, Y.-M. (2020). Combining static permissions and dynamic packet analysis to improve android malware detection. In: 2020 2nd International Conference on Computer Communication and the Internet (ICCCI). IEEE.
Ficco, M. (2021). Malware analysis by combining multiple detectors and observation windows. IEEE Transactions on Computers, 71(6), 1276–1290.
MATH Google Scholar
Ullah, F., Srivastava, G., & Ullah, S. (2022). A malware detection system using a hybrid approach of multi-heads attention-based control flow traces and image visualization. Journal of Cloud Computing, 11(1), 1–21.
Google Scholar
Ullah, F., Ullah, S., Srivastava, G., & Lin, J. C. W. (2023). IDS-INT: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic. Digital Communications and Networks.
Mathews, S.M. (2019). Explainable artificial intelligence applications in NLP, biomedical, and malware classification: A literature review. In: Intelligent computing-proceedings of the computing conference. Springer.
Ullah, F., Alsirhani, A., Alshahrani, M. M., Alomari, A., Naeem, H., & Shah, S. A. (2022). Explainable malware detection system using transformers-based transfer learning and multi-model visual representation. Sensors, 22(18), 6766.
Article Google Scholar
Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(11).

Download references

Funding

Funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant no. RGPIN-2020-05363).

Author information

Authors and Affiliations

School of Software, Northwestern Polytechnical University, Xi’an, 710072, Shanxi, People’s Republic of China
Farhan Ullah & Yue Zhao
School of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, People’s Republic of China
Shamsher Ullah
Department of Mathematics and Computer Science, Brandon University, Brandon, MB, R7A 6A9, Canada
Gautam Srivastava
Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, 5063, Bergen, Norway
Jerry Chun-Wei Lin
Research Centre for Interneural Computing, China Medical University, Taichung, 40402, Taiwan
Gautam Srivastava
Department of Computer Science and Math, Lebanese American University, 1102, Beirut, Lebanon
Gautam Srivastava

Authors

Farhan Ullah
View author publications
You can also search for this author in PubMed Google Scholar
Shamsher Ullah
View author publications
You can also search for this author in PubMed Google Scholar
Gautam Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yue Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

FU proposed the study, simulated it, and wrote the manuscript. SU helped in writing algorithms and formatting. YZ reviewed. and made writing suggestions. GS and JC-WL reviewed and analyzed the proposed research. All authors have read and agreed to the published version of the manuscript

Corresponding author

Correspondence to Gautam Srivastava.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ullah, F., Ullah, S., Srivastava, G. et al. NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble. Wireless Netw (2023). https://doi.org/10.1007/s11276-023-03414-5

Download citation

Accepted: 24 May 2023
Published: 27 June 2023
DOI: https://doi.org/10.1007/s11276-023-03414-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

A systematic literature review for network intrusion detection system (IDS)

Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

A systematic literature review for network intrusion detection system (IDS)

Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation