Skip to main content
Log in

NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble

  • Published:
Wireless Networks Aims and scope Submit manuscript

Abstract

Currently, malware activities pose a substantial risk to the security of Android applications. These risks are capable of stealing important information and causing chaos in the economy, social structure, and financial sector. Malicious network traffic targets Android applications due to their constant connectivity. This study develops the NMal-Droid approach for network-based Android malware detection and classification. First, we designed a packet parser algorithm that filters the combination of HTTP traces and TCP flows from PCAPs (Packet Capturing) files. Second, the fine-tune embedding approach is developed that uses a word2vec pre-trained model to analyze features’ embeddings in three different ways, i.e., random, static, and dynamic. It is used to learn and extract feature-matrix matrices with related meanings. Third, The Convolutional Neural Network (CNN) is used to extract effective features from embedded information. Fourth, the Bi-directional Gated Recurrent Unit (Bi-GRU) neural network is designed to compute gradient computation in the context of time-forward and time-reversed. Finally, a multi-head ensemble of CNN-BiGRU is developed for accurate malware classification and detection. The proposed approach is evaluated on five different activation functions with 100 filters and a range of 1–5 kernel sizes for in-depth investigation. An explainable AI-based experiment is conducted to interpret and validate the proposed approach. The proposed method is tested using two big Android malware datasets, CIC-AAGM2017 and CICMalDroid 2020, which comprise a total of 10.2k malware and 3.2K benign samples. It is shown that the proposed approach outperforms as compared to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available in Canadian Institute for Cybersecurity-CIC-AAGM2017 and CICMalDroid2020 at https://www.unb.ca/cic/datasets/android-adware.html, and https://www.unb.ca/cic/datasets/maldroid-2020.html, respectively.

Notes

  1. https://www.statista.com/statistics/266210/number-of-available-applications-in-thegoogle-play-store.

  2. https://www.unb.ca/cic/datasets/index.html. The first dataset, the Canadian Institute of Cybersecurity Android Adware and General Malware (CICAAGM2017)

References

  1. Arshad, S., Shah, M. A., Khan, A., & Ahmed, M. (2016). Android malware detection & protection: A survey. International Journal of Advanced Computer Science and Applications, 7(2), 463–475.

    Article  Google Scholar 

  2. Felt, A.P., Finifter, M., Chin, E., Hanna, S., & Wagner, D. (2011). A survey of mobile malware in the wild. In: Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices.

  3. Berman, D. S., Buczak, A. L., Chavis, J. S., & Corbett, C. L. (2019). A survey of deep learning methods for cyber security. Information, 10(4), 122.

    Article  Google Scholar 

  4. Faruki, P., Bharmal, A., Laxmi, V., Ganmoor, V., Gaur, M. S., Conti, M., & Rajarajan, M. (2014). Android security: A survey of issues, malware penetration, and defenses. IEEE Communications Surveys & Tutorials, 17(2), 998–1022.

    Article  Google Scholar 

  5. Zhu, H.-J., You, Z. H., Zhu, Z. X., Shi, W. L., Chen, X., & Cheng, L. (2018). DroidDet: Effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing, 272, 638–646.

    Article  Google Scholar 

  6. Qiu, J., Zhang, J., Luo, W., Pan, L., Nepal, S., & Xiang, Y. (2020). A survey of android malware detection with deep neural models. ACM Computing Surveys (CSUR), 53(6), 1–36.

    Article  Google Scholar 

  7. Egele, M., Scholte, T., Kirda, E., & Kruegel, C. (2008). A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys (CSUR), 44(2), 1–42.

    Article  Google Scholar 

  8. Zhou, Y. & Jiang, X. (2012). Dissecting android malware: Characterization and evolution. In: 2012 IEEE symposium on security and privacy. IEEE.

  9. Yang, L., Han, Z., Huang, Z., & Ma, J. (2018). A remotely keyed file encryption scheme under mobile cloud computing. Journal of Network and Computer Applications, 106, 90–99.

    Article  Google Scholar 

  10. Jiang, J., Yin, Q., Shi, Z., & Li, M. (2018). Comprehensive behavior profiling model for malware classification. In: 2018 IEEE Symposium on Computers and Communications (ISCC). IEEE.

  11. Chen, Z., Peng, L., Gao, C., Yang, B., Chen, Y., & Li, J. (2017). Flexible neural trees based early stage identification for IP traffic. Soft Computing, 21(8), 2035–2046.

    Article  Google Scholar 

  12. Talha, K. A., Alper, D. I., & Aydin, C. (2015). APK Auditor: Permission-based Android malware detection system. Digital Investigation, 13, 1–14.

    Article  Google Scholar 

  13. Wang, W., Wang, X., Feng, D., Liu, J., Han, Z., & Zhang, X. (2014). Exploring permission-induced risk in android applications for malicious application detection. IEEE Transactions on Information Forensics and Security, 9(11), 1869–1882.

    Article  Google Scholar 

  14. Ullah, F., Srivastava, G., & Ullah, S. J. J. O. C. C. (2022). A malware detection system using a hybrid approach of multi-heads attention-based control flow traces and image visualization., 11(1), 1–21.

  15. Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P. G., & Alvarez, G. (2013). Puma: Permission usage to detect malware in android. International Joint Conference CISIS’12-ICEUTE 12-SOCO 12 Special Sessions. Springer.

  16. de la Puerta, J.G., Sanz, B., Santos Grueiro, I., & Bringas, P. G. (2015). The evolution of permission as feature for Android malware detection. In: Computational Intelligence in Security for Information Systems Conference. Springer.

  17. Liu, X. & Liu, J. (2014). A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering. IEEE.

  18. Wang, S., Chen, Z., Yan, Q., Yang, B., Peng, L., & Jia, Z. (2019). A mobile malware detection method using behavior features in network traffic. Journal of Network and Computer Applications, 133, 15–25.

    Article  Google Scholar 

  19. Aresu, M., Ariu, D., Ahmadi, M., Maiorca, D., & Giacinto, G. (2015). Clustering android malware families by http traffic. In: 2015 10th International Conference on Malicious and Unwanted Software (MALWARE). IEEE.

  20. Wang, S., Yan, Q., Chen, Z., Yang, B., Zhao, C., & Conti, M. (2017). Detecting android malware leveraging text semantics of network flows. IEEE Transactions on Information Forensics and Security, 13(5), 1096–1109.

    Article  Google Scholar 

  21. Li, Z., Sun, L., Yan, Q., Srisa-an, W., & Chen, Z. (2016). Droidclassifier: Efficient adaptive mining of application-layer header for classifying android malware. in International Conference on Security and Privacy in Communication Systems. Springer.

  22. Wang, S., Chen, Z., Zhang, L., Yan, Q., Yang, B., Peng, L., & Jia, Z. (2016). Trafficav: An effective and explainable detection of mobile malware behavior using network traffic. In: 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS). IEEE.

  23. Vierthaler, J., Kruszelnicki, R. & Schutte, J. (2018). Webeye-automated collection of malicious http traffic. arXiv preprint arXiv:1802.06012.

  24. Naeem, H., Ullah, F., Naeem, M. R., Khalid, S., Vasan, D., Jabbar, S., & Saeed, S. (2020). Malware detection in industrial Internet of Things based on hybrid image visualization and deep learning model. Ad Hoc Networks, 105, 102154.

    Article  Google Scholar 

  25. Xu, P., Eckert, C., & Zarras, A. (2021). Falcon: Malware detection and categorization with network traffic images. In: International Conference on Artificial Neural Networks. Springer.

  26. Wang, S., Chen, Z., Yan, Q., Ji, K., Wang, L., Yang, B., & Conti, M. (2018). Deep and broad learning based detection of android malware via network traffic. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). IEEE.

  27. Chen, Z., Yu, B., Zhang, Y., Zhang, J., & Xu, J. (2016). Automatic mobile application traffic identification by convolutional neural networks. In: 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE.

  28. David, O. E. & Netanyahu, N. S. (2015). Deepsign: Deep learning for automatic malware signature generation and classification. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE.

  29. Wang, S., Chen, Z., Yan, Q., Ji, K., Peng, L., Yang, B., & Conti, M. (2020). Deep and broad URL feature mining for android malware detection. Information Sciences, 513, 600–613.

    Article  Google Scholar 

  30. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.

  31. Karbab, E. B., Debbabi, M., Derhab, A., & Mouheb, D. (2018). MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation, 24, S48–S59.

    Article  Google Scholar 

  32. Qiao, Y., Zhang, W., Du, X., & Guizani, M. (2021). Malware classification based on multilayer perception and word2Vec for IoT security. ACM Transactions on Internet Technology (TOIT), 22(1), 1–22.

    Article  Google Scholar 

  33. Lee, W. Y., Saxe, J., & Harang, R. (2019). SeqDroid: Obfuscated Android malware detection using stacked convolutional and recurrent neural networks, In: Deep learning applications for cyber security. Springer. pp. 197–210.

  34. Vasan, D., Alazab, M., Wassan, S., Safaei, B., & Zheng, Q. (2020). Image-based malware classification using ensemble of CNN architectures (IMCEC). Computers & Security, 92, 101748.

    Article  Google Scholar 

  35. Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  36. Bai, H., Liu, G., Liu, W., Quan, Y., & Huang, S. (2021). N-gram, semantic-based neural network for mobile malware network traffic detection. Security and Communication Networks, 2021.

  37. Lashkari, A.H., Kadir, A. F. A., Gonzalez, H., Mbah, K. F., & Ghorbani, A. A. (2017). Towards a network-based framework for android malware detection and characterization. In: 2017 15th Annual conference on privacy, security and trust (PST). IEEE.

  38. Mahdavifar, S., Kadir, A. F. A., Fatemi, R., Alhadidi, D., & Ghorbani, A. A. (2020). Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning. In: 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). IEEE.

  39. Shyong, Y.-C., Jeng, T.-H., & Chen, Y.-M. (2020). Combining static permissions and dynamic packet analysis to improve android malware detection. In: 2020 2nd International Conference on Computer Communication and the Internet (ICCCI). IEEE.

  40. Ficco, M. (2021). Malware analysis by combining multiple detectors and observation windows. IEEE Transactions on Computers, 71(6), 1276–1290.

    MATH  Google Scholar 

  41. Ullah, F., Srivastava, G., & Ullah, S. (2022). A malware detection system using a hybrid approach of multi-heads attention-based control flow traces and image visualization. Journal of Cloud Computing, 11(1), 1–21.

    Google Scholar 

  42. Ullah, F., Ullah, S., Srivastava, G., & Lin, J. C. W. (2023). IDS-INT: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic. Digital Communications and Networks.

  43. Mathews, S.M. (2019). Explainable artificial intelligence applications in NLP, biomedical, and malware classification: A literature review. In: Intelligent computing-proceedings of the computing conference. Springer.

  44. Ullah, F., Alsirhani, A., Alshahrani, M. M., Alomari, A., Naeem, H., & Shah, S. A. (2022). Explainable malware detection system using transformers-based transfer learning and multi-model visual representation. Sensors, 22(18), 6766.

    Article  Google Scholar 

  45. Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(11).

Download references

Funding

Funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant no. RGPIN-2020-05363).

Author information

Authors and Affiliations

Authors

Contributions

FU proposed the study, simulated it, and wrote the manuscript. SU helped in writing algorithms and formatting. YZ reviewed. and made writing suggestions. GS and JC-WL reviewed and analyzed the proposed research. All authors have read and agreed to the published version of the manuscript

Corresponding author

Correspondence to Gautam Srivastava.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ullah, F., Ullah, S., Srivastava, G. et al. NMal-Droid: network-based android malware detection system using transfer learning and CNN-BiGRU ensemble. Wireless Netw (2023). https://doi.org/10.1007/s11276-023-03414-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11276-023-03414-5

Keywords

Navigation