Skip to main content

A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2022 (ICANN 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13529))

Included in the following conference series:

  • 2234 Accesses

Abstract

Behavior-based machine learning plays a vital role in malware classification, as it potentially overcomes the limitations of signature-based methods. This paper explores the use of dynamic call sequences as extracted by the open source Noriben tool, which employs dynamic analysis in a virtualized environment. Call sequences of a length of up to 5000 operations are generated for a total of 2000 benign and malware samples. Seven malware families are recognized: ransomware, trojan, backdoor, rootkit, virus, miner, and other. An empirical comparison analyzes five different classifiers: fully connected neural networks, GRU and LSTM, Transformer, and two combination approaches. The overall best performing approach is a concatenation of a GRU with a Transformer architecture, yielding the highest F1-score. This best model achieves accuracy and F1-score values of up to 97%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Downloads from VirusShare [8] and VirusSign [18].

  2. 2.

    Downloads from FileHorse [7].

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016)

    Google Scholar 

  2. Alibaba: Alitianchi contest (2021). https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.11409106.5678.1.4354684cI0fYC1?raceId=231668s

  3. Athiwaratkun, B., Stokes, J.W.: Malware classification with lstm and gru language models and a character-level cnn. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2482–2486 (2017). https://doi.org/10.1109/ICASSP.2017.7952603

  4. Baskin, B.: Noriben malware analysis sandbox (2015). https://github.com/Rurik/Noriben

  5. Chen, J., Guo, S., Ma, X., Li, H., Guo, J., Chen, M., Pan, Z.: Slam: a malware detection method based on sliding local attention mechanism. Secur. Commun. Networks 2020, 6724513:1–6724513:11 (2020)

    Google Scholar 

  6. Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation (2014)

    Google Scholar 

  7. FileHorse: Filehorse, June 2020, https://fileHorse.com

  8. Forensics, C.: Virusshare, June 2020, https://virusshare.com/

  9. Goldberg, Y., Levy, O.: word2vec explained: deriving mikolov et al’.s negative-sampling word-embedding method (2014). cite arxiv:1402.3722

  10. Kolosnjaji, B., Zarras, A., Webster, G., Eckert, C.: Deep learning for classification of malware system call sequences. In: Kang, B.H., Bai, Q. (eds.) AI 2016. LNCS (LNAI), vol. 9992, pp. 137–149. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50127-7_11

    Chapter  Google Scholar 

  11. Maxwell, K.: Maltrieve: a tool to retrieve malware directly from the source for security researchers (2015). https://github.com/krmaxwell/maltrieve

  12. O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al.: Keras Tuner (2019). https://github.com/keras-team/keras-tuner

  13. Pedregosa, F., et al.: Scikit-learn: machine learning in python. JMLR 12, 2825–2830 (2011)

    Google Scholar 

  14. Pektas, A., Acarman, T.: Malware classification based on api calls and behaviour analysis. IET Inf. Secur. 12, 107–117 (2018)

    Article  Google Scholar 

  15. Qian, Q., Tang, M.: Dynamic api call sequence visualization for malware classification. IET Inf. Secur. 13, October 2018

    Google Scholar 

  16. Saxe, J., Berlin, K.: expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys (2017)

    Google Scholar 

  17. Vaswani, A., et al.: Attention is all you need (2017)

    Google Scholar 

  18. VirusSign: Virussign, June 2020. https://samples.virussign.com/samples/

  19. VirusTotal: Virustotal, June 2020. http://www.virustotal.com

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajchada Chanajitt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chanajitt, R., Pfahringer, B., Gomes, H.M. (2022). A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15919-0_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15918-3

  • Online ISBN: 978-3-031-15919-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics