Convolutional neural networks and extreme learning machines for malware classification

Jain, Mugdha; Andreopoulos, William; Stamp, Mark

doi:10.1007/s11416-020-00354-y

Convolutional neural networks and extreme learning machines for malware classification

Original Paper
Published: 04 April 2020

Volume 16, pages 229–244, (2020)
Cite this article

Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Mugdha Jain¹,
William Andreopoulos¹ &
Mark Stamp¹

914 Accesses
40 Citations
1 Altmetric
Explore all metrics

Abstract

Research in the field of malware classification often relies on machine learning models that are trained on high-level features, such as opcodes, function calls, and control flow graphs. Extracting such features is costly, since disassembly or code execution is generally required. In this paper, we conduct experiments to train and evaluate machine learning models for malware classification, based on features that can be obtained without disassembly or code execution. Specifically, we visualize malware samples as images and employ image analysis techniques using both two-dimensional images and one-dimensional vectors derived from images. We consider two machine learning techniques, namely, convolutional neural networks (CNN) and extreme learning machines (ELM). For images we find that ELMs can achieve accuracies on par with CNNs, yet ELM training requires less than 2% of the time needed to train a comparable CNN. We also find that ELMs and CNNs perform as well when trained on one-dimensional data as when trained on two-dimensional data. In this latter case, ELMs are faster to train than CNNs, but only by a relatively small factor as compared to image-based training.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Benchmark: Neural Network Malware Classification

Malware classification with Word2Vec, HMM2Vec, BERT, and ELMo

Article 22 April 2022

An Empirical Analysis of Image-Based Learning Techniques for Malware Classification

References

Akusok, A., Björk, K.-M., Miché, Y., Lendasse, A.: High-performance extreme learning machines: a complete toolbox for big data applications. IEEE Access 3, 1011–1025 (2015)
Article Google Scholar
Bhodia, N., Prajapati, P., Troia, F., Stamp, M.: Transfer learning for image-based malware classification. In: Mori, P., Furnell, S., Camp, O. (eds.) Proceedings of the 5th International Conference on Information Systems Security and Privacy. ICISSP 2019, pp. 719–726 (2019)
Brownlee, J.: A gentle introduction to dropout for regularizing deep neural networks (2018). https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/
Cao, J., Hao, J., Lai, X., Vong, C.-M., Luo, M.: Ensemble extreme learning machine and sparse representation classification. J. Frankl Inst 353(17), 4526–4541 (2016)
Article MathSciNet Google Scholar
Cesare, S., Xiang, Y.: Classification of malware using structured control flow. In: Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing, Vol. 107, AusPDC ’10, pp. 61–70 (2010)
Chollet, F. et al.: Keras (2015). https://github.com/fchollet/keras
Damodaran, A., Di Troia, F., Visaggio, C.A., Austin, T.H., Stamp, M.: A comparison of static, dynamic, and hybrid analysis for malware detection. J. Comput. Virol. Hacking Tech. 13(1), 1–12 (2017)
Article Google Scholar
Extreme learning machine implementation in Python. https://github.com/dclambert/Python-ELM
Farrokhmanesh, M., Hamzeh, A.: A novel method for malware detection using audio signal processing techniques. In: 2016 Artificial Intelligence and Robotics (IRANOPEN), pp. 85–91 (2016)
Farrokhmanesh, M., Hamzeh, A.: Music classification as a new approach for malware detection. J. Comput. Virol. Hacking Tech. 15(2), 77–96 (2019)
Article Google Scholar
Fernández-Navarro, F., Hervás-Martinez, C., Sanchez-Monedero, J., Gutiérrez, P.A.: MELM-GRBF: a modified version of the extreme learning machine for generalized radial basis function neural networks. Neurocomputing 74(16), 2502–2510 (2011)
Article Google Scholar
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE International Joint Conference on Neural Networks, vol. 2, pp. 985–990 (2004)
Hashemi, H., Azmoodeh, A., Hamzeh, A., Hashemi, S.: Graph embedding as a new approach for unknown malware detection. J. Comput. Virol. Hacking Tech. 13(3), 153–166 (2017)
Article Google Scholar
Huang, G., Huang, G.-B., Song, S., You, K.: Trends in extreme learning machines: a review. Neural Netw. 61, 32–48 (2015)
Article Google Scholar
Hubel, D., Wiesel, T.: Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. J. Physiol. 160, 106–154 (1962)
Article Google Scholar
Jahromi, A., Hashemi, S., Dehghantanha, A., Choo, K.-K.R., Karimipour, H., Newton, D.E., Parizi, R.M.: An improved two-hidden-layer extreme learning machine for malware hunting. Comput. Secur. 89, 1 (2019)
Google Scholar
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
MathSciNet MATH Google Scholar
Laks. Supervised classification with \(k\)-fold cross validation on a multi family malware dataset (2014). https://sarvamblog.blogspot.com/2014/08/supervised-classification-with-k-fold.html
Majumdar, A., Masiwal, G., Meshram, B.B.: Analysis of signature-based and behaviour-based anti-malware approaches. In: International Journal of Advanced Research in Computer Engineering and Technology, vol. 2 (June 2013)
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, VizSec ’11, pp. 4:1–4:7, New York, NY, USA (2011). ACM
Pak, M., Kim, S.: A review of deep learning in image recognition. In: 2017 4th International Conference on Computer Applications and Information Processing Technology, pp. 1–3 (August 2017)
Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231, 64–82 (2013)
Article MathSciNet Google Scholar
Santos, I., Penya, Y.K., Devesa, J., Bringas, P.: \(n\)-grams-based file signatures for malware detection. In: Proceedings of the 11th International Conference on Enterprise Information Systems, ICEIS 2009 (2009)
Schultz, M.G., Eskin, E., Zadok, E., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy, SP ’01, pp. 38–49 (2001)
Shamshirband, S., Chronopoulos, A.T.: A new malware detection system using a high performance-elm method. In: Proceedings of the 23rd International Database Applications and Engineering Symposium, IDEAS ’19, pages 33:1–33:10 (2019)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Mark Stamp. Deep thoughts on deep learning (2019). https://www.cs.sjsu.edu/~stamp/RUA/ann.pdf
Symantec (2018). Internet security threat report. Technical report, Symantec
Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-based malware classification using ensemble of CNN architectures (IMCEC). Computers and Security, p. 101748 (2020)
Venkatraman, S., Alazab, M., Vinayakumar, R.: A hybrid deep learning image-based analysis for effective malware detection. J. Inf. Secur. Appl. 47, 377–389 (2019)
Google Scholar
Wong, A.: 2019 Symantec internet security threat report highlights. https://www.techarp.com/cybersecurity/2019-symantec-istr-highlights/ (2019)
Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)
Article Google Scholar
Ming, X., Lingfei, W., Qi, S., Jian, X., Zhang, H., Ren, Y., Zheng, N.: A similarity metric method of obfuscated malware using function-call graph. J. Comput. Virol. Hacking Tech. 9(1), 35–47 (2013)
Article Google Scholar
Yajamanam, S., Selvin, V.R.S., Troia, F.D., Stamp, M.: Deep learning versus gist descriptors for image-based malware classification. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy, ICISSP 2018, pp. 553–561 (2018)
Zhang, W., Ren, H., Jiang, Q., Zhang, K.: Exploring feature extraction and ELM in malware detection for Android devices. In: Hu, X., Xia, Y., Zhang, Y., Zhao, D. (eds) Advances in Neural Networks, ISNN 2015, pp. 489–498 (2015)

Download references

Author information

Authors and Affiliations

Department of Computer Science, San Jose State University, San Jose, USA
Mugdha Jain, William Andreopoulos & Mark Stamp

Authors

Mugdha Jain
View author publications
You can also search for this author in PubMed Google Scholar
William Andreopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Mark Stamp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Stamp.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jain, M., Andreopoulos, W. & Stamp, M. Convolutional neural networks and extreme learning machines for malware classification. J Comput Virol Hack Tech 16, 229–244 (2020). https://doi.org/10.1007/s11416-020-00354-y

Download citation

Received: 01 January 2020
Accepted: 19 March 2020
Published: 04 April 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s11416-020-00354-y

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional neural networks and extreme learning machines for malware classification

Abstract

Access this article

Similar content being viewed by others

Benchmark: Neural Network Malware Classification

Malware classification with Word2Vec, HMM2Vec, BERT, and ELMo

An Empirical Analysis of Image-Based Learning Techniques for Malware Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Navigation

Convolutional neural networks and extreme learning machines for malware classification

Abstract

Access this article

Similar content being viewed by others

Benchmark: Neural Network Malware Classification

Malware classification with Word2Vec, HMM2Vec, BERT, and ELMo

An Empirical Analysis of Image-Based Learning Techniques for Malware Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation