Skip to main content

An N-gram Based Deep Learning Method for Network Traffic Classification

  • Conference paper
  • First Online:
Artificial Intelligence and Security (ICAIS 2021)

Abstract

Various attacks have become the main threat in the Internet world. Traffic classification is the first step in network exception detection or network-based intrusion detection systems, and plays an important role in the field of network security. With the development of Internet technology, the source and complexity of network attacks are getting higher and higher, making it difficult for traditional anomaly detection systems to effectively analyze and identify malicious traffic. In recent years, the method of deep learning has been widely used in the field of traffic recognition, and the characteristics of traffic data can be automatically identified. Because of the size limit of the input data of the neural network, the flow data needs to be trimmed to feed into the network for learning, so the neural network cannot learn the characteristics of the traffic data well. In this paper, we propose an N-gram-based data processing method to convert the raw traffic data into N-gram features to represent more information. Then our method uses a detector based on convolutional neural network (CNN) to classify and detect data. Our experiments show that the detection accuracy of using N-gram feature data is better than the method using raw traffic. This method can more effectively detect malicious traffic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Biersack, E., Christian, C., Maja, M.: Data Traffic Monitoring and Analysis. Springer, Berlin (2013)

    Book  Google Scholar 

  2. Dharmapurikar, S., et al.: Deep packet inspection using parallel Bloom filters. IEEE Micro 24(1), 52–61 (2004)

    Google Scholar 

  3. Nguyen, T.T.T., Grenville, A.: A survey of techniques for internet traffic classification using machine learning. IEEE Commun. Surv. Tutorials 10(4), 56–76 (2008)

    Article  Google Scholar 

  4. Wang, W., et al.: Malware traffic classification using convolutional neural network for representation learning. In: 2017 International Conference on Information Networking (ICOIN). IEEE (2017)

    Google Scholar 

  5. Finsterbusch, M., et al.: A survey of payload-based traffic classification approaches. IEEE Commun. Surv. Tutorials 16(2), 1135–1156 (2013)

    Article  Google Scholar 

  6. Horng, S.J., et al.: A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Syst. Appl. Int. J. 38(1), 306–313 (2011)

    Article  Google Scholar 

  7. Syarif, I., Zaluska, E., Prugel-Bennett, A., Wills, G.: Application of bagging, boosting and stacking to intrusion detection. In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 593–602. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_46

    Chapter  Google Scholar 

  8. Li, Z., et al.: Intrusion detection using convolutional neural networks for representation learning. In: International Conference on Neural Information Processing. Springer, Cham (2014)

    Google Scholar 

  9. Aggarwal, P., Sharma, S.K.: Analysis of KDD dataset attributes - class wise for intrusion detection. Procedia Comput. Sci. 57, 842–851 (2015)

    Article  Google Scholar 

  10. Wang, W., et al.: HAST-IDS: learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection. IEEE Access 6(99), 1792–1806 (2018)

    Article  Google Scholar 

  11. Cui, J., Long, J., Min, E., Mao, Y.: WEDL-NIDS: improving network intrusion detection using word embedding-based deep learning method. In: Torra, V., Narukawa, Y., Aguiló, I., González-Hidalgo, M. (eds.) MDAI 2018. LNCS (LNAI), vol. 11144, pp. 283–295. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00202-2_23

    Chapter  Google Scholar 

  12. Dainotti, A., Pescape, A., Claffy, K.C.: Issues and future directions in traffic classification. IEEE Netw. 26(1), 35–40 (2012)

    Article  Google Scholar 

  13. Koukis, D., et al.: A generic anonymization framework for network traffic. In: 2006 IEEE International Conference on Communications, vol. 5. IEEE (2006)

    Google Scholar 

  14. Wang, K., Parekh, J.J., Stolfo, S.J.: Anagram: a content anomaly detector resistant to mimicry attack. In: Zamboni, D., Kruegel, C. (eds.) RAID 2006. LNCS, vol. 4219, pp. 226–248. Springer, Heidelberg (2006). https://doi.org/10.1007/11856214_12

    Chapter  Google Scholar 

  15. Santos, I., et al.: N-grams-based file signatures for malware detection. ICEIS 2(9), 317–320 (2009)

    Google Scholar 

  16. Zhao, Z., et al.: Advancing feature selection research. ASU Feature Selection Repository, pp. 1–28 (2010)

    Google Scholar 

  17. Ajay Kumara, M.A., Jaidhar, C.D.: Leveraging virtual machine introspection with memory forensics to detect and characterize unknown malware using machine learning techniques at hypervisor. Digital Invest. 23, 99–123 (2017)

    Google Scholar 

  18. Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  19. Singh, B., Kushwaha, N., Vyas, O.P., et al.: A feature subset selection technique for high dimensional data using symmetric uncertainty. J. Data Anal. Inform. Process. 2(04), 95–105 (2014)

    Article  Google Scholar 

  20. Coronado-De-Alba, L.D., Rodr´ıguez-Mota, A., Escamilla-Ambrosio, P.J.: Feature selection and ensemble of classifiers for android malware detection. In: Proceedings of the 8th IEEE Latin-American Conference on Communications (LATINCOM), pp. 1–6. IEEE (2016)

    Google Scholar 

  21. Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: Proceedings of the International Conference on Engineering and Technology (ICET), pp. 1–6 (2017)

    Google Scholar 

  22. Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tuts. 18(2), 1153–1176 (2016)

    Article  Google Scholar 

  23. Berman, D., Buczak, A., Chavis, J., Corbett, C.: A survey of deep learning methods for cyber security. Information 10(4), 122 (2019)

    Google Scholar 

  24. Albelwi, S., Mahmood, A.: A framework for designing the architectures of deep convolutional neural networks. Entropy 19(6), 242 (2017)

    Article  Google Scholar 

  25. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31(3), 357–374 (2012)

    Article  Google Scholar 

  26. Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSP, pp. 108–116 (2018)

    Google Scholar 

  27. Sharafaldin, I., et al.: Towards a reliable intrusion detection benchmark dataset. Softw. Netw. 2017(9), 177–200 (2017)

    Article  Google Scholar 

Download references

Acknowledgement

Thanks for the experimental environment provided by laboratory ICN&CAD of School of Electronic Engineering, Beijing University of Posts and Telecommunications. And thanks to He Mingshu, Jin Lei and Zhang Yu for their contributions to this work.

Funding

This work was supported by the National Natural Science Foundation of China (62071056).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Ethics declarations

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiaojuan, W., Kacuila, K., Mingshu, H. (2021). An N-gram Based Deep Learning Method for Network Traffic Classification. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2021. Lecture Notes in Computer Science(), vol 12737. Springer, Cham. https://doi.org/10.1007/978-3-030-78612-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78612-0_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78611-3

  • Online ISBN: 978-3-030-78612-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics