Deep packet: a novel approach for encrypted traffic classification using deep learning

Lotfollahi, Mohammad; Jafari Siavoshani, Mahdi; Shirali Hossein Zade, Ramin; Saberian, Mohammdsadegh

doi:10.1007/s00500-019-04030-2

Deep packet: a novel approach for encrypted traffic classification using deep learning

Methodologies and Application
Published: 13 May 2019

Volume 24, pages 1999–2012, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

12k Accesses
476 Citations
5 Altmetric
Explore all metrics

Abstract

Network traffic classification has become more important with the rapid growth of Internet and online applications. Numerous studies have been done on this topic which have led to many different approaches. Most of these approaches use predefined features extracted by an expert in order to classify network traffic. In contrast, in this study, we propose a deep learning-based approach which integrates both feature extraction and classification phases into one system. Our proposed scheme, called “Deep Packet,” can handle both traffic characterization in which the network traffic is categorized into major classes (e.g., FTP and P2P) and application identification in which identifying end-user applications (e.g., BitTorrent and Skype) is desired. Contrary to most of the current methods, Deep Packet can identify encrypted traffic and also distinguishes between VPN and non-VPN network traffic. The Deep Packet framework employs two deep neural network structures, namely stacked autoencoder (SAE) and convolution neural network (CNN) in order to classify network traffic. Our experiments show that the best result is achieved when Deep Packet uses CNN as its classification model where it achieves recall of 0.98 in application identification task and 0.94 in traffic categorization task. To the best of our knowledge, Deep Packet outperforms all of the proposed classification methods on UNB ISCX VPN-nonVPN dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions

Article 14 March 2022

A systematic literature review for network intrusion detection system (IDS)

Article 27 March 2023

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. http://tensorflow.org/, software available from tensorflow.org
Akhtar N, Mian A (2018) Threat of adversarial attacks on deep learning in computer vision: a survey. arXiv:1801.00553
Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of dna-and rna-binding proteins by deep learning. Nat Biotechnol 33(8):831–838
Article Google Scholar
Alshammari R, Zincir-Heywood AN (2011) Can encrypted traffic be identified without port numbers, ip addresses and payload inspection? Comput Netw 55(6):1326–1350
Article Google Scholar
Auld T, Moore AW, Gull SF (2007) Bayesian neural networks for internet traffic classification. IEEE Trans Neural Netw 18(1):223–239
Article Google Scholar
Bagui S, Fang X, Kalaimannan E, Bagui SC, Sheehan J (2017) Comparison of machine-learning algorithms for classification of vpn network traffic flow using time-related features. J Cyber Secur Technol 1(2):108–126
Article Google Scholar
Bengio Y (2009) Learning deep architectures for ai. Found Trends Mach Learn 2(1):1–127
Article MathSciNet Google Scholar
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems, pp 153–160
Carlini N, Wagner D (2018) Audio adversarial examples: targeted attacks on speech-to-text. arXiv:1801.01944
Caudill M (1987) Neural networks primer, part i. AI Expert 2(12):46–52
Google Scholar
Chollet F et al (2017) Keras. https://github.com/fchollet/keras
Chowdhury NMK, Boutaba R (2010) A survey of network virtualization. Comput Netw 54(5):862–876
Article Google Scholar
Cover TM, Thomas JA (2006) Elements of information theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience, New Jersy
MATH Google Scholar
Crotti M, Dusi M, Gringoli F, Salgarelli L (2007) Traffic classification through simple statistical fingerprinting. ACM SIGCOMM Comput Commun Rev 37(1):5–16
Article Google Scholar
Dainotti A, Pescape A, Claffy KC (2012) Issues and future directions in traffic classification. IEEE Netw 26(1):35-40
Dingledine R, Mathewson N, Syverson P (2004) Tor: the second-generation onion router. Tech. rep., Naval Research Lab Washington DC
dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of the 25th international conference on computational linguistics (COLING), Dublin, Ireland
Du M, Liu N, Hu X (2018) Techniques for interpretable machine learning. arXiv preprint. arXiv:1808.00033
Finsterbusch M, Richter C, Rocha E, Muller JA, Hanssgen K (2014) A survey of payload-based traffic classification approaches. IEEE Commun Surv Tutor 16(2):1135–1156
Article Google Scholar
Gil GD, Lashkari AH, Mamun M, Ghorbani AA (2016) Characterization of encrypted and vpn traffic using time-related features. In: Proceedings of the 2nd international conference on information systems security and privacy (ICISSP 2016), pp 407–414
Hinton G, Deng L, Yu D, Dahl GE, Ar Mohamed, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97
Article Google Scholar
Huang SH, Papernot N, Goodfellow IJ, Duan Y, Abbeel P (2017) Adversarial attacks on neural network policies. arXiv:1702.02284
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2017) Quantized neural networks: training neural networks with low precision weights and activations. J Mach Learn Res 18(1):6869–6898
MathSciNet MATH Google Scholar
Hubel DH, Wiesel TN (1968) Receptive fields and functional architecture of monkey striate cortex. J Physiol 195(1):215–243
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Khalife J, Hajjar A, Diaz-Verdejo J (2014) A multilevel taxonomy and requirements for an optimal traffic-classification model. Int J Netw Manag 24(2):101–120
Article Google Scholar
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint. arXiv:1412.6980
Kohout J, Pevný T (2018) Network traffic fingerprinting based on approximated kernel two-sample test. IEEE Trans Inf Forensics Secur. https://doi.org/10.1109/TIFS.2017.2768018
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th annual international conference on machine learning, ACM, New York, NY, USA, ICML ’09, pp 609–616
Lin DD, Talathi SS, Annapureddy VS (2016) Fixed point quantization of deep convolutional networks. In: Proceedings of the 33rd international conference on international conference on machine learning, vol 48, ICML’16, pp 2849–2858
Longadge R, Dongre S (2013) Class imbalance problem in data mining review. arXiv preprint. arXiv:1305.1707
Lotfollahi M, Shirali Hossein Zade R, Jafari Siavoshani M, Saberian M (2017) Deep packet: a novel approach for encrypted traffic classification using deep learning. CoRR abs/1709.02656. arXiv:1709.02656
Lotfollahi M, Wolf FA, Theis FJ (2018) Generative modeling and latent space arithmetics predict single-cell perturbation response across cell types, studies and species. bioRxiv p 478503
Lv J, Zhu C, Tang S, Yang C (2014) Deepflow: hiding anonymous communication traffic in p2p streaming networks. Wuhan Univ J Nat Sci 19(5):417–425
Article Google Scholar
Madhukar A, Williamson C (2006) A longitudinal study of p2p traffic classification. In: Modeling, analysis, and simulation of computer and telecommunication systems, 2006. MASCOTS 2006. 14th IEEE international symposium on, IEEE, pp 179–188
Montavon G, Samek W, Müller KR (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process 73:1–15
Article MathSciNet Google Scholar
Moore AW, Papagiannaki K (2005) Toward the accurate identification of network applications. PAM, Springer 5:41–54
Google Scholar
Moore AW, Zuev D (2005) Internet traffic classification using Bayesian analysis techniques. ACM SIGMETRICS Perform Eval Rev ACM 33:50–60
Article Google Scholar
Moore A, Zuev D, Crogan M (2013) Discriminators for use in flow-based classification. Tech. rep
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Perera P, Tian YC, Fidge C, Kelly W (2017) A comparison of supervised machine learning algorithms for classification of communications network traffic. In: Neural information processing. Springer, Cham, Lecture Notes in Computer Science, pp 445–454. https://doi.org/10.1007/978-3-319-70087-8_47
Prechelt L (1998) Early stopping-but when? Neural networks: tricks of the trade. Springer, pp 55–69
Qi Y, Xu L, Yang B, Xue Y, Li J (2009) Packet classification algorithms: from theory to practice. In: INFOCOM 2009, IEEE, IEEE, pp 648–656
Samek W, Wiegand T, Müller KR (2018) Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. ITU J ICT Discov Special Issue 1 Impact Artif Intell (AI) Commun Netw Serv 1(1):39–48
Google Scholar
Sen S, Spatscheck O, Wang D (2004) Accurate, scalable in-network identification of p2p traffic using application signatures. In: Proceedings of the 13th international conference on world wide web, ACM, New York, NY, USA, pp 512–521
Sherry J, Lan C, Popa RA, Ratnasamy S (2015) Blindbox: deep packet inspection over encrypted traffic. ACM SIGCOMM Comput Commun Rev ACM 45:213–226
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts CP (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP
Sun R, Yang B, Peng L, Chen Z, Zhang L, Jing S (2010) Traffic classification using probabilistic neural networks. In: Natural computation (ICNC), 2010 sixth international conference on, IEEE, vol 4, pp 1914–1919
Ting H, Yong W, Xiaoling T (2010) Network traffic classification based on kernel self-organizing maps. In: Intelligent computing and integrated systems (ICISS), 2010 international conference on, IEEE, pp 310–314
Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on cpus. In: Deep learning and unsupervised feature learning workshop, NIPS 2011
Velan P, Čermák M, Čeleda P, Drašar M (2015) A survey of methods for encrypted traffic classification and analysis. Int J Netw Manag 25(5):355–374
Article Google Scholar
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, ACM, pp 1096–1103
Wang Z (2015) The applications of deep learning on traffic identification. BlackHat, USA
Google Scholar
Wang X, Parish DJ (2010) Optimised multi-stage tcp traffic classifier based on packet size distributions. In: Communication theory, reliability, and quality of service (CTRQ), 2010 third international conference on, IEEE, pp 98–103
Wang W, Zhu M, Wang J, Zeng X, Yang Z (2017) End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In: Intelligence and security informatics (ISI), 2017 IEEE international conference on, IEEE, pp 43–48
Yamansavascilar B, Guvensan MA, Yavuz AG, Karsligil ME (2017) Application identification via network traffic classification. In: Computing, networking and communications (ICNC), 2017 international conference on, IEEE, pp 843–848
Yeganeh SH, Eftekhar M, Ganjali Y, Keralapura R, Nucci A (2012) Cute: traffic classification using terms. In: Computer communications and networks (ICCCN), 2012 21st international conference on, IEEE, pp 1–9
Yosinski J, Clune J, Nguyen AM, Fuchs TJ, Lipson H (2015) Understanding neural networks through deep visualization. arXiv:1506.06579
Yuan X, He P, Zhu Q, Bhat RR, Li X (2017) Adversarial examples: attacks and defenses for deep learning. arXiv:1712.07107
Zhang C, Li P, Sun G, Guan Y, Xiao B, Cong J (2015) Optimizing fpga-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays, ACM, pp 161–170

Download references

Acknowledgements

The authors would like to thank Farzad Ghasemi and Mehdi Kharrazi for their valuable discussions and feedback.

Author information

Authors and Affiliations

Sharif University of Technology, Tehran, Iran
Mohammad Lotfollahi, Mahdi Jafari Siavoshani, Ramin Shirali Hossein Zade & Mohammdsadegh Saberian

Authors

Mohammad Lotfollahi
View author publications
You can also search for this author in PubMed Google Scholar
Mahdi Jafari Siavoshani
View author publications
You can also search for this author in PubMed Google Scholar
Ramin Shirali Hossein Zade
View author publications
You can also search for this author in PubMed Google Scholar
Mohammdsadegh Saberian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahdi Jafari Siavoshani.

Ethics declarations

Conflict of interest

Mohammad Lotfollahi, Mahdi Jafari Siavoshani, Ramin Shirali Hossein Zade, and Mohammdsadegh Saberian declare that they have no conflict of interest

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Hyper-parameters

Here, in this section, we present all of the hyper-parameters of the proposed SAE and CNN for the traffic characterization and application identification tasks. These parameters are stated in Tables 9, 10, 11 and 12.

In the following, to have more compact tables, we have used some abbreviations, namely FC stands for “Fully connected,” NoF stands for “Number of Features,” DO stands for “Dropout,” BN stands for “Batch Normalization,” NL stands for “Nonlinearity,” Str stands for “Strides,” and Kl stands for “Kernel.”

Table 9 SAE detailed architecture for the application identification task

Full size table

Table 10 SAE detailed architecture for the traffic characterization task

Full size table

Table 11 CNN detailed architecture for the application identification task

Full size table

Table 12 CNN detailed architecture for the traffic characterization task

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lotfollahi, M., Jafari Siavoshani, M., Shirali Hossein Zade, R. et al. Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Comput 24, 1999–2012 (2020). https://doi.org/10.1007/s00500-019-04030-2

Download citation

Published: 13 May 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00500-019-04030-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep packet: a novel approach for encrypted traffic classification using deep learning

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions

A systematic literature review for network intrusion detection system (IDS)

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Hyper-parameters

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep packet: a novel approach for encrypted traffic classification using deep learning

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions

A systematic literature review for network intrusion detection system (IDS)

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Hyper-parameters

Hyper-parameters

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation