An in-depth experimental study of anomaly detection using gradient boosted machine

Tama, Bayu Adhi; Rhee, Kyung-Hyune

doi:10.1007/s00521-017-3128-z

An in-depth experimental study of anomaly detection using gradient boosted machine

Original Article
Published: 05 July 2017

Volume 31, pages 955–965, (2019)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

1907 Accesses
74 Citations
3 Altmetric
Explore all metrics

Abstract

This paper proposes an improved detection performance of anomaly-based intrusion detection system (IDS) using gradient boosted machine (GBM). The best parameters of GBM are obtained by performing grid search. The performance of GBM is then compared with the four renowned classifiers, i.e. random forest, deep neural network, support vector machine, and classification and regression tree in terms of four performance measures, i.e. accuracy, specificity, sensitivity, false positive rate and area under receiver operating characteristic curve (AUC). From the experimental result, it can be revealed that GBM significantly outperforms the most recent IDS techniques, i.e. fuzzy classifier, two-tier classifier, GAR-forest, and tree-based classifier ensemble. These results are the highest so far applied on the complete features of three different datasets, i.e. NSL-KDD, UNSW-NB15, and GPRS dataset using either tenfold cross-validation or hold-out method. Moreover, we prove our results by conducting two statistical significant tests which are yet to discover in the existing IDS researches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of Anomaly-Based Intrusion Detection with Combined Imbalance Correction and Feature Selection

A Comparative Study of Machine Learning Algorithms for Anomaly-Based Network Intrusion Detection System

A Comparative Study of Machine Learning Classifiers for Network Intrusion Detection

References

Aiello S, Eckstrand E, Fu A, Landry M, Aboyoun P (2016) Machine learning with R and H₂O. https://h2o-release.s3.amazonaws.com/h2o/rel-turan/4/docs-website/h2o-docs/booklets/R_Vignette.pdf. Accessed July 2017
Arora A, Candel A, Lanford J, LeDell E, Parmar V (2016) Deep learning with H₂O. http://h2o.ai/resources
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton
MATH Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Article Google Scholar
Chebrolu S, Abraham A, Thomas JP (2005) Feature deduction and ensemble design of intrusion detection systems. Comput Secur 24(4):295–307
Article Google Scholar
Conover WJ (1999) Practical nonparametric statistics 3rd edition, John Wiley and Sons, Michigan
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Article MathSciNet MATH Google Scholar
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
Article Google Scholar
Giacinto G, Perdisci R, Del Rio M, Roli F (2008) Intrusion detection in computer networks by a modular ensemble of one-class classifiers. Inf Fusion 9(1):69–82
Article Google Scholar
Govindarajan M, Chandrasekaran R (2011) Intrusion detection using neural based hybrid classification methods. Comput Netw 55(8):1662–1671
Article Google Scholar
Harb HM, Desuky AS (2011) Adaboost ensemble with genetic algorithm post optimization for intrusion detection. Int J Comput Sci Issues 8:5
Google Scholar
Hsu CW, Chang CC, Lin CJ et al (2010) A practical guide to support vector classification. http://www.datascienceassn.org/sites/default/files/Practical Guide to Support Vector Classification.pdf. Accessed July 2017
Kanakarajan NK, Muniasamy K (2016) Improving the accuracy of intrusion detection using GAR-Forest with feature selection. In: Proceedings of the 4th international conference on frontiers in intelligent computing: theory and applications (FICTA) 2015, Springer, New York, pp 539–547
Kevric J, Jukic S, Subasi A (2016) An effective combining classifier approach using tree algorithms for network intrusion detection. Neural Comput Appl 1–8
Krömer P, Platoš J, Snášel V, Abraham A (2011) Fuzzy classification by evolutionary algorithms. In: 2011 IEEE international conference on systems, man, and cybernetics (SMC), IEEE, pp 313–318
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Lewis RJ (2000) An introduction to classification and regression tree (CART) analysis. In: Annual meeting of the society for academic emergency medicine in San Francisco, California, pp 1–14
Loh WY (2011) Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov 1(1):14–23
Article Google Scholar
Mohammadi M, Raahemi B, Akbari A, Nassersharif B (2012) New class-dependent feature transformation for intrusion detection systems. Secur Commun Netw 5(12):1296–1311
Article Google Scholar
Moustafa N, Slay J (2015) UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Military communications and information systems conference (MilCIS), 2015, IEEE, pp 1–6
Moustafa N, Slay J (2016) The evaluation of network anomaly detection systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf Secur J Glob Perspect 25(1–3):18–31
Article Google Scholar
Mukkamala S, Sung AH, Abraham A (2005) Intrusion detection using an ensemble of intelligent paradigms. J Netw Comput Appl 28(2):167–182
Article Google Scholar
Oza NC, Tumer K (2008) Classifier ensembles: select real-world applications. Inf Fusion 9(1):4–20
Article Google Scholar
Pajouh HH, Dastghaibyfard G, Hashemi S (2017) Two-tier network anomaly detection model: a machine learning approach. J Intell Inf Syst 48(1):61–74
Article Google Scholar
Panda M, Abraham A, Patra MR (2010) Discriminative multinomial naive bayes for network intrusion detection. In: Information assurance and security (IAS), 2010 sixth international conference on IEEE, pp 5–10
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39
Article Google Scholar
Sindhu SSS, Geetha S, Kannan A (2012) Decision tree based light weight intrusion detection using a wrapper approach. Expert Syst Appl 39(1):129–141
Article Google Scholar
Tama BA, Rhee KH (2015a) A combination of PSO-based feature selection and tree-based classifiers ensemble for intrusion detection systems. In: Advances in computer science and ubiquitous computing, Springer, New York, pp 489–495
Tama BA, Rhee KH (2015b) Performance analysis of multiple classifier system in DoS attack detection. In: International workshop on information security applications, Springer, New York, pp 339–347
Tama BA, Rhee KH (2016) Classifier ensemble design with rotation forest to enhance attack detection of IDS in wireless network. In: 2016 11th Asia joint conference on information security (AsiaJCIS), IEEE, pp 87–91
Tama BA, Rhee KH (2017) Performance evaluation of intrusion detection system using classifier ensembles. Int J Internet Protoc Technol 10(1):22–29
Article Google Scholar
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD Cup 99 data set. In: Proceedings of the second IEEE symposium on computational intelligence for security and Defence applications 2009
Therneau TM, Atkinson B, Ripley B et al (2010) rpart: Recursive partitioning. R Package Version 3:1–46
Google Scholar
Vilela DW, Ferreira E, Shinoda AA, de Souza Araujo NV, de Oliveira R, Nascimento VE (2014) A dataset for evaluating intrusion detection systems in IEEE 802.11 wireless networks. In: IEEE Colombian conference on communications and computing (COLCOM), IEEE, pp 1–5

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. NRF-2014R1A2A1A11052981), and partially supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2015-0-00403) supervised by the IITP (Institute for Information & communications Technology Promotion). First author acknowledges Korean Government for providing scholarship through KGSP for Graduate 2013–2018.

Author information

Authors and Affiliations

IT Convergence and Application Engineering, Pukyong National University, Daeyon Campus, 45, Yongso-ro, Nam-gu, Busan, 48513, South Korea
Bayu Adhi Tama & Kyung-Hyune Rhee
Faculty of Computer Science, University of Sriwijaya, Jln. Raya Palembang-Prabumulih Km.32, Inderalaya, OI, Sumatera Selatan, Indonesia
Bayu Adhi Tama

Authors

Bayu Adhi Tama
View author publications
You can also search for this author in PubMed Google Scholar
Kyung-Hyune Rhee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyung-Hyune Rhee.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tama, B.A., Rhee, KH. An in-depth experimental study of anomaly detection using gradient boosted machine. Neural Comput & Applic 31, 955–965 (2019). https://doi.org/10.1007/s00521-017-3128-z

Download citation

Received: 14 October 2016
Accepted: 19 June 2017
Published: 05 July 2017
Issue Date: 01 April 2019
DOI: https://doi.org/10.1007/s00521-017-3128-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An in-depth experimental study of anomaly detection using gradient boosted machine

Abstract

Access this article

Similar content being viewed by others

Evaluation of Anomaly-Based Intrusion Detection with Combined Imbalance Correction and Feature Selection

A Comparative Study of Machine Learning Algorithms for Anomaly-Based Network Intrusion Detection System

A Comparative Study of Machine Learning Classifiers for Network Intrusion Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An in-depth experimental study of anomaly detection using gradient boosted machine

Abstract

Access this article

Similar content being viewed by others

Evaluation of Anomaly-Based Intrusion Detection with Combined Imbalance Correction and Feature Selection

A Comparative Study of Machine Learning Algorithms for Anomaly-Based Network Intrusion Detection System

A Comparative Study of Machine Learning Classifiers for Network Intrusion Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation