Towards Evaluating the Robustness of Deep Intrusion Detection Models in Adversarial Environment

Network Intrusion Detection System (NIDS) is a method that is utilized to categorize network traﬃc as malicious or normal. Anomaly-based method and signature-based method are the traditional approaches used for network intrusion detection. The signature-based approach can only detect familiar attacks whereas the anomaly-based approach shows promising results in detecting new unknown attacks. Machine Learning (ML) based approaches have been studied in the past for anomaly-based NIDS. In recent years, the Deep Learning (DL) algorithms have been widely utilized for intrusion detection due to its capability to obtain optimal feature representation automatically. Even though DL based approaches improves the accuracy of the detection tremendously, they are prone to adversarial attacks. The attackers can trick the model to wrongly classify the adversarial samples into a particular target class. In this paper, the performance analysis of several ML and DL models are carried out for intrusion detection in both adversarial and non-adversarial environment. The models are trained on the NSLKDD dataset which contains a total of 148,517 data points. The robustness of several models against adversarial samples is studied.


Introduction
In today's world, cyber-attacks and threats on Information and Communication Technologies (ICT) systems are growing rapidly.Various new attacks are invented daily by attackers to bypass the current security systems and steal crucial information.To detect and prevent these attacks on ICT systems, we need flexible and reliable integrated network security solutions.Various security structures and methods are used to deal with these malicious attacks namely firewalls, Intrusion Detection System (IDS), software updates, encryption and decryption methods, etc.In that, IDS plays a big role in defending the network from all kinds of intrusion and malicious acts, both from outside and inside the network.IDS has been actively studied area from the 1980s, a seminal work by [1] on the computer security threat monitoring and surveillance.IDS is mainly categorized into two types.One is Network IDS (NIDS): It is utilized to monitor and analyze network traffic records to safeguard a system from network-based attacks.The next type is Host-based IDS (HIDS): it monitors the system in which it is installed to detect both internal and external intrusion and misuse and it responds by recording the activities and alerts the authority.NIDS monitors the network traffic and classifies the network records between normal ones and malicious ones.Since this is a classification problem, various Machine Learning (ML) and Deep Learning (DL) models are widely used in these detection systems and have achieved good results.However, ML and DL models are prone to adversarial attacks.Attackers can fool the detection system by using adversarial samples and make the classifier misclassify those sample data [2].Therefore, it is necessary to check the robustness of those models that are used in NIDS against adversarial samples.In this paper, Several DL and ML models are trained on the openly available NSLKDD dataset for IDS.The robustness of those models against adversarial samples is studied.The main contributions of this work are the following: -We have trained several DL and ML models using NSLKDD dataset in a non-adversarial environment and reported their performance using standard metrics.-We have also studied the robustness of the trained models in the adversarial environment using the samples generated by two different adversarial attack techniques.
The rest of the paper is arranged as follows.Section 2 presents the related works.Section 3 includes the background information.Section 4 and 5 presents description of the dataset and statistical measures respectively.Section 6 and 7 covers the experimental results and conclusion.

Related Work
Many ML and DL based approaches have been applied for various problems in the field of cyber security including IDS [3][4][5][6][7].The authors Tsai et al. utilized Support Vector Machine (SVM), Self-organizing maps, Artificial Neural Networks (ANN), Naive Bayes (NB), K-Nearest Neighbor (KNN), Genetic algorithms, Decision Tree (DT), Fuzzy logic, etc for detecting the intrusion [8].Buczak and Guven have done a comprehensive survey [9] on ML-based NIDS where many ML classifiers such as DT, ensemble learning, SVM, clustering, Hidden Markov Models (HMM), NB, etc.Since ML techniques require manual features, DL based approaches are proposed.DL architectures can obtain salient features from the input data automatically.In [10], the authors have proposed multiple Deep Neural Network (DNN) models for both network and host-based intrusion detection.They have trained models using several benchmark datasets and compared its performance with ML-based approaches.Similar to [10], [11] proposes a DNN based IDS for Software Defined Networking (SDN) environment.The proposed model only takes 6 basic features from 41 features of the NSLKDD dataset.[12] studies the effectiveness of DL networks such as DNN, Convolutional Neural Network (CNN), and Hybrid CNN for binary and multiclass classification.[13] compares the performance of many shallow and deep neural networks in detecting intrusion and [14] proposes a recurrent neural network and its variants for intrusion detection.
ML and DL models are prone to adversarial attacks.This vulnerability, which was discovered in recent years, limits the application of ML and DL models in various security-critical areas like IDS, autonomous vehicles, health care, etc.The authors Szegedy et al experimented on AlexNet with some adversarial sample images [15].AlexNet [16] is the name of a convolutional neural network, designed by Alex Krizhevsky.They showed that by making very small variations in the input image, they could make the model misclassify it.Since then, the profound implications of this vulnerability sparked several researchers to develop various adversarial attacks and defenses.Some of the most commonly known attacks are Jacobian based Saliency Map Attack (JSMA) [17] and Fast Gradient Sign Method (FGSM) [18].In this paper, the effects of adversarial samples generated by [18] and [17] on various ML and DL models are studied.

Adversarial attacks
Fast Gradient Sign Method (FGSM): It is a straightforward method of creating adversarial samples, which was proposed by Goodfellow et al.In FSGM, a small deviation is calculated in the direction of the gradient and it is defined as follows.
where p is the perturbation, is a small constant, xL(θ, x, y) is the gradient of loss function L which is used for training the model, θ denotes the model, x denotes the input and y denotes the class of input x.This perturbation p is added to the input data to generate adversarial samples: FGSM is computationally more efficient when compared to JSMA.But it has a lower rate of success.Jacobian-based Saliency Map Attack (JSMA) It uses the concept of saliency maps to generate adversarial samples.A saliency map gives insights about the features of the input data that are most likely to create a change of targeted class.In other words, saliency maps rate each feature of how influential it is for causing the model to predict a target class.JSMA causes the model to misclassify the resulting adversarial sample to a specific erroneous target class by modifying the high-saliency features.The formulation of the saliency map is given as: Where x (i) is input feature, y is a class, and A + (.) is the measure of positive correlation of x (i) with class y and negative correlation of x (i) with all other classes.If both cases in the formulation fail, then the saliency is zero.JSMA can create adversarial samples with less degree of distortion and has a better success rate while compared to FGSM.

Intrusion Detection System (IDS)
IDS is a tool that deals with unauthorized access and threats to systems and information by any type of user or software.Intrusion can be external or internal.External intrusion is when an intruder tries to gain access to a protected internal network.Internal intrusion is when an insider with a motive tries to misuse, attack or steal information.This is also called an insider threat.Two major categories of IDS are HIDS and NIDS.HIDS is a tool that monitors the system in which it is installed to detect both external and internal intrusion, misuse and responds by recording activities and alerts the authority.NIDS is utilized to monitor and analyze network traffic to safeguard a system from network-based attacks.Figure 1 shows a model of Intrusion detection system.Signature-based NIDS uses signatures that are extracted from previously known attacks.Signatures are manually generated and stored in the database whenever a new attack is identified.New attacks will not be detected by this system.Anomaly-based NIDS models the normal behavior of the network and raises alarm whenever it detects an anomalous behavior.Hybrid NIDS uses the combination of the above two approaches.

Deep Learning (DL) Models
The DL models are used for solving various research problems in a wide range of fields like biomedical, speech processing, natural language processing, etc since DL models have the capability of extracting salient features automatically with very less or no human intervention.

Description of Dataset
One of the most used datasets is KDDCUP 99 which was obtained from the DARPA98 dataset.The KDDCUP 99 dataset has several issues that are resolved by a newly refined version called NSL-KDD [19].In this dataset, the invalid and redundant connection records are omitted from the entire train and test data.Table 1 represents the statistics of the NSLKDD dataset.This dataset has various attacks that belong to four major families such as User to Root (U2R), Probing attacks, Denial of Service (DoS) and Remote to Local (R2L).The purpose of the DoS attack is to work against resource availability.U2R attacks represent attempts for privilege escalation.R2L attacks attempt to exploit a vulnerability and gain remote access to a machine.Probe attacks are mainly information gathering attempts by scanning parts of the networks.The dataset contains a total of 41 features.

Statistical Measures
The performance evaluation of the models against adversarial attacks is conducted based on some of the popular performance metrics such as precision, accuracy, f1-score, and recall.Accuracy gives an oversight of the performance of the classifier.F1-score gives the harmonic mean between recall and precision.In -Accuracy: It denotes the total amount of correct predictions (TP and TN) over the total number of predictions.
-Precision: It denotes the amount of correct positive results over the amount of positive results predicted by the model.

P recision = T P T P + F P
-Recall: It denotes the total amount of correct positive results over the amount of all relevant samples.
-F1 score: F1 score denotes the harmonic mean between recall and precision.
The adversarial attacks reduce the overall performance of the model by tricking it to perform misclassification.Therefore, the above-mentioned metrics which show the performance of the system can be used to measure the robustness of the model in the adversarial environment.

Experimental results
The adversarial attacks such as FGSM and JSMA are implemented using Adversarial Robustness Toolbox v0.10.0 [8] and the ML and DL models are implemented using Scikit-Learn and Keras python libraries respectively.The models implemented Table 2 represents the performance of models such as Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN) and Deep Neural Network (DNN), Support Vector Machine (SVM), Naive Bayes (NB), K-Nearest Neighbour (KNN), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Adaboost (AB) in non-adversarial environment.The performance of the trained models is compared with the performance of the Soft-Max Regression (SMR) classifier [20].It can be observed from Table 2 that the DNN performed better than all the other models that are trained in this work.Based on the accuracy metric, the DNN, CNN, and DT are the top three models that are trained in this work and their accuracies are 77.39%,75.37%, and 74.78%.Adaboost classifier gives the least performance in terms of accuracy.In terms of F1-score, both SMR and DT models performed better than CNN and LSTM models.All the models that are trained in this work are also tested on adversarial samples generated by FGSM and JSMA to evaluate how robust they are under an adversarial environment.The Table 3 and Table 4 represents the performance of all the models tested on adversarial samples generated by FGSM and JSMA methods respectively.It can be observed from both the tables that the adversarial attacks tremendously reduced the performance of the baseline models that are trained in a non-adversarial environment.The performance of the models is affected tremendously by both FGSM and JSMA techniques.The top three most affected models by FGSM in terms of accuracy are DNN, LSTM, and DT.The FGSM attack reduced the performance of DNN from 77.39 to 16.74 (78% reduction), LSTM from 74.65 to 24.51 (76% reduction), and DT from 74.78 to 17.65 (67% reduction).The least affected models by FGSM attack is RBF-SVM (2% reduction), LR (2% reduction), and LSVM (4% reduction).The top three most affected models by JSMA in terms of accuracy are CNN, DNN, and DT.The JSMA attack reduced the performance of CNN from 74.65 to 10.06 (87% reduction), DNN from 77.39 to 10.87 (86% reduction), and DT from 74.78 to 12.21 (83% reduction).The least affected models by JSMA attack are NB (2% reduction), RBF-SVM (4% reduction), and LSVM (30% reduction).
It can be observed from both the tables that, FGSM worked well in the case of LSTM and NB and JSMA worked better than FGSM in all other cases.RBF-SVM, LSVM, KNN, and NB are the models which show more robustness against both adversarial attacks when compared to the rest of the models.The adversarial samples that are created using the DNN model generalize well over other DL and ML models as well.In other words, the attack samples, which are created by both FGSM and JSMA for the DNN model as the target, also affect the performance of other ML and DL models.

Conclusion
In this paper, we have observed that the adversarial samples can lower the accuracy of many DL and ML classifiers with varying degrees of success.This shows that it is necessary to test the robustness of any DL or ML model against adversarial samples especially when they are used in security-critical applications.In this paper, the models that are trained did not perform well when compared to other state-of-the-art approaches, but its robustness towards adversarial attacks are studied.In the future, we will further focus on the defense techniques that avoid such attacks.
The Deep Neural Network (DNN) model used in work has 5 hidden layers and overall it has a total of 1,399,557 trainable parameters.These five layers have 1024, 768, 512, 256, 128 neurons respectively.The dropout regularization technique is also employed to avoid overfitting.The Convolutional Neural Network (CNN) model is widely used in the area of computer vision as it is capable of extracting location invariant features automatically.The CNN model, which is used in this work, has four convolution

Table 1 .
Statistics of NSLKDD data set.binary classification setting, true labels versus the predicted labels are represented by confusion matrix and the matrix contains four terms.The first one is True Positive (TP).It denotes the amount of malicious traffic records that are correctly predicted as malicious.The second one is False Positive (FP).It denotes the amount of normal traffic records that are incorrectly predicted as malicious.The next one is True Negative (TN) and it denotes the amount of normal traffic records that are correctly predicted as normal.The final one is False Negative (FN) and it denotes the amount of malicious traffic records that are incorrectly predicted as normal.Based on these four terms, we can define several metrics: a

Table 2 .
Performance of baseline models for test set.

Table 3 .
Performance of models for the adversarial sample generated by FGSM.

Table 4 .
Performance of models for the adversarial sample generated by JSMA.