Intrusion Detection Using a New Hybrid Feature Selection Model

Intrusion detection is an important topic that aims at protecting computer systems. Besides, feature selection is crucial for increasing the performance of intrusion detection. This paper employs a new hybrid feature selection model for intrusion detection. The implemented model uses Grey Wolf Optimization (GWO) and Particle Swarm Optimization (PSO) algorithms in a new manner. In addition, this study introduces two new models called (PSO-GWO-NB) and (PSO-GWO-ANN) for feature selection and intrusion detection. PSO and GWO show emergent results in feature selection for several purposes and applications. This paper uses PSO and GWO to select features for the intrusion detection system. Furthermore, in this study, a new emergent feature selection method using the interstation of (PSO and GWO) features is developed. Also, this research examines the Most frequently Repeated Features from (PSO and GWO) and gives it the name (MRF). This study runs PSO and GWO for a specific number of iterations, which the user could define. Each feature selection model runs independently, and the selected feature set is saved. PSO features, GWO features, the intersection of (PSO and GWO) features, and MRF features are tested at the next stage. This research uses the UNSW-NB15 dataset for evaluation purposes. Furthermore, experiments are implemented using two classifiers: Naïve Bayesian (NB) and Artificial Neural Networks (ANN). The results show that PSO and GWO are highly acceptable for the selection of intrusion detection features. Besides, the intersection of (PSO and GWO) features gives an emergent result with a minimum number of features. Moreover, MRF features show highly acceptable results. The evaluation process criteria are true positive, false positive, false negative, precision, and recall. The experiments demonstrate that MRF features give a good result related to precision and recall. Finally, experiments show that the performance of (PSO-GWO-NB) classifier is better than (PSO-GWO-ANN) for feature selection and intrusion detections.


Introduction
Intrusion detection is a set of procedures and techniques used to identify an intrusion activity. As such, an intrusion detection system is any software that can detect or respond to abnormal activity. An intrusion is an illegal try to access and use a computer system and its resources [1][2][3]. Generally, intrusion detection is classified into two methods: misuse detection or anomaly detection [1,2]. Misuse detection systems are based on using prior knowledge of attacks to search and identify attack traces. Whereas, anomaly detection is another technique based on studying the normal activity features [2]. Furthermore, intrusion detection can be divided into three different sub-groups. Host-based intrusion detection (HBID), network intrusion detection (NIDS), and hybrid-based intrusion detection (HISD) [4][5][6].
Features selection is an essential factor for the success of an intrusion detection system. It is necessary for high diversity data mining, and is a fundamental data processing step in the training phase prior to moving to the next stage (testing) [7]. There are different techniques for selecting features, such as the wrapper, the filter, and the embedded methods. Other methods are bio-inspired metaheuristic [7][8][9]. Determining the best number of features will improve the success rate and performance. This study's outline and main contribution can be summarized with the following points: Using PSO and GWO in a new emergent method, selecting the intersection of their (PSO and GWO) features, then examining the resulting most frequently repeated features, which will represent the best set.
Intrusion detection can be implemented through several methods, such as the programmed and the selflearning methods. Fig. 1 presents several of such techniques [10]. Many studies indicate that intrusion detection systems have become one of the most recent cybersecurity research areas [5]. Additionally, recent studies demonstrate that the number of attacks on individuals and organizations tends to increase rapidly [5][6][7]. Intrusion detection using traditional preventions, such as firewall, encryption, and user authentication has not entirely succeeded in its mission. In other words, the need for other emergent procedures has become vital. This research focuses on using bio-inspired metaheuristic and machine learning algorithms in developing an efficient feature selection and intrusion detection system. The rest of this paper is organized as follows: Section 2 demonstrates the proposed model. Section 3 presents feature selection. Section 4 explains particle swarm optimization. Section 5 illustrates the grey wolf optimizer. Section 6 describes machine learning algorithms. Section 7 presents related works. Section 8 introduces the dataset used in this work. Section 9 shows the proposed model experiments. Finally, Section 10 presents this research conclusion.

The Proposed Model
This paper proposes two hybrid models for feature selection and intrusion detection. The first model is (PSO-GWO-NB), and the second is (PSO-GWO-ANN).
Features selection is significant for the success of any classification process. As mentioned in the research, there are several techniques for feature selection. After the intensive study of the techniques, this paper gives the interest to investigate the bio-inspired techniques. Feature selection in the proposed system is as follows: The original dataset contains 49 features. The used dataset, after reduction and cleaning, has only 45 features. PSO and GWO are used to decrease the number of features. Experiments are repeated several times until getting the optimal number of features. Figs. 2 and 3 demonstrate PSO and GWO features selection models, respectively.
The process of feature selection is repeated 30 times for PSO and GWO. PSO features, GWO features, the intersection of (PSO and GWO) features, and the Most frequently Repeated Features (MRF) are used for further experiments with NB and ANN. Fig. 4 demonstrates the overall features selection model. As shown in Fig. 4, each bio-inspired algorithm used for feature selection is treated independently, and the reduced set will be used for further experiments. In the next stage, NB and ANN classifiers are used.

Feature Selection
Feature selection is a critical factor for the success and failure of any classification system. Many standard techniques, such as Correlation-based Feature Selection method (CFS), Gain Ratio (GR), and Information Gain (IG) [5][6][7][8]. Other types are bio-inspired algorithms, such as genetic algorithms and artificial neural networks [11,12], and new emergent algorithms should be of broad interest, such as GWO and PSO [5,13,14]. Selecting an appropriate method can be crucial for the success and failure of the overall process.

Particle Swarm Optimization
Particle Swarm Optimization (PSO) algorithm is a computational technique that can optimize a problem by iteratively selecting a suggestion or a candidate solution. PSO is inspired by the behavior of collective animals like fish and birds, and aims at solving the problem by having a population of appropriate solutions. Also, it can search for huge spaces of potential solutions [15]. PSO can not be guaranteed to  find the best solution, and based on having a population (called swarm) of a candidate solution (called particle). Particles moved according to a few simple formulas [5], and swarms travel in the search space with the hope of finding the best solution. In addition, if a better position is discovered, the movements of the swarm will be changed; This process is repeated with the intent to find the optimal solution [16,17].

Grey Wolf Optimizer
Grey Wolf Optimizer (GWO) algorithm is a swarm intelligence optimization algorithm developed by Mirjalili et al. [18] in 2014, and it emulates the leadership hierarchy and the chasing of gray wolves. Wolves are categorized into four types, as shown in Fig. 6 [18].
Wolves types are alpha, beta, delta, and omega. Alpha is the best group of individuals, and is the leader of the wolves [15,18]. The alpha type is dominant, the decision-maker, and their orders and instructions must be taken seriously by the pack. Beta is the second-best group of individuals, and acts as subsidiary wolves, which can help alpha in taking the decision. Besides, beta wolves are the candidates to be alpha in case of any problem, and play a consultant's role to alpha groups and a discipliner for the pack. The third best group of individuals is delta, and the rest of the pack is considered omega [19]. Omega is the lowest level that can eat in the groups, and is the scapegoat. It is said to be delta if a wolf does not belong to alpha, beta, or omega. Delta wolves must submit to alpha and beta, but they are dominant to omega. The first three groups guide the GWO hunting process. In other words, alpha, beta, and delta groups lead other wolves to find the best position in the available search space [20,21].

Machine Learning
Machine learning (ML) algorithms are used for several purposes, such as classification and prediction. Many kinds of research present several machine learning algorithms, such as Genetic Algorithm (GA), Artificial Neural Networks (ANN), Naïve Bayesian (NB), Support Vector Machines (SVM), K-Nearest Neighbors (K-NN), and decision tree in text classification, spam detection, and prediction [5,11,12,[17][18][19][20][21]. Without a doubt, ML algorithms can be used in intrusion detection [5]. Fig. 7 presents that the machine learning process is completed into two phases: the training and the testing phases.
This study applies PSO and GWO algorithms for feature reduction and selection. Also, the dataset used in this research is UNSW-NB15 [22], and ML algorithms used are NB and ANN. ML algorithms employ training of data before testing. The training is vital and aims to clean and prepare data for testing. Besides, data training is used to select the most suitable features, which will be used in the testing phase [23,24].

Naïve Bayes Classifiers
Naïve Bayes (NB) is a probabilistic classifier based on applying Bayes theorem, and is a simple and powerful algorithm that can determine the included classes using the probability theorem [25,26]. Furthermore, the NB classification hypothesis's major function is making sure that the given data belongs to a specific category. In NB, if you are given a series of x attributes, then we have 2x! independence assumptions [27]. Besides, training and data preprocessing are significant in NB since some errors, and data noise could result from unsuitable training and data variance [25][26][27]. Finally, the results of NB, are often correct.

Artificial Neural Networks Classifiers
Artificial Neural Networks (ANN) is considered one of the powerful learning models inspired by a biological neural network (nervous system) and emulates the human brain's role. Many studies used ANN as a classifier, especially as a text classifier [28][29][30][31]. In addition, ANN has several components, such as neurons, connections, weight, propagation function, and organization. Also, ANN has two paradigms: supervised and unsupervised learning. Besides, many researchers use ANN in intrusion detection [29,30], and numerous studies indicated that intrusion detection performance could be enhanced using neural networks. In ANN modules, the system tries to learn the pattern and make a prediction based on the learning phase, and during the training process, ANN can learn errors. Once the neural network has been trained, it can make predictions by indicating a similar pattern [31]. In this research, the Multilayer Perceptron (MLP) is used. The multilayer perceptron is a class of feed-forward ANN and employs a supervised learning method called backpropagation [32][33][34].

Related Studies
This section will demonstrate many kinds of research that illustrate intrusion detection using bioinspired metaheuristic and machine learning algorithms.
AShahri et al. [35] proposed a hybrid model of the genetic algorithm and the support vector machine for intrusion detection. The number of features is reduced to 10 instead of 45. The authors categorize feature priorities into three levels. The highest priority is the first, and the lowest priority is the third. Çavuşoğlu [37] developed a hybrid and layered intrusion detection system, and uses a mixture of machine learning algorithms and feature selection methods to provide a maximum number of accuracies. By using two distinct features selection (CfsSubsetEval, WrapperSubsetEval), the dataset is reduced. However, in all attack types, the proposed system provides 99.7% accuracy, and the dataset used is NSL-KDD.
Buczak et al. [38] reported a survey of data mining and machine learning methods for cybersecurity and intrusion detection. The complexity of machine learning and data mining is addressed, and the crucial aspect of this study for cybersecurity is the importance of a dataset for training and testing. Also, the authors mention that machine learning and data mining cannot work without data representation, and it is difficult and time-consuming to get a dataset. Besides, recommendations on when to use a given method are provided.
Chitrakar et al. [39] proposed a hybrid learning model by joining NB with k-Medoids based clustering technique. The authors observe that the application of K-Medoids clustering techniques, followed by the NB classification method, is better for getting more accurate results. Results demonstrate that the planned model enhanced accuracy and false-positive rate.
Yin et al. [40] demonstrated how to model the intrusion detection system based on deep learning techniques, and suggest a new deep learning approach for intrusion detection using Recurrent Neural Networks Intrusion Detection System (RNN-IDS). Results are compared with J48, artificial neural networks, support vector machine, and random forest. Also, results show that RNN-IDS is very appropriate for modeling systems with high accuracy. Finally, the authors demonstrate that the model can successfully enhance intrusion detection accuracy and the ability to distinguish the intrusions types.
Almomani [5] presented several bio-inspired algorithms for feature selection, and uses the genetic algorithm, particle swarm optimization, grey wolf optimizer, and firefly optimization. Features derived from bio-inspired model evaluated using support vector machine and J48 classifiers. All experiments use the UNSW-NB15 dataset and demonstrate promising results related to false positive and accuracy.
Wang et al. [41] suggested a new approach for the artificial neural network as intrusion detection, and the recommended approach was called Fuzzy Clustering-Artificial Neural Networks (FC-ANN). The FC-ANN procedure based on using the fuzzy clustering technique to produce different training subsets. Then, based on different training subsets, different artificial neural networks are trained to express different base models. Finally, the dataset used is KDDCUP99.
Xin et al. [42] reported an important literature survey on the machine and deep learning methods for intrusion detection. The authors focused on the last three years' literature review for network security, and demonstrate that each approach used for intrusion detection has its advantages and disadvantages. The authors say that selecting a dataset is very important for training and testing. Also, they demonstrate several problems and trends in intrusion detection, such as dataset, hybrid methods, detection speed, and online learning.
Gumus et al. [43] built an online NB classifier to determine normal and unnormal activity. The classifier continually updates the mean and standard deviation of the features (IDS variables). Also, the authors compare several machine learning algorithms on the KDD99 dataset, and they mention that the proposed technique is time-efficient.

Dataset
Dataset used in this research is UNSW-NB15, which is complete and used for intrusion detection systems. One of the significant challenges for all researchers is the availability of a benchmark dataset, and KDD98, KDDCPU99, and NSLKDD datasets were generated a decade ago. Moustafa et al. [22] developed the UNSW-NB15 dataset for research purposes, and this dataset is hybrid and contains usual and contemporary attack events. Fig. 8 shows the UNSW-NB15 dataset [22]. UNSW-NB15 dataset contains nine categories of attacks: fuzzers, analysis, generic, reconnaissance, shellcode, backdoors, dos, exploits, and worms. The dataset contains 49 features. Dataset record distribution is shown in Tab. 1. In UNSW-NB15, the number of records in the training set is 175,341, and the testing set is 82332. The testing and training dataset contains 45 features. Some features are missing in the training and the testing dataset, such as scrip, sport, dstip, dsport, smeansz, dmeansz, res_bdy_len, stime, and ltime. Also, few features are available in the training and the testing dataset but missing in the list of features, such as rate, smean, dmean, and response_body_len. The list of features in UNSW-NB15 is shown in Tab. 2 [22].

Experiments and Results
This section demonstrates the experiments' phases, evaluation metrics, important features, and results.

Experiments Phases
This

Experiments Evaluation Metrics
To accurately test the efficiency of the experiments, several criteria could be used, such as Precision (P), Recall (R), TPR (True Positive Rate), FPR (False Positive Rate), FNR (False Negative Rate), and F-measure [5,27]. See Tab   Recall: The ratio of total relevant results correctly classified or relevant retrieved/total relevant. F-measure: Testing the accuracy level, and it is a single measure that balances precision and recall.

Experiments Important Features
Tab. 4 demonstrates the selected features using PSO, GWO, (PSO∩GWO), and MRF. For the purposes of the experiments, only one set of features will be presented.
The MRF features are selected as follows: PSO feature reduction is repeated 30 times, all features that appear more than 8 times will be selected for the MRF experiments, and this process is repeated with GWO. The result of MRF is 34 features.

Conclusion
Feature selection and intrusion detection are important topics, and it is crucial to improve their methods and techniques. Also, many feature selection and intrusion detection methods are available, and the traditional prevention methods have not entirely succeeded. Therefore, the need for new emergent methods is crucial. This research proposes two models for feature selection and intrusion detection in a new manner. The proposed models are (PSO-GWO-NB) and (PSO-GWO-ANN). Besides, the feature selection process is based on PSO and GWO algorithms. Anaconda Python open-source software is used in phase 1. In phase 1 of the proposed model, features are reduced using PSO and GWO. The results of phase 1 are PSO features, GWO features, the intersection of (PSO and GWO) features, and the MRF. The number of features for (PSO∩GWO) is only 12, and for MFR is 34. In phase 2, reduced sets of features are evaluated using Weka open-source machine learning software, and NB and ANN classifiers are used. Experiments using (PSO∩GWO) and MRF features are highly acceptable and promising. (PSO-GWO-NB) gives a precision range between 74.5% and 91.2% and a recall range between 65% and 90.4%, and (PSO-GWO-ANN) provides a precision range between 81% and 88.8% and a recall range between 70.6% and 82.4%.