Intrusion Detection System Using FKNN and Improved PSO

: Intrusion detection system (IDS) techniques are used in cybersecurity to protect and safeguard sensitive assets. The increasing network security risks can be mitigated by implementing effective IDS methods as a defense mechanism. The proposed research presents an IDS model based on the methodology of the adaptive fuzzy k-nearest neighbor (FKNN) algorithm. Using this method, two parameters, i.e., the neighborhood size (k) and fuzzy strength parameter (m) were characterized by implementing the particle swarm optimization (PSO). In addition to being used for FKNN parametric optimization, PSO is also used for selecting the conditional feature subsets for detection. To proficiently regulate the indigenous and comprehensive search skill of the PSO approach, two control parameters containing the time-varying inertia weight (TVIW) and time-varying acceleration coefficients (TVAC) were applied to the system. In addition, continuous and binary PSO algorithms were both executed on a multi-core platform. The proposed IDS model was compared with other state-of-the-art classifiers. The results of the proposed methodology are superior to the rest of the techniques in terms of the classification accuracy, precision, recall, and f-score. The results showed that the proposed methods gave the highest performance scores compared to the other conventional algorithms in detecting all the attack types in two datasets. More-over, the proposed method was able to obtain a large number of true positives and negatives, with minimal number of false positives and negatives.

Other than opting for the best algorithm, feature selection also plays a vital role in IDS for selecting the subgroups of attributes from the group of original attributes. Feature selection is used to create the best possible learning model. Feature selection has three main advantages. First, it can enhance the probability performance of interpreters; second, it can offer speedy and economical interpreters; and third, it can deliver a better understanding of the ongoing processes that create the data [10]. In IDS, genetic algorithms (GAs) [11] are often used for choosing the input features [12,13] or to elucidate proper hyper-parameter values of the interpreter [14,15] using the SVM approach [16]. As compared to the GA, the PSO approach [17] contains no mutation or crossover processes. This approach is user friendly, needs less time for processing, and has a low cost. In addition, it fine-tunes the position and velocity of every point in such a way that it can differentiate accurately between the best local and global variables. All the elements have a strong capability to search around, helping the swarm to find the ideal solution.
In terms of practical tasks, the PSO and GA take a lot of computational time for processing. Therefore, computing techniques need to be improved. To increase the search and optimization process, implementation of binary and continuous PSO approaches should be done by making use of open multiprocessing that is a transferrable; an accessible model can assist in developing a parallel use of platforms [18].
In this paper, we discussed an adaptive FKNN approach that uses PSO algorithm by defining the neighborhood size (k) and fuzzy parameter (m). In addition, the binary PSO was used for detecting and defining the most significant features. The contribution and efficacy of the proposed IDS model were authenticated by relating them to other state-of-the-art classification models based on real-life scenarios. The experimental visualization outcome showed that the proposed topology can attain the proper parameters as well as display a high discerning power because of the feature selection tool. A comparison among the series and parallel models was performed. We also analyzed the parallel model of TVPSO-FKNNm and we found that it can greatly decrease the computational processing/run-time.
The fKNN and time-variant particle swarm optimization (TVPSO) are explained in Section 2. In Section 3, relevant studies of intrusion detection based on the k-nearest neighbors and PSO are discussed. Section 4 explains the proposed method and the algorithm of the FKNN-TVPSO. We provide the experimental outcomes in Section 5 and compare them with other techniques. In Section 6, we present the conclusion.

Fuzzy K-Nearest Neighbor Algorithm
The KNN algorithm is the first of its kind being a simple and nonparametric classification algorithm where a group allocates rendering to the most-known group amidst the k-nearest neighbors. The fuzzy form of the KNN was first proposed by Keller in 1985 by integrating fuzzy logic into the KNN technique and he named it the "fuzzy-KNN-classifier algorithm (FKNN)." Instead of separate groups, like in the KNN, the fuzzy affiliations of points are allocated to several groups with the following formulation: where i = 1, 2, 3, . . ., G (groups) and j = 1, 2, 3, . . .k (no. of nearest neighbors). The fuzzy parameter (m) defines the weight given to the distance of each point while computing the neighbor's contribution to the membership value. The value is selected as m ∈ (1 − ∞).
x − x j is the space between the x and its jth nearest neighbor x i . Several distance metrics, such as Euclidean distance, Mahalanobis distance, and Hamming distance, can be selected to measure x − x j . In this research, the Euclidean distance was given preference. u ij is the membership unit of pattern x j from the training group to class I between the k-nearest neighbor of x. The two ways to define u ij are the crisp membership and nonmembership in class and the fuzzy membership, where KNN of each set is allocated to the membership in each group using the following equation: where n j is the number of neighbors belonging to the jth class. It must be noted that memberships allocated by (2) must fulfill the following equations: In this research, we found that the fuzzy method has excellent classification precision. After calculating the membership of a particular point, it is then allocated to the group having the highest membership number, as shown as follows:

Time-Variant Particle Swarm Optimization (TVPSO)
This approach was inspired by organisms' social behavior, for instance, fish swimming in a school and birds flocking together. This approach was first developed by Kennedy and Eberhart [19]. In this technique, each point is taken as a unit in d-dimensional space, with some velocity and position. The position vector of the ith unit is symbolized as Both of these parameters are updated using the following equations: where P i = pi 1 , pi 2 , pi 3 , . . . , pi d , i.e., the previous position, P z = pz 1 , pz 2 , pz 3 , . . . , pz d , i.e., the best unit among all, r 1 and r 2 are random numbers, and v i,j is the velocity.
The inertia weight (w) is considered for the global survey and local utilization. A large weight assists in a global search, while a small weight eases the local search. W is updated for algorithm utilization using (9), which is also known as the "time-varying inertia weight": where w min and w max are predefined values of w, and t is the running iteration.
The magnitudes of the unit's velocity in the local and global directions are defined by the acceleration coefficients, i.e., c1 and c2. The concept of the time-varying acceleration coefficient (TVAC) for balancing the search space among the global survey and local usage was implemented in another study [20] which was also deployed in this research to ensure a better solution search. TVAC can be represented mathematically by the following equations: where c 1f , c 1i , c 2f , c 2i are constants, and t max is the maximum number of iterations.
The binary PSO is defined as the search in an isolated space where a unit changes its position in a space limited to zero and one at every dimension. Upon perceiving a high velocity, 1 is assigned, and for lower values, 0 is assigned to the unit. For changing the velocity from the continuous to the probability space, the sigmoid function is used (Eq. (12)).
, j = 1, 2, . . . , d. The new unit position is updated using the following equation: where rnd is a uniform random number ranging between 0 and 1.

Literature Review
The KNN has been discussed in detail in the literature [21]. KNN model does not comprise any kind of learning phase, so it is also known as a "lazy learner." It does not memorize the data for training. The KNN categorizes new data from already-available datasets based on parallel measures by using three of the distance metrics, i.e., the Euclidean distance, Mahalanobis Distance, and Hamming Distance, which help in forecasting and predicting about the hidden data point.
There are obstacles arising from using a simple KNN algorithm, and a new extended algorithm of KNN, the fuzzy KNN technique, was developed to prevent unbalanced normal and intrusion data by building up the feature vectors in the FKNN based on the clustering query point fuzzy conditions [17]. We experimentally found that the fuzzy KNN model performance is far better and greater than the conventional TANN (triangle area based nearest neighbors) and CANN (combined cluster centers and nearest neighbors) classifiers. According to the outcomes of the conducted research in this paper, the precision rate of FKNN is said to be 98.73%, while that of TANN is 98.47% and CANN has an accuracy rate equal to 96.07%. The intrusion detection rate of the FKNN is 96.23%, the TANN's rate is equal to 94.5%, and that of the CANN algorithm is up to 86.05%. The third calculated parameter was the false alarm rate for the FKNN, TANN, and CANN, which was 0.28%, 0.4%, and 0.75% respectively.
A semisupervised methodology was implemented to lessen the false alarm rate and improve the detection rate in IDS using the KNN hyper-parameter approach with cross-validation [22]. In this approach, for every unlabeled dataset, the KNN of the training set is categorized and after gaining statistical data from the KNN hyper-parameter tuning, namely, a neighboring dataset of each group, distance weight, metric, and new data points are taken as the attacking group or normal group. The NSL-KDD dataset was employed for analyzing the robustness of the model.
The fusion approach was implemented for intrusion detection, and it uses cross classifiers' techniques, including the SVM, PSO, and K-NN on the KDD99 dataset [23]. The accuracy of the SVM was 97%, the accuracy of the KNN was 98%, and the accuracy of the PSO was 99.8% (the highest accuracy) during an R2L attack. Compared to the above three classifiers, the fusion model had an accuracy rate of 98.55%.
The IDS was performed over the KDD99 dataset using KNN-ACO and the SVM approach and the accuracy rate of the classifiers was observed [5]. This approach generated fewer false alarms than the rest of the classifiers. The accuracy rate was also mentioned in the paper for the KNN-ACO and was equal to 94.17%. The accuracy rate of the backpropagation neural network (BPNN) was 93% and SVM had an 83% accuracy rate. The false alarm rate for KNN-ACO, BPNN, and SVM was 5.82%, 6.90%, and 16.90% respectively.
The CANN and KNN approaches and classifiers were explained using k-means clustering over the NSL-KDD dataset [24]. The FKNN approach was used for categorizing the data. The FKNN approach had a good accuracy rate and detection rate and a low false alarm rate.
The fuzzy c-means approach, distance-weighted KNN approach, and Dempster Shafer Theory were executed to identify the unknown attack in IDS by evaluating the functions and probabilities on the KDD99 dataset [6]. The accuracy, false alarm, and detection rate were measured using these approaches of the KNN algorithm, and the authors found that implementing fuzzy KNN logic was far better than other approaches.
The PSO-based KNN approach was implemented for the secured transfer of information from the server station to the mobile devices and laptops [25]. The results obtained from this research showed the increased accuracy in the PSO-KNN approach (up to 2%).
The PSO based approach using the SVM-KNN methodology was implemented for assembling classifiers into one category on the KDD99 dataset [26]. The distance weight method was executed using the weighted majority algorithm (WMA) for creating ensemble classifiers. The results were high-performance ensemble classifiers compared to other traditional algorithms.
The two-layered hybrid classification and detection process was proposed using the KDD99 dataset [27]. The first layer consists of the GBGT approach, which detects the DoS attack, while the second layer is comprised of the KNN feature selection classifier that is improved and enhanced by the FOA to identify and split the non-DoS data from the normal, U2L, and R2L probes. All of these classifiers were examined based on their accuracy, recall rate, and detection rate. The grouping of the DoS data using the KNN approach on two attacks (such as the back, teardrop, smurf, and Neptune pod) had a high recall rate. The precision rate of the smurf and Neptune attacks was greater than 99%, while that of the back and pod was greater than 60%. The authors concluded that the KNN methods have a high performance compared to the SVM-ANN methods.
In the KNN-based TSA, the feature selection approach for intrusion detection was implemented for reducing feature severance while the KNN was deployed for classification purposes on the KDD99 dataset for improving the efficiency and accuracy of the IDS system [28]. It was experimentally observed that the accuracy of the KNN-PSO was 87.34%, while that of the TSA-KNN was 87.34%.
The hybrid KNN approach was implemented for intrusion detection on the NSL-KDD dataset and evaluated experimentally [29,30]. The KNN approach was far better than the rest of the algorithms applied.
Based on the literature review above, the PSO approach using the fuzzy KNN method was selected for optimization because it could obtain a better accuracy in the IDS system in addressing various kinds of cyber attacks.

Proposed Methodology
The methodology implemented to conduct this research is the TVPSO-FKNN [31]. Using this model, the FKNN classifier can automatically be optimized by analyzing the k and m parameters and detecting and categorizing the class of best distinct features. So, for this, the binary and continuous PSO approaches were linked together for feature selection and classification. The attained feature was taken as the input to the categorized FKNN approach. In this section, the parameter encoding and the fitness function are discussed first. Then, the first serial PSO methodology in terms of the TVPSO-FKNN is elaborated upon.

Parameter Encoding
The decision variables (fuzzy logic parameters) were encoded using the binary representation, and integer representation within the search bound. While it is easy to represent the search bound for the fuzzy strength with integer representation, the k-neighborhood set and the feature set are lists of values that can be represented with binary values. Eqs. (14) and (15)

Fitness Function
The fitness function was designed to obtain the optimal fuzzy logic parameters, which was done with cross-validation using 20% of the data as the validation set. Machine learning performance metrics (accuracy, precision, recall, and f1 score) were introduced to evaluate the fuzzy KNN model obtained from each combination of parameters. The accuracy, precision, recall, and f1 score were calculated at every training stage on the validation set using the formula defined in (16)- (19). Since we were concerned with reducing the false positive rate, the accuracy and the F-score were used to design the fitness function, as given by (20)

Serial Implementation of the Proposed Approach
The following steps were followed using the TVPSO-FKNN approach for constructing a series PSO algorithm.
Step 1: Encode the particles using n + 2 dimensions, considering k and m as the first two continuous values. The remaining n dimensions are the Boolean feature mask; tag 1 if selected and 0 if discarded.
Step 2: Initialize each individual by random numbers while characterizing PSO units with the upper and lower velocity bounds.
Step 3: Train the FKNN approach with the selected features.
Step 4: Higher fitness values are achieved when units have a high fitness value and a smaller number of the selected features. By taking into consideration the accuracy, precision, recall, and f-score parameters, objective functions are designed and calculated, as shown in (19).
Step 5: Increase the iteration time.
Step 6: Increase the population numbers by updating the position and velocity of m and k and their feature using (7), (8), (12), and (13) for every unit.
Step 7: Train the FKNN using the obtained feature vector from Step 6 and calculate each unit fitness value using (20).
Step 8: Update the personal ideal fitness value (pfit) and position (pbest) in comparison to the best value present in the memory slot.
Step 9: On reaching the maximum population size, move to Step 10, otherwise, repeat the process starting from Step 6.
Step 10: Update (gfit) and (gbest) after comparing gfit with pfit in the overall population. The dominating parameter is stored in the memory.
Step 11: The process moves ahead if the stopping criteria are satisfied; otherwise it is repeated from Step 5.
Step 12: Finally, the ideal (k and m) parameters and feature subset from the best unit (gbest) are obtained.

Parallel Implementation of the Proposed Approach
To improve the optimization performance and reduce the computational time of the model, TVPSO-FKNN was implemented in parallel on a multicore processor using OpenMP. The following steps were used with the TVPSO-FKNN approach for constructing a series PSO.

Pseudocode of the parallel TVPSO-FKNN approach is as follows:
"Initializing parameters"

Dataset Description
The intrusion prevention dataset discussed in KDD-NSL [32] and CICIDS2017 [33] was used to improve the intrusion detection system. These datasets were comprised of 43 and 78 features, respectively, as regarded to network traffic when optimized in a separate group. The CICID2017based dataset contained network traffic taken for a period of five days, i.e., Monday, 3 July 2017 to Friday, 7 July 2017, with six types of attacks with normal network traffic. The NSL-KDD dataset used four types of attacks with normal network traffic. Group labels were tagged in such a way that each data set had an equal number of samples. We collected 500 samples from the CICID2017 dataset, and 1000 samples from the NSL-KDD for each type of attack. The proposed methodology was implemented for the effective classification of every type of networking traffic.

Experimental Set-Up
The OpenMP platform and MATLAB software were used for the implementation of the proposed methodology by making use of statistics, machine learning (ML), and the neural network toolbox. The search range of two parameters, k and m, for both the datasets were as follows: The values set for c1i, c1f, c2i, and c2f were 2.5, 0.5, 0.5, and 2.5, respectively [20]. The values for wmax and wmin were 0.

Experimental Results and Discussion
The proficiency of two models (fuzzy KNN-GA and fuzzy KNN-TVPSO) were evaluated on two datasets by making use of performance metrics mentioned in (16)- (19). The outcomes were compared with those of traditional ML techniques (PNN, Decision Tree, SVM, and KNN) [34]. The experimental results of the NSL-KDD and CICIDS2017 datasets are shown in Tabs. 1 and 2, respectively.
From our results, we found that the performance score of the proposed model was the highest compared to the rest of the algorithms for detecting and identifying every type of attack for two datasets. The average performance of the proposed model and other models was calculated for a single value and presented in Tabs. 1 and 2. The F1-score of PNN is 49.35, for the KNN, it is 89.9, for the SVM, it is 71.39, and that of the decision tree is 92.33. The F-scores for the proposed model using the fuzzy logic KNN with TVPSO and GA were 94.25 and 92.46, respectively. The proposed approaches have a better performance than conventional methods. The evaluated precision accuracy of the proposed techniques and the rest of the IDS models is listed in Tab. 3. The proposed algorithms (FKNN-GA AND FKNN-PSO) have a high accuracy and detection rate and a lower false alarm rate. Tabs. 4 and 5 represent the summary of all the results deduced from the above-mentioned algorithms (PNN, SVM, decision tree, KNN, FKNNPSO, and FKNN-GA) for two datasets (NSL-KDD and CICIDS2017). The proposed algorithms offer high precision rate, recall, and F-scores. In Tab. 6, the testing and training time of two datasets for the suggested methods and other classifiers are shown. We observed that FKNNPSO and FKNNGA do not have fast detecting time as compared to the other models, which is why the parallel implementation of the proposed approaches was performed to eliminate the processing time problem. Nevertheless, the proposed approaches have a higher performance rate than the rest of the classification methods.
To analyze the number of times the algorithm detected an attack, a confusion matrix was determined [35]. Tabs. 7 and 8 show this matrix for the NSL-KDD set intended for optimization of PSO and GA approach. Tabs. 9 and 10 explain the matrix for the CICIDS2017 set.
On observing the diagonal of each matrix, we found that the proposed models showed a high number of true positives and negatives and small number of false positives and negatives. Figs. 1 and 2 depict the evolutionary process, showing that fold #1 is the best among the ten-fold cross-validation in the FKNN-TVPSO using the NSL-KDD and CICIDS2017 datasets. These results were measured based on the best global position. The fitness of the local best positions on the training sets was measured to gain the best fitness of the population in every generation. These evolutionary processes were intriguing because the fitness curves progress from iteration 1 to 100 and reveal no major progression after 40 in the KDD-NSL approach and 3 in CICIDS2017. The stopping criteria is 100 iterations. In the beginning, there is a rapid increase in fitness of the evolution, but after a specific number of iterations, this rapid increase slows. Even then, the stability feature of the fitness remains the same until the stopping criteria are reached. This illustrates that FKNN-TVPSO topology can quickly congregate towards the global target and efficiently find the solution. The phenomenon proves the value of FKNN-TVPSO in developing parameters (k and m) and features via the TVPSO algorithm.

Conclusion
This research offers a novel approach for IDS. The main approach of this research is the implementation of the TVPSO algorithm assisting the FKNN classifier to gain the highest classification performance. The continuous TVPSO is employed to identify parameters k and m of the FKNN, while binary TVPSO is taken into consideration for recognizing the most discrete feature. Both of these TVPSO approaches are executed in a parallel environment for decreasing the processing time. Experimental outcomes illustrate the performance of proposed models to be significantly better than the rest of the state-of-the-art classifiers in place of the IDS system. Experiments show that the parallel implementation of the FKNN-TVPSO is a strong feature selection tool in detecting the best distinct function for intrusion detection (IDS). Nevertheless, the proposed model has a high computation efficiency.
Hence, it is concluded that the proposed FKNN-TVPSO technique is the best in IDS for a cybersecurity system. It should be noted that this technique efficiently performs on the data. Parallel implementation will take to major development when smearing with larger datasets of the detection system before future use. Future analysis should focus on assessing the proposed algorithm for larger datasets.
Funding Statement: The authors received no specific funding for this study.

Conflicts of Interest:
The authors state that they have no conflicts of interest to report regarding the present study.