Oppositional Coyote Optimization based Feature Selection with Deep Learning Model for Intrusion Detection in Fog-Assisted Wireless Sensor Network

Recently, Wireless Sensor Networks (WSN) and the Internet of Things (IoT) become widespread in several real-time applications. Since IoT devices have generated a huge amount of data, the processing of data at the cloud server leads to high delay. To reduce the delay, fog-assisted WSN can be developed where the Fog Nodes are kept at the edge of the network nearer to the client. Besides, security becomes a challenging issue in fog-assisted WSN and can be accomplished by using Intrusion Detection System (IDS). This paper presents an Oppositional Coyote Optimization based feature selection with Cat Swarm Optimization based Bidirectional Gated Recurrent Unit (OCOA-CSBiGRU) for intrusion detection in fog-assisted WSN. The intention of the OCOA-CSBiGRU technique is to identify the occurrence of intrusions in the fog-assisted WSN by the use of feature selection and classification models. The proposed OCOA-CSBiGRU technique initially designs a novel OCOA-based feature selection technique for the optimal selection of features. Besides, the BiGRU model is utilized for the detection and classification of intrusions. In order to improve the detection efficiency of the BiGRU model, the Cat Swarm Optimization (CSO) algorithm has been utilized. A comprehensive experimental analysis is carried out on benchmark datasets, and the results indicatebetter outcomes of the OCOA-CSBiGRU technique over the recent methods interms of different metrics.


Introduction
The Wireless Sensor Network (WSN) is commonly employed in environment monitoring, vehicular communication, and military surveillance systems [Ahmed et al, 2020].Owing to the limitation of storage and computational ability, processing a considerable amount of sensory data in WSN is a serious problem [Bangui & Buhnova, 2021;Butun et al, 2020].Cloud Computing (CC) technique has made significant development, which has inserted new vitality to WSN.CC could offer data storage and processing for conventional WSN, bringing many new applications and solutions [Chen et al, 2016;De Souza et al, 2020].With the ever-growing popularity of the CC technique, an increasing number of user outsources their dataset to the cloud to prevent the overhead of storage and management [Diro & Chilamkurti, 2018].In cloud storage, information is encrypted beforehand outsourcing.Hence data privacy is assured and thus becomes increasingly prominent in various aspects.Sensorcloud is a scheme that incorporates CC and WSN.In contrast with CC, fog computing is an effective approach for WSN that expands CC to the network edge, consequently enabling new services and applications.Like the Cloud, the Fog could offer storage, data, application, and computational services to end users [Diro & Chilamkurti, 2018;Djenouri et al, 2019].The fog layers connect WSN to the Cloud.Usually, the Fog node consists of robust nodes with great processing and storage ability when compared to normal sensors, namely mobile sinks, mobile nodes, and mobile collectors.The major differences between fog computing and CC are that fog computing could offer mobility and local storage for end users.Fig. 1 illustrates the structure of WSN.Given the security problems in the virtual world and the new technologies of the WSN, and owing to the challenge of infiltrating this system [Kaur & Sood, 2019], it is important to offer an optimum method to maintain security and detect intrusion in this system.Thus, to handle attackers and intruders on computer networks and systems, various methodologies were introduced named intrusion detection method, which takes the responsibility to monitor the event that occurs in a computer network or system.The Intrution Detection Systems (IDS) is applied to detectan illegal access to any system or network [Kaushik & Sinha, 202;Kumar et al, 2021].
It is extensively deployed in two manners; initially, atthe Host level on a node for monitoring the operating system running on the node or activity on its system application files.In this phase, a node could be a computer device or system in IoT [Kumar et al, 202;Lawal et al, 2021].Next, at the Network level on a border router or gateway, where it can monitor network traffic flow.The NIDS is classified based on the technique of deployment and detection framework.Based on the deployment framework, the NIDS could use a distributed, hybrid, or centralized deployment approach.In centralized deployment, the NIDS is placed on a router or a devoted host.It monitors network traffic flows and transactions among the internet and the inside of its network.In the distributed deployment framework, the NIDS is positioned on the network node, where the node monitors each network transaction.Hybrid deployment uses distributed and centralized frameworks to reduce shortcomings and leverage the deployment strategy's benefits [Li et al, 2018;Lynn, 2019].The NIDS is usually categorized into hybrid-based, signature-based, and anomaly-based models based on the detection method.The signature-based IDS detects attacks or threats via rules with network traffic flow features or matching stored attack signatures.The anomaly-based system employs protocol-specific, statistical, or machine data to construct a legitimate network traffic flow method as a benchmark for its process [Maheswari & Karthika, 2021;Malik & Khamparia, 2020].This paper presents an oppositional coyote optimization-based feature selection with cat swarm optimization-based bidirectional gated recurrent unit (OCOA-CSBiGRU) for intrusion detection in fog-assisted WSN.The intention of the OCOA-CSBiGRU technique is to identify the occurrence of intrusions in the fog-assisted WSN by the use of feature selection and classification models.The proposed OCOA-CSBiGRU technique initially designs a novel OCOA-based feature selection technique for the optimal selection of features.Besides, BiGRU model is utilized forthe detection and classification of intrusions.In order to improve the detection efficiency of the BiGRU model, the cat swarm optimization (CSO) algorithm has been utilized.A comprehensive simulation assessment takes place on datasets and assessesthe results under varying aspects.

Theoretical MOSPO-CMR Model
In this study, a novel OCOA-CSBiGRU technique has been developed for the identification of intrusions in the fog-assisted WSN.The proposed OCOA-CSBiGRU technique comprises OCOA-based feature selection, BiGRU-based classification, and CSO-based hyperparameter optimization.Using OCOA and CSO algorithms helps accomplish maximum intrusion detection outcomes in the fog-assisted WSN.Fig. 2 demonstrates the overall process of the OCOA-CSBiGRU technique.

Design of the OCOA-FS Technique
Primarily, the networking data is fed into the OCOA-FS technique for the election of optimum feature subsets.The COA is a meta-heuristic optimization approach as the evolutionary and swarm models [Omar et al, 2021].It is inspired bythe Canis latrans species.The social behavior of coyote  in a group  at time  is taken into account as a set of design parameters: The coyote'sadaptation tothe environment is taken into account the fitness cost function.Initially, the agents or coyotes are randomly generated with the searching space: Whereas   and   represent the lower and upper limits of the parameter  and   denotes an arbitrary value within [0,1].
Initially, the coyote is arbitrarily included in the group, but they occasionally move from its group to another.This coyote leave is related to a probability   , as follows: The presented method assists in replacing the coyote's culture among the groups.The leader of the coyote or the alpha coyote is taken into account as the optimally adopted coyote with the environment: The COA collects data amongst the coyote through the groups.Regarding this, there areculture transfers among the coyotes within the group: Whereas  , represent the ranked social condition of the coyote of the group  at time  of the parameter .The COA considered the life cycle of coyotes, such as birth and death: Whereas   and  2 represent arbitrary coyotes with the groups ,  1   2 denotes arbitrary parameters,   and   denotes the scatter and association likelihoods.This probability indicates the culturaldiversity of coyotes from the group, and the value is defined as follows: Whereas  represent the dimension of parameters.When the social behavior is superior to the previous one, as follows.
The social behavior of coyotes is upgraded by the group influence and the alpha coyote: Whereas   and  2 represent arbitrary values within [0,1] expressed in the group influence and the weight of alpha.The fitness value of a coyote is estimated by the following: Finally, the social behavior of optimal adaptation to theenvironment is selected as an optimal solution.
To improve the effectiveness of the COA, population initiation using OBL is derived for designing the OCOA [Pacheco et al, 2020].OBL demonstrates an optimized approach utilized by several analyses to improve the quality of its introduced population solution by diversifying it.The OBL method works by searching both directions from the search space.These 2 directions comprise one novel solution, but the other direction is signified by their opposite solutions.At last, the OBL approach gets the fittest solution in every solution.
Opposite number: has been determined as a real number on the interval  ∈ [, ].The opposite number of  is represented as , and for determining their value Eq. ( 15) is utilized: Eq. ( 15) is a generalization for applying it in a search space with multi-dimensional.So, for generalizing it, all search agent's place and their opposite place are signified as the subsequent Eqs. ( 16) & ( 17): The values of each element from ̃ is defined utilizing Eq. ( 18): Optimization Based on Opposite population: during this approach, the FF is (.).Hence, once the fitness value () of the opposite solution was higher than () of their novel solution , next  = ; else  = .
The OCOA-FS technique has been mathematically formulated as follows.Generally, the classification of data includes a size   ×   where   and   denotes the number of samples and features, respectively.The major intention of the FS problem is the choice of feature subsets  from available features (  ) where the size of  <   .It can be accomplished byminimizing the objective function, as given below.
where   denotes classifier error rate by the use of  and || represents chosen feature count. can be used for for balancing ) and   .

BiGRU-based Intrusion Detection Model
During intrusion detection and classification models, the chosen feature subsets are passed into the BiGRU model.Because of the complex structure of the LSTM unit, there was a problem withthe long training time.The GRU memory unit integrates the input gate  and the forgetting gate  in the LSTM to an update gate  that resolves the long dependence problem and retains significant features, while the structure is simple when compared to the LSTM [Pierezan & Coelho, 2018].In time , for an input   , the hidden layers of the GRU output ℎ  , in the following: Whereas  and ℎ denote the activation function,  represents the weight matrix connecting the two layers.& indicate the update and the reset gates correspondingly.Compared to the sequence problem, the typical RNN employs the preceding data based on the forward input sequence.Based on this problem, the Bi-RNN system was presented when memorizing the abovementioned dataand subsequent data.The fundamental concept is to utilize two RNNs for processing the forward and reverse sequences correspondingly.Then, the output is interconnected to a similar output layer.Thus, bi-directional contextual data for the feature sequence could be recorded.According to the BRNN, the BiGRU system is attained by substituting the hidden layers in the BRNN with the GRU memory unit.For a -dimension input ( 1 ,  2   ).At time , the hidden neuron of the BiGRU output ℎ  .
Whereas  represent the weight matrix connecting the two layers,  denotes the bias vector,  shows the activation function, ℎ  ⃗⃗⃗ and ℎ  ⃖⃗⃗⃗ indicates the output of positive and negative GRU correspondingly.represent element-wise sum.

Design of CSO-based Hyperparameter Tuning
For optimally modifying the hyperparameters involved in the BiGRU model, the CSO algorithm has been employed to it [Prabavathy et al, 2018].CSO is a novel optimized technique from the domain of SI [Rahman et al, 2020].The CSO technique processes the performance of cats into 2 processes: the 'Seeking method'and the 'Tracing method'.The swarm was produced of the primary population collected of particles for searching from the solution spaces.i.e., it could simulate bird, ant, and bee and generate PSO, ACO, and BCO correspondingly.At this point, in CSO, it can be utilized cats as particles to resolve the problem.In CSO, all cats have their individual place collected of  dimensional, velocity to all dimensions, fitness value that signifies the accommodation of cat to FF, and flag for identifying when the cat is in seek/trace systems.The last solutions are the optimum place for most cats.The CSO keeps the optimum solutions and attains the last iterations [Rath & Misra, 2018].The CSO technique involves 2 procedures for solving the issues that are explained under: It can utilize the seeking method to model the performance of cats from resting time and being alert.This process is a time to think and decide on the next moves.This method contains 4 important parameters that are declared as follows: self-position consideration (SPC), seeking memory pool (SMP), counts of dimension to change (CDC), and seeking a range of selected dimensional (SRD).The procedure of seeking system was explained as follows: Step_1: Generate  copy of the existing place of   , whereas  = SMP.When the value of SPC is true, assume  =(SMP −1), afterward retain the existing place as most candidates.
Step_2: To all copies, based on CDC, arbitrarily plus or minus SRD percent the existing value and exchange the old ones.
Step_3: Calculate the fitness values (FS) of every candidate point.
Step_4: Once every FS isn't exactly equivalent, compute the choosing probabilities of all candidates point by (1);else, set every choosing probability of all candidate points to be 1.
Step5: Arbitrarily pick the point for moving to the candidate point, and exchange the place of   .

𝑃 𝑖 = |𝑆𝑆𝐸 𝑖 − 𝑆𝑆𝐸 𝑚𝑎𝑥 | 𝑆𝑆𝐸 𝑚𝑎𝑥 − 𝑆𝑆𝐸 𝑚𝑖𝑛 (27)
When the aim of FF is to find the minimal solution,   =   , else   =   The tracing system is the 2 nd process of technique.During this method, the cat needsto tracethe target as well as the food.The procedure of the tracing system is explained as follows: Step_1: Upgrade the velocity to all dimensional based on Eq. ( 28).
Step_2: Verified the velocity from the range of maximal velocities.During this case, a novel velocity is overrange;it can be set equivalent to limits.
refers to the place of the cat as an optimum fitness value,  , signifies the place of   ,  1 is an acceleration co-efficient to extend the velocity of the cat for moving from the solution spaces and generally is equivalent to 2.05 and  1 stands for the arbitrary value uniformly created from the range of zero and one.The CSO algorithm computes a FF to attain higher classification performance.It defines the positive integer for representing the optimum performance of the candidate solution.During this case, the minimized classification error rate was assumed as FF in Eq. (31).A better solution is a minimal error rate, and the least solution gains an improved error rate.The GRU memory unit integrates the input gate  and the forgetting gate  in the LSTM to an update gate  that resolves the long dependence problem and retains significant features, while the structure is simple when compared to the LSTM .In time , for an input   , the hidden layers of the GRU output ℎ  , in the following:

Performance Evaluation
This section verifies the accuracy and reliability of the proposed scheme through simulation and comparison of the performance with several well-known schemes.

Simulation Environment
This section inspects the result analysis of the MOSPO-CMR techniques with recent approaches, under several aspects are discussed here.Table 1 inspects the result analysis of the MOSPO-CMR technique in terms of energy consumption (ECM), network lifetime (NLFT), and throughput (THRP).
The experimental result analysis of theOCOA-CSBiGRU technique takes place using the KDDCup99 dataset [source], which holds 125973 instances with 41 features and 2 classes.Fig. 3 shows the FS results of the OCOA-FS technique with other techniques.From the figure, it is evident that the GA-FS and BGOA-V techniques have resulted in poor performance with the higher number of chosen features.Besides, the TLBO-FS, BGOA-S, and BGOA techniques have accomplished a moderately reduced number of selected features.However, the OCOA-FS technique has attained improved performance with fewer features.

Fig. 4. BC analysis of OCOA-FS technique
Fig. 4 demonstrates the BC examination of the OCOA-FS and existing techniques.The results indicated that the OCOA-FS technique hadobtained effectual outcomes with the least BC of 0.000936, whereas the TLBO-FS, GA-FS, BGOA-S, BGOA-V, and BGOA techniques have attained slightly increased BC of 0.001108, 0.001150, 0.004176, 0.006763, and 0.006530 respectively.

Table 1 Classification results of the OCOA-CSBiGRU Technique under distinct epochs
Table 1 and Fig. 5 offer a detailed result analysis of the OCOA-CSBiGRU technique under distinct epochs and classes.The results demonstrated that the OCOA-CSBiGRU technique hadaccomplished enhanced classifier results interms of different measures.Under 100 epochs, the OCOA-CSBiGRU technique has obtained average   of 99.975%,   , 99.993%,   of 99.790,   of 99.996%, and  of 99.991%.Moreover, under 200 epochs, the OCOA-CSBiGRU approach has gained average   of 99.989%,   , 99.987%,   of 99.616,   of 99.956%, and  of 99.541%.Furthermore, under 300 epochs, the OCOA-CSBiGRU system has reached average   of 99.940%,   , 99.979%,   of 99.851,   of 99.961%, and  of 99.626%.Simultaneously, under 400 epochs, the OCOA-CSBiGRU approach has obtained average   of 99.847%,   , 99.955%,   of 99.837,   of 99.933%, and  of 99.636%.The accuracy outcome analysis of the OCOA-CSBiGRU approach on the test data is demonstrated in Fig. 8.The results portrayed that the OCOA-CSBiGRU system has accomplished higher validation accuracy related to training accuracy.It is also noticeable that the accuracy values get saturated with the count of epochs.The loss outcome analysis of the OCOA-CSBiGRU algorithm on the test data is demonstrated in Fig. 9.The results revealed that the OCOA-CSBiGRU method represented the lower validation loss over the training loss.It can be additionally observed that the loss values get saturated with the count of epochs.In order to demonstrate the enhanced outcomes of the OCOA-CSBiGRU with other techniques, a detailed accuracy analysis is made in Fig. 10 [ Shafi et al, 2018;Vinoth Kumar et al, 2020;Vikram & Sinha, 2021;Yang et al, 2019].The results depicted that the CS-PSO algorithm has resulted in reduced   of 0.7551.In addition, the COA-IDS and DNN-SVM techniques have attained slightly improved   values of 0.9688 and 0.9203, respectively.Along with that, the DBN, MLIDS, PSO-SVM, and Behaviour-IDS techniques have obtained moderately improved   values of 0.9996, 0.9993, 0.9910, and 0.9889, respectively.However, the OCOA-CSBiGRU technique has accomplished superior results with the maximum   of 0.9999.By looking into the abovementioned results, it is evident that the OCOA-CSBiGRU technique has outdone the recent methods inseveral aspects.

Conclusions
In this study, a novel OCOA-CSBiGRU technique has been developed to identify intrusions in the fogassisted WSN.The proposed OCOA-CSBiGRU technique comprises OCOA for choice of features, BiGRU classifier, and CSO hyperparameter optimizer.Using OCOA and CSO algorithms helps accomplish maximum intrusion detection outcomes in the fog-assisted WSN.A comprehensive result analysis is done on benchmark datasets and assessesthe results under varying aspects.The comparative result analysis indicated the better outcomes of the OCOA-CSBiGRU technique over the recent methods interms of different metrics.In the future, outlier removal approaches can be integrated with Metaheuristics to improvise security and reduce complexity.

Fig
Fig. 1.WSN structure ) =   (  ) =         * 100 (31) Fig. 6 investigates the performance of the OCOA-CSBiGRU technique with existing ones interms of   and   .The figure reported that the Deep Learning, DPCDBN, and AKNN techniques hadreached poor results with minimal values of   and   .In line with this, the C4.5, AdaBoost, and T-SID techniques have accomplished moderately closer values of   and   .Though the DBN model has resulted in near-optimal outcomes with the   and   of 0.9994 and 0.9997, the presented OCOA-CSBiGRU technique has accomplished better outcomes with the higher   and   of 0.9998 and 0.9999.

Fig. 7
Fig. 7 examines the performance of the OCOA-CSBiGRU system with other ones based on   and   .The results depicted that the Deep Learning, DPCDBN, and AKNN approaches have gained worse outcomes with minimal values of   and   .In line with this, the C4.5, AdaBoost, and T-SID techniques have accomplished reasonably closer values of   and   .Eventually, the DBN system has resulted in nearoptimal outcomes with   and   of 0.9995 and 0.9996, the presented OCOA-CSBiGRU algorithm has accomplished optimum outcomes with the superior   and   of 0.9983 and 0.9999.

Table 2
Comparative classifier results of the OCOA-CSBiGRU Technique Fig. 6.Precision and recall analysis of the OCOA-CSBiGRU technique