A Cuckoo Search Detector Generation-based Negative Selection Algorithm

The negative selection algorithm (NSA) is an adaptive technique inspired by how the biological immune system discriminates the self from nonself. It asserts itself as one of the most important algorithms of the artificial immune system. A key element of the NSA is its great dependency on the random detectors in monitoring for any abnormalities. However, these detectors have limited performance. Redundant detectors are generated, leading to difficulties for detectors to effectively occupy the non-self space. To alleviate this problem, we propose the nature-inspired metaheuristic cuckoo search (CS), a stochastic global search algorithm, which improves the random generation of detectors in the NSA. Inbuilt characteristics such as mutation, crossover, and selection operators make the CS attain global convergence. With the use of Lévy flight and a distance measure, efficient detectors are produced. Experimental results show that integrating CS into the negative selection algorithm elevated the detection performance of the NSA, with an average increase of 3.52% detection rate on the tested datasets. The proposed method shows superiority over other models, and detection rates of 98% and 99.29% on Fisher’s IRIS and Breast Cancer datasets, respectively. Thus, the generation of highest detection rates and lowest false alarm rates can be achieved.


Introduction
The biological immune system (BIS), a unique, powerful, and orchestrated system against the influx of pathogens, viruses, and bacteria, protects the body from being damaged and infected. The BIS handles this process through the recognition and detection of foreign elements (non-self) and thereby causing their annihilation. The white blood cells (lymphocytes), that is, the B-cells and T-cells, are responsible for handling these detection and elimination process in the body. The main purpose is basically to distinguish what is self from what is non-self [1]. A process in the body known as negative selection depends solely on the maturation of T-cells in the thymus. This occurs by eliminating T-cells reacting to self-cells. Nonreactive T-cells circulate the body to detect foreign cells. An artificial immune system (AIS) algorithm that mimics the negative selection process was proposed and developed in Forrest et al. [2], and is referred to as the negative selection algorithm (NSA). It is based on binary representation. The real-valued representations of the real-valued negative selection algorithm (RNSA) with constant-sized detectors [3] and real-valued negative selection with variable-sized detectors (V-Detectors) [4] having also been proposed. Their usage has flourished in diverse functional disciplines such as anomaly detection [5,6], data classification and fault diagnosis [7,8], path testing [9], as well as hardware tolerance [10]. Other prominent AIS algorithms are artificial immune network algorithms [11], clonal selection algorithms [12,13], and dendrite-based algorithms [14,15].
Despite the success rate of the NSA in different application domains, it comes with its own deficiencies and drawbacks, which are attributed to its random detectors [16,17]. The issue with the randomly generated detectors lies with the efficient generation of detectors that effectively occupies the non-self space [18]. Thus, there is no assurance for adequate coverage by the random detectors. This problem prompted the present research. We aim to provide suitable and appropriate solutions to combat this issue with negative selection algorithm. Therefore, the present research focuses on the nature-inspired metaheuristic cuckoo search (CS) algorithm [19,20] for the optimization of the randomly generated detector set of NSA. CS is integrated with the NSA, and makes use of Lévy flight to be able to search effectively and thus increase the NSA's overall performance. The V-Detectors variation is used in this research.
The organization of this article is as follows. Section 2 reviews related improvements on the negative selection algorithm. Detailed in Section 3 is the proposed cuckoo search algorithm as utilized in the optimization of the negative selection algorithm. Experiments are presented in Section 4. A conclusion and directions for future work are provided in Section 5.

Related Improvements
Various distinctive improvements have been proposed to optimize the random detectors of the negative selection algorithm. Completely different solutions for the generation of more robust detectors have also been explored. The fruit fly optimization (FFO) and k-means clustering were used as a replacement for the random detectors of NSA in email spam detection. The k-means helps to cluster the self-set that serves as the initial population of FFO [21], with the FFO as a mechanism for restructuring random detectors. A penalty factor (PF) improved the NSA, resulting in the algorithm termed NSAPF [22]. The dangerous malware signatures that would have been discarded through matching with self are otherwise penalized and kept in a library. NSAPF demonstrates its ability to detect malware with low false positives. PSO-DENSA and MPSO-DENSA were also developed [23]. They are enhancements of the distribution estimation-based negative selection algorithm (DENSA), which is dependent on the Gaussian mixture model (GMM) in actualizing flexible and efficient boundaries for self-samples, thereby aiding in the distribution of detectors in nonself space. Experiments confirm that PSO-DENSA and MPSO-DENSA have good potency for detection. In predicting crude oil price, the fuzzy rough set triggering feature selection was combined with NSA [24]. Moreover, to detect email spam, the random detectors of the NSA were replaced with a PSO detector generation procedure [25]. This lead to improved accuracy when a local outlier factor (LOF) was used as a fitness function.
Relying on the NSA, an efficient proactive artificial immune system for anomaly detection and prevention (EPAADPS) was proposed [26]. To generate efficient detectors, a self-tuning of detectors and detector power of the NSA was implemented. The EPAADPS performed better than the NSA in experiments. The subspace density technique was used to improve NSA in generating optimal detectors [27], and a dual NSA algorithm was proposed to produce potent and mature detectors for network anomaly detection [28]. The theory of Delaunay triangulation integrated with the negative selection algorithm (described as ASTC-RNSA) was proposed in Ref. [29]. This algorithm relies on computational geometry in partitioning self-space, and is superior to V-Detectors and RNSA. An improved NSA was proposed also for diagnosing faults in wind turbine gearboxes [30]. Finally, by adopting antigen density clustering, a large reduction in the randomness of NSA detectors was achieved [31], along with better performance compared to other algorithms.

The Proposed Algorithm
The detector generation scheme of the real-valued negative selection algorithm with variable-sized detectors (V-Detectors) will play a crucial role in obtaining adequate performance stability and efficiency. The production of detectors is by random acquisition; however, covering non-self space effectively is not guaranteed. The cuckoo search (CS) algorithm is introduced to improve the quality of the V-Detectors' detectors, which are referred to as CS-V-Detectors. The mutation, crossover, and selection operators enable the CS to attain global convergence and optimality. By undergoing these processes, the best candidate detectors are produced, and ultimately enhance the traditional random generation of detectors. A fitness function is needed to acquire potent detectors. This function is dependent on the Euclidean distance between two overlapping detectors. The implementation of the proposed algorithm is detailed next.

Detector Generation through Cuckoo Search Algorithm
The cuckoo search (CS) algorithm is a population-based stochastic global search algorithm. The main steps of detector generation with CS are enumerated below and summarized in Algorithm 1.
The generation of detectors with CS begins by initiating a random population of detectors with the use of a lower bound and upper bound based on the designed variables. The random detectors are uniformly distributed. A candidate i th detector of j th attribute is produced in Eq. (1),

Algorithm 1: Cuckoo Search Detector Generation of the V-Detectors
Input: n initial population size, and probability p a Objective function f ððxÞÞ; Keep the optimal detector solutions; end where x j LB is the lower bound of j th attributes, x j UB is the upper bound of j th attributes, rand is a uniformly random distributed number in the range of [0,1], i = {1,2,3,…,n} with n as the population size, j = {1,2,3,…,D} with D as the detector's dimension.
Upon population initialization and identifying the best candidate detector x j best from the randomly generated detectors via the fitness function, an iterative process through Lévy flight is used in stochastically generating new candidate detectors (Eq. (2)), where an α greater than 0 is the measurement size of step linked to the scales of problems of interests, g is the number of the current generation (only one generation is used here), and È denotes entry-wise multiplication. A random walk propels movement of the Lévy flight, while a Lévy distribution is used to execute the step length. This is illustrated in Eq. (3), The steps develop into a formative process of random walk, with a power-law step length distribution. Mantegna's algorithm [32] is used to calculate the step length s, and produces a uniform Lévy well-balanced distribution (Eq. (4)), where u and v exist within a naturally occurring distribution of the zero mean r 2 u and standard deviation r 2 v , respectively. That is, where Here, À denotes the gamma distribution, and b ¼ 3=2 is used in the conventional configuration of CS algorithm [33]. The step size g is calculated in Eqs. (7) and (8), where x j i is the current detector, and x j best is the best candidate detector obtained from the population. Thus, the new candidate detector with respect to Eq. (2) is now transformed into Eq. (9), where randn(D) is a random vector of the dimension of detector x j i ðgÞ that abides to a uniform distribution. x j i ðgÞ and x j i ðg þ 1Þ are compared solely on their fitness function value (Eq. (10)). The better of the two detectors is kept as the final detector.
The crossover operator acts on the detector solution obtained from Eq. (10). It involves a probability p a that is compared to a uniform random number rand obeying a uniform distribution of [0,1]. Let us say that the final detector solution in Eq. (10) is stored as x j i ðg þ 1Þ 0 , and rand is greater than p a . Then, x j i ðg þ 1Þ 0 is modified according to Eq. (11); else x j i ðg þ 1Þ 0 is retained.
where x j r1 and x j r2 are detector solutions chosen via a random search. The fitness value of x j i ðg þ 1Þ 0 is weighed against that of x j i ðg þ 2Þ. If fitness value of x j i ðg þ 2Þ is greater than that of x j i ðg þ 1Þ 0 , then These processes for CS are repeated for the detector solutions. Each detector is then matched to the self-samples using the matching rule of Euclidean distance. The training (self) dataset samples is represented in Eq. (13).
The self-sample X i is normalized in the n-dimensional space as follows: We have a generated detector d ¼ ðc d ; r d Þ, where c d ¼ fc d1 ; c d2 ; c d3 ; …; c dm g represent the center of detector, and r d is the designated radius. The Euclidean distance serves as the V-Detectors algorithm's matching rule. The interval between self-sample X i to detector d hinged to center c d is denoted in Eq. (15), The value of distance DðX i ; c d Þ is stored as minðdistÞ and checked against the self radius r s . If minðdistÞ < r s , the detector matches the self-sample and is therefore eliminated. The detector is stored if minðdistÞ > r s and its radius recorded. The detector radius r d can be defined as in Eq. (16).
The detector is checked to ascertain whether it can be detected by previously stored detectors using the Euclidean distance. If the minimum distance between the detector and previously stored detector is less than the radius of the previous detector, the detector is eliminated. Otherwise, it is stored for the detection stage. This continues until the required number of detectors covering the non-self space is reached. These detectors then effectively monitor the system's status during the detection stage.
For the detector stage, the test sample dataset is matched with detectors from the generation stage. Euclidean distance is applied for the matching process. There is a match if the minimum distance between the test sample and detectors is less than the detector's radius, and thus the sample is labeled as non-self. If the distance is greater than the detector's radius, the sample is classified as self, which dictates no match.

Computation of the Fitness Function
The goal of the detectors of the V-Detectors is to thoroughly comb the non-self space; however, the detectors overlap. This overlaps hinders the detectors' coverage and ultimately has a negative effect on the performance of the V-Detectors algorithm. To tackle the overlapping of detectors, a fitness function is introduced and implemented. The fitness function is a minimization problem. The minimum distance between two detectors d 1 and d 2 that are overlapping with each other is calculated with Euclidean distance in Eq. (17). This distance calculation causes the overlapped detectors to move away from each other, thereby enlarging the coverage area of each detector.
In more explicit terms, we are given two detectors d 1 ¼ ðc d 1 ; r d 1 Þ and d 2 ¼ ðc d 2 ; r d 2 Þ with their centers as c d 1 ¼ fc d 1 1 ; c d 1 2 ; c d 1 3 ; …; c d 1 m g and c d 2 ¼ fc d 2 1 ; c d 2 2 ; c d 2 3 ; …; c d 2 m g, and each detector radii as r d 1 and r d 2 .
The minimum distance Dðc d 1 ; c d 2 Þ between the detectors d 1 and d 2 based on the centers c d 1 and c d 2 is calculated as follows:

Experimental Results and Analysis
The driving force behind this study is to discover the detection potency of the proposed CS-V-Detectors algorithm. The UCI repository [34] serves as a benchmark database for retrieval of datasets. The datasets are Parameters values for CS are as follows: population size n = 1000, and probability p a = 0.25 (the set value for p a is considered the best for any optimization problem) [19,35]. For V-Detectors, the parameters are as follows: self-radius r s = 0.05, estimated coverage c 0 = 99.98%, and maximum number of detectors T max = 1000 [4,36].

Performance Metrics
Evaluation criteria for CS-V-Detectors are standard measures. These measures are the detection rate and false alarm rate. The equations for these metrics are:

Results and Discussion
Simulations are performed on 3.40 GHz CPU Intel Pentium® Core i7 Processor configured using 4 GB RAM.
The results on Fisher's IRIS dataset are contained in Tab. 1. The detection rate for V-Detectors was 92.16%, and when it was optimized with cuckoo search (CS), an increase of 5.84% was observed (i.e., the detection rate for CS-V-Detectors was 98%).
The ANN came close to CS-V-Detectors, with a detection rate of 97.33%. FuzzyNN was next, with 96.70%, followed by the SVM and Naïve Bayes (both with a 96% detection rate). The k-NN only had a detection rate of 95.30%, and random forest had a rate of 94%. Thus, CS-V-Detectors performed better than other methods on Fisher's IRIS dataset. In regard to false alarm rates, k-NN had the highest (3%), while CS-V-Detectors and V-Detectors both had 0% rates.
It can be seen from Tab. 2 that all the algorithms performed well on the Breast Cancer Wisconsin dataset. However, CS-V-Detectors surpassed all of the other algorithms, with a 99.29% detection rate (compared to V-Detectors' 96.85%). k-NN, ANN, and random forest had detection rates of 95.10%, 95.30% and 95.70%, respectively. Meanwhile, FuzzyNN, Naïve Bayes, and SVM had detection rates of 96.40%, 96%, and 97%, respectively. k-NN and ANN had the highest false alarm rates: 6.30% and 5.40%, respectively. In contrast, CS-V-Detectors and V-Detectors had false alarm rates of 0%.  As shown in Tab. 3, the results on the BUPA Liver Disorders dataset reveal that detection performances ranged from 58% to 77% across all the algorithms. CS-V-Detectors had the best performance (a 76.71% detection rate); this can be compared to V-Detectors' detection rate of 74.44%.

Receiver Operating Characteristics
The receiver operating characteristic (ROC) curves corresponding to the experiments on Fisher's IRIS and Breast Cancer datasets are presented in Figs. 1 and 2, respectively. In Fig. 1, CS-V-Detectors is clearly better than the other algorithms. It is closely followed by ANN, then FuzzyNN, SVM, and Naïve Bayes; random forest performs the worst. These results are analogous to those in Tab. 1. Shifting focus to the ROC curves for experiments on the Breast Cancer dataset plotted in Fig. 2, we observe that CS-V-Detectors again outperformed all the other algorithms. V-Detectors performed second best, while k-NN performed worst. These results also well match those in Tab. 2. The graph connoting ROC curves for experiments on the Liver Disorders dataset is drawn in Fig. 3. The performances of the algorithms are not as high as on other datasets. However, once again, CS-V-Detectors surpassed the other algorithms in terms of performance. It was closely followed by V-Detectors, and SVM performed worst.  Algorithm performance can also be compared by looking at the area under the ROC curve (AUC). The AUC is generated through the reduction of ROC into a single scalar, with values ranging from 0 to 1. Algorithms that have an area less than 0.5 are considered unrealistic; the best algorithms have AUC values near 1. The AUC for the algorithms are listed in Tab. 4. In ascending order for the Fisher's IRIS dataset, random forest had the lowest AUC value (0.9550), while CS-V-Detectors had the highest AUC (0.9900). On the Breast Cancer dataset, CS-V-Detectors again had the highest AUC (0.9965). The lowest AUC was generated by k-NN (0.9440). A similar trend is seen on the Liver Disorders dataset, with CS-V-Detectors obtaining the highest AUC (0.8836). Thus, it is affirmed that the proposed CS-V-Detectors is at the apex concerning AUC.

Conclusion
This research introduced and proposed a detector generation algorithmic scheme based on cuckoo search (CS) for the negative selection algorithm with particular focus on the real-valued negative selection algorithm with variable-sized detectors (V-Detectors). It embodies the properties of Lévy flight in attaining global convergence and optimality, which results in the generation of efficient detectors. The proposed algorithm improved the performance of standard V-Detectors and performs better than other existing algorithms. Hence, it can be concluded that the optimization technique enhances the detection ability and efficiency of the negative selection algorithm. Future work will involve hybridizing cuckoo search with other optimization algorithms for enhanced detection.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.