Dual-Subpopulation as reciprocal optional external archives for differential evolution

Differential Evolution (DE) is powerful for global optimization problems. Among DE algorithms, JADE and its variants, whose mutation strategy is DE/current-to-pbest/1 with optional archive, have good performance. A significant feature of the above mutation strategy is that one individual for difference operation comes from the union of the optional external archive and the population. In existing DE algorithms based on the mutation strategy—JADE and its variants, individuals eliminated from the population are send to the archive. In this paper, we propose a scheme for managing the optional external archive. According to our scheme, two subpopulations are maintained in the population. Each of them regards the other as the archive. In experiments, our scheme is applied in JADE and two of its variants—SHADE and L-SHADE. Experimental results show that our scheme can enhance JADE and its variants. Moreover, it can be seen that L-SHADE with our scheme performs significantly better than four DE algorithms, CoBiDE, MPEDE, EDEV, and MLCCDE.


Introduction
Differential evolution (DE), a type of Evolutionary Algorithm (EA) for global optimization problems, has been successfully applied in many fields [1]. In each run of DE, the population, which consists of individuals-candidate solutions of problem, need be maintained. Here, individuals are also called target vectors. In the gth generation of population, mutant vectors {v i,g = (v i,1,g , v i,2,g , . . ., v i,d,g )|i = 1, 2, . . ., NP}, where d denotes dimensionality of problem, are generated through mutation based on target vectors {x i,g = (x i,1,g , x i,2,g , . . ., x i,d,g )}. Then, crossover produces trial vectors {u i,g = (u i,1,g , u i,2,g , . . ., u i,d,g )} based on x i,g and v i,g . After that, x i,g+1 are selected via selection from x i,g and u i,g according to their fitness to problem-f(x i,g ) and f(u i,g ).
In the equation, x i,g , x r1,g and x p best;g are target vectors from population P. Further, x p best;g is randomly chosen from the 100p% individuals whose fitness is better than the other individuals, where p 2 (0, 1]. Meanwhile,x r2;g is an individual from the union of the optional external archive and the population. In addition, both x r1,g andx r2;g are randomly chosen and other than x i,g .
Mutation of DE is always based on difference operation of individuals. In the majority mutation strategies, individuals for difference operation are target vectors in the current generation of population. Nevertheless, a significant feature of DE/current-to-pbest/1 with optional archive is that one of individuals for difference operation comes from the union of the archive and the population. That is, individual for difference operation is selected from a larger range than ever. According to experimental results in literature, DE/current-to-pbest/1 with optional archive leads to good algorithm performance.
Here, we give the motivation of this paper. In JADE and its variants, the optional external archive is managed just by a simple means. Details are given as below. In every generation, target vectors weeded out in selection are sent to archive. A sent individual is accepted by the archive only if it is different with any individuals existing in archive. That is, redundancy is not allowed in archive. When there is no free space in the archive, random individuals in it are removed for accommodating new comers. By this means, potential promising search directions in individuals eliminated from the population may be still kept for evolution. Nevertheless, under the control of the above managing method, individuals in the archive are worse in fitness than target vectors and similar in chromosome with target vectors. Hence, the method for managing the archive is not the best choice for DE/current-to-pbest/1 with optional archive. Thus, how to manage the optional external archive need be further studied for improving DE/current-to-pbest/1 with optional archive.
EAs naturally tend to demonstrate parallelism, since most of their variation operators can be processed in parallel. Among renowned types of parallel EAs, distributed EAs (DEAs) are most widely applied for upgrading different EAs [49]. In DEAs, the large population is divided into subpopulations for making segregation. When a predetermined condition is met, migration is executed to exchange individuals among subpopulations. By this means, for each subpopulation, foreign individuals similar in fitness level with local individuals but different in building blocks of chromosome from local individuals can be provided from time to time. Hence, upgrading an EA to a DEA can improve solutions.
Enlightened by migration of DEA, we propose a scheme to manage the optional external archive in this paper. Details are given below. The population is divided into two subpopulations. The two subpopulations evolve synchronously and independently. Each subpopulation regards the other one as its optional external archive. Between the two subpopulations, individuals are similar in fitness level with local individuals but different in building blocks of chromosome from local individuals. Therefore, under the control of our proposed scheme, individuals more fitting than before can be provided for difference operation of mutation.
Although our scheme is enlightened by migration of DEA, it differs with migration of DEA significantly. In migration, individuals from source subpopulation replace individuals in target subpopulation directly. However, under the control of our scheme, individuals from a subpopulation never migrate to the other subpopulation but just participate difference operation in mutation occurred in the latter subpopulation. In fact, DEA is very costly since multiple subpopulations need be maintained in population. However, DE with our method just need to maintain two subpopulations and then can be directly compared with existing DE algorithms.
Our experiments are based on the IEEE Congress on Evolutionary Computation 2014 (CEC2014) benchmark test suite (http://www.ntu.edu.sg/home/EPNSugan/index_files/ CEC2014/CEC2014.htm). In the first of experiments, our scheme are applied in JADE and its two variants, SHADE and L-SHADE. When function dimensionality is set 30, 50 and 100, results of DE algorithms with our scheme are compared with results of the original DE algorithms. The experimental results show that our scheme can significantly improve solutions. In the second experiment, the best performer among the three DE algorithms with our method, L-SHADE with our method, is compared with four up-to-date DE algorithms-CoBiDE [6], MPEDE [12], EDEV [21], and MLCCDE [50]. The experimental results show that L-SHADE with our method is competitive in the field of DE.
The rest of this paper is organized as follows. In Section II, related work is presented. Firstly, JADE and its variants, the DE algorithms with optional external archive, are introduced. Then, DE algorithms with subpopulations are introduced. In Section III, our method for managing the optional external archive is given. Then, experimental results are shown and analyzed in Section IV. Finally, a conclusion and a prospect are dealt with in Section V.

JADE and its variants
JADE employs DE/current-to-pbest with optional archive as its mutation strategy. When implementing the mutation strategy, individuals eliminated from the population are stored in the optional external archive. Moreover, in JADE, scaling factor F and crossover rate CR-the two main parameters of DE-are both adaptively set for each target vector independently. Since details of both DE/current-to-pbest with optional archive and the existing method for managing the optional external archive has been given in the first section, we just introduce the adaptively setting of F and CR here.
As shown in Eq 2, crossover probability of each individual, which is truncated to [0, 1], is generated according to the normal distribution with mean μ CR and standard deviation 0.1.
If f(u i,g ) < f(x i,g ), the value of CR i is collected into S CR . The mean μ CR is initialized to be 0.5 and then updated after each generation according to Eq 3.
In Eq 3 c is a positive constant between 0 and 1 and mean A ( ▪ ) is the usual arithmetic mean. Similarly, as shownin Eq 4, mutation factor of each individual, which is truncated to be 1 if F i ≧ 1 or regenerated if F i ≦ 0, is independently generated according to Cauchy distribution with location μ F and scale parameter 0.1.
If f(u i,g ) < f(x i,g ), the value of F i is collected into S F . The location parameter μ F of Cauchy distribution is initialized to be 0.5 and then updated at the end of each generation according to Eq 5.
In Eq 5, mean L ( ▪ ) is Lehmer mean. According to [3], JADE outperforms jDE [51], SaDE [52], the classic DE/rand/1/bin or a canonical PSO algorithm [53]. A parameter adaptation technique which uses a historical memory of successful control parameter settings to guide the selection of future control parameter values is proposed in [47] as an enhancement to JADE. The proposed algorithm is named SHADE. According to the experimental results in [47] for the 28 CEC2013 benchmark functions, SHADE outperforms dynNP-jDE [54], SaDE, JADE, EPSDE [55] and CoDE.
A crossover rate repair technique based on successful parameters are proposed and combined with JADE in [48]. According to the technique, crossover rate is repaired by using the average number of components taken from mutant. Then, R cr -JADE is obtained based on the technique. The experiments results in [48] indicate that R cr -JADE is able to obtain significantly better solutions than JADE. Moreover, compared with jDE, SaDE, EPSDE-c [56] and CoDE, R cr -JADE obtains better, or at least comparable, results for the 25 CEC2005 benchmark functions.
L-SHADE, which further extends SHADE with Linear Population Size Reduction (LPSR), is proposed in [45]. LPSR continually decreases population size in runs according to a linear function. Based on the CEC2014 benchmark functions, L-SHADE is compared with dynNP-jDE, SaDE, JADE, EPSDE and CoDE as well as the state-of-the-art restart CMA-ES variants. The experimental results show that L-SHADE is quite competitive with the above evolutionary algorithms.
A mechanism, auto-enhanced population diversity, is proposed in [1]. This mechanism identifies convergence and stagnation by measuring the distribution of the population in each dimension. Once convergence is detected at a dimension, diversification is executed at that dimension. Similarly, stagnation at a dimension is eliminated as soon as it is found. The AEPD mechanism is incorporated into DE algorithms including JADE and SHADE. The results for the set of 25 CEC2005 benchmark functions show that the mechanism significantly improved the performance of JADE and SHADE. Moreover, AEPD-JADE also has a superior performance in comparison with DE/rand/1/bin [57], JADE, jDE, SaDE, CoDE, Pro DE/rand/1/bin [58], HdDE [59], and EPSDE [56], CLPSO [60] and IPOP-CMA-ES [61].
A scheme based on superior-inferior (SI) crossover is proposed in [27]. When population diversity degree is small, the SI crossover is performed to improve global search. Otherwise, the superior-superior crossover is used to enhance exploitation. The above scheme is applied in four DE algorithms including JADE. Experiments based on 24 functions selected from IEEE Swarm Intelligence Symposium 2005 and CEC2014 benchmark functions show that JADE-SI -JADE with SI crossover-is significantly better than JADE in the majority of cases.
A modified JADE version with sorting crossover rate (CR) is proposed in [20]. In the proposed algorithm JADE_sort, a smaller CR value is assigned to individual better in fitness. Based on the CEC2005 functions, JADE_sort is compared with jDE, SaDE, EPSDE, JADE, CoDE and JADE-SI. The experiments results show JADE_sort is competitive.
The event-triggered impulsive (ETI) control scheme is introduced in [34]. Two types of impulses-stabilizing impulses and destabilizing impulses, are presented. In runs, the number of individuals taking impulsive control is decided by an adaptive mechanism. After that, the decided number of individuals are chosen by ranking assignment. Then these chosen individuals are adaptively modified with the above two kinds of impulses. The ETI control scheme is incorporated into ten DE algorithms including JADE and SHADE. According to the experiments on the CEC2014 benchmark functions, ETI-JADE outperforms not only original JADE but also AEPD-JADE [1]. Also, ETI-SHADE outperforms SHADE and AEPD-SHADE [1].

DE algorithms with subpopulations
In this subsection, we list five DE algorithms with subpopulations. The latest two of them are involved in our experiments for comparison. Although the listed DE algorithms all have more than one subpopulations, they do not belong to DEA, at least do not belong to narrow sense DEA, because different subpopulations in these algorithms are different in operators or settings. Details are given as below.
A dual-population differential evolution (DPDE) with coevolution is proposed in [62] for constrained optimization problems (COPs). In this algorithm, COPs is treated as a bi-objective optimization problem where the first objective is the actual cost or reward function to be optimized, while the second objective accounts for the degree of constraint violations. At each generation in runs, population is divided into two subpopulations based on the solution's feasibility to treat the both objectives separately. Each subpopulation focuses on only optimizing the corresponding objective which leads to a clear division of work. Furthermore, DPDE makes use of an information-sharing strategy to exchange search information between the subpopulations.
An adaptive multiple subpopulations based DE algorithm, MPADE, is designed in [28]. In MPADE, population is split into three subpopulations based on fitness. Three DE strategies are performed on three subpopulation, respectively. Furthermore, an adaptive approach is designed for parameter adjustment in the three DE strategies. According to its replacement strategy, a few best offspring may replace worst parents.
In [24], mDE-bES is proposed. In this algorithm, population is divided into independent subpopulations, each with different mutation and update strategies. A mutation strategy that uses information from either the best individual or a randomly selected one is used. Selection of individuals for some of the tested mutation strategies utilizes fitness-based ranks of these individuals. Function evaluations are divided into epochs. At the end of each epoch, individuals are exchanged between subpopulations.
MPEDE [12] is an ensemble of multiple mutation strategies with adapted F and CR. These mutation strategies are current-to-pbest/1, current-to-rand/1 and rand/1. Each mutation strategy controls an indicator subpopulation. After every pre-defined number of generations, the best-performing mutation strategy is found by a proposed equation. Then a reward subpopulation, which is randomly allocated to a mutation strategy at beginning, is assigned to the bestperforming mutation strategy. In MPEDE, the method to adapt F and CR comes from [3].
EDEV [21] is an ensemble of differential evolution variants and consists of three state-ofthe-art DE algorithms, JADE, CoDE and EPSDE. Each constituent DE variant is assigned an indicator subpopulation. According to a mechanism similar to the one in MPEDE, the most efficient constituent DE variant is determined after every pre-defined generation, Furthermore, a reward subpopulation is assigned to the currently best-performing constituent DE variant.

Our method for managing the optional external archive
In DE, the more individuals are involved in mutation or the more individuals can be chosen for mutation, the higher mutation degree may be gotten. It can be seen from Eq 1 that, on one hand, five individuals are required in the mutation strategy. On the other hand, an individual for difference operation is chosen from a larger range than the population. Hence, compared with other mutation methods, DE/current-to-pbest/1 with optional archive show higher mutation degree. Although the archive contains individuals as the population does, it is not another population since no new individual can be produced in it. Therefore, no function evaluation is required for maintaining the optional external archive. In brief, the archive provides additional individuals for mutation without consuming extra function evaluation or leading to high degeneration. That is, diversity of the population is improved in a reasonable manner. Hence, JADE and its variants show good performance.
In JADE or its variants, individuals in the optional external archive are ones eliminated from the population at different generations. Therefore, individuals in the optional external archive have similarities in chromosome to current target vectors since genetic relationships exist. Meanwhile, individuals in the archive are worse in fitness than target vectors because they are all losers in selection. Provided that individuals in archive are very different in chromosome with target vectors but similar in fitness level with current target vectors, DE/currentto-pbest with optional archive may be further enhanced.
In our scheme, two subpopulations need be maintained in DE. The two subpopulations regard each other as the optional external archive. In this way, individuals in the archive are not only different in building blocks of chromosome from current target vectors, but also similar in fitness level with current target vectors. To show details of our method for managing the optional external archive, we adapt the pseudo-code of JADE. Although our method can also be used in any variants of JADE, expressing our method based on original JADE is more concise than based on one of its variant. The adapted pseudo-code is given in Algorithm 1.

Experimental studies
Our experiments are based on the 30 CEC2014 benchmark test functions. In the first experiment, original version of JADE and its variants is compared with their version based on our scheme. Then, the best performer among DE algorithms with our scheme is compared with up-to-date DE algorithms in the second experiment.

DE algorithms for experiments
For the first experiment, we need to select variants of JADE beside JADE itself. As mentioned above, SHADE, R cr -JADE, L-SHADE, AEPD-JADE, JADE-SI, JADE_sort, ETI-JADE and ETI-SHADE are variants of JADE. Among these algorithms, L-SHADE, AEPD-JADE, ETI-JADE and ETI-SHADE are tested based on the CEC 2014 benchmark functions in literature. According to results in [1,34,45], it can be seen that L-SHADE performs much better on the CEC2014 functions than the other algorithms. Thus, we select L-SHADE for the first experiment. In addition, SHADE, the foundation of L-SHADE and an variant of JADE, also be selected by us. In short, our method is employed in the three algorithms, JADE, SHADE and L-SHADE, for the first experiment.
For the second experiment, we chose CoBiDE, MPDED, EDEV, and MLCCDE to compare with the best performer among DE algorithms with our scheme. CoBiDE is a state-of-the-art DE algorithm having no relationship with JADE. MPDED and EDEV are recent DE algorithms with subpopulations and belong to related work. MLCCDE is one of the most recent DE algorithms.

Settings
Function dimensionality is set 30, 50 and 100, respectively, in the first experiment, while only 30 in the second experiment. According to the guideline of CEC 2014 competition, maximum fitness evaluations are set 10000 � D for the all DE algorithms, where D represents function dimensionality. All parameters of the original DE algorithms are given in Table 1 based on [3,6,12,21,45,47,50], respectively. It can be seen from Table 1 that we change population size NP in the original algorithms to arrange two subpopulations for implement our scheme. In the DE algorithms with our method, each subpopulation is allocated NP/2 individuals and regard the other subpopulation as its archive.  Table 4, when function dimensionality is 100, our method significantly improves JADE in 9/30 ones, SHADE in 11/ 30 cases and L-SHADE in 9/30 cases. Meanwhile, our method statistically deteriorates JADE, SHADE and L-SHADE in two cases, respectively. In addition, there is no significant difference in other cases.

Comparison between DE algorithms with our scheme and their original version
It can be seen from Tables 2-4 that, for some functions, all DE algorithms with our method significantly win their original DE algorithms. Details go as below. When function dimensionality is 30, for F19 and F29, our method lead to significant improvement in all the cases. When function dimensionality goes to 50, for F19, our method lead to significant improvement in all the cases. When function dimensionality becomes 100, for F1, F9, F19, and F29, our method lead to significant improvement in all the cases. Based on the mean error to the optimum at interval, we plot convergence graphics of runs for one function when function dimensionality is 30, 50, and 100, respectively, in Fig 1. As shown in Fig 1, convergence rate goes lower and lower in all runs. In the figure, runs with our scheme converge more slowly at the initial stage than runs without it but faster at the remaining part. The above phenomenon can be explained as below. The original value of population size in the DE algorithms NP o has been proven to be a fitting value by experiments in literate. In theory, size of each subpopulation in the DE algorithms with our method need be set NP o . That is, population size need be set 2 � NP o . However, due to the limitation in maximum fitness evaluations, great increase of population size means great decrease of maximum generations. Therefore, subpopulation size needs be set less than NP o to ensure that enough generations can be executed in runs. At the beginning of run, DE algorithms with our scheme converge more slowly than original DE algorithms for the lack of individuals in each subpopulation. Nevertheless, with the implement of our scheme, the disadvantage is offset gradually in many cases. Altogether, our method leads to significant improvement in 88 cases out of 270 ones but statistical deterioration in 25 cases. In summary, JADE and its variants with our scheme outperform their original version.

Comparison between L-SHADE with our scheme and up-to-date DE algorithms
According to Tables 2-4, L-SHADE based on our scheme is best in performance among the three DE algorithm with our scheme. Thus, we plan to compare L-SHADE based on our scheme with up-to-date DE algorithms, CoBiDE, MPEDE, EDEV, and MLCCDE. In Table 5, the experimental results are listed. It can be seen from the table that L-SHADE based on our method significantly wins MLCCDE, EDEV, MPEDE and CoBiDE in 9, 13, 14 and 11 cases, respectively. Meanwhile, L-SHADE based on our method loses to MLCCDE, EDEV, MPEDE and CoBiDE in two, Table 2. Results of DE algorithms with our scheme and original DE algorithms when function dimensionality is set 30. "+" denotes the result of a DE algorithm with our method is significant better than the result of its original DE algorithm in terms of Wilcoxon's rank sum test at a 0.05 significance level, while "−" represents statistical worse. In addition, "�" shows that there is no significant difference.

Function
Mean error (standard deviation)   three, zero and zero cases, respectively. There is no significant difference in all of other cases. In summary, L-SHADE with our method is very competitive.

Discussion
In JADE and its variants with our scheme, the two subpopulations evolve independently. Individuals in a subpopulation, compared with individuals in the other subpopulation, are different in chromosome but similar in fitness level. Therefore, regarding the other subpopulation as the optional external archive can provide fitting individuals for difference operation. Under the control of our scheme, DE/current-to-pbest/1 with optional archive become more efficient Table 3. Results of DE algorithms with our scheme and original DE algorithms when function dimensionality is set 50. "+" denotes the result of a DE algorithm with our method is significant better than the result of its original DE algorithm in terms of Wilcoxon's rank sum test at a 0.05 significance level, while "−" represents statistical worse. In addition, "�" shows that there is no significant difference.  47E-15))� 0.00E+00 (0.00E+00) 0.00E+00 (0.00E+00)� 0.00E+00 (0.00E+00) 0.00E+00 (0.00E+00)�    Dual-Subpopulation as reciprocal optional external archives for differential evolution based on the external optional archive. In this paper, we propose a new scheme for managing the archive. According to our scheme, two subpopulations are maintained in the population. Each of them regards the other as its archive. In this way, the individuals in the archive of a subpopulation are ones similar in fitness level with current target vectors but different in building blocks of chromosome from current target vectors. Experiments based on the CEC2014 benchmark functions not only show that our scheme can significantly improve solutions of JADE and its two variants, SHADE and L-SHADE, but also demonstrate that L-SHADE with our method performs significantly better than CoBiDE, MPEDE, EDEV, and MLCCDE. As mentioned above, our scheme for managing the archive is enlightened by DEA. Conversely, a new type of distributed DE-DEA in the field of DE-can be developed based on the work in this paper. Further investigation is remained to be done.

Data curation: Zaichao Wang.
Formal analysis: Yiqun Fan. Table 5. Results of L-SHADE based on our method, MLCCDE, EDEV, MPEDE and CoBiDE when function dimensionality is set 30. "+" denotes that the result of L-SHADE based on our method is significant better than the current result in terms of Wilcoxon's rank sum test at a 0.05 significance level, while "−" represents statistical worse. Meanwhile, "�" shows that there is no significant difference.

Function
Mean error (standard deviation)