New Variants of Adaptive Differential Evolution Algorithm with Competing Strategies

New variants of the adaptive competitive differential evolution algorithm are proposed and tested experimentally on the CEC 2013 test suite. In the new variants, the adaptation is based on the competition of several strategies. The current-to-pbest mutation borrowed from JADE is included into the pool of the competing strategies in newly proposed variants. The aim of the experimental comparison is to find whether the presence of the current-to-pbest mutation strategy increases the efficiency of the differential evolution algorithm, especially on rotated objective functions. The results of the experiments show that the new variants performed better in a few of the test problems, while the benefit is not observed in the majority of the test problems.


INTRODUCTION
This paper is an extended version of the conference submission [2].Compared to [2], two other pools of competing strategies are proposed and all the variants of the algorithm are compared experimentally on the CEC 2013 benchmark suite [5].
Differential evolution (DE) proposed in [9] is a population-based optimization algorithm for single-objective problems with a real-valued objective function.The possible solutions are represented as vectors with real-number components, x = (x 1 , x 2 , , . . ., x D ), D is the dimension of the problem.The population is placed in the search space Ω = ∏ D j=1 [a j , b j ], a j < b j , j = 1, 2, , . . ., D and evolves during the search to the state of higher fitness.The solution of the problem is the global minimum point x * satisfying condition f ( x * ) ≤ f ( x), ∀ x ∈ Ω.
Algorithm 1 Differential evolution algorithm initialize population P = { x 1 , x 2 , . . ., x N } while stopping condition not reached do for i = 1, 2, . . ., N do create a new trial vector The population of size N is developed step-by-step from a generation P to a generation Q by application of evolutionary operators, i.e. mutation, crossover, and selection.The basic scheme of DE algorithm written in pseudo-code is shown in Algorithm 1.The new trial point is created from a mutant point u generated by using a kind of mutation and from the current point of the population by the application of the crossover.Fitter point from the pair of ( x i , y), based on the value of the objective function, is selected to the new generation Q.
The DE algorithm has been studied intensively in recent period.Comprehensive summary of advanced results in DE research is available in [7] and [3].Several kinds of mutation and crossover were suggested as well as some adaptive or self-adaptive DE variants.The main goal of designing adaptive variants of DE is to enable the adaptation of the search carried out during the run of the DE algorithm to the current problem to be solved.
The new test suite of 28 functions was proposed for the special session on Real-Parameter Optimization, a part of Congress on Evolutionary Computation (CEC) 2013.This session was held as a competition of stochastic singleobjective optimization algorithms.The functions are described in the report [5], including the experimental settings required for the competition.The source code of the functions is also available at the web site given in the report.The benchmark functions can be used at several levels of problem dimension varying from 2 to 100.We can expect that this test suite will become one of the most relevant benchmark required for publishing new single-objective optimization algorithms.
We took part in the CEC 2013 special session mentioned above with the paper [13], where an adaptive version of differential evolution based on the competition of DE strategies was applied [12].Our DE variant was ranked in the first half of 21 compared algorithms with respect to their efficiency.This DE variant performs well on the problems, where the objective function is not rotated, whilst the performance in the problems with rotated functions is worse.Similar difficulties with the rotated functions occurred in all DE variants taking part in the CEC 2013 competition including the best performing DE variant of SHADE [10].
In this paper, novel variants of the competitive DE combining two adaptive approaches are proposed and compared experimentally with the "parental" algorithms [12,18] on the CEC 2013 test suite.
The rest of the paper is organized as follows.In Section 2 the "parental" algorithms are described.Three new adap-tive DE variants are proposed in Section 3. Experimental setup is defined in Section 4. The results are presented in Section 5 and the last section concludes the paper.

SOME ADAPTIVE VARIANTS OF DIFFEREN-TIAL EVOLUTION
It is know that standard DE can be a very efficient optimization algorithm but the efficiency is strongly dependent on the setting of the control parameters F and CR for the problem to be solved.The tuning of the control parameters by trial-and-error method is time-consuming.Hence many adaptive or self-adaptive DE variants have been proposed in last decade.
Seven adaptive DE variants [1,6,8,12,15,18] were experimentally compared on six standard benchmark functions at three levels of dimension in [14].It was found that JADE [18] and b6e6rl [12] were the best performing algorithms in the comparison, JADE was the fastest and the second reliable in average, while the b6e6rl was the most reliable and the second in convergence speed.That is why these "parental" algorithms are exploited in the proposed new variants.

JADE
JADE variant of adaptive differential evolution [18] extends the original DE concept with three different improvements -current-to-pbest mutation, a new adaptive control of parameters F and CR, and archive.The mutant vector u is generated in the following manner: where x pbest is randomly chosen from 100 p % best individuals with input parameter p = 0.05 recommended in [18].The vector x r1 is randomly selected from P (r1 = i), x r2 is randomly selected from the union P A (r2 = i = r1) of the current population P and the archive A. In every generation, parent individuals replaced by better offspring individuals are put into the archive and the archive size is reduced to N individuals by randomly dropping surplus individuals.The trial vector is generated from u and x i using the binomial crossover.CR and F are independently generated for each individual x i , CR is generated from the normal distribution of mean µ CR and standard deviation 0.1, truncated to [0, 1].F is generated from Cauchy distribution with location parameter µ F and scale parameter 0.1, truncated to 1 if F > 1 or regenerated if F < 0, see [18] for details of µ CR and µ F adaptation.

Competitive DE
Competitive DE uses H strategies with their controlparameter values held in the pool [11,12].Any of H strategies can be chosen to create a new trial point y.A strategy is selected randomly with probability q h , h = 1, 2, . . ., H. The values of probability are initialized uniformly, q h = 1/H, and they are modified according to the success rate in the preceding steps.The hth strategy is considered successful if it produces a trial vector entering into next generation.
Probability q h is evaluated as the relative frequency of success according to where n h is the current count of the hth setting successes, and n 0 > 0 is an input parameter.The setting of n 0 > 1 prevents from a dramatic change in q h by one random successful use of the hth strategy.To avoid degeneration of the search process, the current values of q h are reset to their starting values if any probability q h decreases below some given limit δ , δ > 0.
We use a variant of competitive DE that appeared wellperforming and robust in different benchmark tests [12].In this variant, denoted b6e6rl hereafter, 12 strategies are in competition (H = 12), six of them using the binomial crossover, rest of them using the exponential crossover.
The randrl/1 mutation (3) is applied in all the strategies, two different values of control parameter F are used, F = 0.5 and F = 0.8.
where the point r x 1 is tournament best among r 1 , r 2 , and r 3 , i.e. f ( r x 1 ) ≤ f ( r x j ), j = 2, 3, as proposed in [4].The mutation according to (3) can cause that a mutant point u moves out of the domain Ω.In such a case, the values of u j ∈ [a j , b j ] are turned over into Ω by using transformation u j ← 2a j − u j or v j ← 2b j − u j for the violated component.The same treatment of the mutation points escaping the Ω is also used in newly proposed algorithms.
The binomial crossover uses three different values of CR, CR ∈ {0, 0.5, 1}.The values of CR for the exponential crossover are evaluated from given values of mutation probability p m as real roots of polynomial equation [17] Three values of p m used in this DE variant are set up equidistantly in the interval (1/D, 1).Details of the CR setting for the exponential crossover can be found e.g. in [14].

NEWLY PROPOSED VARIANTS OF COMPETI-TIVE DE
The adaptive mechanism based on the competition of strategies described in Section 2.2 is applied in all the newly proposed adaptive DE variants.The new variants differs only in the combination of DE strategies available in the pools from which the strategies are selected.Some strategies in the pools of competing strategies are derived from JADE, especially they exploit the current-to-pbest mutation.

b6e6pbest
This adaptive variant of DE (denoted b6e6pbest hereafter) is similar to b6e6rl but the mutation randrl/1 is replaced by the current-to-pbest mutation used in JADE.It is expected that application of the current-to-pbest mutation can help in the solution of rotated functions.An archive from JADE storing the old best solutions is also applied.The new algorithm is shown in pseudo-code in Algorithm 2.
The population P of size N is initialized randomly uniformly distributed in the area of the possible solutions.In addition, the empty archive A of the size N for the storage the old solutions is also initialized.When the new trial point is inserted into next generation Q, the old solution x i is stored in the archive A. If the archive is full, a randomly selected point in the A is replaced by the current x i .The parameters controlling the competition of strategies are set to the recommended values: δ = 1/(5 × H), n 0 = 2 and the control parameter of mutation p = 0.05.

b3e3j6-F05
Twelve DE strategies are included into the competition in this algorithm (denoted b3e3j6-F05 hereafter), six strategies use the randrl mutation like the b6ee6rl algorithm and in the other strategies the current-to-pbest mutation is used.Mutation parameter F is set up to 0.5 in all the twelve DE strategies.This setting is supposed to be helpful by more intensive search in the neighborhood of the current point.The CR parameters for the both types of the crossover are set up to the same values applied in the b6e6rl.

b3e3j6-F05F08
Like in the algorithms described before, twelve DE strategies are also included into the competition of the strategies in this algorithm labeled b3e3j6-F05F08 hereafter.Six strategies use the randrl mutation with F = 0.5, three of them in combination with the binomial crossover, three of them with the exponential crossover.The other strategies use the current-to-pbest mutation with F = 0.8 combined with the binomial and exponential crossover.The higher value of F in half of the competing strategies should keep the population more dispersed compare to b3e3j6-F05.The CR parameters for the both types of the crossover are also set up to the values applied in the b6e6rl.

EXPERIMENTS
The aim of the experiments is to compare the performance of the proposed variants with the parental JADE and b6e6rl algorithms.The algorithms are implemented in Matlab 2010a and this environment was used for experiments.Experimental setting follows the requirements given in the report [5], where the suite of 28 benchmark minimization problems is also defined.The function values f ( x * ) are also given in [5].Thus, the obtained value of the function error f min − f ( x * ) can be calculated for each run, where f min is the minimum function value in the population at the end of the search.The source code of the test functions in C was downloaded from the web page given in [5] and compiled by Lcc-win32 C 2.4.1 compiler.Search range (domain) for all the test functions is [−100, 100] D .
The tests were carried out at two levels of dimension, D = 10 and D = 30.For each test problem, 51 repeated runs were performed.The run stops if the prescribed value of MaxFES = D • 10 4 is reached or if the minimum function error in the population is less than 10 −8 because such a value of the error is considered sufficient for an acceptable approximation of the correct solution.The values of the function error less than 10 −8 are treated as zero in further processing.
The population size was set up to N = 100 for all the algorithms and the problem dimension.The remaining control parameters of the algorithms were set up to the recommended values described in Section 2 and 3.

RESULTS
The basic characteristics of the experimental comparison of the algorithms are presented in Tables 3-11.The structure of the characteristics follows the requirements given in Report [5].The values of characteristics for each problem are counted from 51 repeated runs.The values of the function errors less than 10 −8 are substituted by zero in all the tables.
The efficiency of the five algorithms expressed by the values of the function error found in each of 51 runs was compared statistically by Kruskal-Wallis non-parametric analysis of variance.Kruskal-Wallis multiple comparison of the algorithms was applied to the results of the problems where a significant difference among the algorithms was found.The results of the comparison are shown in Table 1.If there is an algorithm significantly better than the others, it is evaluated as the winner in the corresponding problem.If there are two or more algorithms on the winning position and these algorithms are not different significantly, the wining position is shared by all of them, ordered in the decreasing sequence of their performance.
The counts of wins and shared wins across all the problems are summarized in Table 2, the problems with the no significant difference are not taken into account.Based on the results in Table 2, we can conclude that JADE is the best performing algorithm most frequently but each algorithm in the comparison is winning in some problems at least once and the shared wins were obtained several times by each algorithm.Among the newly proposed DE variants, b6e6pbest performs best in average (20 wins or shared wins out of 56 problems).None of the five algorithms copes well with all the test problems.There are problems, where the error of the best solution found by the algorithm in the prescribed number of the function evaluations has the magnitude of 10 2 , see Tables 3-11.Especially for the composition functions (problems F21 to F28) no algorithm tested here is able to find a better solution.It is not surprising because the composite functions are very difficult tasks for all optimization algorithms.Moreover, the performance of DE algorithms in some problems with rotated objective function (F2 -F4, F6 -F10, F12, F13, F15, F16, F18, F20, F21, F23 -F28) is not satisfactory.

CONCLUSIONS
The experimental comparison showed that newly proposed variants of the competitive DE algorithm do not outperformed the "parental" JADE algorithm, when average performance on the CEC 2013 suite problems is taken into account.Among the newly proposed DE variants, b6e6pbest performs best in average (20 wins or shared wins out of 56 test problems), while JADE achieved 38 wins including shared wins.
However, there are optimization problems, where some newly proposed DE variants performed well.Each of the newly proposed algorithm wins at least in one problem of the CEC 2013 test suite and the shared wins are obtained several times by each algorithm.It indicates that for a specified optimization problem a special combination of competing DE strategies in the pool is more convenient for the convergence than other combination.Such behavior of the algorithms found experimentally is in agreement with the results of No-Free-Lunch theorem [16] but its benefit of the result is limited for application in the solution of real-world optimization problems.
It was found that all the tested DE variants do not perform well on the most of the problems with rotated objective function.The inclusion of current-to-pbest strategy into the competitive adaptive DE does not bring sufficient enhancement of the performance in these problems.Thus, the proposal of an innovated algorithm with the pool of strategies increasing the efficiency of DE on rotated functions remains the challenge for further research.
Algorithm 2 Competitive DE algorithm b6e6pbest initialize population P = { x 1 , x 2 , . . ., x N } initialize empty archive A of the size N initialize probabilities of strategies while stopping condition not reached do for i = 1, 2, . . ., N do choose a strategy by a roulette selection create a new trial vector y

Table 1
Comparison of algorithm performance by Kruskal-Wallis test -best performing algorithms.

Table 2
Counts of the best and the shared best positions according to Kruskal-Wallis multiple comparison.

Table 4
Basic Characteristics of function error, JADE, D = 10.

Table 9
Basic Characteristics of function error, JADE, D = 30.