Second-order DE algorithm

: Differential evolution (DE) is a robust, efficient and simple evolutionary algorithm for various optimisation and engineering problems. It has several outstanding features such as low time complexity, ease to use and robust steadiness. So it is becoming more and more popular and is widely used in more and more applications. However, many questions are deserving to consider the critical balance between global exploration and neighbourhood exploitation. The difference vector of the mutation operator for the direction and neighbour information has not been fully exploited. Therefore, a second-order difference vectors based DE, SODE, is proposed, which can efficiently utilise different direction information from the second-order difference vector. The optimal second-order difference mechanisms are proposed for DE/rand/1 and DE/best/1 to utilise the direction and neighbour information from difference vector. Then, it will guide the individuals toward the possible more encouraging areas. Extensive experiments and comprehensive comparisons show that the second-order differenced mechanism in SODE is much better than the classical first-order difference mechanisms based mutation strategy – ‘ DE/rand/1 ’ and ‘ DE/best/1 ’ as far as the converging and steady performance.


Introduction
Many kinds of metaheuristic optimisation algorithms are proposed for various increasing optimisation problems in the recent few decades, which include evolutionary algorithms (EAs) [1], particle swarm optimisation [2], simulated annealing [3], ant colony optimisation [4], differential evolution (DE) [5] etc. Although these algorithms have better performance when solving various scientific and engineering problems involving non-linear, multidimensional and no differentiable problems, many of them are still possible to locate in a local optimum when the present solution closes to the trap or the found solution has a long distance to the true optimum [6,7]. However, it is especially difficult for highly multimodal problems [8].
Storn and Price [5] proposed a DE algorithm, which is a robust, efficient and simple EA. It has many advantages such as lower computational cost, highly robust and simplexes, which made them find some approximate or satisfactory solutions for various optimisation and applications [9,10]. DE searches for optimal or satisfactory solutions with three operators: mutation, crossover and selection. The offspring is generated by perturbing the solutions with a scaled difference of selected population individuals followed by a crossover strategy. Moreover, DE has several strategy parameters such as population size (NP), scaling factor (F ) and crossover probability (CR).
However, the performance of classical DE algorithm is highly dependent on the mutation strategy and control parameter according to existing studies [11] and analysis [12], which may lead to premature convergence and degradation of performance. Owing to the basic and/or difference vectors are randomly selected among the same population, it does not use the neighbourhood structure and/or the possible direction information to make individuals to the promising areas. Such behaviour may make the current solutions more possible to be trapped by local optimal position if they are close to local trap or the best solution has a long distance to the optimal positions [13][14][15].
To utilise the advantages of DE and simultaneously alleviate its disadvantages, a lot of ensemble mutation strategies and DE variants are proposed by many scholars. For example, a trigonometric mutation operation is presented by Fan and Lampinen to improve the performance of DE. Such modification makes the algorithm to keep a good trade-off between global exploration and local exploitation [16]. Thus, it can be possible to increase the convergence speed, and thereby obtain an acceptable solution with a lower number of objective function evaluations. Sun et al. proposed a hybrid DE/estimation of distributed algorithm (EDA) algorithm from a new aspect, i.e. hybrid utilising local information and global information [17]. The local information is obtained by modified mutation operation, while the global information is acquired from the population's solution by the proposed model. Three different learning approaches were proposed in [18] in which one is to select the base vector and others are to construct difference vectors. A new hybrid DE and self-adaptive immune operation are proposed in [19] to escape from the local optimal neighbourhood. Lu et al. [20] combined corpus-based and Word Net-based similarity methods based on a DE algorithm and assessed semantic similarity between terms in a continuous vector space to improve similarity computation in terms of accuracy. Michael et al. [21] illustrate how the simple constrained multi-objective optimisation algorithm, Generalised DE 3, can assist the practical sizing of mechatronic components used in digital displacement fluid power machinery. Shilpi and Karambir [22] implemented an optimising technique called DE to improve the effectiveness of test cases using the average percentage of fault detection metric. Du et al. [23] proposed an event-triggered impulsive control scheme to improve the performance of DE. By introducing Impulsive control and event-triggered mechanism into DE, they hope to change the search performance of the population in a positive way after revising the positions of some individuals at certain moments.
This paper proposed a second-order DE (SODE) algorithm. In fact, it is hoped to utilise the curvature information of the and v j is a random number in [ − p, p] [a k cos (2pb k 0.5)], a = 0.5, b = 3, k max = 20 [ − 0.5, 0.5] D population. What's more, the optimal second-order differential mechanisms are initially proposed for DE/best/1 and DE/rand/1. The major contributions of this paper are the following: † Second-order difference vector mechanism is proposed: Introducing the second-order difference information based on the first-order mutation strategy and analyse the effect of the proposed mechanism through the experiment. † Different optimal second-order differential mechanisms for DE/best/1 and DE/rand/1 is proposed: Two kinds of difference information from the second-order difference vectors are related with personal individuals. They are individually chosen from different classical mutation strategies.
Such an idea is powered by efficiently utilising the beneficial wide area direction of individuals and different mutation strategies. These strategies support the production of new exploring moves to promote the detection for promising neighbours.
It is noted that a preliminary version [24] of this paper is initially published in conference proceedings. The research motivation, algorithmic analysis, simulation experiments and comparison analysis are simultaneously enlarged based on the preliminary version.
The organisation of this paper is as follows. Basic concepts of DE are described in Section 2. The proposed new strategies and algorithm are presented in Section 3. Section 4 gives the experimental results and evolutionary behaviour comparison. Conclusions and future research are finally given in Section 5.

Differential evolution
The background knowledge of DE is introduced in this section. DE is a population-based optimisation algorithm with the principle of natural evolution, which is encoded into floating point strings P Fig. 1  with individuals as in the equation below: where X G i, j is the jth component of the ith solution at the Gth iteration as in the equation below: The main operations of DE are initialisation, mutation, crossover and selection.

Initialisation
The population is usually initialised with a uniform distribution in the domain confined by lower bound X min j and upper bound X max j , in which D is the dimension size. The operation is shown as in the equation below:

Mutation
On the basis of the parent individual X G i , DE uses mutation operator to generate a mutant vector V G i at each generation. Notation 'DE/a/b/ c' is used to distinguish these strategies. Here, 'a' denotes the mutated vector; 'b' is the number of being used difference vectors; and 'c' shows the crossover scheme. The most usually used mutation operations of the DE algorithm are presented in (4)-(9) [23,24]. The best solution in iteration G is indicated as 'DE/current-to-best/1: a r e random integers and mutually unequal and they are also not equal to index i. F is the scaling parameter between 0.4 and 1 and can be used to adjust the search step size. A new individual around the current best solution is generated with (5) and (6) to finely search the current neighbour. Two difference vectors are produced with (7)-(9) to enlarge the exploring region. The population diversity thus can be maintained and more beneficial heuristic information can be utilised.

Crossover
After mutation operation, each target vector V G i and its parent individual X G i will be executed crossover operation. The parent vector and the target vector are mixed to generate a trial vector U G i . A uniform crossover is used in the usual DE, which is shown in the equation below: where j rand is a random integer between 1 and N p . It ensures that at least one component is chosen from the mutant vector V G i for the trial vector. CR is the crossover probability in [0, 1]. This variable controls the portion of parameter values that are copied from the mutant vector.

Selection
There is a greedy selection mechanism in this operation which is based on their fitness of the parent individual and the trial vector. The better individual with higher fitness is chosen to the next generation. The selection operation is indicated as the equation below (11) (for minimisation): The above-mentioned three operations will not stop until some termination condition is met. A final solution will be given when this iteration stops.

Second-order DE
A novel SODE optimisation algorithm, SODE, is proposed. The aim of proposing a second-order difference evolution is to utilise the second-order direction information. There is one word to say that the motivation of this research is not to propose a competitive DE variant with so powerful performance, but a distinct SODE algorithm model for the extensive subsequent research.

Second-order difference mechanism
Both classical mutation operations, DE/best/1 and DE/rand/1, are used as the analytic generation strategies, which are the most widely researched and successful schemes. Now, the beneficial directional information of the second-order difference vectors will be exploited. The second-order difference vector information is indicated as (12)- (16), which are developed from two classical mutation strategies where r1, r2, r3, r4, r5, r6 are different random integers in 1, N P and they are different. Scaling parameter F is set as 0.5. Difference vectors in (12)- (18) are the same as the classical DE algorithm. (16) is the second-order difference vector, which is used to slightly modify the difference vector d G construct the mutant vectors as in (17) and (18). They are associated with each individual, which can be individually updated according to their current status and the remaining historic information. The second-order difference vector in (16) also aims to enlarge the exploring region and reduce the possibility of being trapped in the local optimum when it is introduced to the first-order difference vector d G . The parameter l is set as 0.1 that will be discussed in Section 4.3. To study the performance of the proposed mechanism, the classical mutation operations-DE/rand/1 as in (17) and DE/best/1 as in (18) will be introduced in Sections 3.2 and 3.3.

Add the second-order difference vector to DE/rand/1
In this section, we will introduce the DE/rand/1 strategy with the second-order difference vector mechanism in detail. The mutant vector V G i as in (17) contains a difference vector dr G , which will be selected as two composed patterns.
The first compose pattern of dr G is made by (14) and (15). The second component pattern of dr G is made up by (13) and (15). To evaluate the performance of the proposed mechanism for DE/rand/ 1, a suit of benchmark functions [13,20,21] is selected as the test suit. The discussion on the performance of the second-order difference vector to DE/rand/1 will be presented in Section 4.4.

Add the second-order difference vector to DE/best/1
In this section, we will introduce the DE/best/1 strategy with the second-order difference vector mechanism in detail. The mutant vector V G i as in (18) contains a difference vector dr G , which will be selected as two component patterns.
The first composed pattern of dr G is made up by (14) and (15). The second composes pattern of dr G is made up by (13) and (15). To evaluate the performance of the proposed mechanism for DE/best/1, a suit of benchmark functions [13,20,21] is selected as the test suit. The discussion on the performance of the second-order difference vector to DE/best/1 will be presented in Section 4.5.

Performance comparison and analysis
To discuss the performance of the proposed second difference strategy, 20 functions [13,20,21] with dimension 30 are used as the test suite and three DE variants are also adopted.

Benchmark functions
The test suit contains 6 unimodal functions and 14 multimodal functions. Functions f 1 − f 7 , but f 5 , are unimodal functions because Rosenbrock's function f 5 is a multimodal function when its dimension is larger than three. Here, f 8 − f 20 are multimodal Both parameters, scale factor F and crossover probability CR of Classical DE are initialised to 0.5 for all algorithms [22,23]. The parameter l is also initialised as 0.1 according to our experiments for the parameter l, which will be discussed as follows.

Simulation results of parameter l
The influence of the selection of parameter l will be discussed in this section. The parameter of l may have an important influence on the population evolving for the current solutions. Five unimodal functions f 1 , f 2 , f 3 , f 4 , f 6 and three multimodal functions f 5 , f 19 , f 20 are chosen to empirically analyse its effect. To achieve more reliable results and rule out other interference factors, the conventional parameters are set as 0.1, 0.3, 0.5, 0.7 and 0.9. Then, the comparison of l for DE/best/1 among five different selections is plotted in Fig. 1 and indicated in Table 2. The comparison of l for DE/rand/1 among five different selections is plotted in Fig. 2 and summarised in Table 3. Fig. 1 indicates that the parameter l for DE/best/1 is very sensitive to the algorithmic performance. It can also be found that the evolving line of l being 0.1 is located at the bottom of six functions for all eight functions. At the same time, numerical results in Table 2 Table 2 clearly indicates that the algorithm with l = 0.1 performs best among all five choices for benchmark functions. Therefore, the parameter l will be chosen 0.1 in the following parts of this paper for DE/best/1. Fig. 2 indicates that the parameter l for DE/rand/1 is very sensitive to the algorithmic performance. It can also be found that the evolving line of l being 0.1 is located at the bottom of seven functions for all eight functions. At the same time, numerical results in Table 3 present the item comparisons for the parameter l = 0.1, 0.3, 0.5, 0.7 and 0.9. The first column gives the test functions for the experiments and the second column gives five  results sufficiently indicate that the second-order difference vector greatly benefits the search for the optimisation process. In general, SODE21 and SODE22 perform better than DE2 which indicates that the second-order difference vector has a significant influence on the convergence ability and accuracy. The fact of SODE21 and SODE22 being better than DE2 indicates that the second-order difference vector has a significant influence on the expansion of the population's diversity. These progressive phenomena verify the excellent effects of the proposed second-order difference information strategy.
(iii) Online evolving performance comparison and analysis: The evolutionary performance comparison among several DE variants can be found in Fig. 4, which will further support the previous numerical comparison and the relative analysis. Observed from Fig. 4, SODE21 and SODE22 based on DE1 outperform its competitors for 17 from 20 benchmarks for the final results. The evolutionary curves of SODE21 and SODE22 decline faster than DE2 and they steadily obtain even better function values than the classical DE algorithms for all the functions. As shown in the results, SODE21 performs best for 15 functions and SODE22 performs best for 3 functions. What is more, it can be seen that DE1 suffers from frequent premature convergence for several functions significantly. In general, SODE21 and SODE22 present more robust performance and faster convergence speed when the second-order difference information is considered, which shows the necessity and validity of the proposed strategy.

Conclusion and future work
How to utilise the second-order information from the individual vectors are considered in this paper. So a novel SODE mechanism, SODE, is proposed and investigated. It expands the current research scope of the classical DE algorithm effectively, which is a first-order DE in nature. It is possible to effectively utilise the second-order curvature information of population for even better solution location and to enhance the adaptability of DE search mechanism. The curvature information of the search space may trigger many novel challenging and interesting topics in this research field. The curvature information has distinct advantages in avoiding premature convergence. SODE is verified on some classic benchmark functions when compared with other DE algorithms. The simulation results indicate that its performance is very competitive and better than other comparison algorithms. It also proves the effective and cooperative effect from the curvature information.
The better utilisation of the second-order information from the curvature information is an interesting topic in future.