A Population-Based Optimization Method Using Newton Fractal

. We propose a deterministic population-based method for a global optimization, a Newton particle optimizer (NPO). The algorithm uses the Newton method with a guiding function and drives particles toward the current best positions. The particles ’ movements are in ﬂ uenced by the fractal nature of the Newton method and are greatly diversi ﬁ ed in the approach to the temporal best optimums. As a result, NPO generates a wide variety of searching paths, achieving a balance between exploration and exploitation. NPO di ﬀ ers from other metaheuristic methods in that it combines an exact mathematical operation with heuristics and is therefore open to more rigorous analysis. The local and global search of the method can be separately handled as properties of an associated multidimensional mapping.


Introduction
The most critical component of a random search algorithm is balancing the trade-off between exploration and exploitation.Exploration is to ensure that the algorithm examines a rather wide region far from the current optimum in the search space to avoid getting trapped in a local minimum.Exploitation is the ability of the algorithm to search the surrounding area nearby the current optimum, attempting to improve it.Since these two objectives are to some extent in conflict, finding an algorithm that could compromise both is challenging [1,2].
The population-based methods are characterized by a set of multiple potential solutions to the problem, moving towards a near-optimal solution area.These algorithms increase the capability of finding the global minimum since they make a simultaneous search in many directions by using a population of temporal solutions.Particle swarm optimization (PSO) is a popular method of choice in the field of heuristic search, which tunes up between the best known position and each particle's local best optimum [3][4][5][6].
In this work, we introduce a new population-based metaheuristic method, the Newton particle optimizer (NPO), which achieves a natural balance between exploration and exploitation in global optimization problems.The algorithm drives the particles (searching agents) in the search space using the Newton method and therefore is deterministic in contrast to most population-based methods.In those conventional methods, each particle combines the currently available information with a random walk to decide the next movement.Hence, the final results may vary even with an identical initial particle distribution.On the other hand, NPO uses the Newton method and determines the searching path exactly from the current positions of the particles.
While the searching paths are completely reproducible from the initial distribution, they are very irregular, reflecting the fractal nature of the Newton method.It is this complexity of the paths that enables us to develop an efficient optimization method.The Newton method with a proper guiding function makes the particles search close around the current optimums and intermittently pushes them away so that the particles can explore distant potential optimums.However, every particle eventually converges to one of the optimums sooner or later.If a better optimum is found during this process, it will be reflected in the guiding function accordingly.
The metaheuristic approach in optimization has been often criticized for lack of scientific rigor [7].There are many "nature-inspired" methods for global optimizations, which have been introduced without a detailed study of the mechanics underlying the methods.Since NPO combines a well-established deterministic operation with heuristics, it allows a more rigorous analysis.

Properties of Newton Paths
The Newton method is a fundamental numerical scheme for finding successively better approximations to the roots of a nonlinear function [8,9].For a real/complex-valued function f , the corresponding scheme to find its zeros is If x starts close enough to one of the zeros of f , it converges to the zero rapidly.While the sequence generated from (1) eventually converges to the zero, its detailed behaviour can be very sensitive to the initial point and the values of the degree coefficient m.For example, if f ℂ → ℂ is a complex function, then the corresponding Newton iteration exhibits a fractal structure in the complex plane [10].This implies that the trace of x, which we call the Newton path, can be very irregular with a wide variety.
Consider a polynomial function where p 1 , p 2 , … , p n are n locations of interest in ℝ. Applying the Newton method in (1) to (2) iteratively yields the dynamics of the particle x wandering around p 1 , p 2 , … , p n .Figure 1 illustrates such dynamics around a root.The behavioural tendency of the sequence sensitively depends on the choice of m.Refer to Figure 2.While the sequence of a particle with m = 1 rapidly converges to one of the roots, x = 2, in Figure 2(a), the sequences with m = 2 5 and m = 4 in Figures 2(b) and 2(c), respectively, seem to keep wandering around the roots, x = 0, 1, or 2, along the iterations.The particle with a higher value of m makes a more irregular movement around the target, even taking a jump as high as 60.However, it should be noted that the sequences are bounded and never get away from x = 2.
The dynamics of a particle in the 2-dimensional search space is even more diversified under the guiding function, f = f z , z ∈ ℂ. Sensitive dependence of the Newton method on the initial point divides the domain into complicated regions, called the Julia sets, according to where the resulting sequence converges.Figure 3 compares three Newton paths with distinct values of m in the complex plane.Three particles are initiated at the same point (5,5) and are led to the origin under the same guiding function f z = z z − 2i z + 1 − i .While the particle with m = 1 makes a gradual search toward the origin, the ones with m = 2 5 and m = 3 5 show rather wild swings around it.It is notable that the movements of the latter two particles are an irregular mix of jumping and mincing.Figure 4 shows that the Newton paths sensitively change with the initial point near (3,3), while they eventually converge to the same point.This

Complexity
"diversely convergent" searching paths promote a balanced search between exploration and exploitation.It is the complexity of the Newton paths that enables us to develop an efficient optimization method.Suppose g is the fitness function to be optimized.One constructs a guiding function f such that the roots of f coincide with the current best optimums of g.Then all particles are driven by (1) along the Newton paths to the zeros of f , which are also the candidate optimum of g.The particles are likely to move close around the current optimums, or sometimes take a sudden jump away from them.If a better optimum is found during this process, the guiding function f is updated accordingly.

Method
3.1.Algorithm.This section introduces the Newton particle optimizer based on the Newton paths dealt in Section 2.
We begin with 2-dimensional examples to illustrate the algorithm of NPO clearly.Suppose g ℂ → ℝ is the fitness function, or the cost function to be optimized.We adopt N particles z i , 1 ≤ i ≤ N, as "searching agents," assigning them with the degree coefficient m i ∈ ℂ.At each iteration of the scheme, n best fitters with respect to the fitness function g are selected as leading particles.That is, we pick p 1 , … , p n , such that where A guiding function f z is constructed as a polynomial of degree n whose roots coincide with p 1 , … , p n , as Then, we update the particle's position z 1 , … , z n by applying the scheme Note that all the particles except the leading particles are attracted toward leading particles.Once the position of all particles are updated, we examine their fitness to choose the next leading particles p 1 , p 2 , … , p n .Then, we refresh the guiding function f accordingly, and reapply (5) to the particles.As long as the guiding function is a polynomial function, the convergence of the algorithm is guaranteed from the convergence of Newton's method and the monotone convergence theorem.However, like other heuristic optimization methods, convergence here means convergence of the sequence of solutions in which all particles have converged to points of leading particles.In the search space, those points may or may not be the optimum.
The pseudocode of the procedure is as described in Algorithm 1.
In ℝ d , d ≥ 3, we set a guiding function f x = f x , … , f d x T where x ∈ ℝ d and f i x is a real valued 3 Complexity function in ℝ d .Note that the updated scheme in (5) becomes the following multidimensional Newton scheme: where M is a d-by-d constant matrix and Df | −1 x is an inverse of the Jacobian matrix of f at x.The eigenvalues of M are generally associated with a local search tendency, m.
3.2.Criteria for Choice of m and M. In Section 3.1, it is illustrated that the parameter m in (5) determines the diversity of the searching paths of NPO.We can show that particles driven by NPO remain bounded if m satisfies 0 < m < 2n, where n is the number of leading particles, or equivalently, the degree of a polynomial guiding function f .For a small m, the movement of particles slows down.On the other hand, it becomes very irregular if m is close to 2n.We can also find a more practical criterion for m from experiments.In practice, we assign different values of m i to each particle to diversify the search.We usually pick them uniformly from the interval 0, m max .Figure 5 depicts that the optimal values for m max tend to be around the middle of the range, that is, m max ≈ n.When using a lesser number of particles, one may choose m max > n, so that the scheme generates more irregular searching paths and therefore compensates for a lack of diversity.In a higher dimensional NPO in (6), the eigenvalues of M play a similar role of m that decides the local search tendency.Practically, one can simply use a diagonal matrix of which diagonal entries follow the distribution of m i as mentioned above.
From the above observation, we can see that adjusting m max is one way to control the trade-off between exploitation and exploration in NPO.Let us define an exploration indicator (EI) as Note that 0 < EI 1 < 1.If EI 1 is small, the particles tend to search near the current optimums.If EI 1 is close to 1, they widely explore away from the current optimums along irregular Newton paths.We usually set EI 1 ≈ 0 5.

Construction of Guiding Functions.
The essence of NPO lies in that the diversely exploring paths toward the temporal optimums can be easily generated from the simple deterministic iteration working on a guiding function.Generating a proper guiding function is therefore the main issue to balance between the exploration and the exploitation.The three desirable properties that a guiding function is expected to have are as follows: (i) The function has zeros at the designated points and no zeros elsewhere (ii) The function is symmetric (iii) The inverse of its Jacobian is easy to compute Condition (i) is necessary for the guiding function to drive particles to a target.Condition (ii) is for unbiased search.Polynomial functions are likely to satisfy condition (iii) and reduce the computational cost.One can easily confirm that a factored polynomial such as (4) in 1-and 2dimensional search spaces naturally satisfies (i), (ii), and (iii).Unfortunately, it is well known that such factored polynomials do not exist in more than three-dimensional space.
for each particle do Initialize particle z i with m i .end for while maximum iterations or minimum error criteria is not attained do for each particle do Calculate fitness value g. end for Choose n best members p 1 , p 2 , … , p n in the search domain.Set the guiding function

Complexity
We therefore need to sacrifice condition (i), at least partly, to construct a polynomial guiding function in ℝ d , d ≥ 3.
In order to find a proper extension of (4) in a higher dimensional domain, we rewrite (4) in the complex form as f x + iy = u x, y + iv x, y and p i = q i + ir i .Then the corresponding component functions are Now we propose, as an extension of (( 8), a general guiding function in ℝ d as Here, we use circular indexes like However, it is interesting that such extra zeros do not lower the searching performance.Indeed, such "phantom roots" occur irregularly and lead the particles to more extensive search.This finding naturally brings us to the issue of how to choose the leading particles for more efficient exploration.
3.4.Criteria for Choice of Leading Particles.The choice of leading particles p 1 , … , p n is crucial in searching performance in that their relative positions determine other particles' movement.One immediate idea is to adopt the current best optimums as suggested in (3).That is, we simply choose top n rank fitters from the population.However, this may cause in many cases overly fast convergence to current optimums, giving up further exploration.
To avoid being trapped at local minimums too soon and to diversify global searching paths, we select leading particles not only from the best fitters, but also from mediocre ones.These less efficient performers prevent a situation where a few similar local minimums happen to dominate all the other particles.Figure 6 compares two cases of NPO, one of which uses the top 5 rank fitters and the other uses the top 4 rank fitters and 1 mediocre fitter with a low rank for leading particles.Interestingly, while the case with this "idle leader" seems to be slow at first, it eventually becomes more successful in finding the minimum value.This can be explained by a balance between exploitation and exploration: when the best fitters lead other particles to the temporal best, they may cause overexploitation and make the particles neglect further exploration.However, the existence of the idle leader may distract the particles' attention from the temporal best and have them search for better optimums.That is, the idle leader enhances the exploration of the search.
From the above observation, we can develop another exploration indicator that is associated with the balance between exploitation and exploration.Let us define where σ is the summation of the ranks of the leading particles.One can see that 0 < EI 2 < 1 and it becomes close to 0 when all leading particles are chosen from the top best fitters.
Hence, EI ≈ 0 means low exploration (high exploitation).On the contrary, if all leading particles come from lowest-rank particles, the indicator increases to 1, implying high exploration (low exploitation).One may adjust EI 2 in the middle range for a balanced search.In the examples in Figure 6(a)EI 2 = 0 015 and in Figure 6(b)EI 2 = 0 366.

Numerical Results
This section compares the numerical performances of NPO with those of PSO and the firefly algorithm.The firefly algorithm (FA) is another stochastic population-based method which has been applied in most areas of optimization [11,12].We tested with 28 benchmark functions suggested from the CEC 2013 competition for real-parameter optimization [13].The functions were 10-dimensional and 51 optimization tests were done for each problem.We kept the same 100,000 times of evaluation of functions for all methods.NPO uses 4 leading particles, 3 from the top best and one from the 50% performer.The matrix M for each particle was chosen such that its eigenvalues lie between 0 and 4. The parameters for PSO were set to the recommended values which are widely used in the benchmark tests [14][15][16].The parameters for FA were taken from [17,18].

Complexity
Table 1 shows that the performances of the three schemes are comparable.However, the summarized ranking of NPO, PSO, and FA in Table 2 are, respectively, 1.821, 2.107, and 2.01, which indicates that NPO is practically better than PSO and FA with these benchmark functions.

Discussion
NPO takes a deterministic approach based on a mathematical operation, the Newton method.Due to the inherent fractal properties and strong convergence of the method, NPO seems to enjoy both features of exploration and exploitation, making effective optimizations for a wide range of functions.
One of major differences between NPO and other random-walk-based metaheuristic methods is that NPO is a combination of a well-established mathematical operation and heuristics: its local and global search abilities can be controlled as a property of a multidimensional mapping.Since such mapping can be easily created like a polynomial function that we suggested in this paper, the convergent/ divergent tendency of searching paths can also be analysed and finely tuned accordingly.This opens the door to the future study of customizing the guiding functions so that they can reflect the properties of a given set of problem instances.

Figure 1 :Figure 2 :
Figure1: Illustration of the particle dynamics under a guiding function in the 1-dimensional search space: a particle is searching around a current optimum (boxed point) to find a better one, moving from 1 to 5.

Figure 3 :Figure 4 :
Figure3: Three Newton paths with a different degree coefficient m: they are initiated from the same point(5,5).The guiding function is f z = z z − 2i z + 1 − i .

for end while Algorithm 1 :
Newton Particle Optimization (in ℂ).

Figure 5 :
Figure 5: Performance variation according to m max .200 particles with 4 leading particles are applied to find the minimum of the Rosenbrock function.

Figure 6 :
Figure 6: Choice of leading particles.(a) The top 5 rank performers are adopted out of 100 particles as leading particles.(b) The four best fitters and one 40th rank fitter are used.The blue and red dots indicate the cost values of ordinary and leading particles, respectively.

Table 2 :
Performance comparison of NPO, PSO, and FA in the mean ranking.