Cluster-Based Population Initialization for differential evolution frameworks
Introduction
Engineering and natural sciences often require the solution of multiple optimization problems. This fact makes the study of optimization methods extremely important in fields such as design and control engineering. Since only a very limited number of real-world optimization problems can be solved by exact methods, in the vast majority of cases an optimizer that does not require a specific hypothesis must be used. Over the past decades, computer scientists have designed a multitude of these types of algorithms for addressing real-world problems where an exact approach is almost never applicable. These methods, known as metaheuristics, do not offer guarantees regarding the convergence, but still are capable to detect high quality solutions that can be of great interest for engineers and practitioners. Among the plethora of metaheuristics some are Evolutionary Algorithms (EAs) [26], Swarm Intelligence (SI) [25], and Memetic Computing (MC) [50].
For about two decades, i.e. from the 1970s to 1990s, computer scientists have put much effort to design metaheuristics with the intention of detecting an algorithm that could outperform all the other algorithms. After the publication of the No Free Lunch (NFL) Theorems [71], the view on optimization of scientists and practitioners underwent a radical modification. The NFL Theorems prove that all the optimization algorithms, under the hypotheses that they search within a set of finite candidate solutions and that the algorithms never visit the same point/candidate solution twice, display the same performance when averaged over all the possible optimization problems. As an immediate consequence, it was clear that it was no longer useful to discuss which algorithm was universally better or worse. Despite of the fact that the hypotheses of NFL Theorems are often not realistic (for example, it is very unlikely that an EA does not generate the same point twice during a run), a large portion of algorithmic design community started to propose algorithms, which were tailored to specific problems, see e.g. [64], [14], [53], instead of trying to propose universally applicable algorithms. On the other hand, by using the non-realism of NFL Theorems’ hypotheses as an argument, another portion of the optimization community in recent years researchers have attempted to push towards the outer limit of these theorems by proposing relatively flexible algorithmic structures that combine (to some extent) robustness and high performance on various problems. This tendency is especially clear in continuous optimization and for those algorithms characterized by adaptively coordinated heterogeneous algorithmic components. For these two sub-fields, the NFL Theorems are proved to be not verified, see e.g. [3], [58], respectively.
Since modern algorithms for continuous optimization are often composed of multiple adaptively coordinated operators, these two sub-fields are not disjointed. For example, in the context of Differential Evolution (DE), the optimizer proposed in [59] combines and coordinates multiple mutation strategies by making use of a learning period and a randomized success based logic (see also [22], [52]). In [44] another DE based strategy namely ensemble has been presented on the basis of strategy used in Evolutionary Programming proposed in [43]. In the ensemble multiple mutation and crossover strategies, as well as the related parameters are encoded within the solutions and evolve with them. Other harmonic self-adaptive combinations of components within the DE framework are proposed in [8], [7]. In the context of Particle Swarm Optimization (PSO), a harmonic coordination of multiple components is also a popular option to enhance the algorithmic robustness over a range of problems. An emblematic example of this strategy is the so-called Frankenstein’s PSO [45]. A more elegant algorithm that coordinates, in a simplistic way, a perturbation logic with a variable decomposition is proposed in [37].
Some studies focus on the coordination techniques in order to have a robust behavior of the algorithm. Several nomenclatures are used in different contexts to express fairly similar concepts. With the term portfolio it is usually referred to algorithmic frameworks composed of optimizers that are alternatively selected during the run time. The selection criteria can be a simple schedule or a more sophisticated adaptive system. Some examples in the context of continuous optimization are given in [68], [57]. In the context of combinatorial optimization, and more specifically for the maximum satisfiability problem, a popular portfolio named SATzilla platform, see [72], [32], has been proposed. The difficulty of finding a trade-off between the search algorithms and the aim at determining an automatic coordination system is studied in [33]. A model of the behavior of optimizers in order to predict their run time is presented in [34]. Very closely related to the concept of portfolio, hyper-heuristics are composed of multiple algorithms usually coordinated by a machine learning algorithm which takes a supervisory role. This term is in the vast majority of cases, to combinatorial problems. Famous examples of hyper-heuristic have been proposed in [20], [11] in the field of timetabling and rostering while in [12] graph coloring heuristics are coupled with a random ordering heuristic. An important concept in hyper-heuristic implementation is the choice function, that is a criterion that assigns a rewarding score to the most promising heuristic, see [20]. More sophisticated coordination schemes present in the literature make use of reinforcement learning in a stand alone or combined fashion, see e.g. [11], [23], and memory-based mechanisms, see [10]. Elegant learning schemes coupled with multiple operators (multi-agents) for addressing complex optimization problems are presented in [2], [1].
Closely related to hyper-heuristics and portfolio algorithms, Memetic Algorithms (MAs) are optimization algorithms composed of an evolutionary framework and a set of local searchers activated within the generation cycle, see [46], [30]. In MAs, as for the related algorithmic families, optimization is carried out by multiple components/sub-algorithms but unlike them, emphasize the global and local search roles of its components. Although the one may argue that there is no clear definition of global and local search (e.g. a DE with a proper tuning can be used as a local search), the term MA is broadly used to refer population-based hybrid algorithms. Moreover, modern MA implementation ignore the original definition that the population-based framework should be evolutionary and refer as MAs also those algorithms based on a SI framework, see e.g. [69], [66]. Recently, the concept of MA has been extended to single-solution algorithms or to any algorithm composed of multiple/heterogeneous components. In the latter case the subject is termed, by a part of the computer science community, as Memetic Computing (MC) and its implementation as MC structures, see e.g. [50], [49], [54], [55].
Regardless of the used nomenclature, an important issue, that is also the focus of this paper, is the generation of an initial population in population-based hybrid algorithms. Nearly all the population-based metaheuristics start with the random sampling of a prefixed amount of points within the decision space. This choice can be explained by the motivation, “since we have no a priori knowledge on the problem, we give to each possible candidate solution the same chance to be in the starting population”. Obviously, there is nothing wrong in this way of reasoning. Moreover, this initialization has the undoubted advantage that is computationally cheap as it does not require objective function evaluation nor other complex operations. On the other hand, for every problem, there likely exists many other strategies that can lead to much better results. Similar in the motivation, but very different in the implementation, a fully deterministic procedure that spreads the points in the decision space is also possible, see [41]. In the latter case, the motivation can be summarized as “since we have no a priori knowledge on the problem, we try to sample the initial points in a way that they cover the decision space as much as possible”. This choice besides being computationally expensive includes an implicit drawback. The minimum amount of points necessary to cover the decision space grows exponentially with the dimensionality of the problem. Hence, in high dimensions a very (unreasonable) large amount of points is required to have a representative search space coverage. Some studies on the degree of randomization of the initial population have been reported in the literature, especially about EAs, see [42], [61]. It is shown that in the many cases a deterministic initial sampling can lead to a performance deterioration. On the other hand, a random sampling within mapped areas of the decision space, i.e. a quasi random sampling, leads to a robust algorithmic behavior without excessively jeopardizing the algorithmic performance with respect to a simple (pseudo-) random sampling.
Whenever there is some knowledge about the problem, a sampling that uses this knowledge can be used to enhance the algorithmic performance, see [24]. For example, in control engineering a initial tuning of the control parameters usually allows an estimation of the instability region and a vague estimation of the region of interest of the algorithm. An initial sampling in this region of the decision space can bias the search towards a quick detection of the optimum, see e.g. [14].
Although in the vast majority of cases an a priori knowledge of the problem is not available, there is always the possibility to perform at runtime a problem characterization in order to extract some features to be exploited in the subsequent stages of optimization, see [18]. A pioneering study in this direction proposed the selective sampling, [5]. This procedure consists of performing an initial random initialization containing a large amount of points followed by a tournament selection in order to shrink the population to those individuals that display the best performance.
Within the context of DE, the sampling of extra points according to a central symmetry (opposition-based points [60]) appear to be beneficial to the algorithmic performance. Another approach consists of applying a local search to one or more solutions and then insert these improved solutions into the initial population of an optimizer. The schemes that improved one solution and inserts into a DE initial population is termed super-fit and displayed a very good performance with respect to the same algorithm making use of a random population and whose budget is entirely devoted to DE, see [15], [35], [19].
This article proposes a novel algorithmic component for pre-processing the initial solutions and generating an initial population for DE algorithms. The proposed component does not require any assumption on the optimization problem (except it being continuous). More specifically, an initial screening of the problem is implicitly performed in order to detect the most promising regions of the decision space. This result is achieved by a multiple stage procedure. At first a set of point is sampled at random. Subsequently two local searchers with very different features are consecutively applied to them with a shallow depth. The resulting points are then clustered. The population of the optimization algorithm is then composed of those individuals belonging to each cluster that display the highest performance and by other points, sampled from their neighborhood, according to a probabilistic criterion. Thus, the initial population is composed of points displaying a good performance and spread in different basins of attraction. A graphical representation of the entire framework is presented in Fig. 1.
In different ways, the combination of clustering techniques within DE framework for global optimization is a topic that has been investigated. For example, paper [70] makes use of a clustering technique over the individuals of a DE population in order to prevent a diversity loss and premature convergence. Paper [13] uses the one step k-means clustering as a multi-parent crossover. This idea is developed in [39] where the k-means clustering is associated to two novel crossover operators. In the context of dynamic optimization problem, the algorithm in [29] uses a multi-population where each sub-population covers a different area of the decision space. The number of clustered populations is dynamically varied during the optimization by means of an adaptive logic.
The remainder of this article is organized in the following way. Section 2 gives a description of the proposed initialization procedure. Section 3 displays for a large set of problems the effect of the proposed initialization over multiple and diverse optimizers. Finally, Section 4 gives the conclusions of this work.
Section snippets
The proposed Cluster-Based Population Initialization
Without a loss of generality, in order to clarify the notation in this paper, we refer to the minimization problem of an objective function (or fitness) , where the candidate solution is a vector of n design variables (or genes) in a decision space . Thus, the optimization problem considered in this paper consists of the detection of that solution such that , and this is valid . Array variables are highlighted in bold face throughout this paper.
Before entering into the
Numerical results
In order to test the validity and potentials of the proposed CBPI, the following testbeds have been taken into account:
- •
The CEC2013 benchmark described in [38] in 10, 30, and 50 dimensions (28 test problems).
- •
The BBOB2010 benchmark described in [48] in 100 dimensions (24 test problems).
- •
The CEC2010 benchmark described in [65] in 1000 dimensions (20 test problems).
In addition, one real-world problem from [21] is also studied. Totally 129 problems for 5 dimensionality values have been considered.
Conclusion
This article proposes a software module that processes a population randomly sampled within a decision space and performs an intelligent sampling to detect the most interesting/promising areas of the domain. This software module is composed of three sub-modules that consecutively act on the sampled points. At first, two local search algorithms characterized by different search logics are applied to each solution with a limited budget. During the second stage, the improved solutions are
Acknowledgement
This research is supported by the Academy of Finland, Akatemiatutkija 130600, “Algorithmic design issues in Memetic Computing”.
References (73)
- et al.
A graph-based hyperheuristic for educational timetabling problems
Eur. J. Oper. Res.
(2007) - et al.
A clustering-based differential evolution for global optimization
Appl. Soft Comput.
(2011) - et al.
Parallel memetic structures
Inf. Sci.
(2013) - et al.
An analysis on separability for memetic computing automatic design
Inf. Sci.
(2014) - et al.
A simulated annealing based hyperheuristic for determining shipper sizes for storage and transportation
Eur. J. Oper. Res.
(2007) - et al.
Algorithm runtime prediction: methods and evaluation
Artif. Intell.
(2014) - et al.
A novel clustering-based differential evolution with 2 multi-parent crossovers for global optimization
Appl. Soft Comput.
(2012) - et al.
Quasi-random initial population for genetic algorithms
Comput. Math. Appl.
(2004) - et al.
Ensemble strategies with adaptive evolutionary programming
Inf. Sci.
(2010) - et al.
Differential evolution algorithm with ensemble of parameters and mutation strategies
Appl. Soft Comput.
(2011)
Cluster-based differential evolution with crowding archive for niching in dynamic environments
Inf. Sci.
Memetic algorithms and memetic computing optimization: a literature review
Swarm Evol. Comput.
Compact particle swarm optimization
Inf. Sci.
Random number generators in genetic algorithms for unconstrained and constrained optimization
Nonlinear Anal.: Theory Methods Appl.
Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
J. Comput. Appl. Math.
A memetic particle swarm optimization algorithm for multimodal optimization problems
Inf. Sci.
A dynamic clustering based differential evolution algorithm for global optimization
Eur. J. Oper. Res.
A multi-agent memetic system for human-based knowledge selection
IEEE Trans. Syst. Man Cybern. – Part A
Hierarchical optimization of personalized experiences for e-learning systems through evolutionary models
Neural Comput. Appl.
Continuous lunches are free!
Nonlinear Programming: Theory And Algorithms
Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems
IEEE Trans. Evol. Comput.
Differential evolution and differential ant-stigmergy on dynamic optimisation problems
Int. J. Syst. Sci.
Self-adaptive differential evolution algorithm using population size reduction and three strategies
Soft Comput.
Population size reduction for the differential evolution algorithm
Appl. Intell.
A tabu search hyperheuristic for timetabling and rostering
J. Heuristics
A fast adaptive memetic algorithm for on-line and off-line control design of PMSM drives
IEEE Trans. Syst. Man Cybern. – Part B
Super-fit control adaptation in memetic differential evolution frameworks
Soft Comput. – Fusion Found. Methodol. Appl.
The importance of being structured: a comparative study on multi stage memetic approaches
A hyperheuristic approach to scheduling a sales summit
Problem Definitions and Evaluation Criteria for CEC 2011 Competition on Testing Evolutionary Algorithms on Real World Optimization Problems
Differential evolution: a survey of the state-of-the-art
IEEE Trans. Evol. Comput.
Memetic algorithms, domain knowledge, and financial investing
Memetic Comput.
Cited by (87)
Multi-strategy differential evolution algorithm based on adaptive hash clustering and its application in wireless sensor networks
2024, Expert Systems with ApplicationsMulti population-based chaotic differential evolution for multi-modal and multi-objective optimization problems
2023, Applied Soft ComputingDifferential evolution with modified initialization scheme using chaotic oppositional based learning strategy
2022, Alexandria Engineering JournalThe novel combination lock algorithm for improving the performance of metaheuristic optimizers
2022, Advances in Engineering SoftwareDifferential evolution: A recent review based on state-of-the-art works
2022, Alexandria Engineering Journal
- 1
Tel.: +358 14 260 1211; fax +358 14 260 1021.