Memetic Algorithms with Local Search Chains in R : The Rmalschains Package

Global optimization is an important field of research both in mathematics and computer sciences. It has applications in nearly all fields of modern science and engineering. Memetic algorithms are powerful problem solvers in the domain of continuous optimization, as they offer a trade-off between exploration of the search space using an evolutionary algorithm scheme, and focused exploitation of promising regions with a local search algorithm. In particular, we describe the memetic algorithms with local search chains (MA-LS-Chains) paradigm, and the R package Rmalschains, which implements them. MA-LS-Chains has proven to be effective compared to other algorithms, especially in high-dimensional problem solving. In an experimental study, we demonstrate the advantages of using Rmalschains for high-dimension optimization problems in comparison to other optimization methods already available in R.


Introduction
Global optimization, i.e., finding the inputs to a function that yield minimal/maximal output, is an important mathematical problem with applications in nearly all fields of modern Science and Engineering.Nowadays, as the functions to optimize are often complex and highdimensional, mathematical analysis may be difficult, costly, or even impossible.In contrast, computational power is steadily growing, and optimization has evolved to an important line of research in Computer Sciences.Here, meta-heuristics are developed for general optimization to produce approximations to an exact solution with sufficiently good quality in a reasonable amount of time and with reasonable computational effort (Michalewicz and Fogel 2004).The algorithms treat the target function in a black-box manner, i.e., no preconditions or further information is required, such as continuity, differentiability, or derivatives of the function.
One such meta-heuristic, which led to a vast amount of successful algorithms and implementations in the past years, is the evolutionary algorithm (EA) framework (Bäck, Fogel, and Michalewicz 1997).Here, a population of possible solutions is evolved by altering (mutation) the solutions, and by letting them interact with each other (crossover).The candidate solutions are evaluated for their fitness, and newly created solutions replace the solutions with worst fitness, in order to converge around the region with best fitness.Note that in the context of EAs, the term solution is often used in the meaning of a candidate solution, i.e., a parameter configuration to be evaluated (Bäck et al. 1997).A solution is not necessarily a good or an optimal solution.
Using a population allows EAs to perform good exploration of the search space, but sometimes they are not capable of exploiting a promising region to reach the local optimum of that region.So, the solutions they obtain are sometimes not accurate.
Local search (LS) methods, on the other hand, can improve a solution very quickly, but they are not able to explore a complex search domain, as they find only local optima, i.e., solutions which are optimal in a certain region of the search space.Only for convex problems, LS methods are fully suitable, as in this case a local optimum is also a global optimum (see, e.g., Mullen 2014).Memetic algorithms (MA; Moscato 1999; Krasnogor and Smith 2005) are a hybridization between EA and LS methods, with the objective to take advantage of both the exploration power of EAs and the exploitative power of the LS, therewith improving the overall results (Goldberg and Voessner 1999).MAs that integrate an LS method within the EA iteration can potentially obtain better results than applying a final LS method after the EA run, because the improvements obtained by the LS can guide better the search to best solutions, allowing that better solutions could be selected by the EA.As not all solutions are equally good, MAs can obtain best results if they apply the LS method with a higher intensity, i.e., using more fitness function evaluations (more iterations), to the most promising solutions, which are the ones with best fitness.
In this paper, we present the package Rmalschains (Bergmeir, Molina, and Benítez 2016) for R (R Core Team 2016) that implements various variants of the memetic algorithm with local search chains paradigm (MA-LS-Chains; Molina, Lozano, García-Martínez, and Herrera 2010).MA-LS-Chains is an MA whose main feature lies in its ability to apply the LS various times on the same solution, using every time a fixed amount of iterations/function evaluations.The final state of the LS parameters after each LS application becomes the initial point of a subsequent LS application over the same solution, creating an LS chain.This way, with varying length of the chain, i.e., varying amounts that the LS is called on an individual, MA-LS-Chains adapts the intensity of the LS to a solution in function of its quality.
The MA-LS-Chains algorithm family has proven to be very effective in continuous optimization problems in the past.MA-SW-Chains, which employs the Solis-Wets algorithm (SW) for LS, was the competition winner of CEC'2010 for high-dimensional optimization (Tang, Li, and Suganthan 2010).MA-CMA-Chains, which employs the covariance matrix adaptation evolution strategy (CMA-ES) as LS (see Section 2), performed very well in the BBOB'2009 competition (Hansen, Auger, Ros, Finck, and Pošík 2010), and also on the data of the CEC'2005 competition (Deb and Suganthan 2005;García, Molina, Lozano, and Herrera 2009); though it did not take part in the official competition, it was evaluated using the same conditions in Molina et al. (2010).In this package, both LS methods are available and the user can choose which one will be used (the default LS method is CMA-ES).
There is already a host of choices for continuous optimization methods that are readily available in R; Section 4 gives an overview.However, an important result in research on optimization is the existence of several "No Free Lunch" theorems, which "mean that if an algorithm does particularly well on average for one class of problems then it must do worse on average over the remaining problems" (Wolpert and Macready 1997).So, a method which takes into account problem specific knowledge has the potential to perform better than general methods.And though most optimization algorithms do not take into account problem-specific knowledge explicitly, they are usually implicitly better/worse suited for certain types of problems.Taking this into account together with the good performance of MA-LS-Chains, especially for high-dimensional problems, we find it justified to present another package for optimization to the R community.The Rmalschains package also performed well in a recent study by Burns (2012), and in Section 5, we perform a comparison of our package to other methods, with a focus on high-dimensional problems.
The algorithm is implemented in C++, and encapsulated in a library called librealea (Molina 2012), so that it can also be used outside of R. Rmalschains uses Rcpp (Eddelbuettel and François 2011) to make the functionality of librealea accessible from within R. The package Rmalschains is available from the Comprehensive R Archive Network (CRAN) at http: //CRAN.R-project.org/package=Rmalschains, and has a dedicated website at http:// sci2s.ugr.es/dicits/software/Rmalschains.Also, the interested reader can find further information on the state of the art of EAs on the thematic web site of our research group on "Evolutionary Algorithms and other Metaheuristics for Continuous Optimization Problems" (http://sci2s.ugr.es/EAMHCO/).The remainder of this paper is structured as follows.Section 2 presents the theory of the MA-LS-Chains algorithm, and Section 3 shows a brief example of the usage of the package.Section 4 gives an overview on the other packages available in R for continuous optimization, and Section 5 shows experiments comparing the methods already present in R with Rmalschains.Section 6 concludes the paper.

The theory of the algorithm
In the following, we describe briefly the general scheme of the MA-LS-Chains algorithm and its main components, i.e., the EA and the LS methods employed.For more details, the reader may refer to Molina et al. (2010).

General scheme
The algorithm was designed with the idea that the LS should be applied with higher intensity on the most promising regions.As promising regions, we consider the areas/regions where solutions with good fitness are located.
MA-LS-Chains is a steady-state MA that is combined with different methods for the LS.It uses a steady-state genetic algorithm (SSGA) as EA (Whitley 1989;Smith 2002).Different from a generational algorithm, where the genetic operators are applied to large parts of the population simultaneously, in a steady-state EA only single individuals are used at a time to generate offspring, which replaces other single individuals of the population.
1: Generate the initial population 2: while not termination-condition do 3: Perform the SSGA with n frec evaluations.

4:
Build the set S LS of individuals which can be refined by LS.

5:
Pick the best individual c LS in S LS .

6:
if c LS belongs to an existing LS chain then 7: Initialize the LS operator with the LS state stored with c LS .

8:
else 9: Initialize the LS operator with the default LS parameters.Apply the LS algorithm to c LS with I str , giving c r LS . 12: Replace c LS by c r LS .

13:
Store the final LS state with c r LS .14: end while MA-LS-Chains allows for improving the same solution several times, thus creating an LS chain.Also, it uses a mechanism to store the final state of the LS parameters along with the solution, after each LS application.In this way, the final state of an LS application on a solution can be used for the initialization of a subsequent LS application on the same solution, continuing the LS.
The general algorithm is shown in Algorithm 1.After generating the initial population, in a loop the following is executed: The SSGA is run with a certain amount of evaluations n frec .Then, the set S LS is built with the individuals of the population that have never been improved by the LS, or that have been improved by the LS but with an improvement (in fitness) superior to δ min LS , where δ min LS is a parameter of the algorithm (by default δ min LS = 10 −8 ).If |S LS | = 0, the LS is applied with an intensity of I str to the best individual in S LS .If S LS is empty, the whole population is reinitialized except for the best individual which is maintained in the population.
With this mechanism, if the SSGA obtains a new best solution, it should be improved by the LS in the following application of the LS method.

The evolutionary algorithm
In MA-LS-Chains, the SSGA applied is specifically designed to promote high population diversity levels by means of the combination of the BLX − α crossover operator (Eshelman and Schaffer 1993) with a high value for its associated parameter (we use a default of α = 0.5) and the negative assortative mating (NAM) strategy (Fernandes and Rosa 2001).Diversity is favored as well by means of the BGA mutation operator.The replacement strategy used is replacement worst (RW) (Goldberg and Deb 1991).The combination NAM-RW produces a high selective pressure.
Crossover.The BLX − α operator (Eshelman and Schaffer 1993)  Negative assortative mating.Assortative mating means that the individuals which are crossed are not chosen fully at random, but depending on their similarity.According to whether crossing of similar or dissimilar individuals is favored, the strategy is called positive or negative assortative mating.We use a mating strategy proposed by Fernandes and Rosa (2001), which favors diversity in the population.The algorithm chooses 4 individuals, and computes the similarity (in form of the Euclidean distance) between the first one and all others.Then, the first individual and the individual with maximal distance from it are chosen for mating.
Mutation: The BGA operator.This is the operator of the breeder genetic algorithm (BGA; Mühlenbein and Schlierkamp-Voosen 1993).Its main purpose is to assure diversity in the population.Let c ∈ [a, b] be the value at the ith position of the individual subject to mutation, with a, b ∈ R being the corresponding upper and lower domain bounds.Let r be the mutation range, normally defined as 0.1 • (b − a).Then, a new value c for the ith position of the chromosome, lying in the interval [c − r, c + r], is computed in the following way: where addition or subtraction are chosen with a probability of 0.5, and the α k are chosen as either zero or one, with a probability for one of P(α k = 1) = 1 16 .So, the probability of generating a c in the neighborhood of c is very high.
The replacement worst strategy.This is a standard replacement strategy, where the worst individuals are replaced by better ones.It generates high selective pressure, so that in combination with the negative assortative mating, many different solutions are generated throughout the search, but only the best ones are kept in the population.

The local search method
Within the MA-LS-Chains paradigm, different methods for the LS can be used, depending on the application.Usually, the CMA-ES strategy works best.But as the CMA-ES algorithm does not scale well with the amount of parameters, for high-dimensional problems other LS strategies, such as the Solis-Wets or the Subgrouping Solis-Wets solver are to be preferred (Molina, Lozano, Sánchez, and Herrera 2011).

CMA-ES.
The CMA-ES algorithm (Hansen, Müller, and Koumoutsakos 2003) can be considered the state of the art in continuous optimization.Thanks to the adaptability of its parameters, its convergence is very fast and obtains very good results.Implementations are available in R in packages cmaes (Trautmann, Mersmann, and Arnu 2011), adagio (Borchers Solis-Wets algorithm.The algorithm presented by Solis and Wets (1981) is a randomized hill climber with adaptive step size.Starting from the current position c LS in the search space, two candidate solutions are generated in the following way.Using a multivariate normal distribution that has the same dimension as c LS and a standard deviation of ρ, a sample is drawn and used as distance d to compute the candidate solutions c LS + d and c LS − d.If the better one of the candidate solutions is better than c LS , c LS is updated to this new solution and a success is recorded.If c LS is better than both of the candidate solutions, c LS is not updated and a failure is recorded.After several successes/failures in a row, ρ is increased/decreased.Furthermore, there is a bias term added, to put the search momentum to regions that are promising.This term is continuously updated using its previous value and d.For details, see Molina et al. (2010).Though this algorithm is rather unsophisticated, it usually yields good results, is fast and easy to compute, scalable, and does not need derivatives to be computed.This makes it suitable to be used in the MA-LS-Chains framework.
Subgrouping Solis-Wets.In this adaptation to high-dimensional data of the Solis-Wets algorithm (Molina et al. 2011), a subgroup of the overall amount of parameters is chosen randomly, and then optimized for a certain amount of evaluations (defined by the parameter maxEvalSubset).Then, a new subset is chosen.In the current implementation, the subsets contain 20% of the overall amount of variables.
Nelder-Mead downhill simplex.This method, presented by Nelder and Mead (1965) (see also Nelder andSinger 2009, or, e.g., Press, Teukolsky, Vetterling, andFlannery 2007), is a popular standard algorithm for optimization without using derivatives.In R, it is the Figure 2: The 2-dimensional Rastrigin function, in the [−5.12, 5.12] interval of the two input parameters.
standard method of the optim function (Venables and Ripley 2002).Also, it is implemented in the packages neldermead (Bihorel and Baudin 2015), dfoptim (Varadhan and Borchers 2016), gsl (Hankin 2006), adagio (Borchers 2016), and nloptr (Ypma, Borchers, and Eddelbuettel 2014).In this algorithm, a simplex is initialized in an n-dimensional parameter space using n + 1 candidate solutions.Then, the simplex is evolved according to the fitness of its vertices.
In MA-LS-Chains, the simplex is initialized with c LS and n further candidate solutions which are generated as c LS + λ i e i , i = 1, . . ., n, with the e i being unit vectors and λ i a scaling factor for the respective parameter dimension, which is usually set to one.Then, within the chaining mechanism, the simplex is stored and reloaded with each individual.

A simple example
We use minimization of the n-dimensional Rastrigin function as an example, which is a common benchmark in global optimization (Mühlenbein, Schomisch, and Born 1991).The function is defined as follows: It has a global minimum at f ( 0) = 0.The cosine causes a lot of local optima, so that this function is considered a difficult optimization problem.In R, it can be implemented as follows: Figure 2 shows the 2-dimensional case for the input parameters in the [−5.12, 5.12]-interval.
The malschains function can then be used to minimize the function (for maximization, the objective function would have to be inverted).For the 30-dimensional Rastrigin function, the optimization is performed, e.g., with the following command: R> res <-malschains(rastrigin, lower = rep(-5.12,30), + upper = rep(5.12,30), maxEvals = 200000, verbosity = 0, + control = malschains.control(popsize= 50, istep = 300, ls = "cmaes", + optimum = 0)) Here, the first parameter is the objective function, the parameters lower and upper define the lower and upper bounds of the search space, and also define the amount of parameters that the optimizer assumes the objective function to have.The parameter maxEvals defines the maximal number of function evaluations that are to be used.The parameter verbosity controls the verbosity level of the output, where 0 indicates no output, 1 indicates a summary, and values of 2 and above currently indicate full output.Finally, the control parameter can be used to define parameters of the optimization method itself.In the example, we use the parameter popsize to set the population size of the EA, the parameter istep to set the amount of function evaluations within each run of the LS, and the parameter ls, to set the type of LS to use.
Furthermore, the parameter optimum gives the global optimum of the function that the algorithm will try to achieve.
The solution is a list (an object of class 'malschains') containing the best individual sol, and its fitness.Furthermore, it contains some information about the convergence process, such as the total amount of evaluations spent for the EA and the LS, respectively, the ratio of the spent evaluations (also called effort), the ratio of total improvement of the fitness, the percentage of times that application of the EA/LS yielded improvement, and some timing results in milliseconds: It can be seen that the solution is near to the global optimum of zero both for all values of the solution and the fitness.

Other packages in R for continuous optimization
Throughout the last years, a rich variety of packages in R for optimization has been developed.A constantly updated overview is provided at the "CRAN Task View: Optimization and Mathematical Programming" (Theussl and Borchers 2016).In the following, we present methods from the section "General Purpose Continuous Solvers" of that task view, in which our package is also present.Some packages are omitted in the following overview because they are not applicable in our context or because they are very similar to other packages.Some of the available non-population-based methods are: • The optim function (Venables and Ripley 2002) from the stats package is the standard optimization function in R. It implements a number of LS methods, like the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method, a box-constrained version of it, called L-BFGS-B, the Nelder-Mead method, a conjugate-gradient algorithm, and a simulated annealing method.The methods, though still pretty popular, cannot be seen as state of the art by today's standards, and attempts exist to replace these methods with newer ones (Nash 2014).Also, the first three of the methods are LS methods suitable for convex optimization only, and as one of the reviewers brought to our attention, the simulated annealing implementation has obvious shortcomings, and it also was not competitive in a recent comparison (Mullen 2014).
• The package dfoptim (Varadhan and Borchers 2016) implements derivative-free optimization algorithms.Concretely, the Nelder-Mead and the Hooke-Jeeves algorithms are implemented.The Nelder-Mead algorithm can be used within our package as the LS.
• The package adagio (Borchers 2016) implements several derivative-free global optimization algorithms, such as a scalable Nelder-Mead algorithm, CMA-ES, and Wolfe Line Search.
• Rsolnp (Ghalanos and Theussl 2016) and alabama (Varadhan 2015) have algorithms for objective functions with constraints.In particular, they use nonlinear augmented Lagrange multiplier method solvers, based on sequential quadratic programming (SQP), for optimization.
• The packages cmaes (Trautmann et al. 2011)  • The package nloptr (Ypma et al. 2014) is an R interface to the popular library for optimization NLopt (Johnson 2012).NLopt implements a host of optimization methods, most of them also available by other R packages.We focus on the controlled random search (CRS) algorithm (and in particular, the CRS2 variant) with the local mutation modification.This algorithm starts with a random population of points, and randomly evolves these points by heuristic rules.

Experimental study: Comparison with other algorithms
In this section, we compare Rmalschains with the packages discussed in Section 4. The comparison is performed using a benchmark which contains 19 scalable objective functions with different characteristics.Our analysis considers accuracy and execution time of the methods with a fixed amount of function evaluations.Accuracy measures directly the quality of the solution, and may be considered the primary criterion to assess performance of a method.But execution time may be critical as well, in the sense that application of many methods is not feasible if problem dimension and complexity grow.Experiments are performed in different use cases, with medium-, and high-dimensional data, to analyze the behavior of the methods in detail, especially regarding the high-dimensional use case.Results are presented as boxplots, tabulars, and diagrams, showing accuracy, execution time, and ranking of average accuracy.

Test suite and experimental conditions
We use the well-known benchmark of Lozano, Molina, and Herrera (2011), which is especially good to test the scalability of the algorithms.This test set is composed of 19 scalable function optimization problems (also see http://sci2s.ugr.es/eamhco/testfunctions-SOCO.pdf): • 6 Functions: F 1 -F 6 of the CEC'2008 test suite.A detailed description may be found in Tang (2008).
Table 2: Characteristics of the benchmark functions F 1 -F 11 .
• 8 Hybrid Composition Functions (F 12 -F 19 ): These functions are built by combining two functions belonging to the set of functions F 1 -F 11 .
All the functions are minimization problems.Table 1 shows range and global minima of F 1 -F 11 , and Table 2 shows some further characteristics.F 12 -F 19 all have a global minimum at 0, and are non-separable.
In the comparison, we follow the guideline proposed by the authors of the benchmark.We apply the test suite for dimensions 2, 10, 30, 50, 100, 200, 500, and 1000.We consider the cases with dimensions 2, 10, 30, and 50 as low-dimensional problems, the cases with dimensions 100 and 200 as medium-dimensional problems, and the cases with dimensions 500 and 1000 as high-dimensional problems.Each algorithm is run 25 times for each test function, and the average error, with respect to the known global optimum, is obtained.Each run stops when a maximum of calls to the fitness function is reached, named MaxFES, depending on the dimension D in the following way: MaxFES = 5000 • D.
To use MaxFES instead of a maximum iteration number allows us to make a fair comparison between optimization methods that have a very different structure.Unfortunately, for several of the considered packages (rgenoud, cmaes, DEoptim, RcppDE), only the maximum of iterations can be defined, but not a MaxFES.In these cases, we count the number of fitness evaluations, and we return the best solution after the first MaxFES evaluations.Then, we use the callCC function from the base package to stop the algorithm.
The experiments are performed on a Sun Grid Engine (SGE) cluster.Parallelization of the experimental executions is performed in the way that the different algorithms are run sequentially on the benchmark functions, and execution of different algorithms is parallelized.The nodes of the cluster have each an Intel Core i7 CPU with a frequency of 2.80 GHz, and 24 GB of RAM.We establish the following limits: One R session (one algorithm executed with a particular dimension on all benchmarks 25 times) is limited to 6 GB of RAM, and a maximal global execution time of 10 days.This is approximately 30 minutes for every execution of one algorithm, and seems reasonable taking into account that computation of the fitness within the benchmark functions is considerably less expensive than in usual real-world problems.

Parameters and used methods
The aim of this work is to make a comparison between different packages on CRAN that can be used for optimization.In order to simulate the most general use of the packages, we compare the results using the packages in the most simple and straightforward way, which is with their default parameters.
The only parameter that we define for the benchmark is the population size, as usually this parameter should be adapted in dependence of the dimensionality, and maintaining constant this parameter for all dimensions could result in degeneration of the methods' errors.According to the authors of the test suite (Lozano et al. 2011), this parameter is set to min(10•D, 100) for algorithms involving a population like DE and PSO.
From package adagio, we use the Nelder-Mead algorithm, denoted as adagio_NM, and from package dfoptim we use the box-constrained Hooke-Jeeves algorithm, which will be denoted as dfoptim_HJKB.Furthermore, we use the CMA-ES implementation available in package parma, denoted as parma_CMAES, and the CRS2 method available from package nloptr, denoted as nloptr_CRS2.The PSO algorithm available from package NMOF will be denoted as NMOF_PSO in the following.
Our function malschains is used with the local search methods CMA-ES (the default method), and SW (the Solis-Wets solver), because it is a more adequate method for higher dimensional problems, due to its lower computation time.These methods will be called in the following malschains-CMA, and malschains-SW.
All other applied methods have the same name as their respective packages/functions, namely DEoptim, RcppDE, and PSO.
From the packages in Section 4, the following are not finally used in the comparisons: • As the methods in optim are either LS methods or are not competitive, we follow the suggestions of one of the reviewers and do not consider these methods in our study.• The CMA-ES implementations available in packages cmaes and adagio are not used, as the implementation in the parma package is more sophisticated, because it is a direct translation of the code recommended by the original authors as "productive code." • The packages Rsolnp and alabama are not included, because our focus is not to solve problems with (nonlinear) constraints.We used them in preliminary experiments, but not surprisingly, their performance was not good on our benchmark problems.
• Finally, the packages rgenoud and GenSA are not included, because their memory requirements prevented obtaining results for dimensions greater than 30.
Some packages require an initial solution, for these algorithms we generate a solution randomly within the parameter domains (using runif), and pass this initial solution to these functions.

Results in average error
In this section, we study the average error for medium dimension (D ≤ 100).Results with higher dimensions (up to 1000) are analyzed in Section 5.5.For each function we rank the algorithms according to the average error (the tabulated results can be found in Appendix A).The algorithms with best results have the lowest ranks, i.e., the lower the ranking, the better the algorithm.
Figure 3 shows the average ranking for all the algorithms considered for dimension ≤ 100.

Analysis of computation time
Though computation time depends greatly on implementation details (e.g., if the whole algorithm is implemented in pure R, or if C or C++ code is used), from a user perspective, when a package has to be chosen for a concrete application, such an analysis can be very valuable.
For each package and function we run once the optimization function and we measure its computation time (using the microbenchmark package; Mersmann 2015) removing every I/O operation (by function capture.outputfrom the utils package).Table 3 shows the average computation time in milliseconds.This is graphically illustrated in Figure 4 (note that in the analysis of computation time DEoptim and RcppDE are shown separately).
Figure 4 shows that dfoptim_HJKB is the fastest, and parma_CMAES and adagio_NM are the slowest methods for higher dimensions.PSO has an average computation time, and it is the algorithm whose running time increases the least with the dimension.
The malschains-CMA algorithm has the same increasing ratio as parma_CMAES but it maintains for dimension ≤ 100 a moderate computation time.The malschains-SW method has better performance.

Scalability
Many real problems require optimization of large numbers of variables.Thus, an important aspect of an algorithm is its scalability.Unfortunately, many algorithms have serious limitations in the number of dimensions they can handle.The main reason is usually the increasing computation time (as seen in Section 5.4), but there are others, such as memory requirements, program errors, or accumulated error with the dimension.
In this section, we study the scalability of the different packages.First, we identify the packages that have scalability problems, considering Table 3: • Method parma_CMAES with its default parameters is not very scalable, since the computational complexity of the CMA-ES algorithm is O(n 3 ).From dimension 10 on it is the slowest algorithm, and reaches the time limit at dimension 200.
• Package Rmalschains with the CMA-ES method requires a lower computation time than parma_CMAES, but with a similar increasing velocity.
In terms of accuracy, Figure 5 shows the ranking for the algorithms that could be executed up to dimension 1000 (the tabulated results can be found in Appendix A).We can observe that malschains-SW obtains the best results for high-dimensional problems.
Results for the execution time are shown in Figure 6.We can observe that the differences in time between the majority of algorithms, except PSO and dfoptim_HJKB, are very similar.dfoptim_HJKB is the algorithm with lowest computation time, but taking into account its  fast increase in computation time for dimension 1000, it may perform similar as the other methods for dimensions higher than 1000.PSO is the algorithm that maintains the lowest increase in computation time with the dimension.

Study of accuracy per function and dimension
So far we performed graphical analyses of the average results in ranking.In this section, we complement the analysis showing the results in fitness for several functions, using boxplots.As this analysis is pretty verbose, we show only selected results here, namely for the two functions F 4 , Rastrigin's function, and F 18 , as an example of combined functions.The full results can be found at http://sci2s.ugr.es/dicits/software/Rmalschains.
In Figures 7 and 8 we can see the results for function F 4 in medium and high dimensions.
We can observe that there is a big difference between algorithms.Initially, malschains-SW and malschains-CMA obtain bad results in lower dimensions, but when the dimensionality increases, their results are more competitive: In dimension 100, only parma_CMAES and dfoptim_HJKB perform better on average, and for higher dimensions, only dfoptim_HJKB obtains better results.
In Figures 9 and 10 we can see the results for function F 18 in medium and high dimensions.In that function, malschains-SW obtains better results than dfoptim_HJKB, and both of them maintain the best results.We can observe that the algorithms that maintain the best results also have low variance in the reached fitness, so that they robustly find better solutions than the other algorithms.

Conclusions
We have presented the R package Rmalschains.It implements the MA-LS-Chains algorithm, which is an algorithm framework of memetic algorithms with local search chains.The framework uses a steady-state genetic algorithm in combination with an LS method.Different LS methods are implemented.The algorithm chooses the individual on which to run the LS according to its fitness and its possibility to be enhanced with the LS.The LS is run for a fixed number of iterations, with the possibility to be continued on in a later stage of the algorithm.The algorithm has good performance, especially for high-dimensional problems.This was demonstrated in various optimization competitions, and in our experiments.
With presenting an implementation in R, we make the algorithm available to the R community and facilitate its use in general.We performed a rigorous experimental study comparing it to other general purpose optimizers already available in R, both with respect to quality of the results, as with respect to execution time.The study showed that, while in lower dimensions there exist competitive methods in R, often outperforming Rmalschains, for higher-dimensional problems the algorithm is very effective.

Figure 1 :
Figure 1: Example of convergence of the CMA-ES algorithm.(1) Solutions are generated from the distribution function.(2) The solutions are evaluated and a subset of best solutions is created (blue dots).(3) The distribution function is adapted accordingly.

Figure 4 :
Figure 4: Average computation time for each package (in ms, log scale).

Figure 6 :
Figure 6: Average computation time for large dimension (in ms, log scale).

Figure 7 :
Figure 7: Results obtained by the packages for function F 4 for dimension 50, 100.

Figure 8 :
Figure 8: Results obtained by the packages for function F 4 for dimension 500, 1000.

Figure 9 :
Figure 9: Results obtained by the packages for function F 18 for dimension 50, 100.

Figure 10 :
Figure 10: Results obtained by the packages for function F 18 for dimension 500, 1000.

Table 1 :
Range and global minima of the benchmark functions F 1 -F 11 .

Table 3 :
Time (in ms) for each optimization package.T: time limit was reached.
An interesting conclusion we can draw from the figure is that DEoptim (and with the same results RcppDE) is initially the algorithm with best results, but with increasing dimensionality, the algorithm performs worse.With respect to our package, we can observe that the Rmalschains methods perform best for dimensions 50 and 100.PSO and parma_CMAES also perform well.Results obtained by package PSO are better than the ones obtained by NMOF_PSO.

Table 5 :
Results for each package for dimension 10.

Table 7 :
Results for each package for dimension 50.

Table 9 :
Results for each package for dimension 200.

Table 10 :
Results for dimension 500."Inf" means that no candidate solution was found.

Table 11 :
Results for dimension 1000."Inf" means that no candidate solution was found.