Cultural algorithm with local search evaluated through non-parametric statistical tests Algoritmo cultural com busca local avaliado através de testes estatísticos

This work aims to analyze the performance of the classical Cultural Algorithm (CA) with a new hybrid CA proposal with to two local search techniques (Simulated Annealing - SA and Tabu Search - TS). In order to diversify the tests, in the CA with SA there was variation of the parameter energy, and in the CA with TS, there was variation in the size of the tabu list. The algorithms were submitted to two scenarios (scenario 1 - Basic functions, scenario 2 - Hybrid functions). The proposed algorithm differs from others found in the literature, by the process of feeding the topographic knowledge that guides the research. The analysis was performed using the Friedman, Friedman Aligned and Quades tests, which serve to compare the behavior of a set of algorithms at one time.


I INTRODUTION
For some time now, science has sought to model the natural evolution of living beings in computational systems [1,2]. From the engineering point of view, these models will be seen as the basis for the development of meta-heuristics to solve problems, basically in systems optimization [3][4][5]. Research progress has shown that meta-heuristics with different operating mechanisms may be more suitable for problems with certain structures, and other meta-heuristics may work better in other classes of problems [6]. This led research to the development of new metaheuristics that were based on other processes of the nature other than the evolution of the species. Due to the new approaches, in which we notice the occurrence of the increase in the knowledge of the mechanisms that support evolutionary computation algorithms, it is noticed that the new Evolutionary Algorithms (EAs) are moving away from the strict biological inspiration. The new EAs tend to deepen the tendency to incorporate operations and mechanisms that are not bio-inspired, but rather inspired by mathematical or computational arguments [6]. We also have algorithms inspired by the adaptation and cultural evolution of individuals in a community, called Cultural Algorithms (CAs). These generate or alter their knowledge due to the relationship between individuals in the community. The Cultural Algorithms (CAs or CA) were proposed by [7]. Due to its characteristics of implicit parallelism and random search the CAs are used in the solution of traditional problems from complex optimization [8]. Techniques employing meta-heuristics (GAs, CAs, etc.) as global search heuristics and local search (hill climbing, TS, SA, etc.), are commonly referred to as memetic algorithms (MAs) [9], or hybrid algorithms. Normally, AMs may not only have a good exploratory capacity, similar to what a population-based global search algorithm does, but it also provides a good intensification performance during the search, similar to what a local search algorithm does. The hybridization of the CAs with the local search engine for extensive exploration in the solutions generated by the CAs can greatly improve the performance of this hybrid over the algorithms in their classical form. Comparisons of AEs generally tend to the analysis of their results after several executions of these in the attempt to solve several functions of benchmarks. These evaluations are performed using statistical hypotheses [10,11].
It can be said that these are results-based analyzes [12], that is, it is an evaluation of the performance of the algorithm for certain benchmark functions. However, the difficulty is in comparing several algorithms, since this comparison is usually performed in pairs of algorithms [13] and increases with the number of algorithms to be evaluated, in addition to increasing the probability of making an error [14]. The interest in nonparametric statistical analysis has recently grown in the field of computational intelligence [10], since it can be a way of comparing evolutionary algorithms, tested for several different problems, with some statistical significance. In this proposal the hybridization of the CAs with two forms of local search (TS and SA) are compared with each other and with the classical CAs. These three algorithms (pure CAs, CAs with TS and CAs with SA) are used to find the minimum of eight real variable benchmark functions. The results are evaluated based on nonparametric tests: Friedman, Friedman Aligned and Quade.
The proposed article provides in section 2 a basic content of items taken as a basis for its development. In section 3 we present the materials and methods with the test scenarios used in this work. In section 4, the results of the simulations are presented. Finally, section 5 concludes the paper with observations and comments on the simulations performed.

II META-HEURISTICS FOR OPTIMIZATION
In this section the global search meta-heuristics 'Cultural Algorithm' and the two meta-heuristics used for local search 'Simulated Annealing' and 'Tabu Search' will be presented. The combination of a global search algorithm with a local search algorithm compose the basis of Memetic Computing [15].

II.1 CULTURAL ALGORITHMS
CAs are used to model the evolution of the cultural component in a computational evolutionary system over time, since it accumulates experience in solving a set of data in problem solving [8]. Cultural evolution allows societies to involve or adapt their environment at rates that exceed biological evolution, which is based only on genetic inheritance [7].
The CAs are formed basically of a population space, a space of beliefs, communication protocols (Acceptance and Influence Functions) between the two spaces and some auxiliary functions: Initialization, Selection, Update and Evaluation. The structure of the CAs is shown in Figure 1, their pseudocode is shown in Figure 2.  The spaces mentioned are described below: Population Space: set of solutions that can be modeled using any technique that makes use of a population of individuals; The Space of Beliefs (Group Map): is the place where occurs the storage and representation of knowledge (experience or individual maps) acquired throughout the evolutionary process takes place. The sources of knowledge are five according [16], these are useful in decision making [16,17]. For example: Situational knowledge has successful and unsuccessful solutions, etc.; Normative knowledge contains ranges of acceptable behaviors. Topographical knowledge has spatial patterns of behavior.
Population space and belief space are linked by a communication mechanism (protocol) composed of an acceptance function that is used to collect the experience of individuals from the selected population. The other function of the communication protocol is the influence function that can make use of the knowledge of solutions of problems in the space of belief to guide the evolution of individuals in the population space. CAs can explore both microevolution and macro evolution. Microevolution refers to the evolution that happens at the population level and macroevolution is that which occurs on the culture itself, that is, the evolution of the belief space [18].

II.2 SIMULATED ANNEALING
Simulated Annealing (SA) is a metaheuristic inspired by the physical process of annealing a solid to obtain low energy states in the area of condensed matter physics [6]. The SA establishes a connection between this type of thermodynamic behavior and the search for global minimums for a discrete optimization problem.
In the same way that the solid is slowly cooled to ensure a crystal structure, the algorithm cools the solution slowly to ensure that it has the best objective function, while allowing configurations to match the best value of the objective function found (situation correspondence the small heating) [18].
The acceptance of configurations that have higher temperature, for [18] this is an important feature of SA, which may seem worse, that is, it allows the acceptance of a configuration that provides a "worse" value for the objective function, thus avoiding convergence to a minimum. This acceptance is determined by a random number being controlled by expression (1): (1)

II.3 TABU SEARCH
The tabu search (TS) guides the heuristic procedure of local search by using characteristics of the current solution and the search history to explore the solution space. [19], in several cases, the methods described provide solutions that are very close to the optimal solution and are among the most effective, if not the best, solutions to the difficult problems in question. As a local search technique, TS starts from an initial solution and moves in the solutions space from one solution to another that is in its neighborhood [6].
The systematic use of adaptive memory is the property that distinguishes TS from other metaheuristics. The word "adaptive" means that the memory actualizes the storage of elements of solutions or complete solutions found during the exploration of solutions spaces [19].
The process of intensification is improved by the use of memory structures, called tabu lists. Each iteration is checked if the current solution has been visited previously or if some rule has been violated, if this solution is stored in the tabu list and marked "tabu". This procedure avoids the so-called cycling, that is, that a solution is visited again. With this memory strategy, the TS algorithm can go beyond the optimal location and access other regions of the solution space [6]. This strategy is based on the fact that in the exploration of the solution space, the oldest solutions are possibly "distant" from the region of the space under analysis and, as such, have no influence in the choice of the next solution in that region [6]. The size of the tabu list is considered a critical parameter. For according to [6], the size of the list cannot be so small, under penalty of cycling; nor so large, to unnecessarily store solutions that are not tied to the recent history of the search.

II.4 BEHAVIOR TESTS OF ALGORITHMS: FRIEDMAN, FRIEDMAN ALIGNED AND QUADE
The need to define the behavior of algorithms when submitted to problems of different natures, has opened a field of research in procedures of tests [12,20]. The Friedman test is a multiple comparison test that aims to detect significant differences between the behavior of two or more algorithms [10]. The procedure for carrying out the Friedman test follows the following steps, according to [10]: 1. Gather all results from each algorithm / problem pair; 2. Classify the values of each problem i from 1 (best result) to k (worst result). Note this classification as r j i (1≤ j ≥k); 3. For each algorithm j, calculate the average of the classifications obtained in all problems to obtain the final classification . In this way the algorithms are classified for each problem separately. As indicated in item 2, the algorithm with the best performance is classified with 1, the second best with 2, etc.
The Friedman statistic is calculated according to equation 2. , For the aligned Friedman test, a location value is calculated as the average performance achieved by all algorithms in each problem. The step of obtaining the difference between the performance of an algorithm and the location value is repeated for each combination of algorithms and problems. Equation 3, shows the definition for the statistical calculation of the aligned classification of Friedman.

(3)
The Quade test is the third test used in this work. This test differs from that of Friedman who considers equality in terms of importance among algorithms, takes into account the fact that some problems are more difficult or that the differences recorded in the sequence of various algorithms on them are larger. Therefore, the calculated rankings in each problem can be sized depending on the observed differences in the performance of the algorithms, obtaining, as a result, a weighted classification analysis of the sample of [10].
The Quade test can be calculated by equation 4, taking into account some definitions presented in [10]. Considerando também os termos A e B, dados pelas equações 5 e 6, respectivamente. , Considering also the terms A and B, given by equations 5 and 6, respectively: , ,

II.5 ALGORITHMS USED
The CAs have in their population space the population evolved through the GAs. Within the mutation function, the local search techniques are applied (SA by varying local search energy at 5, 10 and 15. TS with tabu list size variations at 2, 4 and 6). This differs from [17] which uses the concept of "ball" [18] as a technique to define neighborhood, where an area of radius 'r' is defined, where it should contain the possible solutions. The use of information from the 3 best individuals found in the local search is used to define an area of good behavior that feeds the topographic knowledge. This approach until then had not been used, according to the bibliographical research carried out, this shows the relevance of the research. Figure 3 shows the pseudocode of the algorithm used. Since, the function Cultural_Algorithm_with_Local_Search() is used to represent the use of SA or TS.

III.2 SCENARIO 1 AND 2
The two scenarios have Number of Repetitions equal to 50. Each new repetition generates a new random population with uniform probability density function defined within the limits of each variable. Tables 3 presents the parameters of scenarios 1 and 2, related to basic and hybrid functions respectively.  Figure 4 shows through a flowchart the steps used to run the tests for this article.

IV RESULTS
For all the CAs used, the same pattern of their parameters was maintained, and preliminary tests were carried out with the intention to define the size of the SA variable and the tabu list size of the TS, and after these tests it was defined that three variables of each technique that were shown with more differentiated results would be used to increment new test scenarios. In the variation of the CAs with the SA, the Local Search Energy variable changed from 5, 10 and 15. In the variation of the CAs with the TS, the tabu list variation varied in 2, 4 and 6. In this way, three new test conditions emerge for this hybrid algorithm. Getting, then with seven variations of algorithms. Table 4 presents each of the alternatives.   The data of table 5 were submitted to evaluation of the tests treated in this work. The results are shown in Table  6, 7 and 8. The values in red are the best results in each test associated to the algorithm that had the best performance for the set of functions.   Table 6 shows the results obtained when only the data set with dimension 10 was applied. In this case we had SA15 as the best-ranked algorithm in the Friedman and Quade tests. In the Friedman Aligned test, the AC obtained a better classification. Table 7 shows the resulting values of the tests for the dataset with dimension 30 in each problem. In this scenario SA15 was the best in all tests. After an individual evaluation of the proposed algorithms with the appropriate problems, it was decided to join the generated databases with dimensions 10 and 30, to verify if the classification of these three tests differs greatly from the previous results. Table 8 shows the result after the submission of these data to the tests discussed here. For the result of the union of the obtained values it was observed that accompanied the simulation of the dimension 10.

V CONCLUSIONS
The use of the Friedman, Friedman Aligned and Quade tests helped to evaluate the variations of the proposed algorithms, avoiding the comparison of pairs of algorithms. The data set of D = 10 and D = 10 + 30 (union of test results of algorithms with domain 10 with results of tests with domain equal to 30), obtained equal results for the best Algorithms (SA15 and AC) for the functions used. However, with the data of D = 30, the result was a single one pointing SA15 as the best classified in the simulations. In relation to the CA with Tabu Search, the best result was found with the tabu list of size 6.
The positional values of the 3 best values found in local searches were added to the CA belief space, specifically in topographic knowledge, thus creating a region of promising results for CA evaluation. In addition, it is noted that the hybridization of CA with local search techniques tends to obtain better results in the solution of multivariate functions. The tests used for this evaluation are easy to implement and robust in results when little is known about the problem.