Decomposition-Based Multi-Objective Evolutionary Algorithm Design Under Two Algorithm Frameworks

The development of efficient and effective evolutionary multi-objective optimization (EMO) algorithms has been an active research topic in the evolutionary computation community. Over the years, many EMO algorithms have been proposed. The existing EMO algorithms are mainly developed based on the final population framework. In the final population framework, the final population of an EMO algorithm is presented to the decision maker. Thus, it is required that the final population produced by an EMO algorithm is a good solution set. Recently, the use of solution selection framework was suggested for the design of EMO algorithms. This framework has an unbounded external archive to store all the examined solutions. A pre-specified number of solutions are selected from the archive as the final solutions presented to the decision maker. When the solution selection framework is used, EMO algorithms can be designed in a more flexible manner since the final population is not necessarily to be a good solution set. In this paper, we examine the design of MOEA/D under these two frameworks. We use an offline genetic algorithm-based hyper-heuristic method to find the optimal configuration of MOEA/D in each framework. The DTLZ and WFG test suites and their minus versions are used in our experiments. The experimental results suggest the possibility that a more flexible, robust and high-performance MOEA/D algorithm can be obtained when the solution selection framework is used.


I. INTRODUCTION
Multi-objective optimization problems are commonly found in many real-world applications [1]- [4]. The main goal of solving a multi-objective optimization problem is to optimize (either maximize or minimize) several objective functions simultaneously. However, the objective functions of a multi-objective optimization problem are conflicting in nature. Thus, it is not possible to obtain a single optimal solution that optimizes all the objective functions simultaneously. Usually a set of optimal solutions with different tradeoffs is obtained. The set of tradeoff optimal solutions is known as The associate editor coordinating the review of this manuscript and approving it for publication was Pavlos I. Lazaridis .
the Pareto optimal solution set. The Pareto optimal solutions form the Pareto front in the objective space.
Owing to the advantage of being able to search for a set of non-dominated solutions in a single run, evolutionary multi-objective optimization (EMO) algorithms have been a popular approach for solving multi-objective optimization problems. Over the years, various EMO algorithms such as SMS-EMOA, NSGA-III, HyPE, MOEA/D, and MOEA/D-HH have been proposed [5]. Since it is often impractical/difficult to include user preference a priori, most EMO algorithms in the literature are of the posteriori type [6]. That is, the main aim of these EMO algorithms is to find a set of well-distributed non-dominated solutions to approximate the Pareto front. Then, a decision VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ maker chooses a solution from the obtained solution set. Based on generation update mechanisms, EMO algorithms can be classified into two categories. One is non-elitist algorithms (e.g., MOGA [7] and NSGA [8]) and the other is elitist algorithms (e.g., SPEA [9] and NSGA-II [10]). In nonelitist algorithms, the current population is entirely replaced with the offspring population. No solution in the current population can survive to the next generation. As a result, it is very difficult to design an efficient EMO algorithm based on the non-elitist framework.
In elitist algorithms, good solutions (called elite individuals) in the current population survive to the next generation. The most frequently-used mechanism for generation update is the (µ + λ)-selection strategy [10]. In this mechanism, λ solutions are generated from the current population with µ solutions. Then, the best µ solutions are selected from the (µ + λ) solutions for the next generation. Some elitist EMO algorithms have an archive of a pre-specified size to store non-dominated solutions [11]- [13]. Different archiving methods such as the Pareto dominance-based archiving methods, decomposition-based archiving methods, and indicator-based archiving methods can be used in the archiving process [15]. The archive is updated at every generation. In these EMO algorithms, the current population or the archive at the final generation is presented to the decision maker.
Even though the elitist framework is much more efficient than the non-elitist framework, it still has a difficulty. As discussed in [14], good solutions can be deleted during the generation update phase. Since the population size (and the archive size) is pre-specified, some solutions must be discarded during the generation update phase. New solutions cannot be compared with those discarded solutions in previous generations. As a result, solutions in the final population (and the final archive) are not always non-dominated among all the examined solutions [15].
In both the non-elitist and elitist frameworks, solutions in the final generation are presented to the decision maker. Thus, EMO algorithms should be designed to have a set of welldistributed non-dominated solutions in the final generation. In this paper, both the non-elitist and elitist frameworks are referred to as the final population framework (since most of recent algorithms are based on the (µ + λ) selection mechanism).
Recently, it was proposed to use an unbounded external archive in the design of EMO algorithms [14]. The idea is to present a solution set selected from all the examined solutions to the decision maker. In this paper, this algorithm framework is referred to as the solution selection framework. Whereas this framework was used for performance evaluation of existing EMO algorithms in some studies [16], [17], it has not been used for the design of new EMO algorithms. Unlike the final population framework, the final population in the solution selection framework does not have to be a good solution set. The solution selection framework can use an EMO algorithm with a bounded internal archive together with an unbounded external archive (whereas we do not discuss such an EMO algorithm in this paper).
This paper aims to clearly demonstrate the flexibility of the solution selection framework in the design of EMO algorithms. We use Multi-objective Evolutionary Algorithm based on Decomposition (MOEA/D) [18] as a sample to empirically show the advantages of the solution selection framework. MOEA/D is a well-known and frequently-used EMO algorithm especially for many-objective optimization. A number of approaches have been proposed to further improve the performance of MOEA/D (e.g., with respect to the choice of a scalarizing function and the specification of weight vectors). In our former study [19], we demonstrated that the performance of MOEA/D is sensitive to the specification of the reference point and more robust performance is obtained by the solution selection framework.
Motivated by the promising experimental results in our former study about the reference point specification [19], we further investigate other components and parameters in MOEA/D in this paper. Our experiments are conducted using the two algorithm frameworks, as follows: The examined MOEA/D components include a scalarizing function, a crossover operator, and a mutation operator. The examined parameters include the neighborhood size, the normalization parameter, the initial reference point and the final reference point in the dynamic reference point mechanism, the crossover rate, and the mutation rate. Since the parameter space is large and complex, we use an offline genetic algorithm-based hyper-heuristic method to search for the optimal configuration of MOEA/D under each algorithm framework for each test problem. Then, the obtained optimal algorithm configurations are compared and analyzed. As test problems, we use the three-objective DTLZ1-4 [20], WFG1-9 [21], and their minus versions [22]. We search for the best MOEA/D design for each of these 26 test problems under each algorithm framework.
The main contributions of this paper are listed as follows: • A GA-based hyper-heuristic method is used to search for the optimal configurations of MOEA/D under two different algorithm design frameworks, namely the final population framework and the solution selection framework. This paper is organized as follows. First, we briefly explain multi-objective optimization and MOEA/D in Section II. Next, we explain our genetic algorithm-based hyper-heuristic method and the experiment settings in Section III. Then, we compare the experimental results under the two algorithm frameworks in Section IV. Finally, we conclude this paper in Section V.

II. BACKGROUND A. MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
Let us assume that we have the following M -objective minimization problem: Definition 2 (Pareto Optimal Solution): A solution x * is a Pareto optimal solution if and only if it is not dominated by any other solution in X.
The Pareto set (PS) consists of all Pareto optimal solutions. The projection of the Pareto set to the objective space is the Pareto front (PF).

B. MOEA/D
The basic idea of MOEA/D [18] is to decompose a multiobjective optimization problem into N single-objective subproblems using a set of predefined weight vectors W = {w 1 , w 2 , . . . , w N } and a scalarizing function. Then, the N subproblems are evolved in a cooperative manner by exploiting the information from other subproblems during the search process.
A set of uniformly distributed weight vectors W is used in most MOEA/D implementations in the literature. Each weight vector w j = w j 1 , w j 2 , . . . , w j M must fulfil the following relation: where w j i ≥ 0 (i = 1, 2, . . . , M ) and j ∈ {1, 2, . . . , N }. In our study, the Das and Dennis method [23] is used to systematically generate the weight vectors. Since each subproblem j has a single individual x j (j ∈ {1, 2, . . . , N }), the population size equals to the number of subproblems (the number of weight vectors) in MOEA/D.

C. COMPONENTS AND PARAMETERS IN MOEA/D
MOEA/D includes a number of components and parameters which need to be specified. They are explained in this subsection.

1) SCALARIZING FUNCTIONS
It is known that a scalarizing function plays an essential role in MOEA/D. The scalarizing function is used to calculate the fitness value of each individual. In this paper, we consider five scalarizing functions: the weighted sum g WS , Tchebycheff g TCH , modified Tchebycheff (g MTCH ), penalty-based boundary intersection (PBI) g PBI , and inverted penalty-based boundary intersection (IPBI) (g IPBI ) functions. The five scalarizing functions are given as follows: Weighted Sum (WS) [18]:

Tchebycheff (TCH) [18]:
where z * = z * 1 , z * 2 , . . . , z * M is the reference point (which is served as the origin of the weight vectors). In our study, we consider a dynamic reference point mechanism proposed in [24]. In principle, the ideal point is used as the reference point. However, the ideal point is unknown for real-world problems. Thus, the common practice is to specify each element z * i of z * by the minimum value of each objective f i (x) over all solutions examined so far, as shown in (5) and (6).
where i is a non-negative number, and S consists of all examined solutions. The value of i is set to zero in the original MOEA/D. A dynamic reference point mechanism [24] has shown to be useful for improving the performance of MOEA/D. As discussed in [24], in the early stage of evolution, the estimated reference point is usually inaccurate since the population is not close to the Pareto front. The accuracy of the estimated reference point is gradually improved through the evolution. Based on these discussions, the following linearly decreasing formulation was proposed in [24]: where T is the maximum generation number, t is the current generation index (t = 1, 2, . . . , T ), and ini i and end i are the initial and final settings of i , respectively. Modified Tchebycheff (MTCH) [25]: where z * is the reference point. In this paper, we use the formulation in (5)-(7) as in the Tchebycheff function. If w i = 0, w i is set to 10 −6 to avoid division by zero.

Penalty-based Boundary Intersection (PBI) [18]:
Minimize where the penalty parameter θ is a user-definable nonnegative real number. The commonly-used value for θ is 5.
The two distances d 1 and d 2 are defined as where z * is the reference point specified by (5)-(7).

Inverted PBI (IPBI) function [26]:
Maximize where θ is the penalty parameter. The two distances d 1 and d 2 are defined as where

2) NORMALIZATION MECHANISM
In this paper, we consider a simple normalization mechanism [27] in MOEA/D to deal with problems with differently scaled objectives. The estimated ideal point z * (with i = 0 in (5)) and the estimated nadir point z N are used to normalize the objective value z i as follows: where ε (which is referred to as the normalization parameter in this paper) is a positive real number used to prevent the denominator from becoming zero in the case of z N i = z * i .

3) NEIGHBORHOOD STRUCTURES
Neighborhood structure is an important feature of MOEA/D. Each subproblem has its own neighborhood which is defined by the Euclidean distance between weight vectors. For each subproblem, two parents are randomly selected from its neighborhood to generate a new solution. Then, the newly generated solution is compared with all solutions in its neighborhood (including the current solution of the current subproblem). All inferior neighbors are replaced with the newly generated solution. In the original MOEA/D, the same neighborhood is used for both mating and replacement. In [28], the use of two different neighborhood structures for mating and replacement was investigated. For many-objective knapsack problems, it was shown that the search ability of MOEA/D can be improved by using a larger neighborhood structure for replacement than mating. In this paper, we also examine the use of two neighborhood structures.

4) GENETIC OPERATORS
In the literature, many EMO algorithms use the SBX crossover [29] and the polynomial mutation [30] for multi-objective continuous optimization problems. In addition to these commonly-used genetic operators, we also examine some other operators: the whole arithmetic crossover (WAX) [31], the local arithmetic crossover (LAX) [32], the Gaussian mutation [30], and the random mutation [6].

III. HYPER-HEURISTICS USING A GENETIC ALGORITHM
A. ALGORITHM DESIGN SPACE FOR MOEA/D As explained in the previous section, MOEA/D has a number of components and parameters to be specified. In this paper, we search for an optimal configuration of MOEA/D under each algorithm framework (i.e., final population and solution selection) for each test problem. More specifically, we try to find the best combination of the components and parameters in Table 1. The domain of each component/parameter is shown in the column labeled ''Domain'' in Table 1. For example, one of {WS, TCH, PBI, IPBI, MTCH} is selected as a scalarizing function. For the penalty parameter θ in PBI and IPBI, a real number is specified in the closed interval [0, 10]. In this paper, the MOEA/D algorithm design means the choice of a possible value in the domain for each component/parameter in Table 1.
In our experiments, the population size for MOEA/D is fixed to 91, and the distribution index for the SBX crossover and the polynomial mutation is fixed to their default value 20.

B. GENETIC ALGORITHM-BASED HYPER-HEURISTIC
In this paper, we use a genetic algorithm-based hyperheuristic method to find the optimal MOEA/D algorithm design for the final population framework and the solution selection framework.
We use the following parameter specifications in the hyperheuristic method in our computational experiments: Coding: 53-bit binary string, Population size µ: 100, Initial population: Random binary strings, Termination condition: 100 generations, Crossover: Uniform crossover with probability 1, Mutation: Bit-flip mutation with probability 1/53, Selection: Tournament selection with tournament size 3, Generation update: (µ + µ)-selection strategy, Fitness evaluation: Average hypervolume. In our hyper-heuristic method, each component/parameter is represented by a binary substring. For example, the scalarizing function g is represented by a 3-bit substring where the 1024 real numbers in the interval [0, 10] are examined as the penalty parameter θ. Then, an MOEA/D algorithm is represented by a binary string of length 53, where the coding for each component/parameter is listed in the third column of Table 1.
The hypervolume (HV) indicator is used to evaluate each individual (i.e., the binary string of an MOEA/D configuration). The larger HV value indicates the better performance. In order to handle the stochastic nature of the MOEA/D algorithm, each MOEA/D configuration is applied to a test problem for five times. The average HV value over the five runs is used as the fitness value of each individual in the hyper-heuristic method.
In the final population framework, the HV value of the final population of each run of the MOEA/D configuration is calculated. In the solution selection framework, 91 solutions (which is the same as the population size of MOEA/D) are selected from non-dominated solutions among all the examined solutions in each run of the MOEA/D configuration. We use the distance-based solution subset selection method [33].
In the distance-based selection method, first, one of the extreme non-dominated solutions is randomly selected among the M extreme non-dominated solutions as the first solution. The second solution is the non-dominated solution with the largest distance from the first solution. Then, the nondominated solution with the largest distance from the first and second solutions is selected as the third solution. The selection process is repeated until 91 solutions are selected. We use this method since it is computationally efficient (e.g., it is fast). Of course, we can use various HV-based solution subset methods [34]- [36]. Whereas they can find better solution subsets with higher HV values, they need much more computation time than the distance-based method. In our hyper-heuristic method, 10,000 MOEA/D configurations (100 individuals × 100 generations) are applied to the given test problem five times. The solution subset selection is performed 50,000 times in a single run of the hyper-heuristic method. Thus, we use the fast distance-based solution subset selection method in the hyper-heuristic method in this paper. The termination condition for each MOEA/D configuration is 10,000 solution evaluations.
HV calculation needs a reference point. Since it is assumed that we have no knowledge about the true Pareto front of the test problem during the algorithm design by the hyper-heuristic method, we combined all solution sets obtained by the five runs of all MOEA/D configurations in the current population of the hyper-heuristic method. Then, we select only the non-dominated solutions among them to form an approximated Pareto front. The ideal point and the nadir point are calculated using the approximated Pareto front. The objective space is normalized so that the calculated ideal and nadir points become (0, 0, . . . , 0) and (1, 1, . . . , 1). Then, the reference point r for HV calculation is specified as r = (r, r, . . . , r) with r = 1.1 (i.e., a slightly worse point than the calculated nadir point in the normalized objective space).

A. OBTAINED MOEA/D CONFIGURATIONS
Our computational experiments are performed for each of the three-objective DTLZ1-4 [20], WFG1-9 [21], and their minus versions [22] under each algorithm framework. The obtained MOEA/D configurations for the 26 test problems under the final population and solution selection frameworks are shown in Table 2 and Table 3, respectively.
In Table 2 and Table 3, we can see that the optimal MOEA/D configurations are totally different between the final population framework and the solution selection framework. As an example, when the final population framework is used, only the PBI and TCH functions are chosen as the scalarizing function for the DTLZ and WFG test suites (see Table 2). However, when the solution selection framework is used, the WS function is selected for DTLZ2, DTLZ3, WFG2, WFG3, and WFG6 (see Table 3). VOLUME 8, 2020  Another example is the specification of the neighbourhood size. In the final population framework, the mating neighborhood size (T Mate ) is always smaller than or equal to the replacement neighborhood size (T Rep ). However, this is not always the case in the solution selection framework. As an example, for WFG9, the mating neighborhood size (i.e., T Mate = 10% of the population size) is larger than the replacement neighborhood size (i.e., T Rep = 5% of the population size).

B. EVALUATION UNDER THE FINAL POPULATION FRAMEWORK
In this subsection, the performance of each MOEA/D configuration in Table 2 and Table 3  As in [18], [22], and [26], the penalty parameter θ in the standard MOEA/D-PBI and MOEA/D-IPBI are set as 5 and 0.1, respectively. Our computational experiments are performed on the PlatEMO platform [37]. Each MOEA/D algorithm is independently run 31 times on each test problem.
In order to evaluate the performance of the MOEA/D, the HV indicator and the inverted generational distance (IGD) indicator are used. A larger HV value and a smaller IGD value indicate the algorithm has better performance. It should be noted that the HV indicator is also used to evaluate each MOEA/D configuration in the hyper-heuristic method (in Section III.B).
As we have explained in the previous section, a reference point is needed for calculating the HV value. In this section, the true Pareto front information of each test problem is used for evaluating the performance of MOEA/D. The true ideal and nadir points are used to normalize the objective space. The reference point r = (1.1, 1.1, 1.1) is used for the hypervolume calculation. For the IGD calculation, a reference point set is needed. In our experiments, about 10,000 reference points are uniformly sampled over the entire Pareto front of each test problem in PlatEMO [37] as the reference point set for the IGD calculation except for WFG3 (see the footnote of Table 4). Table 4 shows the mean hypervolume value over 31 runs of each MOEA/D configuration for the three-objective test problems. The stopping condition for each MOEA/D 1 Since the reference point set for WFG3 on the PlatEMO does not cover the flag region (the true Pareto front of the three-objective WFG3 has a flag region [38]), we generated the reference point set for the three-objective WFG3 problem by choosing non-dominated solutions from solution sets obtained by different EMO algorithms. We used NSGA-II, NSGA-III, MOEA/D-PBI, SMS-EMOA and SPEA2 with the population size 100 and 100,000 solution evaluations over 31 runs. A total of 7905 non-dominated solutions were obtained, which were used as the reference points for HV and IGD calculation of WFG3.  Table 2 (tuned under the final population framework) and Table 3 (tuned under the solution selection framework), respectively. Since Auto-MOEA/D-FP is tuned under the final population framework, it is expected that Auto-MOEA/D-FP has the best performance among all the algorithms. The Wilcoxon's rank sum test at a significant level of 5% is used to evaluate the statistical difference between Auto-MOEA/D-FP and each of the other six MOEA/D versions. The signs ''+'', ''−'', and ''='' are used to indicate the compared MOEA/D version is statistically better than, worse than, or equivalent to Auto-MOEA/D-FP. The best and worst results among the seven MOEA/D versions are highlighted using yellow color and red color, respectively.
In Table 4, Auto-MOEA/D-FP shows high performance. Auto-MOEA/D-FP outperforms all the other six versions on almost all test problems. For some test problems, Auto-MOEA/D-FP is not the best in Table 4. However, the results obtained by Auto-MOEA/D-FP are very similar to the best results on those problems. From the experimental results, we can also see that the best results for almost all test problems are obtained by Auto-MOEA/D-FP and Auto-MOEA/D-SS. This observation shows that a tuning procedure is beneficial for improving the performance of MOEA/D.
Even though Auto-MOEA/D-SS shows the best performance on some test problems, it also shows the worst performance on some other test problems. For example, it has the worst performance on WFG3. However, Auto-MOEA/D-SS has the best performance when it is evaluated under the solution selection framework (which will be shown in Table 6 in the next subsection). This observation suggests that the best algorithm configuration for the solution selection framework can be totally different from that for the final population framework.   final population framework are shown in Fig. 1. That is, Fig. 1 shows the solutions in the final population of a single run of each algorithm. A single run (over 31 runs) with the median HV value (from Table 4) is selected. We can see that the final population of Auto-MOEA/D-SS is not a good solution set. That is, only a few solutions are obtained since many solutions are overlapping with each other. Fig. 2 shows the solution sets obtained by Auto-MOEA/D-FP and Auto-MOEA/D-SS for WFG3 under the evaluation of the solution selection framework. That is, Fig. 2 shows the selected solutions from all the examined solutions in the same single run as in Fig.1. We can see from the comparison between Fig. 1 and Fig. 2 that better solution sets can be obtained from the solution selection framework. We can also see that the obtained solution set by the solution selection framework in Fig. 2 (b) is the best with respect to the HV indicator among the four solution sets in Fig. 1 and Fig. 2, whereas the final population of the corresponding run in Fig. 1(b) is the worst.
The performance of MOEA/D is also examined using a different performance indicator (i.e., IGD) and a different termination condition (i.e., 50,000 solution evaluations) under the final population framework. Table 5 shows the summary of statistical comparison results between Auto-MOEA/D-FP with the other six MOEA/D versions using the Wilcoxon rank sum test. The signs ''+'', ''−'', and ''='' indicate the number of test instances on which the results of Auto-MOEA/D-FP are significantly better than, worse than, or equivalent to other MOEA/D versions.
We can see from Table 5 that Auto-MOEA/D-FP (which is designed for HV maximization under 10,000 solution evaluations) also shows high performance under different conditions. As an example, when Auto-MOEA/D-FP is evaluated using the HV indicator with 50,000 solution evaluations, its performance is significantly better than each of the other MOEA/D versions for at least 20 (out of 26) test problems. Auto-MOEA/D-FP also shows good performance when the IGD indicator is used for evaluation.

C. EVALUATION UNDER THE SOLUTION SELECTION FRAMEWORK
In this subsection, each MOEA/D configuration is independently run 31 times on each test problem under the Various solution subset selection methods can be used to select pre-specified numbers of solutions from the unbounded external archive [19]. We use greedy HV-based and IGD-based solution subset selection methods in our experiments to evaluate each MOEA/D version in this paper. A recently proposed lazy greedy inclusion technique [39] is used for the speed-up of greedy inclusion solution subset selection. The lazy greedy HV-based inclusion method [39] is used in the solution selection framework when the performance of MOEA/D is evaluated by the HV indicator. Likewise, the lazy greedy IGD-based inclusion method is used when the performance of MOEA/D is evaluated by the IGD indicator. It should be noted that the distance-based greedy solution subset selection method is used for the design of Auto-MOEA/D-SS by the hyper-heuristic method. This is because the solution subset selection method needs to be performed 50,000 times in a single run of the hyper-heuristic method (i.e., a very fast method needs to be used). Table 6 shows the mean HV value over 31 runs of each MOEA/D version for each test problem under the solution selection framework. The termination condition in Table 6 is 10,000 solution evaluations. The Wilcoxon's rank sum test at a significant level of 5% is used to evaluate the statistical difference between Auto-MOEA/D-SS and each of the six versions of MOEA/D. In Table 6, the signs ''+'', ''−'', and ''='' are used to indicate that the compared MOEA/D version is statistically better than, worse than, or equivalent to Auto-MOEA/D-SS. The best and worst results among the seven MOEA/D versions are highlighted using yellow color and red color, respectively.
Auto-MOEA/D-SS shows high performance on many test problems when it is evaluated under the solution selection framework in Table 6. Even though Auto-MOEA/D-SS does not show the best performance on some test problems, it shows very similar performance to the best results obtained on those problems.
Whereas Auto-MOEA/D-SS shows the worst performance on DTLZ2, WFG3, WFG9, Minus-DTLZ2 and Minus-WFG7 in Table 4 (evaluation under the final population framework), it shows the best performance on WFG3, WFG9, and Minus-WFG7, and comparable performance on DTLZ2 and Minus-DTLZ2 (i.e., the HV values are very similar to the best results on these two test problems) in Table 6. These observations clearly show that the optimal algorithm configuration for the solution selection framework can be totally different from that for the final population framework.
Moreover, we can see that better results are obtained in Table 6 than Table 4 for almost all cases. Actually, higher average HV values are obtained in 181 cases (out of 26 test problems × 7 algorithms =182 cases). This observation shows the usefulness of the solution selection framework with an unbounded external archive. Theoretically, the result of the solution selection framework is always better than or equal to that of the final population framework if we can select the best solution set from the examined solutions. Since the HV-based greedy selection method is used (i.e., since the greedy algorithm is not an exact optimization algorithm), better results are obtained in Table 4 than Table 6 for one case (among the 182 cases). This observation suggests that further improvement is needed for solution selection in future studies.
In Table 6, Auto-MOEA/D-SS shows high performance. However, if compared with the high performance of Auto-MOEA/D-FP in Table 4, the performance of Auto-MOEA/D-SS is not so dominant in Table 6. This is because the distance-based solution selection is used in the hyper-heuristic algorithm (for efficient calculation). If the HV-based solution selection method is used, the performance of Auto-MOEA/D-SS will be further improved. However, at the same time, much more computation time is needed.
In Table 6, we can also see that the difference in the average HV values between Auto-MOEA/D-FP and Auto-MOEA/D-SS is very small for all test problems (in comparison with the difference in Table 4) even when there exists a statistically significant difference. This may mean that the search for the best algorithm configuration is not easy under the solution selection framework since different configurations have similar performance. The search ability of the hyper-heuristic method will be improved by increasing the number of runs to evaluate each algorithm configuration. However, this needs more computation time.
The performance of MOEA/D under the solution selection framework is also examined using a different performance indicator (i.e., IGD) and a different termination condition (i.e., 50,000 solution evaluations). Table 7 shows the summary of statistical comparison results between Auto-MOEA/D-SS with each of the other six MOEA/D versions using different evaluation conditions. The signs ''+'', ''−'', and ''='' indicate the number of test instances on which the results of Auto-MOEA/D-SS are significantly better than, worse than, or equivalent to other MOEA/D versions.
In Table 7, Auto-MOEA/D-SS shows similar performance to Auto-MOEA/D-FP on average when the termination condition of 50,000 solution evaluations is used. Although statistical differences are observed in the performance between Auto-MOEA/D-SS and Auto-MOEA/D-FP, their average HV values are very similar as in Table 6. Auto-MOEA/D-SS outperformed the other five MOEA/D versions on average in Table 7 when the HV indicator is used for performance comparison.
When the IGD indicator is used in Table 7, the performance of Auto-MOEA/D-SS is not the best. For more than 15 test problems (out of 26 test problems), Auto-MOEA/D-SS is outperformed by Auto-MOEA/D-FP. One possible reason is that different configurations are needed to obtain good solution sets for different performance indicators. That is, the obtained configurations for the HV indicator are not always good for the IGD indicator. Another possible reason is that the optimization of Auto-MOEA/D-SS has not been fully completed as discussed for Table 6 (i.e., use of the distance-based selection method and similar performance of different configurations).

D. DISCUSSION
In general, offline automated algorithm design methods for evolutionary algorithms have the following limitations:

1) LARGE COMPUTATION LOAD RELATED TO FITNESS EVALUATION
Large computation load for evaluating each solution (i.e., each evolutionary algorithm implementation) are needed. Noisy evaluation of each solution due to the stochastic nature of evolutionary algorithm also poses a difficulty to the offline automated algorithm design methods. In order to handle the noisy evaluation issue, multiple runs are needed. Otherwise, it is very likely that a solution with a lucky run (e.g., lucky random initial solutions) will be chosen as the final solution. That is, the search by the hyper-heuristic algorithm is to choose a lucky solution (instead of a good algorithm implementation). However, it is difficult to perform many runs due to the large computation load.

2) LARGE COMPUTATION LOAD RELATED TO OPTIMIZATION
By using various components and parameters of an evolutionary algorithm as decision variables in an offline automated algorithm design, the optimization problem by hyper-heuristic become small. Thus, it may be easy to find a good solution. However, the obtained good solution is not always a good algorithm implementation since many components and parameters are not optimized by hyperheuristics. Thus, we need to appropriately choose only important (influential) components and parameters, and we need to appropriately specify other components and parameters in the hyper-heuristic algorithm. Both of these two tasks are not easy.
In addition to these difficulties, in the automated design of EMO algorithms, the following are difficult:

3) CHOICE OF A PERFORMANCE INDICATOR IN THE HYPER-HEURISTIC ALGORITHM
In order to evaluate each implementation of an EMO algorithm, a performance indicator is needed. The choice of a performance indicator and its specification (e.g., a reference point for hypervolume calculation) is not easy.

4) CALCULATION OF THE PERFORMANCE INDICATOR
When the hypervolume indicator is selected, its calculation is time-consuming for many-objective problems and large solution sets.
Moreover, when an unbounded external archive is used, the following are difficult:

5) SELECTION OF THE FINAL SOLUTION SET
We may need an efficient and effective solution selection method since the size of the unbounded external archive can be very large.

6) MEMORY SIZE FOR THE UNBOUNDED EXTERNAL ARCHIVE
If an original multi-objective problem has a large number of decision variables (e.g., a large-scale problem with one million decision variables), it may need some trick to store a huge number of solutions.

V. CONCLUSION
In this paper, we empirically demonstrated the usefulness and flexibility of the solution selection framework using the MOEA/D algorithm on 26 test problems. An offline genetic algorithm-based hyper-heuristic method was used to search for the optimal MOEA/D configurations for each test problem under the final population framework and the solution selection framework. The experimental results suggested the optimal configurations can be totally different under the two frameworks. Better solution sets were obtained from the solution selection framework with a greedy solution subset selection method for almost all test problems than the final population framework.
As shown in this paper (and some other studies [14], [19]), better solution sets are usually obtained from the solution selection framework than the final population framework. However, new algorithm design has not been studied under the solution selection framework in the literature. This is clearly a promising future research direction. Another important research direction is the development of a new efficient and effective solution subset selection method to facilitate the solution selection framework.