A mini-review on preference modeling and articulation in multi-objective optimization: current status and challenges

Evolutionary multi-objective optimization aims to provide a representative subset of the Pareto front to decision makers. In practice, however, decision makers are usually interested in only a particular part of the Pareto front of the multi-objective optimization problem. This is particularly true when the number of objectives becomes large. Over the past decade, preference-based multi-objective optimization has attracted increasing attention from both academia and industry due to its significance in both theory and practice. Significant progress has been made in evolutionary multi-objective optimization and multi-criteria decision communities, although many open issues still remain to be addressed. This paper provides a concise review on preference-based multi-objective optimization, including various preference modeling methods and existing preference-based optimization methods, as well as a brief discussion of the main future challenges.


Introduction
Most real-world optimization problems in science, engineering and even daily life need to take into account multiple and often conflicting criteria [1,2]. Such problems are known as multi-objective optimization problems (MOPs), which can be formulated mathematically as follows: where m is the number of objectives, x is an n-dimensional decision vector, and the feasible region is defined by . For two arbitrary solutions x 1 , x 2 ∈ , x 1 is said to dominate x 2 (notated as x 1 x 2 ), if f i (x 1 ) ≤ f i (x 2 ), and F(x 1 ) = F(x 2 ). A solution x * ∈ that cannot be dominated by any other feasible solutions in is then called a Pareto optimal solution [3]. Typically, a set of Pareto optimal solutions exists for the MOP is descried in Eq. (1), the set of x * is called as Pareto set (PS), and their corresponding objective vectors F(x * ) are called as Pareto front (PF).
It is helpful for decision markers (DMs) to make their decisions if the whole Pareto optimal set is already known, because the whole set can provide an overall picture of the distribution of Pareto optimal solutions. To obtain the entire PF, or to be more exact, to obtain a representative subset of the PF, a huge number of algorithms and methodologies in both communities of traditional mathematical programming and evolutionary computation have been designed in recent decades. Traditional mathematical programming methods such as the weighted aggregation methods [4] cannot identify the whole PF in one single run. Evolutionary algorithms (EAs), as population-based search methods, are believed to be well suited for solving MOPs in that they can achieve a set of non-dominated solutions in one run. Multi-objective evolutionary algorithms (MOEAs) [5] have now become a mature tool to solve MOPs. Generally speaking, existing MOEAs can be divided into three categories according to their selection criteria, namely Pareto-, indicator-, and reference-based MOEAs [6][7][8], even though a number of MOEAs might fall into more than one category or employ additional selection criteria.
Pareto-based MOEAs employ the Pareto dominance as their main selection methodology for convergence. Different diversity maintenance strategies are adopted in different Pareto-based MOEAs, such as crowding distance in NSGA-II [9] and environment selection in SPEA2 [10]. However, it was shown that Pareto-based MOEAs fail to solve manyobjective optimization problems (MaOPs) that are defined to be MOPs with more than three objectives [11], mainly due to the fact that the dominance comparison becomes less effective when the number of objectives increases for a limited population size [12].
Reference-based MOEAs decompose an MOP into a set of sub-problems according to the pre-assigned references, such as weights [19], reference points [20], reference vectors [21], and direction vectors [22,23]. Different aggregation functions have been suggested to convert an MOP into a set of single-objective optimization problems, including weighted sum [3], Tchebycheff approach [3], and penaltybased boundary intersection (PBI) approach [24].
Although a representative subset of the overall PF can be located using most MOEAs for two-or three-objective optimization problems, selecting a few solutions to be implemented is not trivial. The decision-making process will become much harder for many-objective optimization problem, because human beings are believed to be able to handle up to seven criteria [25][26][27]. Therefore, articulation of preferences is essential for solving MOPs [28], which can guide optimization algorithms to find the most preferred solutions rather than the whole PF. To incorporate preferences into multi-objective optimization algorithms, the modeling and articulation of preferences must be considered. Generally, preferences can be involved in different stages of multi-objective optimization algorithms, and preferencebased optimization methods can be classified into three categories: a priori, interactive, and a posteriori methods [28]. However, it is unclear which preferences are able to effectively incorporated into MOEAs, and in many cases the user does not have a clear preference when little knowledge about the problem is available.
This paper offers a brief survey on preference modeling and articulation in multi-objective optimization. In section "Preference modeling methods", various preference modeling methods are summarized. Section "Preferencebased optimization methods" gives an account of existing preference-based optimization methods. Future challenges in preference modeling and preference guided multi-objective optimization are discussed in section "Challenges". Section "Conclusion" concludes this paper.

Preference modeling methods
Various preference models have been reported in the literature [29], which can be largely classified into goals, weights, reference vectors, preference relation, utility functions, outranking, and implicit preferences.

Goals
The most straightforward way to articulate preference is to provide goal information [30], as shown in Fig. 1

Weights
DMs can assign different levels of importance to different criteria by using weights w = {w 1 , ...w m }, which are a vector in the weight space as Fig. 2 shows. With the weights, multiple objectives can be converted into a single-objective function using an aggregation function [37][38][39]. Two most popular Fig. 1 Modeling preferences in terms of goals. In the figure, the star denotes a goal specified by the DM, whereas the points illustrate the optimal solutions that may be found by an optimization algorithm based on the goal (2) and the Tchebycheff approach [3], as described in Eq. (3), where f i is the i-th objective and w i is the i-th weight. In Fig. 2, the dotted line is the contour of the aggregation function g, which indicates the convergence tendency of the search of g on the specific weight w. Authors in [41,42] have modified the dominance by a weight vector. However, similar to goals, it is hard for DMs to provide accurate weights without a full understanding of the characteristics of the problem.

Reference vectors
Reference vectors or points provide the expectation to or importance of the objectives. Reference vectors and weights are similar in their aggregation functionality, although they do have different physical meanings, and consequently, different influences on the search process. Usually, reference vectors represent the directions of the solution vector, whereas weights indicate the importance of different objectives. Reference vectors are in the objective space, whilst weights are in the weight space. Because of the inherent connection between reference vectors and weights, they can be converted into each other. The reference vectors in RVEA [21] and reference points in NSGA-III [20] are converted from uniformly distributed weights.
Taking the PBI approach [24] in Fig. 3 as an example, which is the fitness function in NSGA-III [20], the relationship between a solution and a reference vector v is described by two distances, where d 1 is the projection distance and d 2 Fig. 3 The PBI approach decomposes distance |d| to two orthogonal distance d 1 and d 2 , while the APD approach penalizes replaces the distance with an angle α is the perpendicular distance to a reference vector v. With d 1 accounting for convergence and d 2 promoting diversity, PBI selects solutions based on Eq. (4), where θ is the penalty factor. The recently proposed angle penalized distance (APD) in RVEA [21] adopts the acute angle between the reference vector and solution vectors to replace the Euclidean distance as shown in Eq. (5), where p(α) is a penalty function related to the angle α. It has been shown that angles provide a more scalable measure for diversity in high-dimensional spaces.
Neither the Tchebycheff nor PBI method is suited to the PFs in different shapes [43]. Recently, different aggregation functions are proposed for both preferences in reference vectors and weights. For example, adaptive scalarizing methods in [44][45][46] change the aggregation function during the MOEA, the Tchebycheff method is used in a reversed form for a convex PF [47], and the PBI method is inverted based on a nadir point [48].

Preference relation
DMs have different preferences on different objectives; thus, some objectives might be not equally important during the process of decision making [49][50][51]. Table 1 lists the symbol representation of the importance of objectives, and as a result, objectives can be sorted with a preferred order as With that preference relation [52], the search can be narrowed down by converting into weights, the method in [2] is one example with the binary preference. The main disadvantage is that the preference relation cannot handle non-transitivity. During the process of decision making, DMs

Utility functions
Preferences can be characterized by utility functions [54][55][56], where the preference information is implicitly involved in the fitness function to rank solutions [57,58]. Unlike preference relations, the utility function sorts solutions rather than objectives in an order. For example, there are N solutions x 1 to x N , DMs are required to input their preferences for those solutions, Then, an imprecisely specified multi-attribute value theory (ISMAUT) formulation is employed to infer the relative importance of objectives to modify the fitness function. However, utility functions are based on a strong assumption that all attributes of the preferences are independent, thereby being unable to handle non-transitivity [59,60].

Outranking
Outranking [61] is a different ranking for objective preferences allowing non-transitivity, which is different from the preference relation [62]. To construct an outranking [63], the preference and indifference thresholds for each objective are input by a preference ranking organization method for enrichment evaluations (PROMETHEE) [64]. Every two solutions are compared according to those thresholds. Then, a preference ranking is obtained for outranking-based methods to search the preferred solutions [65]. However, the outrankingbased methods require too many parameter settings, which is hard for DMs when the number of objectives increases [64].

Implicit preferences
In some cases, DMs have little knowledge to articulate any sensible preferences. Nevertheless, there are some solutions on the PF that are naturally preferred, even if no problem specific preference can be proposed. Those solutions can be detected based on the curvature of PF [66]. For example, a knee point, around which a small improvement of any objec-tive causes a large degeneration of others, is always of interest to DMs as an implicit preferred solution [67][68][69]. Examples include model selection in machine learning [70,71] and sparse reconstruction [72].
There is no widely accepted definition for knee points, and specifying knee points are notoriously difficult in high-dimensional objective spaces. Existing approaches to identifying knee points can be divided into two categories: angle-and distance-based approaches [68]. Angle-based approaches measure the angle between a solution and its two neighbors and search the knee point according to the obtained angle [72]. Although angle-based approaches are straightforward, they can be applied to bi-objective optimization problems only. Distance-based approaches can handle problems with more than two objectives, which search the knee point according to the distance to a pre-defined hyperplane [73].
In addition to knee points, extreme points or the nadir point can work as a special form of preferences [74]. Extreme points are the solutions with the worst objective values on the PF. A nadir point is a combination of extreme points. With extreme points or the nadir point, DMs can acquire the knowledge on the range of the PF to input their preferences more accurately [75][76][77].

Discussions
The above formulations of preferences share several similarities. For example, although weights and reference vectors are different concepts, weights are sometimes used as references, and vice versa. All the existing preference formulations are scalable to many objectives, but their complexity significantly increases. Although different preference models may have very different properties, they all describe the objective importance or priority in their own ways, except that utility functions sort the importance of solutions rather than objectives.
DMs might articulate preferences with uncertainty. To model uncertainty in preferences, small perturbations can be introduced into goals, weights, or reference vector-based methods. Thus, fuzzy logic can be used as a natural means for handling uncertainty in preferences [78,79], such as reference points [35], weights [80], preference relation [81,82], and outranking [63]. Preference relation, utility function, and outranking are not strictly based on objective importance in values, which allow uncertainty to a certain degree. DMs might have inconsistent preferences during the search. In these such cases, goal-, weight-, and reference vector based methods might fail, because they focus on the previous preferences too much and may lose diversity. Also, preference relation and utility function based methods cannot handle preference inconsistency. Only outranking allows inconsistency in preferences to some degree. Furthermore, DMs can introduce inappropriate preferences, which might lead to infeasible solutions. There has not been any specific research dedicated to handling inappropriate preferences, and fuzzy preferences might provide a solution to this problem.

Preference-based optimization methods
The existing preference-based optimization methods can be classified into three categories according to the time when preferences are incoporated, i.e. a priori, interactive, and a posteriori methods [28]: -A priori methods In these methods, DMs need to input their preferences before optimization starts. In such methods, the main difficulty lies in the fact that DMs may have limited knowledge about the problem and their preferences may be inaccurate or even misleading. -A posteriori methods In a posteriori methods, a set of representative Pareto optimal is obtained using an optimization algorithm, from which DMs choose a small number of solutions according to their preferences. In comparison with the a priori methods, DMs are able to better understand the trade-off relationships between the objectives in the a posteriori methods. Most existing multi-objective evolutionary algorithms (MOEAs) [83] belong to this category. It should be noted, however, that it becomes increasingly hard to obtain a representative solution set as the number of objectives increases [84]. -Interactive methods Interactive methods [85,86] enable DMs to articulate their preferences in the course of optimization. In interactive methods, DMs are allowed to modify their preferences, typically based on the domain knowledge acquired during the optimization [32,38,87].
With the increasing understanding of the problem as the optimization proceeds, DMs are able to fine tune their preferences according to the obtained solutions in each iteration. With the revised preference, the interactive methods search for new preferred solutions, which usually needs less computational cost compared with the a posteriori methods. In the existing interactive methods, only one single preference model is adopted, such as reference vectors [88][89][90][91], weights [92][93][94][95], preference relation [96][97][98], and utility functions [99].

Non-evolutionary preference-based optimization methods
Traditional multiple criteria decision making (MCDM) methodologies are non-evolutionary and usually involve in a certain type of preference information. During the MCDM processes, the following assumptions hold [4,[100][101][102]: -Parts of non-dominated solutions are expected to be found. -DMs are expected to understand the problem and are able to provide reasonable preferences. -Satisfactory optimal solutions are expected to be output finally.
According to [103], classical MCDM approaches can be divided into two types: aggregation procedures and synthesizing criteria.
The aggregation-based MCDM approaches are based on weights [104]. Thus, decision making is mathematically defined by Eq. (2), where m is the number of criteria, w is the weight. For those approaches, DMs need to have a clear idea about how to set the weights. However, it is very hard for human beings to provide precise quantitative importance levels for different objectives. In some cases, good solutions cannot be easily distinguished from the poor solutions by Eq. (2).
Unlike the aggregation-based MCDM approaches which are based on explicit mathematical formula as a fixed preference, the synthesizing criterion-based approaches are based on implicit rules. For example, outranking and utility function are two implicit and flexible preference models. Outranking sorts the objective preferences, and utility function sorts the solution preferences. So far, the elimination and choice expressing reality (ELECTRE) [105] and preference ranking organization method for enrichment evaluations methods (PROMETHEE) [106] are two main outranking approaches; utilities additives (UTA) methods [107] are utility function-based approaches [108].
In addition to the above mentioned aggregation procedures and synthesizing criteria, fuzzy logic [109], decision rules [110], multi-objective mathematical programming [111], and objective classification [112] have been employed to improve the performance of the MCDM approaches.

Evolutionary preference-based optimization methods
While non-evolutionary methods pay much attention to preference handling, most MOEAs focus on obtaining the whole solution set as the a posteriori methods. In this section, we discuss the a priori methods in MOEAs, which embed preferences into their fitness functions for narrowing down the selection [113]. So far, goals, weights reference and utility functions have been used to integrate preferences in MOEAs [29].
As mentioned in section "Preference modeling methods", goals are straightforward preferences for MOEAs. Different formulations have been used to incorporate preferences in existing MOEAs. For example, the algorithm in [114] considers goal preferences as constraints by Eq. (6), where g i is the goal for the i-th objective. One issue with this approach is that no solution can be achieved if the goals are set unreasonably by DMs for MOPs with a discontinuous PF. To address this issue, an algorithm is proposed in [115] that divides the constraints into hard and soft constraints according to the priority of objectives.
In fact, existing reference-based MOEAs can naturally be seen as preference-based MOEAs, which assign preferences uniformly distributed in the whole objective space and decompose one MOP into a number of single-objective optimization problems. Preferences in those algorithms are presented in different models, such as weights in MOEA/D [19], direction vectors in DVCMOA [22] and MOEA/D-M2M [23], reference vectors in RVEA [21], and reference points in NSGA-III [20]. So far, a majority of preferencebased MOEAs are based on weights [37,38, [116][117][118][119][120]. The second most widely-used preference model in MOEAs is reference vectors. The algorithms reported in [20,21,76] model preferences by reference vectors or points. Most recently, preference articulation methods based on reference points, reference vectors, and weights have been examined and compared on a hybrid electric vehicle control problem [121].
Another popular type of preferences adopted in MOEAs is the achievement scalarizing function (ASF) [28, [122][123][124]. The formulation of ASF is shown in Eq. (7), where w is a weight and z is a reference point. The light beam search [125,126] projects a beam of light from a reference point onto the PF, resulting in a small neighborhood on the PF. To increase the robustness of the preference-based MOEAs to DMs, the light beam is used [127] to replace the reference point in the achievement scalarizing function [128]. Moreover, the achievement scalarizing function has been employed to approximate hypervolume [34, 129,130]. Based on ASF, an interactive MOEA termed I-SIBEA [131] is proposed by selecting new solutions according to a weighted hypervolume.
Several utility function-based MOEAs have also been developed. For example, an algorithm was presented in [58], which might be the first MOEA that implicitly involves the preferences in the fitness function using a utility function. To guide MOEAs toward preferred solutions, robust ordinal regression is employed to approximate the utility in [97].

Challenges
Even though preferences have recently gained increasing attention and have been studied for decades, many issues remain to be addressed in the future.

Preference adaptation for various formulations
As mentioned before, different preference models have been developed and existing preference-based MOEAs are designed according to a specific preference model. However, preferences provided by DMs might be in different forms, thus no single MOEA is able to deal with various types of preferences, making them less flexible to be used in practice. Thus, it would be very desirable if various preference models can be converted into a single preference so that they can be incorporated into a preference-based MOEA. So far, not much work has been reported on converting one preference model into another, with a few exceptions, e.g., preference relations are converted into weights in [2] and fuzzy preferences are turned into weights in [82]. Thus, it is necessary to develop a general framework for converting different preference models so that the advantages and disadvantages of the existing methods can be properly compared in terms of their ability to handle uncertainty, conflicts, as well as the robustness in obtain preferred solutions.

Learning user preferences
Preferences play a very important role in MCDM. Preferences given by DMs are consistent to a certain degree, notwithstanding that the fact that DMs might change their preferences in interacting with the optimizer. Thus, the system should be able to learn the preferences of DMs based on history data. Although there are many mature techniques in machine learning [132] and data mining [133] that can help learn preferences of DMs, little attention has been paid to this research topic with a few exceptions [134], where preferences of DMs are learned by training a single or multiple surrogate models [135] using a semi-supervised learning algorithm. As the work in [134] indicated, a proper learning algorithm should be chosen and attention should be paid to the fact that the learned preferences are able to be incorporated in MOEAs.

Handling preference violation
Without sufficient information about the problem, it is likely for DMs to provide less reasonable or even misleading preferences. In some cases, no solutions can be found for some preferences, for example when the Pareto front is discontinuous.
In case there are a group of DMs, it should be taken into account that the preferences given by different group members might be conflicting to each other [136]. As pointed out in [36], priority, independence, and unanimity of individual preferences need to be taken into account in using preferences from multiple DMs.

Psychological study
Decision making can be seen as a psychological construct in the selection of several alternative actions [137]. In some cases, the processing capacity of DMs is limited due to the overwhelmed results from decision making systems [138]. To ensure that decision making systems are compatible with the psychology of DMs, attention should be paid to theory of decision making in the psychological level [139]. Experiments reported in [140] indicate that the improvement of the forecasting performance can be achieved with the help of a psychological model. Therefore, we believe that a further understanding of the psychology of DMs would build a proper bridge between decision making systems and DMs, which can further improve the efficiency of the preferencebased methods [25,26,57,112].

Relationship between objectives
The conflict between two objectives means that the improvement on one objective would deteriorate the other. The conflict might be global or local [141][142][143]. For locally conflicting objectives, they are conflicting with each other in some regions but not in other regions. However, the existing research on objective reduction focuses on global redundancy between objectives [144][145][146], but little work has been conducted on locally conflicting objectives. The search on locally redundant objectives wastes computational cost, and the results in [141] indicate that objective reduction approaches for some problems with globally conflicting objectives can still improve the performance of MOEAs on the problems with locally redundant objectives. Therefore, detecting locally conflicting objectives, reducing locally redundant objectives, and analyzing the effects of locally conflicting objectives on the PFs are of great interest. Moreover, analysis of the correlation between objectives can help group objectives into a number of groups to simplify the representation to DMs, because human beings can only handle around seven objectives.
Several approaches can be used to help DMs to understand the relationship between objectives. In [147], objectives are divided into five classes to help DMs understand the tradeoff. Self-organizing maps (SOMs) [148] have been shown to be promising in revealing the tradeoff relationships between objectives [149,150]. Correlation is another effective tool for analyzing the relationship between objectives. Different metrics have been proposed to measure the degree of correlation (both linear and non-linear), covariance, mutual information entropy [151], and non-linear correlation information entropy (NCIE) [152,153], for instance. Based on these relations, many mature data mining techniques can be employed to choose a subset of conflicting objectives to simplify the original problem, such as feature selection [146], principal component analysis (PCA) [154], and maximum variance unfolding (MVU) [155]. The Pareto corner search evolutionary algorithm (PCSEA) [145] is a newly proposed objective reduction approach. It only searches the corners on PFs. Then, it uses the obtained solutions to analyze the relationship between objectives and identify a subset of noncorrelated objectives.
Knee points show the conflicting degree and are interesting to DMs if they do not have specific preferences [68]. Knee point detection is based on the different definition for 2 or 3-objective problems [73]. The definition of knee point in MaOPs is not yet well-established, because the conflicting degree might vary with different pairs of objectives. The sensitivity to changes in individual objectives may exist in some particular regions on the PF, which can be considered as partial knee points that are of interested to DMs.

Functional maps from decision variables to objectives
For the real-world applications, noises or uncertainties are inevitable. In such situations, DMs prefer solutions that are robust against small changes in decision variables [156][157][158]. There have been some discussions on robust multi-objective optimization [159][160][161][162][163][164], but little research has studied the robustness in decision making, except for measuring attractiveness by a categorical based evaluation technique (MAC-BETH) [165]. The analysis of the mapping relationship from decision variables to objectives [166] helps searching robust solutions in the preference-based methods.
To analyze the mapping relationship from decision variables to objectives, artificial neural network (ANN) [167,168], Bayesian learning [169,170], and the estimation of distribution algorithm (EDA) [171,172] have been employed.

Benchmark design
So far, several MOP test suites have been proposed, such as ZDT [173], DTLZ [174], and WFG [175] problems. However, no benchmark problems have been proposed to test preference-based optimization methods. Thus, it is very desirable to design MOP test suites tailored for evaluating the performance of preference based MOEAs. To design preference-based MOP benchmarks, following aspects need to be considered.
-Preference simulation It is necessary to simulate the preferences with artificial functions [176], where uncertainty and the response to the algorithm should also be taken into account. -Objective correlation Both global and local conflicts should be designed in the benchmark. -Ground truth The true optimal solutions should be provided for assessing the performance.

Performance assessment
Several performance indicators for measuring the performance of MOEAs have been proposed, such as generational distance (GD) [177], inverted generational distance (IGD) [173], and hypervolume [15]. However, not many performance indicators exist that are dedicated to evaluation of preference-based methods with few exceptions [178], which considers both dominance and the distance to the preferences. In addition, an ideal metric for preference-based methods should evaluate whether the obtained solutions truly reflect the preferences, regardless of their preference modeling types.

Visualization
Visualization plays an important role in interactions between DMs and preference-based optimization methods. When the number of objectives equals or larger than four, visualization becomes a challenge. Existing approaches can be divided into three classes, namely parallel coordinate, mapping, and aggregation tree [179]. The approaches based on parallel coordinates provide the visualization of individual solutions by a parallel coordinate system. In that system, there are parallel axes that can describe values of all objectives. Parallel coordinates [1] use a polyline with vertices on the parallel axes, while heatmap [180] uses color to present the values on the parallel axes. Those approaches can only show the trade-off between two adjacent objectives.
Other approaches include those adopt dimension reduction techniques that can preserve the Pareto dominance relationship among individuals in both global and local areas, such as Sammon mapping [181], neuroscale [182], radial coordinate visualization (RadViz) [183], SOM [149,184], and Isomap [185]. These approaches are not as straightforward as the parallel coordinate-based approaches in analyzing the tradeoff relationships between the objecitves and are time-consuming.
Approaches based on aggregation tree [186,187] measure the harmony between objectives to visualize the relation between objectives. However, this kind of approaches cannot show individual solutions.
Most existing visualization tools are not straightforward for DMs to understand. Ideally, both dominance and preference relationship should be presented in the visualization. Moreover, DMs should be able to zoom in interesting regions to get more detailed information.
Scalable multi-objective optimization test problems.

Conclusion
Since preference-based multi-objective optimization is strongly motivated from the real-world applications, research interests in this area have increased in recent years. Indeed, preference modelling is also a common need in many areas of artificial intelligence in which decision making is involved [188][189][190]. It becomes thus clear that preference modelling and learning are important not only for decision making and evolutionary optimization, but also for artificial intelligence research.
In this paper, we provide a concise review of research on preference modelling and preference-based optimization methods. We discuss the open issues in preference modelling and preference based optimization. It is emphasized that the importance of preference-based multi-objective optimization is of paramount practical significance and preferences must be incorporated in many-objective optimization, where obtaining a representative subset of the entire Pareto front is less likely.