A Low-Sample-Count, High-Precision Pareto Front Adaptive Sampling Algorithm Based on Multi-Criteria and Voronoi

In this paper, a Pareto front (PF)-based sampling algorithm, PF-Voronoi sampling method, is proposed to solve the problem of computationally intensive multi-objectives of medium size. The Voronoi diagram is introduced to classify the region containing PF prediction points into Pareto front cells (PFCs). Valid PFCs are screened according to the maximum crowding criterion (MCC), maximum LOO error criterion (MLEC), and maximum mean MSE criterion (MMMSEC). Sampling points are selected among the valid PFCs based on the Euclidean distance. The PF-Voronoi sampling method is applied to the coupled Kringing and NASG-II models and its validity is verified on the ZDT mathematical cases. The results show that the MCC criterion helps to improve the distribution diversity of PF. The MLEC criterion and the MMMSEC criterion reduce the number of training samples by 38.9% and 21.7%, respectively. The computational cost of the algorithm is reduced by more than 44.2%, compared to EHVIMOPSO and SAO-MOEA algorithms. The algorithm can be applied to multidisciplinary, multi-objective, and computationally intensive complex systems.


Introduction
With the continuous development of simulation technology as well as computational power [1], one can build high-fidelity models to predict complex physicochemical phenomena in the real world, such as the interaction mechanism of fluids with glider geometry and in-cylinder combustion phenomena in internal combustion engines. At the same time, accurate performance evaluation has led to the development of simulation-based multi-objective optimization problems for practical engineering [2]. However, the huge cost of high-fidelity analysis has become a bottleneck for simulation-based optimization, especially for complex systems involving multiple disciplines, multiple objectives, and computationally intensive [3].
It has been reported that it takes 36-160 hours to run a crash simulation for Ford Motor Company [4]. And for ship flow systems, one CFD calculation takes several days [5]. Frequent optimization-seeking iterations and high simulation costs are difficult to be accept in practice.
The application of surrogate models offers the possibility to reduce the cost of optimization [3]. Surrogate models constructed from several training samples are used to approximate computationally expensive functions. And optimization methods search for optimal values on the surrogate models. Clearly, the use of these surrogate model-based optimization design approaches can significantly reduce the number of high-fidelity simulations, reduce time costs, and produce better designs [6].Ibaraki et al [7] used an artificial neural network (ANN) and a genetic algorithm (GA) to form an optimization system for multi-objective optimization of a centrifugal compressor impeller. It was shown that the two optimized impellers located at PF showed significant improvement in efficiency and operating range, respectively, compared to the original impeller. Ekradi et al [8] used the ANN-GA optimization system to study the effect of impeller blade angle on the performance of the centrifugal compressor.
The study showed that the optimized impeller improved the isentropic efficiency at the design point by 0.97%, the total pressure ratio by 0.94%, and the mass flow rate by 0.65%. Wang et al [9] applied the Kriging model to the optimal design of an industrial centrifugal impeller, resulting in a 2.49% improvement in the isentropic efficiency of the impeller. Guo et al [10] applied the response surface methodology (RSM) to optimal design of a small centrifugal compressor to optimize the design of a small centrifugal compressor. The results of the study showed that the optimized compressor pressure ratio was improved by 7.5%.
In the literature above, most of the training sample points required for surrogate model construction are obtained by "one-time" sampling methods [11] such as uniform design [9], Latin hypercube sampling [12], and central composite design [10].
These methods can quickly generate all sample points by a predefined total number of samples. However, in practice, it is often difficult for engineers to determine the appropriate sample size [13], which leads to unnecessary computational costs [14] [15].
To achieve good prediction accuracy of surrogate models with a reasonable number of sampling points, several adaptive sampling (also called sequential sampling) methods have been developed in recent years [16]. This class of methods starts with small samples and improves the accuracy of the surrogate model by iteratively selecting samples throughout the design space. Sasena et al [17] compared five sampling criteria in several mathematical cases: expected improvement criterion, improved probability criterion, regional extreme value locus criterion, mean squared error criterion and minimize surprises criterion. It was shown that the criterion with more emphasis on global search required more iterations to locate the optimal values and was less accurate than the criterion with emphasis on local search. Xu et al [16] proposed a sampling criterion, CV-Voronoi, for global surrogate models. This criterion divides the design space into multiple cells based on the current sample points and determines the cell where the next sample point is located by cross-validation, thus reducing the search cost. Guo et al [18] [21]. Cheng et al [22] proposed a dynamic expectation super volume improvement sampling algorithm. Several promising prediction points are selected to be added to the training set by comparing the expected over-volume improvement values of the currently predicted PF points. Fan et al [23] used reference vectors and non-dominated ranks to screen PF prediction points to ensure the diversity and convergence of the selected individuals.Gao et al [24] used EIM criterion and distance criterion to screen PF prediction points and proposed an adaptation criterion to balance exploration and exploitation and reduce the sample size.
The aforementioned PF-based sampling methods are keen on adding some selected PF prediction points to the training sample, and different criteria are proposed to balance mining and exploration. However, it is difficult to determine the accuracy of the PF predicted by the surrogate model [25], which increases the number of invalid samples and the calculation cost. Therefore, this paper proposes a new PF-based sampling algorithm, the PF-Voronoi sampling method. Instead of screening the next sampling point from the PF prediction point, the algorithm performs the next sampling point selection in the region where the PF prediction point belongs in the design space. In other words, the PF prediction points function only as a tool to indicate the region where the true Pareto front may exist. The PF-Voronoi sampling method makes reference to the division of the design space by the Voronoi diagram [26] to assign the PF prediction points to different regions. And the appropriate regions are selected for sampling by the three sampling criteria proposed in this paper to make full use of the information of PF prediction points. In this paper, the PF-Voronoi sampling method adaptively updates the Kriging model and couples with NASG-II to construct an effective multi-objective optimization framework -PFVNASG-II.

Pareto front (PF)-Voronoi NASG-II algorithm
In this paper, we propose a PF-based optimization algorithm PFVNASG-II.
PFVNASG-II constructs a Kriging model for every expensive objective function [27] and uses the NASG-II optimization algorithm to find the optimization for the multi-objective optimization (MOO) problem. The original MOO problem is approximated as % are surrogate models of objective functions. n is the number of design variable dimensions. tg is the number of constraints. In the flow chart, the complete optimization procedure starts with small training samples. The initial training samples are derived from the optimal Latin hypercube (OLH) [28]

Kriging model
Kriging is an efficient way of interpolation and can give probabilistic responses.
Its variance depends on the number of available training samples N [29]. For single-objective problems. The Kriging model can be expressed as: is the correlation vector between x and the sample points   Where   1 = ,..., n    is hyperparameters; n is the dimension of the design variable.

GEI sampling algorithm
It is necessary to concatenate an adaptive global sampling algorithm before implementing a PF-based sampling algorithm. The specific reasons are described in detail in Chapter 3. Since the Kriging model is a surrogate model based on a stochastic process, it can predict both the function value as well as its uncertainty at unobserved points. Based on this property, Jones et al [30] derived the expectation improvement (EI) sampling algorithm to improve the global prediction accuracy of Where     and     are the standard normal density and distribution function;   x f % is predicted function value by Kriging model; and s is the standard deviation predicted at point x [27].
To apply the EI algorithm to multi-objective optimization problems, this paper uses the generalized expectation improvement (GEI) proposed by Jie et al [31]. branch I is connected in series with the main program ( Fig.1

Simulation experiments
In this section, the performance of PFVNASG-II was verified using the high-dimensional nonlinear mathematical cases ZDT1-ZDT4, ZDT6. The population size of NASG-II is 100, the crossover operator was 0.8, and the variance operator was 0.2. All the codes mentioned above were programmed using MATLAB 2016a. The mathematical cases were shown in Table 1. Other parameter sets were listed in Table   2.    [25]. Fig.5 showed the effect of tandem GEI on predicting the ZDT6 PF.
Each algorithm was run 20 times independently. The resulting data were shown in Table 3.
From Fig. 5(a), it can be seen that the IGD of the optimization program without tandem GEI converged slowly, while the tandem GEI helped the optimization procedure converge quickly and reach the threshold value. In Fig. 5(b), the PF predicted by the optimization procedure without GEI was far from the true PF. The global exploration capability of the PFVNASG-II optimization without GEI is weak, so it is easy to get caught in the local PF. From Table 3, it can be found that the PFVNASG-II optimization procedure with GEI successfully predicted PF in 20 independent runs, while the prediction success rate of the optimization procedure without GEI was only 25%. Therefore, it is necessary to cascade an adaptive global sampling algorithm to improve the global exploration capability of the surrogate model.

PFC screening criterion validation
The ZDT3 mathematical case was selected to verify the effect of the MCC criterion and the MLEC criterion on the computational cost of the PFVNASG-II optimization procedure. ZDT3 was chosen because of the significant effect of MCC on the prediction accuracy of it. Each algorithm was run 20 times independently. The resulting data are shown in Table 4. Fig.6 showed the predictive power of the three optimization programs for PF. The number of non-dominated solution sets of PFVNASG-II without MLEC was small and far from the real PF. obviously, the MLEC criterion facilitated the sampling algorithm to quickly approach the real PF and sped up the convergence rate.
As can be seen from Table 4, all 20 independent operations of PFVNASG-II converge successfully within a given MNOS, and the average NOS was 145.8. While PFVNASG-II without MCC had only an 80% success rate and PFVNASG-II without MLEC had only a 20% success rate. Clearly, the MLEC criterion was crucial in ensuring the successful convergence of the algorithm.
In summary, the MCC criterion and the MLEC criterion showed excellent performance in terms of improving distribution diversity, convergence stability, and reducing time cost.

Comparison with existing algorithms
To examine the performance of the PF-Voronoi sampling algorithm, the CV-Voronoi, GEI, and CE sampling algorithms were selected for comparison with it.
To ensure that all algorithms are under the same conditions, these four sampling algorithms were coupled with the Kriging model as well as the NASG-II algorithm. (worst) had the worst IGD convergence curve, its predicted optimal frontier was close to the true Pareto frontier overall, and R 2 PF=(1,0.991), the predicted value was close to the actual value. While the optimal frontier predicted by EI fitted the true Pareto frontier curve more closely than PF-Voronoi (worst), but R 2 PF= (1,-2.75), the prediction error was larger and the predicted value was not credible. The optimal frontier predicted by PF-Voronoi (medium) fitted part of the true Pareto frontier curve with poorer distribution diversity. This was the reason for its higher IGD. In summary, PF-Voronoi was more likely to approximate the true Pareto front than the other three sampling algorithms.   In summary, the PFVNASG-II optimization procedure maintained high prediction accuracy and distribution diversity at a lower computational cost.   Finally, it needs to be specified that the PFVNASG-II optimization algorithm had difficulty predicting the mathematical case of ZDT4. In fact, for ZDT4, there were only 3 independent runs where the PFVNASG-II optimization algorithm successfully converges. Besides, there were 12 independent runs where the predicted Pareto frontier lay on the true Pareto frontier but performs poorly in terms of distributional diversity (IGD < 0.1 and MS < 0.9), i.e., the IGD converged slowly. It does not mean that the PF-Voronoi sampling algorithm is seriously flawed, but it is the price that all PF-based sampling algorithms have to pay: weakening the global exploration capacity to reduce the computational cost and improve the local prediction accuracy. Therefore, the PF-Voronoi sampling algorithm needs to be coupled with a sampling algorithm with better global exploration capability to extend its adaptability, which will be the next research direction.

Conclusion
In this work, a PF-based sampling algorithm, the PF-Vorigin sampling algorithm, was proposed. It was applied to the Kriging and NASG-II coupled models and its effectiveness was verified on the ZDT mathematical cases.
The experimental results showed that the MCC criterion was effective in improving the diversity of the optimization procedure. While the MLEC criterion and

Funding
Not applicable

Conflicts of interest
The authors declare that they have no conflict of interest.

Informed consent
Informed consent was obtained from all individual participants included in the study.

Availability of data and material
The datasets used or analysed during the current study are available from the corresponding author on reasonable request.