An Ensemble of Adaptive Surrogate Models Based on Local Error Expectations

An ensemble of surrogate models with high robustness and accuracy can effectively avoid the difficult choice of surrogate model. However, most of the existing ensembles of surrogate models are constructed with static sampling methods. In this paper, we propose an ensemble of adaptive surrogate models by applying adaptive sampling strategy based on expected local errors. In the proposed method, local error expectations of the surrogate models are calculated. 0en according to local error expectations, the new sample points are added within the dominating radius of the samples. Constructed by the RBF and Kriging models, the ensemble of adaptive surrogate models is proposed by combining the adaptive sampling strategy. 0e benchmark test functions and an application problem that deals with driving arm base of palletizing robot show that the proposed method can effectively improve the global and local prediction accuracy of the surrogate model.


Introduction
In the engineering design problem, computer simulation is usually applied to replace the real physics experiments. For complex engineering problems, sometimes the performance function is implicit, or due to cost and time limit, the surrogate model is often applied to approximate the real physical model. Commonly used surrogate models mainly include Kriging [1], artificial neural network [2], radial basis function (RBF) [3], support vector regression(SVR) [4], and polynomial response surface(PRS) [5].
When surrogate model is applied, how to find a suitable surrogate model is a difficult task. In order to improve the adaptability of the surrogate model, a reasonable choice is to use a linear weighted combination of different surrogate models, that is, an ensemble of surrogate models. Compared with the single surrogate model, an ensemble of surrogate models can save a lot of time wasted in screening the surrogate models. Many scholars have conducted in-depth research on it and have obtained many good achievements. Huang [6] found that the ensemble of surrogate models has higher prediction accuracy than the single surrogate model. Yan [7] proposed a new weight function construction method, which has the same accuracy as the optimal submodel and can improve the approximation of the true response distribution. Lu [8] found that the multisurrogate model has better optimization results than the single surrogate model's. Pan [9] applied the ensemble of surrogate models to the lightweight design of the car body, and the results achieved a better optimization effect. Liu [10] established the ensemble of surrogate models to solve the structure optimization of car parts. Xing [11] assigned weights to three single surrogate models by using the adaptive metropolis-Markov chain Monte Carlo method. Yin [12] compared the application of a single surrogate model and an ensemble of surrogate models in groundwater restoration design optimization problems, and the results showed that the ensemble of surrogate models is more robust. Li [13] proposed a surrogate-assisted particle swarm algorithm, which can effectively balance the global search and local search. Donncha [14] successfully used the ensemble of surrogate models to improve the forecasting system with significant effects. Ouyang [15] used the analysis of variance method to determine the weights of ensemble of surrogate models. e comparison results show that the proposed method can not only improve the prediction performance of surrogate model, but also obtain a reliable solution. Chen [16] presented a new ensemble model which combines the advantages of global and local measures. e results show that the proposed ensemble model has satisfactory robustness and accuracy. Zhang [17] proposed a unified ensemble of surrogates with global and local measures for global metamodeling. It is concluded that the proposed model has superior accuracy while keeping comparable robustness and efficiency.
Although some progress has been made in the research of the ensemble of surrogate models, most of the current methods for constructing the ensemble of surrogate models are stationary sampling. e problem with stationary sampling is that, in order to obtain an ensemble of surrogate models that meets the accuracy requirements, the sample size must be large enough. Adaptive sampling can obtain new samples that benefit the quality of the surrogate model, which can minimize the total sample size. However, the current adaptive sampling is often applied for a single surrogate model [18][19][20][21]. Only a few scholars combine the adaptive sampling strategy with the ensemble of surrogate models [22,23]. e remainder of this paper is organized as follows. Section 2 briefly reviews the main steps to establish the ensemble of surrogate models. In Section 3, the ensemble of surrogate models using adaptive sampling strategy based on local error expectations is described. e proposed method is verified by numerical examples and compared with the three classical ensembles of surrogate models in Section 4. Section 5 applies the proposed method to the engineering design problem of driving arm base of palletizing robot. Finally, the conclusions are given.

Establishment of the Ensemble of Surrogate Models
ere are three main steps to establish the ensemble of surrogate models: (1) Design of experiment: the experiment design methods are applied to determine the spatial distribution of sample points. Experiment design methods mainly include Central Composite Designs (CCDs) [24], Orthogonal Design [25], and Latin Hypercube Design (LHD) [26]. LHD is the most popular sampling method due to good spatial uniformity. e experiment design method used in this paper is also LHD. (2) Establishment of the ensemble of surrogate models: the surrogate models can be divided into two categories. One is interpolation methods, such as RBF and Kriging. For these methods, the prediction errors of the sample points are zeroes, which has good unbiasedness. e other is the noninterpolation methods, such as PRS and SVR. e noninterpolation methods have certain fitting capabilities, but the surrogate models do not go through all sample points. erefore, enough sample points are needed to ensure the high accuracy of the surrogate models, which has extremely high uncertainty. In view of the advantages and disadvantages of different surrogate models, the most commonly used surrogate models are the RBF model and the Kriging model. In this paper, these two surrogate models are combined to establish the ensemble of surrogate models. e expression of the ensemble of surrogate models is as follows [27]: where y ⌢ e is the predicted response value of the ensemble of surrogate models and N is the number of surrogate models. ω i is the ith weight coefficient. y i is the predicted response value of the ith surrogate model. Generally speaking, the higher the prediction accuracy, the larger the weight coefficient of the corresponding surrogate model.
(3) Accuracy verification: accuracy verification of surrogate model mainly includes two aspects: global accuracy and local accuracy. root mean square error (RMSE) [28] and coefficient of determination (R 2 ) [29] are two main global accuracy evaluation methods. e corresponding expressions are as follows: where y i is the actual response value of the ith test sample and y i is the predicted response value of the surrogate model of the ith test sample. y is the mean value of the actual response value, and n is the size of test sample points. For RMSE, the smaller the value, the higher the global prediction accuracy. e range of R 2 is not greater than 1. e value of R 2 can be negative if the fitting quality of the surrogate model is extremely low. e closer the value of R 2 to 1, the higher the accuracy of the global approximation of the surrogate model. Although RMSE can evaluate the prediction accuracy of the surrogate model, the magnitude of the specific problem greatly affects the value of RMSE, which is not as intuitive and easy to understand as R 2 . e global accuracy evaluation method applied in this paper is the coefficient of determination R 2 . e local prediction accuracy evaluation method is maximum absolute error (MAE). e expression of MAE is as follows: Similar to RMSE, the smaller the MAE, the higher the local prediction accuracy of the surrogate model. In this paper, MAE is also used to evaluate the local prediction accuracy of the surrogate model.

The Ensemble of Adaptive Surrogate Models
Based on Local Error Expectations e existing adaptive sampling strategy of sample points is mainly for a specific surrogate model, which has poor versatility. In addition, due to the inconsistency of the existing adaptive sampling strategies, it will be very complicated to combine the ensemble of surrogate models with the adaptive sampling strategy. In this section, a universal adaptive sampling strategy based on local errors is proposed. By combining the new adaptive sampling strategy, the method to construct the ensemble of surrogate models is proposed.

Adaptive Sampling Based on Local Error Expectations.
Since Kriging and RBF models usually can provide good accuracy for fitting highly nonlinear behaviors, so these two surrogate models are used in general engineering problems. At present, the most commonly used adaptive sampling method is the maximin distance approach proposed by Johnson [30]. Jin and Chen [31] made corresponding improvements and proposed the Maximin Scaled Distance Approach. In this paper, we also propose a universal adaptive sampling strategy based on the local error expectations named LEE strategy for different surrogate models and it is proposed to serve the construction of the ensemble of adaptive surrogate models. e process is shown in Figure 1. e following are main steps of the LEE strategy: (1) Build an initial surrogate model. First, LHD is used to obtain the initial sample points and obtain their response values. Since high accuracy is not required at the beginning of sampling, for different dimensional surrogate models, the initial number of sample points can be 5n d , 10n d , and 20n d (n d is the number of design variables).  coordinate of the sample point; the expression is as follows: where n d is the size of the dimension and x jmax and x jmin are the upper and lower bounds of the jth dimension.
prediction uncertainty near ith sample point is greater than the average prediction uncertainty of the existing sample points. It means the degree of nonlinearity near ith sample point is relatively large. So a sample point is randomly added within the dominating radius of ith sample point with equal probability. In order to avoid the added sample point being too close to the existing sample points, the sample point that meets the following condition is not added to the sample database: where X * stands for the point to be added and X closest represents the sample point closest to point X * . Formula (6) means that if the sample points X * and X closest are too close, they will influence the condition of the correlation matrix of the surrogate model, so the added sample point should be invalid. (5) If the value of R 2 is greater than the preset value η, the final surrogate model is obtained; otherwise update the surrogate model. e new acquired sample points are added to the sample database. e corresponding response values of these new sample points are calculated. en the surrogate model is updated according to the current database of sample points. Calculate the determination coefficient R 2 . If the value of R 2 is greater than the preset value η, the adaptive sampling process ends; otherwise, return to step 2.
In order to illustrate the feasibility of LEE strategy, the one-dimensional test function in [32] is selected and its expression is Figures 2-4 are initial Kriging model, the absolute errors, and the updated Kriging model. Figure 2 shows that the overall prediction accuracy of the initial Kriging surrogate model is low, and the local errors near point 5 and point 6 are very large. It can be seen from Figure 3 that errors of sample points 5 and 6 of the initial Kriging model exceed E[AE], so random sample points are added in the dominating radius of points 5 and 6. It can be seen from Figure 4 that the added Kriging surrogate model has higher prediction accuracy. After adding the sample points, the prediction error in this area is significantly reduced, and the prediction accuracy is higher, which proves the effectiveness and feasibility of adaptive sampling based on LEE strategy.
In order to prove the versatility of LEE strategy for different surrogate models, the RBF surrogate model is also constructed based on the existing sample points and their response values. Figures 5-7 are initial RBF model, the absolute errors, and the updated RBF model. It can be seen from Figure 5 that the overall prediction accuracy of the initial RBF surrogate model is low, and the local errors near points 1 and 6 are the largest. It can be seen from Figure 6 that local errors of sample points 1 and 6 of the initial RBF model exceed E[AE], so random sample points are added in the dominating radius of sample points 1 and 6. It can be seen from Figure 7 that the overall prediction accuracy of updated RBF surrogate model with two new sample points has been greatly improved, which further proves the feasibility and versatility of adaptive sampling based on LEE strategy. e proposed LEE strategy is also compared with another adaptive sampling strategy called the Maximin Scaled Distance Approach (MSDA) [31] through the classic test functions. e specific information of the test functions is shown in Table 1.
e initial Kriging and RBF surrogate models are established, respectively, according to a certain number of initial sample points. e proposed LEE strategy and MSDA are applied to improve the accuracy of surrogate models. e convergence condition is R 2 > 0.8. Comparison results of Kriging and RBF surrogate models are listed in Table 2.
It can be seen from Table 2 that when the numbers of initial sample points of the two methods are the same, the numbers of total sample points used by LEE strategy are less than MSDA's. At the same time, except for CN function, the final values of R 2 of the LEE strategy are greater than those of the MSDA in most functions, which means that surrogate models constructed by LEE strategy can achieve higher prediction accuracy than those constructed by MSDA.

3.2.
e Ensemble of Adaptive Surrogate Models. In this section we construct the ensemble of surrogate models with LEE strategy. e flowchart is shown in Figure 8. e main steps are as follows: (1) Build Kriging and RBF surrogate models. Existing researches [8][9][10][11][12] prove that, in most cases, interpolation type (Kriging and RBF) surrogate models are more suitable for engineering problems. erefore, this paper chooses Kriging and RBF models to form the ensemble of surrogate models. Construct Kriging and RBF models by using the initial sample points. en, obtain the predicted error sum of square (PRESS) [33], MAE, and R 2 values of Kriging and RBF models by applying CV verification method (LOO-leave one method). e absolute errors (AEs) of each sample point of Kriging and RBF models are calculated. Since Forrester [34] has already proved that the surrogate model has better predictive ability when the coefficient of determination R 2 is greater than 0.8, we use R 2 > 0.8 as convergence conditions.
where y i is the true response value of the ith sample point and y − i is the predicted response value of the ith sample point in the single surrogate model composed of all sample points except ith sample point. e prediction sum of squares is the sum of the prediction errors of all sample points, as shown in the following formula: e weight coefficient corresponding to each single surrogate model is calculated by the inverse proportional where P i is the PRESS value at the ith sample point. In this paper, N is equal to 2. en the final ensemble of adaptive surrogate models is obtained by linearly weighting each surrogate model.

Numerical Example Analysis
In order to verify the versatility and effectiveness of the ensemble of adaptive surrogate models based on local error  [36]. Among the three most widely used methods for constructing an ensemble of surrogate model, the most classic one is to use PRESS as a measure of the weight coefficient calculation. If the PRESS value of a certain surrogate model is larger, the weight coefficient is smaller, also known as an inverse proportional averaging method, and its weight coefficient calculation formula is e BestPRESS method selects the single surrogate model with the smallest PRESS value as the final surrogate model, which is essentially a single surrogate model. Another method is the heuristic calculation weight coefficient algorithm proposed by Goel [36], and its calculation formula is where ω * i � (E i + αE avg ) β and E avg � ( n j�1 E j )/n. E i is the PRESS of the ith surrogate model. e recommended parameter values are α � 0.05, β � − 1.

Benchmark Functions.
In this paper, six benchmark functions from low dimension to high dimension are selected. e information of benchmark functions is shown in Table 3.
e Branin, Hartmann-3, and Hartmann-4 functions are low-dimensional. Latin hypercube sampling with 5n sample points is enough, which meet the accuracy requirements. Since the Hartmann-6, Styblinski-Tang8, and Styblinski-Tang10 are high dimensional, the Latin hypercube sampling with 20n sample points is used.

e Analysis of Global Prediction Accuracy.
e global prediction accuracies of different ensembles of surrogate models are compared. e total number of samples is recorded when the EOASM method reaches the convergence condition. For the other three ensembles of surrogate models constructed by the PRESS method, BestPRESS method, and PWS method, the Latin hypercube sampling method is used to generate the same total sample size. So the number of sample points in the four methods is the same. After 20 comparative experiments, the average values of the determination of coefficient R 2 of each ensemble of surrogate models are shown in Table 4.
It can be seen from Table 4 that when the total number of sample points is the same, the prediction accuracy of the ensemble of surrogate model constructed by the EOASM method is the highest. For example, for the Branin function, the average value of determination coefficient R 2 of EOASM is 0.9446. Among the other three ensembles of surrogate models, the PRESS method has the largest average value of R 2 , which is much lower than that of the EOASM method. e results of the other test functions are similar to the Branin function.

e Analysis of Local Prediction Accuracy.
e maximum absolute error (MAE) is used to evaluate the local accuracy. e maximum absolute error of the ensemble of surrogate model constructed by each method is compared when the number of sample points is the same. Table 5 shows the mean values of MAE of different ensembles of surrogate models.
It can be seen from 6 benchmark functions that EOASM method has the smallest average value of the MAE among four ensembles of surrogate models, which means that the proposed method has the highest predict accuracy among four methods.

Robustness Analysis.
Robustness is an important indicator for evaluating surrogate models. e robustness refers to the insensitivity of the prediction accuracy of the surrogate model to random sampling of sample points. In order to compare the robustness of each surrogate model intuitively, 20 sampling experiments are performed for each benchmark function. e distribution results of the determination coefficient R 2 are presented in box plot [37], which are shown in Figure 9.
In Figure 9, the box length indicates whether the surrogate model's determination coefficient R 2 fluctuates greatly. e smaller the box length, the stronger the robustness of the surrogate model. It can be clearly seen that the box length of the ensemble of surrogate model constructed by the EOASM method is the shortest in each benchmark function, which indicates the EOASM method has the strongest robustness.

Engineering Application
In the design of the palletizing robot, the design of the driving arm base plays a key role. e overall assembly of the palletizing robot is shown in Figure 10. e driving arm base bears large load. When it is assembled with the boom, it will deform to a certain extent, which will cause strain and stress. However, these physical quantities are difficult to express using explicit functions. It is often necessary to obtain their data through a large number of simulation tests. e specific material properties are shown in Table 6. e structure of the driving arm base is shown in Figure 11. Considering the assembly relationship of each part, four nonassembly dimensions are selected as design variables, which are shown in Table 7. When the force and torque of the driving arm base reach the maximum, the generated stress is the largest. e fatigue damage is more likely to be caused. Power is carried out through UG software simulation to obtain the maximum force and torque of the assembly hole of the driving arm base. 8 Mathematical Problems in Engineering e curve of the force and torque with time is shown in Figure 12. It can be seen that, at 3 seconds, the driving arm base bears the maximum force and the maximum torque.
Since the maximum stress is difficult to calculate directly, it is selected as the object function, and its true response value is obtained by simulation with Ansys finite element software, as shown in Figure 13.
Hartmann-4 4 Hartmann-6 6      prediction accuracy is significantly improved. In summary, the EOASM method has good applicability to engineering problems and can greatly reduce the calculation cost of physical experiments.

Conclusion
(1) e adaptive sampling based on LEE strategy can greatly improve the prediction accuracy of the surrogate model based on as few sample points as possible, and it also has strong applicability to different types of surrogate models. (2) e EOASM method based on LEE strategy can greatly improve the global prediction accuracy, local prediction accuracy, and the robustness of the ensemble of surrogate models.   (3) Although the prediction accuracy and robustness of the ensemble of surrogate models constructed by the EOASM method have been improved to some extent, it still has not escaped the high-dimensional curse of the surrogate model. Under the condition that the sample size is already large, it is possible that the accuracy of the surrogate model is extremely low. erefore, the high-dimensional problem of the surrogate model is still a problem to be solved.

Data Availability
e data used to support the findings of this paper are included within the article ( Table 2).

Conflicts of Interest
e authors declare that they have no conflicts of interest.