Recent developments in metamodel based robust black-box simulation optimization: An overview

: In the real world of engineering problems, in order to reduce optimization costs in physical processes, running simulation experiments in the format of computer codes have been conducted . It is desired to improve the validity of simulation-optimization results by attending the source of variability in the model’s output(s). Uncertainty can increase complexity and computational costs in Designing and Analyzing of Computer Experiments (DACE) . In this state-of the art review paper, a systematic qualitative and quantitative review is implemented among Metamodel Based Robust Simulation Optimization (MBRSO) for black-box and expensive simulation models under uncertainty. This context is focused on the management of uncertainty, particularly based on the Taguchi worldview on robust design and robust optimization methods in the class of dual response methodology when simulation optimization can be handled by surrogates. At the end, while both trends and gaps in the research field are highlighted, some suggestions for future research are directed.


Introduction
Nowadays, developing processes in an engineering world is strongly associated to computer simulations.These computer codes can collect appropriate information about characteristics of engineering problems before actually running the process.Computer simulations can help a rapid investigation of various alternative designs to decrease the required time to improve the system.In addition, most numerical analyses for engineering problems make a well-suited use of mathematical programming.Clearly, a Simulation-Optimization (SO) becomes necessary to find more interest and popularity than other optimization methods, in order to the complexity of many real world optimization problems in way of mathematical formulation analyzing (Dellino et al., 2014).The main goals of simulation can be defined as two, first what-if study of model or sensitivity analysis, and second is optimization and validation of model (van Beers & Kleijnen, 2003).The essential benefit of simulation is its ability to cover complex processes, either Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.3 P a g e deterministic or random while eliminating mathematical sophistication (Figueira & Almada-Lobo, 2014).In general, SO techniques are classified into model-based and metamodel-based (Mohammad Nezhad & Mahlooji, 2013;Viana et al., 2014).In the model-based, the simulation running is not expensive and model output can be used directly in optimization.Many large scales and detailed simulation models in the complex system particularly under uncertainty may be expensive to run in terms of time-consuming, computational cost, and resources.Moreover, to address such a challenge, metamodels need to be derived via combing by robust design optimization.
The trend of publications on the topic of "simulation optimization" in both Web of Science and SCOPUS databases are confirming the interest on the search term, see Figure 1.On the other hand, an internet search by using a popular web browser "Google Scholar" returns over 40,300 pages, which mainly containing scientific and technical articles, research reports, conference publications, and academic manuscript.
In this paper, we follow to review latest developments in Metamodel-Based Simulation Optimization (MBSO) and in wider scope, Metamodel-Based Robust Simulation Optimization (MBRSO) when simulation affected from uncertainty in model's parameters.MBRSO is applied in the complex simulation model under uncertainty when simulation running is expensive in terms of computational time and/or cost, therefore the just limited number of simulation running is possible.
The rest of this review is organized as follows.Section 2 covers quantitative analysis and also illustrates the survey method while highlighted method of gathering and reviewing articles.In section 3, qualitative analysis is provided including the relevant basic approaches and methodologies around the MBRSO.Section 4 discusses remarkable research findings and (a) (b) Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
provides main recommendations which are extracted throughout reviewing the literature.This paper is concluded in section 5 with summarizing important research tips.

Quantitative analysis on metamodel based simulation optimization
SciVal offers quick, easy access to the research performance of 8,500 research institutions and 220 nations around the world (see "About SciVal" in Elsevier 2017 1 ).Visualization of Elsevier's SCOPUS data for the selected search terms "Visibility" and "Citations" is provided by SciVal.
Being the largest abstract and reference database, SCOPUS provides citation dataset of research literature and quality web sources (Aghaei Chadegani et al., 2013).Figure 2 shows the publications trend on "Metamodel" and "simulation optimization" impact 1996 to date ( 12September 2017) .The number of publication on the topic has increased from one publication in the year 2006 to 65 publications in 2016.In order to forecast the trend of scholarly outputs in following years, we fit polynomial regression over data in when  is year and  is number of annual documents.In the last six years 659 papers were published, receiving over 10,503 views, 2,465 citations, 147 international collaborations, and 1.36 Field-Weighted Citation Impact (FWCI).The FWCI is a measure of citation impact that normalizes for differences in citation activity by subject field, article type, and publication year (Jang & Kim, 2014).The world's average for FWCI is indexed at 1.00, as such, values above 1.00 indicate an above average citation impact.Specifically, a citation impact of 1.36 indicates 36% of the citations are above the average citations in this same filed. 1.Elsevier.(2017).About SciVal.Retrieved from https://www.elsevier.com/solutions/scivalDecision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
The top 50 key-phrases by relevance for the past five years publications (656 publication) is shown in Figure 3. Notably, the phrases "optimization" are the most repeated keyword.The highlighted importance of key phrases of "interpolation", "computer simulation", and global optimization" are obvious.In addition, the phrase of "design of experiments", "multi-objective optimization", and "uncertainty analysis" among the most repeated keywords in recent publications which gained growing attentions.The trends of publications and the top 50 keyphrases have proven the importance of current research on in scholarly publications.Therefore, there is an interest to find alternative ways to improve research on simulation optimization, such as combining design of experiments by evolutionary algorithms like expected improvement methodology which today become to be interested among academic research world, for instance see (Havinga et al., 2017;Zhang, J et al., 2017).According to associated obtained data from SciVal among search on metamodel and simulation optimization, the top ten countries, authors, and journals which ranked based on views count (views source: Scopus data up to 31 Jul 2017) are respectively sketched in Table 1, Table 2, and Table 3.  Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

Instruction of current research
In the current systematic literature review, the search strategy was as follow steps.Some common electronic databases (Scopus indexed) were applied in search processes such as Science Direct, IEEE Xplore, Springer Link, and etc. Different keywords and their combinations were used to search relevant resources in literature from mentioned electronic databases.The SCOPUS databases cover almost two times more than the Web of Science journals (Aghaei Chadegani et al., 2013).Therefore, the SCOPUS databases selected as a reference for academic documents source.Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
Particular metamodels which concentrated are polynomial regression methodology (also called Response Surface Method (RSM)) and Kriging surrogate model.In general, whole findings were filtered based on three factors, i) selecting articles which are associated to interesting our topic (polynomial regression, Kriging, robust design, and SO), ii) recent articles are preferred (all articles were published after 2000, when around 60% of them were published between 2010-Spetember 2017, iii) number of citations are attended.Notably, there are a different number of papers and electronic resources related to the topic, but we just filtered resources which can cover need basic knowledge around simulation-optimization via robust design integrating metamodels.
Table 5 shows the identifier of articles while are sorted based on publishing year.Note that the citations were counted from Scopus leading up to April 2017.In this context, articles were reviewed based on seeking in methodology and scope of applicability, while focused on methods, techniques, and approaches which employed to achieve their relevant goal(s).

Qualitative analysis on MBRSO
The black-box and also computationally expensive simulation models are often found in engineering and science disciplines.Expensive simulation running and expensive analysis of processes are often considered black-box function.In general surrogate models treat the simulation model as a black-box model (Beers & Kleijnen, 2004;Kleijnen, 2005;Shan & Wang, 2010).In fact, many simulation-optimization approaches solely depend on such input-output data in investigating of optimal input settings, while in the black box feature, the simulation just permits the evaluation of the objective and constraint for a specific input (Amaran et al., 2016).Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.9 P a g e

Simulation-optimization (SO)
The process of investigating the best value of input variables among all possibilities in a simulation model is Simulation-Optimization (SO), also known as an optimization via simulation or simulation-based optimization.The objective of SO is obtaining the optimum value for output while minimizing the resource spent.Kleijnen (2015) 4).With optimization strategy, the feedback on the process is provided by the output of simulation model (Carson & Maria, 1997).In the modeling part, the method can be used to identify process components and select them to design simulation model (Banks et al., 2010;Neelamkavil, 1987).The SO can be attracted the attention of many researchers in improving practical engineering problems via different methods.Azadivar (1999) (Dellino & Meloni, 2015;Kleijnen, 2015) and four review papers (Barton, 1992;Carson & Maria, 1997;Li et al., 2010;Simpson, Poplinski et al., 2001).In general, SO models can be divided into two types of stochastic and deterministic   Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.10 P a g e models (Figure 5).In deterministic models, a response of model lacks random error, or in another mean, repeated runs for the same design of input parameters, the same result for the response can be gain from the model.Examples of the deterministic simulation are models of airplanes, automobiles, TV sets, and computer chips applied in Computer Aided Engineering (CAE) and Computer Aided Design (CAD) at Boeing, General Motors, Philips, etc. (Kleijnen, 2009b).On the other hand, the output in stochastic or random simulation usually follows some probability distribution which may vary around its space.So, running simulation for the same input combination gives different outputs.Examples are models of logistic and telecommunication systems (Kleijnen, 2009b).This noisy condition of output also enhances optimization challenge, while it becomes harder to distinguish the best set of input variables, and their validity in deterministic approaches are lost.In SO usually cannot distinguish the exact (deterministic) solution for the black-box system, so the mean and the variance obtained from sampling points (Amaran et al., 2016).Polynomial regression can sufficiently support both deterministic and random simulation, but Kriging has hardly been used in stochastic simulation (van Beers & Kleijnen, 2003).In other classification, Amaran et al. (2016) have categorized SO algorithms based on local or global optimal solution (Figure 6).Barton & Meckesheimer (2006) have classified SO approaches depend on the nature of design variables types.Design variables in Simulation-Optimization Strtegies  Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
simulation models can be either continues and discrete, (see Figure 7).Continues variables can take any real value within a given range which is imposed by constraints.In most engineering problems, during the optimization process with approximation methods (metamodels), the discrete patterns of input variables are neglected and all variables can vary continuously due to solving continues patterns easier.Moreover, based on the optimum design in the continuous feature, the values which inherently are discrete existence, then can be adjusted to the nearest feasible discrete value (Jurecka, 2007).

Applications of simulation-optimization
Various types of problems in engineering design and management have been developed by application of different methods in SO (e.g.production, transportation and logistics, energy management, finance, engineering, and applied sciences).In a real case study, (Kleijnen, 1993) has applied SO methods in production planning to report practical decision support system in the Dutch company.In (Jin et al., 2001) the application of different metamodels have been studied (e.g.polynomial regression, multivariate adaptive regression splines, radial basis functions, and kriging) over 14 test SO problems in engineering design based on noisy or smooth behavior.In the other work by (Kleijnen & Gaury, 2003), four different techniques have been combined included simulation, optimization, uncertainty analysis, and bootstrapping through implementing in a real case study in production control.The appropriate review study which addressed some applications of SO in sub-communities in machine learning problems, discrete event systems such as queues, operations, and networks, manufacturing, medicine and biology, engineering, computer science, electronics, transportation, and logistics have been done by (Amaran et al., 2016).

Figure 7
Simulation optimization strategies based on nature of design variables.

Continues Design variables
Direct Gradient Methods
12 P a g e Recently, a work based on metamodel and Monte Carlo simulation method have been done by (Li et al., 2016) applied in production planning of manufacturing system and compared their proposed model with other approaches (e.g.mathematical programming).The application of robust design hybrid metamodeling in management science and engineering problems has been reviewed by (Parnianifard et al., 2018).
The application of SO in inventory management has been significantly interested in different studies such as (Dellino et al., 2015(Dellino et al., , 2010a) ) and review papers (Jalali & Van Nieuwenhuyse, 2015;Kleijnen, 2017).However, in this context among a review of the literature, the application of SO methods in different types of engineering design and management problems were considered that results are represented in Table 6.Notable, we just targeted SO methods based on surrogates and robust design optimization in black-box and expensive simulation models under uncertainty.For such cases, computer experiments are conducted as main supplementary of metamodel based robust simulation optimization.

Uncertainty management in SO via robust design optimization
In practice, most engineering problems have been affected by different sources of variations.One of the main challenges of SO is address uncertainty in the model, by a variety of approaches, such as robust optimization, stochastic programming, random dynamic programming, and fuzzy programming.Uncertainty is undeniable which effect on the accuracy of simulation results while making variability on them.In uncertain condition, robust SO allows us to define the optimal set point for input variables while keeping the output as more close as possible to ideal point, also with at least variation.Robust design approaches try to make processes insensitive to uncertainty as sources of variation by investigating qualified levels of design input factors.The source of variation in output can be divided into two main types, first is variation due to variability in environmental (uncontrollable or noise) variables (Park & Antony, 2008;Phadke, 1989), and second is fluctuating of input (design) variables in their tolerance range (Anderson et al., 2015;Myers et al., 2016).

Robust optimization in the class of dual response
The dual response surface approach has been successfully applied in robust process optimization.training set to improve the metamodel, the average magnitude of repeated runs can be used (Li et al., 2010).If the stochastic simulation models is followed, each input combination , ( = 1,2, … , ) repeat  times ( = 1,2, . . ., ), while in random simulation  is number of uncertainty scenarios.If  = ( 1 ,  2 , ⋯ ,   ) is the -dimensional vector with the simulation's output, then the mean and variance of each input combination can be computed by: Note that in deterministic simulation models, above equations have  = 1 , for more information can see (Dellino et al., 2015;Kleijnen, 2009bKleijnen, , 2015)).For both Eqs.( 1) and ( 2) are assumed that all scenario of uncertainty have the same probability, else based on probability of uncertainty in a model, equations for computing mean and variance can be rewritten as below: Where   () denotes the probability of  ℎ scenario of uncertainty.Furthermore, there are different optimization approaches available on dual response methodology which some of them are referenced in (Ardakani & Noorossana, 2008;Beyer & Sendhoff, 2007;Nha et al., 2013; Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.et al., 2016), so here just for instance some most common robust optimization methods in class of dual response surface are mentioned:
Relaxing the "variance at target" (Lehman et al., 2004): The standard deviation of the goal output (Dellino et al., 2015): Where () represents the unknown true probability distribution of uncertain parameter (), and denotes a family of distribution that is obtained from old data ( −divergence measure).Figure 8 summarizes the strategies on facing with unknown probability of uncertain parameters.

Designing of computer experiments
In this context, we focus on experiments via computers in simulation terms.Recently, the acronym of Design and Analysis of Simulation Experiments (DASE) has been introduced which is inspired of the common acronym in deterministic simulation as Design and Analysis of Computer Experiments (DACE) (Kleijnen, 2015).In practice, most simulation models have many combinations of input factors to run which may lead to time consuming and expensive patterns, e.g. the model with 7 input factors and 10 levels per each input requires 10 7 combinations.In addition, in an uncertain environment, the analyzing of uncertain (noise) factors need extra computational efforts.Furthermore, if we wish to analyze all combinations to investigate the best set of input factors, then we need extremely long simulation runs unless the appropriate sampling methods are successfully used.As to the number of sample points to produce an accurate response surface model, using 1.5 to 2.5 sample points have been recommended in literature, see, (Giunta et al., 1994;Simpson, Mauery et al., 2001) when  is number of coefficients that need to be estimated.There are three common types of design based on aliasing main or interaction effects in a model.While the effect one factor depends on the levels of one or more other factors, called two of more degree of interaction between factors.A resolution-III design indicates that main effects may be aliased with two factor interactions.A resolution-IV design indicates that main effects may be aliased with three-factor interactions.
Two-factor interactions may be aliased with other two-factor interactions.Resolution-V (or higher) assume that main effects and two-factor interactions can adequately model the factors.
Table 8 shows some DOE methods that have been used in literature to design experimental points through analyzing and improving different engineering design problems.

Latin Hypercube Sampling (LHS)
LHS was first introduced by (McKay et al., 1979).It is a strategy to generate random sample points, while guarantee all portions of the design space is depicted.Generally, LHS is intended to develop results in SO (Kleijnen, 2015).LHS has been commonly defined for designing computer experiments based on space filling concept (Bartz-Beielstein et al., 2015;Del Castillo, 2007).In general, for  input variables,  sample points are produced randomly into  intervals or scenarios (with equal probability).For the particular instance the LHS design for  = 4,  = 2 is shown in Figure 9.The LHS strategy proceeds as follows: i.In LHS, each input range divided into  subranges (integer) with equal probability magnitudes, and numbered from 1 to .In general, the number of  is larger than total sample points in CCD (Kleijnen, 2004).ii.In the second step, LHS place all  intervals by random value between lower and upper bounds relevant to each interval, since each integers 1,2, … ,  appears exactly once in each row and each column of the design space.Note that, within each cell of design, the exact input value can be sampled by any distribution, e.g.uniform, Gaussian or etc.
Three common choices are available to ensure appropriate space filling of sample points in LHS design: Minimax: This design tries to minimizing the maximum distances in design space between any location for each design point and its nearest design points.

Maximin:
In this design attempt to be maximized the minimum distance between any two design points.

Desired Correlational function:
Inspired by (Iman & Conover, 1982) in the case of nonindependent multivariate input variables the desired correlation matrix can be used to produce distribution free sample points in LHS.

Orthogonal Array (OA)
The OA design can fill whole design space like LHS, and it has strength to allocate points in each corner of design space (Owen, 1992).The OA was adapted to balance an discrete experimental factors in a continues space (Koehler & Owen, 1996).The orthogonal array is matrix with  rows and  columns where  is number of experiments (input combination) and  is number of input factors.Each factor is divided into  equal size ranges (grids), and sample points are allocated to these orthogonal grid spaces.In general, the orthogonal array shows with symbol of   (  ), and has the following properties:  For the input factor in any column, every level happen   ⁄ times.
For the two input factors in any two columns, every combination of two levels happen   2 ⁄ times.

P a g e
 By removing one or some columns of an orthogonal array, the remaining arrays are orthogonal to each other, and OA is able to employ by a smaller number of factors.

Sequential design:
Most time in practice due to expensive simulations (i.e. a single simulation run is intensive time consuming), reducing the number of simulation runs (sample points) is interested.In mathematical statistics it is common that sequential designs are more well-organized than fixed sample size design (Wim et al., 2008).Different types of criteria can be used to sequentially define a candidate set of sample points, that most of them are based mean squared prediction error (Van Beers & Kleijnen, 2004).(Kleijnen & Beers, 2004) have proposed a customized sequential method based on cross-validation and jackknifing approaches.In SO, (Kleijnen, 2017) has suggested replacing one-shot designs by sequential designs that are customized for the given simulation models.In other similar work, (Wim et al., 2008) have employed bootstrapping technique to propose the sequential DOE with a smaller number of sample points than other alternative design like LHS, also with better results.A comparison on different sequential sampling approaches has been provided in (Jin, R. et al., 2002).Here, inspired by (Kleijnen & Beers, 2004;Wim et al., 2008) the cross-validation and jackknifing method are followed due to four reasons.First, this method is adapted for expensive simulation models.Since this model is used the cross-validation method and does not need extra simulation runs, so is appropriate for expensive simulation models.Second, evaluate I/O behavior with highest estimated variance which is desirable in robustness study.Third, smaller prediction error than other sequential design (Van Beers & Kleijnen, 2004).Fourth, this method is able to use for different types of metamodel such as polynomial regression and Kriging.
Following steps are shown the procedure of sequential design based on cross-validation and jackknifing:  Expensive simulation runs are implemented for  0 initial sample points, and the approximation metamodel is constructed based on initial result.
 To compare all candidates and select winner for expensive simulation, a jackknife variance for each candidate need to be computed separately and selected a point with maximum jackknife variance.To avoid extrapolating, do not drop the sample points on vertices, k sample on vertices doesn't drop, so  0 is replaced by   when   =  0 − .
Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.where  ̂−0 is the original prediction for candidate  with metamodel over initial sample points, and  ̂− is prediction for candidate  with metamodel over  0 −  points ( delete  ℎ sample point from  0 set of points).
 The jackknife variance is computed for candidate  by employing relevant pseudo-values: where the candidate with maximum   2 ̃, (   {  2 ̃}) is a winner and enter in set of initial sample points after computing its relevant response with original simulation.
 All steps are repeated till stop creation is satisfied.Among literature could not be found any suggested appealing stopping criterion, see (Van Beers & Kleijnen, 2004).It can be defined based on a limitation of computational time or cost.

Metamodeling
Metamodeling techniques have been used to avoid intensive computational and numerical simulation models, which might squander time and resource for estimating model's parameters.

22
P a g e of a metamodel with uncontrollable noise variables as uncertainty symbols is illustrated in (Figure 10).

Polynomial regression
Polynomial regression also called Response Surface Methodology (RSM) is a collection of statistical and mathematical techniques useful for developing, improving, and optimizing the process.The functional applicability of RSM in literature can be i) approximate the relationship between design (dependent) variables and single or multi-response (independent variables), ii) investigating and determining the best operating condition for the process, by finding the best levels of design region which can satisfy operating limits, and iii) implementing robustness in the response(s) of the process by designing the process robust against uncertainty.Barton & Meckesheimer (2006) have claimed the RSM has been successfully used in recent decades for processes with the stochastic application.Kleijnen (2017) has mentioned the RSM is sequential since it uses a sequence of local experiments and leads to the optimum input combination.He also has claimed that the RSM can achieve an appropriate track record in literature.Some of the initial applications of RSM in SO can be found in (Azadivar, 1999;Biles, 1974).Commonly, the main motivation of polynomial approximation for true response function is based on Taylor series expansion around a set of design points, see (Myers et al., 2016).The general overview of the first-order response surface model is shown as:  Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.where  is number of design variables.Most times, the curvature of response surface is stronger than the first model can approximate it even with the interaction terms, so a second-order model can be employed: where  0 ,  ̂,  ̂ and  ̂ ′ are unknown regression coefficients and the term  is the usual random error (noise) component.The number of expression in a linear polynomial regression model ( + 1)( + 2), and cubic model is  = 1 6 ( + 1)( + 2)( + 3) when  is number of input variables.By polynomial regression to fit reasonable metamodel, the sample size should be at least two or three times the number of expression () (Jin et al., 2003).

Kriging
Since Daniel G. Krige (1951) has addressed the geostatistics around six decades ago, today Kriging (also called Gaussian process) models have been used as a widespread global approximation technique (Jurecka, 2007).Kriging is an interpolation method which could cover deterministic data in a black-box presentation, and it is highly flexible due to ability in employing a various range of correlation functions.In general, Kriging has been used in deterministic simulation models, i.e.Computer Aided Engineering (CAE).Kriging is hardly any application yet in random simulation (Kleijnen, 2005;Kleijnen, 2015).The higher accuracy of Kriging models than other alternatives such as polynomial regression is confirmed via different numerical cases in literature (Dellino et al., 2015;Jin et al., 2001;Simpson, Poplinski et al., 2001).In the Kriging model, a combination of a polynomial model and realization of a stationary point are assumed by the form of: Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
where the polynomial terms of   () are typically first or second order response surface approach and coefficients   are regression parameters ( = 0,1, … , ).This types of Kriging approximation is called universal Kriging, while in ordinary Kriging instead of () the constant mean  = (()) is used.The term  describes approximation error, and the term () represents realization of a stochastic process, which most time normally distributed Gaussian random process with zero mean, variance  2 and non-zero covariance.The correlation function of () is defined by: where  2 is process variance and (  ,   ) is the correlation function, and can be chosen from different functions which proposed in literature (e.g.exponential, Gaussian, linear, spherical, cubic, and spline).For instance, the general exponential correlation function is defined as below: where  is dimension of input variables, and   determine the smoothness of the correlation function and   indicates the importance of  ℎ input factor, while the higher   denotes the less effect of factor  on output.For   = 1 and   = 2 respectively the exponential and Gaussian correlation function is made.

Validation of metamodel
In general, to assess the predictor behavior and evaluating of the model, the techniques can be divided into two types based on the set of sampling points.The first type is evaluated model by using training data (i.e. the set of data which is used in estimating model) and second by employing validating data (i.e.new set of data except for data which used in estimation).Some different methods have been suggested for evaluating a metamodel, while among them four more applicable validation methods are chosen to discuss in follow.Notably, mentioned methods are based on employing validation data in semi-expensive models, since in expensive simulation models impose extra computational costs due to extra simulation runs.Moreover, in such a case some methods such as cross-validation or bootstrapping can be used, see (Kleijnen, 2015).

P a g e R-square Index
This method can be used to compare first order against second or above order polynomials regression, or RSM with other metamodels namely Kriging.The  2 coefficient is defined as: where   ̅ is mean of observed values (  ) and   ̂ is corresponding predicted values.

Adjusted R-square Index
Due to the index  2 always increases when the terms are added to the model, some regression analysts prefer to use another statistic index which called adjusted -square: By adding variables to the model, generally the statistic   2 will not increases.In fact, the value of adjusted -square often decrease, if unnecessary terms are added to the model.

Relative Maximum Absolute Error (RMAE)
While the larger magnitude of R-square indicates better overall accuracy, the smaller amount of RMAE indicates the smaller local error.A suitable overall accuracy does not necessarily signify a good local accuracy (Jin et al., 2003).

Cross-validation
The cross validation method can be used when collecting new data or further information about simulation model is costly.The cross validation uses an existed data and does not require to rerun of the expensive simulation.This method also called leave--out cross-validation to validate metamodel (i.e. in each run  sample points would be removed from an initial training sample points) (Kleijnen, 2015).The leave-one-out cross validation ( = 1) is briefly explained in follows that is most popular than others: Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
Step 1: Delete  ℎ input combination and relevant output from the complete set of  combination ( = 1,2, … , ).
Step 2: Approximate the new model by employing  − 1 remain rows  −1 .
Step 3: Predict output for left-out point ( −1 ) with metamodel which obtained from Step 2.
Step 4: Implementing the preceding three previous steps for all input combination (sample points) and computing  predictions ( ̂).
Step 5: The prediction result can be compared with the associated output in original simulation model.The total comparison can be done through a scatter plot or eyeball to decide whether or not metamodel is acceptable.

Robust metamodeling in SO
There are different number of methodologies in optimizing of deterministic simulation, but there are few number of studies have been done on random (stochastic) SO problems under uncertainty and effect of noise parameters, particularly based on combination of metamodels and robust optimization, see (Simpson, Poplinski et al., 2001) and two recent review papers by (Amaran et al., 2016;Kleijnen, 2017).Table 9 depicts a different combination of SO methods which have been applied in reviewed articles.Barton (1992) has introduced the Taguchi methods as an alternative to metamodeling strategies.Bates et al. (2006) have shown that the Taguchi crossed array was more successful than the dual response designs in its relevant numerical example.On the other side (Dellino & Meloni, 2015;Kleijnen, 2015;Myers et al., 2016;Vining & Myers, 1990) have been combined the Taguchi approach with approximation methods to be used advantages of both methods.Figure 11 illustrates a general procedure in SO method under uncertain condition based on surrogate models and Taguchi termonology (Dellino & Meloni, 2015;Kleijnen, 2015).Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.28 P a g e

Bootstrapping
Consequently both sensitivity analysis and optimization must be done on metamodels to interpret the observed simulation input/output data (Van Beers & Kleijnen, 2004).Sensitivity analysis can serve optimization of the simulated system (Kleijnen, 2010).The sensitivity analysis is based on a fixed condition for a system just with the variation of one factor.Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.30 P a g e (Kleijnen, 2009b) has recommended that the designs for sensitivity analysis and optimization need to be combined for robust optimization.The parametric bootstrapping has been suggested in such a case while assume a specific distribution type is estimated from I/O simulation data on hand.The basics of this method have been explained in (Kleijnen, 2004;Kleijnen, 2010;Kleijnen, 2009bKleijnen, , 2015)).In stochastic simulation, each input combination  is replicated a number of times  ≥ 1.In an expensive simulation, yet just the number of replication are less, so good results cannot be expected to gain by this types of bootstrapping (i.e.rarely can find the exact distribution of I/O simulation data).Moreover, a simple method for estimating the exact variance of predictor is distribution-free bootstrapping, see (Dellino & Meloni, 2015).

Sequential improvement:
The sequential improvement methods have been interested in engineering design problems to a trade-off between local and global search with the criterion of expected improvement.(Abspoel et al., 2001) have proposed a sequential approximate approach to solving SO with integer design variables.Sequential improvement can be appropriately used in different practical engineering design problems, for instance, see (Wiebenga et al., 2012) and (Havinga et al., 2017) in improving metal forming processes, (Jurecka et al., 2007) in the 10-bar truss under varying loads, and (Zhang, J et al., 2017) for optimal design of the skyhook control for the suspension of a half-car model.
These methods are combined with two main parts, first statistical part consist of DOE and metamodeling techniques, and second evolutionary algorithms.Note that, this method handles capturing the behavior of the overall design space, so the global approximation methods (e.g.Kriging) can be covered by this method while polynomial regression cannot, due to locally behavior.Sometimes due to the low correlation between predicted optimum point (which estimated by metamodel) and other points, the metamodel does not show enough accuracy in optimal point compare to the original model, for more information see (Havinga et al., 2017;Jurecka, F et al., 2007;Sóbester et al., 2004).

Discussion and results:
Polynomial regression metamodel can be established by two main consecutive steps, screening and optimization.In the screening step, the significant interested levels of input factors can be identified, so in the second step, the closer interval of input factors can be studied (Kleijnen, 2010;Myers et al., 2016).Haftka et al. (2016) have considered the latest development on globallocal approach.In this approach, the rough surrogate model is constructed in entire design space, and then apply it to zoom on promising regions.Wang, (2003) has developed the adaptive RSM Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

P a g e
which instead regions with large-scale, made a new DOE through the central composite design or LHS in reduced region, for more information can also see (Peri & Tinti, 2012).On the contrary, a polynomial model is easy to establish, more distinct in the sensitivity analysis, and cheap to work (Wang & Shan, 2007).In addition comparison of different metamodel types such as polynomial regression and Kriging in practical problems also remains a challenging topic (Beers & Kleijnen, 2004;Kleijnen, 2017;Wang & Shan, 2007).
More support is currently available for implementation of polynomial regression with computer software than other types of metamodels (e.g.Design Expert (V.10), Minitab (V.17), SAS (V. 14.2), SPSS (V.23), and so on).For Kriging the MATLAB (DACE, free Kriging toolbox 1 can be used (Lophaven et al., 2002).In addition, ARCGIS (geostatistical analysis) and Isatis (geographical analysis) can support the Kriging surrogate model, but till the number of input factors does not exceed three (Kleijnen, 2009b).
However, a study in literature shows there is a lack software package that can cover Kriging surrogate model in the framework of practical engineering design (Jin et al., 2001;Kleijnen, 2009b), while it has been more employed for academically usage than for practical problems in the real world (Jalali & Van Nieuwenhuyse, 2015).Viana et al. (2014) have listed a number of existed commercial software (e.g.BOSS/Quattro, DAKOTA, HyperStudy, IOSO, LS-OPT, OPTIMUS, and VisualDOC) with highlighting their metamodeling and optimization capabilities, while they have claimed that for many these software systems, metamodeling is not a final goal.
In practice, most problems can be supported by polynomial regression requirements, and due to less complexity, process designers have preferred to use it than other metamodels, see (Forsberg & Nilsson, 2005;Jalali & Van Nieuwenhuyse, 2015;Jin et al., 2003;Simpson et al., 2004).
However, due to strength points of Kriging, recently there is a significant increasing trend to attend more in the application of common modern metamodel namely Kriging, compared to classical metamodel techniques such as the different order of polynomial regression.Kriging is a popular method for metamodeling with simulation data (Havinga et al., 2017).The Kriging gives more accurate and flexible approximation than polynomial regression due to fit globally over whole design space (Dellino, 2015;Jin et al., 2001;Simpson, Poplinski et al., 2001).The main advantages of Kriging over polynomial regression is exact interpolating of Kriging, i.e. the predicted values and simulated output in the set of observed input values are exactly equal  Decision Science Letters, vol. 8, no. 1, pp. 17-44, 2019.

32
P a g e (Kleijnen, 2005).In general, Kriging predicts very bad in the case of extrapolation outside observed sample data (Kleijnen & Beers, 2004).
Most methods which mentioned in this context just have been tested in theoretical settings of problems, so applying these methods in practical problems and in-depth comparing of their performance can be an interesting area for additional research (Dellino, 2009;Jalali & Van Nieuwenhuyse, 2015).Kleijnen, (2009b) has emphasized on applying metamodels particularly Kriging in practical random simulation models, which are more complicated than the academic M/M/1 queueing and (s, S) inventory models.
Another shortcoming which has been revealed by reviewing literature is that most publication assume single variable output, whereas in practice simulation models have to covered by multivariable output methodology (Kleijnen, 2009b;Simpson, Mauery et al., 2001;Teleb & Azadivar, 1994).Yet, investigating suitable DOE for multiple outputs has been an interesting topic for researchers (Kleijnen, 2005).
To the best of our knowledge, rarely can find the study which considers variability in design variables parallel with noise variables.Most studies consider the variability of environmental factors and it influences in response variability, so they have assumed that design variables can be exactly controlled in their relevant value, while in practice this assumption may be failed, see (He et al., 2010;Sanchez, 2000).Most times in practice important uncertainty can be tolerances in nominal value of design variables (Myers et al., 2016).

Conclusion:
In this paper, the latest developments on optimization of complex simulation models under uncertainty are investigated.Intensive attention is focused on surrogate based methods hybrid robust design optimization, particularly according to dual response methodology.These type of methods are being in the class of MBRSO.Main methodologies that have been used in reviewed articles are highlighted, while important methods are discussed.Outstanding shortcomings and also positive points of each method are mentioned based on discussed topics.Respect to advantages and disadvantages of both metamodels (polynomial regression and Kriging), appropriate methods which integrated both metamodels (hybrid surrogate methods) in practical problems have not been attended enough.The low-order polynomial can be used in screening and Kriging for optimizing as well.Proposing methodologies which can cover different gaps which mentioned in discussion and result section can be interesting topics for future works by researchers.

Figure 1
Figure 1 The trend of publications with topic of "simulation optimization" in (a) the Web of Science databases (source: WoS, Data retrieved on August 2017), and (b) SCOPUS (source: Scopus, Data retrieved August 2017).

Figure 2
Figure2Trend of publications on "Metamodel and simulation optimization" impact (data from 1996 to date).

Figure 3
Figure 3 The top 50 key-phrases by relevance in the past five years papers (656 publication).

Figure 4
Figure 4 An overview of simulation scope.

Figure 5 Figure 6
Figure 5 Deterministic and stochastic simulation models

Jalali&
Van Nieuwenhuyse (2015) have reviewed some methods of robust design optimization in the class of dual response which applied in simulation optimization, and concluded that the dual response surface approach is more attended among other techniques in that subject.This model has employed two metamodels separately for the process mean and another for the process variance.By combining both types of factors in process included design and noise (uncertain) variables, we can approximate the  = (, ) as a function of design factors () and uncertain factors ().Due to the stochastic nature of the simulation model, repeated runs of the simulation model with the same combination of input variables, lead to different outputs.Typically, as the 10)When the probability distribution of noise (uncertain) parameters are unknown but historical data are available, some methods such as -divergence approach can be used to find the estimation of probability distribution(Moghaddam & Mahlooji, 2016).Attending the probability of occurrence of uncertain parameters guide to tighter uncertainty set and less conservative robust solution.Yanikoglu et al. (2016) have suggested robust optimization in the class of dual response Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision Science Letters, vol.8, no. 1, pp. 17-44, 2019.17 P a g e for such a problem when the probability distribution of uncertain parameters is unknown, but historical data are available:    ()    2 () ..  ̅ () ≤ , ∀ , ∀()  (11)

Figure 10
Figure 10The viewpoint of metamodel under uncertainty (noise variables) and relationship with a simulation model.

Figure 11
Figure 11 Main steps in MBRSO procedure.

Table 1
Top ten high view counts countries in field MBSO.

Table 2
Top ten high view counts authors in field MBSO.

Table 3
Top ten high view counts journals in field MBSO.
Table 4 illustrates the number of document results from Scopus by employing some relevant keywords with conjunction 'AND'.The search was conducted on each article title, abstract, and keywords.There are a different number of SO methods that discussion about most of them is beyond of this context.Instead, we focus on simulation-optimization under uncertainty by employing metamodels and robust optimization.Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview,"

Table 4
Number of document results based on combination of different keywords (Scopus database).
13 P a g e design optimization is an engineering methodology to improve the performance of a model by minimizing the effects of variation without eliminating the causes since they are too difficult or too expensive to control.Robust simulation-optimization is about solving simulation model with uncertain data in a computationally tractable way.The main goal of robustness strategy is to investigate the best level of input factors setting for obtaining desirable output goal which is insensitive to the changeability of uncertain parameters.Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision ScienceLetters, vol.8, no. 1, pp. 17-44, 2019.
14 P a g e

Table 7
Applied different strategies in literature for management of uncertainty.

Table 7
illustrates a different number of strategies which have been done in reviewed literature through the management of uncertainty in SO.Most times, in robust design approach the output goal is to gain the minimum distance of mean with target point and with at least variance simultaneously.The main attention behind overall viewpoint via robust design method for designing and developing processes and products is concentrated on three aspects: Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision ScienceLetters, vol.8, no. 1, pp. 17-44, 2019.
iii.Minimizing deviance between the performance of product or process and its relevant target point.15P a g e The strategies in faced with unknown probability distribution of uncertain variables.Cite as: A. Parnianifard, A. Azfanizam, M. Ariffin, M. Ismail, and N. Ale Ebrahim, "Recent developments in metamodel based robust black-box simulation optimization: An overview," Decision ScienceLetters, vol.8, no. 1, pp. 17-44, 2019.
NoFigure 8 18 P a g e

Table 8
Different strategies of DOE for SO in literature.
1 Figure 9 An example for LHS design with two input factors, and four intervals.19 P a g e