Robust simulation optimization using φ -divergence

We introduce a new robust simulation optimization method in which the probability of occurrence of uncertain parameters is considered. It is assumed that the probability distributions are unknown but historical data are on hand and using φ -divergence functionality the uncertainty region for the uncertain probability vector is defined. We propose two approaches to formulate the robust counterpart problem for the objective function estimated by Kriging. The first method is a minimax problem and the second method is based on the chance constraint definition. To illustrate the methods and assess their performance, numerical experiments are conducted. Results show that the second method obtains better robust solutions with less simulation runs.


Introduction
Simulation is a powerful and flexible tool to model problems in which either the closed-form of the objective function or the constraints is not in hand.Examples of such problems are car crash model or aerodynamic wing design.For these problems, by applying optimization tools, the optimal combination of input variables that optimize the simulation output could be found.The area, consisting of the techniques that combine the simulation modeling and optimization tools for solving complicated optimization problems is commonly referred to as simulation optimization.Simulation optimization methods are generally classified into two categories, model-based and metamodel-based approaches (Nezhad & Mahlooji, 2013).In a model-based approach, simulation and optimization components interact with each other and outputs of one are inputs to the other (for more details about model-based approaches one can refer to Bhatnagar et al. (2011), Chang (2012), Kim and Nelson (2007) and Lin et al. (2013).In a metamodel-based approach, there exists a third component called metamodel.Metamodel identifies and estimates the relationship between inputs and outputs of the simulation model in the form of an explicit deterministic function.For expensive simulation models, metamodels such as responsesurface methodology, Kriging, radial basis functions, splines or neural networks are viable alternatives to alleviate the magnitude of the computational cost.For more details about metamodel approaches one can refer to Hurrion (1997), Kleijnen (2015), Liu and Maghsoodloo (2011), Simpson et al. (2001) and Yang (2010).In practice, most of the simulation models are uncertain, i.e., by running the simulation model for the same inputs different outputs are generated.For these models the underlying metamodel would be uncertain too.Usually in the literature, deterministic metamodels are fitted to the combination of the inputs and mean of the simulation output (e.g.Ajdari & Mahlooji, 2014;Kleijnen, 2014;Kleijnen et al., 2010;Quan et al., 2013).If we do not consider uncertainty in metamodel parameters, we may lead to non-optimal solutions.
To deal with this uncertainty, different methods have been proposed in the area of robust simulation optimization.For instance, Myers and Montgomery (2003) assumed that the random (noise) variable has 0 and , then the mean and variance of the response variable are estimated theoretically.Some methods use metamodels to estimate the mean and the variance of the response variable, for example, in (Dellino et al., 2009(Dellino et al., , 2010a(Dellino et al., , 2010b(Dellino et al., , 2012;;Dellino & Meloni, 2015;Kovach & Cho, 2009;Lee et al., 2006) two Kriging metamodels were fitted, one for the mean and one for the variance of the response.To formulate the robust optimization problem, they consider minimizing the mean while satisfying a constraint on the standard deviation (or vice versa).
Few research efforts have been reported so far which employ robust optimization techniques in simulation (for a thorough review on robust optimization techniques one can refer to Bertsimas et al. (2011) and Gabrel et al. (2014).For example, in (Zhou & Zhang, 2010) an interesting approach was presented in which a metamodel and an evolutionary optimization algorithm were combined.In (Stinstra & Den Hertog, 2008) by considering three types of error: simulation error (the difference between reality and computer model), metamodel error (the difference between computer model and metamodel), and implementation error the authors represented the robust counterpart problem based on the Ben-Tal and Nemirovski theory for the regression and Kriging metamodels.In (Marzat et al., 2013) an iterative relaxation procedure was combined with the Kriging-based optimization.In (Bertsimas et al., 2010) a robust optimization method was presented that is suitable for unconstrained problems with a nonconvex cost function as well as for problems based on simulations (see also Cramer et al. 2009;Lung & Dumitrescu, 2011).
The main criticism for the robust optimization methods is that they do not consider the occurrence probabilities of the uncertain parameters and just find a solution which is feasible for all the values in the uncertainty set, so the final solution would be an over conservative solution.A remedy for this is to consider the probability of occurrence of uncertain parameters in a problem.However, in practice, the probability distributions of parameters, P, are not known and just historical data may be available.This means that the distributions of the uncertain parameters are uncertain themselves.Considering the uncertainty in the probability distribution, different robust methods have been proposed in the literature.One method is to optimize the objective function based on the worst-case probability distribution (for more details of this method, one can refer to Birge and Wets, (1986), Shapiro and Ahmed (2004) and Shapiro and Kleywegt (2002).Another method aims to construct an uncertainty region for the probability vectors estimated from historical data, in a way that bears a specific confidence level.There are different statistics in the literature for constructing confidence levels for probability vectors.Here we use φdivergence as it generates a tractable robust counterpart formulation that makes it an interesting method in robust optimization (Ben-Tal et al., 2013).
Works related to φ-divergence in the literature are as follows.Klabjan et al. (2013) employed a special case of φ-divergence to construct uncertainty regions from historical data for the stochastic lot-sizing problem using a χ2-statistic.Calafiore (2007) formulated the robust portfolio selection problem by considering Kullback-Leibler divergence which aims at finding robust solutions that are feasible for all allowable distributions of the uncertain parameters with bounded supports.Wang et al. (2009) proposed a robust optimization method for newsvendor problem in which the uncertainty set for the unknown distribution is defined as a "likelihood region".Ben-Tal et al. (2013) proposed a method based on φ-divergence for optimization problems in which the uncertain parameters are probabilities.This is the case when the objective function and/or the constraint functions involve terms with expectations (such as moments of random variables).They applied their method to an investment problem as well as multiitem newsvendor problem.Their results show that uncertainty sets based on φ-divergence were good alternatives for the uncertainty sets such as ellipsoidal, box, and their variations.Yanikoglu and den Hertog (2012) used φ-divergence for safe approximation of chance constraints where the family of distributions, P, is given by a confidence set that is based on historical data.They show that φ-divergence leads to tighter uncertainty sets compared with other existing safe approximation methods.
In this paper, we employ φ-divergence in the simulation optimization context to find robust solutions for problems in which the closed form of the objective function is not in hand.We fit Kriging to the I/O combination of simulation model to estimate the objective function and apply the φ-divergence to the estimated objective function.We assume that the probability distributions of the uncertain parameters are not known but historical data are available, so the probability vector is the uncertain parameter.We propose two methods to employ φ-divergence in robust simulation optimization context .The first method is a minimax problem, based on the work of Ben-Tal et al. (2013).The second one is a chance constraint definition based on the work of Yanikoglu and den Hertog (2012).Our method is different from other robust simulation optimization methods in:  Using the probabilities of occurrence of uncertain parameters (random variables), that leads to tighter uncertainty sets and less conservative solutions,  Employing a distribution-free structure in robust optimization, which does not need any assumptions about the probability distribution of the random variables,  Proposing a method that can be applied to the constrained simulation optimization problems and consider robustness in both optimality and feasibility of a problem,  Employing a new method for taking samples from the random and decision variables that leads to fewer simulation runs in metamodel construction, which is considered an advantage especially for expensive simulation models).
The remainder of this paper is organized as follows.Section 2 elaborates on Kriging metamodel.In section 3, the φ-divergence concept as well as the proposed robust simulation optimization methods are explained in detail.In section 4, the proposed methods are applied to the well-known EOQ problem, the multi-item newsvendor problem, and seven test functions and results are compared with each other and with those reported by Dellino et al. (2012).Section 5 contains the conclusion and ideas for future research.

Kriging metamodel
Kriging gives predictions of unknown values of a random function as the weighted linear combination of the observed values.Kriging has been often applied to deterministic simulation models in engineering.
Recently, Kriging has been extended to stochastic simulation models (Ankenman et al., 2010).Kriging assumes the following model: where is the vector of design points, is the trend model, and is a realization of mean 0 random field which is assumed to exhibit spatial correlation.This means that values and ́ will tend to be similar if and áre close to each other in space.Ordinary Kriging assumes a stationary covariance process for .This assumption implies that the expected values are constant (so can be replaced by ), and the covariance of and depends only on the distance (or lag) .The predictor for the design point, , denoted by Y , is a weighted linear combination of all the (say) observed output data, : To select the weights, , in Eq. ( 2), Kriging minimizes the mean-squared prediction error.Let the 1 vector , , … , , be denoted by , the covariance matrix across all design points , , … , be denoted by , and 1 denote a vector of ones.Then the optimal weights turn out to be: Assuming that is second-order stationary, leads to the following: where can be interpreted as the variance of for all , and is the correlation that depends only on ́ and is a function of an unknown parameter .We require that ;́ θ)→0 as the distance between and ́ goes to infinity, and 0, 1 (for a thorough review on Kriging one can refer to (Kleijnen, 2009)).

Robust counterpart with φ-divergence uncertainty
In this section, we first explain the concept of φ-divergence and some useful properties in subsection 3.1.Then, we will discuss the two proposed methods based on φ-divergence in subsections 3.2 and 3.3, respectively.

Introduction to φ-divergence
Given is a convex function for 0 satisfying 1 0, 1 0, 0 0 ⁄ ≔ lim → ϕ ⁄ , for 0, and 0 0 0 ⁄ : 0, the φ-divergence (distance) between two vectors , … , 0 and , … , 0 is defined as: We consider to be the unknown true probability vector and the empirical estimate of .There are different types of function in the literature.For a good review one can refer to Pardo (2005).Table 1 taken from (Ben-Tal et al., 2013) presents the most common choices of .together with the conjugate function that is defined as follows: * .
If we assume is twice continuously differentiable in the neighborhood of 1 and 1 0, then the test statistic follows a χ2-distribution with (m−1) degrees of freedom (N is the number of historical observations).Using this test statistic, an approximate 1 -confidence set for defines the family of distributions P: where denotes a column vector of ones with the same dimension as p and Hellinger distance 1 √ 1 , 1

Constructing the robust counterpart by minimax method
We assume that the probability distributions of random variables are not known and should be estimated based on historical data.We also assume that there is a fixed number, m, of given scenarios for random variables and in the case of continuous variables we divide the valid range of random variables to m parts and find the center of each range, , for j=1,2,..,m.Using historical data we estimate the probability of occurrence of .We also find n sampling points, , for i=1,2,…,n, from solution space using a sampling design like Latin Hypercube Sampling (LHS).Then, we run the simulation model for every combination of sampling points, , for i=1,2,…,n and random variables, , for j=1,2,..,m and fit one Kriging metamodel for every scenario of uncertain parameters which amounts to m metamodels.The final objective function would be the weighted sum of all the m metamodels in which weights are the estimated probability values for the corresponding scenarios.Such an outlook toward the objective function paves the way to use functions like φ-divergence in simulation optimization context, which is our main contribution.Using φ-divergence leads to robust solutions, which are less conservative than the solutions obtained by common robust optimization methods.We derive the robust reformulation of the optimization problem using Ben-Tal method (Ben-Tal et al., 2013).We describe the steps of the proposed method in more detail below.
Step 0. Determining valid ranges for the random and decision variables: We first determine the random variables (e.g., demand and lead-time are random variables in an inventory problem) and their valid ranges.We assume that the probability distributions of the random variables are not known.Given N historical observations on the random variables, , we partition the range of them into m cells in a way that the number of observations is at least five in every cell.Then we calculate the frequency for cell j using the number of observations, oj, as an estimate of the occurrence probability in every part as follows: where pj denotes the probability of falling into cell j and ̂ denotes the estimate of pj.We should also determine the decision variables (for example economic order quantity is the decision variable in an inventory problem) and the valid ranges for them.We sample points from the solution space by a uniform sampling design like LHS to form the simulation points set.
Step 1. Constructing the objective function: For every combination of design points, xi, for i=1,2,…,n and random variables, , for j=1,2,…,m we run the simulation model one time and show the simulation outputs by wij.To estimate the objective function, Kriging metamodels are fitted to the I/O combination of simulation model for every random variable, , for j=1,2,…,m.The resulting metamodels would be of the form: We define the final objective function as the weighted sum of m fitted metamodels, whose weights are equal to the probability of occurrence of random variables.Therefore, the final model would be as follows: where: The nominal mathematical programming model is as follows, where is the feasible set given by deterministic constraints.
Step 2. Formulating the robust counterpart problem: We write the robust counterpart for Eq. ( 14) as: We can rewrite Eq. ( 15) as the following mathematical programming problem: Step 3. Solving the counterpart problem: The mathematical problem in Eq. ( 16) is a difficult optimization problem that has infinitely many constraints (∀p∈U), and includes quadratic terms in p. Ben-Tal et al. (Ben-Tal et al., 2013) proposed a tractable semidefinite robust counterpart of an optimization problem where the uncertain parameters are probabilities.Let ∈ , ∈ , and ∈ as given parameters, p∈ as the vector of uncertain parameters, and as a function (linear or nonlinear) in x, the following robust constraint is considered: With uncertainty region U given by: where ∈ and ∈ .They proved that a vector x∈ satisfies Eq. ( 17) with uncertainty region U given by Eq. ( 18) if and only if there exist ∈ and ∈ such that (x, , η) satisfies * 0, 0 , where and are the i-th columns of and , respectively and * is the conjugate function given by Eq. ( 6), with * * ⁄ for ≥ 0 and 0 * 0 ⁄ 0 * , which equals 0 if s ≤ 0 and +∞ if s > 0. In (Ben-Tal et al., 2013) it has been shown that Eq. ( 19) is a convex optimization problem and is computationally tractable.In addition, the size of the uncertainty region can be controlled by the confidence level.These two features, make the approach appealing in robust simulation optimization.
Here we use this method to find the approximate robust reformulation for the model obtained in Eq. ( 16).Adopting the formulation of Eq. ( 17) and Eq. ( 18), the approximate robust reformulation of Eq. ( 16) is the following semidefinite optimization problem: where ∑ for j=1,…,m.It is noteworthy that although the final robust counterpart problem given by Eq. ( 19) is convex but as we use Kriging as in eq. ( 19) and Kriging is not necessarily a convex mapping, the final robust counterpart problem (Eq.( 20)) is not convex necessarily.Convexity of the model obtained by Kriging ( depends on the convexity of the I/O function and even for convex functions Kriging may give nonconvex estimates."Kriging may even give metamodels that contradict our prior qualitative (structural) knowledge of the characteristics (e.g., convexity or monotonicity) of the I/O function that is implicitly defined by underlying simulation model" (Kleijnen et al., 2012).We can extend the proposed approach to constrained problems in which the closed forms of some constraints are not in hand.The algorithm is the same and the only difference is that we should fit m Kriging metamodels for every constraint and work with multiple constraints rather than a single one.The approximate robust reformulation is as follows: where h denotes the constraint index ( 0 relates to the objective function) and ∑ for j=1,2,…,m and h=0,1,…,H.

Constructing the robust counterpart by chance constraint method
Here we use the concept of chance constraint to find the robust counterpart for the objective function found in Eq. ( 14) in a different way.Let , be a function of a decision vector, x, and a random vector, , ∈ 0,1 as the prescribed probability bound and P as the known probability distribution of , a chance constraint is given by Unlike robust optimization, in chance-constrained optimization it is assumed that the uncertain parameters are distributed according to a known distribution and if we have partial or no information on the probability distribution, P, it should be estimated.In such a situation, a safe approximation of ambiguous chance constraints is used.Here we use the safe approximation based on φ-divergence proposed by Yanikoglu and den Hertog (2012).They presented an algorithm to find the solution in a chance constraint problem using φ-divergence.Here we employ the algorithm and propose a new algorithm for problems whose objective function is obtained by Kriging metamodel.Following the proposed algorithm is described in more details.
Step 0. Determining valid ranges for the random and decision variables: Like previous section, we first determine the valid ranges for the random variables and divide the space into m cells.Let V = {1, . . .,m} be the set of cell indices.We then calculate (the estimation of occurrence probability) for j=1,2,…,m by Eq. ( 10).We should also determine the decision variables, x, and their valid ranges.We sample points from the solution space for the random and decision variables using a uniform sampling design like LHS and construct the simulation points set.
Step 1. Constructing the objective function: For every point of experimental design set, ( , ), for i=1,2,…,n we run the simulation model one time and show the simulation output by wi.Therefore, we run the simulation model n times in total which is fewer than the runs which is usually needed by simulation optimization methods.To estimate the objective function, a Kriging metamodel is fitted to the I/O combination of simulation model for n simulation points.The resulting metamodel is as follows: where , 's are the coefficients which are functions of the decision and random variables.
Step2.Formulating the robust counterpart problem: We consider the following chance constrained optimization problem for Kriging metamodel: where X is the feasible set given by deterministic constraints and β is the given probability bound.The robust counterpart problem is as follows: where Z is the uncertainty set given by: where Ω is the radius of the ball in the uncertainty region which is set to zero in the first iteration (so is equal to the center point of the random variables space and the problem is the same as the nominal problem) and in the following iterations, it will be increased according to step five.By solving the above programming problem, the robust optimal solution, x * , is found.
For solving Eq. ( 25), which is a nonlinear robust optimization problem (when Ω 0), some methods are proposed in the literature (see e.g.(Ben-Tal et al., 2015;Bertsimas et al., 2010)).Here we apply the method proposed by Bertsimas et al. (2010) which is applicable to problems whose objective functions are nonconvex and given by a numerical simulation-driven model such as response surface and Kriging metamodels.Their proposed method just requires the value and the gradient of the objective function and is analogous to local search techniques such as gradient descent.(for more details about the method one can refer to (Bertsimas et al., 2010)).
Step 3. Constructing the valid set: We identify the set of cells for which the following equation holds: where is the center point of cell j∈ V.If the center point of a cell satisfies the constraint in Eq. ( 27) for a given x * , then we assume that all the realizations in the associated cell are feasible for the uncertain constraint.
Step 4. Calculating γ(S, α): To calculate γ(S, α) the following programming problem should be solved: , min ∑ ∈ . ., , ∑ 1 0 (28) Note that Eq. ( 28) is a convex optimization problem in p since φ-divergence functions are convex.According to the above problem, the probability that the random variable is in the region defined by S, is at least γ(S, α) with a (1-α)-confidence level.If γ(S, α) ≥ β then the region determined by S is selected as the uncertainty set and the algorithm is terminated.Otherwise, we go to Step 5.In (Yanikoglu & den Hertog, 2012) it has been shown that an alternative way of calculating γ(S, α) is using the dual problem of Eq. ( 28) which results in the following convex problem: in which * .
Step 5. Increasing Ω: We increase Ω by the step size ω and go to Step 2.

Applications and results
We apply the proposed methods to the well-known EOQ problem, the multi-item newsvendor problem, and seven test functions.Numerical results are obtained on a CORE 2 duo notebook with 2.53GHZ processor and 3GB RAM.To fit Kriging metamodel, we employ the DACE toolbox of MATLAB.
Kullback-Leibler is used as the φ-convergence function.We use lhsnorm from the MATLAB statistics toolbox to pick points from the interval of decision variables including their extreme values.As the related robust counterpart is a nonlinear optimization problem (NLP), we apply MATLAB's function fmincon to compute the optimum robust solution, say * , for the counterpart problem of Kriging metamodel.

EOQ Inventory Model
Like Dellino et al. (2010b) we apply our methodology to the classic EOQ inventory model.Total costs per time unit C is considered as the objective function, which consists of setup cost per order K, cost per unit purchased c, and holding cost per inventory unit per time unit h.The delivery lead-time is zero, i.e., goods arrive into inventory as soon as an order is placed.Continuous review is assumed, i.e., an order is placed as soon as the inventory level falls below the prescribed reorder point, which is set to zero.Simulation uses 10,000 periods per replicate.In our example we use the parameter values of K = 12000, c = 10, and h = 0.3.To study robust simulation optimization, we assume that the customer demand in each period takes integer values between 6000 and 10000.We set equal to 0.05.Here, the objective is to determine an EOQ value between 15000 45000 to minimize the total cost per period.
The inventory models were set up in ARENA 12.We pick nine equally spaced points; including the extreme points, 15000 and 45000 (see the first column of Table 2).To obtain the frequencies for random variable (demand), we divide the range of the parameter into nine equal intervals of size 500 (the second row of Table 2).Therefore; any distribution on the demand space can be represented by a vector , … , for m=9, with the constraint ∑ 1; 0. Frequencies of the intervals are presented in the first row of Table 2. Running the simulation model, the total costs C (objective function value) are obtained for every combination of order quantities (decision variable) and demand range (random variable) that are listed in Table 2.In the first method, nine Kriging metamodel , for 1,2, … ,9 are fitted to the I/O combinations for every value of demand.The probability of occurrence of every metamodel is equal to the probability of occurrence of the random variable (demand).Employing the proposed method, the counterpart problem is written using ( 16) and is solved according to (20).The optimal solution is * 25000 and the robust optimal objective function * 87590.
For the second method, we consider probability bound 0.95, step size ω=500, and m=9.We set the number of sampling points at n=20 and using LHS we select the sampling points from the valid range of the decision and random variables and run the simulation model for them.We then fit a Kriging metamodel to the combination of I/O of simulation.In Table 3 the number of iterations, It., the value of demand, , the radius of the tight uncertainty region calculated by the algorithm, Ω, the optimal solution, * , the objective function, * , the set of cells, S, and the probability bounds satisfied by the algorithm, γ(S, α), are shown for 3 iterations at 0.9 confidence level.As can be seen, the final optimal robust solution is * 27389 and the robust objective function is equal to * 101070.The first row in Table 3 shows the answer for the nominal problem and the last row corresponds to the optimal robust solution with respect to the defined probability bound.Between the nominal solution and the solution satisfying a bound of 0.95, the optimal objective value increases by 15% while there is about 30% increase in the immunity to uncertainty.We compare the solutions to those obtained by Dellino method from (Dellino et al., 2012).In their method, two Kriging metamodels are fitted one for the mean and one for the variance of the response.
To formulate the robust optimization problem, they consider minimizing the mean of the response as the objective function while satisfying a constraint on the standard deviation.In (Dellino et al., 2012) it is assumed that the demand has a Normal distribution with mean 8000 and standard deviation 800.Here we obtain the robust solution by Dellino's method for Sc=8200.The optimal robust solution is * 32000 and the robust objective function equals to * 92883.
For comparison purposes, we also calculate the reference minimax value for EOQ problem.It is easy to verify that this problem has the following true I/O function, which we shall use to check our results: (30) where is the random variable (demand) and Q is the decision variable (order quantity).We write the counterpart problem as: min max 10 . s.t.15000 45000 5600 10400 We solve the above problem by a numerical simple algorithm to find the exact solution.The optimal solution yields Q * 28844 and C * 112650 which we use as a reference to compare the results of our methods to those proposed by Dellino et al. (2012).Results for the proposed methods and Dellino et al. (2012) method are compared by computing the absolute value of their relative deviation from reference values for the solution and the function value.The results are reported in Table 4.As it is shown the first method leads to less conservative solutions compared with other two methods but relative deviation of the second method is less than other two methods.The relative deviation of the Dellino et al. (2012) method is less than the first method.Moreover, the second method requires the least number of simulation runs which is considered an advantage especially for expensive simulation problems.

The multi-item newsvendor problem
The newsvendor problem is a single period stochastic inventory problem.In its simplest form, there is a single item with a probabilistic demand that can only be sold in one period.Due to the uncertain demand, this newsvendor can face both unsold items and unmet demand.The unsold items will return a loss, and unmet demand generates a cost of lost sales.The objective is to select the replenishment quantity to maximize the expected profit.One extension to the basic newsvendor problem is the multi-item problem that deals with optimizing the replenishment quantity of several items.For each item i, we define the purchase cost , salvage value , selling price , and the cost of lost sales .Moreover, we denote for the order quantity for item i.We assume that the demand for each item i during the period is a random variable that can take on m values, for 1,2, … , and 1,2, … , .We denote for the unknown probability that demand for item i equals .We consider n=3 items, and m=3 scenarios for each item including: =4 (low demand), 8(medium demand), and 10(high demand).The parameter values of the problem, as well as the values of are given in Table 5. points; including the extreme points (4 and 10), for , i=1,2,3.Running the simulation model, the total costs C (objective function value) are obtained for every combination of order quantities (decision variable) and demand range (random variable).In the first method, 27 Kriging metamodel , for 1,2, … ,27 are fitted to the I/O combinations for every value of demand.The probability of occurrence of every metamodel is equal to the probability of occurrence of the random variable (demand).Employing the proposed method, the counterpart problem is written using Eq. ( 16) and is solved according to Eq. ( 20).The optimal solution is * 8.9,9.2,8.9 and the robust optimal objectives function is * 34.37.For the second method, we consider probability bound 0.90, step size ω=2.We set the number of sampling points at n=36 and using LHS we select sampling points from the valid range of the decision and random variables and run the simulation model for them.We then fit a Kriging metamodel to the combination of I/O of simulation.The final optimal robust solution, for 3 iterations and 0.95 confidence level, is * 8.1,7.3,7.2 and the robust objective function equals to * 9.5.We compare the solutions to those obtained by Dellino method.Here we obtain the robust solution by Dellino's method for Sc=8.The optimal robust solution is * 6,6,6 and the robust objective function equals to * 12.346.To calculate the reference minimax value for the problem, we use the following true I/O function: , , min , max 0, max 0, , We write the counterpart problem as: s.t. 4 10 ∈ 4,8,10 We solve the above problem by a numerical simple algorithm to find the exact solution.The optimal solution is Q * 7,7,7 and C * 5 which we use as a reference to compare the results of our methods to those of Dellino et al.'s.Results for the proposed methods and Dellino et al. (2012) method are compared by computing the absolute value of their relative deviation from reference values for the solution and the function value.The results are reported in Table 6.As it is shown the first method leads to less conservative solutions compared to other two methods but relative deviation of the second method is less than the other two methods.The relative deviation of the Dellino et al. method is less than the first method.Moreover, the number of simulation runs of the second method is the least which is considered an advantage especially for expensive simulation problems.

Test Functions
Here we employ seven test functions from Marzat et al. (2013) and Rustem and Howe (2009) to show the performance of the proposed methods.These test functions are taken as black-box objective functions and the proposed method is applied to evaluate the minimax solution.The i-th decision variable is denoted by , and the j-th random variable by .The analytical expressions of functions are: We employ the first method to the test functions and set equal to 0.05.Table 7 shows the test functions, solution space of the decision variables, X, and random variables, E, the number of simulation points, n, the number of ranges for the random variables, m, the optimal robust solution, * , the corresponding value of the objective function, * , and the number of simulation runs, N. In second method, probability bound β is set to 0.9 and ω is set to 1.The first four columns of Table 8 show the test functions, number of iterations, the optimal robust solution, and the corresponding value of the objective function respectively.The fifth column presents the radius of the minimal ball in the uncertainty region, , and the sixth column gives the probability bounds satisfied by the algorithm, γ(S, α).The seventh column corresponds to the number of simulation runs required to obtain the optimal solution.We compare the resulting solutions to those obtained by Dellino et al. from (Dellino et al., 2012).Here we assume that the random variables have a Normal distribution with mean 0 and standard deviation 0.15 , where R is the range of the random variable.We set the number of replications at each design point at 10.The number of sampling points is set to 10d (d is the number of decision variables).We obtain the robust solution by Dellino et al. method for standard deviation Sc equal to the minimum standard deviation obtained from the 20 first simulated design points for every test function.
For comparison purposes we employ the results of Rustem and Howe (2009) for these test functions as reference solutions.In (Rustem & Howe, 2009) three descent methods have been compared on these test functions and lead to similar results.Table 9 shows the results for reference values and Dellino et al. method.We also study the effect of in relative deviation of the objective function and the robust solution for , .As it is shown in Fig. 1, by increasing toward 1, the relative deviation of the objective function and robust solution decreases.This is so by increasing , the second method will become similar to the mini-max method which is employed as reference for evaluation of solutions.It is worth noting that the comparison of the solutions of robust methods is not straightforward as they use different logics to obtain the robust solutions.However, we can mention the following as the main advantages of the proposed methods in comparison to other robust simulation optimization methods:  We have considered the probability of occurrence of random variables and our methods give solutions that are less conservative than those of the worst-case methods. Our proposed methods are based on the distribution-free structure of robust optimization, i.e., unlike the common robust simulation optimization approaches, we do not make any assumptions about the probability distribution of the random variables. The first method leads to tighter uncertainty sets and less conservative solutions.Furthermore, it can be applied to the constrained problems and can consider robustness in feasibility and optimality of problems.While most of the existing methods are suitable for unconstrained problems and can just consider robustness dealing with the objective function. The second method's solutions are close to the minimax solutions (but less conservative) and its level of conservatism can be controlled by the probability bound, .Furthermore, the second method requires fewer points for simulation than common robust simulation optimization methods.x(%) f(x,ε)(%)

Conclusions and ideas for future research
We have proposed two methods based on φ-divergence uncertainty to obtain the robust counterpart for metamodel-based simulation optimization problems (problems that the closed form of the objective functions or constraints are not in hand and are estimated by metamodels).The methods are applicable to problems whose probability distributions of the random variables are unknown but historical data are available.Considering the probability of occurrence of random variables leads to a tighter uncertainty set and less conservative robust solutions.
In the first method, we have found the minimax solution based on Ben-Tal method.In the second method, we use the chance constraint methodology to find the robust solution.We employed Kriging metamodel to estimate the objective function (and constraints) but other metamodels like regression and artificial neural networks can be employed too.
We have applied the proposed methods to the well-known EOQ problem, the multi-item newsvendor problem, and seven test functions and compared the results with each other and with those of Dellino et al. method. Numerical results show that the first method leads to less conservative solutions compared to the two other methods.However, the solutions of the second method are closer to the minimax solutions and its number of simulation runs are fewer than the two other methods.
An interesting subject for further research is employing other goodness-of-fit statistics in the simulation optimization context and investigating the tractability of the corresponding robust counterpart problem.
Studying the applicability of the proposed method to the complicated real life problems where the probability distribution is not in hand and only some historical demand data are given is among other extensions.

Fig. 1 .
Fig. 1.The relative deviations of the objective function and robust solution for different values of β for ,

Table 2
Results of simulation model for EOQ problem

Table 3
The optimal solution and the objective function obtained by the second method for EOQ problem

Table 5
The parameter values of the multi-item newsvendor problem

Table 6
Results and relative deviation from reference for three methods on multi-item newsvendor problem

Table 7
Results of the first method for the test functions

Table 8
Results of the second method for the test functions

Table 9
Dellino et al. (2012)nd results of the Dellino et al. method for the seven test functionsThe results for the proposed methods andDellino et al. (2012)method are compared by computing the absolute value of their relative deviations from reference values for the solution and the function values.The results are reported in Table10.As it is shown, the results of the second method are closer to the reference values than those of the first method and Dellino et al. method.The mean of deviation for the values of the objective function is 1.96 for the second method while this amount is 28.40 for the first method and 71.06 for Dellino et al. method.Moreover, the mean deviation for the values of the robust solution is 2.40 for the second method, 32.97 for the first method and 18.00 forDellino et al. method.Therefore, the second method obtains better minimax solutions than the first method and Dellino et al. method.Moreover, number of simulation runs in second method is significantly fewer than the two other methods.

Table 10
Relative deviations of the two proposed methods and Dellino et al. method from reference value