A survey on probabilistically constrained optimization problems

Probabilistically constrained optimization problems are an important 
class of stochastic programming problems with wide applications in 
finance, management and engineering planning. In this paper, we 
summarize some important solution methods including convex 
 approximation, DC approach, scenario approach and integer programming 
 approach. We also discuss some future research perspectives 
 on the probabilistically constrained optimization problems.

where f : ℜ n → ℜ and c i (·, ξ) : ℜ n → ℜ (i = 1, . . . , m) are real-valued functions, ξ ∈ R s is a parameter vector and X is a convex closed subset of ℜ n . When ξ is fixed coefficient vector and f and c i are convex functions, problem (P ) is a conventional deterministic convex optimization problem which can be solved by many efficient algorithms (see, e.g., [8,27]). However, in many applications of optimization problems in finance, engineering and management, the parameter vector ξ is often uncertain. For example, ξ may represent the returns of stocks, the market demands or agricultural production which can be regarded as random variables with full or partial information about the distributions. A simple way to handle the parameter uncertainty in (P ) is to ignore the uncertainty of ξ by replacing it with an estimation or a guess of ξ. For instance, we may use the mean value of ξ, which is an unbiased estimation of ξ, to replace ξ in (P ) 1.1. Two Examples. We now give two examples to show the applications of problem (P CP ).
Example 1. (Portfolio selection) Let ξ = (ξ 1 , . . . , ξ n ) T be the random returns of n risky assets with mean value vector µ and covariance matrix Σ. One of the classical mean-variance portfolio selection models of Markowitz is to minimize the variance while requiring that the mean value of the portfolio is equal to or higher than a prescribed return level R. A more reasonable constraint for controlling the random return is to use the following probabilistic constraint: where α ∈ (0, 1) is the risk level. This constraint is also called Value-at-Risk (VaR) constraint in financial engineering. The VaR-constrained portfolio selection model can be expressed as: where X is a group of deterministic linear constraints. VaR-based portfolio selection models have drawn much attention in the last decade in financial engineering (see [1,2,5,7,15]).
Example 2. (Transportation problem) Suppose that a product has n suppliers and m major clients. The capacity of supplier i is M i , i = 1, . . . , n, the unit shipping costs from supplier i to client j is c ij . Suppose that the demand vector ξ = (ξ 1 , . . . , ξ m ) T is random. The transportation problem can be modeled as where x ij ≥ 0 is the amount of product shipped from supplier i to client j. This is a stochastic version of the classical transportation problem (see [12,14,24,30]). Similar probabilistically constrained linear programming has been used in inventory/production management in multi-period supply chain management (see [22]).
1.2. Two convex cases. We now discuss two special cases where problem (P CP ) can be converted into a convex program. For simplicity, we suppose that m = 1 and c(x, ξ) = c 1 (x, ξ). We first consider the log-concave distribution case where • ξ is a continuous random vector with log-concave density function p(ξ); and −∞ otherwise. Thus 1 W is a log-concave function of (x, ξ). It follows that is also log-concave because log(1 W ·p(ξ)) = log(1 W )+log(p(ξ)) is a concave function of (x, ξ). Now, the probabilistic constraint P{c(x, ξ) ≤ 0} ≥ 1−α can be equivalently expressed as the following convex constraint: Thus, the feasible set of (P CP ) is convex.
Another convex case of (P CP ) is linear programming with normally distributed parameters. Suppose c(x, ξ) = ξ T x − b where ξ ∼ N (ā, Σ). Then the probabilistic constraint P(ξ T x ≤ b) ≥ 1 − α can be written as , the probability in the left-hand side of the above inequality is simply Φ( is the cumulative distribution function of a standard Gaussian random variable. Thus, the probability constraint P( which is a second-order cone constraint when α ∈ (0, 0.5). Note that Φ −1 (1−α) > 0 when α ∈ (0, 0.5).
1.3. Major difficulties. The major difficulties of the probabilistically constrained problem are due to the nonconvexity of the feasible set. Except for some special cases such as the two convex cases discussed in the previous subsection, the constraints P{c(x, ξ) ≤ 0} ≥ 1 − α are in general nonconvex (see [16,17,20,28]).
Example 3. Consider the following 2-dimensional example: where ξ ∈ ℜ 2 is a discrete random variable whose distribution is given in Table 1. Figure 1 shows the feasible set S from which we can see that S is nonconvex.  In this paper, we will review some important approaches and recent progress on probabilistically constrained optimization problems. In particular, we will investigate convex approximation methods, DC approach, integer programming methods and scenario approximation. We will also discuss research perspectives and challenging problems in this field.

2.
Convex Approximation Approaches. The idea of convex approximation is to construct convex inner (conservative) approximation to the nonconvex feasible set of (P CP ). Nemirovski and Shapiro [25] gave a unified method to derive convex approximations to (P CP ). For simplicity, we consider the case of m = 1, i.e., c(x, ξ) = c 1 (x, ξ) is a real-valued function of x. Also, we assume that c(x, ξ) is a convex function of x for any ξ.
Let ψ : ℜ → ℜ be a nonnegative convex and nondecreasing function satisfying the following property: ψ(z) ≥ ψ(0) = 1, ∀z > 0. Then, for any t > 0 and random variable Z, we have Letting Z = c(x, ξ) and replacing t by t −1 in the above equation, we obtain is a convex conservative approximation for any t > 0. Furthermore, we can take minimum for t > 0 in (2) to obtain the tightest convex approximation: Note that the left-hand side is still a convex function of x since Ψ(x, t) is convex in (t, x) for t > 0. Accordingly, the convex approximation to problem (P CP ) is We now discuss some special cases of (2) and (3) when different ψ(z) is used.
where t > 0 can be replaced by t ∈ ℜ. Define Conditional Value-at-Risk (CVaR) of a random variable z (see [29]): It is easy to see that (4) is equivalent to CVaR 1−α (c(x, ξ)) ≤ 0. Thus (4) is also called CVaR approximation. It can be proved that CVaR is the best convex approximation to the probabilistic constraint (see [11]).
Assume that ξ has finite discrete distribution, i.e., ξ takes values ξ i , i = 1, . . . , N , with equal probability. The CVaR approximation to the problem (P CP ) can be written as the following convex program: x ∈ X, t ∈ R. (1), we obtain the following Chebyshev bound: The corresponding convex approximation to the probabilistic constraint is which can be written as Note that E(u 2 (5), we obtain a more conservative approximation: which can be written as Minimizing the left-hand side over all t ∈ ℜ gives t * = {E[c(x, ξ) 2 ]/(1 − α)} 1 2 . This t * gives the tightest approximation: The above approximation only depends on the first and the second moments of c(x, ξ).
3. DC Approach. Hong et al. [18] proposed a novel approach to obtain tighter approximation to the probabilistic function P(c(x, ξ) > 0) using DC (difference of two convex) functions. Successive convex approximation methods can be then employed to solve the resulting DC optimization problem.
Recall that P(z > 0) = E[1 (0,+∞) (z)]. We thus focus on constructing DC approximation to the indicator function 1 (0,+∞) (z). Let Then, the DC function π(z, t) := ψ(z, t) − φ(z, t) is a DC approximation to the indicator function 1 (0,+∞) (z), which is tighter than CVaR approximation function ψ(z, t).  Under some mild conditions, we can prove that (P CP ) is actually equivalent to the following problem (see [18]): Taking a small t = ε > 0, the above problem can be approximated by the following DC optimization problem: Combining with Monte Carlo method to estimate the expectation and gradient, a sequential convex method was proposed in [18] to solve the DC optimization problem.
Then, the problem (P CP ) can be reformulated as the following mixed-integer 0-1 programming (see [30]): When f and c i (·, ξ j ) are convex functions, and X is a convex set, (M IP ) is a convex mixed-integer 0-1 programming problem which can be solved by continuous relaxation based branch-and-bound or branch-and-cut methods. For instance, when f (x) is a linear function or quadratic function of x, c i (x, ξ) is linear in x for each i, and X is defined by linear constraints, we can use commercial software such as CPLEX to solve (M IP ). However, due the poor quality of the continuous relaxation of (M IP ) and its subproblems during the branch-and-bound process, a direct application of the branch-and-bound method to (M IP ) is often inefficient and can only deal with problems with small size.
Zheng et al. [32] considered a quadratic case of ( Suppose that ξ takes values ξ i , i = 1, . . . , N , with equal probability. Let l b i and u b i be the lower bound and upper bound of (ξ i ) T Bx over X, i = 1, . . . , N . Then, (P CP ) can be written as a mixed-integer 0-1 convex quadratic programming problem: where K = ⌊N α⌋ and e is the column vector of all ones.
In order to obtain tighter MIQP reformulation whose continuous relaxation is tighter than that of (M IQP 0 ), we are going to construct a semidefinite program (SDP) relaxation of (M IQP 0 ) via Lagrangian decomposition and then derive a new second-order cone program (SOCP) reformulation of (M IQP 0 ). Let For any θ ∈ Θ, consider the following equivalent formulation of problem (M IQP 0 ): where v = (v 1 , . . . , v N ) T . Note that the objective function in (P θ ) is decomposed as the sum of a convex nonseparable quadratic function of x and a separable quadratic function of v. The constraint v i = (ξ i ) T Bx in (P θ ) can be viewed as a link constraint for the ith scenario. Associating a multiplier λ i to each constraint v i = (ξ i ) T Bx, we obtain the following Lagrangian relaxation problem of (P θ ): For any θ ∈ Θ and λ ∈ ℜ N , we have d(λ) ≤ v(P θ ) = v(P ). The dual problem of (P θ ) is Let (QP) denote the continuous relaxation of (M IQP 0 ). It is easy to show that for any θ ∈ Θ ∩ ℜ N + , it holds that v(D θ ) ≥ v(QP ). It was shown in [32] that (D θ ) can be reduced to an SDP problem. Moreover, for any fixed θ ≥ 0 and θ ∈ Θ, problem (D θ ) can be actually reduced to an SOCP problem. The following tighter MIQP reformulation is then derived in [32]: 0. Different choices of θ were discussed in [32].
When the functions involved in (P CP ) are linear and X is defined by linear constraints, and the probabilistic constraint has the special form: P{T x ≥ ξ} ≥ 1 − α, valid inequalities and strengthened mixed-integer linear programming reformulations were presented in [24] and [19]. 5. Other Methods. Several other methods have been proposed in the literature for general or special probabilistically constrained optimization problems: • Scenario method ( [9,26]); • Sample average method ( [23]); • p-efficient point method ( [12,13,14,21]) • MIP reformulations for stochastic set covering problem ( [6,31]). We now briefly introduce the first two methods. 5.1. Scenario method. Scenario method directly uses the finite number of samples or scenarios ξ j (j = 1, . . . , N ) in the probabilistic constraints, resulting in a deterministic optimization problem: Two advantages of scenario method: (1) it has no requirement for the distribution of ξ; (2) if f , c and X are convex, (SP ) is also a convex optimization problem. Therefore, when N , the number of samples, is not large, problem (SP ) can be easily solved. However, the disadvantage of scenario method is also obvious: because different samples lead to different constraints, the problem (SP ) itself is random and hence the optimal solution is also random. An important problem is how to guarantee that the optimal solution of (SP ) is also feasible to (P CP ) with high probability. It was shown in [9] that if the number of samples satisfies then the optimal solution of (SP ) is feasible to (P CP ) with probability 1−δ ∈ (0, 1) (see also [26]).

5.2.
Sample average method. Sample average method is a simple way to estimate the probability function P{c(x, ξ) ≤ 0}. Suppose ξ 1 , . . . , ξ N are N independent samples of the random vector ξ. For any given ǫ ∈ (0, 1), consider the following sample average problem: x ∈ X.
Obviously, when ǫ = 0, problem (SAP ) is exactly the scenario approximation problem (SP ). It was shown in [23] that when ǫ > α, (SAP ) generates a lower bound of the optimal solution of the original problem with probability converging to 1 at exponential speed. When ǫ < α, under certain condition, the optimal solution of the sample average problem (SAP ) is feasible to the original problem with high probability.
6. Conclusions and Research Perspectives. Parameter uncertainty or ambiguity arises in many practical applications of optimization models. Optimization under uncertainty has drawn much attention in recent years and is one of the most important and challenging topics in modern optimization and operations research.
Probabilistic or chance constraint is one of the popular ways to deal with the uncertainty. Probabilistically constrained programming has been successfully used in modeling optimization problems in many application areas such as finance, engineering and management. However, in contrast to the well-developed theory, methods and software for (deterministic) convex optimization problems, great research efforts are still needed for the study of probabilistically constrained optimization. When the parameter ambiguity is described by some simple uncertainty set, the optimization problem under set uncertainty can be dealt with by robust optimization methods. When the uncertainty sets are of special forms such as bounds or ellipsoids, the resulting robust problem can be reduced to second-order cone programming or semidefinite programming problems which can be efficiently solved by interior-point methods. The major disadvantage of robust optimization method is that the optimal solution of the robust model is often too conservative to be useful in practical situations.
There are many challenging problems in probabilistically constrained optimization. One of the interesting research topics is the study of approximation methods for the mixed-integer programming models with large-size samples or scenarios. When there is no prior knowledge about the distribution of the random variables, it is a natural choice to use historical data or samples of simulation to approximate the distribution. The resulting model is mixed-integer 0-1 programming with special structures. Due to the large size of the samples, it is very difficult or impossible to solve the model to global optimality even using the most powerful commercial mixed-integer programming solvers. A heuristic method based on penalty decomposition and alternating direction method was proposed in [3] to solve the mixedinteger programming model of the probabilistically constrained problem. Further efforts are needed to design more efficient approximation methods for probabilistically constrained problems arising from real-world applications such as finance, management and engineering.