A ladder method for linear semi-infinite programming

This paper presents a new method for linear semi-infinite programming. With the 
introduction of the so-called generalized ladder point, a ladder method for 
linear semi-infinite programming is developed. This work includes the 
generalization of the inclusive cone version of the fundamental theorem of linear 
programming and the extension of a linear programming ladder algorithm. The extended 
ladder algorithm finds a generalized ladder point optimal solution of the linear 
semi-infinite programming problem, which is approximated by a sequence of ladder points. 
Simple convergence properties are provided. The algorithm is tested by solving a number 
of linear semi-infinite programming examples. These numerical results indicate that the 
algorithm is very efficient when compared with other methods.

Successful LSIP methods have to be able to deal with the infinitely many constraints efficiently and should avoid common drawbacks that cause numerical difficulties and slow convergence. These common drawbacks include increasingly large subproblem sizes (e.g. cutting plane, discretization), non-linear subproblems (e.g. local reduction, dual parameterization), and feasibility check over densely sampled index points. Many improvements of various semi-infinite programming (SIP) methods have been made by reducing the subproblem sizes, relaxing the need for a global optimal solution when non-linear subproblem occurs, or reducing the number of iterations to speed up the convergence. The exchange methods, the relaxed cutting plane methods ( [18]), and the accelerated central cutting plane method ( [2]) are examples of efforts to improve numerical performance.
In this paper, we present a new approach to LSIP that is obtained by generalizing a linear programming (LP) method ( [12]). We first provide an optimality result for LSIP, which can be considered as an LSIP extension of the inclusive cone version of the fundamental theorem for linear programming. The idea of this optimality result, based on the notion of inclusive cone, naturally leads to the so-called centered climbing algorithm. This algorithm works in the following way. At the beginning of the algorithm, an initial artificial ladder is obtained or constructed. Starting from this initial ladder, a sequence of ladders is constructed iteratively, giving rise to a sequence of approximate solutions which converges to a special optimal solutiona feasible generalized ladder point.
Before giving the problem formulation, we first set up some notational conventions of representing matrices, including vectors in Euclidean spaces. In case the contents of a matrix need to be displayed, we will often present them in row form, with elements in the same row separated by commas and different rows separated by semicolons. For example, A = [a, b; c, d] represents the 2 × 2 matrix whose first row elements are a and b and second row elements are c and d, in their given orders. If a 1 , a 2 , · · · , a m are n-dimensional row vectors and b 1 , b 2 , · · · , b n are m-dimensional column vectors, then A = [a 1 ; a 2 ; · · · ; a m ] and B = [b 1 , b 2 , · · · , b n ] are both m×n matrices. Furthermore, with this notation, the elements of row and column vectors, say a 1 and b 1 above, can be conveniently displayed as a 1 = [a 11 , a 12 , · · · , a 1n ] and b 1 = [b 11 ; b 21 ; · · · ; b m1 ]. We will use R n and R n to represent the n-column and the n-row vector spaces, respectively.
In this paper, we consider the following LSIP problem.
Problem P.
Assumption 3. If the feasible region of problem P is non-empty, it contains a Slater point.
The outline of this paper is as follows. Following this introductory section, in Section 2, the concepts of inclusive cone ( [12]) and generalized ladder point are introduced for linear semi-infinite programming problems. An optimality result is given in terms of generalized ladder point. In Section 3, a technique for finding an initial artificial ladder is first discussed. This is followed by the main algorithm and a number of comments on some details of its iterative steps. At the end of this section a simple convergence result is presented. Before a concluding summary that ends the paper, the efficiency and performance of the algorithm are demonstrated by solving a number of numerical examples in Section 4.
2. An optimality result. For linear programming problems, it is shown in [12,13] that the notion of inclusive cone can play an important part in developing the classical linear programming theory including the optimality and the duality results. It can also be used to develop a class of algorithms for solving LP problems numerically, namely the ladder algorithms ( [12]). Similar ideas apply to the LSIP case. We present an optimality result in this section.
Note that the inclusive cone concept applies directly to problem P. The cone generated by n linearly independent normal vectors a T (t k ), k = 1, 2, · · · , n, is called an inclusive cone if it contains the vector −c, where t k ∈ [α, β] for k = 1, 2, · · · , n. If the cone generated by a T (t k ), k = 1, 2, · · · , n, is an inclusive cone we call the index set I = {t 1 , t 2 , · · · , t n } the generator of the inclusive cone and the region L(I) = {x ∈ R n | a(t k )x ≤ b(t k ), k = 1, 2, · · · , n} the associated inclusive region or ladder. The index set I = {t 1 , t 2 , · · · , t n } is also referred to as the generator of L(I).
The unique solution of the linear system a(t k )x = b(t k ), k = 1, 2, · · · , n, denoted by x I , is called the ladder point of L(I). Since the feasible region of an LSIP problem is in general no longer a polytope, inclusive cone based linear programming results may not automatically apply to LSIP. For a proper generalization, we extend the concept of ladder point in the following.
Definition 2.1. A point x ∈ R n is called a generalized ladder point of problem P if there is a sequence of ladder points x i , i = 1, 2, · · · , with corresponding sequence of ladders L(I i ) such that It is clear that ladder points are special generalized ladder points. In the ladder method for linear programming, ladder points play an important role. The ladder algorithms are designed to obtain an optimal ladder solution, whose existence in case an optimal solution exists is guaranteed by the following ladder point version of the fundamental theorem of linear programming. Lemma 2.2. ( [12]) For an arbitrary inequality constrained n-dimensional linear programming problem whose constraint coefficient matrix has rank n, the following statements hold: (1) If the problem has no optimal solution, then it is either infeasible or unbounded. (2) If the problem has an optimal solution, it must have one at a ladder point.
For LSIP problems, feasible region may no longer contain a ladder point. However, it contains a generalized ladder point if an optimal solution exists, as the following result shows. Proof. (a) This statement is obviously true since the objective value at a generalized ladder point is a lower bound of the optimal value, and hence a feasible generalized ladder point must be an optimal solution. (b) We need only to prove the necessity. Assume that x * is an optimal solution of problem P. Since the feasible region of problem P contains a Slater point, as a direct consequence of the KKT conditions (see e.g. Theorem 6 in [10]), there are t 1 , t 2 , · · · , t k ∈ [α, β] for some k, 1 ≤ k ≤ n, such that a(t i ), i = 1, 2, · · · , k are linearly independent, and According to Assumption 2, we can expand {a(t 1 ), · · · , a(t k )} to {a(t 1 ), · · · , a(t n )} to form a basis for R n . Obviously, I = {t 1 , t 2 , · · · , t n } generates a ladder L(I). We consider two cases. Case 1. Problem P has a non-degenerate ladder L(s 1 , s 2 , · · · , s n ). That is, none of the edges of L(s 1 , s 2 , · · · , s n ) is orthogonal to c. Let B k = {α + i 2 k (β − α) |i = 0, 1, · · · , 2 k } ∪ {s 1 , s 2 , · · · , s n }. Then, the following problem has an optimal solution, denoted by x k , for all k = 1, 2, · · · , as it is feasible and has a ladder. Clearly, Note that the set X in the above expression is compact, the sequence {x k } has a converging subsequence and a(t) and b(t) are continuous. Thus,x is feasible. On the other hand, since c T x k ≤ c T x * , we have c Tx = c T x * . Therefore,x, as a generalized ladder point, is an optimal solution. Case 2. All ladders of problem P are, especially L(I) is, degenerate. Since t 1 , t 2 , · · · , t k satisfy (3) and (4), the ladder point x 0 of L(I) satisfies c T x 0 = c T x * . We consider the linear system As a(t i ), i = 1, 2, · · · , k, are linearly independent, by solving system (5) we can represent k variables of x in terms of its remaining n−k variables which are denoted by y = [y 1 ; y 2 ; · · · ; y n−k ]. Thus, we have Consider the following problem.
It is obvious that each feasible point y of problemP corresponds to the optimal solution x = Gy + d of problem P, and each ladder L(s 1 , s 2 , · · · , s n−k ) with ladder point y 0 of problemP corresponds to the ladder L(t 1 , t 2 , · · · , t k , s 1 , s 2 , · · · , s n−k ) of problem P with ladder point x 0 = Gy 0 + d.
Withc designed above, we see thatĪ = {t k+1 , t k+2 , · · · , t n } ⊂ I generates a non-degenerate ladder for problemP . According to Case 1, problemP has an optimal solution y * which is also a generalized ladder point of problemP . We denote by L(s i 1 , s i 2 , · · · , s i n−k ), i = 1, 2, · · · the sequence of ladders of problemP whose corresponding ladder point sequence {y i } converges to y * . Then by the above discussion, x * = Gy * + d is an optimal solution of problem P and x i = Gy i + d, i = 1, 2, · · · is a sequence of ladder points that satisfies which means that x * is a generalized ladder point of problem P as well.
(c) This is a direct consequence of (b). We note that in the proof of (b) and (c) above, we need only the continuity of a(t) and b(t). Their continuous differentiability is needed only by the algorithm in the next section.
3. Algorithm. In this section, we propose an extended ladder algorithm for solving problem P. The main idea is to obtain a sequence of ladders {L(I i )} whose corresponding sequence of ladder points, {x i }, converges to a generalized ladder point x * which, according to Theorem 2.3, must be an optimal solution. At the start of the solution process, an initial ladder L(I 1 ) with ladder point x 1 is artificially constructed. Then, a sequence of ladders L(I 2 ), L(I 3 ), · · · , with corresponding ladder points x 2 , x 3 , · · · , is iteratively produced in such a way that L(I i+1 ) is obtained from L(I i ) by climbing up (toward the feasible region) from x i along the 'centerline' of L(I i ). The climbing procedure resembles the centered climbing ladder algorithm for LP ( [12]).
3.1. Artificial initial ladder. In order to start the algorithm, we need an initial ladder. It is natural to try to obtain an initial ladder from In fact, in discretization methods (see e.g. [8,16]) authors assume that when N is large, the discretized problem has an optimal solution, which implies according to Lemma 2.2 that problem P T N has a ladder, and hence we can select n vectors from {a(t i ) |i = 1, 2, · · · , N }. This assumption is actually stronger than expected and can be easily violated as the following example indicates.
Example 1. Consider the two-dimensional problem given by For arbitrarily large N , the discretized problem P T N is unbounded as it doesn't include the constraint at t = π/2, and hence it does not have an optimal solution, even though the original problem has infinitely many solutions. Thus, from [13], P T N does not have a ladder and it is impossible to obtain an initial ladder from {a(t) |t ∈ T N }.
In the following, instead of obtaining an initial ladder for the problem by discretization, we construct an artificial ladder that works for general problems.
Consider problem P. It is clear that, if M is a sufficiently large number, by adding constraint c T x ≥ −M to problem P neither the problem's optimal solutions if they exist nor its solvability if it is bounded or infeasible will be changed. From Assumption 2, we can expand the set {−c} into a basis {−c, a T (s i1 ), a T (s i2 ), · · · , a T (s in−1 )} for R n . These vectors clearly generate an inclusive cone and hence generate a ladder with the corresponding ladder point given by Then, we can conveniently present any ladder generator, including that of the artificial ladder, in the form of where 1 n×1 = [1; 1; · · · ; 1] ∈ R n and A(I) = [a(t 1 ); a(t 2 ); · · · ; a(t n )] if I = {t 1 , t 2 , · · · , t n } (in order for A(I) to be well defined, we may consider ladder generators as ordered sets). The centerline of L(I k ) is defined as the line passing x k and parallel to v k : l(x k , v k ) : x = x k + tv k , −∞ < t < ∞. If x k is not optimal, the current ladder needs to be updated and a new ladder point is obtained. For LP problems, in order to update L(I k ) to L(I k+1 ) and x k to x k+1 , we determine a 'pick' (an entering constraint index) which corresponds to the constraint that is violated most along the centerline ( [12]). For LSIP problems, the most violated constraint index t k ∈ [α, β] along the centerline is the global minimizer to the problem Geometrically, if the global minimizer t k above exists, then a(t k )x ≤ b(t k ) is the constraint that is violated most at x k along the centerline of L(I k ) (see Figure 1). However, problem P(I k ) may be not well posed, as in its objective function a(t)v k may take zero value at some t positions. For these reasons, we modify the definition of g(t, I k ) and let −c L(x k ) center line We note that if the parameter τ = 0, we obtain the objective function in problem P(I k ). If 0 < τ ≤ 1, g(t, I k , τ ) is well defined and is continuously differentiable on [α, β], and hence global minimum for the following modified problem P(I k , τ ) always exists.
Let t k be the global optimizer of problem P(I k , τ ) (t k will be called a 'pick' to be consistent with [12]). Then, if τ = 1, the entering constraint a(t k )x ≤ b(t k ) is the constraint that is violated most at x k (see Figure 1). Note that different τ values may lead to different entering constraint, and hence different ladder point sequences. We may naturally expect that some τ values are better than others. Our experiments show that, for the problems solved in Section 4, the algorithm performs much better for 0.9 ≤ τ ≤ 1 than for 0 < τ ≤ 0.2. In general, the best τ value seems to be problem-dependent, and can be anywhere in (0, 1]. Solving global optimization problems is in general a very demanding task. However, for one-dimensional smooth optimization problems, there are reliable and efficient numerical methods available. In this paper, we choose to use the bridging method given in [14] and we always start the bridging process from t = α (right bridging). In short, the right bridging method minimizes a bridged function initially constructed at the left end of the interval. By minimizing the bridged function, points that are not better than the current solution are skipped and a local minimum, better than the starting point, is found. Then, by updating the bridged function at the local minimum, better local minimum solutions are located. This process continues until a global optimal solution is found. For details of the bridging method, we refer interested readers to [14]. After choosing a pick (the global minimizer t k ), obtaining the updated ladder is the same as in the linear programming case [12]. For convenience, when the ladders are updated by choosing a pick as the global minimizer of problem P(I k , τ ) with 0 < τ ≤ 1, we call the updating method a centered climbing method.
Step 1. Checking optimality 1.1 Solve problem P(I k , τ ) for a global minimizer t k with global optimal value V k .
x * is taken as an optimal solution. 1.3 Otherwise, go to Step 2.
Step 2. Updating 2.1 Otherwise, update L(I k ) with t k entering I k : • Use t k to replace one of I k 's elements, say t k i1 , so that a new ladder generator I k+1 is produced.
• Calculate the updated ladder point according to ); · · · ; b(t k+1 n ) . 2.2 Set k = k + 1, go to Step 1. The following are some comments about the above centered climbing algorithm. (a) In the initialization step, very often we can choose t 0 1 = α − 1 and choose t 0 i ∈ [α, β] randomly for i = 2, 3, · · · , n to obtain an initial ladder. This way of finding an initial ladder, though not rigorous, is very efficient in many cases. A more formal but less efficient method for finding an initial ladder is the following procedure.
• First find n linearly independent constraint normal vectors a(t 0 i ), i = 1, 2, · · · , n where t 0 i ∈ [α, β]. • Let [λ 1 ; λ 2 ; · · · ; λ n ] = −[a T (t 0 1 ), a T (t 0 2 ), · · · , a T (t 0 n )] −1 c. • If λ i ≥ 0 for i = 1, 2, · · · , n, set I 0 = {t 0 1 , t 0 2 , · · · , t 0 n } and I 0 generates a ladder L(I 0 ). • Otherwise, identify any λ i0 = 0. Set t 0 i0 = α − 1 (note that a(α − 1) = −c T ), then a(t 0 i ), i = 1, 2, · · · , n are linearly independent. And hence, I 0 = {t 0 1 , t 0 2 , · · · , t 0 n } generates a ladder L(I 0 ). Once L(I 0 ) is obtained, the corresponding ladder point x 0 is calculated by solving the corresponding system of linear equations. (b) The iterative process of the algorithm (Steps 1 and 2) includes some termination criteria and an updating step. The main computational task is to find a global minimum solution of the function g(t, I k , τ ) over the interval [α, β] in Step 1. As we have seen earlier, the global minimum value V k of g(t, I k , τ ) over the interval [α, β] is the negative of the 'maximum violation' of all the constraints at the current ladder point x k along the parameterized centerline. If the maximum constraint violation −V k is less than a prescribed accuracy ε > 0, the current ladder point x k is taken as an approximate optimal solution of problem P. Otherwise, the global minimizer t k of problem P(I k , τ ) determines the constraint that enters the current ladder. The current ladder is then updated and the new ladder point calculated with ease according to standard formulations (see [12]). (c) The main advantage of the algorithm is threefold. Firstly, each entering constraint leads to an, generally speaking, effective improvement of the current solution. Though locating the constraint needs a global minimization, the bridging method is very efficient in doing this job as the computation effort involved is comparable to a few line searches. Secondly, the LSIP is treated like an LP and the updating, aside from the global minimization, is similar in complexity to a simplex step with fixed dimension. Thirdly, feasibility information can be drawn from the global optimization, and there is no need for extra feasibility check. Before ending this section, we provide some simple facts regarding the convergence of the algorithm. (a) Upon the termination of the algorithm, the output x * is either an optimal solution, or the problem is infeasible or unbounded. (b) If {x k } is an infinite sequence, then the following hold true. ( If the optimal solution x * is found, but the optimal value −c T x * = M , then problem P is unbounded. (3) Assume that c T x k →V , and max t∈T (a(t)x k − b(t)) → 0 (k → ∞). Then any converging subsequence of {x k } converges to an optimal solution of problem P.
Proof. Statements (a), (b.1) and (b.2) are obvious. Statement (b.3) is true because in this situation any limiting point will be a feasible generalized ladder point, and hence is optimal according to Theorem 2.3.

Numerical examples.
In this section, we demonstrate the efficiency of the algorithm by solving three approximation problems. The first example is the onesided L 1 approximation of the tangent function over [0, 1] by polynomials with limited order. This is a standard LSIP test problem. The purpose of this example is to compare the new algorithm with other algorithms. As this problem becomes very sensitive when the order of the approximating polynomial is greater than 4, the algorithm seems only to be able to cope with cases where the limiting order of the polynomial is within 20 to obtain solutions that have small feasibility and objective errors. To demonstrate the algorithm's ability of dealing with problems of larger size, two more examples are solved. They are the one-sided approximations of a given smooth function by a sine series and a cosine series of given orders, for which the authors are unaware of their appearance in any published work.
The problem can be formulated as the following LSIP problem.
where c = 1; 1 2 ; · · · ; 1 n , a(t) = −[1, t, · · · , t n−1 ] and b(t) = − tan(t). This problem has been solved by many authors (see [5], Chapter 11). It is known that for each n, problem E 2 has a unique optimal solution. But, for n ≥ 3 the boundary of its feasible set becomes very flat in the −c direction near the optimal solution, and for n ≥ 6 the numerical solution become extremely sensitive ( [17]).
The algorithm is tested for cases 1 ≤ n ≤ 18. The computational results for n = 6, n = 8, and 10 ≤ n ≤ 18 are listed in Table 1 with different choices of the centerline parameter τ . The number of iterations and the CPU time listed depend not only on the problem dimension, but also the initial ladder that is constructed randomly in the initial step. Thus, these data may experience noticeable changes between different runs. The listed data in each case is from a particular run. We list the computed optimal L 1 distance p * − b 1 instead of the corresponding optimal value c T x * where p * (t) = x * 1 + x * 2 t + · · · + x * n t n−1 . The optimal value c T x * can be obtained from p * − b 1 according to c T x * = p * − b 1 − log(cos(1)). For 1 ≤ n ≤ 16, our computational results are very accurate in terms of both feasibility and objective optimality. We note that different choices of τ ∈ (0, 1] give different iterative updating rules for the algorithm. Once it is chosen in the initialization step, it is kept unchanged throughout the computation. Many values of τ have been tested. Listed in the table are results for τ = 1, 0.95 and 0.01. It seems that the τ value at which the algorithm achieves its best performance is problemdependent. But for this example, τ = 0.95 is close to the best possible value of τ in terms of CPU time consumption in obtaining solutions of the same level of accuracy. The number M in the artificial constraint should be as small as possible since larger M , in general, means more iterations in reaching an optimal solution.
We note that, for n = 8, [5] reported computational optimal solution x * of accuracies 1.0e − 3 by a grid discretization method and 1.0e − 4 by a cutting plane method. The new algorithm can obtain an optimal solution of accuracy 1.0e − 4 with much less CPU time and much higher feasibility accuracy. We note that the numerical optimal solutions reported in [10] record a better accuracy for these cases, but the CPU consumption was ignored. Tests for larger n have been done by different authors. For example, in [3], the cases n = 10, 20 and 30 were tested using a logarithm barrier cutting plane method. Even though the computational results show very satisfactory duality gaps for the subproblems, the feasibility of their solutions is unfortunately not guaranteed, as it solves a discretized subproblem. In comparison, the new algorithm is capable of finding solutions that have very accurate objective values with ignorable constraint violation for 1 ≤ n ≤ 16. However, for n ≥ 17, the problem seems to involve singularities and encounters more and more frequent ill-conditioned linear systems in computing the ladder points as n increases.
Example 3. Find the best L 1 approximation to a given smooth function b(t) from above by a Fourier sine series of order ≤ n, in the form of f s (t) = n i=1 x i sin(it), over a given interval [α, β].
Thus, the problem is equivalent to the following LSIP problem.
where c is as above, and Suppose that the optimal solution to problem E 3 is x * , then the required best L 1 approximation to b(t) over [0, π] is achieved at f * s (t) = −a(t)x * and the corresponding minimal distance is f * s − b 1 = c T x * + π 3 /8. Problem E 3 is solved for 1 ≤ n ≤ 200. Computational results for a number of n values are listed in Table 2. The plot of the computed optimal f s (t) against that of b(t) is shown in Figure 2 with selected n values. The algorithm is very stable for these tested cases. For n close to 200, however, some effort has to be made to find an initial ladder that does not involve singularities (the linear system to be solved to obtain the initial ladder point is not ill-conditioned.) Example 4. Find the best L 1 approximation to a given smooth function b(t) from above by a Fourier cosine series of order ≤ n − 1, f c (t) = n i=1 x i cos((i − 1)t), over a given interval [α, β].
For this example, we take the same b(t) and the same interval [α, β] = [0, π] as in Example 3. Then the problem is equivalent to the following LSIP problem.
where c = [π; 0; · · · ; 0], a(t) = −[1, cos(t), · · · , cos((n − 1)t)]  and d(t) = πt − t 2 for t ∈ [0, π]. The problem is also solved, like in Example 3, for 1 ≤ n ≤ 200. Some computational results are listed in Table 3, and the corresponding calculated optimal functions (f * c (t)) are plotted in Figure 3. The algorithm is very efficient for all these cases given that an initial ladder that does not involve singularities is available for starting the algorithm. However, finding such an initial ladder has been proven to become increasingly more difficult when n increases. This difficulty is similar to what we encountered in solving Examples 2 and 3. Table 3. Results for Example 4 with b(t) = − √ πt − t 2 , 0 ≤ t ≤ π and parameters M = 5, ε = 1.0e − 7, τ = 1

5.
Summary. In this paper, we introduce the so-called generalized ladder point for linear semi-infinite programming and prove that a semi-infinite programming problem with a continuous linear inequality constraint system has an optimal solution if and only if it has one at a generalized ladder point. Based on this result, a centered climbing ladder algorithm is developed. The algorithm iteratively obtains a sequence of ladder points that converges to a feasible generalized ladder point -a special optimal solution. It combines a linear programming ladder algorithm with a very efficient one-dimensional global minimization method, namely the bridging method. The bridging method finds precisely the index point where the constraint is violated most at the current solution. It works faster than searching for the most violated constraint over a set of dense grid index points (used in discretization and cutting plane methods). Numerical tests show that the algorithm is very efficient.