Finding multiple local solutions to optimal control problems via saddle points and its application to the ascent trajectory of a winged rocket

Complex nonlinear optimal control problems can have more than a single local solution due to the lack of convexity, and widely-used gradient-based approaches for optimal control have a risk of producing an inferior local solution. Although global optimization methods for optimal control may overcome this difficulty, such existing methods suffer from high computational complexities. This paper presents a novel numerical algorithm for multi-modal optimal control problems based on a saddle-point search method. Starting from an already-found local solution, a saddle-point search algorithm finds a surrounding saddle point of the constrained optimization problem, from which a new local solution is obtained by applying a gradient-based optimizer. By repeating this procedure, multiple locally optimal solutions are explored exhaustively and systematically. Although the developed method is not rigorously guaranteed to produce the global solution, it is applicable to large-scale general optimal control problems with equality and inequality constraints, which makes it a practically beneficial technique. After numerical experiments with an example problem, the ascent trajectory optimization problem of a winged rocket for maximizing apogee altitude is solved by using the proposed method.


Introduction
Optimal operation of dynamic systems governed by differential equations is desired in diverse fields of engineering and natural/social sciences. This kind of problems has been actively studied under the category of optimal control, or equivalently trajectory optimization, and many matured numerical methods are available now [1]. State-of-the-art techniques for optimal control are categorized into direct methods [2] and indirect methods [3]. Direct methods transcribe the continuous-time optimal control problem into a nonlinear programming (NLP) problem as a finitedimensional approximation, and the NLP problem is usually solved by a gradient-based optimizer. Direct methods are further divided into direct collocation methods [2] that discretize both state and control variables, and direct shooting methods [4] that parameterize control variables only. In the indirect methods, on the other hand, a two-point boundary value problem is derived as the optimality conditions of the optimal control problem, and its solution is explored with a gradient-based root finding technique. Since the above methods adopt some sort of gradient-based numerical technique, they give a locally optimal solution near a specified initial guess. Complex nonlinear problems, on the other hand, may have more than a single local solution due to the lack of convexity. Such multi-modal optimal control problems with major practical interests include (1) path planning of robot manipulators [5], (2) obstacle-avoidance control [6], and (3) low-thrust orbit transfer [7]. When objective functions corresponding to local solutions are significantly different, a solution obtained by gradient-based methods can have a risk of being trapped in a quite inferior solution. In addition, if the collection of locally optimal solutions can be revealed instead of a single solution, it brings diverse design options, from which users can make final decisions.
This has motivated recent research activities of applying global optimization techniques to the numerical solution of optimal control problems. They include (1) branch-and-bound [8] or branch-and-lift [9] algorithms, (2) semidefinite relaxation [10], and (3) evolutionary computations [11][12][13]. Whereas the above first and second methods may be successfully applied to simple problems, the scalability to complex and largerscale problems is limited due to high computational complexity. A concern on evolutionary computations is the high computation cost, especially when multiple equality terminal conditions are involved, as observed in [12,13].
Under the above-mentioned background, the authors have recently focused on the combination of a direct collocation method and a numerical algorithm for locating saddle points of functions. Saddle-point search methods are studied mainly in computational physics, in order to identify transition states of a molecular system by searching saddle points of its energy function [14]. At a saddle point with Morse index m, the function becomes maximum along m orthogonal directions and minimum along the other directions. Figure 1 shows the landscape of Himmelblau's function, which is a well-known test function for optimization, with its stationary points, as an example. Steepest-descent pathways converge to different local minima depending on the starting points, and it defines the basins of attraction shown in Figure 2. Since an index-one saddle point lies at the boundary of two basins of attractions, two neighbouring initial points near a saddle point leads to two different local minima. By utilizing this property, the systematic exploration of local solutions is realized in the following steps: (1) The NLP problem arising from an optimal control problem through the direct collocation method is solved by a conventional gradient-based optimizer, and a local minimum is obtained. (2) Surrounding index-one saddle points are found by using a saddle-point search method. (3) New local minima are acquired by running gradient-based optimization with new initial guesses chosen properly around the found saddle points. (4) By repeating this procedure until no additional local minimum is found, it is expected that local optimal solutions are exhaustively obtained.
In contrast to energy functions of molecular systems, NLP problems derived from optimal control problems have constraints that originate from dynamic, path, and boundary constraints. An enhanced saddle-point search algorithm for equality-constrained systems is developed in the previous research by the authors [15], based on an iterative minimization formulation proposed by Gao et al. [16,17]. Its limitation is that it cannot handle inequality path and boundary constraints that are common in practical optimal control problems. To overcome this issue, in this paper, the saddle-point search algorithm developed in [15] is extended so that the inequality constraints can be considered. In addition, its application to the ascent trajectory optimization of a winged rocket is performed.
The remainder of this paper is organized as follows: Section 2 describes the transcription of an optimal control problem into an NLP problem. Then, the procedures for locating saddle points around the current local minimum are explained in Section 3. Section 4 presents an algorithm for successively finding multiple local minima via saddle points. Validity of the developed method will be demonstrated via solving an illustrative example problem in Section 5. In Section 6, the ascent trajectory optimization problem of a winged rocket is solved as a practical application. Finally, Section 7 provides concluding remarks.

Transcription into static optimization problem
Let us consider the following general Bolza-form optimal control problem: where x(t) and u(t) are state and control variable vectors, respectively. Independent variable, t ∈ [t 0 , t f ], typically denotes the time. It is worth noting that t 0 and t f are free in this optimal control problem, while they can be fixed to specified values by using Equation (6) if necessary. One of the advantages of the proposed method is the applicability to a wide range of optimal control problems including free final time problems. In order to handle inequality path and boundary constraints, the augmented objective functionJ is introduced as follows: where P(·) is a penalty function that gives large value for positive argument and is approximately zero for negative argument. w C and w are the penalty weights for the inequality path and boundary constraints, respectively. In this paper, the following hyperbolic smooth penalty function presented in [18] is used: where the parameter τ controls the smoothness, and the parameter λ is related to the intensity of the penalty. Using the Legendre-Gauss (LG) pseudospectral method [19], which is one of the most efficient direct collocation methods, the optimal control problem given by Equations (1)-(7) can be transcribed into the following NLP problem by collocating it at N LG nodes: where w k is the kth LG weight, D i,k+1 is the (i, k + 1)th entry of the LG pseudospectral differentiation matrix, and t k represents the independent variable at the kth LG node. X k and U k are the discretized state and control variables at t k , respectively. X 0 and X f denote the discretized state variables at t 0 and t f , respectively. The augmented objective functionJ is transformed into the following form: It is noted that any direct collocation method (e.g. Hermite-Simpson method) can be employed instead of the LG pseudospectral method. Let this NLP problem of Equations (12)- (19) be rearranged for the ease of visibility as follows: where z indicates all the variables of the NLP problem shown in Equation (12). The transformed augmented objective function, Equation (20), is written in short bỹ V(z).
The problem must be scaled so that the variables approximately have the range between zero and one, and that the constraint functions roughly have the magnitude of one. This makes the values of tuning parameters recommended in Sections 3 and 4 properly applicable to various problems in a robust manner. Refer to [2, Section 4.8] for a scaling method for NLP problems derived from optimal control problems.

Saddle-point search method
An index-one saddle point of a function is defined as a point at which the first derivatives (i.e. gradient vector) are zero and the second derivatives (i.e. Hessian matrix) have one negative eigenvalue. The function takes the maximum value at the saddle point along the direction given by the eigenvector corresponding to the negative eigenvalue. Eigenvector-following methods for saddle-point search are based on this property, and they explore a saddle point by gently ascending a function along the eigenvector corresponding to the minimum Hessian eigenvalue. Among many existing eigenvector-following methods, this paper focuses on the iterative minimization formulation (IMF) [16,17], because of its efficiency and applicability to equalityconstrained systems. After the outline of the original IMF is described in Section 3.1, its enhancements are presented in Sections 3.2 and 3.3.

Iterative minimization formulation for index-one saddle points
The original IMF for the systems with equality constraints (without inequality constraints) consists of the following steps: (1) Given the current point, z (k) , the rotation step calculates the eigenvector, v (k+1) , corresponding to the minimum Hessian eigenvalue. (2) The translation step determines the next point, z (k+1) , so that it moves towards an index-one saddle point. (3) The process is repeated while increasing the iteration counter, k, until z (k) converges to a saddle point.
The rotation step is expressed by the following optimization problem in [16]: Equation (22) is equivalent to a constrained eigenvalue problem, and it can be transformed into an unconstrained problem [20]. Then, it can be solved efficiently without evaluating expensive Hessian ∇ 2 V[z (k) ] explicitly, by using an appropriate eigenvalue solver [21]. As a result, v (k+1) gives the eigenvector associated with the minimum eigenvalue, E(z), which is calculated by The translation step is accomplished by solving the following optimization problem [16]: where L[y; z, v] is an auxiliary objective function constructed so that z (k) converges to a saddle point as k increases. Since the required solution of Equation (24) is the local one, it must be solved by a gradient-based nonlinear optimizer with the initial guess given by y = z (k) . In addition, the solution is sought only within a local neighbourhood of z (k) denoted by U[z (k) ], in order to exclude the unintended non-local solutions. One of the possible forms of L discussed in [16,17] is Here, ξ(y) : R d → R d is the operator that projects a point, y, onto geodesic curves whose tangents are in the tangent space of c(z) and orthogonal to v. For general nonlinear constraints, such an operator cannot be strictly written in a closed form, and this issue is left unsolved in [16,17]. The descent direction of L in Equation (25) corresponds to the gentlest ascent direction of V, and the solution of Equation (24) yields a new point closer to a saddle point than the old point. In this paper, let the local neighbourhood, U[z (k) ], be defined by which can be expressed as simple upper and lower bounds on each component of y.
Objective gradient along the tangent space of c(z) must vanish at a saddle point, and the residual of this constrained stationary condition, R (k) , is defined as P[z (k) ] is the orthogonal projection operator on the tangent space of c(z) at z (k) , which is calculated by IMF procedure is terminated when z (k) sufficiently converges to a saddle point, and its criterion with the tolerance value, ε tol , can be written as follows: Under some conditions, z (k) obtained via IMF is guaranteed to locally converge to an index-one saddle point (see Theorem 3.1 of [16] or Theorem 1 of [17]). When z (k) is not near a saddle point and the corresponding minimum eigenvalue, E[z (k) ], becomes positive, on the other hand, the auxiliary objective, L, has no lower bound locally [16,17]. Therefore, it is recommended that (1) an initial guess of z is chosen so that the associated minimum eigenvalue is negative, and (2) the minimum eigenvalues at z (k) 's during the IMF iterations are kept negative. This idea motivates the development of globalization strategies for IMF introduced in next subsections.

Modified iterative minimization formulation
In this subsection, four modifications to the IMF are proposed to improve its constraint handling and global convergence.
As the first modification, the rotation problem is revised for the effective handling of nonlinear constraints. The modified rotation problem is formulated as follows by using Riemannian Hessian containing second-derivative information of constraints [22], instead of using the combination of usual Hessian and linearized constraints: (30) where ·, · means an inner product. It is noted that the modified rotation problem is formulated with the augmented objective function,Ṽ, containing the penalty term for inequality constraints. As a result, inequality constraints that are near active (i.e. In spite of its lengthy expression, this modified rotation problem can still be handled efficiently by an eigenvalue solver [21]. Here, the explicit evaluations of second derivatives {i.e. ∇ 2Ṽ [z (k) ] and ∇ 2 c[z (k) ]} can be avoided by exploiting the following finite-difference approximations: where δ is a step size. After Equation (30) is solved, the minimum eigenvalue of the Riemannian Hessian, and the corresponding eigenvector is given by v (k+1) . The second modification is made on the translation problem as follows: By imposing this linearized minimum-eigenvalue condition on the translation problem with a properly chosen threshold, ε eig , it is expected that the minimum eigenvalue at the resulting z (k+1) becomes negative. This can improve the global convergence of IMF as discussed in Section 3.1. A partial derivative in Algorithm 1 Saddle-point search with trust region.
The modified translation problem is different from the original NLP problem derived from the optimal control problem, Equation (21), in that (1) the objective function, V, is replaced by the auxiliary function, L, and (2) a linear constraint related to the minimum eigenvalue is introduced. This means that the translation problems have the same numbers of variables and only one more linear constraint, compared to the original NLP problem. Although the computational complexity of L is higher than V, its effect on the overall computational burden is minor, since the computational complexity of NLP for handling c(y) = 0 and c I (y) ≤ 0 is dominant. It is noted that the dimensions of c(y) and c I (y) are typically more than dozens in practical problems (see Section 6). Therefore, the computational burden of the translation problems is comparable to that of the original NLP problem.
The third modification is the implementation of a trust-region algorithm for adjusting r nei and ε eig as shown in Algorithm 1. ε eig0 and ε tol are negative and positive tuning parameters, respectively, whose absolute values are recommended to be comparable or larger than the optimality and feasibility tolerances of the NLP solver. In this paper, ε eig0 = −10 −4 and ε tol = 10 −4 . a, b 1 , b 2 , c 1 , and c 2 are selected as follows based on typical values used in trust-region strategies: a = 2, b 1 = 1.5, b 2 = 0.5, c 1 = 0.5, and c 2 = 1.5. Although these tuning parameter values empirically work fine regardless of the characteristics of the problems, the initial size of the trust region (i.e. r nei0 ) may be chosen depending on the problem. While too small r nei0 requires many iterations until the convergence to the nearest saddle point, it should be small enough to prevent the solution from converging to other nearby stationary points. When it is expected that many saddle points are located densely around the current point, smaller r nei0 is recommended.
After the modified rotation problem is solved in line 3, the sign of the minimum eigenvalue at the current point, z (k) , is examined. If it is not less than zero with a slight margin (see line 5), the magnitude of ε eig is increased by multiplying it with a, and the modified translation problem is solved again in line 18. This procedure makes it more likely for the next z (k) to have a negative minimum eigenvalue by enforcing a stronger constraint in the modified translation problem. In addition, if the residual of the constrained stationary condition increases from the previous iteration by the factor more than b 1 (see line 10), one of the following two situations are suspected. First, the modified translation problem produced a solution that is difference from the intended local solution. Second, the trust region is too large for the auxiliary function, L, to work as expected. Anyway, the size of the trust region is reduced by multiplying r nei with c 1 , and the modified translation problem is solved again in line 18. Criteria for reducing the magnitude of ε eig and increasing r nei are also implemented in lines 7 and 12, respectively. As a result of the iterative procedures with such a trust-region safeguard, an index-one saddle point, z sad , will be obtained when the termination condition, Equation (29) is satisfied.
The fourth modification is the implementation of an approximate ξ function for handling general nonlinear equality constraints. The following Newton retraction presented in [23] is adopted due to its stability and low computational cost: The original Newton retraction finds a point that satisfies constraints, c(z) = 0, via the iterative Newton's method, and Equation (37) is actually its singleiteration version. This operation maps a point z to the proximity of the nonlinear manifold defined by the constraints.

Finding initial guess for iterative minimization formulation
As mentioned in Section 3.1, it is preferable that an initial guess of z that initiates Algorithm 1 has a negative minimum eigenvalue. In this subsection, an iterative procedure for finding such a point from a starting point near a local minimum is constructed. It is noted that, at the local minimum, the Hessian is positive definite, and the minimum eigenvalue is positive. By changing the translation problem and termination condition in Algorithm 1, a new algorithm that accomplishes this goal is developed as shown in Algorithm 2. The translation problem below is used instead of Equation (35) during this initialization step: translation problem during initialization : where L is a new auxiliary objective function defined by L is formulated so that its descent direction corresponds to the gentlest ascent direction of V, similarly to L. The essential difference between L and L is the functional landscapes around y = z when z is a local minimum of V. L [y; z, v] has an almost linear landscape around y = z, while L[y; z, v] has a saddle point at y = z. Since the initial-guess search starts from a local minimum of V, and the gradient-based optimizers generally have difficulties near saddle points, L is used in the initial-guess search method instead of L. The size of the trust region is not adjusted during this initial-guess search process. The following termination condition is employed instead of Equation (29): This means that the iterations stop when the minimum eigenvalue at the current point, z (k) , is substantially smaller than zero. It is not guaranteed that the termination condition is satisfied within a finite number of iterations by using Algorithm 2. The development of a method with the proved termination is the scope for future research.

Successive search of local optimal solutions
As mentioned in Section 1, an index-one saddle point is located on the boundary of two basins of attractions where steepest descent leads to different local minima. By exploiting this property, it is possible to find multiple local optimal solutions successively and systematically in the following way: (1) A saddle-point Algorithm 2 Initial-guess search for modified IMF.
Solve the modified rotation problem, Equation (30). 4: Solve the translation problem for initialization, Equation (38), with r nei . 5: if Stopping condition, Equation (40), is satisfied. then 6: break 7: end if 8: k ← k + 1 9: end while 10: z ini ← z (k) solution around the current minimum is searched by using Algorithms 1 and 2. (2) By choosing a proper initial guess near the found saddle point, a new local minimum may be obtained by using a gradient-based optimizer. (3) These two steps are repeated alternately until no new solution is found.

Initial guess for seeking new local minima
Let us describe how to calculate the proper initial guess near the found saddle point in the second step. Suppose that a saddle point, z sad , is obtained by running Algorithm 1 from a starting point, z ini . The Riemannian Hessian of the augmented objective function has one negative eigenvalue at the index-one saddle point. Along the direction defined by the corresponding eigenvector, the augmented objective function has the maximum value at the saddle point. Therefore, the point given by z sad + δ 2 (z sad − z ini ) with δ 2 > 0 is in a basin of attraction that is different from the basin which z ini belongs to. This means that a new local solution is expected to be acquired by a gradient-based optimizer when the above-mentioned point is used as its initial guess. A concern is that the theoretical steepest-descent pathway and complex gradient-based optimizers may have different basins of attractions [24]. It is noted that the first-order steepest-descent algorithm usually suffers from slow convergence, and it is not applicable to practical large-scale problems. Figure 3 shows the basins of attractions of a sequential quadratic programming (SQP) solver [25] with the default setting for the Himmelblau's function. When it is compared with Figure 2, the SQP solver produces basins similar to those of the steepest descent around the saddle points, in this specific function. In order to realize robust search of new local solutions for various optimal control problems and gradient-based optimizers, it is recommended to try several initial guesses from different δ 2 's.

Successive search algorithm
Algorithm 3 shows the whole procedures for the successive search of local optima. δ 1 is a sufficiently small positive number (e.g. δ 1 = 10 −4 ). 2 is the vector of positive numbers, and its dimension, n, is the number of trial initial guesses in solving the original NLP problem. In this paper, 2 = [10 −4 , 10 −3 , 10 −2 ]. ε dup must be substantially larger than the optimality and feasibility tolerances of the NLP solver (e.g. 10 −2 ). Z sols and Z sads are the sets of local optimal solutions and saddlepoint solutions, respectively, which are the outputs of this algorithm. Z tmp indicates the set of local optimal solutions whose nearby saddle points are not explored yet.
In lines 2-4, the original NLP problem is solved with the user-specified initial guess, and the resulting solution is stored in Z sols and Z tmp . Lines 6-10 represent the core procedures consisting of the saddle-point search and the local-minimum search, and they are repeated until Z tmp becomes empty. In line 8, the eigenvector corresponding to the minimum eigenvalue at the current local minimum, z cur , is calculated and denoted by v. Since v and −v correspond to the gentlest ascent directions of the augmented objective function, they are used as promising initial search directions towards surrounding saddle points in lines 9 and 10. After a saddle point is found through lines 16 and 17, the original NLP problem is solved in line 20, with several initial guesses as discussed in Section 4.1. If a newly found local minimum is not the duplication of any solutions obtained so far with the tolerance of ε dup (see line 21), it is added to Z sols and Z tmp . In addition, the associated saddle point is appended to Z sads . Solve modified rotation problem, Equation (30), with z (k) = z cur , to obtain v. 9: FindNewSol(z cur + δ 1 v) 10: FindNewSol(z cur − δ 1 v) 11: if Z tmp = {} then 12: break 13: end if 14: end while 15: function FindNewSol(z ini0 ) 16: Run Algorithm 2 to obtain z ini . 17: Run Algorithm 1 to obtain z sad . 18: for i=1:n do 19: 20: Solve original NLP problem, Equation (21), with initial guess, z sad + δ 2 (z sad − z ini ), to obtain z * . 21: if z ∈ Z sols such that ||z − z * || 2 ≤ ε dup then 22 [25]. The optimality and feasibility tolerances are both 10 −4 . In order to calculate gradient information required in the SQP algorithm accurately and to improve numerical stability, complex-step derivative approximation [26] is used instead of usual finite differencing.

Characteristics of algorithm
One of the advantages of the developed successive search method over other global optimization techniques mentioned in Section 1 is its scalability to larger or higher-dimensional problems. When the rigorous global optimality is pursued via branch-andbound methods, auxiliary problems must be solved repeatedly, and finer discretization of control variables leads to exponential growth of computation time [8]. As for the semidefinite relaxation, the dimension of the relaxed problem increases significantly when the original problem has strong nonlinearity [10]. The proposed method, on the other hand, explores multiple optimal solutions by repeating the local gradient-based searches for saddle points and local minima. As a result, the sets of problems to be solved are (1) the original NLP problem [Equation (21)] arising from the optimal control problem, (2) the modified rotation problems [Equation (30)], and (3) the modified translation problems [Equation (35)]. The rotation problems can be handled by eigenvalue solvers, and they are not computationally expensive. The computational burden of the translation problems is comparable to that of the original NLP problem as explained in Section 3.2. In addition, it is empirically found that the required number of saddle-point search iterations is around 10 regardless of the size of the problem. Therefore, it is expected that the global trajectory optimization by the developed method is feasible, as long as the local solution to the original problem can be obtained in a stable way.
Saddle-point solutions to optimal control problems are just the byproducts of the systematic search of local optimal solutions, in contrast to the saddle-point search in computational physics. It is noted that a saddle-point solution is not usually a desirable design solution, since it is the most inferior point between two adjacent local optimal solutions. However, saddle points may be useful for discussing the relations between several local optimal solutions, because a saddle point is essentially a branch point between two local minima. Investigations on these issues are the subjects of future research.

Numerical example
In order to investigate capability and efficiency of the developed method, the following example problem inspired by [10] is considered: After the transcription into an NLP problem with N = 21, Algorithm 3 is applied with r nei0 = 1.0. The penalty weight for Equation (45) is w C = 0.001. The initial guess z ini,u corresponds to x(t) = 0, u(t) = 0, t 0 = 0, and t f = 1. Due to highly nonlinear objective function, seven locally optimal trajectories connected by six saddlepoint trajectories are obtained as shown in Figure 4. The local optimal solution obtained first at line 2 of Algorithm 3 is the fourth solution (J ≈ −1.48). Other local optima and saddle points are found under the connectivity that is illustrated by arrows in Figure 4. Such an extensive exploration of local optima reveals that the best locally optimal trajectory is the fourth solution (J ≈ −1.48). Figure 5 presents the residuals of the constrained stationary condition, R (k) , as the functions of iteration number, k. During the initial-guess search using Algorithm 2, the residual increases since z (k) moves away from a local minimum. The average value of computational local order of convergence [27],λ, defined as follows is approximately 1.7: where k f denotes the iteration counter when the termination condition is satisfied. It is suggested in [16,17] that the original IMF has the quadratic convergence rate when the subproblem is solved exactly, and that it may achieve the superlinear convergence when neither an exact nor high-accuracy solution of the subproblem is pursued. The convergence result obtained in this numerical experiment demonstrates that superlinear convergence rate is successfully attained also in the modified IMF method developed in this paper. In order to investigate the robustness of the proposed method, a saddle-point search method without a trust-region strategy (i.e. the original IMF [16,17]) is also applied to solve this problem. Then, the fifth saddle point cannot be reached from the fifth local minimum, when the fixed value of r nei = 1.0 is used. As a consequence, the sixth and seventh locally optimal trajectories are not found. This shows that the trustregion based adjustment of r nei is useful for improving the robustness of the saddle-point search algorithm.
Let us next examine the validity of the penalty term in the modified rotation problems, Equation (30), for handling inequality constraints. For this purpose, the modified rotation problem with the Riemannian Hessian of the original objective function V is tentatively  considered, instead of that of the augmented objective, V. Then, the saddle-point search near the lower and upper bounds of u fails, and it implies that the penalty method employed in this paper for the constraint handling is certainly effective.

Application to ascent trajectory of winged rocket
In this section, the proposed method is applied to the optimization of ascent trajectory of a winged rocket. Reusable winged space transportation systems are expected to realize superior cost efficiency, safety, versatility, and operability, compared to conventional expendable systems. Among possible vehicle configurations, a vertically-launched suborbital vehicle developed under the industry-academia-government collaboration [28] is considered. The latest major specifications of the vehicle used in this section are shown in Table 1. The optimal ascent trajectory of conventional rockets on the one-dimensional vertical line for maximizing their attained altitude has been actively studied analytically [29] and numerically [30]. On the other hand, the optimal ascent trajectory of winged vehicles on the two-dimensional vertical plane considering the lift force requires more investigations.  Figure 6 shows the schematics for the ascent trajectory optimization problem. In order to derive a numerically tractable but practically viable problem, the following assumptions and approximations are introduced.

Problem formulation
• Two degree-of-freedom equations of motion of the vehicle as a point mass on the vertical plane is employed. • The Cartesian coordinate system, [x, h], whose origin is located at the launch point, is used. x and h denote down range and altitude, respectively. • Uniform gravity field with the magnitude of g = 9.8 m/s 2 is considered.  Figure 7. Lift-to-drag ratio, L/D, is included in Figure 7 as well. • The consumption rate of propellant is constant before the engine cut-off, and the resulting thrust force changes according to the ambient pressure.
The ascent flight is divided into the following three phases as shown in Figure 6.
• Lift-off phase: The vehicle is launched with the vertical attitude, and it continues to ascend vertically for the first five seconds after the lift-off, in order to leave the launch facility safely.
• Powered-ascent phase: The vehicle continues to accelerate until the engine thrust is cut off at 128 s after the lift-off. Angle of attack during the powered ascent phase is designed between −10 deg and 10 deg, which is a reasonable range where the longitudinal trim is typically possible.
• Coasting phase: After the engine cut-off, a coasting ascent flight continues until the vehicle reaches the highest point (apogee point). Aerodynamic forces in this phase are quite small, and they are ignored in the present trajectory optimization problem. In addition, it is imposed that the apogee point is 25 km away from the launch point along x direction. This requirement comes from the ground safety considering the flight termination in abort scenarios and debris fallings in explosive accidents.
The maximization of the apogee altitude is the objective of the trajectory optimization. Based on the assumptions above, the trajectory optimization problem only during the powered-ascent phase is considered as illustrated in Figure 6. Initial conditions at t = t 0 are given by the pre-determined trajectory in the lift-off phase. Position of the apogee point can be calculated from the terminal position and velocity at t = t f . As a result, the trajectory optimization problem is formulated as follows: where h, v x , v h , and α denote altitude, horizontal velocity, vertical velocity, and angle of attack, respectively. Thrust force, Th, depends on the ambient pressure, p, in the following way: where the second term represents the pressure loss of the thrust. x apo and m mean the down range at the apogee point and the vehicle mass, respectively, and they are given by Lift, L, and drag, D, are expressed by where q, ρ, Ma, and c are dynamic pressure, air density, Mach number, and speed of sound, respectively. C L,1 (Ma), C L,0 (Ma), C D,2 (Ma), C D,1 (Ma), and C D,0 (Ma) are the nonlinear functions of Mach number such that approximate the CFD results. They are constructed by using the radial basis function network with multiquadric basis [32]. The atmospheric model is as follows:

Result and discussion
For the numerical stability, the original problem formulated in the previous subsection is scaled so that the magnitude of resulting variables approximately has the order of unity. Then, this properly scaled problem is transcribed into an NLP problem with N = 31. Although the constraint given by "x apo − 25 × 10 3 = 0" in Equation (52) is not written in the form of a conventional terminal condition due to Equation (55), it can be converted into an equality constraint by approximating the integration by the Gaussian quadrature. The penalty weight for Equation (53) is w C = 100. The resulting NLP problem is solved by using Algorithm 3 with r nei0 = 1.0. Two locally optimal solutions and one saddle-point solution between them are obtained as shown in Figures 8-10, respectively. Total number of iterations of Algorithms 1 and 2 for the convergence to the saddle point is 10. The computational local order of convergence given by Equation (46) is around 1.4. They imply that realistic and complex optimal control problems can be solved by the proposed methodology in a scalable manner. The apogee altitudes of these three trajectories are 154.8 km, 151.8 km, and 146.1 km, respectively. Since it is confirmed that the dynamic pressure during the coasting phase is actually small, the assumption of zero aerodynamic force introduced in Section 6.1 is valid.
The left part of Figures 8-10 presents an ascent trajectory consisting of powered-ascent and coasting phases, and its enlarged view around the launch point. Two optimal trajectories are different from each other until around 50 s, during which angle of attack changes between its upper and lower bounds. It may be counterintuitive that the second optimal trajectory is better than the saddle-point trajectory even though it flies towards the negative direction of the down range in the beginning of flight. In a previous design study of a winged suborbital vehicle by the authors [33], a path constraint to make the time derivative of flight path angle negative is intentionally imposed in order to obtain a seemingly suitable ascent trajectory. However,  the present result reveals that such a trajectory design approach gives rise to an inferior ascent trajectory which is close to a saddle point rather than optima. The attainable altitude of the saddle-point trajectory is lower than that of the first optimal trajectory by 8.7 km, which is not huge but practically meaningful.
The reason for different apogee altitudes of three trajectories can be examined quantitatively in terms of vertical velocity loss. The ideal vertical velocity increment realized in the absence of thrust loss, aerodynamic forces, attitude control, and gravity is given by The vertical velocity loss is defined as the difference between this ideal velocity increment and the increment actually obtained. The losses concerning the thrust, L t , the aerodynamic forces, L a , the control, L c , and the gravity, L g , are calculated as follows: Figure 11. Vertical velocity loss of optimal and saddle-point ascent trajectories of a winged rocket.
Note that all the feasible trajectories have the same gravity loss in this case, because the duration of powered flight is fixed. Integrands in L t , L a , and L c represent the instantaneous contribution of thrust loss, aerodynamic forces, and control loss, respectively; and they are shown in Figure 11, for two optimal trajectories and one saddle-point trajectory. Two optimal trajectories have large control losses before 40 s because their thrust vectors are not aligned to the vertical direction. However, these losses are overcome by the positive gains (i.e. negative losses) generated by aerodynamic forces, more specifically lift. As a result, the first optimal trajectory has the smallest total loss, followed by the second optimal trajectory. Let us explain why the first optimal trajectory stops flying to the down-range direction and finishes exploiting the lift force at around 80 s. When the lift utilization is not beneficial, a more vertical trajectory with lower maximum dynamic pressure (like the present saddlepoint trajectory) becomes better, since it can reduce drag. This means that the dynamic pressure can be regarded as the degree of lift utilization. In both the first and second optimal trajectories, dynamic pressure reaches its maximum value at 45 s, where Mach number is around 0.8. By referring to the fact that liftto-drag ratio deteriorates above Mach 0.8 (see Figure 7), it can be inferred that the lift-assisted ascent is profitable only in the flight regime where lift-to-drag ratio is sufficiently high.

Conclusion
This paper presented an enhanced iterative minimization formulation for saddle-point search and its application to multi-modal optimal control. After a local minimum is found, surrounding two saddle points are searched, from which new local minima are expected to be obtained by using a gradient-based optimizer. By repeating this procedure, multiple locally optimal solutions connected by saddle-point solutions are explored systematically. The developed method is applicable to general optimal control problems with equality and inequality constraints. It is especially suitable for finding better local solutions around a currently-obtained solution in complex practical problems, when the pursuit of rigorous global optimality is prohibitively expensive. Numerical experiments with an example problem demonstrated that the proposed algorithm with globalization strategies and efficient constraint handling has robust search performance and superlinear convergence.
Then, the two-dimensional ascent trajectory optimization of a vertically-launched winged rocket for the maximization of the apogee altitude was performed using the present method. Two locally optimal trajectories exploit lift force before the lift-to-drag ratio deteriorates in the transonic flight regime, and it was revealed that a vertically-ascending trajectory that is common in conventional non-winged rockets is actually an inferior saddle-point solution.

Disclosure statement
No potential conflict of interest was reported by the author(s).