Second-Order Necessary/Sufficient Conditions for Optimal Control Problems in the Absence of Linear Structure

Second-order necessary conditions for optimal control problems are considered, where the ``second-order"is in the sense of that Pontryagin's maximum principle is viewed as a first-order necessary optimality condition. A sufficient condition for a local minimizer is also given.


Hongwei Lou
Let us consider a minimizer x 0 of a smooth function f (·) on Ω, where Ω is the closure of a domain Ω ⊆ lR n . We call a unit vector ℓ is an admissible direction if there exists a δ > 0 such that x 0 + sℓ ∈ Ω for any s ∈ [0, δ]. If ℓ is admissible, we have the following first-order necessary condition: holds, i.e., (1.1) degenerates, then we can get further the second-order necessary condition: where D 2 f is the Hessian matrix of f . If (1.2) does not hold, that is ∇f (x 0 ), ℓ > 0, then (1.3) does not necessarily hold .
From the above observations, we see that to yield second-order conditions of a minimizer, linear structure of independent variables is needed and second-order conditions only appear when first-order conditions degenerate.
For an optimal control problem, usually the control domain U need not have linear structure.
Thus, the space U ad of control functions need not have linear structure. Pontryagin's maximum principle is a kind of necessary conditions that a minimizer satisfies. Many people look it as the first-order necessary condition. However, Pontryagin's maximum principle could not be obtained directly in a way like (1.2). First, for an optimal controlū(·), there is probably no "admissible direction" v(·) such thatū(·) + sv(·) is still in U ad . Secondly, even if "admissible direction v(·)" exists, what we could get from 0 ≤ lim s→0 + J(ū(·) + sv(·)) − J(ū(·)) s is only a corollary of Pontryagin's maximum principle which looks like (4.4), where we denote J(·) the cost functional of the optimal control problem.
When linear structure lacks, could we replace the "admissible direction" by "admissible path"?
In other words, could we replaceū(·) + sv(·) by u s (·) ∈ U ad , which is continuous in some sense in s ∈ [0, 1]? Certainly, we can do that. Yet, "admissible path" will immediately puzzles us on what are first-order conditions and second-order conditions. To see this, let us consider the function f (·) and its minimizer x 0 again. Let ℓ be an admissible direction such that (1.2) holds. Then choosing x(s) = x 0 + √ sℓ, we have Then, should we call (1.4) a first-order condition? Therefore, we think it is not a good idea to replace "admissible direction" by "admissible path". In this paper, we will transform the original optimal control problem to a new problem, which is in fact the locally relaxed problem of the original problem. In this new problem, the corresponding space of control functions has linear structure and we can yield Pontryagin's maximum principle like (1.1) under this linear structure.
Then we can further yield second-order conditions based on Pontryagin's maximum principle.
To reveal our idea clearly, we consider simply optimal control problems governed by ordinary differential equations.
The rest of the paper is organized as follows: In Section 2, we will give a method to linearize the control space near the optimal control. In Section 3, We will give a new proof of Pontryagin's maximum principle. Section 4 will be devoted to second-order necessary conditions of optimality.
Finally, a sufficient condition for a control being a local minimizer will be given in Section 5.

Local Linearization of Optimal Control Problems.
In this section, we will linearize locally an optimal control problem along its minimizer. Let us consider the following controlled system: x(0) = x 0 (2.1) and the following cost functional where T > 0, and u(·) ∈ U ad with We pose the following assumptions: (S1) The metric space (U, ρ) is separable.
Now, letū(·) ∈ U ad be a minimizer of J(·) over U ad . We linearize U ad alongū(·) in the following manner. Define where δ v denotes the Dirac measure at v on U . For an element σ(·) Then we can define x(·) = x(·; σ(·)) as the solution of the equation and the corresponding cost functional J(σ(·)) by We can see that x(·; u(·)) and J(u(·)) coincide with x(·; δ u(·) ) and J(δ u(·) ) respectively. Thus, U ad can be viewed as a subset of M ad in the sense of identifying u(·) ∈ U ad to δ u(·) ∈ M ad . Readers who are familiar with relaxed controls will immediately find M ad is a subset of relaxed control space. Yet, elements of M ad are much simpler than other relaxed controls. This is why we need neither to pose additional assumptions like that the control domain is compact as Warga did (c.f. [31]) nor to introduce the relaxed control defined by finite-additive probability measure as Fattorini did (c.f. [9]). M ad has a linear structure atū(·), i.e., it contains all elements in the form δū (·) + α(δ u(·) − δū (·) ) (α ∈ [0, 1]). It can be proved easily that δū (·) is a minimizer of J(σ(·)) over M ad . Using this fact, we can derive Pontryagin's maximum principle from It is easy to prove the following results.
Consequently, by Arzelá-Ascoli's theorem, along a subsequence ε → 0 + , x α,ε (·) converges uniformly to some y(·) in [0, T ]. Thus, using a generalization of Riemann-Lebesgue's Theorem (see Ch. II, Theorem 4.15 in [33]), we can easily prove that by a subsequence ε → 0 + , (2.14) Since it follows from (2.14) that Passing to the limit for ε → 0 + in the following equality we get from (2.15) that i.e., y(·) = x α (·). Furthermore, we can see that x α,ε (·) itself converges uniformly to x α (·) in [0, T ]. Combining this with (2.16), we have 3. Pontryagin's Maximum Principle. Now, we will derive Pontryagin's maximum principle from (2.9). The idea of our proof could be tracked back to the works on relaxed control (c.f. [10] and [31], for example). However, one can still find that the proof we will give later has some improvement. Moreover, it can also be used to problems governed by partial differential equations and even having state constraints (c.f. [17]).
We keep the notations used in §2 and denotex(·) = x(·;ū(·)). We have where f x (t, x, u) denotes the transpose of the Jacobi matrix of f on x. By (3.1), (S2), and using the same argument as the proof of the uniform convergence of and X(·) solves the variational equation Now, by introducing the adjoint equation we get from (2.17) and Lebesgue's dominated convergence theorem that where Then, since U is separable and H is continuous in u, it follows from (3.5) and a standard argument that 4. Second-Order Necessary Optimality Conditions. We turns to study second-order necessary optimality conditions where Pontryagin's maximum principle is viewed as a first-order necessary optimality condition. In other words, we will give a second-order necessary condition for optimality to distinguish singular controls in the sense of Pontryagin's maximum principle.
One can see that in (3.7), the equality holds if and only if In this case, Elements in U ad are called singular controls in the sense of Pontryagin's maximum principle. If U is an open subset of lR m , then In this case, we call elements in as singular controls in the classical sense (see Definitions 1 and 2 in [11]). Now we make the following assumption: (S3) Functions f are twice continuously differentiable in x. Moreover, it holds that We mention that (S2) implies The following theorem gives second-order necessary optimality conditions. Theorem 4.1. Let (S1)-(S3) hold andū(·) be a minimizer of J(·) over U ad . Define W (·) be the solution of the following second-order adjoint equation: and Φ(·) be the solution of where I is the unit n × n matrix. Then for any u(·) ∈ U ad , Proof. Let u(·) ∈ U ad . Then by (4.2), for any α ∈ (0, 1], f k x (s,x(s),ū(s)), Y α (s) ds, k = 1, 2, . . . , n. Using (S3) and by the same way to derive (3.2), we can get where Y (·) solves the following second variational equation (4.13) Then it follows from (4.10) and Lebesgue's dominated convergence theorem that where tr B denotes the trace of a matrix B. One can easily verify that Now, we introduce the second-order adjoint equation (4.7). By (S1)-(S3), we can see that (4.7) admits a unique solution W (·). Since W (·) ⊤ satisfies (4.7) too, W (·) should be symmetric. Since tr (AB) = tr (BA) for all k × j matrix A and j × k matrix B, we have x(t), u(t))), X(t) dt.  .7), Thus, it follows from (4.14) and (4.16) that −H x (t,x(t), u(t), ψ(t)), Φ(t)Φ(s) −1 (f (s,x(s), u(s)) − f (s,x(s),ū(s))) ds.
Therefore, combining the above with (4.19), we get That is, for any u(·) ∈ U ad , F (t, u(t)), G(t, u(t)) ≤ 0, a.e. t ∈ [0, T ]. (4.21) For k = 1, 2, . . ., denote Then, T k is measurable. We claim T k has zero measure. Otherwise, for any t ∈ T k , is a nonempty closed subset of U . It is easy to see that Γ(·) is measurable since is measurable in t and continuous in v. Thus, by Lemma 4.2, there exists a measurable functioñ .
Contradict to (4.21). Therefore T k has zero measure. Consequently, T k has zero measure too. That is, for almost all t ∈ [0, T ], This completes the proof. 2 It is not necessary to suppose that U is a Polish space in yielding (4.21). However, usually, we can not get (4.22) from (4.21) if we only suppose that U is a separable metric space. To see this, we introduce the following example.