A Nonsmooth Maximum Principle for Optimal Control Problems with State and Mixed Constraints-Convex Case

Here we derive a nonsmooth maximum principle for optimal control problems with both state and mixed constraints. Crucial to our development is a convexity assumption on the"velocity set". The approach consists of applying known penalization techniques for state constraints together with recent results for mixed constrained problems.

1. Introduction. In this paper we develop a nonsmooth maximum principle for optimal control problems with both pure state and mixed state control constraints in the presence of a convexity assumption. The problem of interest is Minimize l(x(a), x(b)) subject tȯ x(t) = f (t, x(t), u(t)) a.e. t ∈ [a, b] h(t, x(t)) ≤ 0 for all t ∈ [a, b] (x(t), u(t)) ∈ S(t) a.e. t ∈ [a, b] (x(a), x(b)) ∈ E.
The state x and control u are subject to joint, or mixed constraints through the condition (x(t), u(t)) ∈ S(t) where t → S(t) ⊂ R n × R k is a multifunction. The function f : R×R n ×R k → R n describes the system dynamics and h : [a, b]×R n → R is the functional defining the pure state constraint. Furthermore, the closed set E ⊂ R n × R n and l : R n × R n → R specify the endpoint constraints and cost. This problem involves measurable control functions u and absolutely continuous function x. A pair (x, u) is called an admissible process if it satisfies the constraints of the problem with finite cost.
We say that the process (x,ū) is a strong local minimum if, for some ε > 0, it minimizes the cost over admissible processes (x, u) such that |x(t) −x(t)| ≤ ε for all t ∈ [a, b]. We consider the basic hypotheses on the problem data throughout. They are the following: f and L are L × B n+k , S is L × B, E is closed and l is locally Lipschitz.
Necessary optimality conditions for nonsmooth problems with pure state constraints have been studied systematically for quite some time (see [22] details and references therein). On the other hand, problems with mixed state control constraints, amply studied in a smooth framework (see, for example, [1], [2], [7], [14], [15], [16], [17], [19], [20]) have received little attention. Attempts to treat mixed constrained problems with nonsmooth data have been in general timid (see, for example, [13], [12]) until quite recently when, in [8], necessary conditions in the form of a nonsmooth maximum principle were developed. However the literature on nonsmooth maximum principle with both mixed and state constraints has been surprisingly sparse.
In this paper we develop a nonsmmoth maximum principle for problem (P ) with both pure state and mixed state-control constraints under some convexity assumptions. To achieve our purpose we intertwine established approaches used for state constraints with up to date developments for problems with mixed constraints. Indeed, we follow closely the approach developed in [21] (see also [10] and [12]) where necessary conditions for pure state constrained problems are derived. Our proofs differ from those in [10] since we deal not only with state constraints but also with mixed constraints. So applications of a nonsmooth maximum principle for mixed constrained problems, derived in [8] (instead of those in [9]), play a crucial role in our analysis. There is however a price to pay; here we assume that the solution of (P ) is a strong minimum in contrast with [8] where a weaker notion of minimum, that of local minimum of radius R is used (in this respect see also [6]). Also we need to strengthen the hypotheses in comparison with those in [8]. The convexity assumption we impose on this paper may be seen as a major hindrance to some applications. Although this assumption can be successfully removed following the lines of [11] we opt, for the sake of simplicity, to report that work elsewhere together with a discussion of the hypotheses and illustration of applications.

2.
Preliminaries. For g in R m , inequalities like g ≤ 0 are interpreted componentwise. Here and throughout, B represents the closed unit ball centered at the origin regardless of the dimension of the underlying space and | · | the Euclidean norm or the induced matrix norm on R p×q . The Euclidean distance function with respect to a given set A ⊂ R m is We make use of standard concepts from nonsmooth analysis. Let A ⊂ R k be a closed set withx ∈ A. The proximal normal cone to A atx is denoted by N P A (x), while N L A (x) denotes the limiting normal cone and N C A (x) is the Clarke normal cone.
Given a lower semicontinuous function f : R k → R ∪ {+∞} and a pointx ∈ R k where f (x) < +∞, ∂ L f (x) denotes the limiting subdifferential of f atx. When the function f is Lipschitz continuous near x, the convex hull of the limiting subdifferential, co ∂ L f (x), coincides with the (Clarke) subdifferential ∂ C f (x). For details on such nonsmooth analysis concepts, see for example [4,5,18,22].
3. Auxiliary Results. In this section we present a simplified version of one of the main results in [8] that will be of importance in the forthcoming developments.
Assume for the time being that E ⊂ R n × R n and l : R n × R n → R. Consider the following problem: For some ε > 0 1 define There exist constants k φ x and k φ u such that for almost every t ∈ [a, b] and every If this assumption is imposed on f , then the Lipschitz constants are denoted by k f x and k f u . As for S(t) we consider the following bounded slope condition: There exists a constant k S such that for almost every t ∈ [a, b] the following condition holds The two previous hypotheses are strengthening of the analogous hypotheses in [8]. For the sake of uniformity and the analysis in the forthcoming sections we need to position an extra hypothesis on the set S ǫ * (t). We assume that: [CS ǫ * ] The set S ǫ * (t) is closed and there exists an integrable function c such that for almost every t ∈ [a, b] the following holds We observe that although [CS ǫ * ] is a strong assumption it is nevertheless of importance in our future development. Necessary conditions of optimality for (C) are given by the following theorem: 4. The Convex Case. We now turn to problem (P ). For this problem we derive a nonsmooth maximum principle under the following convexity assumption on the "velocity set": Furthermore we need to impose two more hypotheses on the data of our problem, one related to the state constraint and another to mixed constraints.
[H1] For all x ∈x(t) + εB the function t → h(t, x) is continuous and there exists a scalar k h > 0 such that the function [H2] For almost every t ∈ [a, b] the following condition holds: for all u ∈ S(t,x(t)) and all sequence x n →x(t) there exists a sequence u n ∈ S(t, x n ) such that u n → u. In the above the set S(t, x) is defined as where S(t) is as in (1). For a discussion on the need to impose continuity of t → h see [10]. Hypothesis [H2] asserts the lower semi-continuity of the multifunction x → S(t, x) (for definition and properties see [3]).
Assume the basic assumptions. Also suppose that f satisfies [L ǫ * ] and that both [BS ǫ * ] and [CS ǫ * ] hold. Under these assumptions we note for future use that the following conditions are satisfied: a.e. t (6) for all u ∈ S(t,x(t)) a.e. t ∈ [a, b] and there exists an integrable function k such that |f (t,x(t), u)| ≤ k(t) for all u ∈ S(t,x(t)) a.e. t.
Before proceeding we need to define the following subdifferential We are now in position to state our main result.

Remark 2.
It is also easy to deduce from the proofs that when assumption [H2] is not imposed, a "weaker" version of the necessary conditions for (P ) (in the vein of [9]) can be obtained: all the conclusions but (iii) (the Weierstrass condition) hold.
We derive Theorem 4.1 in two main stages. In the first stage we establish the validity of the theorem to the following problem Problem (Q) is a special case of (P ) in which E = {x a } × E b and l(x a , x b ) = l(x b ).
where q is as in (9).
The local minimality of (x,ū) provides some ε > 0. By reducing this constant if necessary, we can also rely on the hypotheses. The proof breaks into several steps.
Define the following problem for each i ∈ N: where h + (t, x) := max{0, h(t, x)}. This differs from (Q) by shifting the state constraint into the objective function. Following the approach in [21] (see also [10]) let us temporarily assume that penalization is effective, i.e., We will justify this assumption later.
Let W denote the set of measurable functions u : [a, b] → R k for which there exists an absolutely continuous function x such thatẋ(t) = f (t, x(t), u(t)), (x(t), u(t)) ∈ S(t), for almost every t ∈ [a, b], x(t) ∈x(t) + εB for all t ∈ [a, b], x(a) = x a and x(b) ∈ E b . We provide W with the metric ∆(u, v) := u − v L1 and define J i : W → R using the arc x mentioned above: It is a simple matter to check that (W, ∆) is a complete metric space in which the functional J i : W → R is continuous (see [4]). Moreover, problem (Q i ) above is closely related to the abstract problem Clearly (ū,x(b)) is admissible for (R i ), with J i (ū) = l(x(b)) = inf P since for all t ∈ [a, b], h + (t,x(t)) = 0 . Let ε i = J i (ū) − inf P i . We have ε i ≥ 0 and, taking into account [IH], ε i → 0. Ekeland's variational principle (see [22]) applies. It asserts the existence of u i ∈ W such that and u i minimizes over W the perturbed cost functional Let x i be the trajectory corresponding to u i .
Step 3: Study optimality conditions for the perturbed problem. In control-theoretic notation, our work with Ekeland's Theorem shows that the process (x i , u i ) solves the following optimal control problem: Since ε i → 0 (by [IH]) it follows from (16) that u i →ū strongly. We can then arrange by subsequence extraction, if necessary, that u i →ū almost everywhere. We can further deduce that x i →x uniformly. By discarding initial terms of the sequence we can guarantee that (x i , u i ) is a local minimum for a variant of problem (D i ) obtained by dropping the constraints x(t) ∈x(t)+εB. We now fix our attention in the related subsequence of problems without relabeling.
Theorem 3.1 applies to (D i ). It provides an absolutely continuous function p i and a scalar λ i ≥ 0 such that These conditions have consequences we now seek to express in terms of the original problem (Q).
Apply Clarke's sum rule [4] to (19) and take into accounts the properties of the subdifferentials of the distance function. We deduce that there exist measurable functions ξ i , ζ i , γ i , e i , φ i and ϕ i such that for almost every t in [a, b], such that (27) To simplify this further, let h 0 (t, x) = 0 and h 1 (t, x) = h(t, x) so that Then for each fixed t, Clarke's Max Rule [4] says . Clearly ∂ C x,u h 0 ≡ {(0, 0)}, so a typical element of the right side has the form Tracking these dependencies leads to the following expansion of (26) and (27): We now introduce the measure µ i ∈ C * ([a, b]; R): and we have for every Borel set B. Here b i = p i (a). Taking (21) into account we have Since α i (t) ∈ Σ i (t), we have µ i ∈ C ⊕ ([a, b]; R) and this measure has support in {t ∈ [a, b] : h(t, x i (t)) ≥ 0}. Since, by (18) b i and λ i are not both zero, we may conclude, after rescaling, that Step 4: Take limits. Our first steps has dealt with fixed i ∈ N. We now consider the case when i → ∞. Recall that the sequence x i converges uniformly tox and u i →ū almost everywhere. Under the hypotheses and appealing to Gronwall's inequality we deduce the existence of a constant K 1 such that |π i | ≤ K 1 . It follows from (30) and (32) that |p i (t)| ≤ K 1 + 1. We now deduce from the above that π i → π weakly * for some measure π. Consequently |π i | → |π|.
With the above and appealing to Lemma 4.3 in [21] we can now conclude that there exists some subsequence such that p i (t) → q(t) a.e. where q is now a function of bounded variation defined as Under the hypotheses we deduce from (22) that |(ξ i (t), ζ i (t))| ≤ max{k f x , k f u } a.e. Dunford-Pettis Theorem (see for example [22,Theorem 2.51]) asserts existence of a subsequence converging weakly in the L 1 topology to some function (ξ, ζ) such that ξ, ζ ∈ L 1 . Taking into account (24) and (25) we deduce in the same way that e i → e, (φ i , ϕ i ) → (φ, ϕ) for some e, φ, ϕ ∈ L 1 where the convergent is understood in the weak L 1 topology. Upper semi-continuity properties of the subdifferentials asserts that (22)-(25) hold when we remove the indexes i.
Observe that ∂ C x h(t, x) ⊂∂ x h(t, x) (see (8) for definition of∂ x h(t, x)) and that ∂ x h(t, x) is of closed graph for any i. It follows from [21,Lemma 4.3] that there exists a Borel measurable, µ-integrable function γ such that γ(t) ∈∂ x h(t,x(t)) µ − a.e. This is (14) of the proposition.
We now turn to (31). The properties of limiting normal cones and limiting subdifferential assert that We concentrate on the support of the measure µ. Mimicking the arguments in [10] it is a simple matter to see that supp{µ} ⊂ {t ∈ [a, b] : h(t,x(t)) = 0} . This is conclusion (14) of the proposition.
Step 5: Show that [C] implies [IH]. We omit the details since the conclusion can be obtained adapting the arguments in [10].
6. Sketch of the Proof of Theorem 4.1. The proof comprises three stages. We omit the details. We first extend Proposition 1 to problems where x(a) ∈ E a and E a is a closed set. This is done following the lines in the end of the proof of Theorem 3.1 in [21]. Thus we obtain necessary conditions when (x(a), x(b)) ∈ E a × E b . Next we consider the case when the cost is l = l(x(a), x(b)). This is done using the technique in Step 2 of section 6 in [12]. And finally, following the approach in section 6 in [12], we derive necessary conditions when (x(a), x(b)) ∈ E and E is a closed set. This completes the proof.