A stochastic maximum principle with dissipativity conditions

In this paper we prove a version of the maximum principle, in the sense of Pontryagin, for the optimal control of a finite dimensional stochastic differential equation, driven by a multidimensional Wiener process. We drop the usual Lipschitz assumption on the drift term and substitute it with dissipativity conditions, allowing polynomial growth. The control enter both the drift and the diffusion term and takes values in a general metric space.


Introduction
Stochastic maximum principle (SMP for brevity) is a standard tool in order to provide necessary conditions for optimal control problems. After the well known paper by Peng [14] for finite dimensional systems, there have been a large number of works on this subject. Firstly, X.Y. Zhou simplified Peng's proof in [19] and studied the relationship between the SMP and dynamic programming. A detailed exposition of this work is contained in [18]. Later, a generalization with random coefficients and without L p -bounds on the control is formulated in [5] for linear equations by Cadenillas and Karatzas. Moreover, other directions have been followed. For example, in [12,16] the authors studied a version of the SMP for a class of noises with jumps and in [1] the case of non-smooth coefficients of the state equation have also been treated. Regarding the infinite dimensional case there are still open issues. Indeed most of the results are only concerned with a convex control domain, or with the case in which the diffusion term does not depend on the control [2,10]. Recently some works are devoted to the study of a general infinite dimensional SMP. See e.g. [7,8,9,11,17].
In this paper we are interested in formulating another version of the general SMP for the optimal control of a stochastic differential equation in a finite dimensional setting, driven by a multidimensional Wiener process. More precisely, we drop the lipschitzianity assumption on the drift term and we replace it with a more natural sign condition, known as dissipativity or monotonicity. This condition is widely studied in the literature both in finite and infinite dimension. In particular, Peng introduced it in [15], in the study of backward stochastic differential equations (BSDEs in the following) with random terminal time. Then let us mention the contributions of Pardoux [13], Briand et al. [3], and more recently Briand and Confortola [4]. If the equation of the state is dx(t) = b(t, x(t), u(t))dt + σ(t, x(t), u(t))dW (t) x(0) = x 0 with cost functional of the form J(u(·)) = E T 0 f (t, x(t), u(t))dt + h(x(T )) then the dissipativity of b is expressed by the following P-a.s. t ∈ [0, T ], u ∈ U ; (1) for every x, x ′ ∈ R n and some constant α ∈ R. The key fact is that the dissipativity condition is inherited by both the first and second variation equation of the state. Indeed a condition of the following type holds P-a.s., for all y in R n , for the drift term Moreover, also the BSDE arising as first adjoint equation satisfies such a condition and this fact enables us to prove well posedness via an existence and uniqueness result provided by P. Briand et al. in [3]. For the second adjoint equation, which is matrix valued, we get the same result equipping the space with the Hilbert-Schmidt norm.
Regarding the diffusion term of the state equation, we impose the usual Lipschitz hypothesis and we assume that both coefficients are regular in x. For the cost functional, we allow a polynomial growth for the coefficients. Finally, we stress the fact that, with these assumptions, we are able to treat polynomials of odd degree with strictly negative leading term as drift coefficients, instead of imposing only a linear growth. The paper is organized as follows. In Section 2 we fix notations and assumptions, we introduce the adjoint BSDEs and we state some preliminary results on the stochastic differential equation of the state. In Section 3 we give the statement of the main result. Section 4 is devoted to the spike variation technique and to the expansion of the cost. In Section 5 we conclude the proof of the SMP. Finally, Section 6 is devoted to the case for which the domain of the controls is convex and only one adjoint equation is needed. A sufficient condition of optimality is also exhibited.

Notations and Preliminaries
Throughout this paper we let W = {W 1 (t), . . . , W d (t)} t≥0 be a standard d-dimensional Brownian motion defined on some complete probability space (Ω, F, P). We denote by (F t ) t≥0 the natural filtration associated to W , satisfying the usual conditions. We suppose that all the processes are defined for times t ∈ [0, T ]. Then we denote by P the σ-algebra on Ω × [0, T ] generated by progressive processes. For any p ≥ 1 we define The space of control actions is a general metric space U (except in section 6), endowed with its Borel σ-algebra B(U ). Furthermore, the class of admissible controls is defined by requiring that they are progressively measurable with respect to {F t } t≥0 , more precisely Finally we will denote by |·| the Euclidean norm in R n .
Here we want to study a finite horizon stochastic control problem in the form with a cost functional given by If x(·) is a solution of (3) and u(·) ∈ U [0, T ] then we call (x(·), u(·)) an admissible pair. The control problem can be formulated as a minimization of the cost over U [0, T ], more precisely a controlū is optimal if Hypotheses and satisfies a α-dissipativity condition in the sense that there exists a constant α ∈ R such that, P-a.s.
. Moreover the map x → σ(t, x, u) is C 2 (R n ; R n×d ) and there exists a constant C 1 > 0 such that, P-a.s.
3. It is easy to see that also the derivative of b(t, x, u) with respect to x satisfies a dissipativity condition. Indeed, for all y ∈ R n , P-a.s.
This property is crucial in order to guarantee the well-posedness of the first and second variation of the state equation.
The following result is essential for the well posedness of the optimal control problem and it concerns the existence and uniqueness of a solution to the state equation, for any u ∈ U [0, T ].
Although it is known in the literature (cfr. e.g. [6]), we provide a sketch of the proof for completeness.
Moreover, there exists a constant C = C(T, p) dependent on T and p ≥ 1, such that Proof. To simplify the notation we drop the dependence on the control; the case of controlled equation can be treated exactly in the same way. By fixing γ ∈ C([0, T ]; L 2 (Ω; R n )) we want to show that the problem admits a unique solution J(γ) which belongs to C([0, T ]; L 2 (Ω; R n )). The existence part follows from the fact that the initial problem can easily reformulated as a differential equation with random coefficients of the form where the quantity is well defined thanks to the linear growth imposed by the Lipschitz assumption. Since b(·) is continuous, we know that there is a local solution which can be easily extended to the whole [0, T ], by the dissipativity assumptions. Now we have to verify that the operator J : C([0, T 0 ]; L 2 (Ω; R n )) → C([0, T 0 ]; L 2 (Ω; R n )) is a contraction if T 0 is small enough. Applying Itô's formula and taking expectation we get, for any γ 1 , γ 2 ∈ C([0, T ]; L 2 (Ω; R n )) and so where we used assumptions on coefficients and the Gronwall lemma. Eventually, Now we have to deal with the two backward stochastic differential equations arising as adjoint equations with terminal conditions in the formulation of the SMP. The first order adjoint equation has the following form (11) where p(·) is the first order adjoint process andx(t),ū(t) are the optimal trajectory and the optimal control process, respectively. It is even worth noting that the coefficient D x b(t,x(t),ū(t)) in front of p(t) is dissipative, as we observed in Remark 3; this is the key fact in order to check the well posedness of the equation. Then, as it was pointed out in [14], the presence of the control in the diffusion term forces the introduction of a second variation process, which can be represented as the solution of a matrix valued BSDE of the form where H is the Hamiltonian and it is defined by Also in this case we have a kind of monotonicity in the first term. Now we observe that a solution of the first (second) adjoint BSDE is a pair of adapted processes (p(·), q(·)) ∈ L 2 where S n is the space of symmetric matrices. Indeed, the following theorem hold Thanks to the growth assumptions in hypothesis (H5) and (8) we have that Hence, using the result in [3], Theorem 4.1, page 119, the hypotheses of the theorem are satisfied by the BSDE (11) and we have finished.
Regarding the second adjoint equation we need to check that the drift term remains dissipative even though the BSDE is matrix valued. To do it we introduce the Hilbert-Schmidt norm and the corresponding scalar product in S n as Then we can state the following Theorem 2.3. Under hypotheses (H1)-(H5) the adjoint equation (12) has a unique adapted solution (P (·), Q(·)) ∈ L 2 Let us consider firstly the scalar product in the Hilbert-Schmidt norm . Now decompose P P T in the following way where γ i ≥ 0 and c i are the eigenvalues and the (orthonormal) eigenvectors of P P T , respectively. Then we have That is exactly the dissipativity condition we need. In fact, as in Theorem 2.2, using again the result in [3], Theorem 4.1, and taking into account the dissipativity obtained in (14) we get the required result.

Statement of the Theorem
Now we are in position to state the Pontryagin-type stochastic maximum principle for the optimal control problem (5) associated to the state equation (3).
that are solutions to the BSDEs (11) and (12) respectively, such that where Tr σ(t,x(t),ū(t)) T P (t)σ(t,x(t),ū(t)) Here we want to stress that, under our assumption, we can consider a state equation which has a drift coefficient of the following type where 0 < λ ≤ c i (·) ≤ C and |c ij (·)| ≤ C. i.e. polynomials of degree 2m + 1 with strictly negative leading term. This is a genuine generalization of the classical Lipschitz case in which only a linear growth is allowed.

Spike Variation Technique
In this section we are going to study a Taylor expansion of the state trajectory with respect to a needle perturbation of the control. Let E ε ⊂ [0, T ] be a set of measure ε andū an optimal control, then we define the perturbed control as Remark In general U does not have a linear structure, hence a perturbation likeū(t)+ εu(t) is meaningless unless U is, for example, a convex space. We will discuss this case later.
If (x(·),ū(·)) is a given optimal pair, let (x ε (·), u ε (·)) satisfy the following Following the notation of Yong and Zhou [18], we will denote by δϕ(t) the quantity ϕ(t,x(t), u ε (t))−ϕ(t,x(t),ū(t)), for a generic function ϕ, and by y ε (·), z ε (·) the solutions of the following SDEs and where have values in R n×n for 1 ≤ j ≤ d and also Here we want to obtain an a priori estimate for a general linear SDE with stochastic coefficients in the spirit of lemma 4.2 of [18], which will be useful in the sequel.
where A, B j : [0, T ] × Ω → R n×d and α, β j : [0, T ] × Ω → R n are (F t )-progressive. Moreover, suppose that the following conditions hold: there exist c ∈ R, L ≥ 0, k ≥ 1 such that Then the following a priori estimate holds Proof. We exhibit the proof for α and β being bounded; the general case follows using the usual approximation argument. We begin by computing Itô's formula for f (y) = |y| p , with p ≥ 4, and we set p = 2k. The case with p ∈ [1, 4) follows from the Hölder inequality.
Then using Young inequality twice, we obtain Hence, from Gronwall inequality we get Now, in order to obtain the required estimate under hypothesis (3) we set and following [18], we end up with which is the required result.
Concerning the well posedeness of the stochastic differential equations (18) and (19), the key fact is that on the drift term of both the equations we have a dissipativity condition like (7). This enables us to state the following Proposition 4.2. Under the hypotheses (H1)-(H5) the stochastic differential equations (18) and (19) admit a unique solution y ε , z ε ∈ C([0, T ]; L 2 (Ω; R n )).
Proof. The proof is standard and it will be only sketched. Let us begin with the first variation equation. As in the proof of Proposition 2.1 we consider for simplicity the case where the SDE is not controlled, then we reduce to an equation of the form For the controlled one the only difference is that it remains to check that the term δσχ Eε is integrable, but this is obvious thanks to the growth condition on σ. Now, we fix γ ∈ C([0, T ], L 2 (Ω, R n )) and we define Since using the boundedness of D x σ(t), the dissipativity of D x b(t) and the a priori estimate given in Lemma 4.1 we have again that the map J t is a contraction in C([0, T 0 ], L 2 (Ω, R n )), hence with the same arguments as before we have existence and uniqueness of a solution for the second variation equation.
Before exhibiting an expansion of the state with respect to small perturbations of the control process we provide a Taylor formula in the form of a lemma. Lemma 4.3. If g ∈ C 2 (R n ) then the following equality holds for every x,x ∈ R n The central result of this section is the following . Then for k = 1, 2, . . .
Proof. The idea of the proof is to reduce every equation to a linear SDE with appropriate coefficients and to use lemma (4.1).
Next we apply Lemma 4.1 noting i.e. condition 1 of Lemma 4.1 is verified. Then we obtain thanks to the polynomial growth and (8).
(ii) Using the dissipativity of D x b and lemma (4.1) the estimate for y ε follows in the same way.
(iii) For z ε we have, proceeding as before, where we used the Hölder inequality, the estimate obtained in (ii) for y ε , and the following (iv) Using the result obtained for ξ ε and y ε we can write where we have defined We begin studying α ε (·) as follows where we used the same estimate as in (21) for the first part and for the second term we applied the Hölder inequality. Now we want to estimate the last term. We have Here the second term has the same control process in both the integrands. Hence, using a Taylor expansion and the Hölder inequality, we get where the last inequality follows from point (i), the polynomial growth of D 2 x b and (8). Then we have And we can conclude, in fact Regarding the estimate of β ε (t) we can proceed in the same way in order to obtain (v) Also in this case we aim to use Lemma 4.1, combined with Lemma 4.3, in order to get the required estimate. If we write dζ ε (t) = d(η ε (t) − ξ ε (t)) then the corresponding stochastic differential equation has the form and , u ε (t))dθ. Indeed, using the equality in Lemma 4.3, for the drift part we have For the diffusion term we can proceed in the same way Now we estimate α ε (·) as follows: Here we used the polynomial growth of b, D x b, D 2 x b as well as the a priori estimate (8) of the solution of the state equation. Let us focus on the second term and in particular on Now we want to show that the quantity above tends to zero as ε → 0. Arguing by contradiction, we suppose that there exists a sequence ε n → 0 such that but from point (i) we have that sup t |ξ εn (t)| L 4k (Ω) → 0, hence there is a subsequence ε n k such that ξ ε n k → 0, that is x ε n k →x dP × dt-a.s.. Now, using dominated convergence theorem (i.e. D 2 x b has polynomial growth), thanks to the continuity of D 2 For β ε (t) we use the boundedness of the derivative of σ(·) and proceeding in the same way we obtain Then, thanks to Lemma 4.1 the desired result follows.
Focusing on the cost functional, now we deduce a Taylor expansion of the cost with respect to the spike variation of the control process in order to use some duality argument.
Proof. Using Lemma 4.3, Proposition 4.4 and the polynomial growth of f and h, this is a straightforward calculation (see for example [18], Theorem 4.4, page 133).

Proof of Theorem 3.1
In the preceding section we studied how the optimal trajectory varies after a small perturbation of the control process. The goal was to have an expansion of the cost and produce a preliminary necessary condition for a given optimal pair. In particular what we obtained is the following Now we are in position to conclude the proof of the SMP.
Proof of Theorem 3.1. Using the results obtained in the previous section, we are in position to conclude the proof of the theorem as in [18] for the classical setting. For the sake of completeness we give an outline of it. Using Itô's formula to compute d p(t), y ε (t) and d p(t), z ε (t) it is easy to derive the following equalities hence, recalling that p(T ) = −D x h(x(T )) and adding (27) and (28), we get Thanks to the optimality ofū(t), substituting the above term in the expression of the cost given in Proposition 4.5, we have Introducing another matrix valued process Y ε (t) := y ε (t)y ε (t) T in order to get rid of the second order terms in y ε (t), we get q(t)). Then if we use the duality relation of [18], Lemma 4.6, page 137, in the form we can eventually get the following Finally, from the above expression (30) we obtain that H(t,x(t), u(t), p(t), q(t)) − H(t,x(t),ū(t), p(t), q(t)) + 1 2 Tr(δσ(t) T P (t)δσ(t)) ≤ 0, ∀u ∈ U , a.e. t ∈ [0, T ], P-a.s.. If we rewrite it in term of H we get the result.

The Convex Case
As we mentioned at the beginning of Section 2, here we are going to discuss the case where controls take values in a closed convex subset U of R n . In the following we are going to obtain a version of the SMP using the convexity of U and later to derive a sufficient condition of optimality.
Remark In this section the maps D 2 x h are no longer used. So, from now on, when we refer to hypothesis (H2)-(H5) we will assume that all the maps involved are only C 1 with respect to x, in contrast with the previous sections. It is even worth noting that we still have a polynomial growth condition on the first derivative.

Necessary conditions
The convexity assumption allows us to use a perturbation argument instead of a spike variation technique, avoiding the introduction of the second adjoint equation. On the other hand, in order to treat this case we have to make another assumption: HYPOTHESIS (H6) The control domain U is a convex subset of R n . If ϕ = b, σ, f , the maps u → ϕ(t, x, u) are C 1 (U ) and their derivatives satisfy a polynomial growth such as |D u ϕ(t, x, u)| ≤ C(1 + |x| k ), for some k ∈ N.
We have to prove that J(·), considered as a functional on L 1 F (0, T ), is Gâteaux differentiable. Then we will write J ′ (ū), u(·) −ū(·) ≥ 0, ∀ u(·) ∈ U [0, T ], and we will deduce a form of the SMP. If we define a new process y(t) as a solution of the stochastic differential equation we can state the following Lemma 6.1. The functional J(·) is Gâteaux differentiable, moreover the derivative has the form d dθ where ξ is the solution to Proof. We denote x θ the trajectory corresponding to the perturbed control and set The idea of the proof is to show that |x θ (t)| 2 L 2 (Ω) → 0 when θ → 0. In fact, this is crucial in order to show that We start by writing the equation forx θ (t) withx θ (0) = 0 as initial condition. Then using the same technique as in the spike variation case we get the following equation x(t) + λθ(y(t) +x θ (t)),ū(t) + λθu(t) − D x σ(t) y(t) dλdW (t) + 1 0 D u b t,x(t) + λθ(y(t) +x θ (t)),ū(t) + λθu(t) − D u b(t) u(t) dλdt + 1 0 D u σ t,x(t) + λθ(y(t) +x θ (t)),ū(t) + λθu(t) − D u σ(t) u(t) dλdW (t).
Applying Itô formula to the functionx θ → |x θ | 2 and taking the expectation we get where we used the Hölder inequality. Passing to the limit with θ → 0 we can conclude. Regarding (ii), the result follows in a similar way.
Now we can state the maximum principle also in this particular case, where controls assume their values in a convex subset.
Proof. The existence and uniqueness of a solution to the BSDE (11) is guaranteed, due to Theorem 2.2. Moreover, thanks to (31) and Lemma 6.1 we are able to compute E [d p(t), y(t) ] = E D x f (t), y(t) + p(t), D u b(t)u(t) + q j (t), D u σ j (t)u(t) dt.

Sufficient conditions
Here we want to remark that also in our framework it is possible to derive a sufficient condition of optimality of a pair (x,ū). In particular, unlike the previous paragraph it is not necessary to ask for the differentiability of coefficients with respect to the control. Indeed only a locally Lipschitz assumption is needed along with some simple properties of Clarke's generalized gradient.
HYPOTHESIS (H7) The control domain U is a convex subset of R n . If φ = b, σ, f , the maps u → φ(t, x, u) are locally Lipschitz in u and their derivatives with respect to x, i.e. D x φ(t, x, u), are continuous in (x, u).
Proof. The key fact of the proof (see [18], Lemma 5.1, page 138) is to show that ∂ u H(t,x(t),ū(t), p(t), q(t)) = ∂ u H(t,x(t),ū(t)) where ∂ u H is the Clarke's generalized gradient of the Hamiltonian. Then the proof proceed exactly as in [18] (page 139-140) noting that the first adjoint equation (11) is well posed.