A Stochastic Maximum Principle for Forward-backward Stochastic Control Systems with Quadratic Generators and Sample-wise Constraints

This paper examines the stochastic maximum principle (SMP) for a forward-backward stochastic control system where the backward state equation is characterized by the backward stochastic differential equation (BSDE) with quadratic growth and the forward state at the terminal time is constrained in a convex set with probability one. With the help of the theory of BSDEs with quadratic growth and the bounded mean oscillation (BMO) martingales, we employ the terminal perturbation approach and Ekeland's variational principle to obtain a dynamic stochastic maximum principle. The main result has a wide range of applications in mathematical finance and we investigate a robust recursive utility maximization problem with bankruptcy prohibition as an example.


Introduction
The class of backward stochastic differential equations (BSDEs), with generators having a quadratic growth in the state variable z, has attracted much attention in the past two decades.Besides the increasingly developed and enriched existence and uniqueness theory [1,7,9,19,25,34,38], BSDEs with quadratic growth have found applications in stochastic control and mathematical finance, say, stochastic linear-quadratic control with random coefficients [4], utility maximization problems [8,17] etc.
In this paper, motivated especially by their applications in the risk-sensitive optimal control problems [12,26,29] and the robust portfolio-consumption optimization model under model uncertainty [32] together with the related recursive utility maximization problems [5] with the bankruptcy prohibition, we are encountering with the following stochastic recursive optimal control problems involving BSDEs with quadratic growth and state constraints with sample-wise type (a sample-wise constraint requires that the state at certain time or at all times be in a prescribed set with probability 1).Denote by U[0, T ] the set of all the admissible controls and the cost functional is defined by the following mixed initial-terminal type (see [39]) where h is any given smooth function, and X u (•), Y u (•) are the solutions to the controlled forward-backward stochastic differential equation (FBSDE, see [18,27,28]) where W is a standard Brownian motion, the coefficients b, σ, f , Φ are deterministic, measurable functions in suitable sizes, and f is quadratic growth in z.The objective is to find ū(•) ∈ U[0, T ] (if it ever exists) such that and the terminal state X u T of the stochastic differential equation (SDE) in (1.2) is required to take values in a given convex set K ⊆ R n (n ∈ N + ) with probability one.On the one hand, when K = R n , h(x, y) = y, f (t, x, y, z, u) = γ 2 |z| 2 + g(t, x, u) with some γ > 0 and measurable function g, if the coefficients admit enough integrability then (1.2)-(1.3) is closely related to the classical risk-sensitive control problems [12,26] by using exponential transformation and Itô's formula.After that Moon [29] studied the generalized case if g depends on (y, z) through the dynamic programming approach.On the other hand, under the setting of Brownian filtration and for any given u(•) ∈ U[0, T ], when n = 1, K = [0, +∞), h(x, y) = −y, f (t, x, y, z, u) = U (u) − βy − 1 2θ |z| 2 , and if U[0, T ] represents the set of consumption-portfolio strategies u(•) feasible for the initial wealth x 0 ≥ 0, it follows from the main result in [32] that −J(u(•)) is optimal for the minimization part of the sup-inf problem proposed in [5] thanks to the method of dual representation (see also [Quenez03]), where θ is the risk-averse parameter and U is the utility function.Furthermore, this means that the objective (1.3) is equivalent to the maximization part of the sup-inf problem in [5] .
The existence of constraints with sample-wise type as above is more a rule than an exception in reality, for example, in the continuous-time mean-variance portfolio selection problem [2] and the recursive utility maximization problems [14] with bankruptcy prohibition.Another important example is the study of the Neyman-Pearson lemma for hypothesis tests under a class of nonlinear probability measures-g-probabilities [22], where the setting K = [0, 1] plays a role as a criterion to exclude the tests that make the g-probability of Type I error beyond the given acceptable significance level.
Based on the above motivation, we are aimed at deducing the necessary condition of the optimalitystochastic maximum principle (SMP) for problem (1.2)-(1.3).Since Peng [30] obtained the general SMP for the classical stochastic control systems, researchers have made progress in the SMP for coupled forwardbackward stochastic control systems (see [16,31,37,39] and references therein) driven by FBSDEs when 3) is well studied [20,21] when f is globally Lipschitz continuous in (x, y, z) and it is generalized to the fully coupled case [35] and to the mean-field case [36].In the existing literature, there are two main approaches to getting the SMP, one is based on the pure BSDE approach [14] and the other is based on the Ekeland variational principle [20,21,35].Particularly, adopting the BSDE method, the authors in [5] establish a comparison theorem for specific BSDEs with quadratic growth and derive a dynamic SMP in the semimartingale context.However, the comparison theorem may not hold since we do not require f have special structures or convexity/concavity in z, and therefore we will resort to the Ekeland variational principle to achieve this goal under our framework.
We encountered three difficulties in deducing the SMP for (1.2)- (1.3).The first one is that the BSDE in (1.2) is no longer Lipschitz but quadratic growth in z, which leads to the derivative f z being unbounded.The unboundedness of f z brings much trouble in obtaining the following estimate, for example, when f depends only on z, for some p > 1, where Z(•) represents the optimal trajectory and Z ε (•) represents the state trajectory after perturbation, because one can deduce (1.4) when f is Lipschitz in z by using the dominated convergence theorem.The second one is that when the family of approximate controls produced by Ekeland's variational principle converges to the optimal one, in which appropriate space can we obtain the convergence of the solutions of the approximate variational equations to the one solving the original variational equation?In the classical case, such an issue can be solved by applying the continuous dependence of the solutions to the Lipschitz BSDEs on the parameters, but it entails estimating the difference between the solutions from two different linear BSDEs with unbounded coefficients under our framework.Furthermore, to this end, we first need to ensure that the approximate state trajectories converge to the optimal one, which essentially involves the convergence of solutions of a family of quadratic BSDEs.The third one is that the adjoint equation is a linear SDE with unbounded coefficients due to the unboundedness of f z .When we deduce the SMP, the solution to it will appear as a component of the integrands of stochastic integrals with respect to the Brownian motion (see (3.24)-(3.25)).Such stochastic integrals are only local martingales whose mathematical expectation at the terminal time T may not exist.So we must check all these stochastic integrals are true martingales on the time interval [0, T ] before taking the expectation.
To overcome the aforementioned difficulties, we deduce the desired convergence (1.4) by applying the energy-type inequality of the bounded mean oscillation (BMO) martingales together with the generalized dominated convergence theorem.Using the estimate in [6] for the linear BSDEs with stochastic Lipschitz coefficients involving BMO martingales, the convergence of both the approximate state trajectories and the approximate variational equations are attained, and the former convergence is stronger than the latter one.
To tackle the last difficulty, we note that the solution of the adjoint equation is the Doléans-Dade exponential of a certain BMO martingale, which satisfies the reverse Hölder inequality (R p ) as long as p ∈ [1, p) for some p > 1 (see [24], Chapter 3, Definition 3.1).On the other hand, the most complicated term to estimate, in those integrands of stochastic integrals, is a product including the solutions of the variational equation and the adjoint equation.Based on this observation, we choose a proper p ∈ (1, p) together with its conjugate p * , such that the (R p ) condition holds and the solution of the variational equation admits a p * -moment.
Then we can apply Hölder's inequality with the couple (p, p * ) to the square root of the quadratic variation of that stochastic integral to verify it is a true martingale.
The rest of the paper is organized as follows.In section 2, preliminaries and the formulation of our problem are given.We use a pure backward formulation of (1.2) in which the terminal state X u T is regarded as the control variable.Unlike the formulation in [20,22], such a reformulated set of admissible controls no longer includes all square-integrable random variables but higher-order ones because of the quadratic growth in z.In section 3, employing the analysis of BMO martingales, we guarantee the well-posedness of both the variational equation and the adjoint equation.Then, applying Ekeland's variational principle, we obtain a dynamic SMP that characterizes the optimal terminal state.In section 4, to illustrate the established SMP, we study its applications to a robust recursive utility maximization problem with bankruptcy prohibition.

Preliminaries and problem formulation
Let T ∈ (0, +∞), n, d ∈ N + and (Ω, F , P) be a complete probability space on which a standard d-dimensional For any given p, q ≥ 1, we introduce the following spaces and notation.
In particular, we denote by M p F ([0, T ]; R n ) the above space when where λ denotes the Lebesgue measure on [0, T ].
BMO p : the space of real-valued, continuous F-martingales M with M 0 = 0 such that where the supremum is taken over all stopping times τ ∈ [0, T ].By Corollary 2.1 in [24], M is a BMO p martingale if and only if it is a BMO q martingale for every q ≥ 1.Therefore, it is simple to write BMO to represent BMO p .
, where M ∈ BMO, p M is the positive constant defined by the following function:

Classical formulation
Let p * > 1 be a number which will be determined lately.Consider the following forward-backward stochastic control system: over the set of admissible controls minimizing the cost functional subject to the controlled FBSDE and an additional convex constraint where are deterministic, measurable functions.We impose the following assumptions on the coefficients of (2.5).
(i) b, σ, f , h, Φ are continuous in their arguments; Φ is continuously differentiable; b, σ are continuously differentiable in (x, u); f is continuously differentiable in (x, y, z, u); h is continuously differentiable in (x, y).
T and p be the constant such that where the function Ψ(•) is defined by (2.2).We assign the value p(p − 1) −1 to p * .Obviously, p * is the conjugate of p.

Backward formulation
In this subsection, we give an equivalent backward formulation of the above stochastic optimal control problem (2.4)-(2.5).To do so we need an additional assumption: Assumption 2.2.There exists α > 0 such that Note that Assumptions 2.1 and 2.2 imply the mapping u −→ σ(t, x, u) is a bijection from R n×d onto itself for any (t, x).Therefore, let θ = σ(t, x, u) and denote the inverse function by u = σ(t, x, θ).Then system (2.5) can be rewritten as where l(t, x, θ) = −b(t, x, σ(t, x, θ)) and g(t, x, y, z, θ) = f (t, x, y, z, σ(t, x, θ)).Since u −→ σ(t, x, u) is a bijection, we may regard θ(•) as the control variable.Due to the well-posedness of the BSDEs with Lipschitz generators, selecting θ(•) is equivalent to selecting the terminal state X T .Then we obtain the following purely backward control system: where ξ is the control variable to be chosen from The equivalent cost functional is Thus, the original problem is equivalent to minimizing J(ξ) over U ad , subject to the controlled system (2.8) and the initial constraint X ξ 0 = x 0 .
Remark 2.3.According to the definitions of l, g and Assumption 2.1, one can verify that l and g satisfy similar conditions in Assumption 2.1.
From the existence result (Proposition 3) in [7] and the uniqueness result (Lemma 2.1) in [19], we have: Theorem 2.4.Let Assumptions 2.1 and 2.2 hold.Then, for any ξ ∈ U ad , (2.5) admits a unique solution Furthermore, we have the following estimate: where C depends on T , p * , l x ∞ , l u ∞ .
Corollary 2.5.From the energy-type inequality ( [24], page 26) and the second inequality in (2.10), applying Hölder's inequality yields 3 Stochastic maximum principle Applying Ekeland's variational principle, we derive the stochastic maximum principle for the optimization problem (2.8)-(2.9) in this section.The proposition below, which will be used frequently, follows from Corollary 9 and Theorem 10 in [6].

Variational equation
In this subsection, the constant C will change from line to line in our proof.
Using the convexity of U and taking an arbitrary ξ ∈ U ad , we know, for ε ∈ [0, 1], be the state trajectory of (2.8) associated with ξ ε .To derive the first-order necessary condition in terms of small ε, we consider the following two BSDEs: where l θ (t) θt := l From the a priori estimate for BSDEs ([13], Proposition 5.1), we can obtain the following estimates by using the method in [20] similarly.
Lemma 3.3.Let Assumptions 2.1 and 2.2 hold.Then, for any β ∈ (1, 4p * ], we have Then we have where gε z (t) = Proof.Under Assumptions 2.1 and 2.2, we get where the constant C is independent of ε.For the second term in the last inequality of (3.9), by (3.5) and Corollary 2.5, it follows from Hölder's inequality that similarly, for the third term in the last inequality of (3.9), we get Consequently, we have where C is independent of ε.The proof is complete.
Lemma 3.7.Let Assumptions 2.1 and 2.2 hold.Then there exists a unique solution Proof.Under Assumption 2.1, we get where We only estimate the most difficult terms in (3.12) as follows.The other terms are similar.

Variational inequality
In this subsection, we employ Ekeland's variational principle [11] to deal with the initial constraint X ξ 0 = x 0 .Given the optimal ξ, we introduce a mapping J δ : U ad −→ R by where x 0 is the given initial state constraint and δ is an arbitrary positive constant.Let us check that the ] as m → ∞ for any {ξ m } m∈N+ , ξ in U ad , we need only to show that X ξ 0 and Y ξ 0 are continuous on U ad .To do this, for any given are respectively corresponding state trajectories to ξ 1 , ξ 2 , satisfying (2.8).Then, by Proposition 5.1 in [13], we have which implies the continuity of X ξ 0 , where C is a positive constant depending on T , L, p * .On the other hand, for t ∈ [0, T ], note that where λ t = g(t,X1(t),Y1(t),Z1(t),θ1(t))−g(t,X1(t),Y2(t),Z1(t),θ1(t)) µ t = g(t,X1(t),Y2(t),Z1(t),θ1(t))−g(t,X1(t),Y2(t),Z2(t),θ1(t)) Under Assumptions 2.1 and 2.2, one can verify that λ( where C is a positive constant depending on T , L, g y ∞ , p * , β, A. Recall that Φ is Lipschitz continuous and observe that Using (3.15), similarly to the proof of (3.9), we obtain which implies the continuity of Y ξ 0 .The proof is complete.

Maximum principle
In this subsection, we derive the stochastic maximum principle.To this end, we introduce the adjoint process (p(•), q(•)) associated with the optimal admissible control ξ to (2.8)-(2.9),which solves the following adjoint system:  [33]).So it admits a unique strong solution q(•) up to an evanescent set.Moreover, set qt = a 0 h y ( ξ, Ȳ0 ) exp Then, noting q(•) is continuous and applying Itô's lemma to qt on [0, T ], one can verify that q(•) = q(•), P-a.s.. On the other hand, for any β ∈ (1, p), recalling Remark 3. 4 The right-hand side of the above inequality is finite since we can show that, for any given β ∈ (1, p) and any β 0 ∈ (β, p), by Corollary 2.5 and using Hölder's inequality, where C > 0 depends on L, T , g y ∞ , A, a 0 , β, β 0 .The other term in that inequality can be estimated similarly.The proof is complete.

An application to a robust recursive utility maximization problem with bankruptcy prohibition
There are d + 1 investment instruments in the market.One of the instruments is a bank account (free risk); the others are stocks.The price processes are described by the following equations: dX π t = [(r t X π t + π ⊺ t σ t φ t ) − c(X π t )] dt + π ⊺ t σ t dW t , t ∈ [0, T ], where the consumption function c is nonnegative and continuous differentiable.The recursive utility of the investor's wealth X π (•) is described by the following BSDE: where g and Φ satisfy Assumption 2.1.
Our problem is that an investor chooses portfolio π(•) so as to maximize the recursive utility Y π 0 of his wealth X π (•) with bankruptcy prohibition.Equivalently, we put h(x, y) = −y since the control problem section 3 is to minimize the cost functional, that is, utility function that has a bounded and continuous derivative.We claim that the optimal terminal wealth can be represented as where p T = a 1 + α T 0 U ′ (α Xs )q s Λ s ds Λ −1 T , q T = −a 0 exp −βT − γ  We provide a sketch of the proof.Case 1: a 0 > 0. In this case, we deduce that q T < 0. Hence, from (4.3), on the one hand we have, on Ω 0 , P-a.s., a 1 + α T 0 U ′ (α Xs )q s Λ s ds > 0. On the other hand, on Ω \ Ω 0 , P-a.s., Case 2: a 0 = 0.In this case, we deduce that a 1 = 0 and q t = 0, t ∈ [0, T ].Hence, from (4.3), we have p T ≥ 0 on Ω 0 and p T = 0 on Ω \ Ω 0 , P-a.s..But a 1 = 0 implies that p T > 0. So we deduce that ξ = 0 P-a.s. and a 1 > 0.
In summary, for both cases, we have (4.4).
Remark 4.1.In view of the stochastic differential utility, the above example is closely related to the robust expected utility model studied in [23].The generator g(t, x, y, z) = U (c(x)) − βy − γ 2 |z| 2 can be interpreted as an intertemporal aggregator where U (c(x)) − βy corresponds to the standard expected additive utility in continuous-time, and γ > 0 is the risk-averse parameter which reflects the issue of robustness in portfolio decision (see [32] for more details).
is the P-augmentation of the natural filtration of W . Denote by R n the n-dimensional real Euclidean space and R n×d the set of n × d real matrices.The scalar product (resp.norm) of any two n × d matrices A, B is denoted by A, B = tr{AB ⊺ } (resp.|A|= tr {AA ⊺ }), where the superscript ⊺ denotes the transpose of vectors or matrices.
)where C 0 is positive constant depending only on p and BMO 2 , and (2.3) is called the reverse Hölder inequality.
H•W : H is an F-adapted process and H•W is the stochastic integral of H with respect to W .If H•W ∈ BMO, then we write simply p H for p H•W and p * H for p * H•W without ambiguity.
both from U ad to R, are continuous functional on U ad .
and using reverse Hölder's inequality, we obtain E sup t∈[0,T ] |q t | Now let us focus on the SDE which p(•) satisfies.Under Assumptions 2.1 and 2.2, since l x and l θ are bounded, it admits a unique strong solution p(•) and, by using a standard estimate for SDEs, we get β ≤ C, where C > 0 depends on L, T , g y ∞ , A, a 0 , β.