A Quasi-Sure Approach to the Control of Non-Markovian Stochastic Differential Equations

We study stochastic differential equations (SDEs) whose drift and diffusion coefficients are path-dependent and controlled. We construct a value process on the canonical path space, considered simultaneously under a family of singular measures, rather than the usual family of processes indexed by the controls. This value process is characterized by a second order backward SDE, which can be seen as a non-Markovian analogue of the Hamilton-Jacobi-Bellman partial differential equation. Moreover, our value process yields a generalization of the G-expectation to the context of SDEs.


Introduction
We consider a controlled stochastic differential equation (SDE) of the form where ν is an adapted control process, W is a Brownian motion and the Lipschitz-continuous coefficients µ(r, X, ν r ) and σ(r, X, ν r ) may depend on the past trajectory {X s , 0 ≤ s ≤ r} of the solution. Denoting by X ν the solution corresponding to ν, we are interested in the stochastic optimal control problem where ξ is a given functional. The standard approach to such a non-Markovian control problem (cf. [9,10]) is to consider for each control ν the associated value process J ν t = ess sup ν:ν=ν on [0,t] where (F t ) is the given filtration. The dependence on ν reflects the presence of a forward component in the optimization problem. The situation is quite different in Markovian optimal control (cf. [12]), where one uses a single value function which depends on certain state variables but not on a control. This is essential to describe the value function by a differential equation, such as the Hamilton-Jacobi-Bellman PDE, which is the main merit of the dynamic programming approach. It is worth noting that this equation is always backward in time. An analogous description for (1.3) via backward SDEs (BSDEs, cf. [20]) is available for certain popular problems such as utility maximization with power or exponential utility functions (e.g., [13,18]) or drift control (e.g., [10]). However, this relies on a very particular algebraic structure which allows for a separation of J ν into a backward part independent of ν and a forward part depending on ν.
In this paper, we consider the problem (1.2) on the canonical space by recasting it as where P ν is the distribution of X ν and B is the canonical process, and we describe its dynamic value by a single value process V = {V t (ω)}. Formally, V corresponds to a value function in the Markovian sense if we see the whole trajectory of the controlled system as a state variable. Even though (1.2) has features of coupled forward-backward type, the value process is defined in a purely backward manner: one may say that by constructing V on the whole canonical space, we essentially calculate the value for all possible outcomes of the forward part. An important ingredient in the same vein is that V is defined "quasi-surely" under the family of mutually singular measures {P ν }. Rather than forming a family of processes as in (1.3), the necessary information is stored in a single process which is defined on a "large" part of the probability space; indeed, the process V "seen under P ν " should be thought of as an analogue of J ν . Clearly, this is a necessary step to obtain a (second order) backward SDE. We remark that [22] considered the same control problem (1.2) and also made a connection to nonlinear expectations. However, in [22], the value process was considered only under the given probability measure.
We first consider a fairly regular functional ξ and define V t (ω) as a conditional version of (1.4). Applying and advancing ideas from [28] and [17], regular conditional probability distributions are used to define V t (ω) for every ω and prove a pathwise dynamic programming principle (Theorem 3.2). In a second step, we enlarge the class of functionals ξ to an L 1 -type space and prove that the value process admits a (quasi-sure) càdlàg modification (Theorem 5.1).
We also show that the value process falls into the class of sublinear expectations studied in [19]. Indeed, if ξ is considered as a random variable on the canonical space, the mapping ξ → V t can be seen as a generalization of the G-expectation [23,24], which, by [7], corresponds to the case µ ≡ 0 and σ(r, X, ν r ) = ν r , where the SDE (1.1) degenerates to a stochastic integral. Moreover, V t can be seen as a variant of the random G-expectation [17]; cf. Remark 6.5.
Finally, we characterize V by a second order backward SDE (2BSDE) in the spirit of [19]; cf. Theorem 6.4. The second order is clearly necessary since the Hamilton-Jacobi-Bellman PDE for the Markovian case is fully nonlinear, while ordinary BSDEs correspond to semilinear equations. 2BSDEs were introduced in [5], and in [27] for the non-Markovian case. We refer to [27] for the precise relation between 2BSDEs in the quasi-sure formulation and fully nonlinear parabolic PDEs.
We remark that our approach is quite different from the (backward) stochastic partial differential equations studied in [21] for a similar control problem (mainly for uncontrolled volatility) and in [14,15,2,3,4] for so-called pathwise stochastic control problems. The relation to the pathdependent PDEs, introduced very recently in [25], is yet to be explored.
The remainder of the paper is organized as follows. In Section 2 we detail the controlled SDE and its conditional versions, define the value process for the case when ξ is uniformly continuous and establish its regularity. The pathwise dynamic programming principle is proved in Section 3. In Section 4 we extend the value process to a more general class of functionals ξ and state its quasi-sure representation. The càdlàg modification is constructed in Section 5. In the concluding Section 6, we provide the Hamilton-Jacobi-Bellman 2BSDE and interpret the value process as a variant of the random G-expectation.

Construction of the Value Function
In this section, we first introduce the setting and notation. Then, we define the value function V t (ω) for a uniformly continuous reward functional ξ and examine the regularity of V t .

Notation
We fix a constant T > 0 and let Ω := C([0, T ]; Ê d ) be the canonical space of continuous paths equipped with the uniform norm ω T := sup 0≤s≤T |ω s |, where | · | is the Euclidean norm. We denote by B the canonical process B t (ω) = ω t , by P 0 the Wiener measure, and by F = {F t } 0≤t≤T the (raw) filtration generated by B. Unless otherwise stated, probabilistic notions requiring a filtration (such as adaptedness) refer to F.
For any probability measure P on Ω and any (t, ω) ∈ [0, T ] × Ω, we can construct the corresponding regular conditional probability distribution P ω t ; cf. [29,Theorem 1.3.4]. We recall that P ω t is a probability kernel on F t × F T ; i.e., P ω t is a probability measure on (Ω, F T ) for fixed ω and ω → P ω t (A) is F t -measurable for each A ∈ F T . Moreover, the expectation under P ω t is the conditional expectation under P : whenever ξ is F T -measurable and bounded. Finally, P ω t is concentrated on the set of paths that coincide with ω up to t, While P ω t is not defined uniquely by these properties, we choose and fix one version for each triplet (t, ω, P ).
Let t ∈ [0, T ]. We denote by Ω t := {ω ∈ C([t, T ]; R d ) : ω t = 0} the shifted canonical space of paths starting at the origin. For ω ∈ Ω, the shifted path ω t ∈ Ω t is defined by ω t r := ω r − ω t for t ≤ r ≤ T , so that Ω t = {ω t : ω ∈ Ω}. Moreover, we denote by P t 0 the Wiener measure on Ω t and by F t = {F t r } t≤r≤T the (raw) filtration generated by B t , which can be identified with the canonical process on Ω t .
Given two paths ω andω, their concatenation at t is the (continuous) path defined by Given an F T -measurable random variable ξ on Ω and ω ∈ Ω, we define the conditioned random variable ξ t,ω on Ω by Note that ξ t,ω (ω) = ξ t,ω (ω t ); in particular, ξ t,ω can also be seen as a random variable on Ω t . Thenω → ξ t,ω (ω) is F t T -measurable and moreover, ξ t,ω depends only on the restriction of ω to [0, t]. We note that for an Fprogressively measurable process {X r , r ∈ [s, T ]}, the conditioned process {X t,ω r , r ∈ [t, T ]} is F t -progressively measurable. If P is a probability on Ω, the measure P t,ω on F t T defined by is again a probability by (2.1). We then have Analogous notation will be used when ξ is a random variable on Ω s and ω ∈ Ω s , where 0 ≤ s ≤ t ≤ T . We denote by Ω s t := {ω| [s,t] : ω ∈ Ω s } the restriction of Ω s to [s, t], equipped with ω [s,t] := sup r∈[s,t] |ω r |. Note that Ω s t can be identified with {ω ∈ Ω s : ω r = ω t for r ∈ [t, T ]}. The meaning of Ω t is analogous.

The Controlled SDE
Let U be a nonempty Borel subset of R m for some m ∈ N. We consider two given functions the drift and diffusion coefficients, such that (t, ω) → µ(t, X(ω), ν t (ω)) and (t, ω) → σ(t, X(ω), ν t (ω)) are progressively measurable for any continuous adapted process X and any U -valued progressively measurable process ν. In particular, µ(t, ω, u) and σ(t, ω, u) depend only on the past trajectory {ω r , r ∈ [0, t]}, for any u ∈ U . Moreover, we assume that there exists a constant K > 0 such that for all (t, ω, ω ′ , u) ∈ [0, T ]×Ω×Ω×U . We denote by U the set of all U -valued progressively measurable processes ν such that T 0 |µ(r, X, ν r )| dr < ∞ and T 0 |σ(r, X, ν r )| 2 dr < ∞ (2.3) hold path-by-path for any continuous adapted process X. Given ν ∈ U , the stochastic differential equation has a P 0 -a.s. unique strong solution for any initial condition x ∈ R d , which we denote by X(0, x, ν). We shall denote bȳ the distribution of X(0, x, ν) on Ω and by i.e., the solution which is translated to start at the origin. Note that P (0, x, ν) is concentrated on Ω 0 and can therefore be seen as a probability measure on Ω 0 . We shall work under the following nondegeneracy condition.
Assumption 2.1. Throughout this paper, we assume that where F X P 0 is the P 0 -augmentation of the filtration generated by X and (x, ν) varies over R d × U .
One can construct situations where Assumption 2.1 fails. For example, if x = 0, σ ≡ 1 and µ(r, X, ν r ) = ν r , then (2.5) fails for a suitable choice of ν; see, e.g., [11]. The following is a positive result which covers many applications.
Remark 2.2. Let σ be strictly positive definite and assume that µ(r, X, ν r ) is a progressively measurable functional of X and σ(r, X, ν r ). Then (2.5) holds true.
Proof. Let X = X(0, x, ν). As the quadratic variation of X, the process σσ ⊤ (r, X, ν r ) dr is adapted to the filtration generated by X. In view of our assumptions, it follows that has the same property. Hence B = σ(r, X, ν r ) −1 dM r is again adapted to the filtration generated by X. Remark 2.3. For some applications, in particular when the SDE is of geometric form, requiring (2.5) to hold for all x ∈ R d is too strong. One can instead fix the initial condition x throughout the paper, then it suffices to require (2.5) only for that x.
Next, we introduce for fixed t ∈ [0, T ] an SDE on [t, T ] × Ω t induced by µ and σ. Of course, the second argument of µ and σ requires a path on [0, T ], so that it is necessary to specify a "history" for the SDE on [0, t]. This role is played by an arbitrary path η ∈ Ω. Given η, we define the conditioned coefficients (More precisely, these functions are defined also when ω is a path not necessarily starting at the origin, but clearly their value at (r, ω, u) depends only on ω t .) We observe that the Lipschitz condition (2.2) is inherited; indeed, We denote by U t the set of all F t -progressively measurable, U -valued processes ν such that Similarly as above, we define Note that this is consistent with the notation P (0, x, ν) if x is seen as a constant path.

The Value Function
We can now define the value function for the case when the reward functional ξ is an element of UC b (Ω), the space of bounded uniformly continuous functions on Ω.
The function ξ is fixed throughout Sections 2 and 3 and hence often suppressed in the notation. In view of the double dependence on ω in (2.7), the measurability of V t is not obvious. We have the following regularity result.
with a modulus of continuity ρ depending only on ξ, the Lipschitz constant K and the time horizon T .
The first source of regularity for V t is our assumption that ξ is uniformly continuous; the second one is the Lipschitz property of the SDE. Before stating the proof of the proposition, we examine the latter aspect in detail.
There exists a modulus of continuity ρ K,T,ψ , depending only on K, T and the minimal modulus of continuity of ψ, such that Proof. We set E[ · ] := E P t 0 [ · ] to alleviate the notation. Let ω,ω ∈ Ω, set X := X(t, ω, ν) andX := X(t,ω, ν), and recall that Hence, Doob's maximal inequality implies that Moreover, using the Lipschitz property (2.2) of µ, we also have that for all t ≤ s ≤ T and then Jensen's inequality yields that Hence, we have shown that where C 0 depends only on K and T , and we conclude by Gronwall's lemma that By the continuity of their sample paths, there exists a localizing sequence (τ n ) n≥1 of stopping times such that M t ,M t , A t ,Ā t are bounded on [t, τ n ] for each n. Therefore, monotone convergence and the previous inequality yield that (ii) Letρ be the minimal (nondecreasing) modulus of continuity for ψ, and let ρ be the concave hull ofρ. Then ρ is a bounded continuous function satisfying ρ(0) = 0 and ρ ≥ρ. Let P := P (t, ω, ν) andP := P (t,ω, ν), then P andP are the distributions of X t andX t , respectively; therefore, Moreover, Jensen's inequality and (2.8) yield that for every n. In view of (2.9), we have After these preparations, we can prove the continuity of V t .

Pathwise Dynamic Programming
In this section, we provide a pathwise dynamic programming principle which is fundamental for the subsequent sections. As we are working in the weak formulation (1.4), the arguments used here are similar to, e.g., [26], while [22] gives a related construction in the strong formulation (i.e., working only under P 0 ). We assume in this section that the following conditional version of Assumption 2.1 holds true; however, we shall see later (Lemma 4.4) that this extended assumption holds automatically outside certain nullsets.
Assumption 3.1. Throughout Section 3, we assume that The main result of this section is the following dynamic programming principle. We shall also provide more general, quasi-sure versions of this result later (the final form being Theorem 5.2).
We remark that in view of Proposition 2.5, we may see ξ → V r (ξ; ·) as a mapping UC b (Ω) → UC b (Ω) and recast (3.2) as the semigroup property Some auxiliary results are needed for the proof of Theorem 3.2, which is stated at the end of this section. We start with the (well known) observation that conditioning the solution of an SDE yields the solution of a suitably conditioned SDE.
Proof. Let ω ∈ Ω s . Using the definition and the flow property ofX, we havē Therefore, recalling thatX s =ω s , (3.4) can be stated as i.e.,X t,ω solves the SDE (2.6) for the parameters (t, η, ν t,ω ). Now the result follows by the uniqueness of the solution to this SDE.
Given t ∈ [0, T ] and ω ∈ Ω, we define These sets have the following invariance property.
Proof. Asν = ν on [s, t), we have X(s,ω,ν) = X on [s, t] and in particular P = P on F s t . Let 1 ≤ i ≤ N , we show thatP t,ω = P (t,ω ⊗ s ω, ν i ) for P -a.e. ω ∈ E i . Recall from (3.8) that where βν is defined as in (3.6). Since both sides of this equality depend only on the restriction of ω to [s, t] and X(s,ω,ν) = X on [s, t], we also have that whereω = β ν (ω) is defined as below (3.6). Note that by (3.7), In fact, since X is adapted and E i ∈ F s t , we even have that By the definition ofν, we conclude that In view of (3.9), this yields the claim.
Remark 3.6. In [26] and [17], it was possible to use a pasting of measures as follows: in the notation of Lemma 3.5, it was possible to specify measures P i on Ω t , corresponding to certain admissible controls, and use the pastinĝ to obtain a measure in Ω s which again corresponded to some admissible control and satisfied which was then used in the proof of the dynamic programming principle. This is not possible in our SDE-driven setting. Indeed, suppose that P is of the form P (s,ω, ν) for someω and ν, then we see from (3.8) that, when σ is general,P t,ω will depend explicitly on ω, which contradicts (3.10). Therefore, the subsequent proof uses an argument where (3.10) holds only at one specific ω ∈ E i ; on the rest of E i , we confine ourselves to controlling the error.
We can now show the dynamic programming principle. Apart from the difference remarked above, the basic pattern of the proof is the same as in [26,Proposition 4.7].
(3.11) (i) We first show the inequality "≤" in (3.11). Fixω ∈ Ω and P ∈ P(s,ω). Lemma 3.4 shows that P t,ω ∈ P(t,ω ⊗ s ω) for P -a.e. ω ∈ Ω s and hence that Since V t is measurable by Proposition 2.5, we can take P (dω)-expectations on both sides to obtain that We take the supremum over P ∈ P(s,ω) on both sides and obtain the claim.
Letω ∈ Ω s . By the definition of V t (ω ⊗ sω ), there exists ν (ω) ∈ U t such that P (ω) := P (t,ω ⊗ sω , ν (ω) ) satisfies if necessary, we may assume thatω i ∈ E i for i ≥ 1. We set ν i := ν (ω i ) and P i := P (t,ω ⊗ sω i , ν i ). Next, we paste the controls ν i . Fix N ∈ N and let A N : and letP := P (s,ω,ν). Then, by Lemma 3.5, we haveP = P on F s t and P t,ω = P (t,ω ⊗ s ω, ν i ) for all ω ∈Ẽ i , for some subsetẼ i ⊆ E i of full measure P . Let us assume for the moment that then we may conclude that (3.14) Recall from Proposition 2.5 that V t admits a modulus of continuity ρ (Vt) . Moreover, we obtain similarly as in (2.10) that there exists a modulus of continuity ρ (ξ) such that Let ω ∈ E i ⊆ Ω s for some 1 ≤ i ≤ N , then ω −ω i [s,t] < ε. Together with (3.12) and (3.14), we obtain that Recall from (2.12) that the mapping is uniformly continuous with a modulusρ independent of i and N . Since forP -a.e. (and thus P -a.e.) ω ∈ E i . This holds for all 1 ≤ i ≤ N . As P =P on F s t , taking P -expectations yields where we writeP N =P to recall the dependence on N .
we conclude from (3.18) that Since P ∈ P(s,ω) was arbitrary, letting ε → 0 completes the proof of (3.11). It remains to argue that our assumption (3.13) does not entail a loss of generality. Indeed, assume thatω i / ∈Ẽ i for some i. Then there are two possible cases. The case P (E i ) = 0 is easily seen to be harmless; recall that the measure P was fixed throughout the proof. In the case P (E i ) > 0, we also have P (Ẽ i ) > 0 and in particularẼ i = ∅. Thus we can replacê ω i by an arbitrary element ofẼ i (which can be chosen independently of N ). Using the continuity of the value function (Proposition 2.5) and of the reward function (2.12), we see that the above arguments still apply if we add an additional modulus of continuity in (3.15).

Extension of the Value Function
In this section, we extend the value function ξ → V t (ξ; ·) to an L 1 -type space of random variables ξ, in the spirit of, e.g., [8]. While the construction of V t in the previous section required a precise analysis "ω by ω", we can now move towards a more probabilistic presentation. In particular, we shall often write V t (ξ) for the random variable ω → V t (ξ; ω).
For reasons explained in Remark 4.2 below, we fix from now on an initial condition x ∈ R d and let be the corresponding set of measures at time s = 0. Given a random variable ψ on Ω, we write ψ x as a shorthand for ψ 0,x ≡ ψ(x⊗ 0 ·). We also write V x t (ξ) for (V t (ξ)) x .
Given p ∈ [1, ∞), we define L p Px to be the space of F T -measurable random variables X satisfying More precisely, we identify functions which are equal P x -quasi-surely, so that L p Px becomes a Banach space. (Two functions are equal P x -quasi-surely, P x -q.s. for short, if they are equal P -a.s. for all P ∈ P x .) Furthermore, given t ∈ [0, T ], Since any L p Px -convergent sequence has a P x -q.s. convergent subsequence, any element of L p Px (F t ) has an F t -measurable representative. For brevity, we shall often write L p Px for L p Px (F T ).
Remark 4.1. The space L p Px can be described as follows. We say that ξ ∈ L p Px is P x -quasi uniformly continuous if ξ has a representative ξ ′ with the property that for all ε > 0 there exists an open set G ⊆ Ω such that P (G) < ε for all P ∈ P and such that the restriction ξ ′ | Ω\G is uniformly continuous. Then L p Px consists of all ξ ∈ L p Px such that ξ is P x -quasi uniformly continuous and lim n→∞ ξ1 {|ξ|≥n} L p Px = 0. Moreover, If P x is weakly relatively compact, then L p Px contains all bounded continuous functions on Ω. The proof is the same as in [17,Proposition 5.2], which, in turn, followed an argument of [7].
Before extending the value function to L 1 Px , let us explain why we are working under a fixed initial condition x ∈ R d .

Remark 4.2.
There is no fundamental obstruction to writing the theory without fixing the initial condition x; in fact, most of the results would be more elegant if stated usingP instead of P x , whereP is the set of all distributions of the form (2.4), with arbitrary initial condition. However, the setP is very large and therefore the corresponding space L 1P is very small, which is undesirable for the domain of our extended value function. As an illustration, consider a random variable of the form ξ(ω) i.e., ξ is in L 1P only when f is uniformly bounded. As a second issue in the same vein, it follows from the Arzelà-Ascoli theorem that the setP is never weakly relatively compact. The latter property, which is satisfied by P x for example when µ and σ are bounded, is sometimes useful in the context of quasi-sure analysis.
As a consequence, V x t uniquely extends to a Lipschitz-continuous mapping Proof. The argument is standard and included only for completeness. Note that |ξ −ψ| p is again in UC b (Ω). The definition of V x t and Jensen's inequality imply that where the equality is due to (3.2) applied with s = 0. (For the case s = 0, the additional Assumption 3.1 was not used in the previous section.) Recalling from Proposition 2.5 that V x t maps UC b (Ω) to UC b (Ω t ), it follows that the extension maps L p Px to L p Px (F t ).

Quasi-Sure Properties of the Extension
In this section, we provide some auxiliary results of technical nature. The first one will (quasi-surely) allow us to appeal to the results in the previous section without imposing Assumption 3.1. This is desirable since we would like to end up with quasi-sure theorems whose statements do not involve regular conditional probability distributions.
For the proof of this lemma, we shall use the following result.
Lemma 4.5. Let Y and Z be continuous adapted processes, t ∈ [0, T ] and let P be a probability measure on Ω. Then F Y P ⊇ F Z implies that Proof. The assumption implies that there exists a progressively measurable transformation β : Ω → Ω such that Z = β(Y ) P -a.s. For P -a.e. ω ∈ Ω, it follows that Z(ω ⊗ t ·) = β(Y (ω ⊗ t ·)) P t,ω -a.s., which, in turn, yields the result.
Proof of Lemma 4.4. Let X := X(t, η, ν) with η 0 = x, we have to show that F X P t 0 ⊇ F t whenever η 0 is outside some P x -polar set. Hence we shall fix an arbitraryP ∈ P x and show that the result holds on a set of full measureP .
Since F ⊆ FX P 0 by Assumption 2.1, we conclude that by using Lemma 4.5 with Z being the canonical process.
The next two results show that (for µ ≡ 0 and σ positive definite) the mapping ξ → V x t (ξ) on L 1 Px falls into the general class of sublinear expectations considered in [19], whose techniques we shall apply in the subsequent section. More precisely, the two lemmas below yield the validity of its main condition [19,Assumption 4.1].
The following property is known as stability under pasting and well known to be important in non-Markovian control. It should not be confused with the pasting discussed in Remark 3.6, where the considered measures correspond to different points in time.
The second property is the quasi-sure representation of V x t (ξ) on L 1 Px , a result which will be generalized in Theorem 5.2 below. Px . Then where P x (F t , P ) := {P ′ ∈ P x : P ′ = P on F t }.
Proof. Recall that Lemma 4.4 allows us to appeal to the results of Section 3.
(i) We first prove the inequality "≤" for ξ ∈ UC b (Ω). Fix P ∈ P x . We use Step (ii) of the proof of Theorem 3.2, in particular (3.17), for the special case s = 0 and obtain that for given ε > 0 and N ≥ 1 there exists a measurē P N ∈ P x (F t , P ) such that The claim follows by letting ε → 0.
(ii) Next, we show the inequality "≥" in (4.3) for ξ ∈ UC b (Ω). Fix P, P ′ ∈ P x and recall that (P ′ ) t,ω ∈ P(t, x ⊗ 0 ω) for P ′ -a.s. ω ∈ Ω by Lemma 3.4. Therefore, (3.2) applied with s := t and t := T yields that P ′ -a.s. on F t . If P ′ ∈ P x (F t , P ), then P ′ = P on F t and the inequality holds also P -a.s. The claim follows as P ′ ∈ P x (F t , P ) was arbitrary.
(iii) So far, we have proved the result for ξ ∈ UC b (Ω). The general case ξ ∈ L 1 Px can be derived by an approximation argument exploiting the stability under pasting (Lemma 4.6). We omit the details since the proof is exactly the same as in [17,Theorem 5.4].

Path Regularity for the Value Process
In this section, we construct a càdlàg P x -modification for V x (ξ); that is, a càdlàg process (Recall that the initial condition x ∈ R d has been fixed.) To this end, we extend the raw filtration F as in [19]: we let F + = {F t+ } 0≤t≤T be the minimal rightcontinuous filtration containing F and we augment F + by the collection N Px of (P x , F T )-polar sets to obtain the filtration We note that G depends on x ∈ R d since N Px does, but for brevity, we shall not indicate this in the notation. In fact, the dependence on x is not crucial: we could also work with F + , at the expense of obtaining a modification which is P x -q.s. equal to a càdlàg process rather than being càdlàg itself.
We recall that in the quasi-sure setting, value processes similar to the one under consideration do not admit càdlàg modifications in general; indeed, while the right limit exists quasi-surely, it need not be a modification (cf. [19]). Both the regularity of ξ ∈ L 1 Px and the regularity induced by the SDE are crucial for the following result.
By a variant of Tietze's extension theorem, cf. [16], this implies that E x t (ξ) coincides P x -q.s. with an element of UC b (Ω t ). In particular, E x t (ξ) ∈ L 1 Px (F t ). (ii) Next, we show that E x t is Lipschitz-continuous. Let ξ, ψ ∈ L 1 Px and t n ↓ t. Using (5.2), Fatou's lemma and Lemma 4.3, we obtain that . Moreover, the representations (4.3) and (5.1) yield ≥ ess sup P P ′ ∈Px(Ft,P ) = V x t (ξ) P -a.s. for all P ∈ P x . We conclude that (5.3) holds true.
Since E x (ξ) is a càdlàg process, its value E x τ (ξ) at a stopping time τ is well defined. The following result states the quasi-sure representation of E x τ (ξ) and the quasi-sure version of the dynamic programming principle in its final form.
Moreover, there exists for each P ∈ P x a sequence P n ∈ P x (G ̺ , P ) such that with a P -a.s. increasing limit.
Proof. In view of Lemmata 4.6 and 4.7, the result is derived exactly as in [19,Theorem 4.9].
As in (3.3), the relation (5.4) can be seen as a semigroup property The latter is guaranteed by Lemma 4.3 when τ is a deterministic time. However, one cannot expect E x τ (ξ) to be quasi uniformly continuous (cf. Remark 4.1) for a general stopping time, for which reason we prefer to express the right hand side as in (5.4).

Hamilton-Jacobi-Bellman 2BSDE
In this section, we characterize the value process E x (ξ) as the solution of a 2BSDE. To this end, we first examine the properties of B under a fixed P ∈ P x . The following result is in the spirit of [28, Section 8].
Proposition 6.1. Let x ∈ R d , ν ∈ U and P := P (0, x, ν). There exists a progressively measurable transformation β : Ω → Ω (depending on x, ν) such that W := β(B) is a P -Brownian motion and Moreover, B is the P -a.s. unique strong solution of the SDE Proof. Let X := X(0, x, ν). As in Lemma 3.4, Assumption 2.1 implies the the existence of a progressively measurable transformation β : Ω → Ω such that β(X 0 ) = B P 0 -a.s. (6.2) Let W := β(B). Then i.e., the distribution of (B, X 0 ) under P 0 coincides with the distribution of (W, B) under P . In particular, W is a P -Brownian motion. Moreover, we have F X 0 P 0 = F β(X 0 ) P 0 by Assumption 2.1 and therefore which is (6.1). Note that X 0 = X 0 (B) = µ(t, x + X 0 , ν t (B)) dt + σ(t, x + X 0 , ν t (B)) dB t under P 0 . Let Y be the (unique, strong) solution of the analogous SDE Using the definition of P and (6.2), we have that In view of (6.1), it follows that Y = B holds P -a.s.
In the sequel, we denote by M B,P the local martingale part in the canonical semimartingale decomposition of B under P . Corollary 6.2. Let P ∈ P x . Then the filtration F P is right-continuous. If, in addition, σ is invertible, then (M B,P , P ) has the predictable representation property.
The latter statement means that any right-continuous (F P , P )-local martingale N has a representation N = N 0 + Z dM B,P under P , for some F P -predictable process Z.
Proof. We have seen in Proposition 6.1 that F P is generated by a Brownian motion W , hence right-continuous, and that M B,P = · 0σ t dW t for σ t := σ(t, x + B, ν t (W )), where ν ∈ U . By changingσ on a dt × P -nullset, we may assume thatσ is F P -predictable. Using the Brownian representation theorem and W = σ −1 dM B,P , we deduce that M B,P has the representation property.
The following formulation of 2BSDE is, of course, inspired by [27]. Definition 6.3. Let ξ ∈ L 1 Px and consider a pair (Y, Z) of processes with values in R × R d such that Y is càdlàg G-adapted while Z is G-predictable and T 0 |Z s | 2 d B s < ∞ P x -q.s. Then (Y, Z) is called a solution of the 2BSDE (6.3) if there exists a family (K P ) P ∈Px of F P -adapted increasing processes satisfying E P [|K P T |] < ∞ such that Y t = ξ − T t Z s dM B,P s + K P T − K P t , 0 ≤ t ≤ T, P -a.s. for all P ∈ P x (6.3) and such that the following minimality condition holds for all 0 ≤ t ≤ T : ess inf P P ′ ∈Px(Gt,P ) E P ′ K P ′ T − K P ′ t G t = 0 P -a.s. for all P ∈ P x . (6.4) Moreover, a càdlàg process Y is said to be of class (D,P x ) if the family {Y τ } τ is uniformly integrable under P for all P ∈ P x , where τ runs through all G-stopping times. The following is our main result. Theorem 6.4. Assume that σ is invertible and let ξ ∈ L 1 Px . (i) There exists a (dt × P x -q.s. unique) G-predictable process Z ξ such that Z ξ = d B, B P −1 d E x (ξ), B P P -a.s. for all P ∈ P x . (6.5) (ii) The pair (E x (ξ), Z ξ ) is the minimal solution of the 2BSDE (6.3); i.e., if (Y, Z) is another solution, then E x (ξ) ≤ Y P x -q.s.
In particular, if ξ ∈ L p Px for some p > 1, then (E x (ξ), Z ξ ) is the unique solution of (6.3) in the class (D,P x ).
Proof. Given two processes which are (càdlàg) semimartingales under all P ∈ P x , their quadratic covariation can be defined P x -q.s. by using the integration-by-parts formula and Bichteler's pathwise stochastic integration [1, Theorem 7.14]; therefore, the right hand side of (6.5) can be used as a definition of Z ξ . The details of the argument are as in [19,Proposition 4.10].
Let P ∈ P x . By Proposition 6.1, B is an Itô process under P ; in particular, we have B, S P = M B,P , S P P -a.s. for any P -semimartingale S. The Doob-Meyer theorem under P and Corollary 6.2 then yield the decomposition E x (ξ) = E x 0 (ξ) + Z ξ dM B,P − K P P -a.s. and we obtain (ii) and (iii) by following the arguments in [19,Theorem 4.15]. If ξ ∈ L p Px for some p ∈ (1, ∞), then E x (ξ) is of class (D,P x ) as a consequence of Jensen's inequality (cf. [19,Lemma 4.14]). Therefore, the last assertion follows from the above.
We conclude by interpreting the canonical process B, seen under the "set of scenarios" P x , as a model for drift and volatility uncertainty in the Knightian sense. Remark 6.5. Consider the set-valued process D t (ω) := µ(t, ω, u), σ(t, ω, u) : u ∈ U ⊆ R d × R d×d .
In view of Proposition 6.1, each P ∈ P x can be seen as a scenario in which the drift and the volatility (of B) take values in D, P -a.s. Then, the upper expectation E x (ξ) is the corresponding worst-case expectation (see [19] for a connection to superhedging in finance). Note that D is a random process although the coefficients of our controlled SDE are non-random. Indeed, the path-dependence of the SDE translates to an ω-dependence in the weak formulation that we are considering.
In particular, for µ ≡ 0, we have constructed a sublinear expectation similar to the random G-expectation of [17]. While the latter is defined by specifying a set-valued process like D in the first place, we have started here from a controlled SDE under P 0 . It seems that the present construction is somewhat less technical that the one in [17]; in particular, we did not work with the processâ = d B t /dt which played an important role in [26] and [17]. However, it seems that the Lipschitz conditions on µ and σ are essential, while [17] merely used a notion of uniform continuity.