Well-posedness of Multidimensional Diffusion Processes with Weakly Differentiable Coefficients

We investigate well-posedness for martingale solutions of stochastic differential equations, under low regularity assumptions on their coefficients, widely extending some results first obtained by A. Figalli. Our main results are a very general equivalence between different descriptions for multidimensional diffusion processes, such as Fokker-Planck equations and martingale problems, under minimal regularity and integrability assumptions, and new existence and uniqueness results for diffusions having weakly differentiable coefficients, by means of energy estimates and commutator inequalities. Our approach relies upon techniques recently developed, jointly with L. Ambrosio, to address well-posedness for ordinary differential equations in metric measure spaces: in particular, we employ in a systematic way new representations and inequalities for commutators between smoothing operators and diffusion generators.


Introduction
Aim of this article is to study well-posedness (i.e., existence, uniqueness and stability) for martingale solutions of stochastic differential equations dX t = b(t, X t )dt + σ(t, X t )dW t , t ∈ (0, T ), providing in particular new results, under low regularity assumptions on the coefficients b : The classical subject of martingale problems dates back at least to [SV06], where it was first shown that continuous and uniformly elliptic covariances a = σσ * 's allow for uniqueness results which have no counterpart in the usual (Itô-)Cauchy-Lipschitz theory, provided that the solution to (1) is understood in a sufficiently weak sense. Since then, the theory has been growing, due to its robustness and strong connections with the theory of semigroups and parabolic PDE's, also in abstract (metric) frameworks, see e.g. [EK86].
Our primary goal here is to show that the techniques originally developed in [AT14] can be extended to the stochastic theory as well as specialized to the Euclidean setting, to extend in a systematic way the results established in the seminal paper [Fig08]. Actually, most of such techniques, tailored to study well-posedness problems for ordinary differential equations in metric measure spaces (possibly infinite-dimensional) are also well-suited also to the study of diffusions in metric measure spaces, as developed in the author's PhD dissertation [Tre14]. However, in this paper, we deal uniquely with Euclidean spaces: among various motivations, besides that a wider audience could be mainly interested in this setting, this allows us to compare new results and techniques with alternative approaches. Finally, Euclidean spaces are a useful "intermediate" step for the infinite dimensional theory, e.g. by cylindrical approximations; the theory developed here is also instrumental to the developments in [Tre15].
Therefore, in this article, we adopt the same point of view as in [Fig08], where precise connections between well-posedness of PDE's and martingale problems are settled, in particular for a wide class of diffusion having not necessarily continuous nor elliptic coefficients, provided that some Sobolev regularity holds. Of course, well-posedness has to be understood "in average" with respect to L d -a.e. initial condition (here and below, L d is Lebesgue measure on R d ). More precisely, a formalization akin to that of DiPerna-Lions (see e.g. [AC14] for an account of the deterministic theory) is introduced, the main objects being Stochastic Lagrangian Flows, i.e., Borel families (η(x)) x∈R d of probability measures on C([0, T ]; R d ), such that (i) η(x) solves (1), starting from x at t = 0, for L d -a.e. x ∈ R d ; (ii) the push-forward measures (e t ) ♯ η(x) dL d (x), where e t is the evaluation map at t ∈ [0, T ], are absolutely continuous with respect to L d , with uniformly bounded densities.
Let us stress the fact that, as in the deterministic theory, uniqueness is understood for flows, thus in a selection sense: we are not claiming well-posedness for L d -a.e. initial datum. Moreover, we remark that, although the conditions above might read as perfect analogues of the notion of Regular Lagrangian flows [AC14, Definition 13], Stochastic Lagrangian Flows are not necessarily (neither expected to be) deterministic maps of the initial point only; this is evident when σ = 0 above and any probability concentrated on possibly non-unique solutions to the ODE give rise to a solution to the martingale problem. Despite this discrepancy, such a theory provides rather efficient tools to study stochastic differential equations under low regularity assumptions, in Euclidean spaces, and, together with [LBL08], which deals with analogous issues from a PDE point of view, has become the starting point for further developments, among which we quote [RZ10,Luo13,FLT10,Zha13]. Before we proceed with a more detailed description of our results and techniques, let us stress the fact that we are concerned uniquely with martingale problems, so we do not address nor compare our results with those obtained for strong solutions of equations under low regularity assumptions on the coefficients (see the seminal paper [Ver80] and [KR05,DFPR13] for more recent results). Rigorous correspondences between martingale (or weak) and strong solutions may be provided by the classical Yamada-Watanabe theorem [YW71] (and extensions, see e.g. [Kur07]). Moreover, the literature on Fokker-Planck equations for general measures is so vast that we must limit ourselves to a comparison of our results only with those which are strongly related and look similar in techniques and mathematical contents: this is done in Section 3.3.
We proceed with a brief description of our contributions developed below, which can be split into two parts, roughly corresponding to Section 2 (toghether with Appendix A) and Section 3.
In the first part, we investigate the problem of abstract equivalence between "Eulerian" and "Lagrangian" descriptions for multidimensional diffusion processes, where by the former we mean by Fokker-Planck equations and the latter consists of solutions to martingale problems. Although such a correspondence can not be considered novel and many ideas can be traced back at least to [Amb04] in the theory of ODE's and DiPerna-Lions flows, as well as [KS98] for càdlàg martingale problems, to our knowledge, here we provide for the first time general results, under somewhat minimal integrability assumptions on coefficients as well as on solutions. Moreover, we choose to state and prove our results in such a way that they can be translated with a minimal effort to the case of general metric measure spaces, that we address in [Tre15].
In this part, the crucial result is Theorem 2.5, which provides a so-called "superposition principle", i.e., a (non-canonical) way to lift any probability-valued solution of a Fokker-Planck equation to some solution of the corresponding martingale problem. Here, "to lift" means that the 1-marginals of the process which solve the martingale problem coincide with the given solution of the Fokker-Planck equation. Results in a similar spirit appear quite often in the literature (see also the comments just below the statement of Theorem 2.5) and could be traced back to L.C. Young's theory of generalized curves. Technically, one could start from already known results such as [Fig08,Theorem 2.6] or [KS98,Theorem 4.9.17] to provide a slightly shorter proof, but we preferred to postpone an almost self-contained derivation in Appendix A: indeed, even if we rely on the results quoted above, it turns out that one has to settle non-trivial technical problems. In particular, an underlying result is Theorem A.2, where we establish an estimate for the modulus of continuity of solutions to martingale problems under somewhat minimal integrability assumptions (based on a refined Lévy-type estimate); an alternative but less effective approach, based on fractional Sobolev spaces, was developed in [Tre14]. Finally, we point out that we exploit a technique originally developed in [AT14,Theorem 7.1], in case of cylindrical approximations, to move from bounded coefficients to possibly unbounded ones.
In the second part, we address the problem of well-posedness for Fokker-Planck equations, providing sufficient conditions assuming Sobolev regularity of the coefficients. We mainly focus on uniqueness issues, which are settled by means of energy or L 2 estimates, formally satisfied by any weak solution, under suitable bounds on the divergence of the driving coefficients: such an approach could be hardly considered novel, as it was already present in [DL89], for transport equations. However, our main contribution consists in a novel and systematic approach to the estimate of the error terms arising in the approximation procedure, to obtain so-called commutator inequalities: see Section 3.4 for a brief account of the method as well as complete proofs of our crucial resuts. It turns out that, essentially by means of the same technique, we are able to deal with Sobolev derivations (Lemma 3.4), Sobolev diffusions (Lemma 3.5) as well as with time-dependent elliptic diffusions (Lemma 3.6). Such a technique, which ultimately consists in choosing a Markov semigroup as a smoothing operator and relying on duality arguments as well as an interpolationà la Bakry-Eméry, has also the advantage of being completely "Eulerian" and "coordinate free". Let us point out also in this case that it was first developed in [AT14] to deal with an analogue problem for derivations in metric measure spaces.
In conclusion, we state and prove two well-posedness results: Theorem 3.1, for diffusions (1) having possibly degenerate coefficients, assuming first order Sobolev regularity for the drift b and second order Sobolev regularity for the infinitesimal covariance a = σσ * (together with uniform bounds on their divergence); Theorem 3.1, for the bounded elliptic case, i.e. λ |v| 2 ≤ a(v, v) ≤ Λ |v| 2 for every v ∈ R d , with t → a t Lipschitz, where (roughly speaking) regularity assumptions can be reduced of one order (i.e., no assumption on b, and first order Sobolev regularity for a). We regard such results as chief examples of the strength and versatility of our techniques for commutator estimates, and we point out that other interesting results could arise in different situations, such as perturbations of elliptic generators which enjoy some ultra-(or hyper-) contractivity features, as well as the case of BV-regular coefficients, that we do not address here.

Definitions and basic facts
Throughout, we use the following notation, for v, w ∈ R d (d ≥ 1) and and the following notation for differential calculus on (0, as well as the notation L d for Lebesgue measure on R d and ∇ * for the distributional adjoint of ∇ (i.e., ∇ * b = − div b on vector fields). We write M (R d ) for the space of signed (real-valued) Borel measures on R d (with finite total variation), M + (R d ) ⊆ M (R d ) for the cone of finite non-negative measures and P(R d ) ⊆ M + (R d ) for the convex set of Borel probability measures on R d . We say that a curve ν = Most of the quantities that we consider below are integrated with respect to the variable t ∈ (0, T ), with respect to L 1 | [0,T ] : when ν = (ν t ) t∈(0,T ) ⊆ M (R d ) is a Borel curve, we write |ν| dt for the Borel measure on (0, is naturally defined and endowed with the Banach norm On the space C([0, T ]; R d ) (naturally endowed with the sup norm and its Borel σ-algebra), we let e t : γ → γ t := γ(t) ∈ R d be the evaluation map at t ∈ [0, T ]. The natural filtration on ) be the space of uniformly bounded (respectively, compactly supported) and continuously differentiable functions, once with respect to t ∈ (0, T ) and twice with respect to x ∈ R d , with uniformly bounded derivatives (as usual, the superscript (1, 2) counts the number of derivatives with respect to (t, x), other superscripts may appear, with natural meaning). We prefer the "abstract" notation A and large parts of the theory can be developed when "test" functions are replaced by other classes (e.g. as developed throughout the monograph [KS98]). We endow A with the norm Notice that, by uniform continuity, any f ∈ A extends to [0, T ] × R d . Throughout, we always let be Borel, where Sym + (R d ) is the space of symmetric, non-negative definite d × d matrices.
We define diffusion operators in R d , measure-valued weak solutions to Fokker-Planck equations and martingale problems on the interval [0, T ]. Most of these notions are classical (for a brief historical account, see e.g. the introduction of [SV06]): for the sake of clarity we provide the definitions and prove some simple facts.
Definition 2.1 (diffusion operator). We let L(= L(a, b)) be the linear differential operator with values in Borel maps on (0, T ) × R d .
We write L t f := (Lf ) t , for t ∈ (0, T ). As usual, the coefficients b, a are referred as the drift of L and the infinitesimal covariance of L. If a = 0, then L reduces to a linear first-order operator, i.e. a derivation, and we say that we are in the deterministic case.
Given a diffusion operator L, we let the "Eulerian" description of evolution of particles "driven" by L consist of weak solutions of Fokker-Planck (or forward Kolmogorov) equations, in duality with A . Although our main interest lies in solutions to FPE's that are narrowly continuous curves of probability measures, we introduce more general measure valued solutions, as they are useful, e.g. the space of solutions becomes linear.
if it holds and, for every With the notation introduced above, condition (4) can be restated as a, b ∈ L 1 t,x (|ν|). In what follows, we frequently omit to specify the operator L, that we regard as fixed.
Remark 2.3. A density argument akin to [AGS08, Lemma 8.1.2] allows for proving that any solution ν = (ν t ) t∈(0,T ) ⊆ P(R d ) to (3) admits a unique narrowly continuous representativẽ ν = (ν) t∈[0,T ] , with ν t =ν t , for L 1 -a.e. t ∈ (0, T ). Thanks to this fact, we may also say that the solution ν starts from ν 0 (or that ν 0 is the initial law of ν). Moreover, for every f ∈ A , it holds Actually, one first proves that (6) holds for f ∈ A c and then extends by density to f ∈ A . Since this last step requires the introduction of useful cut-off functions, we sketch it here, for later use. For R ≥ 1, we fix for which we assume that (6) holds. The chain rule entails Letting R → ∞, by dominated convergence, we extend the validity of (6).
and, for every f ∈ A , the process is a martingale with respect to the natural filtration on C([0, T ]; R d ).
Recall the notation η t = (e t ) ♯ η ∈ P(R d ), t ∈ [0, T ], thus η 0 is the initial law of η. As for FPE's, we usually omit to specify L, regarded as fixed. Let us remark that a density argument shows that it makes no difference to require that (8) is a martingale only for f ∈ A c .
For any solution η of the MP, the integrability assumption (7), which is equivalent to a, b ∈ L 1 (η), entails that the process • e s ds is well defined, up to a η-negligible set, as continuous and progressively measurable process. In particular, it belongs to L ∞ loc (η, (F t ) t ), i.e. there exists an increasing sequence of stopping times τ n , η-a.s. converging towards T , such that We prefer throughout not to enlarge the filtration F with η negligible sets. This causes virtually no harm in the exposition, e.g. a martingale M = (M t ) t∈[0,T ] must be understood in the sense that it holds E[M t |F s ] = M s , η-a.s. for every s ≤ t (and the η-negligible set could not belong to F s ).
When a = 0, solutions to the MP reduce to probability measures concentrated on absolutely continuous solutions to the ordinary differential equation Indeed, arguing as in [Fig08, Lemma 3.8], it turns out that the martingale (8) is constant. More generally, the quadratic variation process of (8) is t → t 0 a s (∇f s , ∇f s )ds: this plays a crucial role in estimates for the modulus of continuity of the canonical process, see e.g. Corollary A.4.
By integration of (8) with respect to η (i.e., taking expectation) we deduce that any solution η of the MP induces, by means of its 1-marginals (η t ) t∈(0,T ) a narrowly continuous solution of the FPE (3). A converse statement is provided by the following theorem, whose proof is deferred in Appendix A; in the next section, it plays a crucial role to connect various well-posedness results for FPE's and MP's.
Theorem 2.5 (superposition principle). Let ν = (ν t ) t∈[0,T ] ⊆ P(R d ) be a narrowly continuous solution of (3). Then, there exists η which is a solution to the MP (associated to the same diffusion operator L) such that, for every t ∈ [0, T ], it holds η t = ν t .
In what follows, we refer to η above as a superposition solution for ν. We refer to this result as the superposition principle for diffusions: the terminology originates in the deterministic literature of ODE's, see [Amb04]: the solution η can be non-trivially distributed among the possibly non-unique solutions to the ODE, thus introducing some "randomness" in an otherwise deterministic setting; these probability measures are nevertheless superpositions of deterministic paths. In the setting of diffusion operators, solutions are already expected to be random, thus the term is justified only by extension, although it would be interesting, at least in some cases, to be able to distinguish between the two "sources of randomness": this would require us to introduce concepts such as strong and weak solutions.
As remarked in the introduction, Theorem 2.5 is a quite general result, only the integrability condition (4) being required, which is some sense minimal to give sense to FPE's and MP's (although one may slightly relax it by dealing with local martingale problems). Our result extends [Fig08, Theorem 2.6], where only uniformly bounded coefficients are considered; let us mention that results in a similar spirit -that of L.C. Young's theory of generalized curves -appear quite often in the literature, e.g. Echeverria's theorem [EK86,Theorem 4.9.17] (see [KS98] for extensions) in the framework of martingale problems in spaces of càdlàg paths, or Smirnov's decomposition of 1-currents [Smi93] (see also [PS12] for an alternative approach, valid also in the case of metric currents). Our strategy of proof extends that of [Fig08, Theorem 2.6] and should be regarded as a (non-trivial) counterpart of [AGS08, §8.1 and §8.2] in the setting of multi-dimensional diffusions: although rather natural, the derivation is not immediate from the available literature (both from deterministic and stochastic), due to non-trivial technical points. The major difficulty in our proof is to provide estimates for the modulus of continuity of the canonical process (a problem that would appear also if we wanted to deduce it from Echeverria's theorem).
Next, we investigate some stability properties enjoyed by solutions of MP's and FPE's, with respect to suitable operations: their proofs are straightforward, so we omit them.
Clearly, all the definitions above can be given with respect to any interval [S, T ] in place of [0, T ] (when it is not mentioned, we always refer to the interval [0, T ]): solutions are then well-behaved with respect to the natural restriction map The analogous property for FPE's is obvious: if (ν t ) t∈(S,T ) is a solution of (3), its restriction (ν t ) t∈(S,T ) is a solution of the FPE on (S, T ) × R d .
Solutions of FPE's and MP's are clearly stable with respect to convex combinations, as a consequence of Fubini's theorem.
Proposition 2.7. Let (Z, A,ν) be a probability space and let (η z ) z∈Z ⊆ P(C[0, T ]; R d ) be a Borel family, such that, forν-a.e. z ∈ Z, η z is a solution of the MP (associated to a fixed diffusion operator L). Moreover, let hold. Then, A → η(A) = η z (A) dν(z) is a solution of the MP (associated to L).
A somewhat converse result, for disintegration with respect to the initial law, is a consequence of stability of martingales under conditional expectations with respect to the σ-algebra F 0 .
Proposition 2.8. Let η be a solution of the MP and let (η(x)) x∈R d be a regular conditional probability for η with respect to e 0 . Then, for η 0 -a.e. x ∈ R d , η(x) is a solution of the MP associated to L, with initial law δ x .
We conclude this section by introducing a suitable notion of flow associated to a diffusion operator, roughly consisting of Borel families of solutions of the MP, for a (large, in some sense) set of initial conditions in R d . Our aim is to study flows in the DiPerna-Lions sense (as extended by Figalli to MP's), thus, we introduce the concept of "regular flow", where regularity is usually some growth and/or absolute continuity condition on the 1-marginals, providing a selection criterion, yielding uniqueness in otherwise ill-posed problems. To study this notion in sufficient generality, we formulate such regularity conditions in terms of some set R := R [0,T ] of narrowly continuous (probability curves that are) solutions of (3), which describe the "admissible" class of dynamics. With this notation, we refer to any ν ∈ R as a R-regular solution of (3), and we say that solution to the MP is R-regular if the curve of its 1-marginals is a R-regular solution of (3). We also let R 0 ⊆ P(R d ) be the set of all initial laws of the solutions belonging to R [0,T ] , which we regard as the set of initial distribution of mass that we are allowed to transport. Definition 2.9 (R-MF). A Borel family (η(x)) x∈R d ⊆ P(C([0, T ]; R d )) is said to be a Rregular martingale flow (R-MF) (associated to L) if the initial law of η(x) is δ x , for every x ∈ R d , and, for everyν ∈ R 0 , the probability measure η(x)dν(x) is a R-regular solution to the MP (associated to L).
We remark that we are not imposing that, for every x ∈ R d , η(x) is a R-regular solution to the MP associated to L; the requirement is only in average, with respect to every admissible initial densityν ∈ R 0 . Of course, from this condition and Proposition 2.8 we obtain that η(x) is a solution of the MP, forν-a.e. x ∈ R d , for everyν ∈ R 0 . For example, if we let R [0,T ] be the set of all narrowly continuous solutions of (3), then we operate no selection at all, and R-MF's are Borel selections (η(x)) x∈R d of solutions of the MP, with η(x) starting at δ x , for every x ∈ R d . The DiPerna-Lions theory is obtained if we let R be the set of all narrowly We state (without implicitly assuming) some further properties of R-regular solutions of MP's and FPE's that are useful in the next section. The first property is a stability property with respect to pointwise domination: for everyν, ν, narrowly continuous solution of (3) such that, for some C ≥ 0, A useful property is stability with respect to convex combinations, i.e., for anyν ∈ P(Z), if,ν-a.e. z ∈ Z, η z is R-regular and (9) holds, then η z dν(z) is R-regular.
A reasonable converse should be stability with respect to disintegration, but there are several formulations: given any R-regular η, writing (η(x)) x∈R d for a regular conditional probability with respect to e 0 , we may require that for anyν ∈ P(R d ) withν ≤ Cη 0 for some C > 0, then or alternatively that or even that which is a rather strong condition: it formally implies the others whenever (11) holds true. Let us also notice that it does not hold when we deal with the DiPerna-Lions class introduced above, while (12) as well as (13) hold true. Moreover, an application of Theorem 2.5 shows that condition (12) is equivalent to (10). Due to technical reasons, we must introduce a slight extension of all the notions above, taking into account a family (R In particular, for any ν ∈ R [r,T ] , one has ν s ∈ R s . We also accordingly extend the notion of R-MF by considering a family (η(s, x)) s∈[0,T ],x∈R d , where (η(s, x) x∈R d is a R-MF, for every s ∈ [0, T ] (notice that we are not requiring joint measurability of (s, x) → η(s, x)).
Remark 2.10 (Markov property). With the notation introduced above, we can state the Markov property via Chapman-Kolmogorov equations, for a R-MF (η(s, x)) s∈[0,T ],x∈R d , We obtain this property as a consequence of uniqueness, arguing e.g. as in [Fig08, Proposition 3.10]. However, let us remark that it could be be of independent interest to study regular flows that are also Markov, extending e.g. the approach in [SV06, Chapter 12]. Finally, much less is known about the strong Markov property for DiPerna-Lions flows, i.e., the validity of (16) with stopping times in place of deterministic times -perhaps one has to introduce some notion of "regular" stopping times.

Equivalence between FPE's, MP's and flows
The superposition principle provided by Theorem 2.5 allows for establishing a neat correspondence between "Eulerian" and "Lagrangian" descriptions, transferring well-posedness results both ways. Such a connection is firmly established in the deterministic case, see e.g. [AC08,§4], and in the stochastic setting has been investigated e.g. in [Fig08,§2], in case of a DiPerna-Lions theory, and in [EK86,§4], for the classical theory (i.e., not in a selection sense). In this section, we provide a complete equivalence between well-posedness results for R-regular solutions of FPE's and MP's.
FPE's ⇔ MP's. Equivalence between existence result is straightforward, by lifting any solution ν of the FPE, we obtain existence of solutions of the MP, so we focus on uniqueness. A simple result which transfers "uniqueness" is the following one: the non trivial implication ii) ⇒ i) follows from lifting two different solutions ν 1 , ν 2 (see also [Fig08,Theorem 2.3]).
Lemma 2.11. Letν ∈ R 0 . Then, the following conditions are equivalent: A stronger uniqueness result, for processes, can be obtained arguing as in [SV06, Theorem 6.2.3] or [Fig08, Proposition 5.5]. Let us point out that here there appears a small gap with the deterministic literature, since a different argument [AC08, Theorem 9] shows uniqueness for MP's assuming only (10), while we must consider also intermediate s ∈ [0, T ] (since the argument employed therein uses some conditioning which may not preserve the martingale property in general, but it does when the martingale is deterministic). Proof. ii) ⇒ i). As in Lemma 2.11, ν ∈ R [s,T ] with ν s =ν we consider a (R-regular) superposition solution η: the uniqueness assumption entails that its 1-marginals are uniquely identified. i) ⇒ ii). The proof relies (implicitly) on the Markov property. Let s ∈ [0, T ] and η 1 , η 2 be solutions of the MP on [s, T ], with η 1 s = η 2 s . To deduce that η 1 = η 2 , we show that, for every n ≥ 1, the n-marginals of η 1 an η 2 coincide, i.e., for any s ≤ t 1 < . . . < t n ≤ T and We argue by induction on n ≥ 1, the case n = 1 being a consequence of i) ⇒ ii) in Lemma 2.11 and property (15), i.e. we use the fact that (η i t ) t∈[s,T ] for i ∈ {1, 2} are R-regular solutions, with η 1 s = η 2 s . To perform the step from n to n + 1, we argue as follows. For fixed s ≤ t 1 < . . . < t n < t n+1 ≤ T and A 1 , . . . , A n , A n+1 ⊆ R d Borel sets, we let i.e., the density of η 1 conditioned with respect to n i=1 {e t i ∈ A i }. We assume that the denominator above is not null: otherwise there is nothing to prove. Notice also that the inductive assumption gives (e tn ) ♯ (ρη 1 ) = (e tn ) ♯ (ρη 2 ), since it amounts to (17) with A n ∩ B in place of A n , for every B ⊆ R d Borel.
For i ∈ {1, 2}, we let η i ρ be the push-forward of the measure ρη i with respect to the natural restriction from [s, T ] to [t n , T ], and notice that both are R-regular solutions of the MP on [t n , T ], with identical laws at t n , by Lemma 2.6 and (15). By the implication i) ⇒ ii) in Lemma 2.11, we deduce in particular that (η 1 hence we deduce the case n + 1 of (17).
MP's ⇔ flows. In this case, both notions are "Lagrangian", thus there is no need of the superposition principle here: most of the argument are just consequences of convexity and disintegration of measures.
Although our actual well-posedness results are in the DiPerna-Lions case, where uniqueness is understood up to m-a.e. equivalence, where m is some "reference" σ-finite Borel measure on R d (i.e., m = L d ), for the sake of completeness, we provide a result assuming (14).
Then, it always holds i) ⇒ ii), while ii) ⇒ i) holds true provided that some R-MF exists and both (11) and (14) hold.
Proof. i) ⇒ ii) is straightforward, since regular conditional probabilities are essentially unique (a R-MF is in particular a regular conditional probability of η(x)dν(x) with respect to e 0 ). To show the implication ii) ⇒ i), letν ∈ R 0 and η 1 , η 2 be R-regular solutions of the MP, with initial lawν. By disintegrating with respect to e 0 and (14) we may assume that ν = δx, for somex ∈ R d . Let (η(x)) x∈R d be a R-MF (here we use the existence assumption) and define The result above is rather unsatisfactory in terms of existence of R-MF's, which seems a delicate problem, in general. For example, existence may follow if one assumes the validity of assumption i), (11), (14) and that R 0 is a Borel of probability measures. Then, for every x ∈ R d such that δ x ∈ R 0 , there exists a unique R-regular solution of the MP, and by suitable definition for x not in such a set, we obtain a R-MF (which is then unique in the sense above). An easier existence result follows if we assume the domination condition as in the DiPerna-Lions case. We also assume that m is minimal in the sense that, for every Proposition 2.14. Let (18) hold, and consider the following conditions: i) for everyν ∈ R 0 , there exists a unique R-regular solution ην to the MP, with initial law ν, and the mapν → ην is Borel; If (11) and (12) holds, then i) ⇒ ii). If (11) and (13) are satisfied, then ii) ⇒ i).
Proof. i) ⇒ ii). We have only to settle existence of some R-MF, as uniqueness is trivial. For any probabilityν = um ∈ R 0 we consider the unique R-regular solution of the MP η u with initial lawν and a regular conditional probability with respect to e 0 , (η u (x)) x∈R d . Then, for any vm ∈ R 0 , it holds Indeed, it is sufficient to show that, for every ε > 0 and every ρm probability density, concentrated on {u > ε, v > ε}, with ρ uniformly bounded, it holds This, in turn, follows from uniqueness and (12): both members above are R-regular solutions to the MP, with initial law ρm ≤ ε −1 Cum. Next, we notice that there must exists some um ∈ R 0 equivalent to m, i.e., such that u > 0 m-a.e. in R d , since m is equivalent to the supremum of all the measures in R 0 (appropriately rescaled). Then, we define η(x) := η u (x), for x ∈ R d . To conclude that η(x) is a R-MF, we use (19): given any probability vm ∈ R 0 , it holds To prove ii) ⇒ i), existence of R-regular solutions to the MP, given the existence of a R-MF is trivial, so we focus on uniqueness. We letη, be a R-regular solution of the MP with some initial law and show that it must coincide with the one induced by the (unique) R-MF (η(x)) x∈R d , i.e.,η = η(x)dη 0 (x). To this aim, we let um ∈ R 0 be a probability measure equivalent to m, and consider the measure which is a R-regular solution to the M P by (11), whose initial law is again equivalent to m. By disintegration with respect to e 0 , we obtain a Borel family of probability measures (η(x)) x∈R d , which, by (13), provides a R-MF and so by uniqueness it coincides with η(x), for m-a.e. x ∈ R d , yielding from which we conclude.
We end this section with some remarks on standard consequences of uniqueness: the Markov property and stability with respect to approximation.
As in Remark 2.10, we consider R-regular flows with respect to a family (R [s,T ] ) s∈[0,T ] such that (15) holds.
Proposition 2.15 (Markov property). Assume that uniqueness holds for R-regular MP's, in the sense that, for every s ∈ [0, T ],ν ∈ R s , there exists a unique R-regular solution to the MP on [s, T ], with initial lawν. Then, for every The proof is straightforward from the following identity between measures on C([s, T ]; R d ): which, in turn, holds true because both terms define R-regular solutions of the MP on [s, T ], with initial law η(r, x) sν (dx): this is obvious for the right hand side, while for the left hand side it is a consequence Proposition 2.6 and condition (15). Another well understood, but rather technical, property that sometimes follows from existence and uniqueness is a non-quantitative version of stability with respect to approximations, which in this setting would read as follows.
Proposition 2.16 (stability). For n ≥ 1, let a n , b n be Borel maps as in (2), let L n := L(a n , b n ) and let η n solve the MP associated to L n . If i) there exists a unique R-regular solution η of the MP associated to L = L(a, b) with η 0 =ν, ii) it holds η n 0 →ν narrowly, a n → a and b n → b pointwise as n → ∞, iii) for some convex, l.s.c functions Θ 1 , Θ 2 as in Theorem A.2 it holds lim sup Notice that we do not require that η n are R-regular: in general it does not even make sense, since R is a class of solutions to the FPE associated to L, not to L n . A proof of the result above would not be difficult, but it would require us to combine some technical results, such as those established in Section A.2 and [AC14, Lemma 23] to establish that (η n ) is a tight sequence and any limit point provides a R-regular solution to the MP associated to L; the conclusion is then straightforward from uniqueness. If (18) holds, then m-a.e. convergence in place of pointwise convergence of the coefficients is sufficient, if we also restrict to solutions η n whose marginals are absolutely continuous (as done, e.g. in [Fig08,Theorem 3.7]).

Well-posedness results
In this section, we state and prove two results (Theorem 3.1 and Theorem 3.2) about existence and uniqueness for solutions of the FPE (3), belonging to suitable classes of probability measures. In particular, as we are interested in the DiPerna-Lions theory, we deal with absolutely continuous with respect to the d-dimensional Lebesgue measure, µ t = u t L d , satisfying some bounds on their density u : [0, T ] × R d → R. Besides such integrability conditions on the solution, we require (Sobolev) regularity assumptions on the coefficients a, b.
In Section 3.1, we give some formal derivation of the energy estimates which eventually lead to well-posedness for FPE's; in Section 3.2, we introduce the notation for Sobolev spaces and related basic facts; in Section 3.3 we state our results and compare their with some (related) existing literature; the technical heart of the matter is developed in Section 3.4, where crucial commutator inequalities are proved; in Section 3.5 we give a proof of main our results.

Energy estimates and renormalized solutions
As in the classical DiPerna-Lions theory (as well as in [Fig08]), we rely on energy inequalities satisfied by an absolutely continuous solution u = (u t ) t∈[0,T ] of (3) (i.e. µ t = u t L d ). Let us briefly sketch a formal derivation, where we assume all the quantities involved being smooth (solutions and coefficients).
The main idea is to write the equation satisfied by t → β(u t (x))dx, where β : R → R is a smooth function (from a Lagrangian viewpoint, this amounts in choosing, as a test function, an expression involving the density of the solution u itself). The chain rule gives ∂ t β(u) = β ′ (u)L * (u) and, by linearity, we consider separately the drift and diffusion terms. Straightforward calculus gives For the diffusion part, we first notice the identity (a : where div L := −∇ * b − (∇ * ) 2 a/2. By integrating over R d , (formally L1 = 0), we deduce which is the key inequality we employ to show existence as well as uniqueness, for suitable choices of β. For example, letting β(z) = |z| + , we deduce that if u 0 ≥ 0, then u t ≥ 0 for t ∈ [0, T ] (thus, for simplicity, we assume that u t ≥ 0 in what follows). In particular, to deduce uniqueness for solutions in L ∞ t (L r x ) (for some r > 1), we show that the difference between any solutions u, v satisfies (21), with β(z) = |z| r , and Gronwall lemma entails a uniform bound with respect to t ∈ (0, T ) for u t − v t L r x . Let us also notice that, with the choice β(z) = |z| 2 , keeping track of the non-negative terms dropped above, we would obtain a bound for the "Sobolev energy" R d a t (∇u t , ∇u t )dL d dt and, for β(z) = |z| r , with r > 2, of the energy , we may still deduce some bound with respect to the energy z → |z| 2 , and again Grownwall inequality leads to a bound for L 2 x , uniform in t ∈ [0, T ]. Similarly, if r > 2, we use the inequality 2ab ≤ a 2 + b 2 thus, for every ε > 0, the term r u r−1 t b t · ∇u t dL d (assume for simplicity that u is non-negative) is estimated with and letting ε = λ, we may conclude again by a Grownall argument that Let us finally remark that if we integrate (20) with respect to some function f ∈ A, with f ≥ 0 (instead of f = 1), we would deduce The inequality above is so useful that weak solutions u of the FPE, which also satisfy (23) for every f ∈ A, f ≥ 0, for (many) smooth convex functions β, are called renormalized solutions [Fig08,Definition 4.9]. There are abstract results connecting well-posedness for FPE's and the fact that every weak solution is renormalized, e.g. [Fig08,Lemma 4.10] (but see also [BC06] for a somewhat converse result, in the deterministic framework); here, for brevity, we limit ourselves to a direct proof of uniqueness of FPE's from the validity of (20), e.g. with the special choice β(z) = |z| r .

Sobolev spaces
Before we state and prove our main results, we briefly introduce Sobolev spaces associated to the operators ∂ t and L, together with some useful facts; we use throughout a compact notation extending that in Section 2.1. For x . A standard mollification argument, with respect to the variable t ∈ (0, 1), gives that A is dense in W 1,p t (L q x ), for p, q < ∞ (for a proof of this and the following results, we refer e.g. to [Sho97, §III.1]). In particular, the chain rule for ∂ t extends to W 1,p t (L q x ), thus Another straightforward consequence of the density of A is the fact that any u ∈ W 1,p t (L q x ) enjoys an absolutely continuous representative, i.e. there exists someũ ∈ AC p ([0, T ]; L q (L d )) such thatũ t = u t , for L 1 -a.e. t ∈ (0, T ). In particular the map: T 0 (u) :=ũ 0 (trace at 0) is linear and continuous from W 1,p t (L q x ) to L q x . Moreover, t →ũ t is strongly differentiable at L 1 -a.e. t ∈ (0, T ) and it holds d dtũ = ∂ t u. We associate to the diffusion operator L some "Sobolev spaces". An important role in our deductions is played by D p (L) (for p ∈ [1, ∞)), defined as the abstract completion of A with respect to the norm f D p (L) := f L 1 x , which is well defined whenever a, b ∈ L 1 t L p x (actually, a more consistent notation for D p (L) would be L 1 t (D p (L t ))). Let us remark however that, without further regularity assumptions, the extended operator D p (L) ∋ f → Lf ∈ L 1 t (L p x ) may be multi-valued, but the assumptions on L that we impose in our results entail that the extension is single-valued.
A useful fact is the following: if f ∈ W 1,1 t (L p x ) ∩ D p (L), then one can provide a sequence (f n ) n≥1 ⊆ A converging towards f both in W 1,1 t (L p x ) and D p (L). Indeed, it is sufficient to consider first a sequence (g n ) n≥1 ⊆ A converging towards f in D p (L), let ρ be a smooth probability density on R, and consider the approximation g n,m := g n * ρ m (where we let ρ m (t) = m −1 ρ(t/m), t ∈ R, and we carefully extend g n to a continuous function outside the set [0, T ] × R d ). For (n, m) → ∞, the sequence g n,m converges towards f in D p (L), because g → g * ρ m is a contraction in D p (L), as convolution with respect to t and the operator L commute; for fixed m ≥ 1, the sequence g n,m converges towards f * ρ m , because g → g * ρ m is continuous from L 1 t (L p x ) into W 1,1 t (L p x ), with norm smaller than ρ m ∞ . Moreover, as m → ∞, f * ρ m converges towards f in W 1,1 t (L p x ), since f ∈ W 1,1 t (L p x ) (this is exactly the standard mollification argument providing density of A in W 1,1 t (L p x )). By a diagonal argument, we finally extract a sequence (f n ) n≥1 as required. As a consequence, if u ∈ L ∞ t L r x (r > 1) is a narrowly continuous solution of (3), with a, b ∈ L 1 t L p x , then the weak formulation (5) extends to f ∈ W 1,1 where by f T ∈ L r ′ x and f 0 ∈ L r ′ x we mean the continuous representative of f evaluated at T and 0.
Similarly, we introduce the space D p (L, a∇ ⊗ ∇) as the abstract completion of A with respect to the norm |f | + |Lf | + a(∇f, ∇f ) L 1 t L p x . Clearly, this is a space than D p (L), but is useful because the following chain rule holds, for D p (L, a∇ ⊗ ∇), and γ ∈ C 2 (R), with γ ′ and γ ′′ uniformly bounded: As in the previous case, it might be that f → L(f ) and f → a(∇f, ∇f ) are not singlevalued, but the identity above holds true with the natural interpretation (and in our results we introduce assumptions ensuring that these are well-defined functions).
We always consider the divergence div L be defined in the sense of distributions, i.e., as the linear operator , the inequality ≤ in place of equality above holds, for every f ∈ A, with f ≥ 0. If div L − ∈ L 1 t (L p x ), then we can prove the following inequality, for u ∈ D 1,p (L, a∇ ⊗ ∇), and β ∈ C 3 (R) convex, with bounded derivatives as well as β ′ (z)z − β(z) bounded: Indeed, let ρ be a smooth convolution kernel on R d and consider the diffusion operator L m with smooth coefficients a * ρ m and b * ρ m (where we let ρ m (x) = m −d ρ(x/m)). If we also assume u ∈ A, then the inequality above holds true by the derivation as in Section 3.1 above. The general case follows by approximation, letting first m → ∞ and then choosing u n ∈ A converging towards u in u ∈ D p (L, a∇ ⊗ ∇).
Besides these spaces associated with L, let us recall some features of standard Sobolev spaces and the smoothing properties of the standard heat semigroup (P α ) α≥0 on R d . For p ∈ [1, ∞], we consider spaces endowed with the usual norms. A crucial fact for our deductions are quantitative inequalities for the smoothing effect of the heat semigroup (P α ) α≥0 , which can be deduced by straightforward computations from the heat kernel in R d . Of course, P α is a contraction semigroup in W 1,p x as well as W 2,p x ; moreover, integration by parts and Hölder inequality give with c depending on p ∈ [1, ∞] only (possibly also on the dimension d). Such inequalities, called L p − Γ in [AT14], play a fundamental role for our approach to continuity equations in metric measure spaces: their validity in abstract setups as well as in Riemannian manifolds follow e.g. from uniform lower bounds on the Ricci curvature.
Arguing similarly, it holds for p ∈ [1, ∞], i, j ∈ {1, . . . d}, Let us also notice that, as α ↓ 0, the left hand side in the two inequalities above are infinitesimal, for a standard density and uniform boundedness argument applies.
Finally, another property that we occasionally use below is that, for p ∈ (1, ∞), one has W 2,p x := {f ∈ L p x : ∆f ∈ L p x }, because of the L p x -boundedness of the second order Riesz transform f → ∇ 2 ∆ −1 f , see e.g. [GT01].

Well-posedness: statement of results
We are in a position to state our main well-posedness results, which we split in two theorems: the first one deals with possibly degenerate diffusions, with Sobolev regular coefficients.
Theorem 3.1 (degenerate case). Let p ∈ (1, ∞], r ≥ 2p/(p − 1), and a, b be as in (2), with Then, for every probability densityū ∈ L r x , there exists a unique narrowly continuous solution Actually, the technique employed provides (existence and) uniqueness even without the assumption thatū is a probability density. As a straightforward consequence of the result above and the equivalence established in the previous section, if we let R be the class of narrowly continuous solutions u of the FPE (3), with u ∈ L ∞ t (L r x ), we deduce existence and uniqueness for R-regular martingale problems as well as for R-regular martingale flows. The unique regular flow satisfies the Chapman-Kolmogorov equations (16).
Our second statement deals with non-degenerate (elliptic) diffusions, i.e. if it holds, for some λ > 0, a(v, v) ≥ λ |v| 2 , for every v ∈ R d , a.e. in (0, T ) × R d . In such a case, we can remove one order of Sobolev regularity assumption from both coefficients, but we introduce Lipschitz regularity for t → a t .
Also in this case, as a straightforward consequence of the equivalence between Eulerian and Lagrangian descriptions, we obtain well-posedness for R-regular martingale problems as well as R-regular flows, with R as in the previous case.
Remark 3.3 (comparison with existing literature). The literature on the subject of Fokker-Planck equations and martingale problems is so vast and growing that we limit ourselves to a direct comparison only with very closely related and recent works. In particular, we stress some aspects which are different from the results appearing in [Fig08], [LBL08].
In [LBL08], the approach is mostly Eulerian, dealing with FPE's in divergence form The main result in [LBL08] provides existence and uniqueness for the equation above, provided that To compare these assumptions, we must notice as in [LBL08, §5.1] that with our notation a = σσ * and the drift is actually b − 1 2 ∇ * a. In view of this correspondence, it might seem that Theorem 3.1 follows from their weaker assumptions: this follows in principle from a result of the type σ := a 1/2 ∈ L 2 t (W 1,2 loc ), if a ∈ L 1 t (W 2,2 loc ), extending the well-known result [SV06, Lemma 3.2.3] that a 1/2 is Lipschitz whenever a ∈ C 2 . However, their conclusions are in fact weaker, and actually insufficient in order to obtain correspondent Lagrangian results: they prove existence and uniqueness in the class of narrowly continuous probability densities u ∈ L ∞ t (L ∞ x ) such that σ∇u ∈ L 2 t (L 2 x ): the latter (weak) regularity condition then prevents from a straightforward application of the results in Section 2.2. In conclusion, our result has (apparently) stronger regularity conditions on the coefficients, but draws stronger results and leads directly to well-posedness of regular martingale problems and flows.
The problem arising from the condition σ∇u ∈ L 2 t (L 2 x ), which prevents a Lagrangian theory, is well understood in [Fig08], where much effort is put in showing, for the bounded elliptic case, uniqueness in the class of narrowly continuous probability densities u ∈ L 2 t (L 2 x ) [Fig08, Theorem 4.3]. When compared with the assumptions of Theorem 3.2, an evident difference is that we require a first order condition a ∈ L 1 t (W 1,p x ), while no such requirement appear in [Fig08, Theorem 1.3], besides (with our notation) ∇ * a, div L − ∈ L ∞ t (L ∞ x ). The technique we employ -approximation by the semigroup associated to the Dirichlet form f → a(∇f, ∇f ) -is the same as Figalli's one, and in the elliptic case the novelty is more conceptual, providing a much cleaner derivation of commutator estimates, essentially by the same abstract arguments in the elliptic and the degenerate case. However, in the possibly degenerate case, our results are stronger, compare e.g. with [Fig08, Theorem 1.4], as we allow for much more general diffusion coefficients, and possibly unbounded terms -obtaining as well Lagrangian counterparts.
In more recent years, further developments along these research lines appeared in the literature, as well as different techniques (e.g. Crippa-DeLellis' technique [CDL08] was extended to SDE's in [Zha10,RZ10]): of course, novelties and improvements appear in these developments, but to the author's knowledge that they address different aspects (such as strong solutions, equations with jumps, quantitative estimates, etc.) and there is no substantial overlap with our two results above.
We also point out that the theory of measure-valued solutions (i.e., not necessarily absolutely continuous) Fokker-Planck equations, at least in the elliptic case, is well-developed and some results may be compared with ours. For example, [BDPRS07, Proposition 3.1] entail uniqueness if, for some p ≥ d + 2, a ∈ L ∞ t H 1,p x is elliptic, b ∈ L p t L p x and t → a t is Hölder continuous (locally uniformly in x). It is immediate to see that there is no inclusion between such class of coefficients and that of Theorem 3.2, and in particular the hypothesis of our result are dimension-free (indeed, we are specializing a theory tailored for infinite dimensional spaces). However, the uniqueness class is smaller in our case, since we restrict from the very beginning to absolutely continuous solutions, which is nevertheless sufficient to entail a reasonable Lagrangian theory. Let us point out some recent developments [BDPR08, BRS11, BRS13] and in particular [BRS15] which also contains a survey of known results and methods for the degenerate case. Finally, we point out the monograph in preparation [BRKS15], which contains a detailed study and a vast bibliography on the subject.
Let us briefly discuss some features of the two theorems above and their proof. First, existence of weak solutions in the hypothesis stated above is a much easier task than uniqueness: for example, one can argue by approximation via convolution of the coefficients (and the initial law) with a smooth kernel, so that the estimates on the coefficients are preserved, and one gains enough regularity (e.g. C 2 coefficients) so existence is available even at the Lagrangian level. Then, we have enough regularity so that the deductions which lead to inequality (20) apply and by a Gronwall argument we deduce a bound in L ∞ t (L r x ), in terms of div L − only, and uniform in the approximation (in the elliptic case, we argue with (22) instead). By extracting a weakly convergent sequence and by strong convergence of the approximations of the coefficients, we deduce that any weak limit point in L ∞ t (L r x ) is a weak solution to the FPE (3). In the elliptic case, we deduce as well existence for a solution u ∈ L ∞ t (L r x ) ∩ L 2 t (W 1,2 x ). Let us also recall the approach [Fig08, Theorem 4.3], which is completely Eulerian (i.e., it relies on PDE's techniques only), and has the advantage of yielding easily uniqueness, for solutions belonging to such a (smaller) space, which does not allow for applications of the theory developed in Section 2.2.
In order to establish uniqueness of solutions, our aim is to rigorously establish (23) and (22), for (difference of) solutions u ∈ L ∞ t (L r x ). As already remarked, the main problem is related to the regularity of u, in order to employ the standard calculus rules. Our strategy relies the well-known smoothing scheme, which dates back at least to [DL89]: for α ∈ (0, 1) we introduce some linear operator P α , acting on functions defined on (0, T ) × R d such that, by defining u α := P α u, we obtain an approximation of u sufficiently regular to rigorously obtain (23). Of course, the price that we pay is that u α , in general, is not a solution of (3) and one has to carefully estimate the "error terms" thus appearing: our novel contribution indeed provides a systematic approach to such inequalities.
To be more precise, in the cases that we consider, the operators (P α ) α≥0 form a stronglycontinuous Markov symmetric semigroup on L 2 ((0, T ) × R d , L d+1 ), so that, in particular, P α preserves all L p t L q x spaces, for p, q ∈ [1, ∞]. If we also prove that P α maps W 1,1 t (L r ′ ) ∩ D r ′ (L) into itself, we may write, for f belonging to such space, (28) since the weak formulation (24) extends by density of A in W 1,1 t (L r ′ ) ∩ D r ′ (L). The commutator term appears as an algebraic way to highlight the identity as an equation for u α , and all the issue is to show that it is infinitesimal as α ↓ 0.
Next, we prove that P α has a "smoothing effect", in a sense that we can choose β ′ (u α ) as a test function, and apply the chain rule with respect to ∂ t and (25), so L 1 -a.e. t ∈ (0, T ) and in the sense of distributions on (0, T ). Finally, we let α ↓ 0, and by strong convergence of u α towards u in L 1 t (L r x ), we are able to conclude, provided

Commutator inequalities
In this section, we estimate the "error terms" involving the commutator between P α and ∂ t + L. Our general strategy is a further development of that first introduced in [AT14], in the framework of continuity equations in metric measure spaces, and it is completely "coordinate free" and depends on an interpolation argumentà la Bakry-Émery, namely where we let ∆ be the generator of (P α ) α≥0 . It turns out that the commutator between ∆ and ∂ t + L, reflecting the "relative regularity" between the chosen approximation and the target diffusion, depends upon natural quantities such as Sobolev regularity of the coefficients. In principle, this method provides very general results but, for the ease of exposition, we address separately only three cases, which are of particular interest: the case of a commutator between the Euclidean heat semigroup and a Sobolev derivation, which is a specialization of [AT14, Lemma 5.8] in the Euclidean case; that of a commutator between the Euclidean heat semigroup and a second-order Sobolev diffusion, which is apparently novel, that we settle by performing a "second order" interpolation argument; and finally that of the commutator between ∂ t and a non-degenerate diffusion acting only the variable x ∈ R d , with t → a t Lipschitz, which provides an alternative approach to Step 2 in [Fig08,Theorem 4.3].
where c ∈ R is some constant (depending on the dimension d only).
Actually, the proof below shows that ∇b can be replaced with the symmetric part of the derivative (also called deformation) D sym b := (∇b + (∇b) τ )/2, where τ denotes the transpose operator.
As a consequence of (29), the commutator operator L ∞ extends to a linear continuous operator on L ∞ t L s x . Moreover, a standard density and uniform boundedness argument entails that, for f ∈ L ∞ t L s x , Proof. It is sufficient to argue assuming that b, u and f are sufficiently smooth, e.g., u ∈ A c , f ∈ A, as well as b i ∈ A, for i ∈ {1, . . . , d}, as the general inequality will follow by approximation (e.g. by convolution with a smooth kernel). Moreover, we argue at t ∈ (0, T ) fixed and then integrate over the interval (0, T ): thus we omit to specify t ∈ (0, T ) in what follows.
By straightforward integration by parts, we obtain the following alternative expression for the right hand side above: If ∇ * b = 0, the conclusion is immediate, since we may estimate |F (α) − F (0)| ≤ α 0 d ds F (s) ds and, by Hölder inequality, by (26) and using 1 0 (s(1 − s)) −1/2 ds = π. The general case ∇ * b ∈ L q x is slightly more involved: let us first notice that the term (∇ * b)∇u s · ∇f α−s can be estimated as above, adding a contribution π ∇ * b L q x to the inequality. Finally, to estimate the contribution of (∇ * b)u s ∆f α−s we do not put the absolute value inside integration with respect to s ∈ (0, α), but exchange integration with respect to x and s, exploiting the identity Next, to integrate by parts only "half of the derivative" with respect to s, we simply add (∇ * b)u α times the quantity which, once integrated with respect to x ∈ R d , by Hölder inequality and (27) is bounded from above by This settles an analogue of (29) at fixed t ∈ (0, T ), and by integration with respect to t ∈ (0, T ), we obtain (29).
The constant c can be even independent of the dimension d of the underlying space, provided that assume some bound directly on ∇ * b L q x , and use a refined, dimension independent estimate for ∆f α−s L s x : these are the key observation that lead to well-posedness on possibly infinite dimensional spaces, as developed in [AT14].
Proof. To establish (30), the underlying idea is to formally rewrite a : ∇ 2 f = a : (∇ 2 ∆ −1 )∆f and exploit the boundedness of the Riesz transform ∇ 2 ∆ −1 in L s x , together with a second order interpolation along the heat semigroup. To make computations more transparent, we argue on coordinates, i.e., we fix i, j ∈ {1, . . . , d} and consider the commutator As in the proof of the previous lemma, we may also let u ∈ A c , f and a i,j be sufficiently regular, e.g. f , a i,j ∈ C 4 b ((0, T ) × R d ), and argue at fixed t ∈ (0, T ). We consider the curve since the Laplacian and partial derivatives commute. We write h α−s := ∂ 2 i,j f α−s = (∂ 2 i,j f ) α−s (since derivatives and heat semigroup commute), let b := ∇a i,j and integrate by parts, obtaining Differentiating once more, since F ∈ C 2 b (0, α), we obtain We introduce a second order interpolation based on the Taylor expansion and we notice that the left hand side gives, up to integration on (0, T ), the left hand side of (30).
Let us notice first how we would conclude in case ∇ * b = ∆a i,j = 0, and then address the general case. As in the previous lemma, we obtain the identity where c is some constant. Integrating with respect to s ∈ (0, α) and exploiting the factor (α − σ) to compensate the bound the norm of h α−s , we deduce (30).
To address the general, we bound separately the terms To deal with former, we isolate a "leading term" which involves ∇ 2 a i,j and we bound the remaining terms it by adding and subtracting suitable quantities, with the only difficulty that we must take into account the second order expansion. Precisely, after arguing as in the case ∆a i,j = 0, we are left with estimating and to this aim we add and subtract where we let R i,j f := ∂ 2 i,j ∆ −1 f be the second-order Riesz transform along the directions i, j. The difference between the (33) and (34) is easily bounded and to conclude, we exploit the identity and use the fact that the latter quantity is uniformly bounded (and that R i,j is a bounded operator).
To estimate the second expression in (32), we notice that thus we integrate by parts with respect to s ∈ (0, α), The first term in the right hand side above is bounded by c ∆a i,j To conclude, we argue once more by adding and subtracting and estimating the differences involved. This settles the validity of (30), for smooth functions and at fixed t ∈ (0, T ). By integration and a density argument, the general case is deduced at once. Next, we prove (31), which follows from the fact that α u[∆, a : ∇ 2 ]u 2α dL d is infinitesimal, as α ↓ 0: indeed, a standard uniform boundedness and density argument gives that the left hand side in (30) is infinitesimal as α ↓ 0. To show it, we initially argue in the case of smooth functions u, f , and for fixed i, j ∈ {1, . . . , d}, we let b = ∇a i,j and integrate by parts Although the intermediate steps require some regularity for u, by the commutator estimate for Sobolev derivations established in the previous lemma, the resulting identity extends by continuity to u ∈ L ∞ t (L r x ), f ∈ L ∞ t (W 2,s x ). Next, we specialize to the case f := u α . By the strong convergence provided by Lemma 3.4 and uniform boundedness of α∂ 2 i,j u α in L ∞ 1 (L r x ), we have Similarly, it holds (recall that the left hand side in (27) is infinitesimal) Finally, in order to handle the term α (∂ 2 i,j u α )(b∇u α )dL d , the choice f = u α and the symmetry of a are crucial: we integrate by parts once, and since b = ∇a i,j , we obtain The first term, when multiplied by α, is clearly bounded and infinitesimal as α ↓ 0, so we focus on the last one. To show that it is bounded, we recall that a is symmetric and we are actually interested in bounds for the whole sum on i, j ∈ {1, . . . d}; thus, by coupling the symmetric terms, it is sufficient to prove that is infinitesimal. This symmetric expression can be explicitly rewritten as and at this stage we integrate by parts once more, obtaining a bound in terms of which is sufficient to conclude (recall that the left hand side in (26) is infinitesimal).
Finally, we deal with the bounded elliptic case: if a is bounded and elliptic, then the form L 2 t (W 1,2 x ) ∈ f → a(∇f, ∇f ) is Dirichlet form, with associated Markov semigroup P α a and (self-adjoint) generator ∆ a f = div(a∇f ), on its "abstract" domain D(∆ a ) (as given by the general theory of Dirichlet forms). When we choose P α a as "smoothing operator", the main difficulty is to prove that it preserves regularity with respect to t ∈ (0, T ), thus we need some estimate for the commutator [P α a , ∂ t ], which we initially define in following the weak sense, for u ∈ A c , f ∈ A: Lemma 3.6. Let a be bounded and elliptic, with ∂ t a ∈ L ∞ t (L ∞ x ). Then, for every α ∈ (0, 1), where c is a constant (depending only on the ellipticity constant λ).
Thanks to this lemma and a density argument, for f ∈ W 1,2 t (L 2 x ), we deduce that P α a f ∈ W 1,2 t (L 2 x ), and the "strong" commutator [P α a , ∂ t ]f := P α a ∂ t f − ∂ t P α a f is well defined and it belongs to L 2 t (L 2 x ). Moreover, the usual uniform boundedness arguments shows that, for u ∈ L 2 t (L 2 x ) and any family (f α ) α≥0 ⊆ L 2 t (W 1,2 x ) converging in L 2 t,x , it holds Proof. We provide the following analogue of (35), where ∂ t is replaced by σ −1 (T σ − I), where T σ f (t, x) = f (t + σ, x), and I is the identity operator (we also choose σ = 0 small enough, to avoid boundary terms, thanks to the assumption u ∈ A c ): (which c depending on λ only). Once this is is settled, we may let σ → 0 and pass to the limit in the weak formulation. Let us notice that the identity operator plays no role above, and everything reduces to estimate σ −1 u[P α a , T σ ]f dL 1+d . By first-order interpolation along the semigroup P s a , for s ∈ (0, α), it is sufficient to bound the infinitesimal commutator where we performed integration by parts with respect to the variable x ∈ R d and the change of variables t → t + σ in the first integral. We have therefore the bound (we are actually interpolating also along the semigroup σ → T σ ) which by ∂ r T r a = T r ∂ t a gives the thesis, after an application of Hölder inequality and using the smoothing effect in L 2 t (L 2 x ) of P a , i.e. ∇u s x . It would be natural to extend the argument above for more general exponents beyond the case above; the main issue being that a smoothing effect for P a akin to (26) is not ensured by Dirichlet form theory, when the exponent involved is different from 2. It seems plausible however to replace L 2 t (L 2 x ) with L ∞ t (L 2 x ) and require only ∂ t a ∈ L 1 t (L ∞ x ) (as the semigroup acts only fiberwise).
Remark 3.7 (trace semigroup at t = 0). Another consequence of Sobolev regularity of the lemma above is existence of a "trace" semigroup, e.g. at t = 0, defined as follows: for f ∈ L 2 x , consider a constant extension f (t, x) = f (x) for (t, x) ∈ (0, T ) × R d , and let P α 0 f be the trace of the Sobolev function P α a f at t = 0. Alternatively, this can be obtained as the semigroup generated by the bilinear form given by the trace at 0 of a.

Proof of well-posedness results
In this section, we address the proof of Theorem 3.1 and Theorem 3.2. As already remarked, existence is easily settled by approximations, so we focus on uniqueness.
Proof of Theorem 3.1. Let u be the difference between any two narrowly continuous solutions in L ∞ t (L r x ) and let P α be the heat semigroup on R d , extended on (0, T ) × R d by acting on each b with respect to the variable x ∈ R d , for L 1 -a.e. t ∈ (0, T ) (to approximate P α f with functions in A, we argue by convolution with a smooth kernel with respect to t ∈ (0, T )), thus (28) holds true for f in such a space: (36) For α > 0, we also have u α ∈ D r ′ (L): to use u α as a test function, we deduce that u α ∈ W 1,1 t (L r ′ x ), which follows directly from the equation satisfied by u α . Indeed, (36) for f ∈ A c entails that the distributional derivative ∂ t u α coincides with the distribution L * u α , which is represented by a function, namely Therefore, ∂ t u α ∈ L 1 t (L r ′ x ), and u α admits an absolutely continuous continuous representative, which must coincide with the one that we would obtain by acting directly to the narrowly continuous representative u t with the heat semigroup P α , at every t ∈ [0, T ]: it holds in particular u α 0 = 0, since u 0 = 0. Moreover, the curve t → R d (u α t ) 2 dL d is absolutely continuous, with distributional and L 1 -a.e. derivative d dt (u α ) 2 t dL d = 2 (∂ t u α )u α dL d . We are in a position to let u α in the weak formulation (36), to obtain If we choose instead a test function t → f (t)u α t , with f ∈ C 1 c [0, T ) and we apply (25), we eventually deduce the inequality L 1 -a.e. t ∈ (0, T ) and in the sense of distributions on (0, T ). Gronwall lemma gives As a consequence of Lemma 3.5, we deduce u L ∞ t L 2 x = 0. Proof of Theorem 3.2. In our smoothing scheme, we choose P α = P α a be the semigroup associated to the Dirichlet form f → a(∇f, ∇f )dL 1+d , as introduced in the previous section. A first step consists in showing that (36) holds true, and we see it as a consequence of the fact that P α a maps W 1,2 then Lemma 3.6 shows that f α ∈ W 1,2 t (L 2 x ) as well; to show f α ∈ D 2 (L), we rely on the assumption on a ∈ L 1 t (W 1,p x ), and show that the smooth approximations obtained by means of the standard heat semigroup x ) (this is the only point where we use the first order regularity assumption on a). Such convergence can be seen by the commutator lemma for Sobolev vector fields, Lemma 3.4, noticing that the claim convergence amounts to show [L, P s ]f α → 0 in L 1 t (L 2 x ), but since derivatives and the standard heat semigroup commute, it holds x and Lemma 3.4 shows convergence towards 0 in L 1 t L 2 x , as s ↓ 0. As a second step, we notice that we may let u α be a test function in (36) indeed, it holds u α ∈ H 1,2 (L, a(∇ ⊗ ∇)) by what we just proved, while the fact that ∂ t u α is represented by some function in L 1 t L 2 x follows from a duality argument: for a.e. t ∈ (0, T ) the linear functional where we applied (25) only for the diffusion part a : ∇ 2 , as we deal with the drift term separately, using the inequality to bound the contribution of the drift part. To conclude, we apply Gronwall inequality and finally let α ↓ 0, using (31) to deduce that the commutator term gives no contribution in the limit and uniqueness holds.

A The superposition principle for multidimensional diffusions
To prove Theorem 2.5, we follow a general scheme, whose structure is shared by many proofs of superposition principles appearing in the literature, see e.g. [ Step 1 (approximation). We build from ν a sequence of solutions (ν n ) n of FPE's associated to diffusion operators (L n ) n , for which the superposition principle is already known to hold, thus obtaining a sequence of superposition solutions (η n ) n of MP's. Here, the difficulty is to exhibit a sufficiently good approximation, so that ν n converge towards ν, e.g., narrowly, and L n towards L, in a sense to be made precise, as n → ∞.
Step 2 (tightness). We prove that (η n ) n ⊆ P(C([0, 1]; R d )) is tight, yielding a narrow limit point η. By Ascoli-Arzelà criterion, this step reduces to show uniform bounds on the modulus of continuity of the canonical process (e t ) t∈[0,1] with respect to η n .
Step 3 (limit). From convergence ν n → ν, L n → L, as n → ∞, we conclude that η is a superposition solution for ν. Here, the problem is to deal with convergence for possibly non-continuous functions, as they involve the coefficients a, b.

A.1 Approximation
We approximate the limit solution by means of mollification by convolutions or push-forwards via smooth maps (in probabilistic jargon, by conditioning with respect to some observables).
Since conditional expectations reduce norms and the derivatives of π i are uniformly bounded, integral bounds on a, b are naturally transferred on π(a), π(b) (in particular, uniform bounds). However, local integrability conditions could be not preserved.
Mollification by convolutions. This is a more standard technique, already employed e.g. in [AGS08,Theorem 8.2.1] and [Fig08,Theorem 2.6]. Let ρ ≥ 0 be a smooth probability density (with respect toL d ), with full support. Then, the family of measures ν * ρ := (ν t * ρ) t∈[0,1] , solve a FPE associated to a suitably defined diffusion operator. Indeed, for since derivatives and convolution commute. We define so (ν t * ρ) t∈[0,1] is a weak solution of the FPE associated to L ρ := L(a ρ , b ρ ), as Integrability and regularity properties of a ρ and b ρ are collected by the following lemma, see [AGS08, Lemma 8.1.10] for a detailed proof.

A.2 Tightness
We prove a compactness criterion for solutions of martingale problems, under minimal integrability conditions on the coefficients. In the deterministic case, tightness is achieved by estimating the metric velocity of absolutely continuous curves which solve the ODE; in the stochastic case, we rely on analogous results for martingales, using Burkholder-Davis-Gundy inequalities and an argument reminiscent of Lévy's modulus of continuity for the Brownian motion, to estimate the modulus of continuity of the canonical process (yielding in some cases Hölder regularity).
To show that Ψ defined by (39) is coercive, it is sufficient to apply Ascoli-Arzelà criterion, noticing that γ ∈ {Ψ ≤ m} can be decomposed as the sum of two curves γ 1 + γ 2 , and γ i (i ∈ {1, 2}) admits the following modulus of continuity To show that (38) holds, we assume that the right hand side therein is finite. The assumptions entail therefore that (M t ) t is a P-a.s. continuous local martingale, whose quadratic variation process is t → t 0 α s ds. If we let γ 1 t := t 0 β s ds and γ 2 t = M t , for t ∈ [0, 1], then the left hand side in (38) is smaller than Next, we focus on the addends in the series above, writing for brevity ε in place of 2 −m . For i ∈ {1, 2}, using (40), we have Let us focus on the case i = 1 (thus we write δ = δ 1 , Θ = Θ 1 ). Since |γ s − γ t | ≤ t s |β r | dr, we estimate where the last inequality is a consequence of Jensen's inequality and our preliminary choice for δ. Summing upon k ∈ 1, . . . , δ −1 , we conclude that ds, for some constant c ≥ 0 (in this case, the constant does not even depend upon Θ).
To deal with the case i = 2 (again, we omit to specify i in what follows), i.e., the martingale part, for each k ∈ 1, . . . , δ −1 , we estimate from above, where c Θ is some constant depending on Θ only: indeed, it is sufficient to apply Burkhölder-Davis-Gundy inequalities, e.g. in the form [LLP80, Theorem 2.1], to the martingale M s := δ −1/2 M s+(k−1)δ , s ∈ [0, δ] and the convex function with "moderate growth" x → Θ(x 2 ). By Jensen's inequality and our definition of δ ε we conclude that As in the previous case, by summing upon k ∈ 0, . . . , δ −1 , we deduce that and so we deduce the desired bound for E [Ψ(M )].

A.3 Limit
In the third step, we assume that the probability measures (η n ) n , obtained as superposition solutions for a suitable approximating sequence (ν n ) n narrowly converge in P(C([0, T ]; R d )) towards some limit η. The fact that η provides a superposition solution for ν is not straightforward, since we must deal with a limit in the weak formulation, where terms involving the coefficients a, b appear (in general, they are not continuous). Indeed, η ∈ P (C([0, 1]; R d )) is a solution of the martingale problem associated to L(a, b) if and only if the following property holds: for every s, t ∈ [0, 1] with s ≤ t, for every f ∈ C 1,2 c ([0, 1] × R d ) (with f C 1,2 ≤ 1) and for every bounded continuous and F s -measurable function g on C([0, T ]; R d ) (with g ∞ ≤ 1) it holds As the correspondent identity holds for η n and L n , i.e.
to deduce that η is a solution to the martingale problem associated to L, since f and ∂ t f are bounded and continuous, the crucial limit is whose validity we now investigate, according to the approximations from Section A.1. Push forward via smooth maps. For n ≥ 1, let π n ∈ C 2 b (R d , R d ) with (π n ) n converging to the identity map locally uniformly and assume that the sequence of first and second derivatives converge (towards the respective limits), pointwise and uniformly bounded, i.e., ∇π n (x) → Id for x ∈ R d , ∇ 2 π n (x) → 0, for every x ∈ R d , and there exists some constant C ≥ 0 such that ∇ i π n (x) ≤ C, for x ∈ R d and i ∈ {1, 2}.
Let ν n = π n ♯ ν, π n (L), and let η n be corresponding superposition solution. To prove that any narrow limit point η is indeed a superposition solution for ν, with respect to the diffusion operator L, we add and subtract the term  L(a, b) is any diffusion operator on R d , whose coefficients a, b are continuous and compactly supported. The difference terms above are infinitesimal as n → ∞, by narrow convergence of η n , thus we estimate (42), as n → ∞, in terms of Let us focus on first term above, at fixed n ≥ 1 (for simplicity of notation, we drop the dependence upon n). Recalling the definition of π(L),integration with respect to the pushforward measure gives Being (Lf ) • π a function of π, up to ν-negligible sets, we have since conditional expectation reduces L 1 (ν)-norms. Writing explicitly the difference and recalling that f C 1,2 ≤ 1, we conclude that Letting n → ∞ (recall that π = π n above), using the assumption on the convergence of π n towards the identity map (in particular, we use Lebesgue dominated convergence w.r.t. the measure ν), we deduce that (43) is bounded from above by twice To conclude, we choose a, b that minimize the right hand side above: this can be made arbitrary small, by the density of continuous and compactly supported functions in L 1 (ν)).
Mollification by convolution. In this case, the argument is similar, and it is more standard, see e.g. [AGS08, Theorem 8.2.1], thus we only sketch it. Given a sequence ρ n of probability densities on R d such that ρ n L d → δ 0 narrowly as n → ∞, let ν n = ν * ρ n and L n be the diffusion operator introduced in Section A.1. We add and subtract, in (42), where L = L(a, b) has continuous and compactly supported coefficients. Let ω be a common (bounded and continuous) modulus of continuity for a, b.
As in the previous case, narrow convergence implies that the absolute value of (42) is bounded from above, as n → ∞, by lim sup First, we prove that lim n→∞ |L n f − Lf |dν n = 0, where L n has coefficients a n := Indeed, recalling that f C 1,2 ≤ 1, we estimate Thanks to this fact, we write lim sup where in the last step we apply (37). To conclude, it is sufficient to optimize upon a, b, by density of continuous and compactly supported functions in L 1 (ν).

A.4 Proof of Theorem 2.5
We argue by iterating the three-steps scheme, the base case being that of diffusion operators with smooth and uniformly bounded coefficients. First, we extend the validity to uniformly bounded coefficients (without any regularity assumption), then to locally bounded coefficients, and finally integrable coefficients. Although everything could be obtain in a single iteration, we think the approach highlights the different roles played by different approximation procedures. Indeed, our crucial improvement with respect to [Fig08, Theorem 2.6] is to move from uniformly bounded to integrable coefficients, which is rather delicate: by comparison, in the deterministic case, one is able to deal directly with locally smooth coefficients (see e.g. [AGS08, Proposition 8.1.8]), essentially because paths either go to infinity, i.e., the solution explodes in a finite time, or stay in a compact set. Roughly speaking, the source of difficulties in the stochastic case is that we have to deal with "averages" of such behaviours, and moreover the solution to a genuinely stochastic martingale problem is expected to instantaneously "diffuse" over compact sets (of course, with small probability as these sets become larger).
Case of smooth and bounded coefficients. Let a, b be Borel maps as in (2), with Then, the superposition principle holds for every solution ν = (ν t ) t∈(0,T ) ⊆ P(R d ) of the FPE (3). This follows from two well-known facts: existence of Itô's stochastic differential equations and uniqueness for narrowly continuous solutions of FPE's. The existence result is standard, with the possible exception of the integrable bounds with respect to the variable t ∈ [0, T ] (usually, one requires uniform bounds), but in fact such condition is sufficient for the various applications of Gronwall inequality. For the sole purpose of establishing a case base for the superposition principle, the usual stronger assumptions on the coefficients, e.g. a, b ∈ C ∞ b ((0, T )×R d ) would even be sufficient, at the price of introducing an extra mollification step with respect to the variable t ∈ [0, T ].
Theorem A.5. Let a, b be Borel maps as in (2), satisfying (44). Then, for everyν ∈ P(R d ), there exists a solution η of the MP associated to L(a, b), with η 0 =ν.

Proof.
The assumption a ∈ L 1 t (C 2 b (R d )) entails that the symmetric non-negative square-root of a, i.e. the (essentially unique) map σ : [0, T ] × R d → Sym + (R d ) such that σ 2 t = a t , L 1 -a.e. t ∈ (0, T ), is bounded and Lipschitz with respect to x ∈ R d , with Lipschitz constant integrable w.r.t. t ∈ (0, T ), see e.g. [SV06, Lemma 3.2.3]. Then, it is sufficient to solve by Picard iteration the Itô stochastic differential equation where X is a r.v. independent of the d-dimensional Wiener process W . By Itô formula, the law of X, i.e. X ♯ P, is a solution of the martingale problem associated to L(a, b).
Of course, the MP is also well-posed, but we need a stronger uniqueness result, for narrowly continuous solutions of FPE's, which is e.g. a consequence of results on backward Kolmogorov equations. We refer e.g. to the expository notes by [Kry99] for more details; notice however that, also in this case, the standard literature studies equations of the form To our purposes, we need existence of a solution, together the following regularity results for the solution f (which entails uniqueness): where z → C(z) denotes some function depending on the dimension d only (the proof gives that C has an exponential behaviour). The proof follows by direct differentiation of the equation, see [SV06, Theorem 3.2.4] for a detailed derivation. Moreover, as a consequence of the maximum principle, iff ≥ 0, and g ≥ 0, then the solution f is non-negative as well.
We are in a position to prove the following result, akin to [AGS08, Proposition 8.1.7]. Again, we provide a slightly stronger statement than what is needed for the superposition principle (e.g., we deduce uniqueness for possibly signed solutions of the FPE). Proof. Let g ∈ C ∞ c ((0, T ) × R d ), with g ≥ 0: it is sufficient to show that g dν ≤ 0. Fix R ≥ 1 large enough so that the support of g is contained in (0, T ) × B R (0) and let χ R be a cut-off function, as below Remark 2.3. Notice that letting a R = aχ R and b R = bχ R in place of a, b, condition (44) holds and L R f = Lf on (0, T ) × B R (0), for every f ∈ C 2 b ((0, T ) × R d ). For ε > 0, let a ε R , b ε R be a double mollification with respect to the space and time variables, and define L ε R = L(a ε R , b ε R ), which is a diffusion operator with smooth and bounded coefficients, satisfying (44) uniformly in ε > 0. Let f ε be a solution to the backward Kolmogorov equation ∂ t f ε = −L ε R f ε + g, f ε T = 0, and choose f ε χ R in the weak formulation (5), which is admissible because f ε ∈ C 1,2 b ((0, T ) × R d ). Since f ε ≤ 0 and ν 0 ≤ 0, we have As ε ↓ 0, since a R = a and b R = b on (0, T ) × B(0, R), the second integral converges to [|Lχ R | + |a| |∇χ R |] d |ν|, and sup t∈[0,T ] f ε t C 2 b is uniformly bounded in ε > 0, by (46). Finally, we let R → ∞ and conclude, since |∇χ R | + |∇χ R | → 0, pointwise and uniformly bounded.
The superposition principle follows immediately from these facts: since any weak solution ν = (ν t ) t∈(0,T ) admits a narrowly continuous representativeν, we let η be a solution of the MP associated to L(a, b), withν =ν 0 (Theorem A.5) and notice that the curve η = (η t ) t∈[0,T ] is a narrowly continuous solution of the FPE associated to L, with η 0 =ν 0 . By Theorem A.6, we conclude that η t =ν t , for t ∈ [0, T ].
Case of bounded coefficients. We extend the validity of the superposition principle for diffusions with uniformly bounded coefficients: this already provides an extension of [Fig08, Theorem 2.6], as uniform bounds are imposed only with respect to x ∈ R d . Precisely, we assume that the coefficients a, b satisfy Step 1 (approximation). We argue by convolution with a kernel ρ = a exp(− 1 + |x| 2 ). For ε ∈ (0, 1), let ρ ε (x) = ε n ρ(x/ε) and notice that ∇ i ρ ε ≤ Cε −2 ρ ε , for i ∈ {1, 2}, where C is some absolute constant. Then, ν ε = ν * ρ ε solves a FPE with respect to a diffusion operator with coefficients a ε , b ε satisfying (the correspondent of) (44), as a consequence of the last statement in Lemma A.1. Existence of superposition solutions η ε ∈ P(C([0, T ]; R d )) for the associated martingale problems follows from the smooth case settled above.
With such a choice of Θ 1 , Θ 2 (and building θ as in the previous case, for (ν M ) M >0 is tight), we obtain for some coercive functional Ψ such that inequalities akin to (48) and the following ones, with Θ 1 (z), Θ 2 (z) in place of |z| p .
Step 3 (limit). This step is described in Section A.3.
General case. The final step consists of removing the assumption (49).
Step 1 (approximation). We perform once again an approximation via convolution, e.g. as in the case of uniformly bounded coefficients. In this case, however, we only use the fact that (ν ε ) ε are solutions to FPE's associated to diffusion operators whose coefficients are locally bounded (and the bound (4) is preserved).
Step 2 (tightness). We argue exactly as in the previous case, i.e. using de la Vallée Poussin criterion to provide suitable Θ 1 , Θ 2 .
Step 3 (limit). Again, this step is described in Section A.3.
As already remarked at the beginning of this section, one could combine all the arguments above and prove Theorem 2.5, starting from the "base case" with a single combination of mollifications and push-forwards approximations. On a technical level, the main difficulty is to obtain the result for locally bounded coefficients, and this is done after we establish the result for uniformly bounded coefficients, regardless of their regularity, essentially because the push-forward approximation may not preserve it.