Stochastic ODEs and stochastic linear PDEs with critical drift: regularity, duality and uniqueness

In this paper linear stochastic transport and continuity equations with drift in critical $L^{p}$ spaces are considered. In this situation noise prevents shocks for the transport equation and singularities in the density for the continuity equation, starting from smooth initial conditions. Specifically, we first prove a result of Sobolev regularity of solutions, which is false for the corresponding deterministic equation. The technique needed to reach the critical case is new and based on parabolic equations satisfied by moments of first derivatives of the solution, opposite to previous works based on stochastic flows. The approach extends to higher order derivatives under more regularity of the drift term. By a duality approach, these regularity results are then applied to prove uniqueness of weak solutions to linear stochastic continuity and transport equations and certain well-posedness results for the associated stochastic differential equation (sDE) (roughly speaking, existence and uniqueness of flows and their $C^\alpha$ regularity, strong uniqueness for the sDE when the initial datum has diffuse law). Finally, we show two types of examples: on the one hand, we present well-posed sDEs, when the corresponding ODEs are ill-posed, and on the other hand, we give a counterexample in the supercritical case.


Introduction
Let b : [0, T ] × R d → R d , for d ∈ N, be a deterministic, time-dependent vector field, that we call drift. Let (W t ) t≥0 be a Brownian motion in R d , defined on a probability space (Ω, A, P ) with respect to a filtration (G t ) t≥0 and let σ be a real number. The following three stochastic equations are (at least formally) related: 1. the stochastic differential equation (sDE ) dX = b(t, X)dt + σdW t , where x ∈ R d ; the unknown (X t ) t∈[0,T ] is a stochastic process in R d ; 2. the stochastic transport equation (sTE ) du and Stratonovich multiplication is used (precise definitions will be given below); the unknown (u(t, x)) t∈[0,T ],x∈R d is a scalar random field; 3. the stochastic continuity equation (sCE ) dµ + div(bµ)dt + σ div(µ • dW t ) = 0, µ| t=0 = µ 0 , where µ 0 is a measure, div(µ • dW t ) stands for is a family of random measures on R d , and thus the differential operations have to be understood in the sense of distributions.
The aim of this paper is to investigate several questions for these equations in the case when the drift is in a subcritical or even critical space, a case not reached by any approach until now.

Deterministic case σ = 0
For comparison with the results for the stochastic equations presented later on (due to the presence of noise), we first address the deterministic case σ = 0. We start by explaining the link between the three equations and recall some classical results -in the positive and in the negative direction -under various regularity assumptions on the drift b. When b is smooth enough, then: (i) the sDE generates a flow Φ t (x) of diffeomorphisms; (ii) the sTE is uniquely solvable in suitable spaces, and for the solution we have the representation formula u(t, x) = u 0 (Φ −1 t (x)); (iii) the sCE is uniquely solvable in suitable spaces, and the solution µ t is the push forward of µ 0 under Φ t , i.e. µ t = (Φ t ) µ 0 .
These links between the three equations can be either proved a posteriori, after the equations have been solved by their own arguments, or they can be used to solve one equation by means of the other. Well-posedness of the previous equations and links between them have been explored Introduction to the ejpecp Class exists. Remarkable is the fact that the flow is obtained by a preliminary solution of the sTE or of the sCE, see [30,2] (later on in [25], similar results have been obtained directly on the sDE). However, when the regularity of b is too poor, several problems arise, for which, at the level of the sDE and its flow, we want to mention two types: 1) non-uniqueness for the sDE, and, more generally, presence of discontinuities in the flow; 2) non-injectivity of the flow (two trajectories can coalesce) and, more generally, mass concentration.
These phenomena have counterparts at the level of the associated sCE and sTE: 1) non-uniqueness for the sDE leads to non-uniqueness for the sCE and sTE; 2) non-injectivity of the flow leads to shocks in the sTE (i.e. absence of continuous solutions, even starting from a continuous initial datum), while mass concentration means that a measure-valued solution of the sCE does not remain distributed.
Elementary examples can be easily constructed by means of continuous drifts in dimension 1; more sophisticated examples in higher dimension, with bounded measurable and divergence free drift, can be found in [1]. Concerning regularity, let us briefly give some details for an easy example: Consider, in dimension d = 1, the drift b(x) := − sign(x)|x| α for some α ∈ (0, 1). All trajectories of the ODE coalesce at x = 0 in finite time; the solution to the deterministic TE develops a shock (discontinuity) in finite time, at x = 0, from every smooth initial condition u 0 such that u 0 (x) = u 0 (−x) for some x = 0; the deterministic CE concentrates mass at x = 0 in finite time, if the initial mass is not zero.
See also Section 7 for similar examples of drift terms leading to non-uniqueness or coalescence of trajectories for the deterministic ODE (which in turn results in non-uniqueness and discontinuities/mass concentration for the PDEs). Notice that the outstanding results of [30,2] (still in the deterministic case) are concerned only with uniqueness of weak solutions. The only results to our knowledge about regularity of solutions with rough drifts are those of [5,Section 3.3] relative to the loss of regularity of solutions to the TE when the vector field satisfies a log-Lipschitz condition, which is a far better situation than those considered in this paper. We shall prove below that these phenomena disappear in the presence of noise. Of course they also disappear in the presence of viscosity, but random perturbations of transport type ∇u • dW t and viscosity ∆u are completely different mechanisms. The sTE remains an hyperbolic equation, in the sense that the solution follows the characteristics of the single particles (so we do not expect regularization of an irregular initial datum); on the contrary, the insertion of a viscous term corresponds to some averaging, making the equation of parabolic type. One could interpret transport noise as a turbulent motion of the medium where transport of a passive scalar takes place, see [19], which is different from a dissipative mechanism, although some of the consequences on the passive scalar may have similarities.

Stochastic case σ = 0
In the stochastic case σ = 0, when b is smooth enough, the existence of a stochastic flow of diffeomorphisms Φ for the sDE, the well-posedness of sTE and the relation u(t, x) = u 0 (Φ −1 Introduction to the ejpecp Class case. Notice that we are not talking about the well-known regularization effect of a Laplacian or an expected value. By regularization we mean that some of the pathologies mentioned above about the deterministic case (non-uniqueness and blow-up) may disappear even at the level of a single trajectory ω; we do not address any regularization of solutions in time, i.e. that solutions become more regular than the initial conditions, a fact that is clearly false when we expect relations like u(t, x) = u 0 (Φ −1 t (x)).

Aim of this paper
The aim of this work is to prove several results in this direction and develop a sort of comprehensive theory on this topic. The results in this paper are considerably advanced and are obtained by means of new powerful strategies, which give a more complete theory. The list of our main results is described in the Subsections 1.6-1.8; in a few sentences, we are concerned with: (i) regularity for the transport (and continuity) equation; (ii) uniqueness for the continuity (and transport) equation; (iii) uniqueness for the sDE and regularity for the flow.
In the following subsections, we will explain the results in more detail and give precise references to previous works on the topics. Moreover, we will also analyze the crucial regularity assumptions on the drift term (discussing its criticality in a heuristic way and via appropriate examples, which are either classical or elaborated at the end of the paper).

Regularity assumptions on b
As already highlighted before, the key point for the question of existence, uniqueness and regularity of the solutions to the relevant equations is the regularity assumption on the drift b. In particular, we will not work with any kind of differentiability or Hölder condition, but merely with an integrability condition. We say that a vector field f : [0, T ] × R d → R d satisfies the Ladyzhenskaya-Prodi-Serrin condition (LPS) with exponents p, q ∈ (2, ∞) if We shall write f ∈ LPS(p, q) (the precise definition will be given in Section 2.1), and we use the norm We may extend the definition to the limit case (p, q) = (∞, 2) in the natural way: we say that f ∈ LPS(∞, 2) if f ∈ L 2 (0, T ; L ∞ (R d , R d )) and we use the norm Introduction to the ejpecp Class Roughly speaking, our results will hold for a drift b which is the sum of a Lipschitz function of space (with some integrability in time) plus a vector field of LPS class. In the sequel of the introduction, for simplicity of the exposition, we shall not mention the Lipschitz component anymore, which is however important to avoid that the final results are restricted to drift with integrability (or at least boundedness) at infinity.
Let us note that if p, q ∈ (2, ∞), then the space L q ([0, T ]; L p (R d , R d )) is the closure in the topology · L q ([0,T ];L p ) of smooth functions with compact support. The same is true for C([0, T ]; L d (R d , R d )). In the limit cases (p, q) = (∞, 2) and (p, q) = (d, ∞), using classical mollifiers, there exists a sequence of smooth functions with compact support which converges almost surely and has uniform bound in the corresponding norm. This fact will allow us to follow an approach of a priori estimates, i.e. perform all computations for solutions to the equation with smooth coefficients, obtain uniform estimates for the associated solutions, and then deduce the statement after passage to the limit.
We further want to comment on the significance of the LPS condition in fluid dynamics. The name LPS comes from the authors Ladyzhenskaya, Prodi and Serrin who identified this condition as a class where regularity and well-posedness of 3D Navier-Stokes equations hold, see [53,60,72,75,59,47]. The limit case (p, q) = (d, ∞) generated a lot of research and can be treated almost as the other cases if there is continuity in time or some smallness condition, see for instance [10,61], but the full L ∞ (0, T ; L d (R d , R d )) case is very difficult, see [33] and related works. It has been solved only recently, at the price of a very innovative and complex proof. A similar result for our problem is unknown. The deep connection of the LPS class, especially when d p + 2 q = 1, with the theory of 3D Navier-Stokes equations is one of our main motivations to analyze stochastic transport under such conditions. We finally note that, while preparing the second version of this work (after the first version appeared on arXiv), one article [67] and two preprints [81,65] have appeared on the topic of this paper. In the article [67] pathwise (but not path-by-path) uniqueness is shown for the sCE under Krylov-Röckner conditions in the subcritical case. The preprints [81] and [65] go almost up to the critical case for weak and strong solution to the SDEs, the latter showing also Sobolev regularity of the stochastic flow. Respectively, Introduction to the ejpecp Class where b λ (t, x) = λ α−1 b(λ α t, λx) is the rescaled drift and W (λ α t) formally denotes the derivative of W at time λ α t. We now want to write the stochastic part in terms of a new Brownian motion. For this purpose, we define a process (W λ (t)) t≥0 , via W λ (t) := λ −α/2 W (λ α t) and notice that W λ is a Brownian motion with W λ (t) = λ α/2 W (λ α t). Thus, the previous equation becomes ∂ t u λ (t, x) + b λ (t, x) · ∇u λ (t, x) + λ α/2−1 ∇u λ (t, x) • W λ (t) = 0.
We first choose α = 2 such that the stochastic part λ α/2−1 ∇u λ (t, x) • W λ (t) is comparable to the derivative in time ∂ t u λ . Notice that this is the parabolic scaling, although sTE is not parabolic (but as we will see below, a basic idea of our approach is that certain expected values of the solution satisfy parabolic equations for which the above scaling is the relevant one). Next we require that, for small λ, the rescaled drift b λ becomes small (or at least controlled) in some suitable norm (in our case, L q (0, T ; L p (R d , R d ))). It is easy to see that b λ L q (0,T /λ 2 ;L p ) = λ 1−(2/q+d/p) b L q (0,T ;L p ) (here, the exponent d comes from rescaling in space and the exponent 2 from rescaling in time and the choice α = 2). In conclusion, we find that • if the LPS condition holds with strict inequality, then b λ L q (0,T /λ 2 ;L p ) → 0 as λ → 0: the stochastic term dominates and we expect a regularizing effect (subcritical case); • if the LPS condition holds with equality, then b λ L q (0,T /λ α ;L p ) = b L q (0,T ;L p ) remains constant: the deterministic drift and the stochastic forcing are comparable (critical case).
This intuitively explains why the analysis of the critical case is more difficult. Notice that if the LPS condition does not hold, then we expect the drift to dominate, so that a general result for regularization by noise is probably false. In this sense, the LPS condition should be regarded as an optimal condition for expecting regularity of solutions.

Regularity results for the sPDEs
Concerning regularity, we proceed in a unified approach to attack the sTE and the sCE simultaneously (but for the sCE we have to assume the LPS condition also on div b).
In fact, we shall treat a generalized stochastic equation of transport type which contains both the sTE and the sCE as special cases. For this equation we prove a regularity result which contains as a particular case the following: Introduction to the ejpecp Class technical achievement of this paper (see Section 2.2 for a detailed description of the central ingredients of our method).
We now want to give some details on the precise statements, the regularity assumptions on drift and the strategy of proof for some known regularity results for the sTE, for the purpose of comparison with the results presented here. The paper [43] deals with the case of Hölder continuous bounded drift and is based on the construction of the stochastic flow from [40]. The paper [36] is concerned with the class called in the sequel as Krylov-Röckner class, after [55], where pathwise uniqueness and other results are proved for the sDE. We say that a vector field f : [0, T ] × R d → R d satisfies the Krylov-Röckner (KR) condition if the LPS condition holds with strict inequality d p + 2 q < 1 and we shall write f ∈ KR(p, q). The improvement from d p + 2 q < 1 to d p + 2 q = 1 appeared also in the theory of 3D Navier-Stokes equations and required new techniques (which in turn opened new research directions on L ∞ (0, T ; L d (R d , R d )) regularity). Also here it requires a completely new approach. Under the condition d p + 2 q = 1, we do not know how to solve the sDE directly (see however the recent preprints [65,81] mentioned above); even in a weak sense, by Girsanov theorem, the strict inequality seems to be needed ( [55,70,52]). Similarly as in [43], the proof of regularity of solutions of the sTE from [36] is based on the construction of stochastic flows for the sDE and their regularity in terms of weak differentiability. Finally, [64] and [68] treat the case of bounded measurable drift and, in [68], fractional Brownian motion (the classical work under this condition on pathwise uniqueness for the sDE is [80]), again starting from a weak differentiability result for stochastic flows, proved however with methods different from [36].
Let us mention that proving that noise prevents blow-up or stabilizes the system (in cases where blow-up or instability phenomena are possible in the deterministic situation) is an intriguing problem that is under investigation also for other equations, different from transport ones, see e.g. [7,12,16,24,27,29,32,42,45,51,74].

Uniqueness results for the sPDEs
The second issue of our work is uniqueness of weak solutions to equations of continuity (and transport) type. More precisely, we prove a path-by-path uniqueness result via a duality approach, which relies on the regularity results described in Section 1.6. When uniqueness is understood in a class of weak solutions, then the adjoint existence result must be in a class of sufficiently regular solutions (which is why the assumption for path-by-path uniqueness for the sCE will be the assumption for regularity for the sTE and vice versa); for this reason this approach cannot be applied in the deterministic case, when b is not sufficiently regular.
By path-by-path uniqueness we mean something stronger than pathwise uniqueness, namely that given ω a.s., the deterministic PDE corresponding to that particular ω has a unique weak solution (note that our sPDE can be reformulated as a random PDE, which then can be read in a proper sense at ω fixed). Instead, pathwise uniqueness means that two processes, hence families indexed by ω, both solutions of the equation, coincide for a.e. ω. We prove: Introduction to the ejpecp Class A more precise statement is given in Section 3.4 below. No other method is known to produce such a strong result of uniqueness. This duality method in the stochastic setting is the second important technical achievement of this paper.
The intuitive reason why, by duality, one can prove path-by-path uniqueness (usually so difficult to be proven) is the following one. The duality approach gives us an identity of the form where ρ t is any weak solution of the sCE (ρ t is the density of µ t ) with initial condition ρ 0 and (u t,ϕ s ) s∈[0,t] is any regular solution of the sTE rewritten in backward form with final condition ϕ at time t. As we shall see below, we use an approximate version of (1.1), but the idea we want to explain here is the same. Identity (1.1) holds a.s. in Ω, for any given ϕ and t. But taking a dense (in a suitable topology) countable set D of ϕ's, we have (1.1) for a.e. ω, uniformly on D and thus we may identify ρ t . This is the reason why this approach is so powerful to prove path-by-path results. Of course behind this simple idea, the main technical point is the regularity of the solutions to the sTE, which makes it possible to prove an identity of the form (1.1) for weak solutions ρ t , for all those ω's such that u t,ϕ s is regular enough. Concerning other uniqueness results for the sTE with poor drift, let us briefly comment on a few of them. In [41] the case of Hölder continuous bounded drift is treated, by means of the differentiable flow associated to the sDE; [66] extends the result and the approach to drifts in KR class with zero divergence. The paper [17] extends the results to the sTE with Hölder continuous drift but driven by fractional Brownian motion, relying again on the flow; the technique used there for the analysis of the sDE itself is instead different from [41] and leads to path-by-path uniqueness. The paper [4] assumes weakly differentiable drift but relaxes the assumption on the divergence of the drift, with respect to the deterministic works [30,2]. The papers [62,37] use Wiener chaos expansion techniques to obtain uniqueness for the sTE for drifts close to KR class, see [62], or even beyond, see [37], at the price of uniqueness in a smaller class (namely among solutions adapted to the Brownian filtration). A full solution of the uniqueness problem in the KR class was still open (apart from the paper [67] and the recent preprints [65,81] mentioned above) and this is a by-product of this paper, which solves the problem in a stronger sense in two directions: i) path-by-path uniqueness instead of pathwise uniqueness; ii) drift in the LPS class instead of only KR class.
Let us mention that the approach to uniqueness of [4] shares some technical steps with the results described in Section 1.6: renormalization of solutions (in the sense of [30]), Itô reformulation of the Stratonovich equation and then expected value (a Laplacian arises from this procedure). However, in [4] this approach has been applied directly to uniqueness of weak solutions so the renormalization step required weak differentiability of the drift. Instead, here we deal with regularity of solutions and thus the renormalization is applied to regular solutions of approximate problems and no additional assumption on the drift is needed.
Finally, we comment on some related uniqueness results in the nonlinear case. The duality technique has been used in [49], for scalar conservation laws with linear transport noise, and in [50], for nonlinear transport noise, but in a different way and without producing a path-by-path uniqueness result. Other results on uniqueness by noise are available with different techniques and/or different choices of noise, e.g. [11] for a dyadic model of turbulence and [6] for a parabolic model. Introduction to the ejpecp Class

Results for the sDE
The last issue of our paper is to provide existence, uniqueness and regularity of stochastic flows for the sDE, imposing merely the LPS condition. The strategy here is to deduce such results from the path-by-path uniqueness result established in Section 1.7. To understand the novelties, let us recall that the more general strong well-posedness result for the sDE is due to [55] under the KR condition on b. To simplify the exposition and unify the discussion of the literature, let us consider the autonomous case b(t, x) = b(x) and an assumption of the form b ∈ L p (R d , R d ) (depending on the reference, various locality conditions and behavior at infinity are assumed). The condition p > d seems to be the limit case for solvability in all approaches, see for instance [55,52,70,20,77], whether they are based on Girsanov theorem, Krylov estimates, parabolic theory or Dirichlet forms. There are some results on weak well-posedness for measure-valued drifts, see [9], and distribution-valued drifts, see [44,28,15], but it is unclear whether they apply to the limit case p = d: for example, the result in [9], when restricted to measures with density b with respect to the Lebesgue measure, requires p > d, see [9,Example 2.3]. The present paper is the first one to give information on sDEs in the limit case p = d (apart from [65,81]).
Since path-by-path solvability is another issue related to our results, let us mention the paper [26], where the drift is bounded measurable: for a.e. ω, there exists one and only one solution. New results for several classes of noise and drift have been obtained by [18]. In general, the problem of path-by-path solvability of an sDE with poor drift is extremely difficult, compared to pathwise uniqueness which is already nontrivial. Thus, it is remarkable that the approach by duality developed here gives results in this direction.
Our contribution on the sDE is threefold: existence, uniqueness and regularity of Lagrangian flows, pathwise uniqueness from a diffuse initial datum and path-by-path uniqueness from given initial condition. The following subsections detail these three classes of results.

Lagrangian flows
We prove a well-posedness result among Lagrangian flows (see below for more explanations) under the LPS condition on the drift: Theorem 1.3. Under LPS condition, for a.e. ω, there exists a unique Lagrangian flow Φ ω solving the sDE at ω fixed. This flow is, at fixed time, W 1,m loc (R d , R d )-regular for every finite m, in particular it has a C α (R d , R d ) version (at fixed time) for every α < 1.
Uniqueness will follow from uniqueness of the sCE, regularity from regularity of the solution to the sTE. The result is new because our uniqueness result is path-by-path: for a.e. ω, two Lagrangian flows solving the sDE with that ω fixed must coincide (notice that the sDE has a clear path-by-path meaning). A Lagrangian flow Φ, solving a given ODE, is a generalized flow, in the sense of [30,2]: a measurable map x → Φ t (x) with a certain non-contracting property, such that t → Φ t (x) verifies that ODE for a.e. initial condition x ∈ R d . However, in general, we do not construct solutions of the sDE in a classical sense, corresponding to a given initial condition X 0 = x. In fact, we do not know whether or not strong solutions exist and are unique under the LPS condition with d p + 2 q = 1 (while for d p + 2 q < 1 strong solutions do exist, see [55]).
Introduction to the ejpecp Class

Pathwise uniqueness from a diffuse initial datum
We also prove a (classical) pathwise uniqueness result under the LPS condition, when the initial datum has a diffuse law. This is done by exploiting the regularity result of the sTE and by using a duality technique similar to the one mentioned before.
Theorem 1.4. If X 0 is a diffuse random variable (not a single x) on R d , then pathwise uniqueness holds among solutions having diffuse marginal laws (more precisely, such that the law of X t has a density in L ∞ ([0, T ]; L m (R d )), for a suitable m).
Finally we notice that uniqueness of the law of solutions (or at least of their onedimensional marginals, namely the solutions of Fokker-Planck equations) may hold true for very irregular drift, i.e. b ∈ L 2 , if diffuse initial conditions with suitable density are considered; see [38,13].

Results of path-by-path uniqueness from given initial condition
When the regularity results for the stochastic equation of transport type is improved from W 1,p to C 1 -regularity, then the uniqueness results of Section 1.7 for the sCE holds in the very general class of finite measures and it is a path-by-path uniqueness result. As a consequence, we get an analogous path-by-path uniqueness result for the sDE with classical given initial conditions, a result competitive with [26] and [18]. The main problem is to find assumptions, as weak as possible, on the drift b which are sufficient to guarantee C 1 regularity of the solutions. We describe two cases. The first one, which follows the strategy described in Section 1.1, is when the weak derivatives of b (instead of only b itself) belongs to the LPS class, that is ∂ i b ∈ LPS(p, q) for i = 1, . . . , d. However, since this is a weak differentiability assumption, it is less general than expected. The second case is when b is Hölder continuous (in space) and bounded, but here we have to refer to [41,43] for the proof of the main regularity results. Theorem 1.5. If Db belongs to the LPS class or if b is Hölder continuous (in space), then, for a.e. ω, for every x in R d , there exists a unique solution to the sDE, starting from x, at ω fixed.
Notice that the "good subset" of Ω is independent of the initial condition x; this is not obvious from the approaches of [18,26], due to the application of Girsanov transformation for a given initial condition.
Let us mention that in [76], generalized in [71] to the case of Lévy noise, path-by-path uniqueness is shown, from a fixed initial condition, for a Hölder continuous drift, using the regularity of the flow. This approach is the translation at the sDE level of the duality technique for the sPDE.

Summary on uniqueness results
Since the reader might not be acquainted with the various types of uniqueness, we resume here the possible path-by-path uniqueness results and their links.
path-by-path uniqueness among trajectories Introduction to the ejpecp Class • Path-by-path uniqueness among trajectories implies path-by-path uniqueness among flows: Assume path-by-path (or pathwise) uniqueness among trajectories and let Φ, Ψ be two flows solving the sDE. Then, for a.e. x in R d , Φ(x) and Ψ(x) are solutions to the sDE, starting from x, so, by uniqueness, they must coincide and hence Φ = Ψ a.e..
• Path-by-path uniqueness among trajectories implies pathwise uniqueness for deterministic initial data: Assume path-by-path uniqueness among trajectories and let X, Y be two adapted processes which solve the sDE. Then, for a.e. ω, X(ω) and Y (ω) solve the sDE for that fixed ω, so they must coincide and hence X = Y a.e..
• Pathwise uniqueness for deterministic initial data implies pathwise uniqueness for diffuse initial data: Assume pathwise uniqueness for deterministic initial data and let X, Y be two solutions, on a probability space (Ω, A, P ), starting from a diffuse initial datum X 0 . For x in R d , define the set Ω x = {ω ∈ Ω : X 0 (ω) = x}. Then, for (X 0 ) # P -a.e. x, X and Y , restricted to Ω x , solve the sDE starting from x, so they must coincide and hence X = Y a.e..
• Path-by-path uniqueness among flows (with non-concentration properties) "implies" pathwise uniqueness for diffuse initial data: The quotation marks are here for two reasons: because the general proof is more complicated than the idea below and because the pathwise uniqueness is not among all the processes (with diffuse initial data), but a restriction is needed to transfer the non-concentration property.
Assume path-by-path uniqueness among flows and let X, Y be two solutions on a probability space (Ω, A, P ), starting from a diffuse initial datum X 0 . We give the idea in the case In this case (which is a model for the general case), for Q-a.e. γ, Φ(γ, ·) and Ψ(γ, ·) are flows solving the SDE for that fixed ω. If they have the required non-concentration properties, then, by uniqueness, they must coincide.
Hence uniqueness holds among processes X, with diffuse X 0 , such that X(γ, ·) has a certain non-concentration property; this is the restriction we need.
We will prove: path-by-path uniqueness among Lagrangian flows, when b is in LPS class; path-by-path uniqueness among solutions starting from a fixed initial point, when b and Db are in LPS class or when b is Hölder continuous (in space). We will develop in detail pathwise uniqueness from a diffuse initial datum in Section 5.1 (where the last implication will be proved) and in Section 5.2 (where a somehow more general result will be given).

Examples
In Section 7 we give several examples of equations with irregular drift of two categories: i) on one side, several examples of drifts which in the deterministic case give rise to non-uniqueness, discontinuity or shocks in the flow, while in the stochastic case our results apply and these problems disappear; ii) on the other side, a counterexample of a drift outside of the LPS class, for which even the sDE is ill-posed.
Introduction to the ejpecp Class Section 1.7 for the sCE and sTE by duality; then we deduce the results of Section 1.8 for the sDE from such uniqueness results. The fact that regularity for transport equations (with poor drift) is the starting point marks the difference with the deterministic theory, where such kind of results are absent. Hence, the results and techniques of the present paper are not generalizations of deterministic ones.
The two most innovative technical tools developed in this work are the analytic proof of regularity (as stated in Section 1.6) and the path-by-path duality argument yielding uniqueness in this very strong sense. The generality of the LPS condition seems to be unreachable with more classical tools, based on a direct analysis of the sDE. Moreover, in principle some of the analytic steps of Section 1.6 and the duality argument could be applied to other classes of stochastic equations; however, the renormalization step in the regularity proof is quite peculiar of transport equations.
The noise considered in this work is the simplest one, in the class of multiplicative noise of transport type. The reason for this choice is that it suffices to prove the regularization phenomena and the exposition will not be obscured by unnecessary details. However, for nonlinear problems it seems that more structured noise is needed, see [42,29]. So it is natural to ask whether the results of this paper extend to such noise. Let us briefly discuss this issue. The more general sDE takes the form where σ k : R d → R d and W k are independent Brownian motions, and the associated sTE, sCE are now Concerning the assumptions on σ k , for simplicity, think of the case when they are of class C 4 b with proper summability in k. In order to generalize the regularity theory (Section 1.6) it is necessary to be able to perform parabolic estimates, and thus, the generator associated to this sDE must be strongly elliptic; a simple sufficient condition is that the covariance matrix function Q(x, y) := ∞ k=1 σ k (x) ⊗ σ k (y) of the random field η(t, x) = Introduction to the ejpecp Class and its stochastic flow of diffeomorphisms ψ t (x), and use the transformation v(t, x) := u(t, ψ t (x)). This new random field satisfies  [18], where the concept of (ρ, γ)-irregular paths is given by means of Fourier analysis, and it is shown that such paths provide uniqueness for certain classes of non-Lipschitz drifts b (in particular if W is a sample path of the Brownian motion, uniqueness is shown for Hölder continuous drifts). In contrast to the present paper, the techniques used in [18] are based on Young integration, and the results, when specialized to Brownian sample paths, are mostly concerned with Hölder continuous drifts. While for a general path it is not easy to verify the (ρ, γ)-irregularity condition, one can prove, see [24,Proposition 1.4], that this condition implies that the path must be irregular (non-Lipschitz in time): this corresponds to the fact that a regular path does not regularize an ill-posed ODE, in general. It would be interesting to compare the (ρ, γ)-irregularity notion with the concept of truly rough paths (e.g. [46]), which also quantifies the irregularity of a path. Another, somehow more explicit, sufficient condition on deterministic paths is given in [22, equation (3.3)], though it is used for the regularization of scalar conservation laws rather than ODEs. Here the regularization of nonlinear PDEs was achieved by means of a noise, that is here the derivative of the regularizing path, which is itself nonlinear and precisely multiplies the nonlinearity; see e.g. [22,24,23], and [48] for other pathwise arguments.
Throughout the paper, the drift b is assumed to be deterministic. In view of applications especially to nonlinear equations, it would be very important to extend the result to random drifts. While we do not see obstacles for the extension of the duality technique, being path-by-path in nature, the first step, namely the proof of regularity of solutions for the sTE, does not allow for such generalization: if the drift were random, then the equations for the moments of the derivative of the solution would not form a closed system. This is not simply a limitation of the techniques: there are in fact simple counterexamples to regularization by noise for general random drifts. Let us mention that, in some cases, it is possible to have regularization by noise even for random drifts, see [18] and related work, assuming a suitable Hölder continuity of the drift, or [31,69], assuming Malliavin differentiability of the drift.
Introduction to the ejpecp Class Finally, let us note that throughout the rest of the paper, concerning function spaces, we shall use for simplicity the same notation for scalar-valued and vector-valued functions (but it will be always clear from the context if the functions under consideration have values in R d , like b, X, or Φ, or in R, like c or solutions u, v).

Regularity for sTE and sCE
In order to unify the analysis of the sTE and sCE we introduce the stochastic general- where b, σ, u and u 0 are as above for the sTE, c : [0, T ] × R d → R and W is a Brownian motion with respect to a given filtration (G t ) t . We shall prove regularity results for solutions to (sgTE).

Remark 2.1.
We note that the case c = 0 corresponds to (sTE), while the case c = div b corresponds to (sCE), with where u stands for the density of the measure µ t with respect to the Lebesgue measure.

Assumptions
Throughout all the paper, we assume that (Ω, A, P ) is a probability space, is a filtration satisfying the standard assumptions, that is, it is complete and rightcontinuous. The process W denotes a Brownian motion with respect to (G t ) t , unless differently specified. Concerning the general equation (sgTE) we will always assume that we are in the purely stochastic case with σ = 0 and that the coefficients b and c satisfy the following decomposition and regularity condition.
(The expression "b is in a certain class A" must be understood componentwise.) Introduction to the ejpecp Class Remark 2.3. The hypotheses on b (2) and c (2) are slightly stronger than the natural ones, : we require L 2 integrability in time instead of L 1 . This is mainly due to a technical point which will appear in Section 3. However, with minor modifications, this assumption could be relaxed to L 1 integrability throughout this section. (j) , where, for every j,b (j) is a vector field satisfying Condition 2.2 with exponents p j , q j that can depend on j; similarly for c. This extension is easy and we refrain to discuss it explicitly.
Remark 2.5. The sTE is just equation (sgTE) with c = 0 and thus we do write explicitly the assumptions for (sTE). The sCE instead corresponds to (sgTE) with c = div b and for completeness let us note that we hence need to assume for (sCE) that we have require the smallness assumption in Condition 2.2, 1c);

Strategy of proof
In order to prove the regularity results, we follow the approach of a priori estimates: we prove regularity estimates for the smooth solutions of approximate problems with smooth coefficients, be careful to show that the regularity estimates have constants independent of the approximation; then we deduce the regularity for the solution of the limit problem by passing to the limit.
The strategy of proof is made of several steps which bear some similarities with the computations done in literature of theoretical physics of passive scalars, see for instance [19].
First, we differentiate the sgTE (which is possible because we deal with smooth solutions of regularized problems), with the purpose of estimating the derivatives of the solutions. However, terms like ∂ i b k appear. In the deterministic case, unless b is Lipschitz, these terms spoil any attempt to prove differentiability of solutions by this method. In the stochastic case, we shall integrate these bad terms by parts at the price of a second derivative of the solution, which however will be controlled, as it will be explained below.
Second, we use the very important property of transport type equations of being invariant under certain transformations of the solution. For the classical sTE, the typical transformation is u → β(u) where β ∈ C 1 (R): if u is a solution, then β(u) is (at least formally) again a solution. For regular solutions, as in our case, this can be made rigorous; let us only mention that, for weak solutions, this is a major issue, which gives rise to the concept of renormalized solutions [30] (namely those for which β(u) is again a weak solution) and the so called commutator lemma; we do not meet these problems here, in the framework of regular solutions. Nevertheless, to recall the issue, we shall call this step renormalization, namely that suitable transformations of the solution lead to solutions. In our case, since we consider the differentiated sgTE, we work on the level of derivatives of the solution u and therefore we apply transformations to ∂ i u. In order to find a closed system, we have to consider, as transformations, all possible products of ∂ i u, and u itself. This leads to some complications in the book-keeping of indices, but the essential idea is still the renormalization principle.
Third, we reformulate the sPDE from the Stratonovich to the Itô form. The corrector is a second order differential operator. It is strongly elliptic in itself, but combined with the Itô term (containing first derivatives of solutions), it does not give a parabolic character to the equation. The equation is indeed equivalent to the original, hyperbolic (time-reversible) formulation.
Fourth, we take the expectation. This projection annihilates the Itô term and gives a true parabolic equation. The expected value of powers of ∂ i u (or any product of them) solves a parabolic equation, and, as a system in all possible products, it is a closed system. For other functionals of the solution, as the two-point correlation function E[u(t, x)u(t, y)], the fact that a closed parabolic equation arises was a basic tool in the theory of passive scalars [19]. Finally, on the parabolic equation we perform energy-type estimates. The elliptic term puts into play, on the positive side of the estimates, terms like ∇E[(∂ i u) m ]. They are the key tool to estimate those terms coming from the partial integration of ∂ i b k (see the comments above). The good parabolic terms ∇E[(∂ i u) m ] come from the Stratonovich-Itô corrector, after projection by the expected value. This is the technical difference to the deterministic case.

Preparation
The following preliminary lemma is essentially known, although maybe not explicitly written in all details in the literature; we shall therefore sketch the proof. As explained in the last section, given non-smooth coefficients, we shall approximate them with smooth ones. Their role is only to allow us to perform certain computations on the solutions (such as Itô formula, finite expected values, finite integrals on R d and so on). More precisely, the outcome of the next lemma are C ∞ c -estimates (infinitely differentiable with compact support in all variables) in space for all times, for the solutions corresponding to the equation with smooth (regularized) coefficients. However, we emphasize that these estimates are not uniform in the approximations, in contrast to our main regularity estimates concerning Sobolev-type regularity established later on in Theorem 2.7.
, then there exists an adapted solution u of equation (sgTE) with paths of class C([0, T ]; C ∞ c (R d )) (where the support of u depends on the path ). We have for every α ≥ 0 and r ≥ 1. Moreover, we have and for every r, R ≥ 1 Proof.
Step 1: Existence of a solution. Under the assumption b ∈ C ∞ c ([0, T ] × R d ), equation (sDE) has a pathwise unique strong solution X x t for every given x ∈ R d . As proved in [57], the random field X x t has a modification Φ t (x) which is a stochastic flow of diffeomorphisms of class C ∞ (since b is infinitely differentiable with bounded derivatives). Moreover, in view of [58, Theorem 6.1.9] we know that, given u 0 ∈ C ∞ c (R d ), the process Introduction to the ejpecp Class (which has paths of class C([0, T ]; C ∞ c (R d )) by the properties of Φ −1 t ) is an adapted strong solution to (sgTE). Inequality (2.3) then follows from (2.5).
Step 2: Regularity of the solution. For the flow Φ t (x) we have the simple inequality This bound will be used below. For the derivative of the flow with respect to the initial and thus, since Db is bounded, where C 1 ≥ 1 is a deterministic constant. The same is true for higher derivatives and for the inverse flow. This proves inequality (2.2) for α > 0, while the inequality for α = 0 comes from (2.3).
Concerning the claim (2.4), for α = 0 and t ∈ [0, T ] we have c(s,·) ∞ds . Combined with (2.6) this implies (2.4) for α = 0 since u 0 has compact support. The proof of (2.4) for α = 1, 2 is similar: we first differentiate u by using the explicit formula (2.5) and get several terms, then we control them by means of boundedness of c and its derivatives, boundedness of derivatives of direct and inverse flow, and the change of variable formula used above for α = 0. The computation is lengthy but elementary. For instance, we have Hence, we obtain Introduction to the ejpecp Class

Main result on a priori estimates
In the sequel we take the regular solution given by Lemma 2.6 and prove a priori estimates. For the formulation of the result, let us introduce a C 1 -function χ : for some constant C χ > 0. For example, we might take χ(x) = (1 + |x| 2 ) s/2 which satisfies |∇χ(x)| ≤ 2|s|χ(x)/(1 + |x|), for every s ∈ R (all cases s < 0, s = 0, and s > 0 are of interest). The associated norm u 0 W 1,r where we have used the notation ∂ 0 f = f .
, let m be a positive integer, let σ = 0, and let χ be a function satisfying (2.8). Assume that b and c are a vector field and a scalar field, respectively, such that Moreover, the constant C can be chosen to have continuous dependence on m, d, σ, χ, p, q and on the L q ([0, T ]; L p (R d )) norms of b (1) and c (1) , on the L 1 ([0, T ]; C 1 lin (R d )) norm of b (2) , and on the L 1 ([0, T ]; C 1 b (R d )) norm of c (2) . The result holds also for (p, q) = (d, ∞) with the additional hypothesis that the L ∞ ([0, T ]; L d (R d )) norms of b (1) and c (1) are smaller than δ, see Condition 2.2, 1c) (in this case the continuous dependence of C on these norms is up to δ). Corollary 2.8. With the same notations of the previous theorem, if m is an even integer, then for every s ∈ R there exists a constant C depending also on s (in addition to the dependencies from the theorem) such that Proof. Via Hölder's inequality we have as a consequence of Theorem 2.7 for a suitable constant C > 0. Remark 2.9. Such power-type weights play a crucial role for later applications. Therefore, let us note that, for every s ∈ R and m ∈ (1, ∞), Introduction to the ejpecp Class space. We can show this, for instance, by observing that the dual of L m Hence, the L m spaces with these weights are reflexive, which directly carries over to the weighted Sobolev spaces since they are closed subspaces via the mapping f → (f, the Banach-Alaoglu theorem is at our disposal. The next subsections are devoted to the proof of the a priori estimate of the theorem. At the end, they will be used to construct a (weaker) solution corresponding to nonsmooth data. Thus, in the sequel, u refers to a smooth solution, with smooth and compactly supported data b, u 0 .

Formal computation
This section serves as a formal explanation of the first main steps of the proof, those based on renormalization, passage from Itô to Stratonovich formulation and taking the expectation. A precise statement and proof is given in the next Section 2.6.
The aim of the following computations is to write, given any positive integer m, a closed system of parabolic equations for the quantities E[ i∈I ∂ i u], where I varies in the finite multi-indices with elements in {0, 1, . . . , d} of length at most m. In principle, we need only the quantities E[(∂ i u) m ] for i = 1, . . . , d, but they do not form a closed system.

Equation (sgTE) is formally of the form
Being a first order differential operator, it formally satisfies the Leibniz rule This is the step that we call renormalization, following [30]: in the language of that paper, if β : R → R is a C 1 -function and if v is a solution of Lv = 0, then formally Lβ(v) = 0, and solutions which satisfy this rule rigorously are called renormalized solutions. Property (2.9) is a variant of this idea. We apply the renormalization to first One has ∂ i (Lu + cu) = 0 and thus With the notation v 0 = u we also have Lv 0 = −cv 0 .
In the sequel, I will be a finite multi-index with elements in {0, 1, . . . , d}, namely an element of ∪ m∈N {0, 1, . . . , d} m . If I ∈ {0, 1, . . . , d} m we set |I| = m. Given a function h : {0, 1, . . . , d} → R, i∈I h(i) means the sum over all the components of I (counting repetitions), and similarly for the product. The multi-index I \ i means that we drop in I a component of value i; the multi-index I \ i ∪ k means that we substitute in I a component Introduction to the ejpecp Class of value i with a component of value k; which component i is dropped or replaced does not matter because we consider only expressions of the form i∈I h(i) and similar ones.
in view of the Leibniz rule (2.9). Now, the equations for v i differ depending on whether i = 0 or i ∈ {1, . . . , d}. The term cv i appears in all of them, but not ∂ i b · ∇u + u∂ i c. Hence, we find Next we want to take the expected value. The problem is the Stratonovich term σ∇v I •Ẇ in Lv I . Rewriting it as an Itô term with correction, we get where the brackets [·, ·] denote the joint quadratic variation. Since dv I has as local Taking (formally) the expectation in the equation for v I , we arrive at This is the first half of the proof of Theorem 2.7, which will be carried out rigorously in Section 2.6. The second half is the estimate on w coming from the parabolic nature of this equation, which will be established in Section 2.7.

Rigorous proof of (2.10)
We work with the regular solution u given by Lemma 2.6 and we use the notations I, I \ i ∪ k, v i , v I , w I as introduced in the previous section. We observe that, since u is smooth in x, the v i 's and their spatial derivatives are well-defined. Moreover, due to inequality (2.2), also the expected values w I 's are well-defined and smooth in x. Proof. The solution provided by Lemma 2.6 is a pointwise regular solution to (sgTE), namely it satisfies with probability one the identity By the definition of r i , for the last integral on the left-hand side of (2.11), it holds Furthermore, before taking expectations, we want to pass in (2.11) from the Stratonovich to the Itô formulation. To this end, we first note (again by [57,Theorem 7.10]) that Introduction to the ejpecp Class for a bounded process g. Hence, for the stochastic integral in (2.11) we find All other terms have also finite expectation, due to estimate (2.2) of Lemma 2.6. Hence, taking expectation, we have This identity implies that w I (t, x) is continuously differentiable in t and that equation (2.10) holds. The proof of the lemma is complete.

Estimates for the parabolic deterministic equation
The system for the w I 's is a parabolic deterministic linear system of partial differential equations. In this section we will obtain energy estimates for w I which will allow us to obtain the desired a priori bounds. The fact that we have a system instead of a single equation will not affect the estimate (to have an idea of what the final parabolic estimate should be, one could think that w I is independent of I).
For every smooth function χ : R d → [0, ∞) as in the statement of the Theorem 2.7 we multiply the identity (2.10) by χw I and get where, for a shorter notation, we have set b 0 := c.
The term with ∂ i b k would spoil all our efforts of proving estimates depending only on the LPS norm of the coefficients, but fortunately we may integrate by parts that term. This is the first key ingredient of this second half of the proof of Theorem 2.7. The second key ingredient is the presence of the term σ 2 t 0 R d χ|∇w I | 2 dxds, ultimately coming from the passage Stratonovich to Itô formulation plus taking expectation; this allows us to ask as little as possible on the drift b to close the estimates: we may keep first derivatives of w I on the right-hand-side of the previous identity, opposite to the deterministic case.
Before starting with the estimates, we recall that b = b (1) + b (2) and c = c (1) + c (2) are assumed to be the sum of two smooth vector fields. Since the desired estimates in Theorem 2.7 differ for the rough part b (1) and the regular (but possibly with linear growth) part b (2) , we now split b and c and use the integration by parts formula, in the following way: when a term with ∂ i b (1) appears, we bring the derivative on the other terms; when we have b (2) multiplied by the derivative of some object, we bring the derivative on b (2) . In this way we obtain where we have defined To estimate these terms we essentially use the following consequence of Hölder's inequality for functions f, g, h defined over [0, T ] × R d such that the relevant integrals are welldefined. Moreover, we shall use at several instances the estimate (2.8) on |∇χ|, and we further drop the notation (s, x) inside the integrals. For the first term we now employ inequality (2.12) with f ≡ 1 (the special case of Hölder's inequality), g = √ εχ|∇w I | and h = 2σ 2 ε −1 χ|w I | for an arbitrary positive number ε > 0 to find Similarly for the second term, we have Similarly as for the terms A I,1 and A I,1 , we now proceed for the remaining terms A (2) I,2 and A I,2 , with the main difference that w I eventually needs to be replaced with w I\i∪k .
In this way, we get Given m ∈ N, we abbreviate Collecting the previous estimates and summing over |I| = m, we have proved so far (since every J of length m is counted m(d + 1) times in the previous sum); so here we have C m,d = m(d + 1). We can then continue to estimate (using Hölder inequality for m|c (1) |) and find we end up with the preliminary estimate

End of the proof of Theorem 2.7
Starting from the previous inequality (2.13), we can now continue to estimate its right-hand side by taking into account the LPS-condition on b and c. To this end, we need to distinguish the three cases (p, q) = (∞, 2), (p, q) ∈ (2, ∞) and (p, q) = (d, ∞). The main difficulty will be to estimate the term From the resulting inequality we can then conclude the proof of Theorem 2.7 via the Gronwall lemma. For the sake of simplicity, let us first restrict ourselves to the important particular case where b (1) and c (1) can be estimated in the L ∞ -topology.
Let us come to the general case. Notice that it is only here, for the first and only time, that the exponents (p, q) of the LPS condition enter. By · W 1,2 and · L p we denote the usual norms in W 1,2 (R d ) and L p (R d ) respectively. We first prove the following technical lemma, which will be relevant to continue with the estimate for the terms on the right-hand side of inequality (2.13) for the general case p = ∞.
Lemma 2.11. If p > d ∨ 2, then for every ε > 0 there is a constant C ε > 0, depending only on p, d and ε, such that for all f, with a constant C d > 0 depending only on d.
Proof. Let us start by recalling the Gagliardo-Nirenberg interpolation inequality on R d for d = 2: for every 0 ≤ β ≤ 1 and α ≥ 2 which satisfy the following holds: there exists a constant C > 0 depending only on β and d such that The result is true also for d = 2 but requires the additional condition β < 1. We apply this inequality with β = d p , α = 2p p−2 . The assumptions of the Gagliardo-Nirenberg inequality are satisfied because β ≤ 1 for d = 2 and β < 1 for d = 2. Then Introduction to the ejpecp Class Now we come the proof of the lemma. We first apply Hölder's inequality with exponents p 2 and p p−2 and then the previous inequality to find and thus, we have found a constant C ε such that (2.15) holds. This concludes the proof.
The previous interpolation Lemma 2.11 now allows us to continue with the proof of Theorem 2.7 in the remaining cases.
Proof of Theorem 2.7 in the case (p, q) ∈ (2, ∞). We start by observing for some constant C depending only on χ, d, m. Therefore, the application of inequality (2.15) to the terms of the second line of inequality (2.13) shows Introduction to the ejpecp Class for some constant C ε > 0. We use this inequality and the similar one for c (1) in the second line of inequality (2.13) and get, for ε small enough and by means of the Gronwall lemma (applicable because of the inequality (2.17)), a bound of the form (2.14). With the final arguments used above in the case (p, q) = (∞, 2), this completes the proof of Theorem 2.7 in the case p, q ∈ (2, ∞).
Proof of Theorem 2.7 in the case (p, q) = (d, ∞). In this case we apply inequality (2.16) to the terms of the second line of inequality (2.13) to find and an analogous inequality for the term with c (1) . We then estimate ∇θ m 2 L 2 as above If the smallness condition is satisfied, we may again apply the Gronwall lemma and the other computations above to conclude the proof of Theorem 2.7 also in the remaining case (p, q) = (d, ∞).

Existence of global regular solutions for sTE and sCE
In this section we deduce, from the a priori estimates of Theorem 2.7, the existence of global regular solutions for the stochastic generalized transport equation (sgTE) and consequently also for the stochastic transport equation (sTE) and the stochastic continuity equation (sCE). This can be interpreted, at least for the sTE, as a no-blow-up result. Uniqueness will be treated separately in the next section, see also Remark 2.18 below.
In what follows, we assume that the LPS-integrability condition on b, c with exponents p ∈ [d, ∞] and q ∈ [2, ∞] as stated in Section 2.1 is satisfied. We further denote by p = p/(p − 1) the conjugate exponent of p (with p = 1 if p = ∞). We now start by defining the notion of solutions of class L θ (W 1,m loc ) of equation (sgTE), for some θ ≥ 2 and m ≥ p .
To this end, we require first of all some measurability and continuous semimartingale properties for terms appearing in (sgTE) after testing against smooth functions. We say that a map u : Secondly, we need all relevant integrals to be well-defined. Due to the choice θ ≥ 2 we have welldefined stochastic integrals; hence we only need to take care that b(s) · ∇u(s) and c(s)u(s) (2) into (roughly) a vector field of LPS class and a Lipschitz function, we first note that with θ ≥ 2 we also have θ ≥ q/(q − 1) (recalling q ≥ 2). Therefore, s → b(s) · ∇u(s) − c(s)u(s), ϕ is integrable according to the choice m ≥ p (here, the symbol ·, · stands for the usual inner product in L 2 (R d )).
These introductory comments now motivate the following definition.
Introduction to the ejpecp Class As mentioned above we will now prove the existence of such solutions by exploiting the a priori Sobolev-type estimates for solutions to approximate equations with smooth coefficients. The crucial point is that the estimates only depend on the LPS norms of the coefficients b and c, but not on the approximation itself. Hence, from the regular solutions to these approximate equations we may then pass to a limit function which still has the same Sobolev-type regularity, provided that the approximate coefficients remain bounded in these norms. In a second step we then need to verify that the limit function is indeed a solution to the original equation in the sense of Definition 2.13.
Concerning the approximation of the coefficients, we first observe that, since b and c are assumed to belong to the LPS class (satisfying Condition 2.2), we may choose sequences (b ε ) ε , (c ε ) ε which verify the following assumptions: ε , such that: a.e. and in LP S, in the following sense: if p, q ∈ (2, ∞), then b (1) ) norms are small enough, uniformly in ε (in case of Condition 2.2 1c), this follows from the previous assumption); a.e. and in L 2 (C 1 lin ), in the following sense: b ε ) ε and f + b (2) with (b (2) ε ) ε (analogously c), which in turn ensures that Condition 2.14 is fulfilled, in particular the smallness of the norm of b (1) ε .
Introduction to the ejpecp Class Remark 2.16. Notice that, in any case of Condition 2.2 (or also for more general b's), the family (b ε ) ε converges to b in L 1 ([0, T ]; L m (B R )); the same holds for c. Theorem 2.17. Let m ≥ 2 be an even integer and let s ∈ R. Assume that b, c satisfy Condition 2.2 and let u 0 ∈ W 1,2m (1+|·|) 2s+d+1 (R d ). Then there exists a solution u of equation (sgTE) of class L m (W 1,m loc ), which further satisfies u(t, ·) ∈ W 1,m (1+|·|) s (R d ) for a.e. (t, ω).
Step 1: Compactness argument. Take (b ε ) ε and (c ε ) ε as in Condition 2.14; take (u ε 0 ) ε as C ∞ c (R d ) approximations of the initial datum u 0 , converging to it a.e. in R d and in W 1,2m (1+|·|) 2s+d+1 (R d ). Let u ε be the regular solution to (sgTE) corresponding to coefficients b ε , c ε instead of b, c, and with initial value u ε 0 , given by Lemma 2.6. From Corollary 2.8 (notice that, in the limit case p = d, b (1) ε is small enough in view of Condition 2.14), by Remark 2.9, we can extract a subsequence (for simplicity the whole sequence), which converges weakly- * to some u in that space; in particular, weak convergence in L m ([0, T ] × Ω; W 1,m (B R )) holds for every R > 0, i.e. Definition 2.13 (i).
Step 3: Passage from Stratonovich to Itô and vice versa. It will be useful to notice that the last two requirements, namely the semimartingale (ii) and the solution property (iii), in Definition 2.13 can be replaced by the following Itô formulation: integrability assumptions (with m, θ ≥ 2), and it is equal to where the brackets [·, ·] again denote the quadratic covariation. The semimartingale decomposition of u(t), ∂ i ϕ is taken from the equation for u (just use ∂ i ϕ instead of ϕ): and so we deduce Definition 2.13 (iii) from (2.21).
Step 4: Verification of the equation. We want to show that u satisfies (sgTE), in the sense of distributions. In view of Step 3, we can use the Itô formulation (2.21). Fix ϕ in C ∞ c (R d ) with support in B R . We already know from Step 2 that u ε (t), ϕ − u ε 0 , ϕ converges to u t , ϕ − u 0 , ϕ weakly in L 2 ([0, T ] × Ω). We will prove that also the other terms in (2.21) converge, weakly in L 1 ([0, T ] × Ω). The idea for the convergence is the following: assume we have a linear continuous map G = G(u) between two Banach spaces and a bilinear map F = F (b, u) mapping from two suitable Banach spaces into a third one; then, if b ε converges to b strongly and u ε converges to u weakly in the associated topologies, G(u ε ) and F (b ε , u ε ) converge weakly to G(u) and F (b, u), respectively.
For the term t 0 b(s)∇u(s), ϕ ds, we take For this purpose, we notice that, by the Fubini-Tonelli theorem, The convergence of the right-hand side now follows easily, since Y is in L 1 ([0, T ]; L m (B R × Ω)) and ∇u ε converges weakly- * to ∇u in L ∞ ([0, T ]; L m (B R × Ω)) (by Step 1), as ε → 0. This finishes the proof of convergence for F , and the convergence of the term t 0 c(s)u(s), ϕ ds is established analogously.
For the term t 0 u(s), ∂ i ϕ dW i s , we define G : G is a linear continuous map, hence weakly continuous. Therefore, as a consequence of the weak convergence u ε (s), ∂ i ϕ to u(s), ∂ i ϕ in L 2 ([0, T ] × Ω), we find that also t 0 u ε (s), ∂ i ϕ dW i s converges weakly (to the obvious limit) in L 2 ([0, T ] × Ω). The convergence of the last terms in (2.21) is easier. Thus, the limit function u satisfies the identity (2.21), i.e. it is a solution to (sgTE) in the Itô sense, and via Step 3 it then satisfies the properties (ii) and (iii) in Definition 2.13. This concludes the proof of Theorem 2.17. The previous result holds for (sgTE) and therefore, it covers the sTE by taking c = 0. The case of the sCE requires c = div b and therefore, it is better to state explicitly the assumptions. The divergence is understood in the sense of distributions.

W 2,m -regularity
In this section we are interested in the existence of solutions to equation (sgTE) of higher regularity, more precisely of local W 2,m -regularity in space. To this end we shall essentially follow the strategy of the local W 1,m -regularity in space presented above.
First, we consider second order derivatives of equation (sgTE) (instead of first ones) for the smooth solutions of approximate problems with smooth coefficients and derive a parabolic (deterministic) equation for averages of second order derivatives. For this reason we have to assume some LPS condition not only on the coefficients b and c, but also on their first space derivatives. Once the parabolic equation is derived, we may proceed analogously to above, that is, via the parabolic theory we establish a priori regularity estimates involving second order derivatives, and finally we pass to the limit to get the regularity statement. Let us now start to clarify the assumptions of this section. As motivated above, we roughly assume that in addition to the coefficients b and c also their first order derivatives ∂ k b and ∂ k c (for k = 1, . . . , d) satisfy the assumptions of Section 2.1. More precisely, we assume Condition 2.20. The coefficients b and c can be written (1) , c (2) , and for every k ∈ {0, 1, . . . , d} each of the decompositions Note that if Condition 2.2 1b) or 1c) applies, then we need to assume in addition d ≥ 3.
We start by deriving, in the smooth setting, suitable a priori estimates involving second order derivatives of the regular solution, following the strategy of Theorem 2.7. Lemma 2.21. Let p, q be in (2, ∞) satisfying 2 q + d p ≤ 1 or (p, q) = (∞, 2), let m be positive integer, let σ = 0, and let χ be a function satisfying (2.8). Assume that b and c are a vector field and a scalar field, respectively, such that b = b (1) Then there exists a constant C such that, for every u 0 in C ∞ c (R d ), the smooth solution u of equation (sgTE) starting from u 0 , given by Here, the constant C can be chosen similarly as in Theorem 2.7, now depending also on the L q ([0, T ]; L p (R d )) norms of ∂ k b (1) and ∂ k c (1) , on the L 1 ([0, T ]; C 1 lin (R d )) norms of ∂ k b (2) , and on the L 1 ([0, T ]; C 1 b (R d )) norms of ∂ k c (2) , for all k ∈ {0, 1, . . . , d}. The result holds also for (p, q) = (d, ∞), provided that the L ∞ ([0, T ]; L d (R d )) norms of ∂ k b (1) and ∂ k c (1) are sufficiently small (depending only on m, σ and d) for all k ∈ {0, 1, . . . , d}, see Condition 2.2, 1c).
Sketch of proof. Let us start again from the formal computation: using the abbreviations v i := ∂ i u and ν ij := ∂ j ∂ i u (thus ν ij = ν ji ) for i, j ∈ {0, 1, . . . , d} (and with ∂ 0 the identity operator), we have Differentiating once more, we find for i, j ∈ {1, . . . , d} the identity Hence, setting again b 0 := c, we end up with the equations We next would like to pass to products of the ν ij 's. To this end we consider K to be an element in ∪ m∈N ({0, 1, . . . , d} × {0, 1, . . . , d}) m and set |K| = m if K ∈ ({0, 1, . . . , d} × {0, 1, . . . , d}) m . Moreover, we may assume i ≥ j for every (i, j) ∈ K. As before, K \ (i, j) means that we drop one component in K of value (i, j), and similarly K \ (i, j) ∪ {k, } now means that we substitute in K one component of value (i, j) by one of value (k, ) if k ≥ or by one of the value ( , k) otherwise. Again, which component is dropped does not matter because in the end we will only consider expressions which depend on the total number of the single components, but not on their numbering. We now set ν K := (i,j)∈K ν ij , and we then infer from the previous equations satisfied by ν ij , via the Leibniz rule and by distinguishing the cases when j = 0, i > j = 0 and i = j = 0, that

Introduction to the ejpecp Class
Rewriting the Stratonovich term in Lν K via σ∇ν K • dW = σ∇ν K dW − σ 2 2 ∆ν K dt and by (formally) taking the expectation, we then obtain that ω K := E[ν K ] satisfies the equation This system of equations is of the same structure as the system (2.10) derived for the averages of products of first order space derivatives of the solution u, with the only difference that now also second order derivatives of the coefficients appear. Analogously to Section 2.6, one can make the previous computations rigorous for the regular solution of Lemma 2.6 to (sgTE), i.e. the functions ω K (t, x) are continuously differentiable in time and satisfy the pointwise equation (2.23). From here on, we can proceed completely analogously to the proof of Theorem 2.7, since -even though there are more terms involved -the structure of the system is essentially the same (note that χ was only introduced after having derived the parabolic equation, hence no second order derivatives of χ appear in the computations). This finishes the sketch of the proof.
With the previous lemma we can then deduce the existence of a global regular solution for the stochastic generalized equation. To this end, we introduce in analogy to (1) ε . We further choose an C ∞ c (R d )-approximation (u ε 0 ) ε of the initial values u 0 with respect to W 2,2m (1+|·|) 2s+d+1 (R d ) and denote by u ε the regular solution to (sgTE) given by Lemma 2.6, corresponding to coefficients b ε , c ε and initial values u ε 0 .
We now take χ = (1 + |x|) 2s+d+1 in the previous lemma and then deduce from Hölder's inequality, as in Corollary 2.8, the bound with a constant C which does not depend on the particular approximation, but only on its norms, and therefore this bound holds uniformly in ε ∈ (0, 1). From this stage we can follow the strategy of the proof of Theorem 2.17. Indeed, the previous inequality yields that the family (u ε ) ε is bounded in L ∞ ([0, T ]; L m (Ω; W 2,m (B R ))) for every R > 0.

Introduction to the ejpecp Class
Hence, there exists a subsequence weakly- * convergent to a limit process u in this space.
This yields the asserted Sobolev-type regularity involving derivatives up to second order, while the fact that u is indeed a solution to (sgTE) with coefficients b, c was already established in the proof of Theorem 2.17.

Remark 2.23.
In a similar way one can show higher order Sobolev regularity of type W ,m loc , provided that b and c are more regular, in the sense that they can be decomposed into b (1) + b (2) and c (1) + c (2) such that each derivative of these decompositions up to order − 1 satisfies Condition 2.2. However, it remains an interesting open question to prove a similar result for fractional Sobolev spaces.

Path-by-path uniqueness for sCE and sTE
The aim of this section is to prove a path-by-path uniqueness result for both the stochastic transport equation (sTE) and the stochastic continuity equation (sCE). Since we deal with weak solutions, where an integration by parts is necessary at the level of the definition, the general stochastic equation (sgTE) is not the most convenient one. Let us consider a similar equation in divergence form du + (div(bu) + cu)dt + σ div(u • dW t ) = 0, u| t=0 = u 0 (ii) the sCE is included in (3.1), with u as density of the measure µ t with respect to the Lebesgue measure; (iii) the sTE is included in (3.1), by formally setting c = − div b (which then gives rise to a restriction on div b for this equation).
We recall from the introduction that all path-by-path uniqueness results rely heavily on the regularity results achieved in the previous section. For this reason we will always assume Condition 2.2 of Section 2.1 to be in force, which allows us to decompose the vector fields b and c into rough parts b (1) and c (1) under a LPS-condition and more regular parts b (2) and c (2) under an integrability condition in time (only here the L 2 -integrability in time is required, cp. Remark 2.3). Concerning the LPS-condition, we still denote the exponents by p, q ≥ 2 and the conjugate exponent of p by p . We will consider the purely stochastic case σ = 0 throughout this section.
We can now introduce the concept of a weak solution of the stochastic equation (3.1), in analogy to Definition 2.13 (in particular, it is easily verified that all integrals are well-defined by the integrability assumptions on the vector fields b and c and on the weak solution). We recall that (G t ) t∈[0,T ] is a filtration satisfying the standard assumptions and that W denotes a Brownian motion with respect to (G t ) t . (o) it is weakly progressively measurable with respect to (G t ) t ; (i) it is in L m ([0, T ] × B R × Ω) for every R > 0; (ii) t → u(t), ϕ has a modification which is a continuous semimartingale, for every ϕ in C ∞ c (R d ); Introduction to the ejpecp Class (iii) for every ϕ in C ∞ c (R d ), for this continuous modification (still denoted by u(t), ϕ ) it holds, with probability one, for all t ∈ [0, T ] The previous definition can be given with different degrees of integrability in time and space, namely for solutions of class L θ (L m loc ) with θ ≥ 2 and m ≥ p (cp. Definition 2.13). We take θ = m only to minimize the notations.
Since our aim is to establish the stronger results of path-by-path uniqueness, we first give a path-by-path formulation of (3.1). Let us recall that we started with a probability space (Ω, A, P ), a filtration (G t ) t≥0 (satisfying the standard assumptions), and a Brownian motion (W t ) t≥0 . We now choose, without restriction, a version of W t which is continuous for every ω ∈ Ω. Given ω ∈ Ω, considered here as a parameter, we define b(ω, t, x) := b(t, x + σW t (ω)) c(ω, t, x) := c(t, x + σW t (ω)).
We shall sometimes write b ω and c ω for b(ω, ·, ·) and c(ω, ·, ·), respectively, in order to stress the parameter character of ω. With this new notation we now consider the following deterministic PDE, parametrized by ω ∈ Ω, in the unknown u ω : [0, T ] × R d → R: Notice that we have employed here time-dependent test functions. This is only for a technical convenience (we will use such functions in the following), and the definition with autonomous test functions could be shown to be equivalent to Definition 3.3. Equation (3.2) will be considered as the path-by-path formulation of (3.1). The reason is: Proof of Proposition 3.4. The idea of the proof is given by the following formal computation, using the Itô formula (in Stratonovich form): Since this does not work rigorously when u is not regular, one could try to apply the change of variable formula on the test function rather than on u itself, i.e. takingφ(t, x) = ϕ(t, x − σW t ) (which is smooth) as test function in equation (3.1) for u and then use a change of variable to get equation (3.2) forũ, with ϕ as test function. The problem is thatφ, besides being time-dependent, is not deterministic (but Definition 3.1 only allows deterministic test functions). Thus, we proceed by approximation. The idea is the following: taking a family (ρ ε ) ε of standard symmetric, compactly supported mollifiers, we first use a shifted version of ρ ε as test function, to get an equation for the mollification u ε := u * ρ ε for fixed x; having regularity of u ε , we can derive a formula for u ε (t, x)ϕ(t, x − σW t ). After integrating in x, taking the limit ε → 0 and a change of variable, we finally get an equation forũ, still in a weak formulation.
For simplicity of notation, we set c = 0 and σ = 1, but all the arguments are valid with immediate extension also in the general case.
Step 1: For fixed ϕ ∈ C 1 ([0, T ]; C ∞ c (R d )), the mollifications u ε satisfy, for a.e. (t, x, ω), and all the addends have modification that are measurable in (t, x, ω) (these are the modifications considered in the equality above). We fix a measurable map u (not an equivalence class), so that by Fubini's theorem convolutions of u are measurable maps in (t, x, ω). For fixed x ∈ R d , we apply Definition 3.1 of a weak solution with test function ϕ = ρ ε (x − ·) ∈ C ∞ 0 (R d ), getting the following equation for a modification u(ρ ε (x − ·)) of u ε (x) = u * ρ ε (x) = u, ρ ε (x − ·) (here the notation ∇ · ρ ε (x − ·) means the derivative with respect to the · variable, with x fixed): Introduction to the ejpecp Class where we also have passed from Stratonovich to Itô stochastic integral. Applying Itô's product formula to u(ρ ε (x − ·)) and ϕ(t, x − W t ), we find that P -a.s. it hold for every t ∈ [0, T ] Since, for fixed x, u(ρ ε (x − ·)) = u ε (x) for a.e. (t, ω), we can replace u(ρ ε (x − ·)) with u ε (x) inside the integrals, which implies (3.5) for all (t, ω) in a full-measure set A x , possibly depending on x. Note that, up to this point, we have not used any measurability in x. modifications of the addends of those of (3.7), also (3.7) would not hold on this set, which is a contradiction, cf. Remark 4.8.
Step 2: For fixed ϕ ∈ C 1 ([0, T ]; C ∞ c (R d )),ũ has the solution property (3.3) a.s.. We may now integrate, for a.e. (t, ω), the identity (3.5) with respect to x, obtain- where we have also used the classcial Fubini as well as the stochastic Fubini theorem to exchange the order of integration. Employing once again Fubini's theorem to bring Introduction to the ejpecp Class the convolution on ϕ(t, · − W t ), we get, for a.e. (t, ω), Letting ε → 0, since u, bu are in L 1 ([0, T ]; L 1 loc (R d )) for a.e. ω, we have for a.e. (t, ω), By the change of variablex = x − W t , we therefore end up with the claimed solution , for every (t, ω) in a full measure set F ϕ , which may still depend on ϕ.
Step 3: Removal of the dependency on the test function ϕ. In order to conclude the proof of the proposition, we need to make the "good" full measure set, whereũ satisfies the solution property, independent of ϕ. For this purpose, we use a density argument. Let D be a countable set in C 1 ([0, T ]; C ∞ c (R d )), which is dense in C 1 ([0, T ]; C 2 b (R d )), and set F = ∩ ϕ∈D F ϕ . Then F is a full measure set and the identity (3.9) holds for every (t, ω) ∈ F and ϕ ∈ D; after possibly passing to a smaller full-measure set F we can also assumeũ ω ∈ L m ([0, T ]; L m loc (R d )) (thus, fulfilling Definition 3.3 (i)). Now, for a generic test function ϕ ∈ C 1 ([0, T ]; C ∞ c (R d )), we take a sequence (ϕ n ) n∈N in D, satisfying equation (3.9) and converging to ϕ in C 1 ([0, T ]; C 2 b (R d )); by the dominated convergence theorem, we can pass to the limit in the equation, for (t, ω) ∈ F , getting (3.9) for ϕ. Hence, for a.e. (t, ω), (3.3) holds and the right-hand side defines the continuous representative.
Since some technical measurability arguments are delicate in the above proof (based mostly on classcial Fubini and stochastic Fubini theorems), we want to give alternative proofs of Step 1 and formula (3.8) at the beginning of Step 2, which rely on a direct exchange of integral formula obtained by continuity of approximations.

Introduction to the ejpecp Class
Then the family of stochastic integrals t 0 f (r, x)dW r , parametrized by x, admits a modification which is measurable in (t, x, ω), for every x progressively measurable in (t, ω), and for a.e. ω locally Hölder continuous in (t, x). This can be proven by Kolmogorov's continuity criterion in (t, x) for the stochastic integrals (joint measurability is a consequence of progressive measurability and continuity in (t, x)). Moreover, for such modification, we have for a.e. ω ∈ Ω: for every t ∈ [0, T ], provided the integrals are well-defined (for example, if f is compactly supported). This is a consequence of the stochastic Fubini theorem but can be proved without it: For this purpose, we first observe that by continuity of the stochastic integrals in (t, x), we can approximate, for fixed t, for a.e. ω ∈ Ω, the left-hand side R d t 0 f (r, x)dW r dx with a finite Riemann sum (in x) of stochastic integrals. We then notice that we can approximate the inner integral R d f (r, x)dx in L 2 ([0, T ] × Ω) with a finite Riemann sum (in x), and as a consequence, we can approximate, for fixed t, the right-hand side t 0 R d f (r, x)dxdW r in L 2 (Ω) with the stochastic integral of a finite Riemann sum (in x).
At the level of these approximations sums we can finally exchanging sum and stochastic integral, and passing to the limit we get the equality above.
Alternative proof of Step 1 above. As before, we fix a measurable map u (not an equivalence class), so that, by Fubini's theorem, all the convolutions with u are measurable maps in (t, x, ω), regular in x for a.e. (t, ω) fixed. Then, for fixed x ∈ R d , our starting point is the modification u(ρ ε (x − ·)) of u ε (x) = u * ρ ε (x) = u, ρ ε (x − ·) satisfying (3.6) and (3.7). Replacing u(ρ ε (x − ·)) with u ε (x) inside the integrals of (3.7) (as before), we get for a.e. ω, for every t, For the stochastic integrals, the integrands u(s) * ∇ρ ε (x)ϕ(s, x−W s ) and u ε (s, x)∇ϕ(s, x− W s ) are measurable in (t, x, ω), progressively measurable for every fixed x, and they also belong to L 2 ([0, T ] × Ω; C 1 loc (R d )). Therefore, by Step 0, there exist "nice" modifications of the stochastic integrals. Using these modifications, we get for every x, for a.e. (t, ω) (where the exceptional set possibly depends on x) precisely the formula (3.5). Moreover, since all the addends are measurable in (t, x, ω) by construction, this equality is true for a.e. (t, x, ω) (otherwise we would find positive measure sets A x in [0, T ] × Ω, for some x, where the equality above would not hold).
Alternative justification of (3.8). As before we again integrate (3.5) in x, for a.e. (t, ω), but at this stage we may then use Fubini's theorem to exchange the integrals in ds and dx, while we may use Step 0 to exchange the integral in dW s and dx. Remark 3.6. One can ask why such a change of variable works and if this is simply a trick. Actually this is not the case: as we will see in Section 4, this change of variable Introduction to the ejpecp Class corresponds to looking at the random ODE dX ω =b ω (t,X ω )dt.
A similar change of variable can be done also for more general diffusion coefficients, see the discussion in the Introduction, Subsection 1.9.

The duality approach in the deterministic case
To prove uniqueness for equation (3.2), we shall follow a duality approach. It is convenient to recall the idea in a deterministic case first, especially in view of condition (3.14) further below. For the sake of illustration, we give here a Hilbert space description, even though the duality approach will be developed later in the stochastic case in a more general set-up.
Assume we have a Hilbert space H with inner product ·, · H and two Hilbert spaces D A , D A * which are continuously embedded in H, D A ⊂ H and D A * ⊂ H.  Let us repeat this scheme (still considering the case u 0 = 0), when a regularized version of the dual equation is used. Assume we have a sequence of (smooth) approximations of equation (3.12) Introduction to the ejpecp Class again from (3.11), and hence If, for every t f ∈ [0, T ] and v 0 ∈ D, we have then we again conclude with u = 0, which proves uniqueness for equation (3.10).
A property of the form (3.14) will be a basic tool in the sequel.
The problem to apply this method rigorously is the regularity of v (or a uniform control of the regularity of v ε ). For deterministic transport and continuity equations with rough drift, one cannot solve the dual equation (3.12) in a sufficiently regular space. Thus, the regularity results of Section 2 are the key point of this approach, specific to the stochastic case.

Dual sPDE and random PDE
Let us recall that we started with a probability space (Ω, A, P ), a (complete and right-continuous) filtration (G t ) t≥0 and a Brownian motion (W t ) t≥0 . Given t f ∈ [0, T ] (which will be the final time), we consider the process (3.15) and the family of σ-fields, for t ∈ [0, t f ], (3.16) where N is the set of P -null sets in A. The family (F t ) t∈[0,t f ] is a backward filtration, in the sense that F t1 ⊂ F t2 if t 1 > t 2 . The process B is a "backward Brownian motion", or a "Brownian motion in the reversed direction of time", with respect to the filtration (F t ) t : • B t f = 0 a.s., t → B t is a.s. continuous (in fact, for all ω ∈ Ω by our choice of W t ), N (0, h) and independent of F t , for every t ∈ [0, t f ] and h ∈ (0, t], and define weak or W 1,r solutions in the same way as in the forward case. In fact, instead of this equation, we shall use its regularized version where b ε , c ε , for ε > 0, and v 0 are C ∞ c functions. First, for every ε > 0, this equations has a smooth solution with the properties described in Lemma 2.6. Second, we have the analog of Theorem 2.7 and Corollary 2.8: Introduction to the ejpecp Class Corollary 3.7. Let m be an even integer and let s be in R d . Assume that b, c satisfy Condition 2.2, and let b ε , c ε be C ∞ c ([0, T ] × R d ) functions satisfying Condition 2.14. Then there exists a constant C independent of ε such that, for every v 0 ∈ C ∞ c (R d ), the smooth solution v ε of equation (3.17) verifies for all ε ∈ (0, 1) Moreover, the analog of Proposition 3.4 holds. But, about this, let us pay attention to the notations. The result here is: and that there holds To check that the substitution x + σB t is correct, we should repeat step by step the proof of Proposition 3.4 in the backward case, but, since this is lengthy, let us only convince ourselves with a formal computation, similar to (3.4), which would be rigorous if W (hence B) and v were smooth: x + σW t f ) and therefore: Corollary 3.9. With the notations v ε (t, So v ε solves the dual equation to (3.2) (more precisely, the regularized version of the dual equation), but with a randomized final condition at time t f . Having in mind the scheme of the previous section, we have found the operator A * ε (t).

Duality formula
The aim of this section is to prove the duality formula (3.20), in order to repeat the ideas described in Section 3.1. Notice that, by the explicit formula (2.5), smooth solutions of equation (sgTE) with smooth, compactly supported initial data, and therefore also the smooth solution v ε (ω, t, x) of the backward stochastic equation (3.17) with smooth, compactly supported final data, are compactly supported in space, with support depending on (t, ω). The same is true for the function v ε (ω, t, x) := v ε (ω, t, x + σW t (ω)) (since we have assumed that W t is continuous for every ω ∈ Ω). We shall write v ω ε and v ω ε for these functions, respectively, for a given ω ∈ Ω. Before going on, we need to give a meaning to equation (3.3) for every t, for a certain fixed (i.e. independent of ϕ) modification ofũ ω (see Remark 3.5). To this end, we establish the next lemma, in which we denote by B b the set of bounded Borel functions and H −1 (B R ) := (W 1,2 0 (B R )) * .
Introduction to the ejpecp Class Unless otherwise stated, we will use this representativeũ ω (and the validity of formula (3.3) for every t ∈ [0, T ]).
Proof. We will omit the superscript ω in the following. In order to construct the representative, we fix t ∈ [0, T ] and define F t : By our integrability assumption on b and c, if ϕ has support in B R , then | F t , ϕ | is bounded by C R ϕ W 1,2 , with a constant C R which is independent of t. Therefore, for any R > 0, F t can be extended to a linear continuous functional on W 1,2 0 (B R ), with norms uniformly bounded in t.
Let us verify that F is a representative ofũ. By equation (3.3), for every timeindependent test function ϕ in C ∞ c (R d ), there exists a full L 1 -measure set A ϕ in [0, T ] such that, for all t in A ϕ , F t , ϕ coincides with ũ t , ϕ . Hence, for a countable dense set D in C ∞ c (R d ), F t andũ t must coincide for all t in ∩ ϕ∈D A ϕ , which is still a full measure set in [0, T ].
It remains to prove that F satisfies the identity (3.3) for time-dependent test func- . To this end we notice that, since F is a representative of u, it must verify (3.3) for a.e. t ∈ [0, T ]. Moreover, t → F t , ϕ(t) is continuous, which follows from the uniform (in time) bound of the H −1 norm of F : in fact, we have hence, when s → t, then | F t − F s , ϕ(t) | → 0, as a consequence of the definition of F , and also | F s , ϕ(t) − ϕ(s) | → 0, since ϕ(s) → ϕ(t) and sup s∈[0,T ] F s H −1 (B R ) ≤ C R for every R > 0. Since the right-hand side of (3.3) is continuous in time as well, we conclude that (3.3) holds in fact for every t ∈ [0, T ], and the proof of the lemma is complete.
With this "weakly continuous" representative, we can now state the duality formula for approximations.
v ω ε ∈ C 1 ([0, T ]; C ∞ c (R d )) and that identity (3.19) holds for v ω ε . If u ω is any weak solution of equation (3.2) of class L m (L m loc ) corresponding to that ω, then we have Proof. This follows directly from (3.3) (which can be stated for t = t f fixed, by the previous Lemma 3.10) for ϕ = v ω ε and identity (3.19).

Path-by-path uniqueness
By linearity of equation (sgTE), in order to prove uniqueness it is sufficient to prove that u 0 = 0 implies u = 0. To this aim, we will combine identity (3.20) and Corollary 3.7, similarly to the idea explained in Section 3.1 for the deterministic case. The problem in the stochastic case, however, is that we have regularity control in average and we want to deduce path-by-path uniqueness. Let us first state the analog of (3.14). Here, we need to impose m > 2 (while the Definition 3.3 of weak L m -solutions requires only m ≥ 2).
Assuming (3.21), we write β = (β + αm /p) − αm /p and apply Hölder's inequality, first in x and ω with exponentp/m > 1, then in time with exponent 1. In this way, we Introduction to the ejpecp Class Now Corollary 3.7 gives that the second term is uniformly bounded and we get the claim of the lemma in view of (3.21).
For the following uniqueness statement we have to restrict the behavior at infinity of weak L m -solutions (note that in the definition they are just L m loc (R d )). The restriction is not severe: we just need at most polynomial growth at infinity. To be precise, we need that for some α > 0 we have Theorem 3.14. Let m > 2. There exists a full measure set Ω 0 ⊂ Ω such that for all ω ∈ Ω 0 the following property holds: for every u 0 : Proof.
Step 1: Identification of Ω 0 . From Lemma 3.13, given t f ∈ [0, T ] and v 0 ∈ C ∞ c (R d ), there exist a full measure set Ω t f ,v0 ⊂ Ω and a sequence (ε n ) n∈N with ε n → 0 as n → ∞ such that v ω εn belongs to C 1 ([0, T ]; C ∞ c (R d )) and satisfies (3.19), for all n ∈ N, for all ω ∈ Ω t f ,v0 . By a diagonal procedure, given a countable set D in C ∞ c (R d ) which is dense in L 2 (R d ), there exist a full measure set Ω b ⊂ Ω and a sequence (ε n ) n∈N with ε n → 0 as n → ∞ such that properties (3.23) and (3.24) hold for all t f ∈ [0, T ] ∩ Q, v 0 ∈ D and ω ∈ Ω b . Since, for a given ω, there exists a constant C ω > 0 such that for all x ∈ R d and t ∈ [0, T ], we may replace (3.24) by An analogous selection is possible for (1 + | · | α/m )( c ω − c ω εn ) v ω εn , which provides another full measure set Ω c , and Ω 0 is then defined as the intersection Ω b ∩ Ω c .

Existence for (3.1)
So far we have proved that path-by-path uniqueness holds for the stochastic equation (3.1). It remains to prove the existence of a (distributional) solution. The proof is based on a priori estimates and is somehow similar to that of Theorem 2.7 and Theorem 2.17, without the difficulty of taking derivatives. Thus, we will state the result and only sketch the proof. Proposition 3.15. Let p, q be in (2, ∞) satisfying 2 q + d p ≤ 1 or (p, q) = (∞, 2). Assume that b and c are a vector field and a scalar field, respectively, such Let χ be a function satisfying (2.8). Then there exists a constant C such that, for every u 0 in C ∞ c (R d ), the smooth solution u of equation (3.1) starting from u 0 , given by Lemma 2.6, verifies Moreover, the constant C can be chosen to have continuous dependence on m, d, σ, χ, p, q and on the L q ([0, T ]; L p (R d )) norms of b (1) and c (1) , on the L 1 ([0, T ]; C 1 lin (R d )) norm of b (2) , and on the L 1 ([0, T ]; C 1 b (R d )) norm of c (2) . The result holds also for (p, q) = (d, ∞) under the additional hypothesis that the L ∞ ([0, T ]; L d (R d )) norms of b (1) and c (1) are smaller than δ, see Condition 2.2, 1c) (in this case the continuous dependence of C on these norms is up to δ).
Proof. We proceed similarly as in the the proof of Theorem 2.7, but aiming for a priori estimates for u and not for its derivatives. To this end, we consider the equation for E[u m ], which is a parabolic closed equation. The same method of proof as in Theorem 2.7 can then be applied (without the difficulty of having a system with many indices), which then shows the claim.. and that all the derivatives must be carried over to the test function ϕ. Pathwise uniqueness follows from Theorem 3.14: Let u, u 1 be two solutions to (3.1) satisfying (3.22) on the same filtered probability space (Ω, (G t ) t , P ), such that W is a Brownian motion with respect to (G t ) t . Then, according to Lemma 3.4, the functionũ 1 given byũ 1 (t, x) = u 1 (t, x + σW t ) solves the deterministic PDE (3.2) for a.e. ω ∈ Ω, soũ 1 must coincide withũ for a.e. (t, x, ω), which implies the claim u 1 = u.
Introduction to the ejpecp Class Remark 3.17. For solutions to equation (3.1), non-negativity of initial values is preserved, i.e. if u 0 ≥ 0, then u(t, x, ω) ≥ 0 for a.e. (t, x, ω): this is true in the regular case (for u, b and c smooth and compactly supported), thanks to the representation formula (2.5) (where, for the application to equation (3.1), c is replaced by c + div b). This carries over to the general case, since u is constructed as weak- * limit in L ∞ ([0, T ]; L m (Ω; W 1,m (B R ))) of solutions with regularized coefficients and initial condition.

Path-by-path uniqueness and regularity of the flow solving the sDE
In this section we want to apply the previous results to study the stochastic differential equation (sDE). We will get existence, strong (even path-by-path) uniqueness and regularity for the stochastic flow solving the sDE, where b is in the LPS class and σ = 0. Once again, we recall that no such result holds in the deterministic case (σ = 0), which means that for the stochastic case (σ = 0) the evolution of the finite-dimensional system gets better due to the additional stochastic term.
In order to state the result, we need to make the formal links between (sDE) and (sCE) precise. This will be done for the deterministic case, in the first subsection, using Ambrosio's theory of Lagrangian flows. Then we will use this link (read in a proper way in the stochastic case) combined with uniqueness and regularity for the stochastic equations to arrive at our result.

The deterministic case
If we integrate this equation with respect to a finite signed measure µ 0 on R d , we get µ t , ϕ = µ 0 , ϕ + t 0 µ s , f (s, ·) · ∇ϕ ds, i.e. gdµ t = g(Φ t )dµ 0 for every measurable bounded function g on R d . Equation (4.2) is the continuity equation (CE) for µ (starting from µ 0 ), which we have written so far in compact form as It is easy to see that the previous passages still hold when f is not regular. Starting from this remark, DiPerna-Lions' and Ambrosio's theory extends the above link between the ODE and the CE (in some generalized sense) to the irregular case, so that one can study the CE in order to study the ODE. We will follow Ambrosio's theory of Lagrangian flows, which allows to transfer a well-posedness result for the CE to a well-posedness result for the ODE.
In the general theory, one considers a convex set L f of solutions µ = (µ t ) t to the equation (CE), with values in the set M + (R d ) of finite positive measures on R d , which Introduction to the ejpecp Class ("solution of (CE)" is here intended in the sense of distributions). For our purposes, L f will be, for some m fixed a priori, the set Sometimes, we will use u to indicate also µ and vice versa.   [30,2]). We will follow this strategy, but for f in LPS class and with noise, using our well-posedness result for (sCE).

Stochastic Lagrangian flow: existence, uniqueness and regularity
We consider the equation (sDE) on R d . Since we use also here the results of the previous sections, we again assume the same LPS Condition 2.2 on the drift b. As before, we consider the purely stochastic case σ = 0, and W is a standard d-dimensional Brownian motion, endowed with its natural completed filtration (F t ) t (the smallest among all the possible filtrations), which is also right-continuous (see [8,Proposition 2.5]).
With the change of variableX t = X t − σW t , this sDE becomes a family of (random) ODEs, parametrized by ω ∈ Ω: d dtX =b ω (t,X), (4.4) where, as usual,b ω (t, x) = b(t, x + σW t (ω)). More precisely, if X is a progressively measurable process, then X solves (sDE) if and only ifX solves the ODE (4.4) for a.e. ω. For this family of ODEs, the concepts of Lagrangian flow and CE (at ω fixed) make sense and the CE associated with this ODE is precisely the random PDE (3.2) with c = 0, that is the random CE ∂ tũ + div(bũ) = 0. Thus, we can hope to apply our existence and uniqueness result for (sCE) (remembering that, by Lemma 3.4, a solution to the random CE (4.5) is given byũ(t, x) = u(t, x + σW t ), when u solves (sCE)). • Φ is progressively measurable, i.e. it is P ⊗ B(R d )-measurable, where P is the progressive σ-algebra.
Given a certain class A of functions from R d to R d , e.g. W 1,m loc (R d ), the flow is said to be of class A if, for every t ∈ [0, T ], Φ t is in class A with probability one.
Let us now state the main result of this section: Before proceeding to the proof, which is essentially an application of our wellposedness result for the deterministic PDE (3.2), we make some comments on this result.
Introduction to the ejpecp Class Remark 4.6. If m > d, we deduce by Sobolev embedding that, for every t ∈ [0, T ], there exists a representative of Φ t which is of class C 0,α loc (R d ) for α = 1 − d/m. However, we are not able to show that this representative is jointly continuous in (t, x) (we actually do not even show joint measurability), though such joint continuity is known to be true in the subcritical case (see [34]).
Remark 4.7. The existence part gives essentially a family a flows Φ ω , parametrized by ω ∈ Ω, such that Φ(x) solves (sDE) for a.e. x, while the regularity part gives local weak differentiability of the flow. The uniqueness part implies pathwise uniqueness among stochastic Lagrangian flows: given two stochastic Lagrangian flows Φ 1 and Φ 2 solving (sDE) (even adapted to some filtration larger than (F t ) t ) and starting from the same initial datum of the form 1 S L d , they necessarily coincide. Indeed, for a.e. ω ∈ Ω, Φ 1 (ω) andΦ 2 (ω) are Lagrangian flows solving the random ODE (4.4) (with that ω), so Φ 1 = Φ 2 a.e..
Let us emphasize that path-by-path uniqueness is stronger than pathwise uniqueness: it says that, for each fixed ω in a full P -measure set, any two Lagrangian flows, solving (sDE) (interpreted as the random ODE (4.4)) with ω fixed, must coincide, without any need to have adapted flows. On the other hands, while we can manage flows, we are not able to compare two solutions to the ODE (4.4), at ω fixed, starting from a fixed x, so we have no uniqueness result for (sDE) with x as initial datum. Let us remind, however, that pathwise uniqueness holds for (sDE) (with x fixed) under Krylov-Röckner conditions, see [55].
We also wish to recall a basic argument in measure theory, that we will use quite often: Remark 4.8. Let (E, E, µ), (F, F, ν) be two σ-finite measure spaces and let f : E ×F → R be a map such that, for ν-a.e. z ∈ F , the map y → f (y, z) is E-measurable. Assume that f has a E ⊗ F-measurable version g : E × F → R, i.e. there exist a full measure set F 0 and, for every z ∈ F 0 , a full measure set E z 0 , such that f (y, z) = g(y, z) for all z ∈ F 0 and y ∈ E z 0 . Let BP = BP (a) be a Borel property defined for a ∈ R (in the sense that the subset where BP is true is a Borel set), for example ϕ(a) = 0 for some Borel function ϕ. Assume that, for ν-a.e. z ∈ F , it holds: BP (f (y, z)) for µ-a.e. y ∈ E. Then BP (g(y, z)) holds for (µ × ν)-a.e. (y, z) ∈ E × F . A similar property also holds for more than two variables.
Proof. If this were not true, then the set A = {(y, z) ∈ E × F : ¬BP (g(y, z))} is E ⊗ Fmeasurable (by measurability of BP and g) and of positive measure. Therefore, by Fubini's theorem, there exists a positive measure set F ¬ ⊂ F such that, for every z ∈ F ¬ , the set E z ¬ := {y ∈ E : ¬BP (g(y, z))} is E-measurable and of positive measure. But g is by assumption a version of f . Therefore, for every z in the positive measure set F ¬ ∩ F 0 , the set {y ∈ E : ¬BP (f (y, z))} contains the positive measure set E z ¬ ∩ E z 0 , which is in contradiction with the assumption on BP (f ).
Theorem 3.14, applied to the random CE (4.5), gives a full P -measure set Ω 0 in Ω such that, for every ω ∈ Ω 0 , for every Borel set S, there exists at most one solutionũ ω to the CE in the class Lb ω , which starts from 1 S (note that Ω 0 is independent of the initial datum). Thus, for every ω ∈ Ω 0 , the first part of Theorem 4.2 gives local uniqueness among Lagrangian flows solving (4.4) at ω fixed.
Introduction to the ejpecp Class to conclude. As we are going to take various modifications of the same function, we keep the following convention: we use the notationΦ for a solution of the sDE which is continuous in time (at x fixed), but not necessarily measurable in ω, the notationΦ for a solution of the random ODE and the notationΦ for a solution of the random ODE which is also continuous in time (again at x fixed). For the solution to the (s)CE, we do not use the "bar" since we consider, unless otherwise stated, versions that are both weakly- * measurable and weakly- * continuous.
In the first step, we get the existence, at ω fixed, of a Lagrangian flowΦ ω solving the ODE (4.4). We take S = B N for an arbitrary positive integer N . By Theorem 3.16 (applied with c = 0), Remark 3.17 and Proposition 3.4, we find a full P -measure set Ω 0 in Ω, independently of N (by a diagonal procedure), such that, for every ω ∈ Ω 0 and N , there exists a (unique) solutionũ ω,N to the CE (4.5) in the class Lb ω , starting from 1 B N .
Thus, the second part of Theorem 4.2 gives the claimed existence of a global Lagrangian flowΦ ω solving the ODE (4.4). Now we defineΦ =Φ + σW , which seems at first the natural candidate for the stochastic Lagrangian flow solution to (sDE). The main problem is thatΦ does not have any measurability property in ω. Therefore, in the second step, we find a progressively measurable map Φ : x, ω) for a.e. (t, x)) = 1 (keep in mind that this set is not a priori measurable in ω).
To this end, we shall use the link between ODE and CE (at the deterministic level) and the progressive measurability of the solution to (sCE).
In what follows, we denote by ϕ n functions in C ∞ c (R d ) with ϕ n (x) = x for |x| ≤ n. By the deterministic theory (Theorem 4.2 and Remark 4.3), we know that, for every n ∈ N and u 0 ∈ C ∞ c (R d ), we have for every ω in a full measure set Ω u0,n : for every t ∈ [0, T ] there holds u 0 , ϕ n (Φ ω t ) = ũ ω (t), ϕ n and so u 0 , ϕ n (Φ ω t ) = u ω (t), ϕ n , where u ω (t) is the H −1 -weakly- * continuous version, as in Remark 3.11, of the solution to (sCE) starting from u 0 . In particular, the map (t, ω) → u 0 , ϕ n (Φ ω t ) coincides with a progressively measurable map for every t ∈ [0, T ], for a.e. ω (with the exceptional set independent of t) for every u 0 ∈ C ∞ c (R d ). Hence, up to redefiningΦ ω t on a P -null set independent of t, u 0 , ϕ n (Φ ω t ) is progressively measurable for every ϕ in C ∞ c (R d ) and thus, by density, also for every u 0 ∈ L 2 (B R ). Therefore, ϕ n (Φ ω t ) is weakly- * progressively measurable in L 2 (B R ). Since L 2 (B R ) is a separable reflexive space, Pettis measurability theorem applies and gives that ϕ n (Φ ω t ) is strongly progressively measurable with values in L 2 (B R ); in particular, there exists Φ n : [0, T ] × B R × Ω → R d , P ⊗ B(R d )-measurable, version of ϕ n (Φ), that is, for a.e. (t, ω) there holds Φ n (t, x, ω) = ϕ n (Φ ω t (x)) for a.e. x ∈ B R (cf. [63,Proposition A.6]). Using Remark 4.8 and the analogous properties forΦ, one can check that Φ n does not depend on R and is definitively constant in n (for a.e. (t, x, ω)), so we get Φ, which is P × B(R d )-measurable and a version ofΦ. The second step is complete.
To conclude the proof of existence, we have to prove that Φ − σW is a (global) Lagrangian flow solving (4.4). However, since Φ coincides withΦ only for a.e. (t, x) (for fixed ω), Φ(·, x, ω) − σW (ω) does not need to be continuous in time and satisfies (4.4) only for a.e. t ∈ [0, T ]. In the third step, we prove that there exists a measurable version of Φ, and so ofΦ, which is continuous in t for a.e. (x, ω), and use this version to conclude. The conceptual idea is that, given a path γ which has a continuous version, its continuous version can be constructed from γ in a measurable way, so that this version is measurable (with respect to some other variable) if γ is measurable.
For any N ∈ N, we choose a dyadic partition t N j = 2 −N j, we set I N j := [t N j , t N j+2 ) and, for t ∈ [0, T ], we define I N (t) as I N j for the minimal j with t ∈ I N j . Note that, for a.e. ω, it holds: for a.e. (t, x), Φ(t, x, ω) =Φ(t, x, ω) (as Φ is a modification ofΦ and both are Introduction to the ejpecp Class measurable in (t, x) for ω fixed). In particular, for a.e. ω, it holds: for a.e. x, The continuity property ofΦ implies that for a.e. ω the following is true: for a.e. x and every m ∈ N, there exists N ∈ N with max j [ess sup t∈I N jΦ (t, x, ω) − ess inf t∈I N jΦ (t, x, ω)] < 1/m. Therefore, by Remark 4.8, the set x, ω) < 1/m has full measure. Then, for a.e. (x, ω), the limit is well-defined and finite for every t. Moreover, the map A (defined zero on the exceptional set where the above limit does not exist) is measurable in (t, x, ω), and continuous in t for a.e. (x, ω). For a.e. ω it holds: A(t, x, ω) =Φ(t, x, ω) for a.e. (t, x) (sincē Φ(t, x, ω) = lim N →∞ ess sup s∈I N (t)Φ (s, x, ω) for a.e. (t, x)). So, again by Remark 4.8, A = Φ for a.e. (t, x, ω). With a little abuse of notation, we will use now Φ also for its modification which is continuous in t.
It remains to show thatΦ = Φ − σW is a Lagrangian flow solving (4.4). The integrand b(Φ) is in L 1 (0, T ) for a.e. (x, ω) and the ODE (4.4) is satisfied for a.e. (t, x, ω): otherwise, since Φ is a version ofΦ, reasoning as in Remark 4.8, for some ω in a positive measure set, the ODE would not be satisfied even byΦ. The continuity in time implies that, for a.e. (x, ω), the ODE (4.4) is satisfied for every t. Therefore, this Φ is the desired stochastic Lagrangian flow.
Part 3: W 1,m loc (R d )-regularity of Φ. We prove a stability result, which is interesting in itself.
Proof of Lemma 4.9.
Step 1: Representation formula for fixed time.
as a consequence of the analogous property forΦ and of Remark 4.8. In particular, taking the H −1 -valued weakly- * time continuous version for u (cp. Remark 3.11), we get the above formula for every t, for every ω (in a full measure set independent of t). Moreover, by Theorem 3.16 extended to every time by weak- * continuity, sup t E[ u(t, ·) m L m (1+|·|) α (R d ) ] is finite for every real α (since u 0 is bounded compactly supported); therefore, calling again ϕ n functions in C ∞ c (R d ) with ϕ n (x) = x for |x| ≤ n, we have that, for every t fixed: u 0 , Φ t is in L m (Ω) and Step 2: Approximation and conclusion. Fix t ∈ [0, T ] and u 0 ∈ C ∞ c (R d ). Note that, for every ϕ ∈ C ∞ c (R d ), ϕ(Φ ε t ) is the solution v ε , at time 0, to the backward approximated stochastic transport equation, with final time t and final datum ϕ. The approximated duality formula (3.20) (for c = 0 and with a change of variable to avoid the "tilde"), the approximation Condition 2.14 on (b ε ) ε and equation (4.6) then give On the other hand, Corollary 3.
, uniformly in ε. Since this space is reflexive (see Remark 2.9), ϕ(Φ ε t ) converges weakly, as ε → 0 and up to the choice of a subsequence, to an ele- for a constant C which is independent of t. In particular, for every u 0 ∈ C ∞ c (R d ) and F ∈ L ∞ (Ω), we get Therefore, by (4.7) we find ϕ(Φ t ) = Ψ ϕ t for a.e. (x, ω). Now, taking ϕ = ϕ n (with bounded W 1,2m where the constant C is independent of n and t. As a consequence, ϕ n (Φ t ) converges weakly in L m (Ω; W 1,m (1+|·|) −(d+1+m) (R d )), as n → ∞ and up to the choice of a subsequence. On the other hand, by Step 1 we know that u 0 , Φ t − ϕ n (Φ t ) → 0 in L m (Ω), as n → ∞. So, by a similar argument to the one for ε → 0, any weak limit of ϕ n (Φ t ) has to be Φ t , and hence Φ t L m (Ω;W 1,m (1+|·|) −(d+1+m) (R d )) ≤ C.

Towards classical pathwise uniqueness
So far we have investigated the problem of path-by-path uniqueness for the equations (sCE) and (sDE). In some sense, this is the strongest type of uniqueness we know. Indeed, we can come back heuristically to pathwise uniqueness for (sDE) in this way: given two processes X and Y which are solutions to (sDE) with the same initial datum, then, for a.e. ω, X(ω) and Y (ω) solve the sDE at fixed ω (more precisely,X ω andỸ ω solve the random ODE (4.4)), so that, by path-by-path uniqueness, they must coincide. However, since we only deal with flows, we are not able to give a "classical" pathwise uniqueness result (among processes instead of flows), as a direct consequence of Theorem 4.5. Thus, we will now see how to modify the duality argument to get a more classical pathwise uniqueness, though still the initial datum cannot be a single point x ∈ R d , but it has to be a suitable diffused random variable.

The first result
The easiest consequence of Theorem 3.14 (applied to the continuity equation) is pathwise uniqueness among solutions with conditional laws (given the Brownian motion) The relevant concept of solution and the result are shown below, but let us explain the idea. As already mentioned, we need the initial datum X 0 to be diffuse. We could take e.g. the probability space (C([0, T ]; R d ) × B R (y 0 ), Q ⊗ L d ) (with the suitable σ-algebra), with Q as Wiener measure, for some R > 0, y 0 ∈ R d , and X 0 (γ, x) = x, W t (γ, x) = γ t ; the filtration must be any filtration (G t ) t (satisfying the standard assumptions) such that G t contains σ{X 0 , W s |s ≤ t}. The solution X = X(γ, x) should be thought of as a flow, for fixed γ ∈ C([0, T ]; R d ), solving the sDE at this fixed γ. Now we ask: among which class of processes path-by-path uniqueness applies, implying pathwise uniqueness? We have to require (again heuristically) that, for Q-a.e. Brownian trajectory W = γ, (X t (γ, ·)) # L d is a diffuse measure. This is true in the case above, while for the general case (of a general probability space and general initial datum X 0 ), we must require that X 0 has a diffuse law and that "the law of X t for fixed γ is diffuse" too. This law of X t for fixed γ is the conditional law of X t given the Brownian motion W ; see e.g. [78, Chapter 1] for a reference on conditional law.
Definition 5.1. Let m ≥ 1, α ∈ R; let W be a Brownian motion (on a probability space (Ω, A, P )), let (F t ) t be its natural completed filtration. An R d -valued process X on Ω is said to have conditional (marginal) laws (given the Brownian motion , the conditional law of X t given F t has a density (with respect to Lebesgue measure) ρ(t, x, ω) and, for a.e. ω ∈ Ω, ρ(·, ·, ω) belongs to L m ([0, T ], L m (1+|·|) α (R d )). Theorem 5.2. Let m ≥ 4, s ∈ R. Let W , (F t ) t be as above and let X 0 be a random variable on Ω, independent of W , such that the law of X 0 has a density (with respect to the Lebesgue measure) in L m (1+|·|) 2s+d+1 (R d ). Assume Condition 2.2. Then, for every α ≤ s, strong existence and pathwise uniqueness hold for (sDE) with initial datum X 0 , among solutions with conditional laws in L m ([0, T ], L m (1+|·|) α (R d )). More precisely, if (G t ) t is an admissible filtration (satisfying the standard assumptions) on Ω (i.e. X 0 is G 0 -measurable and W is a Brownian motion with respect to (G t ) t ), then there exists a unique G-adapted process solving (sDE), starting from X 0 and with conditional laws in We will not give all the details of the proof, also because the proof is similar to the one of the next Theorem 5.4.
Proof. The proof of uniqueness is similar to the one of the first part of Theorem 4.2.
Suppose by contradiction that there exist two different solutions X and Y with the properties above. Then it is possible to find a time t 0 , two disjoint Borel sets E and F in R d and a measurable set Ω in Ω with P (Ω ) > 0 such that X t0 (ω) belongs to E and Y t0 (ω) belongs to F for every ω in Ω .
On C([0, T ]; R d ) we denote by Q the Wiener measure and by Γ the essential image of Ω under the map W , i.e. Γ := {γ ∈ C([0, T ]; R d ) : P (Ω |W = γ) > 0} (this definition makes sense up to Q-negligible sets). Since Q is the image measure of P under W , we have Q(Γ) > 0. For every t, we defineμ t as the conditional law on R d ofX t , restricted to Ω , given W , i.e., for every ϕ in C b (R d ), We analogously defineν t forỸ instead ofX. Then one can show that: • for Q-a.e. γ ∈ Γ,μ γ andν γ are weakly continuous (in time) solutions to the random CE (4.5) at γ fixed; •μ andν belong to the Lb class; •μ andν differ at time t 0 .
So we have found two different Lb γ solutions to the random CE at γ fixed, for a nonnegligible set of γ. This is a contradiction, and thus, the proof of uniqueness is complete. Strong existence is a consequence of the existence of a stochastic Lagrangian flows Φ solving (sDE). Indeed, defining X t (ω) := Φ ω t (X 0 (ω)), we observe the following facts: Introduction to the ejpecp Class • Since Φ(x) solves the sDE with initial datum x, for a.e. x, and X 0 is absolutely continuous (with respect to the Lebesgue measure), X verifies for a.e. ω is the minimal admissible filtration (N are the P -null sets).
• Let u 0 be the density of the law of X 0 and let u be the solution to the sCE in L ∞ ([0, T ]; L m (Ω; L m (1+|·|) s (R d ))) as in Theorem 3.16, with initial datum u 0 . Then the law of X t has u(t) as conditional density, given W . To prove this, notice that W and Φ are adapted to the Brownian (completed) filtration F and that X 0 is independent of F T , so, for any test function ϕ ∈ C ∞ c (R d ) and any ψ ∈ where in the second passage we used independence (precisely, in the form of Lemma 5.5 in the following paragraph) and the last passage is a consequence of Thus, X is the desired solution and also existence is proved.

The second result
The previous result is somehow limited, at least for uniqueness, by our hypothesis on conditional laws. In this paragraph we prove that actually pathwise uniqueness holds among processes whose marginal laws are diffuse (the precise hypothesis is stated below), with no need to control conditional laws.
To understand the relation with the previous Theorem 5.2, consider again the case discussed at the beginning of the previous paragraph and notice that, given a process X on (C([0, T ]; R d ) × B R (y 0 ), Q ⊗ L d ), the law ρ t of X t is the Q-average, on C([0, T ]; R d ), of the conditional laws ρ γ t of X t (γ, x), given the Brownian trajectory γ. So the fact that the law (that is, the mean of the conditional laws) is diffuse is a weaker condition than the hypothesis on Q-a.e. conditional law. Hence the class of processes whose marginal laws are diffuse is larger that the class used in Theorem 5.2, and in particular, the uniqueness result in the following Theorem 5.4 is morally stronger. Actually no implication holds between the two uniqueness results (for a technicality on the bounds on the densities, see the next definition), but still the idea is that uniqueness is stronger in Theorem 5.4.
Introduction to the ejpecp Class for (sDE) with initial datum X 0 , among solutions with laws in L ∞ ([0, T ], L m (1+|·|) α (R d )). More precisely, if (G t ) t is an admissible filtration (satisfying the standard assumptions) on Ω (i.e. X 0 is G 0 -measurable and W is a Brownian motion with respect to (G t ) t ), then there exists a unique G-adapted process solving (sDE), starting from X 0 and with laws in Proof of uniqueness. First we give the idea of the proof. Let X, Y be two solutions to (sDE) which are adapted to an admissible filtration (G t ) t . Set µ t := δ Xt − δ Yt ; then µ is a random distribution which solves the sCE in the sense of distributions, with µ 0 = 0. We have to prove that µ ≡ 0. We again want to use duality: if v solves the backward sTE with final time t f and final condition v(t f ) = ϕ fixed, then formally it holds µ t , ϕ = µ 0 , v 0 = 0. But now we must be careful: expressions like µ s , b(s) · ∇v(s) , There are two key facts. The first one is where the integrability hypothesis plays a role: if we replace µ s by its average ρ s = E[µ s ] = (X s ) # P − (Y s ) # P , we can estimate (5.1) since the density of ρ is in the correct integrability class for Hölder's inequality. However, taking the expectation, we have to deal with E[µ s ∇v(s)]. Here enters the second key fact, namely that µ s and ∇v(s) are independent, since µ s is G s -measurable, while v(t) (as backward solution) is adapted to the Brownian backward (completed) filtration F s , which is independent of G s . Having this in mind, we come to the rigorous proof of the result.
Take t f ∈ [0, T ] and ϕ ∈ C ∞ c (R d ). Let b ε be as in Condition 2.14, let v ε be the solution to the approximated backward transport equation with final time t f and final datum v ε (t f ) = ϕ. With the usual notation with tilde (ṽ ε (s, x) = v ε (s, x + σW s ),X s = X s − σW s ), the chain rule gives and similarly for Y . Subtracting the expression for Y from that for X, we get We now claim that Introduction to the ejpecp Class and similarly for Y . Assuming this, we obtain ϕ(X t f ) = ϕ(Y t f ) and then, by the arbitrariness of ϕ and t f , also X ≡ Y . For proving (5.2), we want to exploit the independence of ∇v ε (s) and X s , for fixed s ∈ [0, T ]. To this end, we need the following elementary lemma: Lemma 5.5. Consider two measurable spaces (F 1 , F 1 ), (F 2 , F 2 ) and a probability measure P on (F 2 , F 2 ). Let f : F 1 × F 2 → R, Z : F 2 → F 1 be two measurable functions and denote by ρ the law of Z on F 1 . Suppose that there exists a σ-algebra A ⊂ F 2 such that f is F 1 ⊗ A-measurable and Z is independent of A. Assume also F1 F2 |f (y, ω)|P (dω)ρ(dy) < ∞. Then it holds f (y, ω)P (dω)ρ(dy).
Proof. The lemma is clear for f (y, ω) = g(y)h(ω), when g is F 1 -measurable and integrable (with respect to η), and h is A-measurable and integrable (with respect to P ). The general case is obtained by approximating f with sums of functions as above.
Applying this lemma with F 1 = R d , F 2 = Ω, f = |(b − b ε ) · ∇v ε | and Z = X with law ρ, for fixed s ∈ [0, t f ], and then integrating over s ∈ [0, t f ], we obtain We would like to use Hölder's inequality to conclude with (5.2). Since the density of ρ belongs to L ∞ ([0, T ]; L m (1+|·|) −α (R d )) by assumption, it is enough to prove that Proof of existence. As for Theorem 5.2, strong existence is an easy consequence of the existence of a stochastic Lagrangian flows Φ solving (sDE). Indeed, defining X t (ω) := Φ ω t (X 0 (ω)), we observe the following facts.
• Since Φ(x) solves the sDE with initial datum x, for a.e. x, and X 0 is absolutely continuous (with respect to the Lebesgue measure), X verifies for a.e. ω is the minimal admissible filtration (N are the P -null sets).
• Let u 0 be the density of the law of X 0 and let u be the solution to the sCE in L ∞ ([0, T ]; L m (Ω; L m (1+|·|) s (R d ))) as in Theorem 3.16, with initial datum u 0 . Then the law of X t has density given by µ t = E[u(t)]. Indeed Φ and X 0 are independent (which allows to use Lemma 5.5), so, for any test function ϕ ∈ C ∞ c (R d ), we have Introduction to the ejpecp Class where the last passage is a consequence of u(t) = (Φ t ) # u 0 . We further have for µ Thus, X is the desired solution and also existence is proved.
6 Path-by-path results for sDE 6

.1 Path-by-path uniqueness of individual trajectories
We next consider equation (sDE). Its integral formulation is and therefore, we may give a path-by-path meaning to it. Assume for some constant C > 0 that b : [0, T ] × R d → R d is a measurable locally bounded function (defined for all (t, x), not only a.e.). As before, let us assume that W has continuous trajectories (everywhere). Given ω ∈ Ω, hence given the continuous function t → W t (ω), consider all continuous functions y : [0, T ] → R d which satisfy the identity y(t) = x + t 0 b(s, y(s))ds + σW t (ω) and call C(ω, x) the set of all such functions. Denote by Card(C(ω, x)) the cardinality of the set C(ω, x). Definition 6.2. We say that the sDE satisfies path-by-path uniqueness if P Card(C(ω, x)) ≤ 1 for all x ∈ R d = 1, namely if for a.e. ω ∈ Ω, C(ω, x) is at most a singleton for every x in R d .
To our knowledge, the only two results on path-by-path uniqueness are [26] and [18]. We present here a new strategy for this kind of results.
We can now prove a central fact. For a bounded function f and a Borel set E, denote with f 0,E the supremum of f over E; in general, this is not the essential supremum, unless f is continuous. Then, if R > 0 is such that |y (i) (t)| ≤ R for t ∈ [0, T ] and i = 1, 2, we find ρ ω (t f ), v 0 (· + σW t f (ω)) ≤ 2 t f 0 ( b ω (s) − b ω εn (s)) · ∇ v ω εn (s) 0,B R ds, and thus ρ ω (t f ), v 0 (· + σW t f (ω)) = 0 is satisfied due to (6.4). This is equivalent to ρ ω (t f , · − σW t f (ω)), v 0 = 0, which implies ρ ω (t f , · − σW t f (ω)) = 0 since v 0 ∈ D was arbitrary and D separates points. Consequently, y (1) (t f ) = y (2) (t f ) follows. This holds true for every t f ∈ [0, T ] ∩ Q, and since t → y (1) (t) is continuous, we get y (1) (t) = y (2) (t) for every t ∈ [0, T ]. This finishes the proof of the theorem. Theorem 6.4 is, in a sense, our main result on path-by-path uniqueness, although assumption (6.1) is not explicit in terms of b. Roughly speaking, this condition is true when we have a uniform bound (in some probabilistic sense) for ∇ v ω εn 0,B R . It introduces a new approach to the very difficult question of path-by-path uniqueness, which may be easily generalized, for instance, to sDEs in infinite dimensions (which will be treated in separate works). A simple consequence is: where v ε is the smooth solution of the backward sPDEs (3.17) corresponding to b ε and v 0 , with c ε = 0. Then path-by-path uniqueness holds for (sDE).
In Section 2. which implies condition (6.5) of Corollary 6.5. Hence, we have: Corollary 6.6. Under the conditions of Section 2.10 (Db of class LPS ) we have path-bypath uniqueness for (sDE).
Notice that the conditions of Section 2.10 with m > d imply b ∈ C ε loc (R d , R d ) for some ε > 0. Thus, in the case m > d, this result is included in Corollary 6.8 below and already in [26]. However, also the limit case m = d ≥ 3 is included in our statement. Otherwise, we may take estimate (6.6) from [41] in the case b ∈ L ∞ (0, T ; C α b (R d )); (6.7) (essential boundedness in time, with values in C α b (R d ), is actually enough since the measure solutions to the continuity equation are only space-valued). Precisely, the following result is proved in [41]. We give here an independent proof for the sake of completeness.
Taking δ sufficiently small (and thus λ large enough), from Gronwall's lemma we deduce Since r > 1 is arbitrary, we may apply Kolmogorov's regularity criterion (see for instance the quantitative version of [57] for the bound on the moments of supremum norm in x) and entail (6.8), which finishes the proof of the lemma.
As a straight-forward consequence of Theorem 6.4 in combination with Lemma 6.7, it follows: Corollary 6.8. Under condition (6.7) we have path-by-path uniqueness for (sDE).

Examples and counterexamples
In this final section we present some examples of drifts under LPS conditions, which exhibit regularization by noise phenomena (i.e. the ODE is ill-posed, while the sDE is well-posed), and an example outside of the LPS conditions where our results do not hold.

Examples of regularization by noise
Example 7.1. Given a real number α, we consider on R d the autonomous vector field b(x) := 1 0<|x|≤1 |x| αx + 1 |x|>1 x, wherex = x/|x| for x = 0 (and0 = 0). The vector field b is Lipschitz continuous if and only if α ≥ 1 and it satisfies the LPS conditions if and only if α > −1. In the deterministic case, if −1 < α < 1, we see that: • if x 0 = 0, then there exists a unique solution Y to the ODE dX t /dt = b(t, X t ) (that is (sDE) with σ = 0) starting from x 0 , namely where t 1 is the first time that |Y | = 1 (t 1 = 0 if |x 0 | > 1); • if x 0 = 0, then there is an infinite number of solutions to the ODE starting from 0, namely any function of the form Y (t) = 1 t>ta ((1 − α)(t − t 0 )) 1/(1−α)x a 1 t≤t1 + e t−t1x a 1 t>t1 Introduction to the ejpecp Class Consequently, if −1 < α < 1, there cannot exist a continuous flow solving the ODE: continuity fails in x 0 = 0. This also implies that non-uniqueness occurs for the transport equation with discontinuous (at 0) initial datum.
On the contrary, for α > −1 the drift is in the LPS class; hence, all our results apply to this example and show regularization by noise in various ways. In particular, the discontinuity of the flow in the origin is removed ω-wise; moreover, for α > 0 (when b is Hölder continuous), we have proved path-by-path uniqueness starting from x 0 = 0. Let us also remind that pathwise uniqueness holds, starting from 0, by [55].
In this particular example it is possible to get an intuitive idea, "by hands", of what happens. If one consider the ODE without noise starting from 0, any solution Y grows near 0 no faster than t 1/(1−α) ; on the contrary, the Brownian motion W near 0 grows as t 1/2 (up to a logarithmic correction, which does not affect the intuition). Heuristically, we could say that the "speed" of Y near 0 caused by the drift is like t α/(1−α) , while the one caused by W is like t −1/2 . So what we expect to happen is that the Brownian motion moves the particle immediately away from 0, faster than the action of the drift, and this prevents the formation of non-uniqueness or singularities. At least in the one-dimensional case, this can be seen also through speed measure and scale function, see [14].
Notice that if α < −1 the opposite phenomenon appears (Y is faster than W ), so that we expect ill-posedness. This is also an argument for the optimality of the LPS conditions (even if, in this case, we do not look at critical cases in LPS hypotheses), see also Example 7.4.
In this case, we see that, for every initial x 0 , there exists a unique solution Y to the ODE, which reaches 0 in finite time and then stays in 0. Thus, concentration happens in 0, so that there does not exist a Lagrangian flow (the image measure of the flow at time t can have a Dirac delta in 0). Moreover the solution to the continuity equation concentrates in 0. Again our results apply when α > −1, so these concentration phenomena disappear in the stochastic case. b(x) = 1 x∈A 1 0<|x|≤1 |x| αx + 1 |x|>1 x + 1 x∈A c − 1 0<|x|≤1 |x| αx − 1 |x|>1 x , where A = {x ∈ R 2 : x 1 > 0 or (x 1 = 0, x 2 > 0)}. It is easy to see that, for α < 1, in the deterministic case one can construct flows with discontinuity, concentration of the mass in 0 or both; in particular, non-uniqueness holds. Again, for α > −1, well-posedness (as in Theorem 4.5) is restored.
Introduction to the ejpecp Class

A counterexample in the supercritical case
Finally let us show that outside the LPS class there are equations and diffuse initial conditions without any solution; in particular the statement of Theorem 5.4 does not hold in this case. Example 7.4. We now consider equation (sDE) on R d , with σ = 1 and drift b defined as b(x) := −β|x| −2 x1 x =0 , with β > d/2. Notice that, for d ≥ 2, this drift is just outside the LPS class (in the sense that |x| α−1 x belongs to that class for any α > −1). For this particular sDE, we have: for some T > 0 and M > 0, if X 0 is a random variable, independent of W and uniformly distributed on [−M, M ], then there does not exist a weak solution, starting from X 0 . Proof.
Step 1: (sDE) does not have a weak solution for X 0 = 0 (for any T > 0). Notice that this does not prevent from extending Theorem 5.4 to this case (because the initial datum is concentrated on 0), but it is a first step. The method is taken from [21].
Assume, by contradiction, that a weak solution on [0, T ] exists, i.e. there is a filtered probability space (Ω, A, G t , P ) (satisfying standard assumptions), a Brownian motion W in R d with respect to (G t ) t , an (G t ) t -adapted continuous process (X t ) t≥0 in R d , such that T 0 |b(X t )|dt < ∞ and, a.s., Hence, X is a continuous semimartingale, with quadratic covariation X i , X j t = δ ij t.
By the Itô formula, we have We now claim that Therefore, |X t | 2 is a positive local supermartingale, vanishing at t = 0. This implies |X t | 2 ≡ 0, hence X t ≡ 0, which contradicts the fact that X i , X j t = δ ij t. It remains to prove the claim (7.1). Consider the random set {t ∈ [0, T ] : X t = 0}. Since it is a subset of A 1 = {t ∈ [0, T ] : X 1 t = 0}, it is sufficient to prove that A 1 is of Lebesgue measure zero, P -a.s. and this is equivalent to P T 0 1 X i s =0 ds = 0 = 1. Since X is a continuous semimartingale with quadratic covariation X i , X j t = δ ij t, also X 1 is a continuous semimartingale, with quadratic covariation X 1 , X 1 t = t. Hence, by the occupation times formula (see [73, Chapter VI, Corollary 1.6]) Sketch of proof. The idea is that the symmetry properties of the given drift b allow to reduce the solution X of the sDE to a one-dimensional Bessel process, for which the probability of hitting 0 is known. Fix R > 0 and, for any ε > 0, x ∈ B R \B ε , and denote by τ ε (x) the exit time from the annulus B R \B ε of the solution Z(x) to (sDE), starting from x (note that Z exists up to τ ε (x) since the drift is regular in the annulus). If we prove that, for every x = 0, there exist T > 0, δ > 0 and R large enough such that, for every ε > 0, P τ ε (x) < T, Z(x, τ ε (x)) = ε > δ, (7.4) then we have shown the claim (7.2). In order to prove (7.4) we notice that we have P (τ ε (x) < T, Z(x, τ ε (x)) = ε) = u(0, x), where u solves the backward parabolic PDE, on B R \B ε , ∂ t u + b · ∇u + 1 2 ∆u = 0, with final and boundary conditions u(T, ·) ≡ 0 in B R \B ε , u(t, ·) ≡ 1 on ∂B ε and u(t, ·) ≡ 0 on ∂B R , for all t ∈ [0, T ].