A DISCRETIZED VERSION OF KRYLOV’S ESTIMATE AND ITS APPLICATIONS

0 σsdWs, (1.1) where ξ0 ∈ F0, bs(ω) : R+ × Ω → R and σs(ω) : R+ × Ω → R ⊗ R are bounded measurable Ft-adapted processes with bound κ0. Suppose that for some κ1 > 0, det(σs(ω)σ ∗ s (ω)) > κ1, ∀(s, ω) ∈ R+ × Ω, where the asterisk stands for the transpose of a matrix. It is well known that for any T > 0 and p > d+1, there exists a constant C > 0 depending only on κ0, κ1, p and d such that for all f ∈ L([0, T ]× R),


Discretized Krylov's estimate
Let (Ω, F , P; (F t ) t 0 ) be a complete filtration probability space and (W t ) t 0 a ddimensional standard F t -Brownian motion. Let ξ t be a d-dimensional Itô process with the following form (1.1) where ξ 0 ∈ F 0 , b s (ω) : R + × Ω → R d and σ s (ω) : R + × Ω → R d ⊗ R d are bounded measurable F t -adapted processes with bound κ 0 . Suppose that for some κ 1 > 0, that for all f ∈ L p ([0, T ] × R d ), (1.2) and for time-independent f ∈ L p (R d ) with p d, Such estimates were proven by Krylov in [8], which plays a basic role in the study of SDEs with measurable coefficients (see also [19] for some extensions).
In this paper we are interesting in showing a discretized version of (1.2). More precisely, for fixed N ∈ N, we introduce the following discretized Itô process: for k ∈ N, (1.4) where ξ N 0 ∈ F 0 , and for each j ∈ N 0 := N ∪ {0}, b j ∈ R d and σ j ∈ R d ⊗ R d are F j/Nmeasurable random variables. We aim to establish a discretized version of Krylov's estimate for ξ N k in the following theorem.
Let us consider the following general SDE in R d : (1.7) where b : R + × R d → R d is a Borel measurable function and σ : R + × R d → R d ⊗ R d is a nondegenerate matrix-valued Borel measurable function and continuous in x. If b and σ are of linear growth in x uniformly in t, it is well known that SDE (1.7) admits a unique weak solution X t (cf. [16]). Moreover, if in addition σ is Lipschitz continuous in x uniformly in t, then SDE (1.7) admits a unique strong solution (cf. [18]). For h ∈ (0, 1), consider the following Euler approximation of SDE (1.7):  (1.9) or equivalently, for k = 0, 1, 2, · · · and t ∈ [kh, (k + 1)h), (1.10) One would ask whether it holds where the key point of proving the above limit is to show an estimate like (1.5). Notice that if we take h = 1/N , then ξ N k := X 1/N k/N just takes the same form as in (1.4). In fact, when σ is Hölder continuous, that is, for some α ∈ (0, 1) and c > 0, Gyöngy and Krylov [4,Theorem 4.2] proved that X h t allows a density ρ h t (y) with where C = C(d, p, κ 0 , κ 1 ) > 0, see also [9] for two-sides estimates of ρ h t (x). From this, it is easy to derive that for any p > d/α, The above discretized Krylov estimate plays a key role in [4] for showing (1.11) when b is only bounded measurable. However, by (1.6), the above estimate holds for any p > d without any continuity assumption on σ. In other words, using (1.6) we can drop the continuity assumption on σ in Theorem 2.8 of [4]. It should be noticed that in the remarkable paper [4], under very broad assumptions, Gyöngy and Krylov used Euler's polygonal approximation to construct the strong solution for SDE (1.7). We mention that if b satisfies some monotonicity condition and σ is Lipschitz continuous, Gyöngy [3] showed the rate of almost surely convergence for Euler's scheme. Up to now, there are many works devoted to the study of Euler's approximation for SDEs with irregular coefficients under various assumptions, for examples, see [5,10,13,1] and references therein.

Euler's scheme for DDSDEs
Another goal of this paper is to use Theorem 1.1 to derive the same results as in [4] for mean-field (also called McKean-Vlasov or distribution-dependent in literature) SDEs with measurable discontinuous coefficients b and σ. For β 0, let P β (R d ) be the space of all probability measures on R d with finite β-order moment, which is endowed with the weak convergence topology. Let β 1. Consider the following distribution-dependent SDE (abbreviated as DDSDE): (1.12) where µ Xt stands for the law of random variable X t and b : are Borel measurable functions. Below we make the following assumptions: EJP 24 (2019), paper 131.

(H β )
For each x, µ, t → b t (x, µ) and σ t (x, µ) are continuous, and for each t, x, µ → b t (x, µ) and σ t (x, µ) are weakly continuous. Moreover, for some β 1, there is a constant c 0 > 0 such that for all t 0, x ∈ R d and µ ∈ P β (R d ), and the following nondegenerate condition holds: there is a constant c 1 > 0 such that for all t 0, x ∈ R d and µ ∈ P β (R d ), det(σσ * )(t, x, µ) c 1 .
(1.13) (H β ) Letb andσ be two Borel measurable functions on R + × R d × R d with values in R d and R d ⊗ R d , respectively. Assume that for each x, y ∈ R d , t →b t (x, y),σ t (x, y) are continuous, and for some c 0 > 0 and all t 0, x, y ∈ R d , and we also assume the nondegenerate condition (1.13) holds.
The difference between (H β ) and (H β ) lies in that in the later case, may be not continuous with respect to the weak convergence. Notice that we do not make any continuity assumptions onb,σ in x, y. We now consider the following Euler approximation of DDSDE (1.12): (1.14) The following theorem extends [4, Theorem 2.8] to DDSDEs. Theorem 1.2. Let β > 2, ν ∈ P β (R d ) and one of (H β ) and (H β ) holds.
(i) Suppose that weak uniqueness holds for DDSDE (1.12). Then there is a unique weak solution X to DDSDE (1.12) with initial law P • X −1 0 = ν so that X h converges to X in distribution. Moreover, for any bounded measurable f , (1.15) (ii) Suppose that pathwise uniqueness holds for DDSDE (1.12). Then there is a unique About the weak and strong uniqueness of DDSDE (1.12), by Girsanov's theorem, Li and Min [11] obtained the existence and uniqueness of weak solutions when b is bounded measurable and σ is nondegenerate and Lipschitz continuous. While under (H β ) or (H β ), when σ does not depend on µ and is Lipschitz continuous in x and b is Lipschitz continuous with respect to µ in case (H β ), Mishura and Veretennikov [12] showed the strong uniqueness. In a recent work of the present author with Röckner [14], we established the well-posedness of DDSDEs (1.12) with singular drifts (see also [6]).

Propagation of chaos for Euler's scheme
Below we fix h ∈ (0, 1) and let {ξ j , j ∈ N} be a sequence of i.i.d. random variables in R d with common distribution ν, and {W j , j ∈ N} a sequence of independent ddimensional standard Brownian motions. For numerical reason, we also consider the following interacting particle approximation for Euler's scheme: for fixed N ∈ N, we define for j = 1, · · · , N , where δ x stands for the Dirac measure concentrated at point x. In the following, for simplicity we only consider the case (H β ), and in this case we have Clearly, {X j · , j ∈ N} is a family of i.i.d. stochastic processes with common distribution as X h · . Theorem 1.3. Let β > 2 amd ν ∈ P β (R d ). Suppose that (H β ) holds and the initial law ν has a density φ ∈ L q loc (R d ) for some q > 1. Then it holds that for any T > 0, For fixed h ∈ (0, 1) and N ∈ N, we use E h and P N to denote the operators of Euler's scheme and the interacting particle approximation to DDSDE (1.12), respectively: From the construction, it is easy to see that Here an open question is to show that EJP 24 (2019), paper 131. and lim N →∞ Obviously, the obstacle is to show the following propagation of chaos under (H β ): (1.21) When b and σ are Lipschitz continuous in x and µ, the above propagation of chaos (1.21) was proven by Sznitman [17]. Recently, Bao and Huang [2] proved (1.20) by Zvonkin's transformation when b and σ are Hölder continuous in x and Lipschitz continuous in µ with respect to the Wasserstein distance. However, under (H β ), proving (1.21) seems to be a challenge problem.

Plan and notations
This paper is organized as follows: In Section 2, we prove Theorem 1.1. In Section 3, we prove Theorem 1.2. In Section 4, we prove Theorem 1.3. Throughout this paper we use the following conventions: • For a matrix σ, we use σ to denote the Hilbert-Schmidt norm of σ.
• For R > 0, we use B R to denote the ball in R d with radius R and center 0.
for some unimportant constant C 1, whose dependence on the parameters can be traced from the context.

Proof of Theorem 1.1
To prove (1.5), we shall use the classical Krylov estimate (1.2). For this we need to embed ξ N k into a continuous Itô process. For k ∈ N 0 and t ∈ [k/N, In this way, it is easy to see that X N k/N = ξ N k and Similarly, let (f k ) k∈N be a family of nonnegative measurable functions in R d . If we definẽ where t N + := ([tN ] + 1)/N . Moreover, by (1.2) we have for any p d + 1 and y ∈ R d , EJP 24 (2019), paper 131.
At this moment, we can not immediately conclude (1.5) because we need to treat is the distributional density of Brownian motion W t .
Noting that Note that by the change of variable and the scaling property of ϕ t (y),
The desired estimate (1.5) now follows by (2.2) and showing that the last integral is finite. In fact, by the change of variable, for some c = c(d, q) > 0, As for (1.6) it follows by using (1.3) in the above proof.

Proof of Theorem 1.2
The following lemma is standard by Burkholder and Gronwall's inequalities.

1)
and for any s, t ∈ [0, T ], Proof. Note that For simplicity, we let |X h t | * := sup s∈[0,t] |X h s |. By Burkholder's inequality and the linear growth of b and σ, we have Let Q h be the law of (X h · , W · ) in product space C × C, where C is the continuous functions space. By (3.2), since β > 2, (Q h ) h∈(0,1) is tight. Therefore, by Prokhorov's theorem, there are a subsequence h n → 0 as n → ∞ and Q ∈ P(C × C) so that Q n := Q hn → Q weakly. Now, by Skorokhod's representation theorem, there are a probability space (Ω,F ,P) and random variables (X n ,W n ) and (X,W ) defined on it such that (X n ,W n ) → (X,W ),P − a.s.
In other words,W n t is anF n t -Brownian motion. Thus, by (3.3) and (3.5) we havẽ X n t =X n 0 +  Then it holds that for every T > 0, Using the above lemma we can show the following limits by the discretized Krylov estimate.  where ( ε ) ε∈(0,1) is a family of mollifiers in R d × R d with support in B ε × B ε . For fixed ε ∈ (0, 1), sinceσ ε is continuous and linear growth in x, y, by (3.4) and Lemma 3.2 (see also [4,Lemma 3.1]), it is easy to see that for fixed ε ∈ (0, 1), t 0 σ ε sn X n sn , µ X n sn dW n s → t 0 σ ε s (X s , µ Xs ) dW s in probability as n → ∞. Indeed, it suffices to prove the following two limits: in probability as n → ∞. Limit (3.9) follows by (3.2) and the continuity of t → σ ε t (x, y), and limit (3.10) follows by (3.4) and Lemma 3.2. Therefore, it remains to prove that t 0 σ ε sn X n sn , µ X n sn dW n s → t 0 σ sn X n sn , µ X n sn dW n s (3.11) in probability uniformly in n as ε → 0, and We only show (3.11). By Itô's isometric formula, we have E σ ε sn X n sn , µ X n sn − σ sn X n sn , µ X n sn 2 ds t 0 E σ ε sn X n sn ,X n sn −σ sn X n sn ,X n sn 2 ds =: J n ε (t), whereX n · is an independent copy of X n · . More precisely, (X n ,X n ) solves the following equation (the Euler scheme): dX n t = b tn X n tn , µ X n tn dt + σ tn X n tn , µ X n tn dW t , dX n t = b tn X n tn , µXn tn dt + σ tn X n tn , µXn tn dW t , (3.13) where (W, X n 0 ) and (W ,X n 0 ) are independent and have the same distributions. In order to use the discretized Krylov estimate to show lim ε→0 sup n J n ε (t) = 0, (3.14) we use a standard stopping time technique. For R > 0, we define a stopping time τ n R := inf{t > 0 : |X n t | ∨ |X n t | > R}, and make the following decomposition: J n ε (t) = t 0 E 1 t τ n R σ sn (X n sn ,X n sn ) −σ ε sn (X n sn ,X n sn ) 2 ds + t 0 E 1 t<τ n R σ sn (X n sn ,X n sn ) −σ ε sn (X n sn ,X n sn ) 2 ds =: J n,1 R,ε (t) + J n,2 R,ε (t).
For J n,1 R,ε (t), by Hölder's inequality, (3.1) and Chebyshev's inequality we have For J n,2 R,ε (t), we can not directly use the discretized Krylov estimate to conclude lim ε→0 sup n J n,2 R,ε (t) = 0, ∀t, R > 0, (3.16) because the Euler scheme (3.13) has unbounded coefficients. We need to cutoff the coefficients. Let χ R (x) be a nonnegative smooth cutoff function with χ R (x) = 1 for |x| < R and χ R (x) = 0 for |x| > R Let (X n,R ,X n,R ) solve the following equation in R 2d (no coupling): where (W, X n 0 ) and (W ,X n 0 ) are the same as in (3.13). From the construction, one sees that (X n t ,X n t ) = (X n,R t ,X n,R t ), t < τ n R . (3.17) Moreover, it is easy to see that (X n,R ,X n,R ) is a discretized R 2d -valued Itô process with coefficients satisfying the assumptions in Theorem 1.1 uniformly in n. Thus, for fixed R > 0 and any p > 2d + 1, by (3.17) and (1.5) we have J n,2 R,ε (t) = t 0 E 1 t<τ n R σ sn (X n,R sn ,X n,R sn ) −σ ε sn (X n,R sn ,X n,R sn ) 2 ds t 0 E 1 |X n,R sn |∨|X n,R sn |<R σ sn (X n,R sn ,X n,R sn ) −σ ε sn (X n,R sn ,X n,R sn ) 2 ds EJP 24 (2019), paper 131.
where h n ↓ 0 as n → ∞, and the constants contained in the above may depend on R. By the dominated convergence theorem and the continuity of t → σ t (x, y), we have lim ε→∞ I R ε (t) = 0, lim n→∞ K R n (t) = 0, which in turn implies the limit (3.16), and so (3.14). Thus we complete the proof.
Proof of (i) of Theorem 1.2. Using the above lemma and taking limits for both sides of (3.6), one finds that (X,W ) solves the following SDE: Since the weak uniqueness holds for DDSDE (1.12), any weak solutions have the same distribution. Hence, the whole Euler approximation X h weakly converges to the unique weak solution X in distribution. As for (1.15), it follows by Krylov's estimate (1.5).
In order to show (ii) of Theorem 1.2, we need the following important observation due to [4,Lemma 1.1], which has the root of Yamada-Watanabe's theorem.
Lemma 3.4. Let (Z h ) h∈(0,1) be a family of random elements in a Polish space (E, ρ). Then Z h converges in probability to an E-valued random element as h → 0 if and only if for every pair of subsequences (Z hn , Z n ) n∈N , there exists a subsubsequence (Z h n(k) , Z n(k) ) k∈N that converges in distribution to a random element in E × E, which supports on the diagonal {(x, y) ∈ E × E : x = y}.
Proof. We use a contradiction method. Suppose that Z h does not converge in probability. Then there is an ε > 0 such that for any δ > 0, there are h δ and δ less than δ such that P ρ(Z h δ , Z δ ) > ε ε.
Thus we can choose two subsequences Z hn and Z n such that inf n∈N P ρ(Z hn , Z n ) > ε ε. (3.19) By the assumption, there is a subsubsequence (Z h n(k) , Z n(k) ) k∈N such that lim k→∞ E ρ(Z h n(k) , Z n(k) ) ∧ 1 = 0.
Clearly, this is contradict with (3.19). By the completeness of (E, ρ), we complete the proof.

Now we are in a position to give
Proof of (ii) of Theorem 1.2. Let X hn and X n be two subsequences of X h . Clearly, by Lemma 3.1, the law of (X hn , X n , W ) n∈N in C × C × C is tight. As above, by Skorokhod's embedding theorem, there exist subsequences n(k), a probability space (Ω,F ,P), carrying stochastic processes (X h n(k) ,X n(k) ,W k ) and (X,X,W ) such that X h n(k) ,X n(k) ,W k k→∞ → X ,X,W P − a.s. and for each k ∈ N, EJP 24 (2019), paper 131.
As in showing (3.18), one sees that (X,W ) and (X,W ) are two solutions of DDSDE (1.12) defined on the same probability space with the same initial valuesX 0 =X 0 . The latter point is due tõ By the pathwise uniqueness, we obtainX =X. Thus by Lemma 3.4, we conclude that X h converges in probability to a random elelment X in C as h ↓ 0. Using Lemma 3.3, one sees that X is a solution of DDSDE (1.12). Moreover, the convergence (1.16) follows by (3.1) and the dominated convergence theorem.

Propagation of chaos: Proof of Theorem 1.3
In this section we use induction to prove Theorem 1.3. First of all, we prepare several lemmas. The following lemma is the same as in Lemma 3.1. We omit the details.  Under (H β ), for any T > 0, there is a constant C > 0 such that for all N ∈ N and j = 1, · · · , N , The proof is complete. Let f ε t (x, y) := f t (·) * ε (x, y) be the mollifying approximation. Define Let β > 2 and ν ∈ P β (R d ). Suppose that (H β ) holds and the initial law ν has a density φ ∈ L q loc (R d ) for some q > 1. Then we have Proof. We only prove the first limit. For simplicity, we write Without loss of generality we assume t > h. Notice that N,ε , by the assumption we have N,ε + I N,ε .
For I (11) N,ε , we clearly have For I (12) N,ε , if we define B R := {(x, y) ∈ R d × R d : |x| < R, |y| < R} for R > 0, then by Hölder's inequality and φ ∈ L q loc (R d ), where the constant C contained in is independent of N, R, ε. By the dominated convergence theorem and first letting ε → 0 and then R → ∞, we get Combining the above calculations we obtain that for all t ∈  Here we can not use Gronwall's inequality to derive the result. We shall use the induction method to show (1.19). First of all, we clearly have E|X N,j 0 −X j 0 | 2 = 0. E|X N,j mh −X j mh | 2 + C/R β−2 + A b N,ε (t) +Ā b N,ε (t) + A σ N,ε (t) +Ā σ N,ε (t).
Firstly letting N → ∞ and then R → ∞ and ε → 0, by Lemma 4.3 and the induction hypothesis, we obtain (1.19) for t = (k + 1)h. The proof is complete.