Uniform in Time Interacting Particle Approximations for Nonlinear Equations of Patlak-Keller-Segel type

We study a system of interacting diffusions that models chemotaxis of biological cells or microorganisms (referred to as particles) in a chemical field that is dynamically modified through the collective contributions from the particles. Such systems of reinforced diffusions have been widely studied and their hydrodynamic limits that are nonlinear non-local partial differential equations are usually referred to as Patlak-Keller-Segel (PKS) equations. Under the so-called"quasi-stationary hypothesis"on the chemical field the limit PDE is closely related to granular media equations that have been extensively studied probabilistically in recent years. Solutions of classical PKS solutions may blow up in finite time and much of the PDE literature has been focused on understanding this blow-up phenomenon. In this work we study a modified form of the PKS equation for which global existence and uniqueness of solutions holds. Our goal is to analyze the long-time behavior of the particle system approximating the equation. We establish, under suitable conditions, uniform in time convergence of the empirical measure of particle states to the solution of the PDE. We also provide uniform in time exponential concentration bounds for rate of the above convergence under additional integrability conditions. Finally, we introduce an Euler discretization scheme for the simulation of the interacting particle system and give error bounds that show that the scheme converges uniformly in time and in the size of the particle system as the discretization parameter approaches zero.


Introduction
Consider the following system of nonlinear nonlocal partial differential equations where (t, x) ∈ (0, ∞) × R d and α, β, γ, χ are positive constants. The symbols ∇, ∇· and ∆ denote the gradient operator, the divergence operator and the standard Laplacian respectively. Equations of the above form arise as reinforced diffusion models for chemotaxis of particles representing biological cells or microorganisms in which the particle diffusions are directed by the gradient of a chemical field which in turn is dynamically modified by the contributions of the particles themselves(cf. [20,15]).
The functions u(t, x) and h(t, x) represent, respectively, the continuum limits of the densities of the biological particles and the particles constituting the chemical field.
The parameters α and β in the second equation model the decay rate of the chemical particles and the rate at which the biological particles contribute to the chemical field, respectively. The function g is the dispersal kernel which models the spread and amount of the chemical produced by the biological particles. A natural form for g is a Gaussian kernel g(x, y) . = (2πδ) −d/2 exp{−|y − x| 2 /2δ}, where δ is a small parameter. The first equation describes the collective motion of the biological particles. The dynamics of the individual particles is coupled through the gradient of the chemical field, i.e. ∇h, which defines their drift coefficient (up to a positive constant multiplier χ). Finally, the function V models a confinement potential for the particle motions. Thus the reinforcement mechanism is as follows. Particles are attracted to the chemical and they emit the chemical at a constant rate, resulting in a positive feedback: the more the cells are aggregated, the more concentrated in their vicinity is the chemical they produce which in turn attracts other cells.
The key feature of the model is the competition between the aggregation resulting from the above reinforcement mechanism and the diffusive effect which spreads out the biological and chemical particles in space. When V = 0 and g(z − x)dz is the Dirac delta measure δ x , (1.1) becomes the classical Patlak-Keller-Segel (PKS) model [20,15] which has been studied extensively. It is well known that for the 2-d PKS model (i.e. d = 2), there is a critical mass M c such that (i) the solution to (1.1) blows up in finite time if the initial mass u(0, x)dx > M c , and (ii) a smooth solution exists for all time if u(0, x)dx < M c . For d ≥ 3, the blow up of the solution is related to the L d/2 norm of the initial density but here the theory is less well developed. We refer the reader to the survey articles [13,14] for references to the large literature on the PKS model and its variants. One line of active research has focused on the prevention of finite time blowup of solutions via various modifications of the classical PKS equation that discourage mass concentration (cf. [10,1,7] and references therein.) The replacement of the Dirac delta measure by a smooth density g(z − x)dz, as is considered in the current work, can be regarded as one such natural modification of the PKS model in which chemicals are dispersed by cells over a region of positive area rather than over a single point. It is easy to see that there are global unique solutions for general initial conditions for the system (1.1) (cf. Proposition 2.3). The focus of this work is instead on the study of the long time behavior of (1.1) and their particle approximations, for which very little is known. Long time behavior of weakly interacting particle systems of various types has been investigated in many recent works ( [25,8,17,9,4]) and although our work uses many ideas similar to those in these works, one key distinguishing feature and challenge in the model considered here is that the associated weakly interacting particle system is not a Markov process. In particular, Lyapunov function constructions that have been extensively used in the proofs in the above works are not available for the model considered here.
Note that if (u, h) solve (1.1) and u(0, x)dx = m, then (u/m, h) solves (1.1) with β replaced by βm. Thus we can (and will) assume without loss of generality that u(0, x)dx = 1. Our starting point is the following probabilistic representation for the solution of (1.1) in terms of a nonlinear diffusion of the McKean-Vlasov type.
L(X t ) = µ t , (1.2) where {B t } is the standard Brownian motion in R d and L(X t ) denotes the probability law ofX t . In Proposition 2.3 we will show that the above equation has a unique pathwise solution (X t , h(t, ·)) under natural conditions on the initial data and the kernel g. Furthermore, for t > 0 the measure µ t admits a density u(t, ·) with respect to the Lebesgue measure. The first equation in (1.1) can be regarded as the Kolmogorov's forward equation for the first equation in (1.2). In particular, it is easy to check that the pair (u(t, ·), h(t, ·)) is a solution of (1.1). Along with the nonlinear diffusion (1.2) we will also study a mesoscopic particle model for the chemotaxis phenomenon described above that is given through a stochastic system of weakly interacting particles of the following form.
are independent standard Brownian motions in R d . Note that the second equation in (1.3) is the same as that in (1.2) with µ t replaced by the empirical measure . (1.4) In this model a detailed evolution for biological particles is used whereas the chemical field is regarded as the continuum limit of much smaller chemical molecules. One can also consider a microscopic model where a detailed evolution equation of chemical particles replaces the second equation in (1.3). Such 'full particle system approximations' of (1.1) for the classical PKS model (i.e. when g is replaced by a Dirac probability measure) were studied in [22] where the convergence of the empirical measures of the biological and chemical particles to the solution of the limit PDE, up to the blow up time of the solutions, was established. Starting from the works of McKean and Vlasov [18], nonlinear diffusions and the associated weakly interacting particle models have been studied by many authors (See, for instance, [23,19,17,16].) One important difference in (1.2) (and similarly (1.3)) from these classical papers is that the right side of the first equation depends not only on µ t but rather on the full past trajectory of the laws, i.e. {µ s : 0 ≤ s ≤ t}.  conditions, uniform in time convergence of µ N t to µ t , in a suitable sense. Such a result is important since it says in particular that the time asymptotic aggregation behavior of the particle system is well captured by the asymptotic density function u(t, ·) as t → ∞.
In general one would also like to know how well µ N approximates µ for a fixed value of N . In order to address such questions, in our second result, under stronger integrability conditions, we will provide uniform in time exponential concentration bounds that give estimates on rates of convergence of µ N to µ.
A natural empirical approach for the study of long time properties of (1.3) is through numerical simulations. For example, under the Neumann boundary condition, a numerical simulation for (1.7) in a square demonstrates a separation of time scale: after an initial short time interval during which particles aggregates to form many crowded subpopulations, the subpopulations merge to form a stationary profile at a much slower time scale. See [11] and Figure 3.9 in [21] for such simulation results. Note however that the system cannot be simulated exactly and in practice one needs to do a suitable time discretization. For such simulations to form a reliable basis for mathematical intuition on the long time behavior, it is key that they approximate the system (1.3) or the PDE (1.1), uniformly in time. We will show that under suitable conditions a natural discretization scheme for (1.3) gives a uniform in time convergent approximation to the solution of (1.2) as N → ∞ and as the discretization step size tends to zero. Our uniform in time numerical approximations offer qualitative insights for the long time dynamical behavior of such systems.

Existing results and some challenges
One of the key challenges in the study of (1.3) is that the N d dimensional process X (N ) = (X 1,N , · · · , X N,N ) is not a Markov process since the right side of the first equation in (1.3) depends on the full past history of the empirical measure, i.e. {µ N s } 0≤s≤t . In order to get a Markovian descriptor one needs to consider the pair (X (N ) , h N ) which is an infinite dimensional Markov process. Similar difficulties arise in the study of (1.1) where the form of the coupling between u and h makes the analysis challenging.
These difficulties do not occur for the reduced parabolic-elliptic system (1.5) obtained by formally letting γ → ∞ in (1.1): In the context of chemotaxis, the model in (1.5) corresponds to a quasi-stationary hypothesis for the chemical h, that is, the chemical diffuses at a much faster time scale than the biological particles. Equation (1.5) is mathematically more tractable since here one can solve for h explicitly in terms of u and g as where * denotes the standard convolution operator, G α (z) = ∞ 0 e −αt P t g(z) dt and P t is the standard heat semigroup, i.e. the semigroup generated by 1 2 ∆. Using this expression for h, the system (1.5) can be expressed as a single equation of the form Kinetic equations of the above form have been well studied in the literature [18,25,23,8,9,17,4] where they are sometimes referred to as granular media equations because of their use in the modeling of granular flows (cf. [2]). An interacting particle approximation for this equation takes the following simple form } is a standard Brownian motion in R dN and for Note that in this case X N is a Markov process given as a N d dimensional diffusion with a gradient form drift. Law of large number results and propagation of chaos properties for such models over a finite time interval that rigorously connect the asymptotic behavior of (1.7) as N → ∞ with the equation in (1.6) are classical and go back to the works of McKean [18] and Sznitman [23]. In recent years there has also been significant progress in the study of the time asymptotic behavior of (1.7) and (1.6). Under suitable growth and convexity assumptions on V and G α , [25] studied the existence and local exponential stability of fixed points of (1.5) by a suitable construction of a Lyapunov function. Similar Lyapunov functions were used in [8,17,9]  where Hess f ∞ .
for all positive integers k, where for p ≥ 1, W p is the Wasserstein-p distance (see Section 1.2) on the space of probability measures on R dk and u is the solution to (1.6).
Under the same assumptions (1.8) and (1.9), a uniform concentration bound of the form [4,Theorem 2.12] is implied by (1.9).) The paper [17] proves uniform in time weak convergence of empirical measures constructed from an implicit Euler discretization scheme for the Markovian system (1.7) to the solution of (1.6). As remarked earlier, uniform in time numerical approximations are useful for obtaining qualitative insights for the long time dynamical behavior of such systems. Much less is known for the model (1.1)-(1.3). For the classical parabolic-parabolic Patlak-Keller-Segel PDE a global existence in the subcritical case (i.e the initial mass is less than 8π) in R 2 is established in [6] and the corresponding uniqueness result is established in [5]. We refer the reader to references in [5] for recent development of the parabolic-parabolic Patlak-Keller-Segel PDE. None of these works consider particle approximations or long time behavior (however see [5] for recent stability results in the plane in a quasi parabolic-elliptic regime.) The goal of the current work is to develop the theory for the long time behavior of (1.1)-(1.3), analogous to the one for parabolic-elliptic model described above. As noted earlier, our approach is inspired by the ideas developed in [25,8,17,9,4]. Our main contributions are as follows.

Contributions of this work
In this work we identify conditions under which the particle system (1.3) converges to the nonlinear process (1.2) uniformly over the infinite time horizon and construct time stable numerical approximations for (1.3) and (1.2). More precisely, the main contributions of this paper are as follows. 3. Under stronger integrability conditions, we establish uniform in time exponential concentration bounds for µ N given in Theorem 3.8. These bounds say that the probability of observing deviation of µ N t from its LLN prediction µ t is exponentially small, uniformly in t, as N increases.

An explicit Euler scheme for (1.3) is constructed and it is shown that it converges
to the solution of (1.3) uniformly in time and in N (Theorem 3.10). Together with the POC result in 2 this shows that the Euler scheme gives a uniform in time convergent approximation for the nonlinear process as N → ∞ and step size goes to 0 (Corollary 3.11).
Our main condition for uniform in time results in 2,3,4 is Assumption 2.4. This assumption can be regarded as the analog of condition (1.9) used in the study of (1.6)-(1.7).
The paper is organized as follows. In Section 2, we present the basic wellposedness results and introduce our main assumptions. Section 3 contains the main results of this work. Finally Section 4 is devoted to proofs.
Notation: For a Polish space (i.e. a complete separable metric space) S, P(S) denotes the space of all probability measures on S. This space is equipped with the topology of weak convergence. Distance on a metric space S will be denoted as d S (·, ·) and if S is a normed linear space S the corresponding norm will be denoted as · S . If clear from the context S will be suppressed from the notation. The space of continuous functions from an interval I ⊂ [0, ∞) to R d is denoted by C(I : R d ). The space C T . = C([0, T ] : R d ) will be equipped with the usual uniform norm and the Fréchet space will be equipped with the distance Given metric spaces S i , i = 1, . . . k, the distance on the space S 1 × · · · × S k is taken to be the sum of the k distances: d Si (x i , y i ), x = (x 1 , · · · x k ), y = (y 1 , · · · y k ). The law of a S valued random variable X (an element of P(S)), is denoted by L(X). A collection of S valued random variables {X α } is said to be tight if their laws {L(X α )} are tight in P(S). For a signed measure µ on S and a µ-integrable function f : S → R, we write f dµ as f, µ . For a polish space S, the Wasserstein-p distance on P(S) is defined where the infimum is taken over all probability measures π ∈ P(S × S) with marginals µ and ν. Let P p (S) be the set of µ ∈ P(S) having finite p-th moments where p ∈ [1, ∞).
where Lip 1 (S) is the space of Lipschitz functions on S whose Lipschitz constant is at most 1.
Throughout, (Ω, F, P) will denote a probability space which is equipped with a filtration (F t ) satisfying the usual conditions. The symbol E denotes the expectation with respect to the probability measure P. For a stochastic process X the notation X t and X(t) will be used interchangeably.
The space of all bounded continuous functions on S is denoted by C b (S). The supremum of a function f : S → R is denoted as ||f || ∞ . = sup x∈S |f (x)|. Space of functions with k continuous (resp. continuous and bounded) derivatives will be denoted as (∂ xi f ) 2 1/2 and Hess g ∞ = sup i,j sup x |∂ xi ∂ xj g(x)|.

Preliminaries and well-posedness
Note that for a bounded function g : R d → R, a solution h to (1.2) by the variation of constants formula satisfies EJP 22 (2017), paper 8.

This section gives the basic wellposedness results for equations (1.2) and (1.3) under
suitable conditions on the dispersal kernel g and initial chemical field h 0 . Lemma 2.1 below gives a uniform boundedness and a uniform Lipschitz property for ∇h and ∇h N .
Its proof is straightforward but is included for completeness in Section 4.1.
where the outside supremum is taken over all m ∈ P C([0, ∞) : R d ) .
In Section 4.1, using Lemma 2.1 we prove the following proposition which gives the wellposedness of (1.3).
With another application of Lemma 2.1 and straightforward modifications of classical fixed point arguments (cf. [23]), we prove the following proposition in Section 4.1 as well.

Assumptions
The following will be our standing assumptions.
• γ = 1 and R d g(x) dx = 1. The second assumption can be made without loss of generality by modifying the value of β whereas the first assumption is for notational convenience; the proofs for a general γ follows similarly.
The above assumptions will be used without further comment. Note that from the last assumption it follows that ∇V (0) = 0.
In addition, for several results the following convexity assumption will be made. This condition plays an analogous role in the study of the long-time properties of the parabolic-parabolic system as condition (1.9) for the parabolic-elliptic system. Let v * .
Note that since ∇V is Lipschitz, |v * | ≤ L ∇V where latter is the the Lipschitz constant of ∇V . Let

Propagation of chaos
A standard approach to proving POC (see, for instance [23,17,9]) is by a coupling Using the above coupling we establish the following POC for any finite time horizon. Note this result does not require the convexity assumption (i.e. Assumption 2.4).
As an immediate consequence of this result we have the following result on asymptotic mutual independence of (X 1,N , . . . , X k,N ) for each fixed k and the convergence of each X i,N toX 1 (cf. [23]). The result in particular says that L(X 1,N , · · · , X N,N ) is L(X)chaotic in the terminology of [23].
Proof. From the definition of the Wasserstein-2 distance and using the fact that (X i,N ,X i ) has same distribution as (X 1,N ,X 1 ), for i = 1, . . . , N , we have are exchangeable, by [23, Proposition 2.2], we have the following process level weak convergence of empirical distributions.
to the deterministic measure L(X) in probability in P (C)).

Observe that Corollary 3.2 implies in particular that
for all T ≥ 0. However this result does not give uniform in time convergence of these multidimensional laws. To obtain a uniform in time result, we will make the stronger assumption in Assumption 2.4. The following is the analog of Theorem 3.1 over an infinite time horizon.
As an immediate consequence of the theorem we have the following uniform in time propagation of chaos result and uniform in time convergence of the empirical measures µ N (t).
Proofs of Theorem 3.1, Theorem 3.4 and Corollary 3.5 are given in Section 4.2.

Concentration bounds
In this section, we present our concentration estimates for µ N t in W 1 -distance. As in the previous subsection, we first give a result for finite time horizons (this result will not use Assumption 2.4).
Theorem 3.6. Suppose the initial distribution µ 0 ∈ P(R d ) has a finite square-exponential moment, that is, there is θ 0 > 0 such that We note that the constant K may depend on T but not on and d ; also C and N 0 may depend on T and d but not on . The main idea in the proof is, as in [4], to (i) bound . The first step is accomplished in Subsection 4.3.1 via a coupling argument similar to the one used in the proof of results in Section 3.1, while the second step relies on an estimate from [4] for the tail probabilities for empirical measures of i.i.d. random variables that is based on the equivalence between Talagrand's transportation inequalities (cf. [4]) and existence of a finite square-exponential moment. The precise result obtained in [4] is as follows. For a, α ∈ (0, ∞) we let We shall apply this theorem to ν = µ s . In order to do so, we need µ s to have a finite squared-exponential moment. We will show in Section 4.3.2 that if µ 0 satisfies (3.2), then This will allow us to apply Theorem 3.7 in completing step (ii) in the proof of Theorem 3.6. We next show in Theorem 3.8 below that when the convexity property in Assumption 2.4 is satisfied then a uniform in time concentration bound holds. The key step (see Section 4.3.2) is to argue (see Proposition 4.3) that under this assumption, for some θ ∞ > 0, (3.4) holds with [0, T ] and θ T replaced with [0, ∞) and θ ∞ respectively. This together with another uniform bound established in Section 4.3.1 (Proposition 4.1) will imply the uniform in time concentration bound given in the theorem below. Theorem 3.8. Suppose that µ 0 satisfies (3.2) for some θ 0 > 0. Suppose further that Assumption 2.4 is satisfied. Then there exists K ∈ (0, ∞) such that for any d ∈ (d, ∞), there exist C ∈ (0, ∞) and N 0 ∈ (0, ∞) such that We note that C and N 0 may depend on d but not on .
Proofs of Theorems 3.6 and 3.8 will be given in Section 4.3.

Uniform convergence of Euler scheme
In this section we will introduce an Euler approximation for the collection of SDE in (1.3) which can be used for approximate simulation of the system. We show that the approximation error converges to 0 as the time discretization size converges to 0, uniformly in time. As a consequence it will follow that the empirical measure of the particle states in the approximate system converges to the law of the nonlinear process, uniformly in time, as N → ∞ and → 0 (Corollary 3.11). Note that Q t has transition density q(t, x, y) = e −αt p(t, x, y) with respect to Lebesgue measure, where p(t, x, y) is the standard Gaussian kernel. Using (2.5), the system of equations governing the particle system and We now define an explicit Euler scheme for (3.5) with step size ∈ (0, 1). Let Y Note that the integral on the right hand side of (3.8) can be written as Thus in order to evaluate a typical Euler step, one needs to compute terms of the form [k ,(k+1) ] G θ ( x, y)dθ which can be done using numerical integration.
Our goal is to provide uniform in time estimates on the mean square error of the scheme, namely to estimate the quantity For that we begin by establishing moment bounds for the Euler scheme which are uniform in N , step size and time instant n. Recall v * introduced in (2.7). Also recall that µ 0 ∈ P 2 (R d ).
We now present our main result on the uniform convergence of the Euler scheme. For this result we will make the stronger convexity assumption in Assumption 2.4.   It is important that the estimate in (3.9) is uniform not only in time instant n but also in the size of the system N . As a consequence one has the desired property that in order to control the mean square error for larger systems one does not need smaller time discretization steps. This in particular implies that the Euler scheme provides a good numerical approximation to the nonlinear process, uniformly in time. Namely we have the following result.

Proofs
We will denote by κ, κ 1 , κ 2 , · · · the constants that appear in various estimates within a proof. These constants only depend on the model parameters or problem data, namely, α, β, χ, g, V, h 0 , d and µ 0 . For estimates on a finite time horizon [0, T ], these constants may also depend on T and in that case we use κ T , κ 1,T , κ 2,T , · · · to denote such constants.
The value of such constants may change from one proof to another.

Wellposedness
Proof of Lemma 2.1. From the definition of h m in (2.6) and of the semigroup {Q t }, we have for all x, x 1 , x 2 ∈ R d and t ≥ 0, uniform bounds The result is immediate from the above inequalities.
Proof of Proposition 2.2. For notational simplicity, we suppress the index N and write X N,i as X i .
= sup x |∇H(s, x)| and the last inequality follows from Lemma 2.1 on δ Xi is the path empirical measure (see (2.5)). From (4.2), the fact that ∇ x p(t, x, y) = −∇ y p(t, x, y) and integration by parts, we   Existence. This is argued by a minor modification of the standard Picard approximation method as follows. Define a sequence {( valued random variables as follows. Let X (1) (t) = (ξ 1,N 0 , · · · , ξ N,N 0 ) and h (1) (t, x) = h 0 (x) for all t. We then define, for k ≥ 2, By similar estimates as that were used to obtain (4.5), we have Hence, for fixed T > 0 and t ∈ [0, T ], From this it follows that for every t ∈ [0, T ], h (k) (t, ·) converges uniformly to a continuously differentiable function h(t, ·) and ∇h (k) (t, ·) converges uniformly to ∇h(t, ·). Furthermore the convergence is uniform in t ∈ [0, T ], namely This establishes the desired existence of solutions.
Proof of Proposition 2.3. The proof uses classical arguments from [23].  (4.6) and h m is as in (2.6). From Lemma 2.1 ∇h m is a Lipschitz map and by assumption ∇V is Lipschitz as well, thus the equation in (4.6) has a unique solution and consequently the function Φ is well-defined. Observe that (X, h) is a solution of (1.2) over [0, T ] if and only if L(X) ∈ P(C T ) is a fixed point of Φ and h is given by the right hand side of (2.1) with µ t being the law of X t . We will show that for all m 1 , m 2 ∈ P(C T ), we have for some κ T ∈ (0, ∞) where D t is the Wasserstein-1 distance on P(C t ), namely, for m 1 , m 2 ∈ P(C t ), D t (m 1 , m 2 ) is given by the right side of (1.11) with p = 1 and S = C t . Suppose for i = 1, 2, Z i solves (4.6) with m replaced by m i on the right side where m i ∈ P(C T ). Then, Z i has law Φ(m i ) and for any M ∈ P(C t × C t ) with marginals m 1 and m 2 , where the last step uses the fact that y → ∇ z g(y − z) is Lipschitz. Combining the above estimates with (4.8) and using the Lipschitz property of ∇V , we obtain for any M as above. Hence By Gronwall's Lemma, it now follows that Taking expectations we obtain (4.7). Now by a standard fixed point argument, there exists a unique m * ∈ P(C T ) such that m * = Φ(m * ). Let Z * be the unique solution to (4.6) with m replaced by m * . Then (1.2) has a unique pathwise solution (Z * , h * ) where h * is given by the right hand side of (2.1) with µ t (dy) being the law of Z * t .

Propagation of chaos
, for the first term on the right side of (4.9) we have Next note that for any m ∈ P(R d ), For the second term on the right side in (4.9) we will use the decomposition From the above observations we have where the above inequality is interpreted in the integral sense. That is, dφ t ≤ ψ t dt is interpreted as φ b − φ a ≤ b a ψ s ds for all 0 ≤ a ≤ b. In second term on the right of (4.10), the integrand has absolute value The proofs of Theorems 3.1 and 3.4 will make use of the above calculations.
= e 2λ t f (t), we obtain In rest of the proof we estimate ϑ(t) using the above inequality. For this, heuristically, one can set ζ = √ ϑ to obtain a simplification Since we do not have any control for ζ (t) on {ζ = 0}, we instead consider ζ (t) = ϑ(t) + 2 where > 0. Then ϑ = 2ζ ζ and √ ϑ = ζ 2 − 2 ≤ ζ . Hence (4.16) implies (4.17) We will now use a comparison result for ordinary differential equations (ODE). Let k be the solution of the ODE Note that the solution k solves the integral equation It is straightforward to verify that the unique solution of (4.18) converges to uniformly on compacts when → 0, where r 2 < 0 < r 1 are the zeros of the characteristic polynomial θ(r) = r 2 − (λ − α) r −C 2 .
= e 2λ t f (t), we have Observe that Assumption 2.4 implies thatC 2 <λα from which it follows that (λ − α) 2 + 4C 2 <λ + α. Recalling that r 1 , r 2 are the positive and negative roots of (4.21) we now see that under Assumption 2.4,λ > r 1 > 0 > r 2 . Thus the right hand side of (4.24) is at most The proof is complete.
We can now complete the proof of Corollary 3.5.
Proof of Corollary 3.5. The first statement in the corollary is immediate from Theorem 3.4 on noting that For the second statement note that since µ 0 ∈ P q (R d ), sup t≥0 R d |x|qµ t (dx) < ∞ for someq > 2(see Remark 4.4). Hence from Theorem 1.1 of [12], we have The result now follows on combining the above two displays and using the triangle inequality

Concentration bounds
In this section we will first provide exponential concentration bounds that are uniform over compact time intervals. Under the stronger property in Assumption 2.4 we will then show that these bounds can be strengthened to be uniform over the infinite time horizon. We begin with an upper bound for W 1 (µ N t , µ t ) in terms of W 1 (ν N s , µ s ) s∈[0,t] where ν N is as introduced in (3.3).

Bounds in terms of empirical measures of independent variables
The following proposition is a generalization of Proposition 5.1 in [4]. Letλ .
Proof. From (4.10) and (4.11) we have Instead of taking expectations as in Section 4.2, we now bound the third term on the right hand side of (4.25) using the inequality which follows from the Kantorovich-Rubenstein duality (1.12) and the fact that the Lipschitz norm of the function x → ∇ j P t−s g(X i t − x) is bounded by d Hess g ∞ for each j = 1, · · · , d.
Following the same comparison argument as was used to obtain the bound for √ ϑ (ϑ was introduced above (4.16)), we let H (t) = G(t) + 2 where > 0 and obtain for a.e. t ≥ 0  This time we need to solve the inhomogeneous second order ODE with initial conditions K (0) = and K (0) = 0. On solving this ODE, we obtain, as in (4.20) and (4.24), that K converges uniformly on compacts as → 0 to K defined as .
Sending → 0, On the other hand, (4.31) The desired equality now follows from the triangle inequality The following corollary is an immediate consequence of the above proposition.

Corollary 4.2.
For every T ∈ (0, ∞), there exists C T ∈ (0, ∞) such that Also, for all t ≥ 0 Recall from Section 4.2 that under Assumption 2.4 (r 1 −λ) < 0. This will be key in obtaining a uniform in time bound from the last inequality in the corollary above.

Moment bounds
Let (X t ) t≥0 be the nonlinear process solving (1.2) and µ t be its law at t. In Proposition 4.3 below we will give bounds on the square exponential moments ofX t under appropriate conditions. The first part of the proposition holds under our standing assumptions along with a suitable integrability condition. For the second part we will make in addition an assumption that is weaker than Assumption 2.4, namely v * > 0, where v * was defined in (2.7). Let, for t, θ ≥ 0, (ii) Suppose that v * > 0. Then for any θ ∈ (0, θ 0 /4 ∧ v * /8), we have sup t≥0 S θ (t) < ∞.
Finally we verify the claim. Using the estimates −x·∇V (x) ≤ −v * |x| 2 and |x| ≤ η+ |x| 2 4η once again, and choosing η such that b = v * /2 as before, we have by an application of Itô's formula where the next to last inequality follows on noting that is a supermartingale and in the last inequality we have used the fact that since This completes the proof of the claim and the result follows. (ii) Suppose that instead of assuming that µ 0 has a finite squared exponential moment, we assume that µ 0 ∈ P q0 (R d ) for some q 0 ≥ 2. Then it follows easily that for any fixed Furthermore, by applying Itô formula to |x| q instead of e θ|x| 2 one can check that, if in Also, analogous statements as in (i) hold with the squared exponential moment replaced by the q-th moment.

Time-regularity
In this section we give some estimates on the moments of the increments of the nonlinear processX. These estimates are needed to appeal to results in [4] for proofs of our concentration bounds.
We start with moment estimates for |X t −X s |. By Lemma 2.1 and our assumption on ∇V we have where C is as in Lemma 2.1.
Throughout this section we will assume that S θ0 (0) < ∞ for some θ 0 > 0. Taking powers and using Proposition 4.3, we obtain the following result.  Moreover, if Assumption 2.4 is satisfied, then for some C p ∈ (0, ∞) and under Assumption 2.4, for someC p > 0 Next we give an estimate on the exponential moments of the increments.
Proof. By an application of Cauchy-Schwarz inequality we see that it suffices to show that for some θ T , C T ∈ (0, ∞) Note that from Lemma 2.1, for every m ∈ P C([0, ∞) : R d ) , ∇h m ∈ H C . Given v ∈ H C and z ∈ R d , let Y v,z be the solution of the stochastic differential By a standard conditioning argument it suffices to argue that for some θ T , C T ∈ (0, ∞) Fix (z, v) ∈ R d ×H C and suppress it in the notation (i.e. write Y v,z as Y ). Let θ : [0, T ] → R be a non-negative continuously differentiable function and write for t ∈ [0, b] Z t . = e θ(t) |Yt−z| 2 .
Using Itô's formula we obtain, Integrating, we obtain From a similar argument as for the proof of Proposition 4.3(i), there is a ς > 0 such that One of the properties of {θ(t)} 0≤t≤T chosen below will be that sup 0≤t≤T θ(t) < ς/2. With such a choice of {θ(t)}, {M t } is a martingale and consequently E(M t ) = 0 for all t ≥ 0.
Rest of the argument is similar to [4, Section 5.1] and so we only give a sketch. Choose θ(r) to be the solution of the ODE η θ(r) + 2θ 2 (r) + θ (r) = 0 with θ(0) to be a strictly positive and smaller than ς/2. It is easy to see that the solution is decreasing and strictly positive. Thus B r is identically zero and 0 < θ(T ) ≤ θ(t) ≤ ς/2 for every t ∈ [0, T ]. As a consequence E sup Using the bound in (4.43) it is now checked exactly as in [4] that for someC T < ∞. This proves (4.40) and thus the result follows.

Proofs for the concentration bounds
This in turn implies P sup which is analogous to equation (75) in [4]. Part (i) of Proposition 4.3 guarantees that we can apply Theorem 3.7 to assert that for any d ∈ (d, ∞) and θ ∈ (0, θ), there exists a positive integer N 0 such that for all > 0 and N ≥ N 0 max( −(d +2) , 1).  [4]. This result is replaced by part (ii) of Proposition 4.3 which gives uniform in time estimate for the square exponential moment for µ t . We omit the details.

Uniform convergence of Euler scheme
In the proofs of Lemma 3.9 and Theorem 3.10, we need to solve difference inequalities which are harder to handle than similar differential inequalities that appeared in the proofs of Theorem 3.4 and Proposition 4.1.
Proof of Lemma 3.9. From integration by parts in (3.7) we see that where κ 4 = κ 3 + 2χ ∇h 0 ∞ and ξ i n is measurable with respect to F B i n .
We can rewrite (4.71) as which is the discrete analogue of a differential inequality similar to (4.15).
Proof of Corollary 3.11. The first statement in the corollary is immediate from Corollary 3.5 and Theorem 3.10. For the second statement, we have from triangle inequality W 2 (µ N, n , µ n ) ≤ W 2 (µ N, n , µ N n ) + W 2 (µ N n , µ n ).
Also, from Theorem 3.10, The result now follows on combining the above two displays with Corollary 3.5.
A Proof of (4.81) To see the first inequality in the claim (4.81), note that The inequality is now a consequence of the observation that δ 1/2 → 1 as → 0 and where the last inequality is from Assumption 2.4. Hence the first estimate holds.