Self-interacting diffusions IV: Rate of convergence

Self-interacting diffusions are processes living on a compact Riemannian manifold defined by a stochastic differential equation with a drift term depending on the past empirical measure of the process. The asymptotics of this measure is governed by a deterministic dynamical system and under certain conditions it converges almost surely towards a deterministic measure (see Bena\"im, Ledoux, Raimond (2002) and Bena\"im, Raimond (2005)). We are interested here in the rate of this convergence. A central limit theorem is proved. In particular, this shows that greater is the interaction repelling faster is the convergence.


Self-interacting diffusions
Let M be a smooth compact Riemannian manifold and V : M × M → R a sufficiently smooth mapping 1 . For all finite Borel measure µ, let V µ : M → R be the smooth function defined by Let (e α ) be a finite family of vector fields on M such that α e α (e α f )(x) = ∆ f (x), where ∆ is the Laplace operator on M and e α ( f ) stands for the Lie derivative of f along e α . Let (B α ) be a family of independent Brownian motions.
A self-interacting diffusion on M associated to V can be defined as the solution to the stochastic differential equation (SDE) t t 0 δ X s ds is the empirical occupation measure of (X t ). In absence of drift (i.e V = 0), (X t ) is just a Brownian motion on M but in general it defines a non Markovian process whose behavior at time t depends on its past trajectory through µ t . This type of process was introduced in Benaim, Ledoux and Raimond (2002) ( [3]) and further analyzed in a series of papers by Benaim and Raimond (2003, 2005, 2007) ( [4], [5] and [6]). We refer the reader to these papers for more details and especially to [3] for a detailed construction of the process and its elementary properties. For a general overview of processes with reinforcement we refer the reader to the recent survey paper by Pemantle (2007) ([16]).

Notation and Background
We let Let µ ∈ (M ) and f : M → R a nonnegative or µ−integrable Borel function. We write µ f for f dµ, and f µ for the measure defined as f µ(A) = A f dµ. We let L 2 (µ) denote the space of functions for which µ| f | 2 < ∞, equipped with the inner product 〈 f , g〉 µ = µ( f g) and the norm f µ = µ f 2 . We simply write L 2 for L 2 (λ).
Of fundamental importance in the analysis of the asymptotics of (µ t ) is the mapping Π : where ξ : C(M ) → C(M ) is the function defined by M e − f ( y) λ(d y) . (2) In [3], it is shown that the asymptotics of µ t can be precisely related to the long term behavior of a certain semiflow on (M ) induced by the ordinary differential equation (ODE) on (M ) : Depending on the nature of V, the dynamics of (3) can either be convergent or nonconvergent leading to similar behaviors for {µ t } (see [3]). When V is symmetric, (3) happens to be a quasigradient and the following convergence result holds. In particular, if Fix(Π) is finite then (µ t ) converges almost surely toward a fixed point of Π. This holds for a generic function V (see [5]). Sufficient conditions ensuring that Fix(Π) has cardinal one are as follows: Theorem 1.2 ([5], [6]). Assume that V is symmetric and that one of the two following conditions hold where K is some positive constant. Here Ric x stands for the Ricci tensor at x and Hess x, y is the Hessian of V at (x, y).
As observed in [6] the condition (i) in Theorem 1.2 seems well suited to describe self-repelling diffusions. On the other hand, it is not clearly related to the geometry of M . Condition (ii) has a more geometrical flavor and is robust to smooth perturbations (of M and V ). It can be seen as a Bakry-Emery type condition for self interacting diffusions.
In [5], it is also proved that every stable (for the ODE (3)) fixed point of Π has a positive probability to be a limit point for µ t ; and any unstable fixed point cannot be a limit point for µ t .
In this paper we intend to study the rate of this convergence. Let It will be shown that, under some conditions to be specified later, for all g = (g 1 , . . . , g n ) ∈ C(M ) n the process ∆ s g 1 , . . . , ∆ s g n , V ∆ s s≥t converges in law, as t → ∞, toward a certain stationary Ornstein-Uhlenbeck process (Z g , Z) on R n × C(M ). This process is defined in Section 2. The main result is stated in section 3 and some examples are developed. It is in particular observed that a strong repelling interaction gives a faster convergence. The section 4 is a proof section.
In the following K (respectively C) denotes a positive constant (respectively a positive random constant). These constants may change from line to line.

The Ornstein-Uhlenbeck process (Z g , Z).
For a more precise definition of Ornstein-Uhlenbeck processes on C(M ) and their basic properties, we refer the reader to the appendix (section 5). Throughout all this section we let µ ∈ (M ) and

The operator G µ
Let h ∈ C(M ) and let G µ,h : R × C(M ) → R be the linear operator defined by where Cov µ is the covariance on L 2 (µ), that is the bilinear form acting on L 2 × L 2 defined by We define the linear operator G µ : It is easily seen that In particular, G µ is a bounded operator. Let {e −t G µ } denote the semigroup acting on C(M ) with generator −G µ . From now on we will assume the following: This limit exists by subadditivity. Then This implies that e −t G µ f λ ≤ e −κt f λ . Denote by g t the solution of the differential equation Now for all t > 1, and f ∈ C(M ), This implies that e −t G µ ≤ Ke −κt , which proves the lemma. QED The adjoint of G µ is the operator on (M ) defined by the relation for all m ∈ (M ) and f ∈ C(M ). It is not hard to verify that

The generator A µ and its inverse Q µ
Let H 2 be the Sobolev space of real valued functions on M , associated with the norm f 2 Since Π(µ) and λ are equivalent measures with continuous Radon-Nykodim derivative, L 2 (Π(µ)) = L 2 (λ). We denote by K µ the projection operator, acting on L 2 (Π(µ)), defined by We denote by A µ the operator acting on H 2 defined by Note that for f and h in H 2 (denoting 〈·, ·〉 the Riemannian inner product on M ) .
It is shown in [3] that Q µ f is C 1 and that there exists a constant K such that for all f ∈ C(M ) and µ ∈ (M ), Finally, note that for f and h in L 2 ,

The covariance C g µ
We let C µ denote the bilinear continuous form This form is symmetric (see its expression given by (9)). Note also that for some constant K depending on µ, We let C µ denote the mapping C µ : . . , n} ∪ M and C g µ :M ×M → R be the function defined by Then C µ and C g µ are covariance functions (as defined in subsection 5.2). In the following, when n = 0,M = M and C g µ = C µ . When n ≥ 1, C(M ) can be identified with R n × C(M ). Proof : Since the argument are the same for n ≥ 1, we just do it for n = 0. Let where the last inequality follows from (8). Then d C µ (x, y) ≤ K d(x, y). Thus d C µ satisfies (30) and we can apply Theorem 5.4 of the appendix (section 5). QED

The process (Z g , Z)
Let G g µ : R n × C(M ) → R n × C(M ) be the operator defined by where I n is the identity matrix on R n and A g µ : Since G g µ is a bounded operator, for any law ν on R n × C(M ), there existsZ = (Z g , Z) an Ornstein-Uhlenbeck process of covariance C g µ and drift −G g µ , with initial distribution given by ν (using Theorem 5.6). More precisely,Z is the unique solution of whereZ 0 is a R n × C(M )-valued random variable of law ν andW = (W g , W ) is a R n × C(M )-valued Brownian motion of covariance C g µ independent ofZ. In particular, Z is an Ornstein-Uhlenbeck process of covariance C µ and drift −G µ . Denote by P g,µ t the semigroup associated toZ. Then Proposition 2.4. Assume hypothesis 2.1. Then there exists π g,µ the law of a centered Gaussian variable in R n × C(M ), with variance Var(π g,µ ) where for (u, m) ∈ R n × (M ), and where m t is defined by Moerover, (i) π g,µ is the unique invariant probability measure of P t .
(ii) For all bounded continuous function ϕ on Proof : This is a consequence of Theorem 5.7. To apply it one can remark that G g µ is an operator like the ones given in example 5.11.
The variance Var(π g,µ ) is given by Note that (13) is equivalent to for all f ∈ C(M ), and m 0 = m. From which we deduce that which implies the formula for m t given by (12). QED An Ornstein-Uhlenbeck process of covariance C g µ and drift −G g µ will be called stationary when its initial distribution is π g,µ .

A central limit theorem for µ t
We state here the main results of this article. We assume µ * ∈ Fix(Π) satisfies hypotheses 1.3 and 2.1. Set ∆ t = e t/2 (µ e t − µ * ), D t = V ∆ t and D t+· = (D t+s ) s≥0 . Then Define C : with (h t is defined by the same formula, with h in place of f )

Corollary 3.3. ∆ t g converges in law towards a centered Gaussian variable
Proof : Follows from theorem 3.2 and the calculus of Var(π g,µ )(u, 0). QED

The case µ * = λ and V symmetric.
Suppose here that µ * = λ and that V is symmetric. We assume (without loss of generality since Since V is compact and symmetric, there exists an orthonormal basis (e α ) α≥0 in L 2 (λ) and a sequence of reals (λ α ) α≥0 such that e 0 is a constant function and Assume that for all α, 1/2 + λ α > 0. Then hypothesis 2.1 is satisfied, and the convergence of µ t towards λ holds with positive probability (see [6]).
Let f ∈ C(M ) and f t defined by (15), This, with corollary 3.3, proves Theorem 3.6. Assume hypothesis 1.3 and that 1/2 + λ α > 0 for all α. Then for all g ∈ C(M ) n , ∆ g t converges in law toward a centered Gaussian variable (Z In particular, When all λ α are positive, which corresponds to what is named a self-repelling interaction in [6], the rate of convergence of µ t towards λ is bigger than when there is no interaction, and the bigger is the interaction (that is larger λ α 's) faster is the convergence.

Proof of the main results
We assume hypothesis 1.3 and µ * satisfies hypothesis 2.1. For convenience, we choose for the constant κ in hypothesis 2.1 a constant less than 1/2. In all this section, we fix g = (g 1 , ..., g n ) ∈ C(M ) n .

A lemma satisfied by Q µ
We denote by (M ) the space of continuous vector fields on M , and equip the spaces (M ) and (M ) respectively with the weak convergence topology and with the uniform convergence topology. Proof : Let µ and ν be in Using the fact that (x, y) → ∇V x ( y) is uniformly continuous, the right hand term of (17) converges towards 0, when d(µ, ν) converges towards 0, d being a distance compatible with the weak convergence. QED

The process ∆
is a continuous process taking its values in C(M ) and D t = e t/2 (h e t − h * ).
To simplify the notation, we set Thus

First estimates
We recall the following estimate from [3]: There exists a constant K such that for all f ∈ C(M ) and t > 0, This estimate, combined with (8), implies that for f and h in C(M ), and that Lemma 4.2. There exists a constant K depending on V ∞ such that for all t ≥ 1, and all f ∈ C(M ) which implies that ((∆ 1 + ∆ 5 ) t+s ) s≥0 and ((D 1 + D 5 ) t+s ) s≥0 both converge towards 0 (respectively in (M ) and in C(R + × M )).
We also have Proof : The first estimate follows from The second estimate follows from the fact that The last estimate follows easily after having remarked that This proves this lemma. QED

The processes ∆ and D
Then where for all f ∈ C(M ), N f is a martingale. Moreover, for f and h in C(M ), 〈∇Q e s f (X e s ), ∇Q e s h(X e s )〉ds.

Estimation of
(ii) a.s. there exists C with E[C] < ∞ such that for all t ≥ 0, We have 〈N 〉 t ≤ K t for some constant K. Then Note that there exists a constant K such that and that (see hypothesis 2.1) We now prove (ii). Fix α > 1. Then there exists a constant K such that Then Bürkholder-Davies-Gundy inequality (BDG inequality in the following) inequality implies that which is finite. This implies the lemma by taking α = 2. QED

Estimation of D t λ
Note that for all f ∈ C(M ), |ε Thus This implies (using lemma 2.2 and the fact that 0 < κ < 1/2) This lemma with lemma 4.4-(ii) implies the following Proof : First note that D t λ ≤ D t λ + K(1 + t)e −t/2 .
Using the expression of D t given by (20), we get Proof : Lemmas 4.6 and 4.7 imply that Since hypothesis 1.3 implies that lim s→∞ e −s/2 D s λ = 0, then a.s. for all ε > 0, there exists C ε such that D t λ ≤ C ε e εt . Taking ε < 1/4, we get This proves the lemma. QED

Lemma 4.9. a.s. there exists C such that for all f ∈ C(M )
, by lemma 4.8. QED

Estimation of
Proof : Note that (i) is implied by (ii). We prove (ii). We have So to prove this lemma, using (21), it suffices to show that Using hypothesis 2.1 and the definition of G g µ * , we have that for all positive t, e This implies e Thus the term (24) is dominated by from which we prove (24) like in the previous lemma. QED

Tightness results
We refer the reader to section 5.1 in the appendix (section 5), where tightness criteria for families of C(M )-valued random variables are given. They will be used in this section.

Tightness of
In this section we prove the following lemma which in particular implies the tightness of (D t ) t≥0 and of (D t ) t≥0 .

Thus, using the expression of
Since µ * is absolutely continuous with respect to λ, we have that (with Lip(A t ) the Lipschitz constant of A t , see (36)).
Therefore (using lemma 4.4 (i) for α = 2), sup t E[ A t 2 ∞ ] < ∞. To prove this tightness result, we first prove that for all x, It is now an exercise to show that x t ≤ K and so that E[(Z x, y y). Using proposition 5.2, this completes the proof for the tightness of (L −1 µ * (M ) t ) t . QED Kunita (1990)), with the estimates given in the proof of this lemma, implies that

Tightness of ((L
Let ∆g be defined by the relation . Then Thus, t 0 e s/2 A s g ds. Using this expression it is easy to prove that ( ∆ t g) t≥0 is bounded in L 2 (P). This implies, using also lemma 4.11

Convergence in law of
In this section, we denote by E t the conditional expectation with respect to e t . We also set Q = Q µ * and C = C µ * .

Preliminary lemmas.
For

Lemma 4.14. For all f and h in C(
Proof : For z ∈ M and u > 0, set We have Integrating by parts, we get that . . , f n be in C(M ). Let (t k ) be an increasing sequence converging to ∞ such that the conditional law of M n,k = (N f 1 ,t k , . . . , N f n ,t k ) given e t k converges in law towards a R n -valued process W n = (W 1 , . . . , W n ).

Lemma 4.15. W n is a centered Gaussian process such that for all i and j,
Proof : We first prove that W n is a martingale. For all k, M n,k is a martingale. For all u ≤ v, BDG inequality implies that (M n,k (v) − M n,k (u)) k is bounded in L 2 .
Let l ≥ 1, ϕ ∈ C(R l ), 0 ≤ s 1 ≤ · · · ≤ s l ≤ u and (i 1 , . . . , i l ) ∈ {1, . . . , n} l . Then for all k and i ∈ {1, . . . , n}, the martingale property implies that where Z k is of the form Using the convergence of the conditional law of M n,k given e t k towards the law of W n and since (M This implies that W n is a martingale. We now prove that for (i, j) ∈ {1, . . . , n} (with C = C µ * ), where Z k is of the form (25). Using the convergence in law and the fact that (M n,k (v) − M n,k (u)) 2 k is bounded in L 2 (still using BDG inequality), we prove that as k → ∞, shows that the first term converges towards 0. The convergence of the conditional law of M n,k with respect to e t k towards W n shows that the second term converges towards 0. Thus We conclude using Lévy's theorem. QED

Convergence in law of M t+· − M t
In this section, we denote by t the conditional law of M t+· − M t knowing e t . Then t is a probability measure on C(R + × M ). Proof : In the following, we will denote M t+· − M t by M t . We first prove that Proof : For all x ∈ M , t and u in R + , This implies that for all u ∈ R + and x ∈ M , (M t u (x)) t≥0 is tight. Let α > 0. We fix T > 0. Then for (u, x) and (v, y) in [0, T ] × M , using BDG inequality, where K α is a positive constant depending only on α, V ∞ and Lip(V ) the Lipschitz constant of V .
We now let D T be the distance on [0, T ] × M defined by  Let . . , W g n ), and let (W g , W ) denote the process (W g t , (W t (x)) x∈M ) t≥0 . Proposition 4.19. As t goes to ∞, g t converges weakly towards the law of (W g , W ).
Proof : We first prove that { g t : t ≥ 0} is tight. This is a straightforward consequence of the tightness of { t } and of the fact that for all α > 0, there exists K α such that for all nonnegative u and v, Let (t k ) be an increasing sequence converging to ∞ and (Ñ g ,M ) a R n × C(M )-valued random process (or a C(R + × M ∪ {1, . . . , n}) random variable) such that g t k converges in law towards (Ñ g ,M ). Then lemmas 4.14 and 4.15 imply that (Ñ g ,M ) has the same law as (W g , W ). Since { g t } is tight, g t convergences towards the law of (W g , W ). QED

Convergence in law of
We have Since (using lemma 4.9) is an Ornstein-Uhlenbeck process of covariance C µ * and drift −G µ * started from 0, we have Theorem 4.20. The conditional law of (D t+s − e −sG µ * D t ) s≥0 given e t converges weakly towards an Ornstein-Uhlenbeck process of covariance C µ * and drift −G µ * started from 0.

Convergence in law of D t+·
We can now prove theorem 3.1. We here denote by P t the semigroup of an Ornstein-Uhlenbeck process of covariance C µ * and drift −G µ * , and we denote by π its invariant probability measure.
Since (D t ) t≥0 is tight, there exists ν ∈ (C(M )) and an increasing sequence t n converging towards ∞ such that D t n converges in law towards ν. Then D t n +· converges in law towards (L −1 µ * (W ) s + e −sG µ * Z 0 ), with Z 0 independent of W and distributed like ν. This proves that D t n +· converges in law towards an Ornstein-Uhlenbeck process of covariance C µ * and drift −G µ * .
We now fix t > 0. Let s n be a subsequence of t n such that D s n −t+· converges in law. Then D s n −t converges towards a law we denote by ν t and D s n −t+· converges in law towards an Ornstein-Uhlenbeck process of covariance C µ * and drift −G µ * . Since D s n = D s n −t+t , D s n converges in law towards ν t P t . On the other hand D s n converges in law towards ν. Thus ν t P t = ν.
Let ϕ be a Lipschitz bounded function on C(M ). Then where the second term converges towards 0 (using proposition 2.4 (ii) or theorem 5.7 (ii)) and the first term is dominated by (using lemma 5.8) It is easy to check that using the estimates (19), the proof of lemma 4.10 and remark 4.12, we get that Taking the limit in (28), we prove νϕ = πϕ for all Lipschitz bounded function ϕ on C(M ). This implies ν = π, which proves the theorem. QED

Convergence in law of D g
The norm of the second term of the right hand side (using the proof of lemma 4.10) is dominated by The next proposition gives a useful criterium for a class of random variables to be tight. It follows directly from [15] (Corollary 11.7 p. 307 and the remark following Theorem 11.2). A function ψ : R + → R + is a Young function if it is convex, increasing and ψ(0) = 0. If Z is a real valued random variable, we let For ε > 0, we denote by N (M , d; ε) the covering number of E by balls of radius less than ε (i.e. the minimal number of balls of radius less than ε that cover E), and by D the diameter of M .

Proposition 5.1. Let (F t ) t∈I be a family of C(M )-valued random variables and ψ a Young function.
Assume that (i) There exists x ∈ E such that (F t (x)) t∈I is tight; Then (F t ) t≥0 is tight.

Proposition 5.2. Suppose M is a compact finite dimensional manifold of dimension r, d is the Riemannian distance, and
for some α > r. Then conditions (ii) and (iii) of Proposition 5.1 hold true.
Proof : One has N (E, d; ε) of order ε −r ; and for ψ(x) = x α , · ψ is the L α norm. Hence the result. QED

Brownian motions on C(M ).
Let C : M × M → R be a covariance function, that is a continuous symmetric function such that i j a i a j C(x i , x j ) ≥ 0 for every finite sequence (a i , x i ) with a i ∈ R and x i ∈ M . A Brownian motion on C(M ) with covariance C is a continuous C(M )-valued stochastic process W = {W t } t≥0 such that W 0 = 0 and for every finite subset S ⊂ R + ×M , {W t (x)} (t,x)∈S is a centered Gaussian random vector with For d a pseudo-distance on M and for ε > 0, let Then N (M , d; ω C (ε)) ≥ N (M , d ; ε). We will consider the following hypothsis that d may or may not satisfy: Let d C be the pseudo-distance on M defined by When d = d C , the function ω defined by (29) will be denoted by ω C . Proof : By Mercer Theorem (see e.g [11]) there exists a countable family of function Ψ i ∈ C(M ), i ∈ N, such that C(x, y) = i Ψ i (x)Ψ i ( y), and the convergence is uniform. Let B i , i ∈ N, be a family of independent standard Brownian motions. Set W n t (x) = i≤n B i t Ψ i (x), n ≥ 0. Then, for each (t, x) ∈ R + × M , the sequence (W n t (x)) n≥1 is a martingale. It is furthermore bounded in Hence by Doob's convergence theorem one may define W t (x) = i≥0 B i t Ψ i (x). Let now S ⊂ R + × M be a countable and dense set. It is easily checked that the family (W t (x)) (t,x)∈S is a centered Gaussian family with covariance given by This later bound combined with classical results on Gaussian processes (see e.g Theorem 11.17 in [15]) implies that (t, x) → W t (x) admits a version uniformly continuous over S T = {(t, x) ∈ S : t ≤ T }. By density it can be extended to a continuous (in (t, x)) process W = (W t (x)) {(t,x)∈R + ×M } . QED An Ornstein-Ulhenbeck process with drift A, covariance C and initial condition F 0 = f ∈ C(M ) is defined to be a continuous C(M )-valued stochastic process such that

Ornstein-Ulhenbeck processes
We let (e tA ) t∈R denote the linear flow induced by A. For each t, e tA is a bounded operator on C(M ).
Proof : Observe that L A ( f ) = 0 implies that f t = e tA f 0 . Hence L A restricted to C 0 (R + × M ) is injective. Let g ∈ C 0 (R + × M ) and let f t be given by the right hand side of (33). Then It is easily seen that h is differentiable and that d d t h t = 0. This proves that h t = h 0 = 0. QED This lemma implies for all f ∈ C(M ), Note that L −1 A (W ) t is Gaussian and its variance Var F t (µ) := E[〈µ, F t 〉 2 ] (with µ ∈ (M )) is given by 〈µ, e sA C e sA * µ〉ds.
where C : . *** We refer to [10] for the calculation of Var F t . Note that the results given in Theorem 5.6 are not included in [10].

Asymptotic Behaviour
Let λ(A) = lim t→∞ log( e tA t . Denote by P t the semigroup associated to an Ornstein-Uhlenbeck process of covariance C and drift A. Then for all bounded measurable ϕ : C(M ) → R and f ∈ C(M ), where F t is the solution to (31), with F 0 = f . Proof : The fact that λ(A) < 0 implies that lim t→∞ Var F t (µ) = V(µ) < ∞. Let ν t denote the law of F t , where F t is the solution to (31), with F 0 = f . Since F t is Gaussian, every limit point of {ν t } (for the weak* topology) is the law of a C(M )-valued Gaussian variable with variance V. The proof then reduces to show that (ν t ) is relatively compact or equivalently that {F t } is tight. We use Proposition 5.1. The first condition is clearly satisfied. Let ψ(x) = e x 2 − 1. It is easily verified that for any real valued Gaussian random variable Z with variance σ 2 , Z Ψ = σ 8/3. Hence F t (x) − F t ( y) ψ ≤ 2d V (x, y) so that condition (ii) holds with d V . Denoting ω (defined by (29) To conclude this section we give a set of simple sufficient conditions ensuring that d V satisfies (30).
For f ∈ C(M ) we let A map f is said to be Lipschitz provided Li p( f ) < ∞. (ii) C is Lipschitz; (iii) There exists K > 0 such that Li p(Af ) ≤ K(Li p( f ) + f ∞ ); (iv) λ(A) < 0.
Then d C and d V satisfy (30).
Note that (i) holds when M is a finite dimensional manifold. We first prove