Speed of convergence to equilibrium in Wasserstein metrics for Kac-s like kinetic equations

This work deals with a class of one-dimensional measure-valued kinetic equations, which constitute extensions of the Kac caricature. It is known that if the initial datum belongs to the domain of normal attraction of an \alpha-stable law, the solution of the equation converges weakly to a suitable scale mixture of centered \alpha-stable laws. In this paper we present explicit exponential rates for the convergence to equilibrium in Kantorovich-Wasserstein distances of order p>\alpha, under the natural assumption that the distance between the initial datum and the limit distribution is finite. For \alpha=2 this assumption reduces to the finiteness of the absolute moment of order p of the initial datum. On the contrary, when \alpha<2, the situation is more problematic due to the fact that both the limit distribution and the initial datum have infinite absolute moment of any order p>\alpha. For this case, we provide sufficient conditions for the finiteness of the Kantorovich-Wasserstein distance.


Introduction
This paper is concerned with the study of the speed of convergence to equilibrium − with respect to Wasserstein distances − of the solution of the one-dimensional kinetic equation (1) ∂ t µ t + µ t = Q + (µ t , µ t ) The solution µ t = µ t (·) is a time-dependent probability measure on B(R), the Borel σ-field of R. Following [3,10] we assume that Q + is a suitable smoothing transformation. More precisely, the probability measure Q + (µ, µ) is characterized by for all bounded and continuous test functions g ∈ C b (R), where (L, R) is a random vector of R 2 defined on a probability space (Ω, F , P) and E denotes the expectation with respect to P. For suitable choices of (L, R), equation (1)-(2) reduces to well-known simplified models for a spatially homogeneous gas, in which particles move only in one spatial direction. The basic assumption is that particles change their velocities only because of binary collisions. When two particles collide, then their velocities change from v and w, respectively, to where (L 1 , R 1 ) and (L 2 , R 2 ) are two identically distributed random vectors with the same law of (L, R). A fundamental hypothesis on (L, R) in this kind of equation is that there exists an α in (0, 2] such that The first model of type (1)- (2) has been introduced by Kac [22], with collisional parameters L = sinθ and R = cosθ for a random angleθ uniformly distributed on [0, 2π). The inelastic Kac equation, introduced in [29] to describe gases with inelastically colliding molecules, corresponds to (1)- (2) with L = | sinθ| d sinθ and R = | cosθ| d cosθ, where d > 0 is the parameter of inelasticity. In this case, (3) holds with α = 2/(d + 1).
A less standard application of equations of type (1)- (2) is concerned with the construction of kinetic models for conservative economies. These models consider the evolution of wealth distribution in a market of agents which interact through binary trades, see for example [5,7,24,27].
Finally, we mention that, using results in [9], it can be shown that the isotropic solutions of the multidimensional inelastic homogeneous Boltzmann equation [8] are functions of one-dimensional µ t that are solutions of equation (1)-(2) for a suitable choice of (L, R) andμ 0 .
In [3] it is proved that, if L and R are positive random variables such that (3) holds true for α ∈ (0, 1) ∪ (1,2], E[L p + R p ] < 1 for some p > α andμ 0 belongs to the domain of normal attraction of an α-stable law (μ 0 being centered if α > 1), then the solution µ t converges weakly to a probability measure µ ∞ , that is a mixture of centered α-stable distributions. Some extra conditions are needed for the case α = 1, but the result is essentially of the same type. For a precise statement of these results, see Theorems 2.2 and 2.3 in Section 2.4. As for the limit distribution, it is easy to see that µ ∞ is a steady state, that is a fixed point of the smoothing transformation Q + . Moreover, it has been proved that also the mixing distribution is a fixed point of another smoothing transformation. For more information on fixed points of smoothing transformations see [16]. See also the very recent paper [1] and the references therein.
In addition to the problem of finding sufficient (and eventually necessary, see e.g. [20]) conditions for the relaxation to the steady state, an important problem is to determine explicit rates of convergence to the equilibrium with respect to suitable probability metrics.
In the case of the Kac equation, that has the Gaussian distribution as steady state, rates of convergence with respect to Kolmogorov's uniform metric, weighted χ-metrics of order p ≥ 2, Wasserstein metrics of order 1 and 2 and total variation distance have been proved. See [14,15,19]. As for the inelastic Kac equation, in [4] rates of convergence to equilibrium with respect to Kolmogorov's uniform metric and χ-weighted metrics have been derived. For the solutions of the general model

Preliminary results
The following assumption will be needed throughout the paper.

Probabilistic representation of the solution.
In this paper we shall use the Fourier formulation of (1). We say that µ t is a (weak) solution of (1), with initial conditionμ 0 , if its Fourier-Stieltjes transformμ t (ξ) = R e iξv µ t (dv) obeys to the equation where for any couple of characteristic functions (f, g).
As in the case of the Kac equation, it is easy to see that (8) admits a unique solutionμ t (in the class of the Fourier-Stieltjes transforms) which can be written as a Wild series [33]μ where q 0 (ξ) :=μ 0 (ξ) and, for n ≥ 1, q n (ξ) := 1 n n−1 j=0 Q + (q j , q n−1−j )(ξ).
In [3] it has been shown that the solution of (1) is related to a suitable stochastic process. More precisely, the unique solution µ t of (1) with initial datumμ 0 is the law of the weighted random sum with the following elements defined on a sufficiently large probability space (Ω, F , P): • a sequence (X j ) j≥1 of i.i.d. random variables with distributionμ 0 ; • a stochastic process (N t ) t≥0 which takes values in N and with for every n ≥ 1 and t ≥ 0; • a random array of weights (β j,n : j = 1, . . . , n) n≥1 recursively defined by: := (β 1,n , . . . , β In−1,n , L n β In,n , R n β In,n , β In+1,n , . . . , β n,n ).
where (L n , R n ) n≥1 is a sequence of independent and identically distributed (i.i.d., for short) random vectors with the same distribution of (L, R), and (I n ) n≥1 is a sequence of independent random variables such that I n is uniformly distributed on {1, . . . , n} for every n ≥ 1; • (X j ) j≥1 , (N t ) t≥0 , (L n , R n ) n≥1 , (I n ) n≥1 are stochastically independent.
As a matter of fact, it is possible to prove that for every n ≥ 1,q n−1 − defined in (11) − is the characteristic function of the random variable (12) W n := n j=1 β j,n X j .
See the proof of Proposition 1 in [3]. Since V t = W Nt , from (10) it follows that µ t is the law of V t .

2.2.
Martingale of weights and fixed point equations for distributions. It is easy to prove that, under (H 0 ), n j=1 β α j,n is a (positive) martingale and hence it converges a.s. (as n → +∞) to a random variable M In (13), M ∞ , and Z 1 d = Z 2 means that the random variables Z 1 and Z 2 have the same distribution. For a proof of these facts see Proposition 2 in [3].

Stable laws.
Recall that a probability distribution g α is said to be a centered stable law of exponent α (with 0 < α ≤ 2) and real parameters (λ, β), λ > 0 and |β| ≤ 1, if its Fourier-Stieltjes transformĝ α (ξ) = R e iξv g α (dv) has the form By definition, a probability measureμ 0 belongs to the domain of normal attraction of a stable law of exponent α if for any sequence of i.i.d. real-valued random variables (X n ) n≥1 with common distributionμ 0 , there exists a sequence of real numbers (c n ) n≥1 such that the law of n −1/α n i=1 X i − c n converges weakly to a stable law of exponent α.
It is well-known that, provided α = 2, a probability measureμ 0 belongs to the domain of normal attraction of an α-stable law if and only if its distribution function F 0 (x) :=μ 0 (−∞, x] satisfies (16) lim Typically, one also requires that c + 0 + c − 0 > 0 in order to exclude convergence to the probability measure concentrated in 0, but here we shall include the situation c + 0 = c − 0 = 0 as a special case. The parameters λ and β of the associated stable law in (15) are related to c + 0 and c − 0 by In contrast, if α = 2, F 0 belongs to the domain of normal attraction of a Gaussian law if and only if it has finite variance σ 2 . The parameter λ of the associated Gaussian law in (15) is given by λ = σ 2 2 . See for example Chapter 17 of [18] and Chapter 2 of [21]. 2.4. Convergence to Steady states. We are ready to state the results concerning the convergence of µ t to a steady state, that is a probability measure µ ∞ such that 3]). Assume that (H 0 ) holds true with α = 1 and that F 0 satisfies (16). In addition, assume that R vμ 0 (dv) = 0 if α > 1. If p < α, then µ t converges weakly to the degenerate probability measure δ 0 , while, if p > α, then µ t converges weakly to a steady state µ ∞ with Fourier-Stieltjes transform where ν α is the same as in Proposition 2.1 and the parameters λ and β are defined in (17) for α < 2 and (λ, β) = (σ 2 /2, 0) for α = 2.
We conclude this section by considering the case in which α = 1. We state a slight variant of Theorem 4 in [3]. +∞) and suppose, in addition, that If p < 1, then µ t converges weakly to the degenerate probability measure δ 0 , while, if p > 1, then µ t converges weakly, as t → +∞, to a steady state µ ∞ with Fourier-Stieltjes transform where ν 1 is the same as in Proposition 2.1.
This theorem can be proved in a very similar way of Theorem 1 of [3], for the sake of completeness a sketch of the proof is given in Appendix B.
Remark 1. It is worth noticing that the steady states µ ∞ described in Theorems 2.2-2.3 are the unique possible fixed points of Q + . See Theorems 2.1 and 2.2 in [1]. Necessary conditions for the convergence of µ t to a steady state µ ∞ are investigated in [28].

Rates of convergence in Wasserstein distances
The minimal L p -metric − or Kantorovich-Wasserstein distance of order p − (p > 0) between two probability measures µ 1 and µ 2 on B(R) is defined by is the class of all the probability measures on B(R 2 ) with marginals µ 1 and µ 2 , that is the probability measures m such that m(· × R) = µ 1 (·) and m(R × ·) = µ 2 (·). In general, the infimum in (22) may be infinite; a sufficient (but not necessary) condition for having finite distance between µ 1 and µ 2 is that both R |v| p µ 1 (dv) < +∞ and R |v| p µ 2 (dv) < +∞. An important property of the Kantorovich-Wasserstein distance is its close connection with weak convergence of probability measures; namely, if (ν t ) t≥0 is a family of probability measures such that R |v| p ν t (dv) < +∞ for every t ≥ 0 and ν ∞ is a probability measure such that See, e.g., Lemma 8.4.35 in [30]. Recall also that d p (ν t , ν ∞ ) → 0, as t → +∞, yields the weak convergence of ν t to ν ∞ , even if R |v| p ν t (dv) = +∞ for every t ≥ 0.
In the rest of the section we deal with the problem of providing an upper bound for d p (µ t , µ ∞ ) when µ t is the solution of (1) with initial conditionμ 0 and µ ∞ is the corresponding steady state.
When α = 1, 2, taking advantage of a probabilistic representation of the solution recalled in Section 2.1, it is relatively easy to get an upper bound for d p (µ t , µ ∞ ) whenever p ≤ 2. The reason of the restriction to p ≤ 2 is that in proving such kind of estimates a key point is the employment of the von Bahr -Esseen inequality for sums of independent random variables -see (41) -, which holds only if p ≤ 2. In order to enunciate these rates of convergence we recall that the so-called spectral function, introduced in [10], is the function ϕ : (0, +∞) → R := R ∪ {−∞, +∞} defined by (23) ϕ(q) := S(q) q .
3.1. Statement of the main results for α < 2. In this section we will enunciate two results which provide (exponential) rates of convergence to equilibrium for the solution of (1) with respect to the Wasserstein distances of any order. The proofs of these statements will be established by using the probabilistic representation of the solution of (1) and employing an inductive argument inspired by a technique developed in [17]. This inductive argument makes use of rates of convergence to equilibrium with respect to Wasserstein distances of order p ≤ 2; thus, it is crucial to have estimates for d p (µ t , µ ∞ ) when p ≤ 2. Theorem 3.1 fulfills our need if α = 1, while, when α = 1, we have to prove an estimate that will make us able to proceed with the next inductive argument. This key step is provided by the following theorem.
In order to introduce the generalizations of Theorems 3.1 and 3.2 to Kantorovich-Wasserstein metrics of higher order, we define, for i = 1, 2 and every q ≥ i, We are now in the position to enounciate the aforementioned exponential rates of convergence, which are divided into two different theorems according to the value of α. Theorem 3.3 (0 < α < 1). Assume that (H 0 ) holds true with 0 < α < 1 and p > 1. Assume also thatμ 0 satisfies the hypotheses of Theorem 2.2 and that d p (μ 0 , µ ∞ ) < +∞. Then there exists a constant C p = C p (μ 0 ) < +∞ such that for every t ≥ 0.
Here we give a criterion that provides the finiteness of d p (μ 0 , µ ∞ ) when p > α. The main result of this section is contained in Theorem 3.7 which extends Lemma 1 of [3]. Let us start by noticing that (18) can be immediately rewritten in terms of random variables as follows: under the hypotheses of Proposition 2.1 and Theorem 2.2, let M (α) ∞ be the unique solution of equation (13), consider an α-stable random variable S α of parameters (λ, β) given by (17) and assume that M (α) ∞ and S α are stochastically independent. Finally, let V ∞ be a random variable whose probability distribution is µ ∞ . Then, (18) becomes Note that, in the same way, (21) becomes where C λ,γ0 is a Cauchy random variable of scale parameter λ = πc 0 and position parameter γ 0 , and S 1 = C λ,0 . In other words, for every α ∈ (0, 2], V ∞ is an α-stable random variable randomly rescaled by M It is useful to observe that, in order to obtain sufficient conditions for the finiteness of d p (μ 0 , µ ∞ ), when α = 1 we can suppose, without loss of generality, that γ 0 = 0. This fact is justified by the next lemma.
Proposition 3.6. Let 0 < α < 2. If α = 1 let the same assumptions of Theorem 2.2 hold with c + 0 + c − 0 > 0, while if α = 1 let the same hypotheses of Theorem 2.3 be in force with γ 0 = 0 and c 0 > 0. Let F ∞ be the distribution function of the steady state µ ∞ described in Theorem 2.2, Theorem 2.3 respectively. Then (i) If α = 1, |β| = 1 and S(α(k + δ)) < 0 for some integer k ≥ 1 and some δ ∈ (0, 1], then resp.] and S(α(k + δ)) < 0, then (31) holds and For the proof of this proposition the reader is deferred to Appendix A. It is worth noticing that − with the exception of few cases, see e.g. [6] − in general there is no analytical expression of the law of M This recursive formula can be easily obtained using (13) and Newton binomial formula. The next theorem provides the announced sufficient conditions on the initial datumμ 0 that ensure the finiteness of d p (μ 0 , µ ∞ ). Essentially, d p (μ 0 , µ ∞ ) is finite whenever the tails of F 0 are close enough to the tails of F ∞ .
let the same hypotheses of Theorem 2.3 be in force with γ 0 = 0 and c 0 > 0. Let p > α and set k : . . ,c − k−1 ,c + k−1 are given in Proposition 3.6 and ζ : (0, +∞) → R + is a continuous, monotone decreasing function on [B, +∞) such that Remark 3. A simple example of function ζ is ζ(x) := |x| −ε for some ε > 0, but one can also take functions that decrease to infinity slower than a power, for Hence, in this case k = ⌊1+ p−α pα ⌋ = 1. This means that (33)- (34) are similar to the conditions that describe to so-called strong domain of attraction of an α-stable law, i.e.
3.3. Some estimates for α = 2. In this section we assume that (H 0 ) holds true with α = 2 and we provide some estimates for the rate of convergence to equilibrium with respect to Wasserstein distances of order p > 2. To do so, we will employ the same inductive argument on the order p used in the proof of Theorems 3.3 and 3.4. The first obstacle in this procedure is that, at the best of our knowledge, when α = 2, there is not a result comparable to those of Theorems 3.1 and 3.2. The only exception is for the Kac model; in this case rates of convergence both in d 1 and in d 2 are known [19]. It would be useful to prove a result similar to Theorems 3.1 and 3.2 for α = 2 to get estimates for d p (µ t , µ ∞ ) − with 1 ≤ p ≤ 2 − and use them as the first step of the inductive argument. The main problem is that we do not manage to give non trivial upper bounds for d p (µ t , µ ∞ ) with 1 < p ≤ 2. Indeed, the only explicit estimate that we are able to provide is given by for some positive constant Γ 2 , for every t ≥ 0 and for every 1 < p ≤ 2. This trivial inequality follows since The convergence to zero of d 2 (µ t , µ ∞ ) is a consequence of the weak convergence of µ t to µ ∞ supplemented by the fact that, whenμ 0 satisfies the assumptions of Theorem 2.2 (i.e. it has zero mean and finite variance), one has R x 2 µ t (dx) = R x 2 µ ∞ (dx) for every t ≥ 0.
As for d 1 , we obtain a non trivial bound passing through Fourier distances. Recall that for every s > 0 the Fourier distance χ s (also known as weighted χ-metric of order s) between two probability measures µ 1 and µ 2 on B(R) is defined as for every ξ ∈ R and i = 1, 2. These distances are very useful in order to easily obtain rates of convergence to equilibrium for every α ∈ (0, 2]. Indeed, one can plainly prove the following: Proposition 3.8. Assume that (H 0 ) holds true with α ∈ (0, 2] and p > α. If α = 1 suppose thatμ 0 satisfies the hypotheses of Theorem 2.2, while if α = 1 suppose that µ 0 satisfies the hypotheses of Theorem 2.3. If χ p (μ 0 , µ ∞ ) < +∞, one has In Section 6 we will prove that, for a suitable δ > 0, the Fourier distance of order 2 + δ can be used as an upper bound for the Wasserstein distance of order 1. Combining this fact with Proposition 3.8 with α = 2, we will prove the following: Theorem 3.9. Assume that (H 0 ) holds true with α = 2 and p > 2, and that µ 0 satisfies the hypotheses of Theorem 2.2. Then, for every δ ∈ (0, 1) such that 2 + δ ≤ p and R |x| 2+δμ 0 (dx) < +∞, there exists a constant 0 < C < +∞ such that The next theorem provides some estimates for the rate of convergence to equilibrium with respect to Wasserstein distances of order higher than 2.

Proofs of Theorem 3.2 and Lemma 3.5
We start with some useful remarks related to the probabilistic representation of the solution. Here and in the rest of the paper L(Z) denotes the law of a random variable Z.

4.1.
Proof of Lemma 3.5. We begin by proving a simple lemma.
Lemma 4.1. Consider two probability measures µ 1 and µ 2 on B(R) such that d p (µ 1 , µ 2 ) < +∞ for some p ≥ 1. Letμ 1 be a probability measure on B(R 2 ) such that L(U · V ) = µ 1 when (U, V ) is distributed according toμ 1 . Then, there exists a random vector (X 11 , X 12 , X 2 ) such that the law of (X 11 , X 12 ) isμ 1 , the law of X 2 is µ 2 and Proof. Let (X 1 , X 2 ) be an optimal coupling for (µ 1 , µ 2 ). If µ 2|1 denotes the conditional law of X 2 given X 1 , then the Disintegration Theorem leads to where (X 11 , X 12 , X 2 ) is a random vector whose probability distribution is Thanks to the previous lemma, we can prove Lemma 3.5.

4.2.
Proof of Theorem 3.2. As already anticipated in the introduction of Section 3, the von Bahr-Esseen inequality has played an important role in proving rates of convergence to equilibrium with respect to Wasserstein metrics of order p ≤ 2 in the cases in which α = 1 (i.e. Theorem 3.1). For the reader's convenience we recall the statement of the von Bahr-Esseen inequality [32]: let Z 1 , . . . , Z n be independent (real valued) random variables such that E[ In this section we establish the upper bound (24) employing once again the von Bahr-Esseen inequality. To do this we will need to prove the existence of a random vector (X 0 , V 0 ) with marginal laws, respectively,μ 0 and µ ∞ and such that X 0 −V 0 has finite p-th absolute momentum and zero mean. These properties will be proved in Lemma 4.2 which constitutes the main tool for the proof of Theorem 3.2.
On the other hand, if −F −1 Thanks to (42), the first integrals on the right hand side of both (43) and (44) converge to zero when n → +∞. As concerns the second integrals, recall that for every n ≥n one has The positiveness can be obtained by further increasingn, if needed. Thus, in order to prove that the second integrals in (43)  εn ) xdF * (x) converge to zero as n → +∞. By partial integration and using the estimates of F * with G 1 and G 2 we get Thanks to the arbitrariness of δ > 0, this entails that the second integrals in (43) and (44) converge to zero as n → +∞ and hence lim n→+∞ A(n) = 0. This implies (iii) and concludes the proof.
Proof of Theorem 3.2. Let (X 0 , V 0 ) be the random vector given by Lemma 4.2. Consider a sequence (X j , V j ) j≥1 of i.i.d. random vectors with the same distribution of (X 0 , V 0 ) and such that (X j , V j ) j≥1 is stochastically independent of B = σ{(β j,n : j = 1, . . . , n) n≥1 }. By (38) we already know that Nt j=1 β j,Nt V j has probability distribution µ ∞ . Now, for every n ≥ 0, denote by µ n the law of the random variable W n+1 , defined in (12). Hence, by convexity Since E|X j − V j | p < +∞ and E(X j − V j ) = 0, we can make use of the von Bahr-Esseen inequality (41) − conditionally to B − and get From Lemma 2 in [3], one has (45) E n j=1 β p j,n = Γ(n + S(p)) Γ(n)Γ(S(p) + 1) and hence, recalling that for every γ > −1 and 0 < u < 1

Proof of Theorems 3.3 and 3.4
In this section we will prove the exponential rates of convergence to equilibrium which have been presented in Section 3.1. We will develop in details only the proof of Theorem 3.3 since Theorem 3.4 can be proved in a very similar way with slight adaptations. As already anticipated, both Theorems 3.3 and 3.4 descend from an inductive argument − applied to the order of the Wasserstein distance − supplemented by the probabilistic representation of the solution of (1) briefly recalled in Section 2.1. Recall that for every n ≥ 0, µ n is the law of the random variable W n+1 introduced in (12).
We start by proving two simple lemmata: Then the function is continuous and bounded on every interval [0, T ].
Proof. For every fixed t ∈ [0, T ], we have to show that the series in (47) converges. In view of the hypothesis d p (μ 0 , µ ∞ ) < +∞, there exists a random vector p . Consider a sequence (X j , V j ) j≥1 of i.i.d. random vectors distributed as (X 0 , V ∞ ) and independent of (β j,n : j = 1, . . . , n) n≥1 . By (38), we have that By (45), we conclude that the series in (47)  Proof. First of all, by the dominated convergence theorem, one proves that q → S(q) is continuous on its domain. Moreover, one can easily show that for every q belonging to the interior of the domain of S and (49) d 2 dq 2 S(q) = E L q (log L) 2 + R q (log R) 2 . Now consider ϕ on the interval (0,p); this interval is obviously included in the interior of the domain of S and, therefore, ϕ is differentiable on (0,p) and Now we claim that there is at most one point p 0 ∈ (0,p) such that ϕ ′ (p 0 ) = 0, i.e. S ′ (p 0 )p 0 − S(p 0 ) = 0. Computing the derivative one gets d dq which, from (49), is strictly positive since P{(L, R) ∈ {0, 1} 2 } < 1 (see (7)). Thus, q → S ′ (q)q − S(q) is a strictly increasing function and the claim follows if we show that To this end, fix q ∈ (0, α] and note that and that the right hand side is integrable; an analogous fact obviously holds for R. Then, by dominated convergence theorem, which, by hypothesis (4), is strictly positive and hence − lim q→0 + S(q) < 0.
Here we prove a proposition that will give the fundamental tools for the inductive argument that we will use in the proofs of Theorems 3.3-3.4-3.10.
Proposition 5.3. Assume that the assumptions of Lemma 5.1 are in force. Consider the function σ t defined in (47) and a real number q such that 1 < q ≤ p. Then for a suitable constant B q . Moreover, for every s ≥ 1 one has (53) d s s (µ t , µ ∞ ) ≤ σ t (s). Proof. Statement (53) is trivial since, by Jensen's inequality, one has Now we prove (51) and (52). Consider two stochastically independent sequences , are stochastically independent of ((L n , R n )) n≥1 , (I n ) n≥1 , (N t ) t≥0 and, for every k ≥ 1, (W ′ k , V ′ k ) and (W ′′ k , V ′′ k ) are optimal couplings for d s (µ k−1 , µ ∞ ) for every s ≥ 1. Let us specify that we can always find such random variables since, having defined for every x ∈ R random variables uniformly distributed on (0, 1). Recall also the following fact: if a, b ∈ R + and q > 1, then with c q := q if q ∈ [2, 3] and c q := q2 q−3 otherwise; see, e.g., Lemma 3.1 in [24]. Now put ∆ n+1 (s) := d s s (µ n , µ ∞ ) for every n ≥ 0 and s ≥ 1. Thanks to the independence of (W ′ k , V ′ k ) and (W ′′ k , V ′′ k ), (37) and (54) lead to Recalling that (W ′ k , V ′ k ) and (W ′′ k , V ′′ k ) have been defined as optimal couplings for d s (µ k−1 , µ ∞ ) for every s ≥ 1 and putting λ q := E(L q + R q ) = S(q) + 1, B q := c q E(L q−1 R + LR q−1 ), one has At this stage, we have to distinguish two different situations, i.e. 1 < q < 2 or q ≥ 2 (which is possible if p ≥ 2). The reason for this distinction lies in the fact that if q ≥ 2 then q − 1 ≥ 1 and hence (by definition of ( . We begin to consider the case q ≥ 2: as already noticed, E|W ′ k − V ′ k | q−1 = ∆ k (q − 1) and hence, from (55), one has which means that Thanks to Gronwall Lemma (whose applicability is guaranteed by Lemma 5.1), it follows that Hence, for any q ≥ 2, which gives (52). On the other hand, if 1 < q < 2 then, by Jensen's inequality, and, with the same technique used to get (56) from (55), one can easily obtain which gives (51).
We are now ready to prove Theorem 3.3 and Theorem 3.4.
Proof of Theorem 3.3. From the hypotheses one knows that S(α) = 0, S(p) < 0 and p > 1; hence, thanks to the convexity of S, it is clear that S(1) < 0. Thus, from the proof of Theorem 5 in [3] we have that (1) .
Proof of Theorem 3.4. The proof follows the same argument of the one used in the proof of Theorem 3.3. In particular we prove by mathematical induction that Since p ≥ 2, we use (52) as the fundamental tool for the induction. From the proofs of Theorem 3.1 (when α = 1) and Theorem 3.2 (when α = 1) one has (68) σ t (2) ≤ C 2 2 e tS (2) . By Lyapunov's and Jensen's inequalities one gets for every 0 < ε < 1. Combining the above inequalities with (68) one has (2) and Using (69) in (52), it follows that Noticing that the first step of induction is i = 2, one can follow the same steps of the proof of Theorem 3.3 using (69) in place of (57), (70) in place of (61) and ϕ(2) = S(2) 2 in place of ϕ(1) = S(1).
6. Proofs of Proposition 3.8 and Theorems 3.9, 3.10 We start by proving Proposition 3.8 which provides rates of convergence in Fourier metrics of suitable orders for any α ∈ (0, 2].
Proof of Proposition 3.8. By convexity of the Fourier distance we know that So we need a bound for χ α+δ (µ n , µ ∞ ). By (38) one gets Now recall that for every n ≥ 1 if z 1 , . . . , z n , w 1 , . . . w n are complex numbers such that |z i | < 1 and |w i | < 1 for every i = 1, . . . , n, then Using this inequality and (71) one obtains So, by (45), one can write and therefore, using (46), In order to prove Theorem 3.9 we need the following Proposition 6.1. For every two probability measures µ 1 , µ 2 on R such that R x 2 µ 1 (dx) < +∞, R x 2 µ 2 (dx) < +∞ and χ 2+δ (µ 1 , µ 2 ) < +∞, then and The proof of this proposition can be done following the same argument, with slight changes, of the proof of Theorem 2.21 of [11].
We are now ready to prove Theorem 3.10.
7. Proof of Theorem 3.7 The proof of this theorem is inspired by the proof of Lemma 3.19 in [13]. See also Lemma 3.1 in [31].
Proof of Part (ii) Suppose that β = −1 (the case β = 1 can be done in an analogous way). We start as in the proof of Part (i) writing .