On Stein's Method for Multivariate Self-Decomposable Laws With Finite First Moment

We develop a multidimensional Stein methodology for non-degenerate self-decomposable random vectors in $\mathbb{R}^d$ having finite first moment. Building on previous univariate findings, we solve an integro-partial differential Stein equation by a mixture of semigroup and Fourier analytic methods. Then, under a second moment assumption, we introduce a notion of Stein kernel and an associated Stein discrepancy specifically designed for infinitely divisible distributions. Combining these new tools, we obtain quantitative bounds on smooth-Wasserstein distances between a probability measure in $\mathbb{R}^d$ and a non-degenerate self-decomposable target law with finite second moment. Finally, under an appropriate spectral gap assumption, we investigate, via variational methods, the existence of Stein kernels. In particular, this leads to quantitative versions of classical results on characterizations of probability distributions by variational functionals.


Introduction
Stein's method is a powerful device to quantify proximity in law between random variables. It has proven to be particularly useful to compute explicit rates of convergence for several limiting theorems appearing in probability theory (from the standard central limit theorem to more complex probabilistic models satisfying some specific asymptotic behavior). Moreover, it has been successfully implemented for a large collection of one dimensional target limiting laws (see [44,45,13,42] for standard references on the subject and [26] for a more recent survey). All the previously mentioned works essentially focus on the unidimensional setting and related multidimensional results are relatively sparse in the literature. Indeed, the multidimensional Stein's method has mainly been developed for the multivariate normal laws (see e.g. [2,19,20,39,38,11,40,30,33,41,34]) and for invariant measures of multidimensional diffusions ( [28,18]). In particular, the work [18] proposes a general Stein's method framework for target probability measures µ on R d , d ≥ 1, which satisfy the following set of assumptions: µ has finite mean, is absolutely continuous with respect to the d-dimensional Lebesgue measure and its density is continuously differentiable with support the whole of R d .
Below, we introduce and develop a multidimensional Stein's methodology for a specific class of probability measures on R d , namely non-degenerate self-decomposable laws with finite first moment (see (2.4) in Section 2 for a definition). This class of probability measures, introduced by Paul Lévy in [24], is rather natural in the context of limit theorems for sum of independent summands and has been thoroughly studied in several classical books (see e.g. [22,24,23,27,37,43]). Nevertheless, while being very classical in the context of limit theorems, no systematic Stein's method has been implemented for multivariate non-degenerate self-decomposable distributions. (The whole class of non-degenerate self-decomposable laws with finite first moment is different, but intersects with the class of target probability measures considered in [18] and covered by their methodology. Indeed, non-degenerate self-decomposable laws with finite first moment admit a Lebesgue density, which might not be differentiable on R d , and whose support might be a half-space of R d .) Finally, many classical probability measures on R d are self-decomposable (see [43,46] and Section 3 below for some examples).
From our previous univariate work [1], the multidimensional Stein's method we implement is a generalization of the semigroup method "à la Barbour" ( [2]). Thanks to the particular structure of self-decomposable characteristic functions, this semigroup approach relies heavily on Fourier analysitic tools. Moreover, the generator of the aforementioned semigroup is an integro-differential operator reflecting the infinite divisibility of the target law and it can be seen as a direct consequence of a characterization identity originating in [21] and further developed and analyzed in [1]. The resulting Stein equation is a non-local partial differential equation and contrasts with the usual second order partial differential equations associated with the multivariate Gaussian distribution or with the invariant measures of Itô diffusions.
Then, we apply our Stein methodology to quantify proximity, in smooth Wasserstein distances of orders 1 and 2, between an appropriate probability measure on R d and a non-degenerate selfdecomposable laws with finite second moment. Key quantities used in our analysis are relevant versions of Stein kernels and of Stein discrepancies in this infinitely divisible setting (see Definition 4.1). Stein kernel and Stein discrepancy are concepts which have mostly been well developed in the Gaussian setting and have recently gained a certain momentum in connection with random matrices ( [9]), Malliavin calculus ( [31,32]), functional inequalities ( [25,16]), optimal transport ( [17]) and rates of convergence for multidimensional central limit theorems ([34]). In particular, the work [16] investigates the question of existence of a Gaussian Stein kernel for probability measures satisfying a Poincaré inequality or a converse weighted Poincaré inequality (see e.g. [3] for a definition). Thanks to earlier work on characterizing functionals of infinitely divisible distributions [14], we introduce in the last section of the present manuscript the relevant variational setting which ensures the existence of Stein kernel and implies manageable upper bounds on the Stein discrepancy. In particular, Theorem 4.3 is a quantitative version of the characterizing results contained in [14].
Let us further describe the content of these notes. In the next section, we introduce the notations used throughout this work. In Section 3, we develop the multidimensional Stein methodology for non-degenerate self-decomposable random vector with finite first moment, extending our univariate approach ( [1]). In Section 4, we introduce the infinitely divisible version of Stein kernel (and of the Stein discrepancy) and study the existence of the latter under an appropriate version of Poincaré inequality. We end this section by providing quantitative upper bounds on the smooth Wasserstein distance of order two in terms of Poincaré constants and of the second moment of the Lévy measure of the target self-decomposable distribution. A technical appendix finishes these notes.

Notations
Throughout, let · and ·; · be respectively the Euclidean norm and the inner product in R d , d ≥ 1. Let also S(R d ) be the Schwartz space of infinitely differentiable rapidly decreasing realvalued functions defined on R d , and by F the Fourier transform operator given, for f ∈ S(R d ), by On S(R d ), the Fourier transform is an isomorphism and the following inversion formula holds For any bounded linear operator, T , from a Banach space (X , · X ) to another Banach space (Y, · Y ) the operator norm is, as usual, More generally, for any r-multilinear form F from (R d ) r , r ≥ 1, to R, the operator norm of F is Throughout, a Lévy measure is a positive Borel measure on R d such that ν({0}) = 0 and R d (1 ∧ u 2 )ν(du) < +∞. An R d -valued random vector X is infinitely divisible with triplet (b, A, ν) (written X ∼ ID(b, A, ν)), if its characteristic function ϕ writes, for all ξ ∈ R d , as with λ a finite positive measure on the Euclidean unit sphere S d−1 and k x (r) a nonnegative function (Lebesgue) measurable in x ∈ S d−1 and decreasing in r > 0 (namely, k x (s) ≤ k x (r), for 0 < r ≤ s).
Thanks to [43,Remark 15.12 (iii)], since ν = 0, let's assume that λ(S d−1 ) = 1, (0,+∞) (r 2 ∧ 1)k x (r)dr/r is finite and independent of x and that k x (r) is right-continuous in r > 0. Finally, since they satisfy the divergence condition (see e.g. [43,Theorem 27.13]), non-degenerate selfdecomposable laws on R d are absolutely continuous with respect to the d-dimensional Lebesgue measure. We further end this section by introducing some natural distances between probability measures on R d . For p ≥ 1, the Wasserstein-p distance between two probability measures µ X and µ Y with finite p-th moment is where Γ(µ X , µ Y ) is the collection of probability measures on R d × R d with respective first and last d-dimensional marginals given by µ X and µ Y . By Hölder inequality, for 1 ≤ p ≤ q, while, by duality, Lip is the space of Lipschitz functions on R d endowed with the seminorm Moreover, for any function r-times continuously differentiable, h, on R d , viewing its ℓth-derivative D ℓ (h) as a ℓ-multilinear form, for 1 ≤ ℓ ≤ r, we introduce the following quantity (2.10) For r ≥ 0, H r is the space of bounded continuous functions defined on R d which are continuously differentiable up to (and including) the order r and such that, for any such function f Therefore, the space H r is a subspace of the set of bounded functions which are r-times continuously differentiable on R d such that D α (f ) ∞ ≤ 1 for all α ∈ N d with 0 ≤ |α| ≤ r. Then, the smooth Wasserstein distance of order r between two random vectors X and Y having respective laws µ X and µ Y is defined by Moreover, the smooth Wasserstein distances of order r ≥ 1 admit the following representation (see where C ∞ c (R d ) is the space of infinitely differentiable compactly supported functions on R d . In particular, for p ≥ 1 and r ≥ 1 Finally, as usual, for two probability measures, µ 1 and µ 2 , on R d , µ 1 is said to be absolutely continuous with respect to µ 2 , denoted by µ 1 << µ 2 , if for any Borel set, B, such that µ 2 (B) = 0, then µ 1 (B) = 0.

Stein's Equation for SD Laws by Semigroup Methods
Let X be a non-degenerate self-decomposable random vector with values in R d , without Gaussian component, and law µ X . By non-degenerate, we mean that the support of the law of X is not contained in some d − 1 dimensional subspace of R d . Denote by X i , for i = 1, ..., d, its coordinates and assume that E|X i | < ∞, for all i = 1, ..., d. Its characteristic function ϕ is given, for all ξ ∈ R d , by where ν is the Lévy measure of X, while k x and λ are given in (2.5). Further, assume that, for any 0 < a < b < +∞ the functions k x (·) are such that Since the function k x (·) is a non-increasing function in r > 0, the previous condition boils down to, where k x (a + ) = lim r→a + k x (r), for all x ∈ S d−1 . In (3.2), the supremum over x in S d−1 has to be understood as the λ-essential supremum of the function k x (r) in the x variable. In the univariate case, d = 1, the polar decomposition of the Lévy measure ν boils down to ν(du) = k(u)du/|u| where k is non-negative, non-decreasing on (−∞, 0) and non-increasing on (0, +∞). Thus, the condition (3.2) is automatically satisfied for d = 1. For d ≥ 2, the polar decomposition of the Lévy measure associated with a stable distribution of index α ∈ (1, 2) is given by for some finite positive measure λ on the d-dimensional unit sphere (see [43,Theorem 14.3]). Then, the function k x (r) = 1/r α , for all r > 0, and condition (3.2) is automatically satisfied (see below for more examples). Next, define a family of operators (P ν t ) t≥0 , for all f ∈ S(R d ), all x ∈ R d and all t ≥ 0, via Denoting by (µ t ) t≥0 the family of probability measures given by (2.4) with γ = e −t and using Fourier inversion in S(R d ), then For all t ≥ 0, the probability measure µ t is infinitely divisible with finite first moment and its characteristic function ϕ t admits, for all ξ ∈ R d , the following representation (3.5) The next lemma asserts that the family of operators (P ν t ) t≥0 is a C 0 -semigroup on the space L 1 (µ X ). Its proof is very similar to the one dimensional case (see [1,Proposition 5.1]) thanks to the polar decomposition (2.5).
Lemma 3.1. Let X be a non-degenerate self-decomposable random vector without Gaussian component, with law µ X , Lévy measure ν and such that E X < ∞, with moreover the functions k x given by (2.5) satisfying (3.2). Let ϕ be its characteristic function and let (P ν t ) t≥0 be the family of operators defined by (3.3). Then, (P ν t ) t≥0 is a C 0 -semigroup on the space L 1 (µ X ) and its generator A is defined, for all f ∈ S(R d ) and for all x ∈ R d , by Proof. Let f ∈ C b (R d ). First, by (3.4), for all s, t ≥ 0 and for all x ∈ R d , Moreover, for all s, t ≥ 0, Let ψ s,t,x be the measurable function defined by ψ s,t, Then, from (3.7), for all s, t ≥ 0 and for all x ∈ R d , Let us now compute the characteristic function of the probability measure where ϕ s,t,x (u) = e −(s+t) x + u, for all x ∈ R d , s ≥ 0 and t ≥ 0. This implies that and so the semigroup property is verified on Setting ω t (u, x) = u + e −t x, for all u ∈ R d and all x ∈ R d . (3.8) then becomes The characteristic function of the probability measure Hence, for all t ∈ R d , and so the probability measure µ X is invariant for the family of operators (P ν t ) t≥0 . One can further check as well, by Fourier arguments, that, for all x ∈ R d , Finally, by Jensen inequality and the invariance property, Then, by the density of , we can extend the family of operators (P ν t ) t≥0 to functions in L 1 (µ X ). Moreover, this extension still denoted again by (P ν t ) t≥0 is a C 0 -semigroup on L 1 (µ X ). To end the proof of the lemma, let us compute the generator of this semigroup on S(R d ). Let f ∈ S(R d ). By Fourier inversion, for all x ∈ R d and for all t > 0, First, for all x ∈ R d and for all ξ ∈ R d , which is a well-defined limit since X has finite first moment. Now, from the representation (3.1),

Moreover, by Lemma
for some C 1 , C 2 > 0 independent of t, ξ and x. Then, by the dominated convergence theorem, which is equal, by standard Fourier arguments, to This concludes the proof of (3.6) and of the lemma.
The aim of this section is to solve, for all x ∈ R d , the following integro-partial differential equation, which will serve as the fundamental equation in our Stein's methodology for non-degenerate selfdecomposable law. As done in the one-dimensional case in [1], we first introduce a potential candidate solution for this equation, then study its regularity and finally prove that it actually solves the equation (3.12). The following proposition is concerned with the existence and the regularity of the candidate solution.
Proposition 3.1. Let X be a non-degenerate self-decomposable random vector in R d without Gaussian component, with law µ X , characteristic function ϕ and such that E X < ∞, with moreover the functions k x given by (2.5) satisfying (3.2). Let (P ν t ) t≥0 be the semigroup of operators as in Lemma 3.1. Then, for any h ∈ H 2 , the function f h given, for all x ∈ R d , by is well defined and twice continuously differentiable function on R d . Moreover, for any α ∈ N d such that |α| = 1, and, for any α ∈ N d such that |α| = 2, and so the function is well defined for all x ∈ R d . Moreover, by (3.4) and the regularity of h ∈ H 2 , for all 1 ≤ j ≤ d and for all x ∈ R d , which implies that, for all 1 ≤ j ≤ d and for all x ∈ R d , Similarly for all i, j ∈ {1, .., d} and for all x ∈ R d , which implies that and concludes the proof of the proposition.
Proposition 3.2. Let X be a non-degenerate self-decomposable random vector in R d without Gaussian component, with law µ X , characteristic function ϕ, and such that E X < ∞ with moreover the function k x given by (2.5) satisfying (3.2). Further, let (X t ) t≥0 be the collection of random vectors such that, for all t ≥ 0, X t has law µ t given by (2.4) with γ = e −t . For each t > 0, let µ t be absolutely continuous with respect to the d-dimensional Lebesgue measure and let its Radon-Nikodym derivative, denoted by q t , be continuously differentiable on R d and such that, for all Let h ∈ H 1 and (P ν t ) t≥0 be the semigroup of operators as in Lemma 3.1. Then, the function f h given, for all x ∈ R d , by is a well defined twice continuously differentiable function on R d . Moreover, for any α ∈ N d such that |α| = 1

18)
and, for any α ∈ N d such that |α| = 2 for some C d > 0 only depending on d.
Proof. Let h ∈ H 1 . By By (3.4) and and so the function is well defined for all x ∈ R d . Moreover, by (3.4) and the regularity of h ∈ H 1 , for all 1 ≤ j ≤ d and for all x ∈ R d , Let us fix 1 ≤ i ≤ d and x ∈ R d . By definition and an integration by parts (thanks to (3.16)), Thus, for all 1 ≤ i, j ≤ d and x ∈ R d , This representation together with the third condition in (3.16) ensures that f h is twice continuously differentiable on R d and that, for all x ∈ R d and for all 1 Finally, for all x ∈ R d and all 1 ≤ i, j ≤ d, This concludes the proof of the proposition.
Before showing that f h as given in the two previous propositions is a solution to the integro-partial differential equation we provide some examples of non-degenerate self-decomposable random vectors whose law µ X satisfies the assumptions of the aforementioned propositions.

Some Examples
Rotationally invariant α-stable random vector in R d . Let α ∈ (1, 2) and let X be an α-stable random vector whose law is rotationally invariant. Then, its characteristic function ϕ is, for all for some constant C α,d > 0 depending on α and d. Hence, the characteristic function of µ t is given, for all ξ ∈ R d , and t ≥ 0 by Thus, for all t > 0, µ t is absolutely continuous with respect to the Lebesgue measure and its density q t is given for all x ∈ R d , by the Fourier inversion formula, Moreover, the characteristic function ϕ t is linked to the Fourier transform of the probability transition density of a rotationally invariant d-dimensional α-stable process after the time change is a rotationally invariant d-dimensional α-stable Lévy process, then its characteristic function at time t is given, for all t ≥ 0 and all ξ ∈ R d , by Thus, for all ξ ∈ R d and all t ≥ 0 Finally, Lemma 2.2 of [15] implies that the density q t and its gradient satisfy the following inequality, for some C ′ α,d , C ′′ α,d > 0 only depending on α and d. It follows that q t satisfies the conditions (3.16).
Symmetric α-stable random vector in R d . Let α ∈ (1, 2) and let X be a symmetric αstable random vector on R d . By [43,Theorem 14.13], the characteristic function of X is given by, for all ξ ∈ R d with ξ = 0 where λ 1 is a symmetric positive finite measure on S d−1 . Then, for all ξ ∈ R d and all t ≥ 0, Moreover, let's assume that there exists c 0 > 0 such that for any Then, µ t is absolutely continuous with respect to the d-dimensional Lebesgue measure and its density q t is given, for all t > 0 and all x ∈ R d , by Takano distribution [48]. Let α ∈ (0, +∞) and let µ α,d be the probability measure on R d given by where c α,d > 0 is a normalizing constant. As shown in [48] such a probability measure is selfdecomposable. Moreover, its characteristic function is given, for all ξ ∈ R d , by ([48, Theorem II]) where K (d−2)/2 denotes the modified Bessel function of order (d − 2)/2 while g α (w) = 2/(π 2 w) , for all v > 0. This implies the following polar decomposition for ν where σ is the uniform measure on S d−1 . Hence, condition (3.2) is automatically satisfied. So, our methodology applies as soon as α > 1/2 (which ensures that R d x µ α,d (dx) < +∞).
Takano distribution [47]: Let µ be the probability measure on R d given by where C > 0 is a normalizing constant. Thanks to [47,Result 1], its characteristic function ϕ is given by Hence, µ is an infinitely divisible probability measure on R d . Moreover, the function M admits the following representation (see the last formula page 64 of [47]) which is non-negative and non-increasing on (0, +∞) (and C d > 0). Thus, µ is self-decomposable and its Lévy measure admits the following polar decomposition where σ is the uniform measure on the Euclidean unit sphere. Finally, the associated k x functions satisfy (3.2) and the probability measure µ admits moments of any orders.
Multivariate gamma distributions. Let (α 1 , . . . , α d ) ∈ (0, +∞) d and let X = (X 1 , . . . , X d ) be a random vector whose independent coordinates are distributed according to gamma laws with parameters (α i , 1), 1 ≤ i ≤ d. Namely, the characteristic function of X is given by For any b > 1, there exists ρ b , a probability measure on R d , such that, Then, one can apply our methodology to these multivariate gamma distributions. Moreover, for all ξ ∈ R d and all t > 0, Let us note that other types of self-decomposable multivariate gamma distributions have been considered in the literature (see [36]). In particular, the authors of [36] considered infinitely divisible multivariate gamma distributions whose Lévy measures have the following polar decomposition, where α, β are positive real numbers and λ is a finite positive measure on the d-dimensional unit sphere. Therefore, k x (r) = αe −βr , r > 0 and so (3.2) is again satisfied. Moreover, such infinitely divisible multivariate gamma distributions are self-decomposable and admit absolute moment of any orders.
Step 1 : Let us prove that for all t > 0 and all Since h belongs to C ∞ c (R d ), by Fourier inversion, for all t > 0 and for all x ∈ R d which will be used to compute d/dt(P ν t (h)(x)). First, note that, for all x ∈ R d , ξ ∈ R d and t > 0, Moreover, for all x ∈ R d , for all ξ ∈ R d and all t > 0, To conclude the first step, let us precisely compute the right-hand side of the previous equality. First, where we have used that e −t P ν t (∂ j (h))(x) = ∂ j (P ν t (h))(x), for all t ≥ 0, all x ∈ R d and all 1 ≤ j ≤ d. Moreover, by Fubini Theorem and Fourier arguments, Thus, for all x ∈ R d and all t > 0, which gives (3.21) and finishes the proof of the first step.
Step 2 : Let 0 < b < +∞. Integrating out the equality (3.21) gives, then, letting b → +∞ and using Lemma 3.1 lead to: Next, let us show that +∞ 0 |A(P ν t (h))(x)| dt < +∞, for all x ∈ R d . To do so, we need to estimate the quantities ∇(P ν t (h))(x) and ∇(P ν t (h))(x + u) − ∇(P ν t (h))(x) , for all x ∈ R d , all u ∈ R d and all t ≥ 0. Using that ∂ j (P ν t (h)) (x) = e −t P ν t (∂ j (h))(x) and that h ∈ H 2 , Then, by the very definitions of A and P ν t and standard inequalities, for all x ∈ R d and all t ≥ 0 To conclude, one needs to prove that, for all but this follows from standard arguments as well as from Proposition 3.1.
We end this section with regularity estimates for the solution f h of the Stein's equation under the assumptions of Proposition 3.1. Similar estimates hold true under the assumptions of Proposition 3.2. In particular, under these latter assumptions, it is sufficient to have h ∈ H 1 to obtain a bound on M 2 (f h ), and this is in line with the Gaussian case (see [38,11]).
Proposition 3.4. Let X be a non-degenerate self-decomposable random vector in R d without Gaussian component, with law µ X , characteristic function ϕ and such that E X < ∞, with moreover the functions k x given by (2.5) satisfying (3.2). Let h ∈ H 2 and (P ν t ) t≥0 be the semigroup of operators as in Lemma 3.1. Then, f h given, for all x ∈ R d , by is such that Proof. By definition, Let u ∈ R d with u = 1. Then, by the Cauchy-Schwarz inequality, for all

Now, thanks to the commutation relation
The bound for M 2 (f h ) follows similarly using the commutation relation twice and the fact that h ∈ H 2 .

Stein Kernels for SD Laws With Finite Second Moment
In the Gaussian setting, a major finding in the context of Stein's method is the introduction of the notion of Stein kernel (see e.g. [45,7,8,9,31,10,32,34,25,16]). Recall that γ, a centered Gaussian measure on R d , satisfies the following integration by parts formula, for all smooth enough, R d -valued function f = (f 1 , ..., f d ) and where div(f (x)) = d j=1 ∂ j (f j )(x). For a centered probability measure ρ on R d , the Gaussian Stein kernel of ρ is the measurable function τ ρ , from R d to M d×d (R), the space of d × d real matrices, such that, for all smooth enough where A; B HS = Tr A t B , for A, B ∈ M d×d (R). Recall, also from the previous section, that a non-degenerate self-decomposable random vector X without Gaussian component, with law µ X and with finite first moment satisfies, for f smooth enough, the following characterizing equation, Then, quite naturally, in the infinitely divisible framework, let us introduce the following definitions of Stein's kernels and of the Stein's discrepancy.
Definition 4.1. Let X be a centered non-degenerate infinitely divisible random vector without Gaussian component, with law µ X , with Lévy measure ν and with E X 2 < ∞. Let Y be a centered random vector with law µ Y and with E Y 2 < ∞. Then, a Stein kernel of Y with respect to X is a measurable function τ Y from R d to R d such that, for all R d -valued test function f for which both sides of the previous equality are well defined. Moreover, the Stein's discrepancy of µ Y with respect to µ X is given by where the infimum is taken over all Stein kernels of Y with respect to X, and is equal to +∞ if no such Stein kernel exists.
The next result ensures that the Stein's discrepancy provides a good control on classical metrics between probability measures on R d .
Theorem 4.1. Let X be a centered non-degenerate self-decomposable random vector without Gaussian component, with law µ X , Lévy measure ν, such that E X 2 < +∞ and let also the functions k x given by (2.5) satisfy (3.2). Let Y be a centered non-degenerate random vector with law µ Y , such that E Y 2 < +∞, and for which a Stein's kernel with respect to X exists. Then, and thus, Now, since Y admits a Stein kernel with respect to X, Taking the absolute values and applying the Cauchy-Schwarz inequality, Now, by the very definition of M 2 (f h ) and by the Cauchy-Schwarz inequality (applied twice), the following bound holds true To conclude use the definition of the Stein's discrepancy and Proposition 3.4.
In the sequel, we wish to discuss sufficient conditions for the existence of Stein kernels as defined above. For this purpose, let us recall some definition and results from [12,14] regarding Poincaré inequalities in an infinitely divisible setting. First, if X is a non-degenerate infinitely divisible random vector in R d without Gaussian component, with law µ X and with Lévy measure ν and if f : [12,Theorem 4.1] gives characterizes the proximity in law of Y to a centered infinitely divisible random vector with finite second moment and Lévy measure ν. Indeed, [14, Theorem 2.1] gives the following: U (Y, ν) ≥ 1 and U (Y, ν) = 1 if and only if the characteristic function of Y is given by In the Gaussian case, the existence of a Stein kernel for multivariate distributions has been investigated with the help of variational methods. Indeed, in [16], under a spectral gap assumption, the existence of a Gaussian Stein kernel has been ensured thanks to the classical Lax-Milgram Theorem. Then, in view of (4.2) and the associated characterization, it is natural to introduce the following variational setting: let Y be a centered non-degenerate random vector with finite second moment and with law µ Y and let ν be a Lévy measure on R d such that u ≥1 u 2 ν(du) < +∞. Moreover, assume that ν * µ Y << µ Y , with ν * µ Y denoting the convolution of the two positive measures ν and µ Y . Now, let H ν (µ Y ) be the vector space of Borel measurable R d -valued functions (Two functions f and g of H ν (µ Y ) are identified as soon as f = g µ Y -almost everywhere.) Then, let us assume that Y satisfies a Poincaré inequality of the following type: there exists a positive and finite constant U Y such that, In particular, note that if Y satisfies the Poincaré inequality (4.2), then, for all so that Y satisfies a Poincaré inequality in the sense of the Inequality (4.3) with U Y = U (Y, ν). Moreover, let A be the bilinear functional defined, for all test functions f and g, by and let L be the linear functional defined, for all test functions f , by Before solving the variational problem associated with A, L and H ν (µ Y ), we need the following technical lemma.
Lemma 4.1. The vector space H ν (µ Y ) endowed with the bilinear functional is a Hilbert space. Moreover, A, defined by (4.4), is continuous on Proof. First, it is clear that the bilinear symmetric functional ·; · Hν (µ Y ) is an inner product on H ν (µ Y ). Then, let · Hν (µ Y ) be the induced norm defined via f 2 , for all f ∈ H ν (µ Y ). Let us prove that H ν (µ Y ) endowed with this norm is complete. Let (f n ) n≥1 be a Cauchy sequence in H ν (µ Y ). Therefore (f n ) n≥1 is a Cauchy sequence in L 2 (µ Y ), and there exists f ∈ L 2 (µ Y ) such that f n → f , as n → +∞ in L 2 (µ Y ). Now, pick a subsequence (f n k ) k≥1 such that f n k → f , µ Y -almost everywhere, as k → +∞. Fatou's lemma together with the assumption that ν * µ Y << µ Y and the fact that (f n ) n≥1 is a Cauchy sequence in H ν (µ Y ) (thus is bounded), imply that (4.7) Hence, f belongs to H ν (µ Y ). Another application of Fatou's lemma together with the fact that . Now, by the Cauchy-Schwarz inequality, for all f, g ∈ H ν (µ Y ), Moreover, since Y satisfies the Poincaré inequality (4.3), for all f ∈ H ν,0 (µ Y ) for 2C Y = min (1, 1/(U Y )) > 0. Finally, the continuity property of the linear functional L on H ν (µ Y ) follows from the Cauchy-Schwarz inequality, from E Y 2 < +∞, and from the continuous Note that since H ν,0 (µ Y ) is a closed subspace of H ν (µ Y ), it is as well a Hilbert space with the inner product .; . Hν (µ Y ) . Based on Lemma 4.1, a direct application of the Lax-Milgram Theorem ensures the existence of a Stein kernel in the sense of Definition 4.1 for probability measures µ Y which satisfy the Poincaré inequality (4.3). This is the content of the next theorem.
Theorem 4.2. Let Y be a centered non-degenerate random vector with finite second moment and with law µ Y and let ν be a Lévy measure on R d such that u ≥1 u 2 ν(du) < +∞. Assume that Y satisfies the Poincaré inequality (4.3) for some 0 < U Y < +∞ and that ν * µ Y << µ Y . Then, there exists a unique τ Y ∈ H ν,0 (µ Y ), such that, for all f ∈ H ν,0 (µ Y ) Moreover, Proof. The first part of the theorem is a direct application of the Lax-Milgram Theorem with A, L and H ν,0 (µ Y ). To obtain the inequality (4.9), note that thanks to (4.8) with f = τ Y , (4.10) Finally, the Poincaré inequality (4.3) combined with the previous inequality implies which concludes the proof.
The next theorem is the main result of this section.
Theorem 4.3. Let X be a centered non-degenerate self-decomposable random vector without Gaussian component, with law µ X , with Lévy measure ν, such that E X 2 < +∞ and let also the functions k x given by (2.5) satisfy (3.2). Let Y be a centered non-degenerate random vector with law µ Y , with E Y 2 < +∞ and such that Y satisfies the Poincaré inequality (4.3) with 1 ≤ U Y < +∞ and that ν * µ Y << µ Y . Then, Proof. Let us start with the proof of (4.12). First, note that since with f h given by Proposition 3.3. Thus, using EY = 0, by Theorem 4.1, We continue by estimating . By the Pythagorean Theorem, Definition 4.1 and the fact that Moreover, (4.9) implies that (4.14) Combining (4.13) and (4.14) concludes the proof of the theorem. The proof of (4.11) follows in a completely similar manner.
for all 1 ≤ j ≤ d. Therefore, it follows that U Y = 1.
(iii) The following inequality on the Stein discrepancy is a direct byproduct of the proof of the previous theorem (iv) All the above results should be compared with the analogous Gaussian ones obtained in [16] (see [16,Theorem 2.4 and Corollary 2.5]).
As a straightforward corollary to Theorem 4.3, the following convergence result holds true.
Corollary 4.1. Let X be a centered non-degenerate self-decomposable random vector without Gaussian component, with law µ X , Lévy measure ν, such that E X 2 < +∞ and let also the functions k x given by (2.5) satisfy (3.2). Let (Y n ) n≥1 be a sequence of centered square-integrable non-degenerate random vectors with laws (µ n ) n≥1 , such that ν * µ n << µ n , for all n ≥ 1, and such that Y n satisfies the Poincaré inequality (4.3) with 1 ≤ U n < +∞, for all n ≥ 1. If E Y n 2 → R d u 2 ν(du) and U n → 1, as n tends to +∞, then, (Y n ) n≥1 converges in distribution towards X.
To end this section, we briefly discuss the condition ν * µ Y << µ Y appearing in Theorems 4.2 and 4.3. For this purpose, let ν be the Lévy measure of a non-degenerate infinitely divisible random vector, X, in R d with law µ X . Now, let P(ν) be the set of probability measures, µ, on R d , such that ν * µ << µ. First of all, thanks to [12,Lemma 4.1], the set P(ν) is not empty and contains the probability measure µ X . Moreover, it is clearly a convex set. Now, let us describe some further non-trivial examples of probability measures belonging to P(ν). For this purpose, we say that two probability measures µ 1 and µ 2 on R d are equivalent (denoted by µ 1 ∼ µ 2 ) if for any Borel set B of R d , µ 1 (B) = 0 if and only if µ 2 (B) = 0. Proposition 4.1. Let X be a non-degenerate infinitely divisible random vector in R d with law µ X and Lévy measure ν and P(ν) be the set of probability measures, µ, in R d such that ν * µ << µ. Let Y be a non-degenerate random vector in R d with law µ Y such that µ Y ∼ µ X . Then, µ Y ∈ P(ν).
As a further straightforward corollary, the following result holds true.
Corollary 4.2. Let X be a non-degenerate infinitely divisible random vector in R d without Gaussian component, with law µ X , Lévy measure ν X and parameter b X ∈ R d and let P(ν X ) be the set of probability measures, µ, on R d such that ν X * µ << µ. Let Y be a non-degenerate infinitely divisible random vector in R d without Gaussian component, with law µ Y , Lévy measure ν Y and parameter b Y ∈ R d . Assume that ν X ∼ ν Y and that Proof. This is a direct application of Proposition 4.1 together with [43,Theorem 33.1].

A Appendix
The aim of this section is to provide technical results (often multivariate versions of univariate ones proved in [1]) which are used throughout the previous sections.
Lemma A.1. Let X be a non-degenerate self-decomposable random vector in R d , without Gaussian component, with law µ X , characteristic function ϕ and such that E X < ∞. Assume further that, for any 0 < a < b < +∞ the functions k x given by (2.5) satisfy the following condition Let X t , t ≥ 0, be the random vectors each with characteristic functions, ϕ t , given, for all ξ ∈ R d by . Then, and, (ii) for all ξ ∈ R d and all t ∈ (0, 1), for some C > 0 independent of ξ and t.
Proof. Let us start with the proof of (i). First note that, for all t > 0 where Y t and Z t are independent, with, for all ξ ∈ R d with ν t the Lévy measure of X t . Then, for all t > 0, Using [29,Lemma 1.1], Now, thanks to the representation (3.5), Thus, To prove (ii), first note that, for all ξ ∈ R d with ξ = 0 and all t > 0 and thus Noting that, for all ξ ∈ R d and all 1 ≤ j ≤ d, it follows that Then, the polar decomposition of ν t , allows to bound the two terms u ≤1 u 2 ν t (du) and u ≥1 u ν t (du). Let us first deal with the term u ≥1 u ν t (du). By (3.5), ≤ (e t − 1) sup which is finite in view of (A.1). For the term u ≤1 u 2 ν t (du), This concludes the proof of the lemma.
Lemma A.2. Let X, Y be two random vectors in R d with respective laws µ X and µ Y . Let r ≥ 1. Then, Now, let h ∈ H r and let (h ε ) ε>0 be the regularization of h with the Gaussian kernel, namely, for all x ∈ R d and all ε > 0 Next, let Ψ be a compactly supported infinitely differentiable function with values in [0, 1] such that supp(Ψ) ⊆ D(0, 2) and such that Ψ(x) = 1, for all x ∈ D. Then, for any R ≥ 1 and any ε > 0, set, for all Then, for X and Y two random vectors on R d with respective laws µ X and µ Y , To continue, one needs to estimate the quantities M ℓ (h ε,R ), for all 0 ≤ ℓ ≤ r. First, since h ∈ H r Then, for all R ≥ 1 and all ε > 0 By a similar reasoning, it follows that M ℓ (h ε,R ) ≤ C ℓ,Ψ ℓ k=1 1/R k + 1, for all 1 ≤ ℓ ≤ r and for some C ℓ,Ψ > 0 only depending on ℓ and Ψ. Then, the function h ε,R defined, for all x ∈ R d , by Letting first R tend to +∞ and then ε tend to 0 + concludes the proof of the lemma.
The objective of Theorem A.1 below is to prove that the d W 1 distance between the law of X and the law of X t decreases exponentially fast as t tends to +∞. For this purpose, for any r ≥ 1 and any random vectors X and Y , let where H r is the set of functions which are r-times continuously differentiable on R d such that D α (f ) ∞ ≤ 1, for all α ∈ N d with 0 ≤ |α| ≤ r. Since, for any r ≥ 1, H r ⊂ H r , 10) The next lemma shows that smooth compactly supported function in H r , r ≥ 1, are enough in (A.9).
Lemma A.3. Let X, Y be two random vectors in R d with respective laws µ X and µ Y . Let r ≥ 1. Then, Proof. Let r ≥ 1. By definition, Let h ∈ H r and let (h ε ) ε>0 be a regularization by convolution of h, such that h ε ∈ C ∞ (R d ) and Clearly, by construction, h M,ε ∈ C ∞ c (R d ). Then, for all M ≥ 1 and ε > 0, Choosing M ≥ 1 large enough, Next, by the very definition of h M,ε h M,ε ∞ ≤ 1, and, moreover, by Leibniz formula, for all α ∈ N d with 1 ≤ |α| ≤ r and x ∈ R d |Eh(X) − Eh(Y )| + (2d + 2)ε for some C d,r > 0 only depending on d, r and ψ. The conclusion follows by, first taking M → +∞, and then ε → 0 + .
Theorem A.1. Let X be a non-degenerate self-decomposable random vector in R d , without Gaussian component, with law µ X , characteristic function ϕ and such that E X < ∞. Assume further that, for any 0 < a < b < +∞ the functions k x given by (2.5) satisfy the following condition Let X t , t > 0 be random vectors each with law µ Xt , with characteristic function ϕ t , given, for all ξ ∈ R d by ϕ t (ξ) = ϕ(ξ) ϕ(e −t ξ) . (A.14) Then, for t > 0 for some C d > 0 independent of t. Proof.
For α ∈ N d such that |α| = r, let us estimate D α (h ε ) ∞ . By definition, for all x ∈ R d , Now, by Rodrigues formula, for all j ∈ 1, ..., d, where H α j is the Hermite polynomial of degree α j . Thus, for all α ∈ N d and x ∈ R d , where H α (x) = d j=1 H α j (x j ). Then, for all α ∈ N d such that |α| = r, for all x ∈ R d and for some β ∈ N d such that |β| = r − 1 and α − β ≥ 0 Thus, for some C α , C r > 0 depending only on α, on d and on r. Let Z and Y be two random vectors with respective laws µ Z and µ Y such that d Wr (Z, Y ) < 1. Then, Choosing ε ∈ (0, C r ), Taking ε ≤ C r /(1 + C r ) d Wr (Z, Y ) yields, for someC r > 0 only depending on r and on d. Now, let Z and Y be two random vectors such that d W 2 (Z, Y ) < 1. Then, thanks to (2.15), d Wm (Z, Y ) < 1, for all 2 ≤ m ≤ r. By induction, we get 16) for some C r > 0 only depending on r and on d.
Step 2 : Let g be an infinitely differentiable function with compact support contained in the Euclidean ball centered at the origin of radius R + 1, for some R > 0. Then by Fourier inversion and Fubini theorem, for all t > 0, Moreover, for all p ≥ 2 sup ξ∈R d |F(g)(ξ)|(1 + ξ p ) ≤ C d (R + 1) d g ∞ + max 1≤j≤d ∂ p j (g) ∞ , for some C d > 0 depending on the dimension d only. Thus, for all t > 0 (A.17) Step 3 : Let h ∈ C ∞ c (R d )∩ H d+2 . Let Ψ R be a compactly supported infinitely differentiable function on R d whose support is contained in the Euclidean ball centered at the origin of radius R + 1 with values in [0,1] and such that Ψ R (x) = 1, for all x such that x ≤ R. Then, for all t > 0 |Eh(X) − Eh(X t )| ≤|Eh(X)ψ R (X) − Eh(X t )Ψ R (X t )| + |Eh(X)(1 − Ψ R (X))| + |Eh(X t )(1 − Ψ R (X t ))|.
using Lemma A.1. A similar bound holds true for |Eh(X)(1 − Ψ R (X))|. Moreover, from (A.17), for some constant C d depending on d. Now, hΨ R ∞ ≤ 1, and, by taking for Ψ R an appropriate tensorization of one dimensional bump functions ψ R , for some D > 0 independent of R and h. Then, Choosing R = e t/(d+1) , for all t > 0, it follows that for someC d > 0. Using (A.16) with r = d + 2, Using the inequality (A.10) concludes the proof of the theorem.