Malliavin-Stein method for Variance-Gamma approximation on Wiener space

We combine Malliavin calculus with Stein's method to derive bounds for the Variance-Gamma approximation of functionals of isonormal Gaussian processes, in particular of random variables living inside a fixed Wiener chaos induced by such a process. The bounds are presented in terms of Malliavin operators and norms of contractions. We show that a sequence of distributions of random variables in the second Wiener chaos converges to a Variance-Gamma distribution if and only if their moments of order two to six converge to that of a Variance-Gamma distributed random variable (six moment theorem). Moreover, simplified versions for Laplace or symmetrized Gamma distributions are presented. Also multivariate extensions and a universality result for homogeneous sums are considered.


Introduction
Let X = {X(h)} h∈H be an isonormal Gaussian process, defined on a probability space (Ω, F, P), over some real separable Hilbert space H, fix an integer q ≥ 2 and let {F n } n∈N be a sequence of random variables belonging to the qth Wiener chaos induced by X (precise definitions follow in Section 2 below). Denote by H ⊗q and H q the qth tensor product and the qth symmetric tensor product of H, respectively, and let I q be the isometry between H q (equipped with the modified norm √ q! · H ⊗q ) and the qth Wiener chaos of X. If H in particular is an L 2 -space of some σ-finite measure space without atoms, then a random variable I q (h) with h ∈ H q has the form of a multiple Wiener-Itô integral of order q.
In recent years, many efforts have been made to characterize those sequences {F n } n∈N belonging to a Wiener chaos of fixed order, which verify a central limit theorem in the sense that F n converges in distribution, as n → ∞, to a centered Gaussian random variable N of unit variance (compare with the book [14] for an overview). The celebrated fourth moment theorem of Nualart and Peccati [21] asserts that if E[F 2 n ] = 1 for all n ≥ 1, then, as n → ∞, F n converges in distribution to N if and only if E[F 4 n ] converges to 3, the fourth moment of the Gaussian random variable N . This can be seen as a drastical simplification of the classical method of moments, which ensures convergence in distribution of F n to N , provided that all moments of F n converge to that of N .
Only a few years later, Nourdin and Peccati [12] combined Stein's method for normal approximation with the Malliavin calculus on the Wiener space of variations to prove explicit bounds on the total variation distance d T V (F n , N ) := sup A∈B(R) |P (F n ∈ A) − P (N ∈ A)|, where the supremum runs over all bounded Borel sets A ⊂ R (in fact, also other notions of distances have been considered in [12], but we restrict here to the total variation distance). They showed that for a sequence {F n } n∈N of random variables of the form F n = I q (h n ) with h n ∈ H q it holds that where, for integers j ≥ 1, we write κ j (X) for the jth cumulant (semi-invariant) of a random variable X. More recently, in [15], exact rates of convergence for the total variation distance have been found. Namely, if F n converges in distribution to N , as n → ∞, then there exist two constants 0 < c < C < ∞ (possibly depending on q and on {F n }, but not on n), such that 3 are the third and the fourth cumulant of F n , respectively. In other words this means that the rate provided by (1.1) is suboptimal by a squareroot factor. This, however, seems unavoidable using the Malliavin-Stein technique for normal approximation, which is based on the analysis of fourth moments, while the proof of (1.2) uses more refined arguments.
The main goal of this paper is to study non-central limit theorems (i.e., limit theorems with a non-Gaussian limiting distribution) for sequences {F n } n∈N belonging to a fixed Wiener chaos of order q ≥ 2, as above. A first step in this direction is the paper [11] by Nourdin and Peccati, in which conditions on the sequence {F n } n∈N have been derived, under which convergence towards a centred Gamma distribution takes place. Moreover, in [12] rates of convergences for such Gamma approximations were considered, again by applying Stein's method. It is an interesting fact that if q is an odd integer, there is no sequence of chaotic random variables with bounded variances converging in distribution to a centred Gamma distribution. This is a consequence of the fact that a random variable I q (h) with h ∈ H q with q being odd has third moment equal to zero, while the third moment of a centred Gamma distribution is strictly positive. Beyond the normal and Gamma approximation results in [12], up to our best knowledge there are no other quantitative limit theorems for chaotic sequences so far. Our paper is an attempt to fill this gap in case of the broad class of so-called Variance-Gamma distributions. This is a 3-parametric family of continuous probability distributions on the real line defined as variancemean mixtures of Gaussian random variables with a Gamma mixing distribution. We emphasize that Variance-Gamma distributions are widely used in financal mathematics, for example. Particular examples of Variance-Gamma distributions are the Laplace distribution or more general symmetrized Gamma distributions, but also the classical normal or Gamma distribution, which show up as limiting cases. It is interesting to see that in our set-up, no parity condition on q is necessary in general. It is also worth mentioning the recent work [9] of Kusuoka and Tudor, where it has been shown that within the so-called Pearson-family of probability distributions on the real line, only the normal and the Gamma distribution can appear as limit laws for a sequence {F n } n∈N of chaotic random variables. The Laplace distribution, the symmetrized Gamma distribution and, more generally, the Variance-Gamma distributions, are not members of the Pearson-family and this way our study goes beyond the theory developed in [9].
Besides the goal of identifying new limiting distributions for the sequence {F n } n∈N as considered above, another source of motivation for our paper comes from the area of free probability. In [6], Deya and Nourdin studied the convergence of a sequence of multiple stochastic integrals with respect to a free Brownian motion to what they call the tetilla law, which can be regarded as the commutator of the well-known Marchenko-Pastur distribution. Our aim here is to identitfy the non-free analogue of this distribution and to prove a related limit theorem for multiple stochastic integrals with respect to the classical Brownian motion. We will see that the Laplace distribution with parameter √ 2, which, as already mentioned above, is contained in the class of Variance-Gamma distributions, can be seen as such an analogue, see Remark 5.4 below for a more detailed description.
Let us describe our results and the structure of our paper in some more detail. In Section 2 we collect some background material related to isonormal Gaussian processes and the Malliavin calculus of variations on the Wiener space. We will recall in particular the definitions of the so-called Γ-operators, which are central for our further investigations. Essential elements of Stein's method for Variance-Gamma distributions are reviewed in Section 3. In particular, we state bounds on the solution of the Stein equation and introduce some particular subclasses and limiting cases of Variance-Gamma distributions, which are of special interest. An abstract bound for the Wasserstein distance d W (F, Y ) between a (suitably regular) functional F of an isonormal Gaussian process and a Variance-Gamma distributed random variable Y in the spirit of the Malliavin-Stein method is derived in Section 4. We will see that the bound is expressed in terms of the Γ-operators mentioned above. This general bound is specialized in Section 5.1 to the case of elements living inside a fixed Wiener chaos of order q ≥ 2. We derive a sufficient analytic criterium in terms of contractions for such a sequence to converge in distribution to a Variance-Gamma distributed random variable. In this context, we also recover the fourth moment theorem discussed above together with a rate of convergence for the Wasserstein distance, which improves (1.1), but is still not optimal in view of (1.2). Our general bound is specialized in Section 5.2 to the case q = 2, which is of particular interest in view of the theory of quadratic forms, for example. We show that in this case the previously derived sufficient criterium for convergence turns out to be necessary and, moreover, equivalent to a simple moment condition involving moments of up to order six only. For example, we show that a sequence {F n } n∈N of elements belonging to a second Wiener chaos converges in distribution to a random variable Y having a Laplace distribution with parameter b > 0 if and only if as n → ∞. This this case, we also have the bound on the Wasserstein distance, where, recall, κ j (F n ) stands for the jth cumulant of F n and where C 1 , C 2 > 0 are constants only depending on the parameter b. We like to emphasize the following interesting observation. Namely, although the third cumulant κ 3 (F n ) shows up in the bound (1.4), it automatically vanishes in the limit, as n → ∞, under the moment condition (1.3). Our result can be seen as a six moment theorem for the convergence to a Laplace distribution on the second Wiener chaos. An analogue for general Variance-Gamma distributions is one of the main achievements of this paper. We mention that this is also closely connected to the work [17] (see also the erratum [18]) of Nourdin and Poly, who characterize convergence of a sequences of random elements inside the second Wiener chaos associated with the ordinary (and the free) Brownian motion in terms of conditions on a sequence of consequtive moments. However, their results do not allow to derive rates of convergence. In the final Section 5.3 we deal with a universality question for so-called homogeneous sums with respect to Variance-Gamma convergence as well as with some multivariate extensions of the previously derived results.
The results of our paper complement those obtained in the recent study of Azmoodeh, Peccati and Poly [2], which has independently been conducted in parallel with our paper. They derive necessary and sufficient conditions under which a sequence {F n } n∈N as above converges to a limiting random variable, whose distribution is a finite linear combination of centred χ 2 -distributions. However, these limit theorems are not quantitative in the sense that they just state the convergence in distribution without giving upper bounds on the rate of convergence. On the other hand, the results are for sequences living inside a Wiener chaos of arbitrary order.

Elements of Gaussian analysis and Malliavin calculus
Isonormal Gaussian processes. Here we collect the essentials of Gaussian analysis and Malliavin calculus that are used in the paper, see the books [20] and [14] for further details.
For a real separable Hilbert space H and q ≥ 1, we write H ⊗q and H q to indicate, respectively, the qth tensor power and the qth symmetric tensor power of H with convention H ⊗0 = H 0 = R. We denote by X = {X(h)} h∈H an isonormal Gaussian process over H, i.e., X is a centred Gaussian family, defined on some probability space (Ω, F, P) and indexed by H, with a covariance structure given by the relation E[X(h)X(g)] = h, g H . We assume that F = σ(X). For q ≥ 1, the symbol H q denotes the qth Wiener chaos of X, defined as the closed linear subspace of L 2 (Ω, F, P) =: L 2 (Ω), which is generated by the family {H q (X(h)) : h ∈ H, h H = 1}, where H q (x) = (−1) q e x 2 /2 d q dx q (e −x 2 /2 ) is the qth Hermite polynomial. For any q ≥ 1 the mapping I q (h ⊗q ) = H q (X(h)) can be extended to a linear isometry between H q and the qth Wiener chaos H q . For q = 0 we write I 0 (c) = c, c ∈ R. When H = L 2 (A, A, µ) =: L 2 (µ) with µ being a non-atomic σ-finite measure on the measurable space (A, A), for every f ∈ H q = L 2 s (µ q ) the random variable I q (f ) coincides with the q-fold multiple Wiener-Itô-integral of f with respect to the centred Gaussian measure canonically generated by X, see [20,Section 1.1.2]. Here, L 2 s (µ q ) stands for the subspace of L 2 (µ q ) composed by symmetric functions. It is well-known that L 2 (Ω) can be decomposed into the infinite orthogonal sum of the spaces H q . Hence an F ∈ L 2 (Ω) admits the Wiener-Itô chaotic expansion , and the f q ∈ H q , q ≥ 1, uniquely determined by F . Let {e n } n∈N be a complete orthonormal system in H. For f ∈ H p and g ∈ H q , for every r = 0, . . . , p ∧ q, the contraction of f and g of order r is the element of H ⊗(p+q−2r) defined by f, e i 1 ⊗ · · · ⊗ e ir H ⊗r ⊗ g, e i 1 ⊗ · · · ⊗ e ir H ⊗r .
It is important to notice that the definition of f ⊗ r g does not depend on the particular choice of {e n } n∈N , and that f ⊗ r g is not necessarily symmetric. We denote its canonical symmetrization by f ⊗ r g ∈ H (p+q−2r) . Clearly, f ⊗ 0 g = f ⊗ g and f ⊗ q g = f, g H ⊗q . Moreover, when H = L 2 (µ) and r = 1, . . . , p ∧ q, the contraction f ⊗ r g is the element of L 2 (µ p+q−2r ) given by . , x p−r , a 1 , . . . , a r ) g(x p−r+1 , . . . , x p+q−2r , a 1 , . . . , a r )d(µ r (a 1 , . . . , a r )).
We will intensively use the isometry property and the product formula for multiple integrals, i.e. elements of a fixed Wiener chaos. Namely, if f ∈ H p and g ∈ H q , and 1 ≤ q ≤ p, then Malliavin operators. Let X be an isonormal Gaussian process and let S be the set of random variables of the form F = g(X(φ 1 ), . . . , X(φ n )) with n ≥ 1, φ 1 , . . . , φ n ∈ H and g : R n → R an infinitely differentiable function whose partial derivatives have polynomial growth. The Malliavin derivative of F with respect to X is the element of L 2 (Ω, H) defined as Hence DX(h) = h for h ∈ H. By iteration, the mth derivative D m F is an element of L 2 (Ω, H m ) for every m ≥ 2. For m ≥ 1 and p ≥ 1, D m,p denotes the closure of S with respect to the norm We use the notation D ∞ := m≥1 p≥1 D m,p . Every finite linear combination of multiple Wiener-Itô integrals is an element of D ∞ and its law admits a density with respect to the Lebesgue measure on the real line. The Malliavin derivative satisfies the following chain rule. If ϕ : R n → R is continuously differentiable with bounded partial derivatives and if F = (F 1 , . . . , F n ) is a vector of elements of D 1,2 , then ϕ(F ) ∈ D 1,2 and If H = L 2 (A, A, µ) with µ σ-finite and non-atomic, then the derivative of F as in (2.1) is given by where f q (·, x) stands for the function f q with one of its arguments fixed to be x. The adjoint of the operator D is denoted by δ and called the divergence operator. A random element u ∈ L 2 (Ω, H) belongs to the domain of δ (Dom(δ)), if and only if it verifies |E[ DF, u H ]| ≤ c u F L 2 (Ω) for any F ∈ D 1,2 , where c u is a constant depending only on u. For u ∈ Dom(δ) the random variable δ(u) is defined by the integration-by parts formula which holds for every F ∈ D 1,2 . The infinitesimal generator of the Ornstein-Uhlenbeck semi-group is given by L = ∞ q=0 −q J q , where J q (F ) := I q (f q ) for every F as in (2.1). The domain of L is D 2,2 . A random variable F belongs to D 2,2 if and only if F ∈ Dom(δD) (i.e., F ∈ D 1,2 and DF ∈ Dom(δ)) and in this case, For any F ∈ L 2 (Ω) we define L −1 F = ∞ q=1 − 1 q J q (F ). The operator L −1 is called the pseudo-inverse of L. For any F ∈ L 2 (Ω) one has that L −1 F ∈ DomL = D 2,2 , and The following result is used frequently throughout this paper (see [  (2) Suppose that F = I q (f ) with q ≥ 2 and f ∈ H q . Then for every s ≥ 0, we have Proof. (1) By (2.7) and (2.8) we observe that The result is obtained by using the integration-by-parts formula (2.6).
Cumulants and Γ-operators. Let F be a real-valued random variable such that E[|F |] m < ∞ for some integer m ≥ 1 and define φ F (t) = E[e itF ], t ∈ R, to be the characteristic function of F . Then, for j = 1, . . . , m, the jth cumulant of F , denoted by κ j (F ), is given by There is a well-known relation between cumulants and moments. In this paper, such a relation is needed for cumulants and moments up to order six, and only if E[F ] = 0. In this case, we have The cumulants can be characterized in terms of Malliavin operators. For this, we need to introduce the so-called Γ-operators Γ j , j ≥ 1. For F ∈ D ∞ we define Γ 1 (F ) = F and, for very j ≥ 2, Each Γ j (F ) is well-defined and an element of D ∞ , since F is assumed to be in D ∞ , see [13,Lemma 4.2]. According to [13,Theorem 4.3], there is an explicit relation between Γ j (F ) and the jth cumulant of F . Namely, if F ∈ D ∞ , then F has finite moments of all orders and for each integer j ≥ 1 it holds that The relation continuous to hold under weaker assumptions on the regularity of F , see [13,Theorem 4.3].
If F belongs to a fixed Wiener chaos (i.e, if F has the form of a multiple integral if H = L 2 (µ) as discussed above), there is a more explicit representation for Γ j (F ), see Formula (5.25) in [13]. To state it, let q ≥ 2 and F = I q (f ) with f ∈ H q . Then for any j ≥ 1, applyling the product formula (2.3), we have where the constants c q (r 1 , . . . , r j ) are recursively defined as follows: and for a ≥ 2,

Elements of Stein's method
Wasserstein distance and the standard normal distribution. Stein's method is a set of techniques allowing to evaluate distances between probability measures. In the present paper, we focus on the Wasserstein distance (L 1 -distance). For any two real-valued random variables X and Y it is defined as We will make use of the fact that the elements in L are exactly those absolutely continuous functions whose derivatives are a.e. bounded by 1 in absolute value. We notice that d W (X n , Y ) → 0 as n → ∞ for a sequence of random variables {X n } n∈N implies convergence of X n to Y in distribution (the converse is not necessarily true). A standard Gaussian random variable Z is characterized by the fact that for every absolutely continuous function f : This together with the definition of the Wasserstein distance is the motivation to study the Stein equation and hence With (3.1), we obtain that for every h ∈ H such that h H = 1 we have for smooth functions f that It is a particular case of the consequence of Lemma 2.1(1), that for Symmetric Gamma distributions. The main goal of our paper is to consider probabilistic approximations by Variance-Gamma random variables. To motivate the right choice of a Stein equation, first let us consider the case of Laplace distribution or, more generally, symmetrized Gamma distribution. The Lebesgue-density of a Laplace distribution with parameter b is given by while the Lebesgue-density of a symmetrized Gamma distribution with parameters λ > 0 and r > 0 equals In what follows we shall indicate the distribution with density p λ,r by Γ s (λ, r) and by Γ(λ, r) we denote the non-symmetric (i.e., classical) Gamma distribution. Note that the choice r = 1 and λ = 1/b leads to the Laplace distribution with density as at (3.3). A first-order Stein operator for a random variable with density p b can be obtained by the so-called density approach, see [27].
for all absolutely continuous f for which the expectation exists. However, Summarizing, we obtain that if Y has a Laplace distribution with parameter b, f and f are absolutely continuous functions and E[f (Y )] exists, that see [25,Lemma 1]. A major disadvantage of this characterization is that the machinery of Malliavin calculus usually enters by Lemma 3.1 suggests the following Stein equation for the Γ s (λ, r)-distribution: The following lemma collects bounds on the solution f h of (3.7) and its first and second derivative, see [8, Theorem 3.6] for a proof. In what follows, we denote by g (j) the jth derivative of a function g : R → R.
, and r ∈ Z + and λ > 0, then the solution f h of the Stein equation (3.7) and its derivatives up to order two satisfy The Stein-type characterization (3.6) for the Γ s (λ, r)-distribution also allows a neat computation of its moments or cumulants. We state the result here only for the moments and cumulants of order 2, 4 and 6 as they will play a major role later in this paper.
. Then all odd moments and cumulants of Y are identically zero, and with the choice f (x) = x 5 we obtain from (3.6) that 10r+20 ]. The formulas for the cumulants follow from the relation between moments and cumulants stated in Section 2.
Variance-Gamma distributions. A random variable Y is said to have a Variance-Gamma distribution with parameters r > 0, θ ∈ R, σ > 0 and µ ∈ R if and only if its Lebesgue-density p(x; r, θ, σ, µ), Here, K ν (x) denotes a modified Bessel function of the second kind (see [8,Appendix B] and references there). In what follows we write V G(r, θ, σ, µ) for such a Variance-Gamma distribution. It is known . We will mostly consider only the centred case µ = 0 and write V G c (r, θ, σ) for V G(r, θ, σ, 0). Note that the symmetrized Gamma distribution considered in the previous paragraph corresponds to V G(2r, 0, 1/λ, 0). Variance-Gamma distributions are widely used in finance modelling and contain as special or limiting cases the normal, Gamma or Laplace distribution. In particular, for certain parameter values, the Variance-Gamma distribution has semi-heavy tails that decay slower than those of the normal distribution, see [7,8].
The parameter r is known to be the scale parameter. As r increases, the distribution becomes more rounded around its peak value. The parameter σ is called the tail parameter. As σ decreases, the tails drop off more steeply. Finally, the parameter θ is the asymmetry-parameter, for non-zero θ the distribution becomes skewed, that is, asymmetric, see Figure 1. In [8] a Stein equation for the V G(r, θ, σ, µ)-distribution was established. From this, the Stein equation for the V G c (r, θ, σ) distribution follows: The next lemma presents bounds for the solution f h of (3.8) and its first and second derivative. It is interesting to note that in contrast to the case θ = 0, uniform bounds are much harder to obtain if θ = 0. In a first step these bounds can be expressed in terms of expressions involving modified Bessel functions, see Lemma 3.17 in [7]. The following lemma follows from this representation.
, and r > 0, θ ∈ R, σ > 0, then the solution f h of the Stein equation (3.8) and its derivatives up to order two are bounded, that is, there exists a constant C = C(r, θ, σ) such that Remark 3.5. In contrast to the symmetric case discussed above, if θ = 0, it seems rather difficult to express the constant C appearing in Lemma 3.4 explicitly in terms of the parameters r, θ and σ.
With the same proof as for Lemma 3.3 we can compute the first six moments or cumulants of a centred Variance-Gamma random variable, which will be needed later.
Moreover, the first six cumulants of Y are κ 1 (Y ) = 0 and Let us collect some distributions, which are of particular interest and belong to the class of Variance-Gamma distributions, see [8, Proposition 1.2]: • A V G c (2r, 0, 1/λ)-distributed random variable has the symmetrized Gamma distribution, in particular V G c (2, 0, b) corresponds to a Laplace distribution with parameter b.
• Suppose that (X, Y ) has the bivariate gamma distribution with correlation and marginals X ∼ Γ(λ 1 , r) and Y ∼ Γ(λ 2 , r). Then the random variable X − Y follows the V G c (2r, (

A Malliavin-Stein bound for the Wasserstein distance
Our first result provides explicit bounds for the V G c (r, θ, σ)-approximation of general functionals of an isonormal Gaussian process X. Recall the definition of the Γ-operators Γ j (F ) given in (2.9).
, the second term in our bound (4.1) measures the distance between the variances of Y and F . The interpretation of the L 2 -distance of σ 2 (F + rθ) + 2θ Γ 2 (F ) and the Γ 3 (F )-term on the right-hand side of (4.2) is not obvious and will be discussed for F ∈ H q being in the qth Wiener chaos in Section 5 below.
We will now derive two consequences from Theorem 4.1. The first one deals with two special Variance-Gamma distributions, the symmetric Gamma distribution Γ s (λ, r) and the distribution of X − Y of two random variables X and Y having a Γ s (λ 1 , r)-and Γ s (λ 2 , r)-distribution, respectively. (a) Let Y be a V G c (2r, 0, 1/λ) = Γ s (λ, r)-distributed random variable for some λ, r > 0. Then with constants C 1 , C 2 > 0 only depending on λ and r.
(b) Fix r, λ 1 , λ 2 , > 0 and let Z denote a real-valued random variable with a V C c (2r, ( with constants C 1 , C 2 > 0 depending only on r, λ 1 , λ 2 and . Our next result deals with two limiting cases of Variance-Gamma distributions, namely the normal and the (non-symmetrized) Gamma distribution. As discussed in the introduction, this has previously been considered in [12]. More precisely, Theorems 3.1 and 3.11 there show that if F ∈ D 1,2 is a centred functional of an isonormal Gaussian process and if Z ∼ N (0, σ 2 ) for some σ 2 > 0 and Y ∼ Γ(λ, r) for some λ, r > 0 that with a constant C > 0 only depending on r and λ. In our context, we can derive another bound for d W (F, Z) and d W (F, Y ) in terms of the Gamma-operator Γ 3 . We will see below that in the case of multiple stochastic integrals this is closely related to some of the results recently derived in [1]. (a) Let Z denote a centred Gaussian random variable with variance σ 2 > 0. Then there exist constants C 1 , C 2 > 0 only depending on σ such that (b) Let Y be a Γ(λ, r)-distributed random variable with parameters λ > 0 and r > 0. Then there exist constants C 1 , C 2 > 0 depending only on r and λ such that Proof. We apply Theorem 4.1 and use the fact that lim r→∞ V G c (r, 0, σ/ √ r) = N (0, σ 2 ) and lim    [14] reads (V[Γ 2 (F )]) 1/2 . As explained earlier, this comes from the fact that we consider the much larger class of Variance-Gamma distributions based on a second order-differential equation. This also implies that the stronger condition F ∈ D 2,4 is needed.
5 Explicit bounds on a fixed Wiener chaos 5.1 The general case q ≥ 2 Fix q ≥ 2 and consider F n = I q (f n ), n ≥ 1, a sequence of random variables belonging to the qth chaos of an isonormal Gaussian process X and assume that E[F 2 n ] = q! f n 2 H ⊗q → r(σ 2 + 2θ 2 ) with r > 0, σ > 0 and θ ∈ R. The sequence {F n } n∈N converges in distribution to Y ∼ V G c (r, θ, σ), if and only if for every j ≥ 3, E[F j n ] → E[Y j ], as n → ∞, or equivalently if κ j (F n ) → κ j (Y ) for every j ≥ 3, as n → ∞. This follows from the classical method of moments or cumulants, since the law V G c (r, θ, σ) is determined by its moments (compare with Proposition 5.2.2 in [14]).
One of our main result is, that the method of moments and cumulants for V G c (r, θ, σ)-approximation boils down to a sixth-moment method inside the second Wiener chaos, see Section 5.2. For general q ≥ 2 the next result provides an expression for the first term of the bound in Theorem 4.1 in terms of contraction operators. Note that if q ≥ 3 is an odd integer and θ = 0, then there is no sequence {F n } n∈N = {I q (f n )} n∈N , such that F n has bounded variances and such that F n converges in distribution to a random variable Y with a V G c (r, θ, σ)-distribution, as n → ∞. This is a consequence of the fact that an element of a Wiener chaos of odd order has its third moment equal zero, while Theorem 5.1. Let q ≥ 2 be an even integer and let F = I q (f ), where f ∈ H q . Then we have In case q = 2 the last two sums are empty and have to be interpreted as 0.
Let us have a closer look at the first summand appearing in the expression provided by Theorem 5.1. Using Lemma 3.6 we see that the moment assumption that converges to zero, as n → ∞. Note moreover that the other contraction operators do not depend on r. The dependence on r is completely encoded in the moment assumption that Next we consider the particularly attractive case θ = 0 separately corresponding to the symmerized Gamma distributions separately. As explained earlier, in this case no restriction on the parity of q is necessary.
Theorem 5.2. Let q ≥ 2 be an integer and let F = I q (f ) with f ∈ H q . Then for q being even we have whereas for q being odd we set p = q − 1 and obtain Proof. For q being even the result follows directly form Theorem 5.1. The case when q ≥ 3 is odd is similar. Here, we put p := q − 1 and denote by C 2k+1 the set of those integers r ∈ {1, . . . , q − 1} for which the double contraction ((f ⊗ r f ) ⊗ 3q/2+1−k−r f ) is well defined. We skip the details. Remark 5.4. Comparing the contraction conditions implied by Theorems 4.1 and 5.2 for the symmetric Gamma distribution with those of Theorem 1.1 (ii) in [6] for the tetilla law arising in free probability we see that our condition in the case of Γ s (1, 1 √ 2 ) coincides almost readily with that in [6]. The only difference are the coefficients c q (r, q −r), which arise as a consequence of the product formula (2.3). In contrast, these coefficients are all equal to 1 in the free set-up (compare with Equation (2.6) in [6], for example). This way, we may identify the Laplace distribution with parameter √ 2 as the non-free analogue of the tetilla law.
A particularly interesting question is whether the bounds derived in Theorems 5.1 and 5.2 are tight with respect to the convergence in distribution towards a Variance-Gamma distribution, in the sense that these bounds converge to zero if and only if a normalised sequence {F n } n∈N , living inside a fixed Wiener chaos, converges in distribution to a V G c (r, θ, σ)-distributed random variable. Fix q ≥ 2, and consider a sequence {F n : n ≥ 1} such that F n = I q (f n ), n ≥ 1, where f n ∈ H q and suppose that E[F 2 n ] = q! f n 2 H ⊗q → 2r λ 2 . Moreover, by Y denote a random variable with Γ s (λ, r)-distribution. We conjecture that for the symmetric Variance-Gamma distributions (corresponding to , which in turn is equivalent to the contraction conditions (iii) that where r = 1, . . . , q − 1 and r is such that r + 2r ≤ 2q and r + r = q. Our conjecture for θ = 0 reads similar. Namely, we conjecture that a sequence {F n } n∈N such that F n = I q (f n ), n ≥ 1, where f n ∈ H q and E[F 2 n ] = q! f n 2 H ⊗q → r(σ 2 + 2θ 2 ) (i) converges in distribution to a V G c (r, θ, σ)distributed random variable if and only if (ii) the moment condition E[F j n ] → E[Y j ] is satisfied for j = 3, 4, 5, 6 or if and only if (iii) the contraction conditions ((f n ⊗ l f n ) ⊗ 3q/2−k−l f n ) H ⊗3q−2r−2r → 0 for every l = 1, . . . , 3q/2 − k − 1 and k = q, . . . , 3q/2 − 2, The technically sophisticated step in both situations is to show that (ii) implies (iii). The main difficulty is to deal with the involved combinatorial structure transmitted from the product formula to the collection of double contractions. In Section 5.2 below, we will obtain a positive answer to both of the above stated conjectures in the particular case q = 2, while general case remains open, because for general q we were not able to express (or to estimate from above) the bounds of Theorems 5.1 or 5.2 in terms of the first six moments of the involved chaotic random variables.
The following discussion concerns the symmetric Gamma approximation of a finite sum of Wiener chaoses. Without loss of generality we discuss a sum of two Wiener chaoses. Consider two integers 2 ≤ q 1 < q 2 and a sequence of the form where f i n ∈ H q i . In order to bound the second summand on the right hand side of (4.1) we have to compute E[Γ 2 (Z n )]. By the product formula (2.3) we obtain Hence to ensure convergence of E[Γ 2 (Z n )] it is not necessary to that each of the summands Without loss of generality, we can assume that X is an isonormal process over a Hilbert space of the type L 2 (A, A, µ). For every b ∈ A, it is immediately checked that and (·, b)). Therefore, by the product formula, T (q i , q j , q k , r, s, f 1 n , f 2 n ). Now, we consider the two summands i = j = k = 1 and i = j = k = 2 and choose s = q i − r.
We observe that these summands can be re-presented as q l −1 r=1 c q l (r, q l − r) I q l f l n ⊗ q−r (f l n ⊗ r f l n ) for l = 1, 2. Summarizing, we have 2r and, whenever i = j = k, r = q i and s = q i − r .
By using the inequality (a 1 + a 2 ) 2 ≤ 2(a 2 1 + a 2 2 ) and the isometric property (2.2) we obtain: Proposition 5.5. Consider two integers 2 ≤ q 1 < q 2 and a sequence of the form where f i n ∈ H q i . Then for every λ > 0 we have Using Proposition 5.5, it is in principle also possible to deduce bounds for the Variance-Gamma approximation of random variables living inside an infinite sum of Wiener chaoses.
We finally turn in this section to the case of normal approximation and recover the celebrated fourth moment theorem. Moreover, our more general framework implies the following result, which leads to a better rate of convergence (namely exponent 3/2 instead of 1) compared with [14, Theorem 5.2.7], for example. However, our rate is still not optimal as shown by the main result in [15].
Proposition 5.6. Fix q ≥ 2, and consider a sequence {F n } n∈N such that Then the sequence {F n } n∈N satisfies a central limit theorem and we have the following bound for the Wasserstein distance: where C > 0 is a constant only depending on σ and where Z ∼ N (0, σ 2 ). Moreover, we have that Proof. That the sequence {F n } n∈N satisfies a central limit theorem under our assumptions is ensured by Corollary 4.4 (a). Moreover, using the multiplication formula (2.3) we have Hence a sufficient condition for a central limit theorem to hold is that for every r = 1, . . . , q − 1 and r such that r + 2r ≤ 2q it holds that as n → ∞. Now, the double-contractions are dominated by the usual (single) contractions in the following way: see [3,Equation (4.10)]. This proves the first part the result. As shown above, the sequence {F n } n∈N satisfies a central limit theorem provided that E[Γ 3 (F n ) 2 ] → 0. By the fourth moment theorem [14, Theorem 5.2.7], the central limit theorem for {F n } n∈N is equivalent to the condition that κ 4 (F n ) → 0. This proves the second part of the result.

The case of the second Wiener chaos
The goal of this subsection is to confirm the two conjectures spelled out in the previous subsection for elements of the second Wiener chaos (i.e., for double stochastic integrals). That is, we consider a sequence of elements of the second Wiener chaos of an isonormal process X, that is, a sequence of random variables of the type F n = I 2 (f n ) with f n ∈ H 2 for each n ∈ N. For symmetric Variance-Gamma distributions (θ = 0) our result reads as follows.
Theorem 5.7. Let Y be a Γ s (λ, r)-distributed random variable with r, λ > 0 and suppose that E[F 2 n ] = 2r/λ 2 . Then, as n → ∞, the following assertions are equivalent: In the general asymmetric case θ = 0, stronger moment or contraction conditions are necessary in order to ensure convergence in distribution of F n to a Variance-Gamma distributed random variable.
Theorem 5.8. Let Y be a V G c (r, θ, σ)-distributed random variable with r, σ > 0 and θ ∈ R, and suppose that E[F 2 n ] = r(σ 2 + 2θ 2 ). Then, as n → ∞, following assertions are equivalent: Before entering the proofs of Theorems 5.7 and 5.8, we collect some general facts about random variables of the type F = I 2 (f ), f ∈ H 2 , belonging to the second Wiener chaos H 2 and introduce some notation. First recall that the law of F is determined by its moments or, equivalently, by its cumulants. The latter are given by thanks to relation (2.10). Here, {f ⊗ (p) 1 f = f and for p ≥ 2 by f ⊗ (p) Let F n = I 2 (f n ) with f n ∈ H 2 , n ≥ 1. Theorem 4.1 for q = 2 leads to (5.10) For θ = 0, σ = 1/λ, with (5.3) we obtain We represent the left hand side of (5.10) in terms of moments and cumulants of F n to be able to check that if the six moment condition on F n (condition (b) in Theorem 5.8) is satisfied, then condition (c) for the contractions follows. The left-hand side of (5.10) consists of six terms. Identity (5.8) gives By (5.9) we obtain κ 6 (F n ) = 2 5 Next, with (5.2) we get using that κ 4 (F n ) = 48 f n ⊗ 1 f n , f n , see (5.9). For the third term we have E[σ 4 (F n + rθ) 2 ] = σ 4 E[F 2 n ] + r 2 θ 2 σ 4 . Applying Part (1) of Lemma 2.1 with s = 1 we obtain for the fourth term and part (2) says that Hence , and it follows that E[I q (f )Γ 3 (I q (f ))] = 1 6 κ 4 (I q (f )). Hence the fifth term can be presented as Finally, we have to compute −4θE[Γ 2 (F n ) Γ 3 (F n )]. With (5.1), (5.2) and (5.9) we obtain Summarizing, the left hand side of (5.10) is equal to (5.12) Using now the moments assumption (b) together with Lemma 3.6, we see that the term in (5.12) converges to zero as n → ∞ and hence the contraction condition (c) follows, see (5.10). This completes the proof.
Proof of Theorem 5.7. As in the asymmetric case, it suffices to show that (b) implies (c). In our case, θ = 0 and we put σ = 1 λ and obtain that (κ 3 (F n )) 2 + 1 λ 4 κ 2 (F n ) from (5.10) and (5.12). Hence with (5.11) and (5.13) we get 2 1 . Into the last identity we plug the well known relationships between moments and cumulants stated in Section 2. Then, a simple calculation leads to 2 1 . Then, H ⊗q → r(σ 2 + 2θ 2 ). Here we list the different forms of conditions on contraction-operators which are equivalent to the convergence in distribution to a member of V G c (r, θ, σ).
(e) An example of case (d) is the convergence to V G c (1, , 1 − 2 ), which can be interpreted as the distribution of the product of two correlated standard normal distributed random variables X and Y with correlation . We obtain that F n converges to When → 0, case (c) appears with λ = r = 1.
After having characterized convergence in distribution of an element belonging to the second Wiener chaos H 2 , we turn now to quantitative bounds for the Wasserstein distance. In contrast to the bounds that follow from the results presented in Section 4 and Section 5.1, we are seeking for upper bounds in terms of moments. In view of Theorems 5.7 and 5.8 we can expect that these bounds only involve moments up to order six. Our next theorem presents bounds in terms of the first six cumulants, as they have a more compact form.
known (see [14,Section 2.7.4]) that, the series j≥1 λ p f,j converges for all p ≥ 2, and that f admits the expansion (in H 2 ) (5.14) f = j≥1 λ f,j e f,j ⊗ e f,j .
We notice that for the trace of the pth power of A f one has the relation Tr( Then the following two conditions are equivalent to the conditions stated in Theorem 5.7: (a) As n → ∞, j≥1 1 λ 2 λ fn,j − 4λ 3 fn,j 2 → 0 and j≥1 λ 3 fn,j → 0, where, for each n ≥ 1, {λ fn,j } j≥1 stands for the sequence of the eigenvalues of the Hilbert-Schmidt operator A fn .
(b) As n → ∞, j≥1 λ 3 fn,j → 0 and for every q ≥ 2, Proof. To prove the equivalence of (a) to (c) in Theorem 5.7, we use (5.14) to deduce that It follows that 1 Next we show that (a) is equivalent to (b H n (G, q) as a multiple stochastic integral of order q, i.e., H n (G, q) = I q (f n ) with f n ∈ H q given by Moreover, a number of combinatorial tools are available to control the moments of such integrals, see [22] for details. The universality phenomenon for homogeneous sums has been addressed by Rotar [26] and later also by Nourdin, Peccati and Reinert [16], who consider especially multivariate extensions in case of normal and Gamma limiting distributions by means of Stein's method and Malliavin calculus. Using the results obtained in the previous sections, we can reduce a corresponding limit theorem to a simple moment condition in case q = 2 and if the limiting distribution belongs to the broad class of Variance-Gamma distributions.
Proposition 5. 16. Suppose that E[H n (G, q) 2 ] → r(σ 2 + 2θ 2 ), as n → ∞, and let Y be a random variable having a V G c (r, θ, σ)-distribution with parameters r, σ > 0 and θ ∈ R. Then, as n → ∞, the following assertions are equivalent: Proof. The first part of the claim is a reformulation of a special case of Proposition 1 in [26]. The second part is a direct consequence of Theorems 5.7 and 5.8.
A distance d(F n , Y) between the random vectors F n and Y is measured by where the supremum is taken over all functions φ : R d → R possessing partial derivatives of order one and two, which are uniformly bounded in absolute value by 1. The distance d( · , · ) is our multivariate version of the Wasserstein distance used in the one-dimensional situation. The proof of the next result closely follows the lines of the proof of Lemma 4.4. in [23], which in turn was inspired by the methods in [4]. To keep the paper reasonably self-contained, we decided yet to present the basic idea.
Proposition 5.17. There are constants C 1 > 0 and C 2 > 0 only depending on d and the parameters r j , θ j and σ j , j = 1, . . . , d, such that Proof. To simplify the notation and to keep the argument more transparent, we restrict to the case bivariate d = 2. Thus, what we have to show is that (5.18) d(F n , Y) ≤ C 1 A n (1) + A n (2) + C 2 B n (1, 2) + B n (2, 1) , n ∈ N, with constants C 1 , C 2 > 0 only depending on the parameters r 1 , r 2 , θ 1 , θ 2 and σ 1 , σ 2 . We start by writing for an admissible test function φ : R 2 → R, Conditioning on the first component Y 1 of Y in T 2 leads to a one-dimensional situation, which has already been considered in the proof of Theorem 4.1. This contributes the term A n (2) to the bound (5.18). Let us turn to T 1 . Writing L X for the distribution of an arbitrary random element X, we re-write T 1 as , y)).
The term in brackets is now interpreted as the left-hand side of a Stein equation for Y 1 , i.e.
Here, for fixed y, h y (x) stands for a solution of this equation for the text function x → φ(x, y). Also put h(x, y) := h y (x), understood as a bivariate function. Using the smoothness properties of the test function φ together with the smoothness properties of h y (x) (again taken from Lemma 3.17 in [7]), we see that (i) the mappings x → h(x, y) and y → h(x, y) are twice differentiable on R, (ii) there is a constant C > 0 only depending on r 1 , θ 1 , σ 1 such that all partial derivatives up to order two of the mappings in (i) are bounded by C (compare with the proof of Lemma 4.4 in [23] for a similar argument). In terms of h(x, y), the representation (5.19) of T 1 can be re-written as T 1 = E σ 2 1 (F n,1 + r 1 θ 1 )∂ xx h(F n,1 , F n,2 ) + (σ 2 1 r 1 + 2θ 1 (F n,1 + r 1 θ 1 ))∂ x h(F n,1 , F n,2 ) − F n,1 h(F n,1 , F n,2 ) , (5.20) where ∂ x and ∂ xx indicate the first and second partial derivative in the first coordinate (similarly, we write ∂ y and ∂ yy for those in the second coordinate). Applying the integration-by-parts-formula (2.6) together with the chain rule (2.4) we see that E[F n,1 h(F n,1 , F n,2 )] = E[ Dh(F n,1 , F n,2 ), −DL −1 F n,1 H ] = E[∂ x h(F n,1 , F n,2 ) DF n,1 , −DL −1 F n,1 H + ∂ y h(F n,1 , F n,2 ) DF n,2 , −DL −1 F n,1 H ].
Combining this with (5.20) and arguing as in the proof of Theorem 4.1, we see that this contributes the terms A n (1) and B n (2, 1) to (5.18). Interchanging the role of F n,1 and F n,2 leads to a term B n (1, 2) and completes the argument.
We now apply Proposition 5.17 to sequences of vectors belonging to a fixed Wiener chaos, i.e., we assume from now on that F n,j = I q j (f n,j ) with f n,j ∈ H q j , where q 1 , . . . , q d ≥ 2. The next result ensures that convergence in distribution of the components of F n towards the components of Y already implies convergence in distribution of the involved random vector. This can be regarded as a quantitative version for Variance-Gamma distributions of the strong asymptotic independence properties on the Wiener chaos (see Remark 5.19 below for further discussion).
Proposition 5.18. Suppose that for each j = 1, . . . , d, F n,j converges in distribution to Y j and that for all i = j = 1, . . . , d, Cov(F 2 n,i , F 2 n,j ) → 0, as n → ∞. Then F n converges in distribution to Y and with A n (j) given by (5.16) and constants C 1 , C 2 > 0 as in Proposition 5.17.
Proof. In view of Proposition 5.17 and Theorem 4.1 it only remains to show that B n (i, j) is dominated by Cov(F 2 n,i , F 2 n,j ) up to a constant factor. However, this is known from step 2 in the proof of [19,Theorem 4.3], see also Identity (6.2.3) in [14]. Remark 5.19. Without a rate of convergence, Proposition 5.18 is also a consequence of the strong asymptotic independence properties inside the Wiener chaos. In particular, the result is a consequence of Theorem 1.4 in [10] and the fact that the distribution of each Y j , j = 1, . . . , d, is determined by its moments (alternatively, one can apply Theorem 3.1 in [19]).