Approximation of Hilbert-Valued Gaussians on Dirichlet structures

We introduce a framework to derive quantitative central limit theorems in the context of non-linear approximation of Gaussian random variables taking values in a separable Hilbert space. In particular, our method provides an alternative to the usual (non-quantitative) ﬁnite dimensional distribution convergence and tightness argument for proving functional convergence of stochastic processes. We also derive four moments bounds for Hilbert-valued random variables with possibly inﬁnite chaos expansion, which include, as special cases, all ﬁnite-dimensional four moments results for Gaussian approximation in a diffusive context proved earlier by various authors. Our main ingredient is a combination of an inﬁnite-dimensional version of Stein’s method as developed by Shih and the so-called Gamma calculus. As an application, rates of convergence for the functional Breuer-Major theorem are established.


Introduction
Random variables taking values in Hilbert spaces play an important role in many fields of mathematics and statistics, both at a theoretical and applied level. For example, they arise naturally in statistics, in particular in the field of functional data analysis or machine learning (for example in the context of Reproducing Kernel Hilbert Spaces). An important and classical topic is the asymptotic analysis of sequences of such random variables.
In the linear case, i.e., when looking at normalized sums of i.i.d. random variables, the asymptotic behaviour is very well understood, with central limit theorems including error bounds being available in Banach or more general infinite-dimensional spaces (see [1]). Here, (separable) Hilbert spaces have the distinguished property of being the only infinite-dimensional Banach spaces for which convergence of such sums is equivalent to finite variances (square integrability of the norms) of the components.
In the non-linear case, where the sum is replaced by a general transformation, much less is known, except when the dimension of the Hilbert space is finite. In this case, Nourdin and Peccati ( [30]) have introduced the very powerful combination of Stein's method and Malliavin calculus, which yields quantitative central limit theorems for a very wide class of square integrable real-valued transformations of arbitrary Gaussian processes. Since its inception, this approach, which is now known as the Malliavin-Stein method, has had a very substantial impact with numerous generalizations and applications. We refer to the monograph [31] for an overview.
In this paper, we lift the theory to infinite-dimension, thus obtaining quantitative central limit theorems for square-integrable and Hilbert-valued random variables. The setting we will be working in is that of a diffusive Markov generator L, acting on L 2 (Ω; K), where K is a real separable Hilbert space. Our main result (see Section 3 for unexplained definitions and Theorem 3.2 for a precise statement) then states that for random variables F in the domain of the associated carré du champ operator Γ and centered, non-degenerate Gaussians Z on K with covariance operator S, one has (1.1) Here, · HS denotes the Hilbert-Schmidt norm, L −1 the pseudo-inverse of the generator L and d is a probability metric generating a topology which is stronger than convergence in distribution.
Some examples of random variables F fitting our framework are homogeneous sums of i.i.d. Gaussians with Hilbert-valued coefficients (or more generally a polynomial chaos with distributions coming from a diffusion generator), stochastic integrals of the form F t = ∞ 0 u t,s dB s , where B is Brownian motion and the kernel u is such that the trajectories of F are Hölder-continuous of order less than one half, or multiple Wiener-Itô integrals.
Proceeding from the general bound (1.1), we generalize and refine the two most important results of the finite-dimensional Malliavin-Stein framework: The first results are quantifications of so-called Fourth Moment Theorems (first discovered in [39] and substantially generalized in [24,2,7]), which state that for a sequence of eigenfunctions of the carré du champ operator satisfying a chaotic property, convergence in distribution to a Gaussian is equivalent to convergence of the second and fourth moment. We prove that such quantitative Fourth Moment Theorems continue to hold in infinite-dimension, i.e., that if F is a chaotic eigenfunction of the carré du champ operator and Z is a Gaussian having the same covariance operator as F , then one has (see Section 3.2 for precise statements) Approximation of Hilbert-valued Gaussians on Dirichlet structures denotes Lebesgue or Bochner integration with respect to P . For p ≥ 1, we denote by L p (Ω; K) the Banach space of all equivalence classes (under almost sure equality) of K-valued random variables X with finite p-th moment, i.e., such that X L p (Ω,K) = (E ( X p K )) 1/p < ∞.
Note that for all X ∈ L p (Ω; K), the Bochner integral E (X) ∈ B exists. If X ∈ L 2 (Ω; K), the covariance operator S : K → K of X is defined by Su = E ( X, u X) .
(2.1) It is a positive, self-adjoint trace-class operator and verifies the identity tr S = E X 2 . (2.2) We denote by S 1 (K) the Banach space of all trace class operators on K with norm T S1(K) = tr |T |, where |T | = √ T T * . The subspace of Hilbert-Schmidt operators will be denoted by HS(K), its inner product and associated norm by ·, · HS(K) and · HS(K) , respectively. Recall that · op ≤ · HS(K) ≤ · S1 , where · op denotes the operator norm.
When there is no ambiguity about what Hilbert space K underlies ·, · K , · K , S 1 (K) or HS(K), we will drop the K dependency and just write ·, · , · , S 1 , HS, and so on.

Gaussian measures and Stein's method on abstract Wiener spaces
In this section, we introduce Gaussian measures, the associated abstract Wiener spaces and Stein's method. We present the theory in a Banach space setting as specializing to Hilbert spaces brings no significant advantages at this point. Standard references for Gaussian measures and abstract Wiener spaces are the books [5,23], Stein's method on abstract Wiener space has been introduced by Shih in [41].

Abstract Wiener spaces
Let H be a real separable Hilbert space with inner product ·, · H and define a norm · on H (not necessarily induced by another inner product) that is weaker than · H . Denote by B the Banach space obtained as the completion of H with respect to the norm · (note that if the · norm happens to be induced by an inner product, then B is actually a Hilbert space), and define i to be the canonical embedding of H into B. , for any η ∈ B * , where ·, · B,B * denotes the dual pairing in B.

Gaussian measures on Banach and Hilbert spaces
For a Banach space B, we denote by B(B) its family of Borel sets. Definition 2.1. Let B be a real separable Banach space. A Gaussian measure ν is a probability measure on (B, B(B)), such that every linear functional x ∈ B * , considered as a random variable on (B, B(B), ν), has a Gaussian distribution (on (R, B(R))). The Gaussian measure ν is called centered (or non-degenerate), if these properties hold for the distributions of every x ∈ B * .
We see from the definition that every abstract Wiener measure is a Gaussian measure and, conversely, for any Gaussian measure ν on a separable Banach space B, there exists a Hilbert space H such that the triple (i, H, B) is an abstract Wiener space with associated abstract Wiener measure ν (see [22,Lemma 2.1]). The space H is called the Cameron-Martin space.

Stein characterization of abstract Wiener measures
Let B be real separable Banach space with norm · and let Z be a B-valued random variable on some probability space (Ω, F, P ) such that the distribution µ Z of Z is a nondegenerate Gaussian measure on B with zero mean. Let (i, H, B) be the abstract Wiener space associated to the Wiener measure µ Z , as described in the previous subsection. Let {P t : t ≥ 0} denote the Ornstein-Uhlenbeck semigroup associated with µ Z and defined, for any B(B)-measurable function f and x ∈ B, by provided such an integral exists. We have the following Stein lemma for abstract Wiener measures (see [41,Theorem 3.1]).
for any twice differentiable function f on B such that E ∇ 2 f (Z) S1(H) < ∞.
The notion of an H-derivative appearing in Theorem 2.2 was introduced by Gross in [16] and is defined as follows. A function f : The k-th order H-derivatives of f at x can be defined inductively and are denoted by ∇ k f (x) for k ≥ 2, provided they exist. If f is scalar-valued, ∇f (x) ∈ H * ≈ H and ∇ 2 f (x) is regarded as a bounded linear operator from H into H * for any x ∈ U , and the notation ∇ 2 f (x)h, k H stands for the action of the linear form

Remark 2.3 (On the relation between Fréchet and H-derivatives
). An H-derivative ∇f (x) at x ∈ B determines an element in B * if there is a constant C > 0 such that | ∇f (x), h H | ≤ C h H for any h ∈ H. Then, ∇f (x) defines an element of B * by continuity and we denote this element by ∇f (x) as well. Now, if f is also twice Fréchet differentiable on B, then ∇f (x) coincides with the first-order Fréchet derivative f (x) at x ∈ B and is automatically in B * . Furthermore, ∇ 2 f (x) coincides with the restriction of the second-order Fréchet derivative f (x) to H × H at x ∈ B. In this framework, since for any x ∈ B, f (x) is a bounded linear operator from B into B * , Goodman's theorem (see [23,Chapter 1,Theorem 4.6]) implies that ∇ 2 f (x) is a trace-class operator on H and that, consequently, the Gross Laplacian ∆ G f (x) is well-defined. Twice Fréchet differentiability hence constitutes a sufficient condition for the existence of the Gross Laplacian.

Stein's equation and its solutions for abstract Wiener measures
In view of the above Stein lemma (Theorem 2.2), the associated Stein equation is given by x ∈ B, (2.4) where h is given in some class of test functionals. Shih showed in [41] that In what follows, we will consider test functions from the space C k b (K) of real-valued, k-times Fréchet differentiable functions on a separable Hilbert space K with bounded derivatives up to order k. A function h thus belongs to C k b (K) whenever The following lemma collects some properties of the Stein solution f h for a given function h ∈ C k b (K). Lemma 2.4. Let K be a separable Hilbert space, k ≥ 1 and h ∈ C k b (K). Then the Stein solution f h defined in (2.5) also belongs to C k b (K) and furthermore one has that ) du, we have, for any j = 1, . . . , k, Using the property of the semigroup P that ∇ j P u h(x) = e −ju P u ∇ j h(x), and the fact that P is contractive yields The bound (2.6) can be derived similarly.

Dirichlet structures
In this section, a Dirichlet structure for Hilbert-valued random variables is introduced, which will be the framework we work in. We start by recalling the well-known definition in the case of real-valued random variables (full details can for example be found in [6,14,26,3], where the latter reference emphasizes the equivalent notion of a Markov triple). Given a probability space (Ω, F, P ), a Dirichlet structure (D, E) on L 2 (Ω; R) with associated carré du champ operator Γ consists of a Dirichlet domain D, which is a dense subset of L 2 (Ω; R) and a carré du champ operator Γ : D × D → L 1 (Ω; R) characterized by the following properties.
-Γ is bilinear, symmetric (Γ(f, g) = Γ(g, f )) and positive Γ(f, f ) ≥ 0, -for all m, n ∈ N, all Lipschitz and continuously differentiable functions ϕ : R m → R and ψ : R n → R and all f = (f 1 , . . . , f m ) ∈ D m , g = (g 1 , . . . , g n ) ∈ D n , it holds that is closed in L 2 (Ω; R), i.e., D is complete when equipped with the norm Here and in the following, E (·) denotes expectation on (Ω, F) with respect to P . The form f → E(f, f ) is called a Dirichlet form, and, as is customary, we will write E(f ) for E(f, f ). Every Dirichlet form gives rise to a strongly continuous semigroup {P t : t ≥ 0} on L 2 (Ω; R) and an associated symmetric Markov generator −L, defined on a dense subset dom(−L) ⊆ D. We will often switch between −L and L, as these two operators only differ by sign. There are two important relations between Γ and L. The first one is the integration by parts formula valid whenever f, g ∈ D, the second one is the relation It follows that Consider now such a Dirichlet structure on L 2 (Ω; R) with diagonalizable generator as given and denote the Dirichlet domain, Dirichlet form, carré du champ operator, its associated infinitesimal generator and pseudo-inverse by D, E, Γ, L and L −1 , respectively, in order to distinguish these objects from their extensions to the Hilbert-valued setting to be introduced below.
Given a separable Hilbert space K, one has that L 2 (Ω; K) is isomorphic to L 2 (Ω; R) ⊗ K. The Dirichlet structure on L 2 (Ω; R) can therefore be extended to L 2 (Ω; K) via a tensorization procedure as follows.
a bilinear and positive operator Γ by and a bilinear, positive and symmetric form E by ) with a random operator on K, whose action is given by For all F, G ∈ A, the operator Γ(F, G) is then of trace class and an element of L 1 (Ω; S 1 ).
It is standard to verify that the definitions of L, L −1 and Γ do not depend on the choice of the basis of K. Furthermore, from the well-known results for L, Γ and E, we can extend them as follows.
Proposition 2.5. The operators L, L −1 , E and Γ introduced above can be extended to dom(L), dom(L −1 ) and dom(Γ) = dom(E) = D × D, given by respectively, where π p denotes the orthogonal projection onto In particular, one has where all inclusions are dense.
Throughout this article, the extensions of L, L −1 and Γ to their maximal domains will still be denoted by the same symbols. The operators just defined yield a Dirichlet structure (Γ, D) on L 2 (Ω; K), which is a natural counterpart to the given structure ( Γ, D) on L 2 (Ω; R). The following theorem summarizes its main features.
Theorem 2.6. For a Dirichlet structure (D, Γ) on L 2 (Ω; K), consisting of a dense subspace D of L 2 (Ω; K) and a carré du champ operator Γ : D × D → L 1 (Ω; S 1 ) as introduced above, the following is true.
(i) Γ is bilinear, almost surely positive (i.e., Γ(F, F ) ≥ 0 as an operator on K), symmetric in its arguments and self-adjoint (ii) The Dirichlet domain D, endowed with the norm is complete, so that Γ is closed.
(iii) For all Lipschitz and Fréchet differentiable operators ϕ, ψ on K and F, G ∈ D, one has that ϕ(F ), ψ(G) ∈ D and the diffusion identity holds, where ∇ϕ(F ) and ∇ψ(G) denote the Fréchet derivatives of ϕ and ψ at F and G, respectively, and ∇ϕ(F ) * , ∇ψ(G) * are their adjoints in K.
(iv) The associated generator −L acting on L 2 (Ω; K) is positive, symmetric, densely defined and has the same spectrum as − L.
(v) There exists a compact pseudo-inverse L −1 of L such that for all F ∈ L 2 (Ω; K), where the expectation on the right is a Bochner integral (well defined as F ∈ L 2 (Ω; K)).

Approximation of Hilbert-valued Gaussians on Dirichlet structures
(vi) The integration by parts formula is satisfied for all F, G ∈ dom(−L).
(vii) The carré du champ Γ and the generators L and L are connected through the connecting Γ and its one-dimensional counterpart Γ is valid for all F, G ∈ D and all u, v ∈ K.
Proof. Parts (i) − (ii) and (iv) − (viii) are straightforward to verify. In order to prove where the f p and g p are eigenfunctions of L with eigenvalue −λ p , and {k i : i ∈ N} is an orthonormal basis of K. Let K n = span {k i : 1 ≤ i ≤ n} and ρ n be the orthogonal projection onto L 2 (Ω; K n ), so that Denote by i n : K n → R n the canonical isometric isomorphism mapping K n to R n so that ξ n = i n • ρ n (F ) ∈ R n and υ n = i n • ρ n (G) ∈ R n . Let ϕ n = ϕ • i −1 n and ψ n = ψ • i −1 n . Then ϕ n : R n → K is Lipschitz and Fréchet differentiable, with Fréchet derivative given by for all x, y ∈ R n and an analogous result is true for ∇ ψ n . Therefore, via Γ (ϕ(ρ n (F )), ψ(ρ n (G))) = Γ ϕ n (ξ n ), ψ n (υ n ) and identity (2.12), the assertion can be transformed into an equivalent assertion for Γ, which can then be verified by tedious but straightforward calculations, using the diffusion property (2.7) for Γ and then letting n → ∞.
The most important example in our context is the Dirichlet structure given by the Ornstein-Uhlenbeck generator of a Hilbert-valued Ornstein-Uhlenbeck semigroup. Here, −L = δD, where D and δ denote the Malliavin derivative and divergence operator, and the carré du champ operator is given by Γ(X, Y ) = DX, DY H , where H is the Hilbert space associated to the underlying isonormal Gaussian process (see Section 4 for full details). The corresponding eigenspaces are known as Wiener chaos and spanned by the infinite-dimensional Hermite polynomials. In the same way, one can obtain Jacobi, Laguerre or other polynomial chaoses (see for example [2] for the real-valued case). We refer to the monographs quoted at the beginning of this section for further numerous examples.

Approximation of Hilbert-valued Gaussians
In this section, we combine Stein's method introduced in Section 2.2 with the Dirichlet structure defined in Section 2.3 in order to derive bounds on a probabilistic metric between the laws of square integrable random variables and a Gaussian, both taking values in some separable Hilbert space.
Throughout the whole section, this separable Hilbert space will be denoted by K, and we furthermore assume as given a Dirichlet structure on L 2 (Ω; K) as introduced in the previous section, with Dirichlet domain D, carré du champ operator Γ and associated generator L.
The probabilistic distance we use is the well-known d 2 -metric, given by with uniformly bounded first and second derivatives (see Section 2.2.4). In an infinitedimensional context, this distance has already been used in [12] and, in a weakened form, also in [4]. As already observed in [12], it metrizes convergence in distribution: as n → ∞, for all bounded, real-valued and continuous functions h on K.
Proof. The proof given in [12, Lemma 4.1] for K = 2 (N) continues to work without any modification.

An abstract carré du champ bound
The following general bound between the laws of a square integrable K-valued random variable in the Dirichlet domain D and an arbitrary Gaussian random variable holds.
Theorem 3.2. Let Z be a centered, non-degenerate Gaussian random variable on K with covariance operator S and let F ∈ D. Then where d W denotes the Wasserstein distance, and where f h is the Stein solution given by (2.5). Indeed, using Stein's equation (see (2.4)), the left hand side of (3.4) is equal to |E (h(X)) − E (h(Z))|, so that the assertion follows after taking the supremum over h. Identifying K * with K, using the integration by parts formula (2.11) and the diffusion property (2.10) for the carré du champ, we can write Now let H be the Cameron-Martin space associated to Z as introduced in Section 2.2.
As the covariance operator S of Z is compact and one-to-one, it holds that S = i∈N λ i ·, e k K e i for some λ i > 0 and an orthonormal basis {e i : i ∈ N} of H consisting of eigenvectors. Then {k i : i ∈ N}, where k i = 1 √ λi e i , is an orthonormal basis of K, as H = √ S(K). It thus follows that Combining the last two calculations yields that and, taking absolute values and applying Hölder's inequality for the Schatten norms, we get ≤ ∇ 2 f h (F ) L 2 (Ω;HS(K)) Γ(F, −L −1 F ) − S L 2 (Ω;HS(K)) .
where · Lip denotes the Lipschitz norm. The Wasserstein distance is then obtained by approximating Lipschitz functions in C 2 b (K) (for example by convoluting a Gaussian kernel).
If Z is a K-valued Gaussian random variable with covariance operator S, then, taking L to be the Ornstein-Uhlenbeck generator (see the forthcoming Section 4), one has that Γ(Z, −L −1 Z) = S. Therefore, taking F to be Gaussian in Theorem 3.2 yields a bound on the distance between two Gaussians Z 1 , Z 2 in terms of the Hilbert-Schmidt norm of their covariance operators S 1 , S 2 . We state this as a corollary. Corollary 3.3. Let Z 1 , Z 2 be two centered, non-degenerate Gaussian random variables on K with covariance operators S 1 , S 2 , respectively. Then, it holds that

Approximation of Hilbert-valued Gaussians on Dirichlet structures
We continue with some remarks on Theorem 3.2.

Remark 3.4.
(i) Note that the proof of Theorem 3.2 does not use diagonalizability of L, so that this assumption can be replaced by weaker conditions guaranteeing that a pseudoinverse can still be defined (in a finite-dimensional context, this has been done in [7]). However, we will not need this level of generality.
(ii) While Γ(F, −L −1 F ) − S HS is almost surely finite for any F ∈ D, it might be that Γ(F, −L −1 F ) − S L 2 (Ω;HS) is infinite. A simple sufficient condition for finiteness of the latter norm is that F has finite chaos decomposition (see Section 3.2). In the case of an infinite decomposition, some control on the tail is needed.
(iii) In principle, Theorem 3.2 can also be used to prove weak convergence in a Banach space setting. Starting from a Gaussian random variable on a separable Banach space B, it is always possible (see [22,Lemma 2.1]) to densely embed B in a separable Hilbert space K such that the Borel sets of B are generated by the inner product of K. Then, by applying our methods, one obtains weak convergence in K, which in turn implies weak convergence in B.

Fourth Moment bounds via chaos expansions
In this section, we show how the carré du champ bounds obtained in Theorem 3.2 can be further estimated by the first four moments of the approximating random variable or sequence. For this, we need to assume that the generator satisfies the following, generalized version of an abstract polynomial chaos property first stated in [2] for the finite-dimensional case. Prime examples of chaotic generators are those whose eigenspaces consist of (closures of) multivariate polynomials, such as the Hilbert-valued Ornstein-Uhlenbeck generator, Laguerre or Jacobi generators, in finite or infinite dimension. The Ornstein-Uhlenbeck case will be covered in depth in Section 4, precise definitions for the other two generators can for example be found in [2].
We will also make use of the following covariance condition. for any two orthonormal vectors u, v ∈ K.

Approximation of Hilbert-valued Gaussians on Dirichlet structures
It will be proved later that both the covariance condition and the chaotic property is satisfied whenever F is an eigenfunction of the Ornstein-Uhlenbeck generator.

Now we can state the main result of this section.
Theorem 3.7. Let F ∈ D with chaos expansion F = ∞ p=1 F p , where LF p = −λ p F p and assume that L is chaotic and its eigenfunctions verify the covariance assumption (3.6). Denote the covariance operators of F p by S p , so that F has covariance operator S =

9)
and the constants a p,q and c p,q are given by a p,q = (λ p + λ q )/2λ q and respectively.
Before proving Theorem 3.7, let us give the following restatement of M in terms of fourth moments only.

Proposition 3.8.
In the setting of Theorem 3.7, it holds that where the Z p are centered Gaussian random variables with the same covariance operators as the F p .
Proof. Using similar arguments as in [ Proof of Theorem 3.7. The idea of the proof is to transfer the Dirichlet structure from L 2 (Ω; K) to L 2 (Ω; R) by expanding in an orthonormal basis and working on the coefficients, afterwards reassembling everything again. To this end, let {e i : i ∈ N} be an orthonormal basis of K and denote F i = F, e i , as well as where Γ and L are the real-valued counterparts of Γ and L (see Section 2.3). To improve readability, we will not make any notational distinction between the real-valued and Hilbert-valued case and therefore denote Γ and L by the symbols Γ and L as well throughout the proof. The meaning can always unambigously be inferred from the context, depending on whether the arguments are Kor R-valued. Define the cross-covariance operators C p,q : K → K via the identity E ( F p , k F q , l ) = C p,q k, l , k, l ∈ K.

Approximation of Hilbert-valued Gaussians on Dirichlet structures
Then, C p,p = S p and, by orthogonality, C p,q = 0 if p = q. Therefore, Var (Γ(F p,i , −L −1 F q,j )), (3.12) Note that all carré du champ operators appearing in the double sum (3.12) are acting on real valued random variables, so that known results from the finite-dimensional theory can be applied.

Approximation of Hilbert-valued Gaussians on Dirichlet structures
Together with (3.14), we thus get for p = q that ∞ i,j=1 from which the asserted bound follows.
Inspecting the proof of Theorem 3.7, it becomes apparent that for the case where F = F p is a chaotic eigenfunction, we can remove one square root. In other words, the following holds.
Combining Theorems 3.7 and 3.2, the following moment bound is obtained.
Theorem 3.10. Let Z be a centered Gaussian, non-degenerate random variable on K, assume that L is chaotic and let F ∈ L 2 (Ω; K) with chaos expansion F = ∞ p=1 F p , where LF p = −λ p F p . Denote the covariance operators of Z, F and F p by S and T and S p , respectively. Then the following two statements are true.
(i) If F p satisfies the covariance condition (3.6) for all p ∈ N, then where the quantities M (F ) and C(F ) are given by (3.8) (or equivalently (3.10)) and (3.9), respectively.
(ii) If F = F p for some eigenfunction F p ∈ ker (L + λ p Id), then so that the expressions M (G N ) and C(G N ) are no longer infinite series but finite sums. To handle the additional term d 2 (G N , F ), one then needs control on the tails E ( F − G N ), Of course, in the setting of Theorem 3.10, if K is assumed to have finite dimension d, then the right hand side of (3.16) also bounds the Wasserstein distance d W (F, Z) (with constant 1/2 replaced by C s,d of Theorem 3.2). Let us now state two central limit theorems which are direct consequences of Theorem 3.10. The first one is an abstract Fourth Moment Theorem. Theorem 3.12 (Abstract fourth moment theorem). Let Z be a centered, non-degenerate Gaussian random variable on K and {F n : n ∈ N} be a sequence of K-valued chaotic eigenfunctions such that E F n 2 → E Z 2 . Consider the following two asymptotic relations, as n → ∞: (i) F n converges in distribution to Z; Then, (ii) implies (i), and the converse implication holds whenever the moment sequence F n 4 : n ≥ 1 is uniformly integrable.
Proof. Denote the covariance operators of Z and the F n by S and S n , respectively. Then by assumption tr(S n − S) → 0. The fact that (ii) implies (i) is a direct consequence of Theorem 3.10. The converse implication follows immediately if the additional uniform integrability condition is assumed to hold.
Remark 3.13. (i) As is well known, a sufficient condition for uniform integrability of the sequence F n 4 : n ≥ 1 is given by sup n≥1 E F n 4+ε < ∞ for some ε > 0.

(ii) Theorem 3.12 is a Hilbert-valued generalization of the Gaussian Fourth Moment
Theorems derived in [2] (K = R) and [10] (K = R d with Euclidean inner product). As further special cases, taking L to be the Ornstein-Uhlenbeck generator on L 2 (Ω, R), the classical Fourth Moment Theorem of [39] (K = R) and Theorem 4.2 of [35] (K = R d with Euclidean inner product) are included. Further details on these latter two cases will be provided in Section 4.2.
For functionals with infinite chaos expansions, the corresponding limit theorem reads as follows. Again, the proof is a straightforward application of Theorem 3.10.
Theorem 3.14. Let Z be a centered, non-degenerate Gaussian random variable on K with covariance operator S and let {F n : n ∈ N} be a sequence of square integrable, K-valued random variables with chaos decomposition where, for each n, p ≥ 1, F p,n is a chaotic eigenfunction associated to the eigenvalue −λ p (of the operator −L) and verifying the covariance condition (3.6). For n, p ∈ N, let S n and S p,n be the covariance operators of F n and F p,n , respectively. Suppose that: Then F n converges in distribution to Z as n → ∞.
Proof. For N ∈ N, define F n,N = N p=1 F p,n , R n,N = F n − F n,N = ∞ p=N +1 F p,n and let Z N be a centered Gaussian random variable on K with covariance operator N p=1 S p . Now let ε > 0 and note that tr(S p,n ). Similarly, tr(S p ) The above two calculations, together with assumption (i), yield the existence of N ∈ N, not dependent of n, such that By assumption (ii) and Theorem 3.10, we also have that as n → ∞, so that in view of (3.18), The assertion follows as ε was arbitrary.
Although we stated Theorems 3.12 and 3.14 in a qualitative way, it should be clear that the convergences in both results are actually quantified by Theorem 3.10.

Hilbert-valued Wiener structures
In this section, we apply our general results to the special Dirichlet structure induced by the Ornstein-Uhlenbeck generator. This leads to Hilbert-valued Wiener chaos and a carré du champ operator given by a gradient of Hilbert-valued Malliavin derivatives. The eigenfunctions are multiple Wiener-Itô integrals with Hilbert-valued deterministic kernels. This additional structure allows to express the moment bounds of the previous sections in terms of kernel contractions, which in the finite-dimensional case have already proved themselves to be very useful in applications, due to their comparatively easy computability when compared to moments.

The Malliavin derivative and divergence operators
Let {W (h) : h ∈ H} be an isonormal Gaussian process with underlying separable Hilbert space H, that is {W (h) : h ∈ H} is a centered family of Gaussian random variables, defined on a complete probability space (Ω, F, P ), satisfying We assume that the σ-algebra F is generated by W . Let K be another separable Hilbert space and denote by S ⊗ K the class of smooth K-valued random variables F : . . , h n ∈ H, v ∈ K, and linear combinations thereof. S ⊗ K is dense in L 2 (Ω; K) and for F ∈ S ⊗ K, define the Malliavin derivative DF of F as the H ⊗ K-valued random variable given by It can be shown that D is a closable operator from L 2 (Ω; K) into L 2 (Ω; H ⊗ K), and from now we continue to use the symbol D to denote the closure. The domain of D, denoted by D 1,2 (K), is the closure of S ⊗K with respect to the Sobolev norm F 2 D 1,2 (K) = F 2 L 2 (Ω;K) + DF 2 L 2 (Ω;H⊗K) . Similarly, for k ≥ 2, let D k,2 (K) denote the closure of S ⊗K with respect to the Sobolev norm . For any k ≥ 2, the operator D k can be interpreted as the iteration of the Malliavin derivative operator defined in (4.1). As D is a closed linear operator from D 1,2 (K) to L 2 (Ω; H ⊗ K), it has an adjoint operator, denoted by δ, which maps a subspace of L 2 (Ω; H ⊗ K) into L 2 (Ω; K) through the duality relation for any F ∈ D 1,2 (K) and η ∈ dom(δ). The domain of δ, denoted by dom(δ), is the subset of random variables η ∈ L 2 (Ω; H ⊗ K) such that E DF, η H⊗K ≤ C η F L 2 (Ω;K) , for all F ∈ D 1,2 (K), where C η is a positive constant depending only on η. Since D is a form of gradient, its adjoint δ should be interpreted as a divergence, so that it is referred to as the divergence operator. Similarly, for any k ≥ 2, we denote by δ k the adjoint of D k as an operator from L 2 (Ω; H ⊗k ⊗ K) to L 2 (Ω; K) with domain dom(δ k ).

Multiple integrals and chaos decomposition
Any K-valued random variable F ∈ L 2 (Ω; K) can be decomposed as δ n (f n ), (4.2) where the kernel f n ∈ H n ⊗ K are uniquely determined by F , where H n denotes the n-fold symmetrized tensor product of H. The representation (4.2) is called the chaos decomposition of F , and for each n ≥ 0, δ n (f n ) is an element of the closure of H n ⊗ K with respect to the norm on L 2 (Ω; K), where the so-called n-th Wiener chaos H n is defined to be closed linear subspace of L 2 (Ω) generated by the random variables H n (W (h)) : h ∈ H, h H = 1 , where H n is the n-th Hermite polynomial given by H n (x) = (−1) n e x 2 /2 d dx n e −x 2 /2 (recall that H 0 is identified with R). For any n ≥ 0, the K-valued random variable δ n (f n ) is usually denoted by I n (f n ) and called the (Kvalued) multiple Wiener integral of order n of f n . In the particular case where K = R, these integrals coincide with the ones defined in [38]. Denote by J n the linear operator on L 2 (Ω) given by the orthogonal projection onto H n , and by J K n the extension of J n ⊗Id K to L 2 (Ω; K). Then, it holds that J K n F = I p (f n ). Let {e k : k ≥ 0} be an orthonormal basis of H. Given f ∈ H n and g ∈ H m , for every r = 0, . . . , n ∧ m, the r-th contraction of f and g is the element of H ⊗(n+m−2r) defined as f, e i1 ⊗ · · · ⊗ e ir H ⊗r ⊗ g, e i1 ⊗ · · · ⊗ e ir H ⊗r . We denote by f ⊗ r g the symmetrization (average over all permutations of the arguments) of f ⊗ r g. Given an orthonormal basis {v k : k ≥ 0} of K, the following multiplication formula is satisfied by K-valued multiple Wiener integrals: for two arbitrary basis elements v i , v j of K and for f ∈ H n ⊗ K and g ∈ H m ⊗ K, define f i = f, v i K and g j = g, v j K . Then I n (f i )I m (g j ) = n∧m r=0 r! n r m r I n+m−2r (f i ⊗ r g j ). (4.4) Finally, the action of the Malliavin derivative operator on a K-valued multiple Wiener integral of the form I n (f ) ∈ L 2 (Ω; K), where f ∈ H n ⊗ K, is given by DI n (f ) = nI n−1 (f (·)) ∈ L 2 (Ω; H ⊗ K).

Fourth moment and contraction bounds
In this section, we are going to apply our abstract results to the Dirichlet structure given by the Ornstein-Uhlenbeck generator, acting on L 2 (Ω; K), where K is a real, separable Hilbert space and the σ-algebra of the underlying probability space is generated by an isonormal Gaussian process W , indexed by a real, separable Hilbert space H. The Ornstein-Uhlenbeck generator, commonly denoted by −L in this context, is then defined as −L = δD. Its spectrum is given by the non-negative integers and the eigenspace asociated to the eigenvalue p ∈ N 0 consists of K-valued multiple Wiener-Itô integrals of order p. The product formula (4.4) furthermore shows that each of these eigenfunc- Using this concrete structure, our bounds can be expressed in terms of kernel contractions. In applications, such contractions have proven to be very useful, as they are typically easier to evaluate than moments (see, among many others, [30,32] for the context of Breuer-Major theorems, for instance).
Throughout the rest of this section, we assume a Dirichlet structure as introduced in the above paragraph as given.
We start by proving that the covariance condition (3.6) always holds in the present context.
Proof. Let {e i : i ∈ N} be an orthonormal basis of K and abbreviate the inner products F p , e i and f p , e i by F p,i and f p,i , respectively. As in the proof of Theorem 3.7, it follows The assertion follows after summing over i and j.
Combined with Theorem 3.2, the contraction bound just obtained yields the following result. As special cases for K = R, Theorem 4.3 includes the main results of [13] and [33] (as usual in finite dimension, d 2 can be replaced by the Wasserstein distance -see the proof of Theorem 3.2).
Let us now show how the results proved in Section 3.2 can be refined in the Wiener chaos setting. We start with the Fourth Moment Theorem. Theorem 4.4 (Infinite-dimensional Fourth Moment Theorem). Let Z be a centered Gaussian random variable on K with covariance operator S, and, for p ≥ 1, let {F n : n ∈ N} = {I p (f n ) : n ∈ N} be a sequence of K-valued multiple integrals such that tr(S n − S) → 0 as n → ∞. Then, as n → ∞, the following assertions are equivalent.
(i) F n converges in distribution to Z, Proof. As tr(S n − S) → 0 as n → ∞, hypercontractivity of Wiener chaos implies that for any r ≥ 2, sup n E[ F n r ] < ∞, which yields that (i) implies (ii) by uniform integrability. Summing (4.5) over i and j and using (3.11) yields the implication (ii) ⇒(iii) (and also (ii) ⇒(iv) ). The fact that f n ⊗ r f n H (2p−2r) ⊗K ⊗2 ≤ f n ⊗ r f n H (2p−2r) ⊗K ⊗2 gives (iii) ⇒(iv) and the implication (iv) ⇒(v) follows by summing (4.8) over i and j. Finally, The corresponding Fourth Moment Theorems for random variables with infinite chaos expansion (Theorem 3.14 in Section 3.2) can be expressed using contractions as follows: I p (f p,n ), (4.9) where, for each n, p ≥ 1, f p,n ∈ H p ⊗ K. Suppose that: (ii) for all p ∈ N and r = 1, . . . , p − 1, it holds that f p,n ⊗ r f p,n H ⊗2(p−r) ⊗K ⊗2 → 0.
Then F n converges in distribution to a centered Gaussian Z with covariance operator S given by where, with some slight abuse of notation, E f p 2 H ⊗p ∈ K ⊗ K L(K, K) denotes the covariance operator of I p (f p ).
Proof. For p, n ∈ N, let S p and S p,n be the covariance operators of I p (f p ) and I p (f p,n ), respectively. Then which tends to zero as n → ∞ by assumption (i). As tr(S p,n ) = E I p,n The rest of the proof can now be done as in Theorem 3.14, using the bound provided by Theorem 4.2.

Quantifying the functional Breuer-Major Theorem
In this section, we will give rates of convergence for a functional version of the seminal Breuer-Major Theorem. To introduce the setting, let X = {X t : t ≥ 0} be a centered, stationary Gaussian process and define ρ(k) = E (X 0 X k ) such that E (X s X t ) = ρ(t − s) = ρ(s − t). Assume ρ(0) = 1, denote the standard Gaussian measure on R by γ and let ϕ ∈ L 2 (R, γ) be of Hermite rank d ≥ 1, so that ϕ can be expanded in the form After its discovery by Breuer and Major (see [8]), it took more than twenty years until progress was made towards quantifying this result. Taking X to be the normalized increment process of a fractional Brownian motion, Nourdin and Peccati ( [30]), as an illustration of the Malliavin-Stein method introduced in the same reference, were able to associate rates to the normal convergence of the chaotic projections of the coordinate sequences of U n , i.e., to the random sequence where H p denotes the pth Hermite polynomial and B H is a fractional Brownian motion with Hurst index H. Note that the random variables defined in (5.4) can be represented as multiple integrals of order p and therefore are elements of the pth Wiener chaos.
Recently, the Breuer-Major Theorem has been intensively studied, and very strong results have been obtained concerning the coordinate sequence, providing rates of convergence in total variation distance for general functions ϕ under rather weak assumptions (see [37,29,34]). Turning to infinite-dimension, it also has been proved recently in [9] and [28] that the process U n converges in distribution towards a scaled Brownian motion in the Skorohod space or in the space of continuous functions (replacing the Gauss brackets in the sum by a linear interpolation). In this section, it will be shown how, using our bounds, one can associate rates to the aforementioned functional convergences, taking place in a suitable Hilbert space K containing D([0, 1]) and C 0 ([0, 1]), respectively. The rates are obtained through the contraction bounds obtained in the previous section, which allow a natural and straightforward lifting of the one-dimensional results. We illustrate this method on [32, Example 2.5], where ϕ = H p and ρ(k) = k α l(k) for some α < 0 and a slowly varying function l. This latter assumption on ρ for example includes the case where X is the increment process of a fractional Brownian motion. Also, for simplicity, we set K = L 2 ([0, 1]). Our results also allow the analysis of more general functions ϕ and smaller Hilbert spaces K with finer topologies, such as the Besov-Liouville (see [40] for definitions and [12] for proofs of related functional limit theorems in this space) or other reproducing kernel Hilbert spaces, but as the calculations are more involved and also quite lenghty and technical, we decided to focus on the general picture in this article and will provide full details on this topic in a dedicated followup work.
The statement is as follows.
Theorem 5.1. Let {U n (t) : t ∈ [0, 1]} be the stochastic process defined in (5.2), considered as a sequence of random variables taking values in L 2 ([0, 1]), assume that ϕ = H p for some p ∈ N and that the covariance function ρ of the underlying centered, stationary Gaussian process is of the form ρ(k) = |k| α l(|k|), where α < −1/p and l is a slowly varying function. Then there exists a constant C > 0, such that d 2 (U n , σW ) ≤ Cr α (n) (5.5) where σ is defined in (5.3), W denotes a standard Brownian motion on L 2 ([0, 1]) and the rate function is given by  |k n (s, t)| ≤ C n −1 + n αp+1 l(n) as asserted.