New Berry-Esseen bounds for non-linear functionals of Poisson random measures

This paper deals with the quantitative normal approximation of non-linear functionals of Poisson random measures, where the quality is measured by the Kolmogorov distance. Combining Stein's method with the Malliavin calculus of variations on the Poisson space, we derive a bound, which is strictly smaller than what is available in the literature. This is applied to sequences of multiple integrals and sequences of Poisson functionals having a finite chaotic expansion. This leads to new Berry-Esseen bounds in de Jong's theorem for degenerate U-statistics. Moreover, geometric functionals of intersection processes of Poisson $k$-flats, random graph statistics of the Boolean model and non-linear functionals of Ornstein-Uhlenbeck-L\'evy processes are considered.


Introduction
Combining Stein's method with the Malliavin calculus of variations in order to deduce quantitative limit theorems for non-linear functionals of random measures has become a relevant direction of research in recent times. Results in this area usually deal either with functionals of a Gaussian random measure or with functionals of a Poisson random measure. Applications of findings dealing with the Gaussian case have found notable applications in the theory and statistics of Gaussian random processes [3,4] (most prominently the fractional Brownian motion [15]), spherical random fields [1,14], random matrix theory [17] and universality questions [19], whereas the findings for Poisson random measures have attracted applications in stochastic geometry [10,13,36,37], non-parametric Bayesian survival analysis [7,23] or the theory of U-statistics [32,33,35].
The present paper deals with quantitative central limit theorems for Poisson functionals (these are functionals of a Poisson random measure). Whereas most of the existing literature, such as [24,25,32,33], deals with smooth distances, such as the Wasserstein distance or a distance based on twice or trice differentiable test functions to measure the quality of the probabilistic approximation, our results deal with the non-smooth Kolmogorov distance. This is the maximal deviation of the involved distribution functions, which we consider as more intuitive and informative; let us agree to call a quantitative central limit theorem using the Kolmogorov distance a Berry-Esseen bound or theorem in what follows. Similar results for the Kolmogorov distance have previously appeared in [35]. Whereas in that paper a bound is derived using a case-by-case study around the non-differentiability point of the solution of the Stein equation and an analysis of the second-order derivative, we use a version of Stein's method, which circumvents such a case differentiation completely and avoids the usage of second-order derivatives. This provides new bounds, which differ in parts from those in [35]. In particular, our bounds are strictly smaller and also improve the constants appearing in [35].
Our general result, Theorem 3.1 below, is applied to sequences of (compensated) multiple integrals, which are the basic building blocks of the so-called Wiener-Itô chaos associated with a Poisson random measure. We provide new Berry-Esseen bounds for the normal approximation of such sequences. Besides our plug-in-result, the main technical tool we use is an isometric formula for the Skorohod integral on the Poisson space. In the context of normal approximation, this approach is new, although it has previously been applied in [32] for studying approximations by Gamma random variables. In a next step, this is applied to derive a new quantitative version of de Jong's theorem for degenerate U-statistics based on a Poisson measure. As far as we know, this is the first Berry-Essen-type version of de Jong's theorem. In a particular case we shall show that the speed of convergence of the quotient of the fourth moment and the squared variance to 3 -which is de Jong's original condition to ensure a central limit theorem -also controls the rate of convergence measured by the Kolmogorov distance. As a second main application, we shall consider Poisson functionals having a finite chaotic expansion. Examples for such functionals are provided by non-degenerate U-statistics. We then study the normal approximation of such (suitably normalized) functionals and provide concrete applications to geometric functionals of intersection processes of Poisson k-flats and random graph statistic of the Boolean model as considered in stochastic geometry and to empirical means and second-order moments of Ornstein-Uhlenbeck Lévy processes. In this context, our new bound simplifies considerably the necessary computations and avoids a subtle technical issue, which is present in [35]. One of the main technical tools is again an isometric formula for the Skorohod integral.
Our text is structured as follows: In Section 2 we introduce the necessary notions and notation and recall some important background material in order to make the paper self-contained. Our general bound for the normal approximation of Poisson functionals is the content of Section 3. Our applications are presented in Section 4. In particular, Section 4.1 deals with multiple Poisson integrals, Section 4.2 with de Jong's theorem for degenerate U-statistics and Section 4.3 with non-degenerate Ustatistics and general Poisson functionals having a finite chaotic expansion as well as with our concrete application to stochastic geometry and Lévy processes.

Preliminaries
Poisson random measures. Let (Z, Z ) be a standard Borel space, which is equipped with a σfinite measure µ. By η we denote a Poisson (random) measure on Z with control µ, which is defined on an underlying probability space (Ω, F, P). That is, η = {η(B) : B ∈ Z 0 } is a collection of random variables indexed by the elements of Z 0 = {B ∈ Z : µ(B) < ∞} such that η(B) is Poisson distributed with mean µ(B) for each B ∈ Z 0 , and for all n ∈ N, the random variables η(B 1 ), . . . , η(B n ) are independent whenever B 1 , . . . , B n are disjoint sets from Z 0 (the second property follows automatically from the first one if the measure µ does not have atoms, cf. [6,Theorem VI.5.16] or [34,Corollary 3.2.2]). The distribution of η (on the space of σ-finite counting measures on Z) will be denoted by P η . For more details see [6,Chapter VI] and [34,Chapter 3].
L 1 -and L 2 -spaces. For n ∈ N let us denote by L 1 (µ n ) and L 2 (µ n ) the space of integrable and square-integrable functions with respect to µ n , respectively. The scalar product and the norm in L 2 (µ n ) are denoted by · , · n and · n , respectively. From now on, we will omit the index n as it will always be clear from the context. Moreover, let us denote by L 2 (P η ) the space of square-integrable functionals of a Poisson random measure η. Finally, we denote by L 2 (P, L 2 (µ)) the space of jointly measurable mappings h : Ω × Z → R such that Ω Z h(ω, z) 2 µ(dz) P(dω) < ∞ (recall that (Ω, F, P) is our underlying probability space).
Chaos expansion. It is a crucial feature of a Poisson measure η that any F ∈ L 2 (P η ) can be written as where the sum converges in the L 2 -sense, see [12,Theorem 1.3]. Here, I n stands for the n-fold Wiener-Itô integral (sometimes called Poisson multiple integral) with respect to the compensated Poisson measure η − µ and for each n ∈ N, f n is a uniquely determined symmetric function in L 2 (µ n ) (depending, of course, on F ). In particular, the multiple integrals are centered random variables and orthogonal in the sense that all for integers q 1 , q 2 ≥ 1 and symmetric functions f 1 ∈ L 2 (µ q 1 ) and f 2 ∈ L 2 (µ q 2 ). The representation (2.1) is called the chaotic expansion of F and we say that F has a finite chaotic expansion if only finitely many of the functions f n are non-vanishing. In particular, (2.1) together with the orthogonality of multiple stochastic integrals leads to the variance formula Malliavin operators. For a functional F = F (η) of a Poisson measure η let us introduce the difference operator D z F by putting D z F is also called the add-one-cost operator as it measures the effect on F of adding the point z ∈ Z to η. If F has a chaotic representation as at (2.1) such that ∞ n=1 n n! f n 2 < ∞ (we write F ∈ dom(D) in this case), then D z F can alternatively be characterized as where f n (z, · ) is the function f n with one of its arguments fixed to be z. We remark that DF is an element of L 2 (P, L 2 (µ)). Besides of D, let us also introduce three other Malliavin operators L, L −1 and δ. If F satisfies ∞ n=1 n 2 n! f n 2 < ∞, the Ornstein-Uhlenbeck generator is defined by nI n (f n ) and its inverse is denoted by L −1 . In terms of the chaos expansion of a centred random variable F ∈ L 2 (P η ), i.e. E(F ) = 0, it is given by Finally, if z → h(z) is a random function on Z with chaos expansion h(z) = z + ∞ n=1 I n (h n (z, · )) with symmetric functions h n (z, · ) ∈ L 2 (µ n ) such that ∞ n=0 (n + 1)! h n 2 < ∞ (let us write h ∈ dom(δ) if this is satisfied), the Skorohod integral δ(h) of h is defined as where h n is the canonical symmetrization of h n as a function of n + 1 variables. The next lemma summarizes a relationships between the operators D, δ and L, the classical and a modified integrationby-parts-formula (taken from [35,Lemma 2.3]) as well as an isometric formula for Skorohod integrals, which is Proposition 6.5.4 in [31].
Lemma 2.1. (i) For every F ∈ dom(L) it holds that F ∈ dom(D) and DF ∈ dom(δ), and (ii) We have the integration-by-parts-formula for every F ∈ dom(D) and h ∈ dom(δ).
(iii) Suppose that F ∈ L 2 (P η ) (not necessarily assuming that F belongs to the domain of D), that h ∈ dom(δ) has a finite chaotic expansion and that D z 1(F > x)h(z) ≥ 0 for any x ∈ R and µ-almost all z ∈ Z. Then We refer the reader to [21] or [24] for more details and background material concerning the Malliavin formalism on the Poisson space. Moreover, we refer to [12] for a pathwise interpretation of the Skorohod integral.
Product formula. Let q 1 , q 2 ≥ 1 be integers and f 1 ∈ L 2 (µ q 1 ) and f 2 ∈ L 2 (µ q 2 ) be symmetric functions. In terms of the contractions of f 1 and f 2 introduced in the previous paragraph, one can express the product of I q 1 (f 1 ) and I q 2 (f 2 ) as follows: see [26,Proposition 6.5.1].
Technical assumptions. Whenever we deal with a multiple stochastic integral, a sequence F n = I q (f n ) or a finite sum n ) of such integrals with integers k ≥ 1, q i ≥ 1 for i = 1, . . . , k, and symmetric functions f n ∈ L 2 (µ q n ) or f (i) n ∈ L 2 (µ q i n ) we will (implicitly) assume that the following technical conditions are satisfied (for sequences of single integrals, the upper index has to be ignored): i) for any i ∈ {1, . . . , k} and any r ∈ {1, . . . , q i }, the contraction f ii) for any r ∈ {1, . . . , q i }, ℓ ∈ {1, . . . , r} and (z 1 , . . . , z 2q i −r−ℓ ) ∈ Z 2q i −r−ℓ we have that (|f is well defined and finite; iii) for any i, j ∈ {1, . . . , k} and k ∈ {max(|q i − q j |, 1), . . . , q i + q j − 2} and any r and ℓ satisfying For a detailed explanation of the rôle of these conditions we refer to [10] or [24], but we remark that these technical assumptions ensure in particular that F 2 n is an element of L 2 (P η ), such that {E F 4 n : n ∈ N} is a bounded sequence. We finally note that (iii) is automatically satisfied if the control measure µ of the Poisson measure η is finite -just apply the Cauchy-Schwarz inequality.
Probability metrics. To measure the distance between the distributions of two random variables X and Y defined on a common probability space (Ω, F, P), one often uses distances of the form where H is a suitable class of real-valued test functions (note that we slightly abuse notation by writing d(X, Y ) instead of d(law(X), law(Y ))). Prominent examples are the class H W of Lipschitz functions with Lipschitz constant bounded by one or the class H K of indicator functions of intervals (−∞, x] with x ∈ R. The resulting distances d W := d H W and d K := d H K are usually called Wasserstein and Kolmogorov distance. We notice that d W (X n , Y ) → 0 or d K (X n , Y ) → 0 as n → ∞ for a sequence of random variables X n implies convergence of X n to Y in distribution (the converse is not necessarily true, but holds for the Kolmogorov distance if the target random variable Y has a density with respect to the Lebesgue measure on R).
Stein's method. A standard Gaussian random variable Z is characterized by the fact that for every absolutely continuous function f : This together with the definition of the Kolmogorov distance is the motivation to study the Stein equation in which x ∈ R is fixed and Φ(x) = P(Z ≤ x) denotes the distribution function of Z. A solution of the Stein equation is a function f x , depending on x, which satisfies (2.9). The bounded solution of the Stein equation is given by 4 . Moreover, we observe that f x is continuous on R, infinitely differentiable on R \ {x}, but not differentiable at x. However, interpreting the derivative of f at x as 1 − Φ(x) + xf (x) in view of (2.9), we have according to [5,Lemma 2.3]. Moreover, let us recall from the same result that f x satisfies the bound If we replace w by a random variable W and take expectations in the Stein equation (2.9), we infer that We identify the quantity on the left hand side of (2.12) as the Kolmogorov distance between (the laws of) W and the standard Gaussian variable Z.

General Malliavin-Stein bounds
Our first contribution in this paper is a new bound for the Kolmogorov distance d K (F, Z) between a Poisson functional F and a standard Gaussian random variable Z. Our bound involves the Malliavin operators D and L −1 as introduced in the previous section. Moreover, our set-up is that (Z, Z ) is a standard Borel space which is equipped with a σ-finite measure µ and that η is a Poisson measure on Z with control µ.
Theorem 3.1. Let F ∈ L 2 (P η ) be such that EF = 0 and F ∈ dom(D) and denote by Z a standard Gaussian random variable. Then where we use the standard notation that (and similarly for the other terms).

Remark 3.2.
• Comparing our bound with that for d K (F, Z) from [35], we see that the result in [35] involves the additional term E (DF ) 2 , |DF × DL −1 F | , implying that the bound in [35] is strictly larger than ours. In addition, Theorem 3.1 improves the constants ibidem.
• Our bound should be compared with a similar bound from [24, Theorem 3.1] for the Wasserstein distance between F and Z. Namely, if F ∈ dom(D) and EF = 0, then Theorem 3.1 in [24] states that Our bound involves additional term, reflecting the effect that our test functions are indicator functions of intervals (−∞, x], x ∈ R, in contrast to Lipschitz functions with Lipschitz constant bounded by one.
• It is well known that Wasserstein and Kolmogorov distance are related by However, this inequality leads to bounds for d K (F, Z), which are systematically larger than the bounds obtained from Theorem 3.1. For instance, if the control of η is given by nµ with integers n ≥ 1 and if we denote the Poisson functional by F n in order to indicate its dependence on n, then we often have that d W (F n , Z) ≤ c W n −1/2 and d K (F n , Z) ≤ c K n −1/2 for constants c W , c K > 0, whereas (3.1) would deliver the suboptimal rate n −1/4 for d K (F n , Z) only (see Examples 4.12, 4.13 and 4.15 for instance).
• Other examples for bounds between the law of a Poisson functional and some target random variable in the spirit of Theorem 3.1 are the paper [27] dealing with the multivariate normal approximation (with applications in [13]), the paper [32] considering the approximation by a Gamma random variable as well as [22], in which the Chen-Stein method for Poisson approximation has been investigated (see also [36,37] for applications of this result).
• We finally remark that if F is a functional of a Gaussian random measure on Z with control µ, (note that this is not influenced by the fact that f ′ only exists as a left-or right-sided derivative at t = x). Next, applying (2.4) and the integration-by-parts-formula (2.5) in this order yields We now replace w by F in the Stein equation (2.9), take expectations and use (3.2) as well as (3.3) to see that Let us consider for fixed z ∈ Z the integral in the second term. Since f is a solution of the Stein equation (2.9), we have that (3.5) Now, the integrand in I 1 can be bounded by means of (2.11), which yields To bound I 2 , we consider the cases D z F < 0 and D z F ≥ 0 separately and write For the first term, we have Thus, we arrive at the following estimate for I 2,<0 : where the equality in the last line follows by considering the cases D z F + F, F ≤ x and D z F + F ≤ x < F separately (note that the remaining cases cannot contribute). For the second case, similar arguments lead to the upper bound Thus, for I 2 = I 2,<0 + I 2,≥0 we have that Together with the bound for I 1 and the fact that f ′ (w) ≤ 1 for all w ∈ R we conclude from (3.4) the bound The final result follows in view of (2.12) by taking the supremum over all x ∈ R.
Let us draw a consequence of Theorem 3.1, which will in our applications below serve as kind of plug-in theorem. It provides a more convenient form of the bound for the Kolmogorov distance, which will be applied in the context of Theorem 4.1 and Theorem 4.8.
Proof. Since √ 2π/8 < 1/2, the result follows immediately by applying to twice the Cauchy-Schwarz inequality and the Minkowski inequality, then by using the bound provided by Theorem 3.1.
As a particular case, let us consider a multiple integral of arbitrary order: The second term of the bound in Theorem 3.1 reads as 1 Hence, the second term of the bound in Corollary 3.3 can be estimated from above by This set-up will further be exploited in Section 4.1 below. We refer the reader to [24, Theorem 3.1] for a similar bound for the Wasserstein distance between I q (f ) and Z.

Multiple integrals
In this section we consider a sequence of multiple integrals F n := I q (f n ) for a fixed integer q ≥ 2 and with functions f n ∈ L 2 (µ q ) satisfying the technical assumptions presented in Section 2. Moreover, we shall assume that for each n ∈ N, η n is a Poisson random measure on (Z, Z ) with control µ n , where for each n ∈ N, µ n is a σ-finite measure on Z. In what follows, norms and scalar products involving functions f n are always taken with respect to µ n .
Then F n converges in distribution to a standard Gaussian random variable Z and for any n ∈ N we have the following bound on the Kolmogorov distance between F n and Z: with a constant C > 0 only depending on q and where the maximum runs over all r and ℓ such that either r = q and ℓ = 0 or r ∈ {1, . . . , q} and ℓ ∈ {1, . . . , min(r, q − 1)}.

Remark 4.2.
• Note that the assumption and the estimate for the Kolmogorov distance in Theorem 4.1 involve the contraction kernel f n ⋆ 0 q f n = f 2 n . In particular, the condition that f 2 n → 0 as n → ∞ is actually a condition on the L 4 -norm of f n .
• Under condition (4.1) we have that f n ⋆ ℓ r f n 3/2 is smaller than f n ⋆ ℓ r f n for sufficiently large indices n so that d K (F n , Z) is asymptotically dominated by f n ⋆ ℓ r f n or the variance difference |var(Z) − var(F n )| = |1 − q! f n 2 .
• It is worth comparing our bound with the one from [24, Theorem 4.2] for the Wasserstein distance: • A bound for d K (F n , Z) with F n = I q (f n ) as in Theorem 4.1 could in principle also be derived using the techniques provided by [35]. However, this leads to an expression which is systematically larger than ours as it involves contractions of the absolute value of f n .
• Similar statements for sequences of multiple integrals with respect to a Gaussian random measure can be found in [16,Proposition 3.2] for instance. In this case, it is sufficient that q! f n 2 → 1 and that f n ⋆ r r f n → 0, as n → ∞, to conclude a central limit theorem for them. Note, that in the Poisson case, assumption (4.1) involves also contractions f n ⋆ ℓ r f n with r = ℓ, which for general f n seems unavoidable (see however [28] for the case of so-called homogeneous sums, where it suffices to control f n ⋆ r r f n ).
Proof of Theorem 4.1. Let us introduce the sequences where here and below F n stands for I q (f n ) (recall that in our set-up the norms and scalar products are with respect to µ n ). Then Corollary 3.3 delivers the bound Thus, we shall show that A 1 (F n ), A 2 (F n ) × A 3 (F n ), and A 4 (F n ) vanish asymptotically, as n → ∞.
For A 1 (F n ), we use Theorem 4.2 in [24], in particular Equation (4.14) ibidem, to see that Next, for A 2 (F n ) we observe that using that, by definition, DL −1 F n = − 1 q DF n (see Example 3.4). Hence, we can apply again Theorem 4.2 in [24], this time Equation (4.32) and (4.18) ibidem, to deduce the bound Concerning A 3 (F n ), let us first write 3 (F n ) .

Now, use Jensen's inequality to see that
Hence, we conclude that A 3 (F n ) is a bounded sequence, since the functions f n satisfy the technical assumptions. Finally, let us consider the sequence A 4 (F n ). We will adapt in parts the strategy of the proof of Proposition 2.3 in [32] to derive a bound for A 4 (F n ). First, define the mapping u → Ξ(u) := u|u| from R to R and observe that it satisfies the estimate for all u, v ∈ R. To apply the modified integration-by-parts-formula (2.6) we need to check that D z 1(F n > x) Ξ(D z F n ) |DL −1 F n | ≥ 0. Therefore and in view of the definition of Ξ, it is sufficient to show that (D z 1(F n > x))(D z F n ) ≥ 0. To prove this, consider the two cases F ≤ D z F + F and F > D z F + F separately. In the first case we have D z F ≥ 0 and D z 1(F n > x) ∈ {0, 1}, whereas in the second case it holds that D z F < 0 along with D z 1(F n > x) ∈ {−1, 0}. Thus, (D z 1(F n > x))(D z F n ) ≥ 0, and hence D z 1(F n > x) Ξ(D z F n ) |DL −1 F n | ≥ 0. This allows us to apply the modified integration-by-parts-formula (2.6) and to conclude that Now, the Skorohod isometric formula (2.7) yields

Clearly, A
(1) 4 (F n ) = qA 2 (F n ), recall (4.2). As seen in the proof of Lemma 4.3 in [32], the term A  4 (F n ) can, by means of the Cauchy-Schwarz inequality, be estimated as follows: 4 (F n ) given by Arguing again as at [32, Page 549], one infers that A 4 (F n ) is bounded by linear combinations of f n ⋆ b a f n 2 with a and b as above. Consequently, our assumptions (4.1) imply that d K (F n , Z) → 0, as n → ∞. This yields the desired convergence in distribution of F n to Z. The precise bound for d K (F n , Z) follows implicitly from the computations performed above. Corollary 4.3. Fix an integer q ≥ 2 and assume that {f n : n ∈ N} is a sequence of non-negative, symmetric functions in L 2 (µ q ) which satisfy the technical assumptions. In addition, suppose that EI 2 q (f n ) = 1 for all n ∈ N. Then, there is a constant C > 0 only depending on q such that for sufficiently large n, where Z is a standard Gaussian random variable. Moreover, if the sequence {I q (f n ) : n ∈ N} is uniformly integrable, then I q (f n ) converges in distribution to a standard Gaussian random variable if and only if EI 4 q (f n ) converges to 3.
Proof. The first part follows directly by combining Proposition 3.8 in [10] with Theorem 4.1. The second part is Theorem 3.12 (3) in [10].

Remark 4.4.
• Corollary 4.3 should be compared with the following result from [20]: Let for some integer q ≥ 2, I G q (f n ) be a sequence of multiple integrals with respect to a Gaussian random measure on Z such that for each, n ∈ N, f n ∈ L 2 (µ q ) is symmetric (but not necessarily non-negative). In addition, suppose that EI G q (f n ) 2 = 1. Then the convergence in distribution of I G q (f n ) to a standard Gaussian random variable is equivalent to the convergence of E[I G q (f n ) 4 ] to 3.
• The fourth moment criterion stated in Corollary 4.3 is in the spirit of fourth moment criteria for central limit theorems of Gaussian multiple integrals first obtained in [20] and recalled above. They have attracted considerable interest in recent times and we refer to the webpage http://www.iecn.u-nancy.fr/∼nourdin/steinmalliavin.htm for an exhaustive collection of works in this direction.
• Inequality (4.6) with Kolmogorov distance d K replaced by Wasserstein distance d W has been proved in [10], see Equation (3.9) ibidem.
• If we have EI 2 q (f n ) → 1, as n → ∞, instead of EI 2 q (f n ) = 1, then (4.6) has to be replaced by However, this generalization will not be needed in our applications below.
• The second assertion of Corollary 4.3 remains true without the assumption that the functions f n are non-negative in the case of double Poisson integrals (i.e. if q = 2). This is the main result in [25]. Because of the involved structure of the fourth moment of a Poisson multiple integral (resulting from a highly technical so-called diagram formula, see [26]), it is not clear weather a similar result should also be expected for q > 2.

A quantitative version of de Jong's theorem
Let Y = {Y i : i ∈ N} be a sequence of i.i.d. random variables in R d for some d ≥ 1 whose distribution has a Lebesgue density p(x). Moreover, let, independently of Y, {N n : n ∈ N} be a sequence of random variables such that each member N n follows a Poisson distribution with mean n. Then, for each n ∈ N, is a Poisson random measure on Z = R d (equipped with the standard Borel σ-field) with (finite) control µ n (dx) = np(x)dx (where dx stands for the infinitesimal element of the Lebesgue measure in R d ). For convenience we put µ := µ 1 . Next, let for each n ∈ N, h n : R 2d → R be a non-zero, symmetric function, which is integrable with respect to µ 2 . By a sequence of (bivariate) U-statistics (sometimes called Poissonized U -statistics) based on these data we understand a sequence {U n : n ∈ N} of Poisson functionals of the form where η 2 n, = is the set of all distinct pairs of points of η n . h n is called kernel function of U n . We shall assume that these U-statistics are completely degenerate in the sense that for any n ∈ N, It is well known that a completely degenerated U n can be represented as U n = I 2 (f 2,n ) with f 2,n = h n . It is a direct consequence of (2.2) that where the expectation E is the integral with respect to µ 2 . Let us also introduce the normalized U-statistic F n := U n / var(U n ). Our main result in this section is a quantitative version of de Jong's theorem [9] for such U-statistics.
Theorem 4.5. Let {h n : n ≥ 1} be as above and suppose that h n ∈ L 4 (µ 2 ) as well as Then the fourth moment condition implies that F n converges in distribution to a standard Gaussian random variables Z. Moreover, there exists a universal constant C > 0 such that for all n, Remark 4.6.
• The set-up of this section fits into our general framework by taking Z = R d and Z as its Borel σ-field.
• The first assertion of Theorem 4.5 corresponds de Jong's theorem in [9]. Whereas the original proof is long and technical, our proof is more transparent and directly deals with the fourth moment. It is the slightly corrected version of the proof taken from [32]. On the other hand, the technique in [9] also allows to deal with U-statistics whose kernel functions h n are not necessarily symmetric.
• Theorem 4.5 is a generalization of (the corrected form of) Theorem 2.13 (A) in [32], which deals with the Wasserstein distance between F n and Z. In fact, the bound for d W (F n , Z) coincides -up to a constant multiple -with the bound for d K (F n , Z). To the best of our knowledge, Theorem 4.5 is the first quantitative version of de Jong's theorem, which deals with the Kolmogorov distance.
• The paper [32] also contains a quantitative version of de Jong's theorem, where the target random variable follows a Gamma distribution instead of a standard Gaussian distribution. In this case, the probability metric is based on the class of trice differentiable test functions.
Proof of Theorem 4.5. Since U n is completely degenerate, we can represent the normalized U-statistic F n as I 2 (f n ) with f n = h n / var(U n ) (note that the double Poisson integral is taken with respect to the compensated Poisson measure η n − µ n ). The estimate for d K (F n , Z) is thus a consequence of where, as usual in this section, norms and contractions are with respect to µ n (observe that to verify these computations, assumption (4.7) is essential). For some more details on how to obtain this relation we refer the reader to [32,Formulae (4.12) and (4.13)]. Since var(F n ) = 2 f n 2 = 1 by construction, we clearly have 3(2 f n 2 ) 2 = 3 for all n ∈ N. Thus, if the fourth moment condition (4.8) is satisfied, the other (non-negative) terms in (4.9) must vanish asymptotically, as n → ∞. Consequently, d K (F n , Z) tends to zero, as n → ∞.
Let us finally in this section present a version of de Jong's theorem, were the speed of convergence in the fourth moment condition (4.8) also controls the rate of convergence of F n towards a standard Gaussian random variable.
Corollary 4.7. Assume the same set-up as in Theorem 4.5 and suppose in addition that h n is nonnegative for each n ∈ N. Then there is a universal constant C > 0 such that for sufficiently large n, where Z is a standard Gaussian random variable.
Proof. This is consequence of Theorem 4.5 and Corollary 4.3. Note that the assumption EF 2 n = 1 for each n ∈ N is automatically fulfilled by construction.

Functionals with finite chaotic expansion
Let us assume that (Z, Z ) is a standard Borel space, {µ n : n ∈ N} is a sequence of σ-finite measures on Z and for each n ∈ N, η n is a Poisson random measure with control µ n . In this section we deal with a sequence {F n : n ∈ N} of Poisson functionals such that for each n ∈ N, F n admits the representation (4.10) with integers 1 = q 1 < . . . < q k (k ∈ N) and symmetric functions f (i) n ∈ L 2 (µ q i n ). Note that each of the multiple integrals I q i , i ∈ {1, . . . , k}, is taken with respect to η n − µ n . We shall assume that for all n ∈ N and i ∈ {1, . . . , k}, the functions f (i) n satisfy the technical assumptions and are such that f (1) n > 0. In particular, this implies that F n ∈ L 2 (P ηn ). A particular interesting class of such functionals are non-degenerate U-statistics of Poisson random measures. To define them, let, as above, k ≥ 1 be a fixed integer and h ∈ L 1 (µ k ) be a symmetric function. Then a U-statistic based on η n and h is given by where the symbol η k n, = indicates the class of all k-dimensional vectors (z 1 , . . . , z k ) such that z i ∈ η n and z i = z j for every 1 ≤ i = j ≤ k. We always assume that U n ∈ L 2 (P ηn ) (then necessarily h ∈ L 2 (µ k )), in which case U n can be re-written as . . . , z i , y 1 , . . . , y k−i ) µ k−i n d(y 1 , . . . , y k−i ) for every i = 1, . . . , k, see [33,Lemma 3.5]. Moreover, the multivariate Mecke formula [34, Corollary 3.

2.3] implies that
One should note that this chaotic representation follows from an application of the results proved in [12]. In contrast to the situation considered in the previous section around de Jong's theorem and in order to ensure the non-degeneracy of the U-statistics, we will assume that g (1) n > 0 for all n ∈ N. Let us finally introduce -by slight abuse of notation -the normalized U-statistics F n by F n := (U n − EU n )/ var(U n ). It is easy to see that the so-defined F n has the chaotic expansion n / var(U n ) for i ∈ {1, . . . , k}.
Theorem 4.8. Consider a Poisson functional as at (4.10) in the set-up as described above and suppose that for all n ∈ N and i ∈ {1, . . . , k}, the functions f (i) n satisfy the technical assumptions for all n ∈ N and that lim n→∞ var(F n ) = 1. Moreover, let Z be a standard Gaussian random variable. Then there is a universal constant C > 0 such that for all n, where the first maximum is taken over all i ∈ {1, . . . , k} and pairs (r, ℓ) such that either r = q i and ℓ = 0 or r ∈ {1, . . . , q i } and ℓ ∈ {1, . . . , min(r, q i − 1)}, whereas the second maximum is taken over all i, j ∈ {1, . . . , k} with i < j and pairs (r, ℓ) satisfying r ∈ {1, . . . , q i } and ℓ ∈ {1, . . . , r}.
• The statement of Theorem 4.8 remains true if we replace Kolmogorov distance by Wasserstein distance, see [10,Theorem 3.5].
• A bound for d K (F n , Z) has also been derived in [35] in the case of U-statistics, see Theorem 4.2 there. However, this bound is systematically larger than our bound, but not only because of an additional term in Theorem 3.1 (recall Remark 3.2). It also involves (after re-writing the terms M ij there in our language) contractions of the absolute values of the functions f n . This goes hand in hand with the observation that the notion of absolute convergence of U-statistics introduced and used in [33,35] can be avoided in our framework.
Corollary 4.10. We assume the same framework as in Theorem 4.8 and suppose in addition that f (i) n ≥ 0 for all n ∈ N and i ∈ {1, . . . , k}. Then there is a constant C > 0 such that for sufficiently large n, where Z is a standard Gaussian random variable. Moreover, if the sequence {F n : n ∈ N} is uniformly integrable, convergence in distribution of F n to Z is equivalent to convergence of EF 4 n − 3(EF 2 n ) 2 to 0.
Proof. This is a consequence of Theorem 4.8 and Proposition 3.8 and Theorem 3.12 in [10].
• Corollary 4.10 is a direct generalization of Corollary 4.3, which deals with sequence of single multiple integrals.
• With d K replaced by d W , the fourth moment bound has already been stated in [10]. Moreover, in the special case q 1 = 1, . . . , q k = k, corresponding to a U-statistic, the bound for the Kolmogorov distance also appears in [35].
• For Corollary 4.10 to be true, the assumption that the functions f (i) n are non-negative is essential. It is an open problem whether this can be relaxed.
Proof of Theorem 4.8. We use Corollary 3.3 to see that where norms and scalar products are always taken with respect to µ n . To bound A 1 (F n ) we use the first part of Theorem 3.5 in [10] (which is a consequence of Proposition 5.5 in [27]), which yields for some constant C 1 > 0. Here, the first maximum is taken over all i ∈ {1, . . . , k} and pairs (r, ℓ) such that r ∈ {1, . . . , q i } and ℓ ∈ {1, . . . , min(r, q i − 1)} (note that the case r = q i and ℓ = 0 is excluded here), whereas the second maximum is taken over all i, j ∈ {1, . . . , k} with i < j and pairs (r, ℓ) satisfying r ∈ {1, . . . , q i } and ℓ ∈ {0, . . . , r}. To bound A 2 (F n ), we write Now, the product formula (2.8) allows us to re-write I 2 n (z, · )) and I 2 q j −1 (f (j) n (z, · )) as a sum of multiple integrals. The orthogonality of these integrals then implies that A 2 (F n ) 2 is bounded by a linear combination of terms of form f 2 with i, j, r and ℓ as in the statement of the theorem. The sequence A 3 (F n ) can be bounded as follows. Using our technical assumptions, the factor (EF 4 n ) 1/4 + 1 is bounded. Moreover, we obtain by Jensen's inequality that Therefore A 3 (F n ) is bounded by a linear combination of terms of form f shall follow the strategy of the proof of Theorem 4.1 to derive an estimate for it. To do this, recall that F n was of the form Using the modified integration-by-parts-formula (2.6) together with the fact that 1(F n > x) ≤ 1 similarly as in the proof of Theorem 4.1, we see that Thus, the isometric formula for Skorohod integrals (2.7) implies that The first term is just A 2 (F n ), whereas the second term in brackets is bounded by a linear combination of quantities of the type where i and j range from 1 to k. To analyse them, let us define the function Ψ(x, y) := x|y| for x, y ∈ R and observe that we can find finite constants c 1 > 0 and c 2 > 0 such that for all a, b, c, d ∈ R, as a consequence of a multivariate Taylor expansion (as in the proof of Theorem 4.1, the constants c 1 and c 2 can be determined explicitly, but are not important for our purposes here). This gives that Using the Cauchy-Schwarz inequality we thus conclude that i,j (F n ) and A (3) i,j (F n ) given by with i, j ∈ {1, . . . , k}. In the proof of Theorem 4.1 we have shown that the first two sequences, A i,j (F n ) and A (2) i,j (F n ), are bounded by linear combinations of squared norms of contractions of f (i) n and f (j) n . Turning to the last term A (3) i,j (F n ), we apply once more the Cauchy-Schwarz inequality to deduce that The terms in brackets can now be bounded as in the proof of Theorem 4.1.
We now present three concrete application of Theorem 4.8. The first one deals with a certain class of functionals arising in stochastic geometry [34] and generalizes the results developed in [33]. In particular, we consider a much wilder class of geometric functionals, which is inspired by the findings in [13]. In our second example we consider random graph statistics of the Boolean model, which have previously been considered in [11,Section 8.1]. Our third application deals with non-linear functionals of a certain class of Lévy processes and generalizes results in [24,25,27]. We emphasize that this example does not fit within the class of U-statistics and is hence not in the domain of attraction of the applications considered in [35]. Other examples to which our theory could directly be applied to are counting statistics for random geometric graphs [10,29] and random simplicial complexes [8] or proximity functionals of non-intersecting flats [36,37].
Here, Q is a probability measure on the space G(d, k) of k-dimensional linear spaces of R d and H d−k stands for the (d − k)-dimensional Hausdorff measure. By η n we denote a Poisson random measure on A(d, k) with intensity measure µ n .
Let m ∈ N be such that d− m(d− k) ≥ 0. Then the intersection process η [m] n of order m of η n arises as the collection of all subspaces E 1 ∩ . . . ∩ E m , where (E 1 , . . . , E m ) ∈ η m n, = are in general position. In terms of [34], η [m] n is a translation-invariant process of (d − m(d − k))-dimensional subspaces of R d . Both, η n as well as η [m] n , are one of the classical objects studied in stochastic geometry and we refer to [34,37] for more details.
We call a geometric functional every non-negative measurable function ϕ on the space of convex subsets of R d with the properties that ϕ(∅) = 0 and |ϕ(K ∩ E 1 ∩ . . . ∩ E m )| ≤ c(K) for µ m n -almost all (E 1 , . . . , E m ) and all compact convex subsets K ⊂ R d , where c(K) is a constant only depending on K. Examples of geometric functionals are • the (d − m(d − k))-dimensional Hausdorff measure, • the counting functional ϕ(K) = 1(K = ∅), • the intrinsic volume V i of order i ∈ {0, . . . , d} (cf. [34,Chapter 14]), • generalized chord-power integrals V i ( · ) α with i ∈ {0, . . . , d} and α ≥ 0, where V i is the intrinsic volume of order i (note that this functional is not additive), • integrals with respect to support measures (or generalized curvature measures) as considered in convex geometry (cf. [34,Chapter 14] and the references cited therein).
Given a geometric functional and the Poisson random measures η n together with its intersection process η [m] n as above, we consider for a compact convex subset K ⊂ R d the (non-degenerate) U-statistic n , i ∈ {1, . . . , m}, for the kernels of the chaotic expansion of U n (K), it follows directly from the definition of the contraction operator that g (i) n ⋆ ℓ r g (j) n 2 is proportional to n (m−i)+(m−j)+ℓ . So, the maximal exponent is realized if i = j = 1 and ℓ = 0. Consequently, denoting by F n (K) := (U n (K)−EU n (K))/ var(U n (K)) the normalized U-statistic, we have from Theorem 4.8 the Berry-Esseen bound d K (F n (K), Z) ≤ C × n 2m−2 n 2m−1 with a constant C > 0 only depending on ϕ, m and K.
Example 4.13. Let K d be the space of compact convex subsets of R d (for some d ≥ 2) and for each K ∈ K d we denote by m(K) the center of the smallest circumscribed ball (called midpoint of K in the sequel) and define K d 0 := {K ∈ K d : m(K) = 0} as the subspace of compact convex subsets of R d with midpoint at the origin. This allows us to identify K d with the product space K d 0 × R d by identifying each K ∈ K d with the pair (K − m(K), m(K)). Now, let µ 0 be a probability measure on K d 0 and for each n ∈ N, η n be a Poisson random measure on K d with control µ n given by where dx stands for the infinitesimal element of the Lebesgue measure on R d and λ > 0 is a fixed intensity parameter. The union set K∈ηn K is the so-called Boolean model associated with η n , cf. [34]. It is random closed set in the sense of [34] if K d To ensure that the U n have finite second-order moments, we assume that h 2 is integrable over any compact subset of R d and, moreover, that [−n 1/d ,n 1/d ] d h(x) dx = 0 for all n ∈ N. Standard examples are h ≡ 1 or h(x − y) = dist(x, y) α , where dist(x, y) stands for the Euclidean distance of x and y and α > 0. We see that U n is a non-negative and non-degenerate U-statistic of order two in the sense of this section and the variance formula (2.2) says that var(U n ) behaves like a constant times n, as n → ∞. In addition, the multivariate Mecke formula [34,Corollary 3.2.3] implies that also EU n with constants C M > 0, C S > 0 and C V > 0 only depending on the parameter λ, or on λ and h, respectively, and where Z is a standard Gaussian random variable.