Multivariate central limit theorems for Rademacher functionals with applications

Quantitative multivariate central limit theorems for general functionals of possibly non-symmetric and non-homogeneous infinite Rademacher sequences are proved by combining discrete Malliavin calculus with the smart path method for normal approximation. In particular, a discrete multivariate second-order Poincar\'e inequality is developed. As a first application, the normal approximation of vectors of subgraph counting statistics in the Erd\H{o}s-R\'enyi random graph is considered. In this context, we further specialize to the normal approximation of vectors of vertex degrees. In a second application we prove a quantitative multivariate central limit theorem for vectors of intrinsic volumes induced by random cubical complexes.


Introduction
Suppose that X = (X k ) k∈N is a Rademacher sequence, that is, a sequence of independent random variables satisfying, for all k ∈ N, P(X k = 1) = p k and P(X k = −1) = q k = 1 − p k for some p k ∈ (0, 1).Further, fix a dimension parameter d ∈ N and let F 1 = F 1 (X), . . ., F d = F d (X) be d random variables depending on possibly infinite many members of the Rademacher sequence X.We shall refer to such random variables as Rademacher functionals in what follows.The goal of this paper is to derive handy conditions under which the random vector F = (F 1 , . . ., F d ) consisting of d Rademacher functionals is close in distribution to a d-dimensional Gaussian random vector.In our paper the distributional closeness will be measured by means of a multivariate probability metric based on four times partially differentiable test functions.We will provide two versions of such a result.One is in the spirit of the Malliavin-Stein method and expresses the distributional closeness in terms of so-called discrete Malliavin operators.The second one is a multivariate discrete second-order Poincaré inequality, a bound which only involves the first-and second-order discrete Malliavin derivatives of the Rademacher functionals F 1 , . . ., F d , or, more precisely, their moments up to order four.More formally, if F = F(X) is a Rademacher functional, the discrete Malliavin derivative D k F in direction k ∈ N is defined as , where F ± k is the Rademacher functional for which the kth coordinate X k of the Rademacher sequence X is conditioned to be ±1.The second-order discrete derivative is iteratively given by D k D ℓ F for k, ℓ ∈ N.Such a bound is particularly attractive for concrete applications as demonstrated in the present text.
Let us describe the purpose and the content of our paper in some more detail.
(i) First of all, our aim is to provide a multivariate quantitative central limit theorem for vectors of Rademacher functionals by bringing together the discrete Malliavin calculus of variations with the socalled smart-path method for normal approximation.This leads to a limit theorem in the spirit of the Mallavin-Stein method and generalizes an earlier result from [9], where the underlying Rademacher sequence has been assumed to be homogeneous and symmetric, meaning that p k = q k = 1/2 for all k ∈ N in above notation.
(ii) From this result, a further aim of this text is to develop a discrete multivariate second-order Poincaré inequality, that is, a bound for the multivariate normal approximation that only involves the first-and second-order discrete Malliavin derivatives, or, more precisely, its moments up to order four.Such a result can be regarded as the multivariate analogue of the main theorem obtained in [10].
(iii) Finally, we want to demonstrate the flexibility and applicability of our discrete multivariate secondorder Poincaré inequality by means of examples from the theory of random graphs and random topology.First, we are going to provide a bound of order O(n −1 ) for the multivariate normal approximation of a vector of subgraph counts in the classical Erdős-Rényi random graph.This generalizes (in a different probability metric) a result of Reinert and Röllin [18], where vectors of the number of edges, 2-stars and triangles have been considered, and adds a rate of convergence to the related central limit theorem in the paper of Janson and Nowicki [6].Moreover, for the same model we also provide a multivariate central limit theorem for the random vector of vertex degrees with a rate of convergence of order O(n −1/2 ).This can be seen as a version of the result of Goldstein and Rinott [4] and is the multivariate analogue of a related Berry-Esseen bound proved by Goldstein [3] and Krokowski, Reichenbachs and Thäle [10].Second, we consider the vector of intrinsic volumes determined by different models of random cubical complexes in R d and derive bounds of order O(n −d/2 ) on the error in their normal approximation.This constitutes a multivariate extension of the central limit theorem provided by Werman and Wright [20] and is in line with recent developments in the active field of random topology, see [1,2,7,11] as well as the references cited therein.
Our results continue a recent line of research concerning limit theorems for Rademacher functionals.The field has been opened by Nourdin, Peccati and Reinert [12], who proved first limit theorems for a class of smooth probability metrics.Later, Krokowski, Reichenbachs and Thäle [9,10] considered Berry-Esseen bounds and provided a first univariate discrete second-order Poincaré inequality.Zheng [21] has obtained a refined bound for the Wasserstein distance and also proved almost sure central limit theorems.Moreover, Privault and Torrisi [16] as well as Krokowski [8] also derived bounds for the Poisson approximation of Rademacher functionals.
This text is organized as follows.In Section 2 we briefly recall the basis of discrete Malliavin calculus in order to keep the paper reasonably self-contained.A first quantitative multivariate central limit theorem for functionals of a possibly non-symmetric and non-homogeneous infinte Rademacher sequence based on the discrete Malliavin-Stein method is presented in Section 3.1, while Section 3.2 contains the discrete multivariate second-order Poincaré inequality.The applications to subgraph and vertex degree counts in the Erdős-Rényi random graph and to the intrinsic volumes of random cubical complexes are discussed in the final Section 4.

Discrete Malliavin calculus
In this section we briefly recall the basis of discrete Malliavin calculus.We refer to the monograph [15] as well as to the papers [9,10,12] for details, proofs and further references.
Rademacher sequences.Let p := (p k ) k∈N be a sequence of success probabilities 0 < p k < 1 and put q := (q k ) k∈N with q k := 1 − p k .Furthermore, let (Ω, F , P) be the following probability space: Ω := {−1, +1} N , F := power({−1, +1}) ⊗N , where power( • ) denotes the power set of the argument set, and with δ ±1 being the unit-mass Dirac measure at ±1.We let X := (X k ) k∈N be a sequence of independent random variables defined on (Ω, F , P) by X k (ω) := ω k , for every k ∈ N and ω := (ω k ) k∈N ∈ Ω.We refer to such a sequence X as (possibly non-symmetric and non-homogeneous infinite) Rademacher sequence.We also define the standardized sequence homogeneous and symmetric Rademacher sequence, that is, if p k = q k = 1/2, for all k ∈ N.

Discrete multiple stochastic integrals and chaos decomposition.
Let us denote by κ the counting measure on N. We put ℓ 2 (N) ⊗n := L 2 (N n , P (N) ⊗n , κ ⊗n ) for every n ∈ N and refer to the elements of that space as kernels.Let ℓ 2 (N) •n denote the subset of ℓ 2 (N) ⊗n consisting of symmetric kernels and let ℓ 2 0 (N) ⊗n be the subset of kernels vanishing on diagonals, that is, vanishing on the complement of the set •n , we define the discrete multiple stochastic integral of order n of f by In addition, we put ℓ 2 (N) ⊗0 := R and J 0 (c) := c, for every c ∈ R.
It is an important fact that every F ∈ L 2 (Ω) admits a decomposition of the form with uniquely determined kernels

Discrete Malliavin derivative.
For every ω = (ω 1 , ω 2 , . . . ) ∈ Ω and k ∈ N we define the two sequences For such an F the discrete Malliavin derivative is defined by Note that it immediately follows from (2) that, for every k ∈ N, D k F is independent of X k .In the following we state a product formula for the discrete Mailliavin derivative.
For m ∈ N let us further define the iterated discrete Malliavin derivative of order m of F by (1) and m ∈ N, we will say that F ∈ dom(D m ), provided that If F ∈ dom(D) with chaos decomposition (1), D k F can be P-almost surely be identified with the random variable given by where f n ( • , k) stands for the kernel f n with one of its variables fixed to be k (which one is irrelevant, since the kernels are symmetric).

Discrete divergence.
We will now define the discrete divergence operator δ and its domain dom(δ).Let , for every n ∈ N, and consider the sequence u := (u k ) k∈N given by u where f n denotes the canonical symmetrization of f n .For u ∈ dom(δ), the discrete divergence operator δ is then defined by One can interpret δ as the operator that is adjoint to the discrete Malliavin derivative.Namely, if F ∈ dom(D) and u ∈ dom(δ), then Discrete Ornstein-Uhlenbeck operator and its inverse.Next, we define the discrete Ornstein-Uhlenbeck operator L and its (pseudo-)inverse For F ∈ dom(L), the discrete Ornstein-Uhlenbeck operator L is then defined by For centred F ∈ L 2 (Ω), its (pseudo-) inverse is given as follows: Discrete Ornstein-Uhlenbeck semigroup.Finally, we introduce the semigroup associated with the discrete Ornstein-Uhlenbeck operator L. The discrete Ornstein-Uhlenbeck semigroup (P t ) t≥0 is defined by The process associated with the discrete Ornstein-Uhlenbeck semigroup is given as follows.For every k ∈ N, let X * k be an independent copy of X k .Furthermore, let (Z k ) k∈N be a sequence of independent and exponentially distributed random variables with mean 1, where Z k is independent of X k and X * k , for every k ∈ N.For every real t ≥ 0, let X t := (X t k ) k∈N with Then, (X t ) t≥0 is the discrete Ornstein-Uhlenbeck process associated with the Ornstein-Uhlenbeck semigroup (P t ) t≥0 .The relation of Ornstein-Uhlenbeck semigroup and process is exhibited in the following formula, known as Mehler's formula.If F ∈ L 2 (Ω), then it P-almost surely holds that Integration by parts, integrated Mehler's formula and Poincaré inequality.We notice that the discrete Malliavin operators D, δ and L are related by the identity L = −δD.Moreover, the following discrete integration by parts formula is valid.If F, G ∈ dom(D), then Indeed, the relation L = −δD and the adjointness of D and δ in (4) yield The following identity can be seen as an integrated version of Mehler's formula.If m, k 1 , . . ., k m ∈ N and F ∈ dom(D m ) with E[F] = 0, then it P-almost surely holds that From this, one can immediately deduce the following important inequality.If m, k 1 , . . ., k m ∈ N, α ≥ 1 and Finally, let us recall a discrete version of the classical Poincaré inequality.For every F ∈ L 1 (Ω), it holds that 3 Multivariate central limit theorems

A discrete Malliavin-Stein bound
In the following, we will prove a bound on the error in the multivariate normal approximation of vectors of general functionals of possibly non-symmetric and non-homogeneous infinite Rademacher sequences.This way we generalize Theorem 5.1 in [9], where only functionals of symmetric Rademacher sequences have been considered.The proof proceeds along the lines of [9], but there are a number of subtleties arising in the more general case here that were not present before.In particular, in the non-symmetric case a new summand in the error bound becomes visible as further discussed in Remark 3.2 below.To make this and other phenomena transparent, we include the full details.
The distance between the law of a vector of Rademacher functionals and a multivariate normal distribution will be measured by the so-called d 4 -distance that is defined as follows. , . Then, Remark 3.2.A comparison of Theorem 3.1 with Theorem 5.1 in [9] shows that the extension to vectors of general functionals of possibly non-symmetric and non-homogeneous infinite Rademacher sequences comes at the costs of an additional summand in the bound, namely .
However, resorting to the case where the underlying Rademacher sequence is symmetric, i.e., if p k = q k = 1/2, for every k ∈ N, this additional summand vanishes and our bound in Theorem 3.1 coincides with the one from [9] with an improvement by a factor 1/2 on the constant in front of the third term.
The proof of Theorem 3.1 relies on two multivariate integration by parts formulae, a Gaussian one and an approximate one from Malliavin calculus which combines (6) with a multivariate chain rule for the discrete gradient operator.We start by recalling the multivariate Gaussian integration by parts formula from Equation (A.41) in [19].
The following lemma contains a multivariate chain rule for the discrete gradient operator, which is a generalization of Proposition 2.1 in [16] to the d-dimensional case.Also note that it not only generalizes Lemma 5.1 in [9] to the case where the underlying Rademacher sequence is non-symmetric and non-homogeneous, but also improves on the constants in the bound for the remainder term.For these reasons, we include a detailed proof.
Lemma 3.4.Let F be a random vector of Rademacher functionals as in Theorem 3.1.Furthermore, let f : R d → R be a thrice partially differentiable function.Then, for every k ∈ N, with for every k ∈ N.
Proof.Fix k ∈ N and observe that Now, a Taylor series expansion of f at F yields that, for every x := (x 1 , . . ., with some θ := θ(x, F) ∈ (0, 1).By re-writing each of the quantities where from the identities and it follows that, for every k ∈ N, and Again, by virtue of ( 14) and ( 15), for every k ∈ N, the second summand on the right hand side of ( 13) can be rewritten as Another Taylor series expansion of ∂ 2 ∂x i ∂x j f at F + k and F − k , respectively, yields that, for every i, j, k ∈ N, and where . This adds up to for every i, j, k ∈ N, and thus, it follows from ( 18) that for every k ∈ N where by the fact that |X k | ≤ 1 for every k ∈ N and another application of ( 14) and ( 15) it holds that and Combining ( 13) and ( 19) finally yields that for every k ∈ N , where because of ( 16), (17), (20) and (21) we have that The proof is thus complete.
Let us now turn to the already announced multivariate approximate integration by parts formula.The next result not only generalizes Lemma 5.2 in [9] to the case in which the underlying Rademacher sequence is allowed to be non-symmetric and non-homogeneous, but also improves the constants in the bound for the remainder term.We emphasize that Lemma 3.5 is the first instance where the additional boundary term discussed in Remark 3.2 shows up.
Lemma 3.5.Let F be a vector of Rademacher functionals as in Theorem 3.1.Furthermore, let f : R d → R be a thrice partially differentiable function with bounded partial derivatives.Then, for every i = 1, . . ., d, with a remainder R(F) that satisfies the estimate Proof.Fix i = 1, . . ., d.By the integration by parts formula (6) we have that Here, we implicitly used the fact that f (F) ∈ dom(D), which can be verified as follows.At first, by the mean value theorem it holds that for every k ∈ N where θ ∈ (0, 1).Thus, an application of the Cauchy-Schwarz inequality yields that and finiteness of the right hand side in (24) follows from the assumptions that, for every i = 1, . . ., d, ∂ ∂x i f is bounded and F i ∈ dom(D).Now, by plugging ( 10) into (23) we immediately get with F + := (F + k ) k∈N and F − := (F − k ) k∈N as well as a remainder R 1 (F) which by (11) fulfils the estimate As a consequence, we only need to further bound the second term in (25).By virtue of the Cauchy-Schwarz inequality and (8) we see that, for every j, ℓ ∈ N, and finiteness of this expression follows from the assumptions that F i ∈ dom(D) and (pq) −1/4 DF i ℓ 2 (N) < ∞ for every i = 1, . . ., d.Thus, an exchange of expectation and summation is valid due to the Fubini-Tonelli theorem, and the independence of X k and ( ∂ 2 By plugging (27) into (25) we then get with a remainder term Finally, the assertion follows from (28) upon putting R(F) := R 1 (F) − R 2 (F) and using the bounds in (26) and (29).
Remark 3.6.In the symmetric case where the underlying Rademacher sequence satisfies p k = q k = 1/2 for every k ∈ N, the bound for the remainder term in (22) simplifies to , since p k − q k = 0, for every k ∈ N.
With both integration by parts formulae at hand we can now turn to the proof of Theorem 3.1.We will use an interpolation technique that is known as the 'smart path method' in the literature, cf.Section 2.4 in [19].This method has already found applications within the Malliavin-Stein method for multivariate normal approximation in the framework considering non-linear functionals of Gaussian [13] and Poisson random measures [14].We follow the lines of the proof of Theorem 4.2 in [14] until we have to use our discrete integration by parts formula developed in Lemma 3.5.Without loss of generality, we can and will from now on assume that F and N are independent.
Proof of Theorem 3.1.Let g : R d → R be a four times partially differentiable function with bounded partial derivatives satisfying M 1 (g), M 2 (g), M 3 (g), M 4 (g) ≤ 1.Consider the function Ψ : R → R given by Then, by the mean value theorem we have that where, for every t ∈ (0, 1), Ψ ′ is given by with Now, by independence of F and N as well as by Fubini's theorem we have that, for every t ∈ (0, 1), where E F and E N denote the expectations with respect to the distributions of F and N, respectively.Using the functions f i,t : R d → R given by can be rewritten as Thus, by the integration by parts formula in Lemma 3.3 we deduce that, for every t ∈ (0, 1), where the exchange of differentiation and expectation in the last step of (34) is valid since ∂ ∂x i g is bounded.Again, using the independence of F and N together with Fubini's theorem yields that, for every t ∈ (0, 1), Similarly as above, we deduce for the quantity B t that, for every t ∈ (0, 1), This time, defining the functions h i,t : R d → R by and using the integration by parts formula in Lemma 3.5, we see that, for every t ∈ (0, 1), where we recall that the exchange of differentiation and expectation in the last step is valid since ∂ ∂x i g is bounded for every i = 1, . . ., d, and R i,t (F) is a remainder which by (22) fulfils that for every i = 1, . . ., d and t ∈ (0, 1), Going back to (36), another application of the independence of F and N together with Fubini's theorem yields that for every t ∈ (0, 1), Hence, by combining (35) and ( 38) with (32) we deduce that for every t ∈ (0, 1), and by using (37) as well as the fact that M 2 (g), M 3 (g), M 4 (g) ≤ 1 we thus conclude that, for every t ∈ (0, 1), .
Plugging this into (31) where we take the supremum over all t ∈ (0, 1) completes the argument.

A multivariate discrete second-order Poincaré inequality
In this section we use Theorem 3.1 to develop a discrete second-order Poincaré inequality for the normal approximation of vectors of Rademacher functionals.In comparison with Theorem 3.1 it has the advantage that it expresses the bound for d 4 (F, N) only in terms of discrete first-and second-order Malliavin derivatives and does not involve the operator L −1 .This in turn allows to apply the bound without specifying the chaos decomposition of the component random variables of the random vector F. Our result can be seen as the natural multivariate extension of the main result from [10], where a univariate discrete second-order Poincaré inequality has been obtained for the Kolmogorov distance (see also Remark 3.2 [21] for a closely related bound for the Wasserstein distance).
Theorem 3.7.Let the conditions of Theorem 3.1 prevail and assume additionally that F i ∈ dom(D 2 ) for all i = 1, . . ., d.For i, j = 1, . . ., d define Then, Proof.Let A 1 , A 2 and A 3 be the three terms defined in Theorem 3.1.We start with A 1 .An application of the triangle inequality yields Let us further consider the second summand on the right hand side of (39).By the integration by parts formula in (6) we see that and thus, by the Cauchy-Schwarz inequality and the Poincaré inequality in (9) we have that By the product formula for the discrete Malliavin derivative in (3) and the triangle inequality we get that, for every k, ℓ ∈ N, Plugging this into (40) and using the Cauchy-Schwarz inequality then yields with Each of these quantities will now be further bounded from above.Considering T 1 , an application of ( 7) and (5) as well as the triangle inequality yields that, for every ℓ ∈ N, Furthermore, by virtue of the monotone convergence theorem we get that, for every ℓ ∈ N, By using Jensen's inequality as well as the Cauchy-Schwarz inequality we then conclude that, for every Thus, another application of the Cauchy-Schwarz inequality leads to the bound The quantities T 2 and T 3 can be treated in the same manner as T 1 , and thus, it holds that Therefore, combining the bounds for T 1 , T 2 and T 3 with (41) and (39) yields Turning to the term A 2 , by several applications of the Cauchy-Schwarz inequality and due to (8) we see that For the third and last term A 3 we similarly see that, by using Hölder's inequality with Hölder conjugates 4 and 4/3 as well as (8), This completes the argument.
4 Applications to random graphs and random cubical complexes

Subgraph and degree counts in the Erd ős-Rényi random graph
In this section we consider an application of the discrete second-order Poincaré inequality developed above to subgraph counts in the Erdős-Rényi random graph.To describe the model, let K n be the complete graph on n ∈ N vertices and fix p ∈ (0, 1).In what follows we implicitly assume that n is sufficiently large.We number the ( n 2 ) edges of K n in a fixed but arbitrary way and denote them by e 1 , . . . ,e ( n 2 ) .Now, to each edge e k of K n a Rademacher random variable X k with success probability p is assigned and we remove e k from K n if X k = −1 and keep e k otherwise.This gives rise to the Erdős-Rényi random graph denoted by G(n, p), which has n vertices and a binomially distributed number of edges with parameters ( n 2 ) and p.In what follows we assume p to be independent of n.Let Γ be a fixed (finite, simple) graph and denote by X Γ the number of subgraphs of G(n, p) that are isomorphic to Γ (we assume here that all graphs we consider have at least one edge).To represent this counting statistic formally, we denote by v Γ the number of vertices and by e Γ the number of edges of Γ.Moreover, we shall denote by aut(Γ) the (finite) group of graph-automorphisms of Γ and by |aut(Γ)| its cardinality.Using this notation, X Γ may be written as where the sum is running over all To proceed, we also need information about the covariance between X Γ and X Φ for two graphs Γ and Φ.Before we state the result, let us introduce our asymptotic notation.We shall write Lemma 4.1.Let Γ and Φ be two graphs and define X Γ and X Φ as above.Then, with a constant c(Γ, Φ) > 0 only depending on Γ and Φ.
Proof.Recalling (42), we see that cov By the independence properties of the construction of G(n, p) we have that cov(1{Γ ′ ⊂ G(n, p)}, 1{Φ ′ ⊂ G(n, p)}) = 0 if and only if Γ ′ and Φ ′ have at least one common edge.In what follows we shall write e Γ ′ ,Φ ′ for the number of edges that Γ ′ and Φ ′ have in common.Thus, Now, we notice that the second sum is running over By choosing i = 1 in the first sum (a choice that leads to the asymptotically dominating term), we see that the third sum is running over ≍ n v Φ −2 |aut(Φ)| terms, since Γ ′ and Φ ′ have precisely one edge in common and there are possible choices for the e Φ − 1 missing vertices to build a copy Φ ′ of Φ in G(n, p).Moreover, taking into account all possible choices and orientations for this common edge gives rise to another factor 2e Γ e Φ .Summarizing, the term with i = 1 yields the asymptotic contribution Choosing i = 2 we see that there are two possible situations.Namely, the two common edges of Γ ′ and Φ ′ can or cannot have a common vertex.In the first situation and by the same reasoning as above, the asymp- , while in the second case we have the asymptotic contribution ≍ c 2 (Γ, Φ)n v Γ +v Φ −4 p e Γ +e Φ −2 (1 − p 2 ) with constants c 1 (Γ, Φ), c 2 (Γ, Φ) > 0 only depending on Γ and Φ.Moreover, it is clear from this discussion that for all i ≥ 3 the corresponding terms in the above sum are of order O(n v Γ +v Φ −3 ).This proves the claim.Now, let us turn to the multivariate central limit theorem for the subgraph counting statistics X Γ .For this, fix some d ∈ N and let Γ 1 , . . ., Γ d be d fixed (finite, simple) graphs with associated counting statistics X Γ 1 , . . . ,X Γ d .For i ∈ {1, . . ., d} define the normalized random variables Our next result is the announced multivariate central limit theorem for X Γ , which adds a rate of convergence to the related result in the paper of Janson and Nowicki [6].
Theorem 4.2.Let Σ = (Σ ij ) d i,j=1 be the matrix given by Proof.It readily follows from Lemma 4.1 and the definition of the constants σ i in the statement of the theorem that, for all i, j = 1, . . ., d, cov( To evaluate the other terms in the bound provided by the discrete second-order Poincaré inequality in Theorem 3.7, we first consider for each i = 1, . . ., d and k, ℓ = 1, . . ., ( n 2 ) the first-and second-order discrete Malliavin derivatives D k X Γ i and D k D ℓ X Γ i .From the very definition it follows that and the difference ( k is just the number of copies of Γ i that contain edge e k .Since there are O(n v Γ i −2 ) possible choices for the v Γ i − 2 missing vertices to build such a copy, it follows that (X For the same reason we conclude that where |e k ∩ e ℓ | denotes the number of vertices that e k and e ℓ have in common. We can now start to bound, for each i, j = 1, . . ., d, the term B 1 (i, j).Using the Cauchy-Schwarz inequality, it first follows that Now, we have to distinguish different cases that are illustrated in Figure 1 (up to permutation of the indices k, ℓ and m).In case (i), we have O(( n 2 )) = O(n 2 ) possibilities to choose each of the three edges and by (43) each first-order discrete Malliavin derivative contributes O(n −1 ), while each second-order derivatives contribute O(n −3 ) according to (44).Thus, in case (i) the sum is of order ) possibilities to choose each of the edges e k and e m , while there are only O(n) possibilities for e ℓ .Moreover, in view of (43) and (44) the first-order discrete Malliavin derivatives contribute again O(n −2 ), but the second-order derivatives contribute only O(n −5 ).Thus, the terms corresponding to case (ii) in the above sum are of order O(n . The same behaviour is also valid for cases (iii), (iv), (v) and (vi), which shows that B 1 (i, j) 2 We are thus left with the terms B 3 (i, j) and B 4 (i, j) given by In B 3 (i, j) there are ( n 2 ) ≍ n 2 choices for k and the first-order discrete Malliavin derivatives are of order O(n −1 ) by ( 43), which shows that B 3 (i, j) = O(n −1 ) for all choices of i and j.Finally, in B 4 (i, j) there are again ( n 2 ) ≍ n 2 choices for k and once again by (43) the derivatives are of order O(n −1 ).Hence, B 4 (i, j) = O(n −2 ) for all possible choices of i and j.Summarizing we conclude that where the constant hidden in the O-notation only depends on p and the graphs Γ 1 , . . ., Γ d .This completes the proof of the theorem.
Remark 4.3.The structure of the asymptotic covariance matrix Σ in the previous theorem implies that Σ has rank 1.Thus, Σ cannot be positive definite, but it clearly is positive semidefinite.
Remark 4.4.We believe that there are also other methods available in the existing literature that allow to prove results similar to Theorem 4.2.For example, the multivariate exchangeable pairs approach used in [18] might be generalized to subgraph counts of arbitrary graphs.On the other hand, this might require serious technical efforts, while our proof of the quantitative multivariate central limit theorem for subgraph counts basically only requires simple (asymptotic) counting arguments.A similar comment also applies to the random cubical complexes treated in the next section.
We continue our study of the Erdős-Rényi random graph G(n, p) by establishing a central limit theorem for the vertex degree statistic in the the case that p = θ/(n − 1) for a θ ∈ (0, 1).Although the number of vertices of a given degree is a special case of a subgraph counting statistic as considered above, the significant difference here is that we allow the success probability p to vary with n.
For i ≥ 0 we denote by V i the number of vertices of degree i in the Erdős-Rényi random graph G(n, p) for a p ∈ (0, 1) and where we assume that n is sufficiently large so that all quantities we deal with are well-defined.More formally, if we denote by v 1 , . . ., v n the n vertices of the complete graph K n , then The covariance between V i and V j for i, j ≥ 0 under the choice p = θ/(n − 1) for a θ ∈ (0, 1) has been investigated in [4] and we recall from Theorem 4.2 there that We define . ., i d ) and define the random vector F D := (F i 1 , . . ., F i d ).From now on and for the rest of this subsection, we assume that the success probability p is of the form p = θ/(n − 1) for some fixed θ ∈ (0, 1).Then, it is easily seen from the expression for cov( as n → ∞.Our next result is a multivariate central limit theorem for the vertex degree vector and let N Σ be a d-dimensional centred Gaussian random vector with covariance matrix Σ.Then, there exists a constant c = c(i 1 , . . ., i d , θ) > 0 only depending on i 1 , . . ., i d and θ such that for all sufficiently large n.
Proof.From (46) we infer that cov(F i , F j ) → Σ ij , as n → ∞.Moreover, from the structure of the covariance (45) we also conclude that |cov( Next, we fix i ∈ {1, . . ., d} and k, ℓ ∈ {1, . . ., ( n 2 )}.As in the proof of Theorem 1.3 in [10] we notice that adding or removing an edge from G(n, p) results in a change of at most 2 for the number of vertices of degree i.In other words, For the second-order discrete Malliavin derivative we observe that D k D ℓ F i is zero whenever k = ℓ or the edges e k and e ℓ corresponding to k and ℓ, respectively, do not have a common vertex.Thus, it follows that We can now evaluate the terms B 1 (i, j) to B 4 (i, j) in Theorem 3.7.Using the Cauchy-Schwarz inequality we first conclude that . Similarly, we have that B 2 (i, j) 2 = O(n −1 ).For the remaining terms B 3 (i, j) and B 4 (i, j) we see that By Theorem 3.7 we have thus proved the result.

Intrinsic volumes of random cubical complexes
Fix a space dimension d ≥ 1, n ≥ 3 and consider the lattice To avoid boundary effects, we identify opposite faces in L, a convention which supplies L with the topology of a d-dimensional torus.Now, we number the cubes in L in a fixed but arbitrary way and assign to each cube C k ∈ L a Rademacher random variable X k such that P(X k = 1) = p and P(X k = −1) = 1 − p =: q for some fixed parameter p ∈ (0, 1).Following the paper of Werman and Wright [20] the voxel model for a so-called random cubical complex C arises from L when each cube C k is removed from L for which X k = −1, see Figure 2. It should be clear that the random cubical complexes arising in this way may be represented as finite unions of disjoint open cubes of dimensions 0, 1, . . ., d, corresponding to the vertices, edges, etc.We are interested in the intrinsic volumes V j (C) of the random cubical complex C for all j ∈ {0, 1, . . ., d}.
To define these quantities formally one can follow approach from [5], where Groemer introduced a way to define intrinsic volumes for the relative interior of a convex body.Since we are interested only in finite unions of cubes, we go the more direct way also used in [20].Namely, for a δ ∈ N, by a closed δ-cube we understand any translate of [0, 1] δ , while an open δ-cube refers to a translate of (0, 1) δ .The intrinsic volume V j (C) of order j ∈ {0, 1, . . ., i} of a closed δ-cube C is given by V j (C) = ( δ j ), while the jth intrinsic volume of an open δ-cube D is V j (D) = (−1) δ−j ( δ j ) =: V j (δ).Finally, for the random cubical complex C as defined above we have the following representation for V j (C) from [20]: From this representation it readily follows that where N δ = ( d δ )n d denotes the number of δ-cubes in L and P δ = 1 − q 2 d−δ is the probability that an arbitrary δ-cube is included in C, see [20].Although the variance of V j (C) has been computed in [20], in our context we will also need information about the covariance structure between V i (C) and V j (C).This is provided by the next lemma.Lemma 4.6.Let i, j ∈ {0, 1, . . ., d}.Then, Proof.We first notice that for two open cubes D and D ′ in L (possibly having different dimensions) the random variables ξ D and ξ D ′ are independent whenever D and D ′ are not faces of a common d-dimensional cube from L. Thus, using (47) we conclude that with the sum running over all open cubes D, D ′ in L that are faces of a common d-cube.To evaluate this sum, we observe that for each pair of cubes D, D ′ there is a unique cube C(D, D ′ ) of which D and D ′ are common faces and which has the smallest dimension among all such cubes (in fact, the existence of such a cube is the reason why n ≥ 3 is assumed in this section).On the contrary, if C is a cube of dimension δ ∈ {0, 1, . . ., d}, we let N a,b,δ be the number of pairs of cubes D and D ′ of dimensions a and b, respectively, for which C(D, D ′ ) = C.We notice that the value of N a,b,δ is independent of the particular choice of C and given by according to Equation ( 18) in [20].Especially, N a,b,δ is independent of n.Moreover, following Equation (20) in [20] we denote by the probability that both D and D ′ are included in the cubical complex C.Then, we conclude that According to our above discussion, the two expectations , the proof is complete.Now, define for j ∈ {0, 1, . . ., d} the centred and normalized random variables ) as well as the random vector Our next theorem provides a bound for the multivariate normal approximation of V and this way extends Theorem 4 in [20].
Theorem 4.7.Let Σ := (Σ ij ) d i,j=0 be the matrix Σ ij := c(i, j) with the constants c(i, j) given by Lemma 4.6.Then, there exists a constant C = C(p, d) only depending on p and on d such that where N Σ is a (d + 1)-dimensional centred Gaussian vector with covariance matrix Σ.
Proof.By Lemma 4.6 it follows that, for all i, j ∈ {0, 1, . . ., d}, cov( V i (C), V j (C)) = Σ ij .Thus, it only remains to bound the terms B 1 (i, j) to B 4 (i, j) in Theorem 3.7.To this end, we need appropriate estimates for the firstand second-order discrete Malliavin derivatives D k V i (C) and D k D ℓ V i (C) for all i ∈ {0, 1, . . ., d}, respectively.For this, we recall the representation (47) and observe that for each k ∈ {1, . . ., n d }, D k V i (C) can be written as √ pq/n d/2 times a sum of at most 6 d summands, where each of them is bounded independently of n.
Here, 6 d ≥ 2 d−δ • 3 d for any δ ∈ {0, 1, . . ., d} and 2 d−δ is the number of d-dimensional cubes of which a fixed δ-dimensional cube is a face of, while 3 d = ∑ d δ=0 ( d δ )2 d−δ is the total number of faces of a d-dimensional cube.As a consequence, we find that and by the triangle inequality also where the hidden constants only depend on d and on p. Now, it is crucial to observe that for any fixed k ∈ {1, . . ., n d } the second-order discrete Malliavin derivative D k D ℓ V i (C) is even identically zero whenever the cubes corresponding to k and ℓ are not neighbours of each other.Since any cube in L has only a finite number of neighbours, independently of n, we conclude that in the term B 1 (i, j) provided by the multivariate discrete second-order Poincaré inequality in Theorem 3.7 there are exactly n d choices for m and only a constant number of choices for k and ℓ for which the corresponding summand is non-vanishing.As a consequence, B 1 (i, j) 2 is of order ) for any choice of i, j ∈ {0, 1, . . ., d}.Since the same behaviour can also be observed for the remaining terms B 2 (i, j), B 3 (i, j) and B 4 (i, j), the claim follows.
Besides of the voxel model, the authors of [20] also consider three further models for random cubical complexes: the plaquette model, the closed faces model and the independent faces model.For each of these models our method can be used to derive a multivariate central limit theorem for the random vector of their intrinsic volumes and to obtain bounds on the d 4 -distance of order O(n −d/2 ) in each case.We present the result only in the case of the plaquette model, since it is close in spirit to the celebrated random simplicial complexes introduced by Linial and Meshulam [11] that have been object of intensive studies.The construction just described gives rise to a random set P and as in the case of the voxel model C we are interested in its intrinsic volumes V j (P ), j ∈ {0, 1, . . ., d}.Using the same notation as in the previous example, we formally have that V j (P ) = ∑ D ξ D with the sum running over all open cubes in P and hence E[V j (P )] = ∑ d δ=j N δ P δ V j (δ).However, in the plaquette model we have that the probabilities P 0 , P 1 . . ., P d satisfy P d = p and P δ = 1 for δ ∈ {0, . . ., d − 1}, which implies (after some simplifications) that E[V j (P )] = and the result follows.Now, we define the centred and normalized random variables V j (P ) := n −d/2 (V j (P ) − E[V j (P )]) and the (d + 1)-dimensional random vector W := ( V 0 (P ), V 1 (P ), . . ., V d (P )).The next result is a multivariate central limit theorem for the random vector W. Since the arguments are the same as in the proof of Theorem 4.7, we have decided not to present the details. with a (d + 1)-dimensional centred Gaussian random vector N Σ having covariance matrix Σ. Remark 4.10.As for subgraph counting statistics it follows from the structure of the asymptotic covariance matrices Σ in Theorems 4.7 and 4.9 that Σ only has rank 1 and is hence only positive semidefinite rather than positive definite.

Figure 1 :
Figure 1: The different cases arising in the proof of Theorem 4.2.We distinguish according to the number of vertices that edges e m has in common with e k and e ℓ , respectively.For example, the illustration in case (ii) means that |e k ∩ e m | = 0 and |e ℓ ∩ e m | = 1 or, vice versa, |e k ∩ e m | = 1 and |e ℓ ∩ e m | = 0.This allows both situations, |e k ∩ e ℓ | = 0 or |e k ∩ e ℓ | = 1 with |e k ∩ e ℓ ∩ e m | = 0.

Figure 2 :
Figure 2: Illustrations of the voxel model C of a random cubical complex with d = 2 and n = 4 for increasing values of p.

Figure 3 :
Figure 3: Illustrations of the plaquette model P of a random cubical complex with d = 2 and n = 4 for increasing values of p.The grey cubes are included, while the white cubes are not included in P.

Theorem 4 . 9 .
Let Σ := (Σ ij ) d i,j=0 be the matrix given by Σ ij = ( d i )( d j )p(1 − p).Then,there exists a constant C = C(p, d) only depending on p and on d such that d 4 (W, N Σ ) ≤ C n d/2

E[g(Y)]| , where the supremum is running over all four times partially differentiable functions g : R d → R with
Fix d ∈ N and let n = 1, . . ., d.For an n times partially differentiable function g : R d → R we put bounded partial derivatives fulfilling M 1 (g), M 2 (g), M 3 (g), M 4 (g) ≤ 1.

3.3. Fix
d ∈ N and let N := (N 1 , . . ., N d ) be a centred Gaussian random vector with symmetric and positive semidefinite covariance matrix Σ := (Σ ij ) d i,j=1 .Furthermore, let g : R d → R be a partially differentiable function with bounded partial derivatives and E[|N To introduce the model formally, we fix d ≥ 1, n ≥ 3, and define the setG := {∂[0, 1] d + z : z ∈ {0, ..., n − 1} d }, where ∂[0, 1] d stands for the boundary of the unit d-cube [0, 1] d .The open cubes C 1 , ..., C n d in {(0, 1) d + z : z ∈ {0, ..., n − 1} d }are assumed to be numbered in a fixed but arbitrary way and we assign to each cube C k a Rademacher random variable X k with P(X k = 1) = p and P(X k = −1) = 1 − p =: q.The plaquette model now arises if those open cubes C k are joint with the set G for which the associated Rademacher random variable X k takes the value 1, see Figure3.