The needlets bispectrum

The purpose of this paper is to join two different threads of the recent literature on random fields on the sphere, namely the statistical analysis of higher order angular power spectra on one hand, and the construction of second-generation wavelets on the sphere on the other. To this aim, we introduce the needlets bispectrum and we derive a number of convergence results. Here, the limit theory is developed in the high resolution sense. The leading motivation of these results is the need for statistical procedures for searching non-Gaussianity in Cosmic Microwave Background radiation.


Introduction
The purpose of this paper is to join two different threads of the recent literature on random fields on the sphere, namely the statistical analysis of higher order angular power spectra on one hand, and the construction of second-generation wavelets on the sphere on the other hand. More precisely, we shall be concerned with zero-mean, mean square continuous and isotropic random fields on the sphere, for which the following spectral representation holds [1,4]: for x ∈ S 2 , where {a lm } l,m , m = 1, . . . , l is a triangular array of zero-mean, orthogonal, complex-valued random variables with variance E|a lm | 2 = C l , the angular power spectrum of the random field. For m < 0 we have a lm = (−1) m a l−m , whereas a l0 is real with the same mean and variance. (1)  where we have moved to spherical coordinates x = (ϑ, ϕ), 0 ≤ ϑ ≤ π and 0 ≤ ϕ < 2π. It is a well-known result that the spherical harmonics provide a complete orthonormal systems for L 2 (S 2 ) [36].
The analysis of random fields on the sphere has recently gained very strong physical motivations, due to the overwhelming amount of data which is becoming available on Cosmic Microwave Background radiation (hereafter (CMB)). As detailed elsewhere [15], to the first order we can view CMB data as maps of the Universe in the immediate adjacency of the Big Bang. The first of these maps were provided by the satellite experiment COBE in 1993, and in view of this G. Smoot and J. Mather were awarded the Nobel prize for Physics in 2006. Much more refined maps have been made available by another NASA satellite experiments, WMAP (http://wmap.gsfc.nasa.gov/); still more refined data are expected from the ESA mission Planck, scheduled to be launched in October 2008. These huge collaborations involve hundred of scientists and are expected to provide invaluable information on Physics and Cosmology. At the same time, these massive data sets have called for huge statistical challenges, ranging from power spectrum estimation to outlier detection, testing for isotropy, efficient denoising and map-making, handling with missing data, testing for non-Gaussianity and many others.
Among these issues, particular interest has been driven by efficient testing for non-Gaussianity. This is due to strong physical motivations (the leading paradigm for the Big Bang dynamics predicts (very close to) Gaussian fluctuations) and difficulties in finding a proper statistical procedure. There is by now a wide consensus that the most efficient procedures to probe non-Gaussianity are based upon the bispectrum, in the idealistic circumstances where the spherical random field is fully observed: see for instance [8,11,20,23,24,37]. However, the properties of the bispectrum are also known to deteriorate dramatically in the presence of missing observations, see again [11]. To handle the latter problem, a general approach is to focus on wavelet, rather than Fourier transforms. The construction of spherical wavelets has recently drawn an enormous amount of attention in the literature, see for instance [2,27,38] and the references therein. In our view, a particularly convenient tight frame construction on the sphere is provided by so-called needlets, which were introduced in [30,31]; their applications to spherical random fields is due to [6,7]. Needlets enjoy two properties which seem especially worth recalling: they are quasi-exponentially localized in the real domain and compactly supported in the harmonic domain. A further, quite unexpected property is as follows: random needlets coefficients are asymptotically uncorrelated at the highest frequencies and hence, in the Gaussian case, independent, see again [6]. This latter feature is rather surprising in a compact domain and makes asymptotic theory possible even in the presence of a single realization of a spherical random field.
Our aim in this paper is to borrow ideas from the bispectrum and the needlets literature to propose and analyze a needlets bispectrum, where the random coefficients in the needlets expansion are combined in a similar way to the bispectrum construction. The aim is to obtain a procedure which mimicks the ability of the bispectrum to search for non-Gaussianity at the most efficient combination of frequencies, at the same time providing a much more robust construction in the presence of missing data, as typical of the needlets. The plan of the paper is as follows: in Section 2 we review some background material on spherical random fields, the bispectrum and the needlets construction. In Section 3 we introduce the needlets bispectrum and we establish a central limit theorem, in the high resolution sense. In Section 4 we go on to establish a functional central limit theorem for the integrated needlets bispectrum; in Section 5 we provide some preliminary discussion on the behaviour under non-Gaussian assumptions and discuss some possibilities for applications and further research.

A Review on Needlets
In this subsection we shall briefly recall the main features of the needlets construction. As mentioned above, needlets were first introduced in the Functional Analysis literature by [30,31], whereas the investigation of their properties from a stochastic point of view is due to [5,6] and [7]; see also [19,17]. We need first to introduce some notation and definitions, which are largely identical to those in [6,31].
Given any two positive sequences The standard (open and closed) balls in S 2 are given as always by B(a, α) = {x, d(a, x) ≤ α}, B • (a, α) = {x, d(a, x) < α}. For a general A ⊂ S 2 we will denote by |A| the spherical measure of A. Now fix ǫ > 0 and For all x i ∈ Ξ ǫ , the associated family of Voronoi cells is defined by: It is proved in [7] that there are at most 6π 2 adjacent cells to any given cell.
For the construction of needlets, we should first start to define K l as the space of the restrictions to the sphere S 2 of polynomials of degree less than l. The next ingredient are the set of cubature points and cubature weights; indeed, it is now a standard result (see for instance [29]) that for all j ∈ N, there exists a finite subset X j of S 2 and positive real numbers λ jk > 0, indexed by the elements of X j , such that Given a fixed B > 1, we shall denote by {ξ jk } the cubature points corresponding to the space K [3B j+1 ] , where [.] represents as usual integer part. It is known that {X j } ∞ j=0 can be taken s.t. the cubature points for each j are almost uniformly ǫ j −distributed with ǫ j := κB −j , and the coefficients {λ jk } are such that λ jk ≈ cB −2j , card {X j } ≈ B 2j . Now let φ be a C ∞ function supported in |ξ| ≤ 1, such that 0 ≤ φ(ξ) ≤ 1 and φ(ξ) = 1 if |ξ| ≤ B −1 , B > 1. Following again [30,31], we define It is immediate to verify that b(ξ) = 0 only if 1 B ≤ |ξ| ≤ B. The needlets frame {ϕ jk } is then constructed as The main localization property of {ϕ jk } is established in [30], where it is shown that for any M ∈ N there exists a constant c M > 0 s.t., for every ξ ∈ S 2 : More explicitly, needlets are almost exponentially localized around any cubature point, which motivates their name. In the stochastic case, the (random) spherical needlet coefficients are then defined as We have immediately i.e., the (weighted) sample mean of the needlets coefficients is identically zero at all levels j. The proof is trivial, because The variance of the needlets coefficients is given by Note that we have σ 2 jk ≈: σ 2 j uniformly over k, where From now on, we shall typically focus on the normalized needlets coefficients, defined as β jk := β jk /σ j .
To investigate the correlation, we introduce now the same, mild regularity conditions on the angular power spectrum C l of the random field T (x) as in [5,6].
Condition A The random field T (x) is Gaussian and isotropic with angular power spectrum such that, for all B > 1, there exist α > 2, and {g j (.)} j=1,2,... a sequence of functions such that where Condition 8 entails a weak smoothness requirement on the behaviour of the angular power spectrum, which is trivially satisfied by some cosmologically relevant models (where the angular power spectrum usually behaves as an inverse polynomial, see again [15], pp. 243-244). For instance, considering positive constants d j , j = 1, . . . , p, p > 2, for all B > 1 condition A holds for Under Condition 8 a crucial and rather unexpected property of the random needlets coefficients is established in [6], namely the correlation bound where d(ξ jk , ξ jk ′ ) = arccos( ξ jk , ξ jk ′ ). In words, (9) is stating that as the frequency increases, spherical needlets coefficients are asymptotically uncorrelated (and hence, in the Gaussian case, independent), for any given angular distance. This property is of course of the greatest importance when investigating the asymptotic behaviour of statistical procedures: in some sense, it states that it is possible to derive an infinitely growing array of asymptotically independent "observations" (the needlets coefficients) out of a single realization of a continuous random field on a compact domain. It should be stressed that this property is not by any means a consequence of the localization properties of the needlets frame. As a counterexample, it is easy to construct spherical frames having bounded support in real space, whereas the corresponding random coefficients are not at all uncorrelated (recall the angular correlation function can be taken to be bounded from below at any distance on the sphere). We recall briefly two further results that we shall exploit often in the sequel; more precisely, ( [6], Lemma 10 ) we note that for M > 2, j ∈ N, ∀ k, k ′ , Also ( [31], Lemma 4.8 ), for some C ′ M depending only on M we have the inequality From the computational point of view, we should stress that needlets are not only feasible, but indeed extremely convenient. The implementation can be performed with a minimal effort by means of standard packages for the analysis of spherical random fields such as HEALPIX or GLESP ( [18] and [16]), see for details [26], where plots and numerical evidence on localization and uncorrelation are also provided.

Diagram Formula
To complete our background, we need a quick review on the diagram formula. This is material which can now be found at a textbook level (see for instance [34]); nevertheless, we need a brief overview to fix notation. Denote by H q the q−th order Hermite polynomials, defined as We now introduce diagrams, which are basically mnemonic devices for computing the moments and cumulants of polynomial forms in Gaussian random variables. Our notation is the same as for instance in [23,24]. Let p and l j , j = 1, . . . , p be given integers. A diagram γ of order (l 1 , . . . , l p ) is a set of points {(j, l) : 1 ≤ j ≤ p, 1 ≤ l ≤ l j } called vertices, viewed as a table W = − → l 1 ⊗· · ·⊗ − → l p and a partition of these points into pairs called edges. We denote by Γ(W ) the set of diagrams of order (l 1 , . . . , l p ). If the order is l 1 = · · · = l p = q, for simplicity, we also write Γ(p, q) instead of Γ(W ). We say that: a) A diagram has a flat edge if there is at least one pair {(i, j)(i ′ , j ′ )} such that i = i ′ ; we write Γ F for the set of diagrams that has at least one flat edge, and Γ F otherwise. b) A diagram is connected if it is not possible to partition the rows − → l 1 · · · − → l p of the table W into two parts, i.e. one cannot find a partition K 1 ∪ K 2 = {1, . . . , p} that, for each member V k of the set of edges (V 1 , . . . , V r ) in a diagram γ, either V k ∈ ∪ j∈K1 − → l j , or V k ∈ ∪ j∈K2 − → l j holds; we write Γ C for connected diagrams, and Γ C otherwise.
c) A diagram is paired if, considering any two sets of edges {(i 1 , j 1 )(i 2 , j 2 )} {(i 3 , j 3 )(i 4 , j 4 )}, then i 1 = i 3 implies i 2 = i 4 ; in words, the rows are completely coupled two by two. We write Γ P for the set of diagrams for paired diagrams, and Γ P otherwise.
where, for each diagram G, η ij (G) is the number of edges between rows l i , l j and We have now all the preliminary material to define our needlets bispectrum on S 2 , as explained in the following Section.

The needlets bispectrum
As mentioned in the introduction, the recent literature suggests that the most powerful statistic to search for non-Gaussianity in fully observed spherical random fields is the (normalized) angular bispectrum, defined as where the symbol in brackets represents the so-called Wigner's 3j coefficients, which are meant to ensure the statistics is rotationally invariant. Wigner's 3j coefficients arise in many different instances, especially in the quantum theory of angular momentum (see [36], where explicit expressions are also provided). Up to a normalization factors, they are equivalent to the so-called Clebsch-Gordan coefficients, which play an important role in representation theory for the group of rotations SO(3), see [25] for a much more detailed discussion and probabilistic applications. Following [20,37], an alternative definition of the (normalized) bispectrum can be considered, namely where i.e. we focus on the Fourier projections of the random fields at multipoles (l 1 , l 2 , l 3 ). Both versions of the bispectrum have been shown to be extremely powerful against non-Gaussian alternatives, indeed there is now a widespread consensus that they make up the most efficient statistical procedures to search for non-Gaussianity, at least in the presence of fully observed spherical maps (see for instance [11,21,28]). On the other hand, it is also well-known that the performance of Fourier methods in general, and the bispectrum in particular, deteriorates quite clearly in the presence of missing observations/partially observed maps ( [20,22,11]). A natural idea is then to explore the localization properties of the needlets in harmonic domain, together with their robustness in the presence of gaps, in order to devise a statistic which might mimick the positive features of the bispectrum, at the same time coping with the difficulties brought in by missing observations.
To this aim, we shall consider the (normalized) needlets bispectrum where I(.) denotes the indicator function and V jk is the Voronoi Cell that corresponds to ξ jk ; note that for j 2 = j 3 we have h j1j2j3 ≈ λ j2k2 . It is immediate to see that (13) can be seen as a natural development of (12), where the convolution with the orthonormal projector operator L l is replaced by the (discretized) convolution with the frame operator projection λ jk l b(l/B j )L l . Of course, in practice (12) is unfeasible and requires discretization to be implemented. In words, we are considering a version of the bispectrum where the exact identification of the multipoles is blurred by a form of suitable smoothing, with the purpose of a better robustness against missing observations. The summation convention in (13) needs some further discussion. In practice, for applications the Voronoi tessellation is chosen in such a way to be nested across different scales (this is the case, for instance, for HEALPix [18], which is the standard package for CMB applications). Under such circumstances, our procedure can be described more explicitly as follows: we take where In other words, the "finest grid" X j3 is the one which leads the summation, whereas smaller frequency terms are identified with those centres whose corresponding Voronoi cells include the points being summed. Note, however, that in the general non-nested case the centre of V j1k1 needs not belong to X j3 . In the sequel, for notational simplicity we write k 1 , k 2 rather than k 1 (k 3 ), k 2 (k 3 ), when no ambiguity is possible.
To investigate the asymptotic behaviour of the needlets bispectrum, we shall make an extensive use of the Diagram Formula which was introduced in the previous section. A crucial point, of course is the determination of the frequencies where the needlets bispectrum is evaluated. We distinguish three cases, namely: Case 1 corresponds to the situation where all three frequencies are different. Case 2) is basically the "squeezed" or collapsed configuration which is considered by [3,23], and many other cosmological references; in words, one frequency is (much) smaller than the other two. It has been widely argued in the physical literature that this configuration corresponds to the highest power region for so-called local models of non-Gaussianity. Case 3 corresponds to so-called equilateral configurations; this case, however, can be largely investigated by means of results in [6] and we report it only for completeness, omitting many details in the proof. It should be noted that for case 1) and 2) we focus on frequencies that are at least two steps apart, in order to exploit the semiorthogonality properties of the needlets systems. Relaxing this assumption implies no new ideas and would only make the paper notationally more complicated.
In each of the three cases we have trivially EI j1j2j3 = 0. We now focus on the asymptotic behaviour; here, asymptotics must be understood in the high resolution sense, i.e. we focus on a single realization of an isotropic random field, and we investigate the behaviour of our statistics at higher and higher bands. The first task is to ensure the statistics are non-degenerate and do not exhibit an explosive behaviour; this is the aim of the next Lemma. While the bound from above is quite straightforward, the lower bound is much more complicated and settles a question which was left open in [5], where the lower limit was simply assumed to be strictly larger than zero even for the simple case where j 1 = j 2 = j 3 . As it is evident from the proof, the integer K depends on the choice of cubature points and of kernel function b(.); more explicit expressions can be found below. Note also that unless the three bands are equal, condition b) cannot be satisfied for B = 2; indeed in CMB applications values of order B ≃ 1.2, 1.3 are likely to be favoured.
The bounds are uniform with respect to j 1 , j 2 .
Proof. (a) In the sequel, we shall use the fact that the set of the cubature points of polynomial spaces with degree less than B j are a κB −j −net; we also define ρ := max j,k {B −2j λ j,k }. The proof of all three cases are similar; we shall focus on case 2) j 1 + 1 < j 2 = j 3 .
Since λ jk ≃ B −2j for every ξ jk , and #{k 2 , k 2 ∈ V j1k1 ∩ X j2 } ≃ B 2j2−2j1 , the sum can be readily bounded by in view of (10). This completes the proof of part a).
(b) The proof of the lower bound on the variance is considerably more delicate. We recall the correlation of the needlets coefficient is provided by where θ = arccos ξ jk , ξ jk ′ . The idea of our argument is to replace the needlets coefficients in the coarsest grid X j1 by coefficients with the same resolution but evaluated over the pixels X j2 ∩ V j1 (ξ j1k1 ). This will allow us to circumvent the fact that the cubature weights at the smaller frequencies stay constant over a Voronoi cell, whereas those corresponding to the highest j's may vary. We shall hence consider Let us now recall a few properties of Legendre polynomials, that we shall use extensively soon (see [36] for details); we have sup θ∈[0,π] P l (cos θ) = P l (cos 0) = 1, and sup As a consequence, for any positive ǫ < 1, there exists a δ > 0, s.t. if 0 ≤ θ < δ ≤ ǫ/ (3l) , then, Now fix an integer K such that K ≥ log B 6κ/ǫ; if we let {ξ j,k } j,k be the cubature points of polynomial space with degree less than B j+K+1 , (note that we can assume all of these sets are κB −j −nets), then for any

It follows that
where C is a constant, C = C (κ) (see [29]). We are now in the position to establish our lower bound. By simple algebraic manipulations, we have In view of (15), (10) and standard manipulations, the second and third components can be made arbitrarily small. To bound the first component, we recall first two facts involving spherical harmonics (see again ( [36,23] for details), namely the so-called Gaunt integral where for the Wigner's 3j coefficients introduced on the right-hand side we recall the properties Using the fact that {λ jk } are the cubature weights corresponding to the space K 3B j+1 , easy manipulations yield In the previous argument, we have taken c := inf j1j2j3 c j1j2j3 , where It is simple to show that c > 0. Indeed, under (4), we have The same argument as before could be used to establish lower bounds when j 1 = j 2 < j 3 or j 1 = j 2 = j 3 . To conclude our proof, we consider the case when j 1 < j 2 = j 3 . We obtain , the two summands corresponding to the cases where k 2 , k ′ 2 belong to the same or to different Voronoi cells, respectively. The first part is equal to while the second part is equal to where c 1 = C(κ), c 2 = C (κ, ρ, C M ) , and ǫ ≤ 1 2 c/C M , the argument is completed. The proof for case (3) is similar (actually slightly easier).
The following weak convergence theorem is the main result of this Section. We stress that the statement could be easily extended to a multivariate Central Limit Theorem; however, because this extension would not entail any substantial novelty, at the same time making the notation much more cumbersome, we prefer to stick to the univariate case.
Theorem 2. Let T (x) be a zero-mean, mean square continuous and isotropic Gaussian random field, with angular power spectrum that satisfies Condition A. As j 1 → ∞, we have I j1j2j3 Proof. In view of the results in [32], to establish a Central Limit Theorem for a multilinear form in Gaussian random variables, it is enough to investigate the asymptotic behaviour of fourth order moments (or equivalently cumulants), see also [14]. Our aim is then to show that, as j 1 → ∞, By the diagram formula we have, for all index sets I: Similarly to [23], we define where Γ C is the set of all connected diagrams. To conclude our argument, we only need to show that and is an immediate consequence of the definition of EI 2 j1j2j3 and trivial combinatorial manipulations(see [34]). The result in (18) is proved by splitting connected diagrams into those with or without flat edges. Diagrams with flat edges are dealt with in Lemma 3, while those without are dealt with in Lemma 4. We stress that, on the contrary of what is often the case when the diagram formula is applied, terms with flat edges do not vanish, due to correlation among different locations in the spherical needlets coefficients. This completes the proof of the main result.
Lemma 3. For a connected diagram with flat edges, γ ∈ Γ CF (4, 3), we have ..,4,b=1,2,3 for the elements in our diagram, a and b denoting the row and column indexes, respectively. We recall also that k 2 = k 2 (k 3 ), k 1 = k 1 (k 3 ), as explained earlier. For Case 1), i.e. j 1 < j 2 − 1 < j 3 − 2, since E{β j3k3 β j2k2 } = 0 for every k 2 ∈ X j2 , k 3 ∈ X j3 ,it is easy to see that ρ(γ; j 1 , j 2 , j 3 ) ≡ 0. For Case 2), i.e. j 1 + 1 < j 2 = j 3 , we assume (without loss of generality) that a flat edge is present in the first row of the diagram, i.e. we let By the same argument as in (7) we obtain immediately On the other hand, under j 1 = j 2 < j 3 − 1, again we assume a flat edge 1 , j 2 )} ∈ γ; by necessity, there should exist another flat edge in the graph, and w.l.o.g. we take it be in the fourth row, i.e. {(k , and the resulting term can be bounded by 3 ) 3 ) Finally for Case 3), the argument is analogous; more precisely, components with diagrams in Γ CF (4, 3) can be bounded by where in the third equation we used (9). Thus the proof is completed.
Lemma 4. For a connected diagram without a flat edge, γ ∈ Γ CF (4, 3), we have Proof. Connected diagrams without flat edges and with four nodes can be partitioned into two classes, i.e. so-called cliques, where each vertex is connected to all three others, and terms with loops of order 2. We focus on the former class; without loss of generality, we can express ρ(γ; j 1 , j 2 , j 3 ) by 2 )ρ j2 (k 3 )ρ j3 (k 3 ) By means of (9) we readily obtain the bound 2 )) M × 1 Consider first Case 2), with j 1 + 1 < j 2 = j 3 . Using inequality (11), (20) can be replaced by Likewise, for j 1 = j 2 < j 3 − 1, we have the bound This concludes the proof for Case 2); the proof for Case 3) could be implemented along identical lines. The analysis for case 1) j 1 + 1 < j 2 < j 3 − 1, i.e. the case where all three frequencies differ, requires considerably more care. As before, let V ju (k u ∈ X ju . Our idea is to partition (20) into four elements, as follows: The first three summands are easy to bound, indeed it is enough to notice that 3 )) M × 1 by taking r = B −j1/2 , compare Lemma 10 in [6]. The argument for the second and third term is analogous. Concerning the last summand, we recall that for every k ∈ X j1/2 . Now denote where r = κB − 1 2 j1 . Heuristically, the idea is to split Ω(k) into regions where (k 1 , k 2 , k 3 , k 4 ) are each "close" to all three others, and regions where they are close two by two but the two pairs need not belong to the same neighbourhood. More precisely, we take Ω(k; We can hence define In the region Ω 1 (k; j 3 ), the idea is to keep k fixed and then proceed by evaluating the cardinality of B 6r (k); in view of (21), this leads to k∈X j 1 /2 (k1,k2,k3,k4)∈Ω1(k;j3), by (9) and Lemma 4.8 in [31]. On the other hand, in the region Ω 2 (k), we exploit the fact that d(k 1 , k 2 ), d(k 3 , k 4 ) ≥ 3r = 3κB −j1/2 , so that we obtain the upper bound (for some C > 0) 2 )) M × 1 The proof for the remaining terms is very similar and hence omitted for brevity's sake.

Unknown angular power spectrum
As the final result of this Section, we wish to focus on the case where the variance of the needlets coefficients is unknown and estimated from the data. A natural estimator for σ j is provided by where as before N j = card {X j } ≈ B 2j . We define our studentized statistics as β jk := β jk / σ j and we then consider Our next result shows that this studentization procedure has no effect on asymptotic behaviour. N (0, 1).
Proof. By a standard application of Slutzki's theorem, the weak convergence result follows immediately from the consistency of the variance estimator. We provide a proof for the three cases separately.
For case 2), we focus on the case where j 1 = j 2 < j 3 ; the remaining part of the proof is nearly identical. First note that By (9) and [31] (Lemma 4.8), we have Hence the variance is bounded by The proof for case 3) is an easy consequence of results by [6,7].

Convergence to multiparameter Gaussian processes
Our aim in this Section is to extend the previous results to functional convergence theorems. The motivation for such an extension can be easily explained.
Indeed, from the applications points of view, practitioners are typically interested not only at the possible existence of non-Gaussianity and/or other features, but also to their location in frequency space. If we focus for instance on cosmological applications, which are the main driving rationale behind our work, it is important to recall that the existence of possible non-Gaussianities takes a very different meaning according to the scales where they are located, so that a suitable statistical procedure should provide information not only on their existence, but also on their position in the frequency domain. As an example, a huge debate has arisen in the Cosmological literature on the possible existence of a non-Gaussian "Cold Spot" in CMB data, much of the related literature concerning the determination of the angular scale of such features, see for instance [12,13]. Concerning this feature, it may even be of interest to test for Gaussianity only on a subspace of the sphere (this is indeed what happens in practice, because of missing observations). The modification of (13) under these circumstances is straightforward: we would simply restrict our sum to a subset of the cubature points. Our following discussion would be asymptotically unaltered.
In [11,23], it was proposed to build alternative forms of partial sum process from the bispectrum B l1l2l3 , and to use them as a probe of non-Gaussianity at various scales. All different proposals were univariate, in the following sense. Assume the resolution of the experiment is such that frequencies up to l = 1, . . . , L are observed; the partial sums were then run only on a subset of configurations with cardinality L, whereas the number of multipole combinations (l 1 , l 2 , l 3 ) which would be available for statistical analysis is in the order of L 3 . One of the reasons for this restriction had to do with computational complexity: the evaluation of even a single bispectrum statistic is extremely time consuming, so that the exploration of all possible configurations is likely to be unfeasible even on the greatest supercomputing facilities. On the contrary, needlets are extremely convenient from a computational point of view, and there is no obstacle in considering larger frequency configurations, provided of course that the triangle conditions are satisfied.
We shall hence focus on two possible partial sums processes, which correspond broadly to cases 1 and 2 of the previous section; more precisely , K ≥ 0 is some integer satisfying the constraint determined in Lemma 1, where W (.) is two-dimensional Brownian sheet, i.e. the zero mean Gaussian process with covariance function EW (r 1 , r 2 )W (s 1 , where W (.) is standard Brownian Motion.
Here, ⇒ denotes weak convergence in the sense of [9], D[0, 1] p , p ∈ N is the usual multidimensional Skorohod space.
Proof. We start from (24); as usual, we need to establish convergence of the finite-dimensional distributions and tightness. Obviously, To establish Gaussianity, we can again rely on the results by [32] and proceed with the bounds for the fourth-order cumulants. As the computations are very much the same as in the previous Section, we omit the details for brevity's sake.
To consider tightness, we use the classical criteria given for instance in [35]. Define first the two-dimensional increments It is again a standard computation to show that, as L → ∞, We can then establish tightness by showing that j1=[Ls1] [Lt2] For the first part it is easy to see that it is bounded by 16(r 1 − s 1 ) 2 (r 2 − t 2 )(t 2 − s 2 ); for the second part, for j where Γ 1C denotes the graphs with cliques (all nodes connected with all others), where Γ 2C refers to graphs with loops of order two; these two disjoint classes cover all possible connected graphs with four nodes. We have from which we obtain Similarly, we can get the same result for 3) This concludes the proof of (24). For (25), we start again from the convergence of the finite-dimensional distributions; for notational simplicity, we stick to the univariate case. It is obvious that EJ 2L (r) = 0; on the other hand, For Gaussianity, we analyze once again fourth-order cumulants, i.e. the connected components in the expansion of the fourth moment. As before, we need only focus on connected diagrams with four nodes, which can be partitioned into two classes: the cliques, where all nodes are connected with all three others, and diagrams with a loop of order 2. As before, these terms can be bounded by 3 ,k 3 ,k 3 ,k 2 )) M
To sum up which is enough to conclude the proof for the finite-dimensional distributions, in view of the standard argument from [32] that we used before.
To conclude the proof, we need only to consider tightness in D([0, 1]). Note that E[Î j1,j2,j3Îj ′  Thus we finished the proof of tightness.

Behaviour under non-Gaussianity
In this final Section, we shall provide some quick and informal discussion on the behaviour of our statistics under non-Gaussianity; see [33,26] for other applications of the needlets to cosmological data analysis. There exist of course a huge variety of non-Gaussian models for spherical random fields, and we shall delay a much more detailed treatment to future work. Our purpose here is different, i.e. we want to provide some heuristic discussion on the expected behaviour of our procedures for physically motivated non-Gaussian models. This will provide some guidance to practitioners for applications to CMB data, which are currently under way in a separate work. We start from the expected value of the needlets bispectrum, which is provided by × l 1 l 2 l 3 m 1 m 2 m 3 l 1 l 2 l 3 0 0 0 (2l 1 + 1)(2l 2 + 1)(2l 3 + 1) 4π Here, we recall that b l1l2l3 is the so-called reduced bispectrum (see for instance [20,23]), which collects the non-Gaussian component in the third order moment