Testing multivariate uniformity based on random geometric graphs

We present new families of goodness-of-fit tests of uniformity on a full-dimensional set $W\subset\mathbb{R}^d$ based on statistics related to edge lengths of random geometric graphs. Asymptotic normality of these statistics is proven under the null hypothesis as well as under fixed alternatives. The derived tests are consistent and their behaviour for some contiguous alternatives can be controlled. A simulation study suggests that the procedures can compete with or are better than established goodness-of-fit tests.

the observation window. We want to test the null hypothesis with X being an independent copy of X 1 and U(W ) denoting the uniform distribution on W against general alternatives. Further applications are testing pseudo random number generators, see e.g. [19,Section 3.3], or testing if i.i.d. random vectors in R d follow a given absolutely continuous distribution, which is, by the Rosenblatt transformation, see [28], theoretically equivalent to testing uniformity on the d-dimensional unit cube [0, 1] d , although this transformation is hard to compute in many cases. The problem of testing uniformity has been investigated in classical papers in the univariate case, see [24] for a survey and [3] for a recent article, and, hitherto far less studied, in the multivariate setting, see [4,5,7,12,18,22,30,31,33], for which an empirical study was conducted in [26]. The cited methods include classical goodnessof-fit testing approaches as the Kolmogorov-Smirnov test, see [18], nearest neighbour concepts, see [12] and the references therein, the distances of the data points to the boundary of the observation window, see [7], or the volume of the largest ball that can be placed in the observation window and does not cover any data point, see [5]. The related problem of testing for complete spatial randomness of a point pattern (i.e., the points are a realisation of a homogeneous Poisson point process) is also of ongoing interest, see e.g. monographs like [2,10] or the recent publications [11,15].
We approach the testing problem (1) by examining the local properties of the data by means of random graphs. Using random graphs for testing uniformity is a known but not widely used concept, see [14,20,26]. Our new approach is to consider statistics of the random geometric graph RGG(X n , r n ), r n > 0: It has the realisations of the random vectors in X n as vertices, and any two distinct vertices x, y ∈ X n are connected by an edge if x − y ≤ r n , where ⋅ stands for the Euclidean norm. This random graph model was introduced by Gilbert for an underlying Poisson point process in [13] and is thus also called Gilbert graph. For further details see [25] and the references cited therein. Our test statistics are related to the edge lengths of RGG(X n , r n ) and are defined by L n (β) ∶= 1 2 (x,y)∈X 2 n,≠ 1{ x − y ≤ r n } x − y β , β ∈ R.
Here ∑ (x,y)∈X 2 n,≠ stands for the sum over all pairs of distinct points of X n (such sums are called U -statistics), and 1{⋅} is the indicator function. Notice that L n (0) counts the number of edges and L n (1) is the total edge length of RGG(X n , r n ). These statistics differ from nearest neighbour methods, see e.g. [8,12] and the references therein, as such that they rely on all interpoint distances not exceeding r n , whereas nearest neighbour methods take only distances between points and their k-nearest neighbours into account. An extensive theory of properties and the asymptotic behaviour of L n (β) in the complete spatial randomness setting can be found in [27]. Figure 1 provides a visualisation of different point models and selected random geometric graphs. For definitions of the CLU and CON alternatives we refer to Section 5.
Based on the asymptotically standardised statistics L n (β), we propose the test statistics where β > −d 2 and rejection of H 0 will be for large values of T j,n (β), j ∈ {e, a}.
In order to derive distributional limit theorems for L n (β), T e,n (β) and T a,n (β), we apply a central limit theorem from [17] for triangular schemes of U -statistics. For β = 0 the statistic L n (β) was considered as application in [17]. Here, we generalise these findings to β ∈ (−d 2, ∞), which is technical for β ∈ (−d 2, 0), and present them in more detail. Moreover, the focus of the present paper is on statistical tests based on L n (β) and their properties, which even for β = 0 goes clearly beyond what was studied in [17].
In [33] some U -statistics based on interpoint distances are proposed as test statistics for uniformity on the unit cube (beside two other statistics based on data depth and normal quantiles). In contrast to L n (β), these U -statistics take all interpoint distances into account and not only the small ones, whence their kernels do not depend on n (i.e., the summand associated with two given points from the sample is the same for all n ∈ N). The tests for multivariate uniformity studied in [4,30] are also based on U -statistics with fixed kernels, which are more involved to compute than the distances between the sample points. For Ustatistics with fixed kernels as considered in [4,30,33], the asymptotic behaviour is much easier to analyse than for L n (β), where the kernels depend on the parameters n and r n and their interplay. This paper is organised as follows. In Section 2 we derive the theory for L n (β), including formulae for the mean and the variance as well as central limit theorems, in a general setting.
The two families of test statistics T j,n (β), j ∈ {e, a}, are discussed in Section 3, and their limiting behaviour is given under H 0 and under fixed alternatives. The behaviour for some contiguous alternatives is studied in Section 4. Section 5 provides a simulation study and a comparison to existing methods. We finish the paper with comments on open problems and research perspectives in Section 6.
2 Properties of L n (β) Let X n = {X 1 , . . . , X n }, where n ≥ 2 and X 1 , . . . , X n are i.i.d. random vectors distributed according to a density f , whose support is contained in a measurable set W ⊂ R d of positive finite volume. In the following, we assume without loss of generality that Vol(W ) = 1, i.e., W has volume one. For some of our results we need the additional assumption that Here, we use the notation d( The assumption (2) requires that the volume of the set of points in W that are in the r-neighbourhood of the boundary of W is at most of order r and seems to be no significant restriction. For many sets W , for example all compact and convex W , the limit superior in (2) equals the surface area of W . The expression in (2) is related to the so-called (outer) Minkowski content. For a definition as well as some results on its finiteness we refer to [1].
Let (r n ) be a sequence of positive real numbers such that r n → 0 as n → ∞. In the following B d (x, r) stands for the d-dimensional closed ball with centre x ∈ R d and radius r > 0, and κ d ∶= π d 2 Γ(d 2 + 1) is the volume of the d-dimensional unit ball B d (0, 1). We denote by L 2 (W ) the space of all square-integrable functions on W . For the special case β = 0 the formulae of the following theorem can be also found in [17,Equations (4.2) and (4.3)].
Theorem 2.1 For β > −d and f ∈ L 2 (W ), and Proof: Equation (3) follows from where X and Y are independent random vectors distributed according to the density f . Notice where we used the inequality of arithmetic and geometric means and spherical coordinates.
This yields In the following we use the shorthand notation f C (x) ∶= min{f (x), C} for C > 0 and x ∈ W .
It follows from Lemma A.1 that, for any C > 0, for almost all x ∈ W . Now the dominated convergence theorem yields Together with Now letting C → ∞ and the monotone convergence theorem yield Combining this with (5) proves (4). ◻ Theorem 2.1 states exact formulae for the mean and easy to compute asymptotic approximations under fairly general assumptions including the behaviour of EL n (β) under H 0 , which is a direct consequence. We write g ≡ h to indicate that two functions g, h ∶ W → R are identical almost everywhere. .
Recall that the degree of a vertex in a graph is the number of edges emanating from it.
The average degreeD n of the vertices in RGG(X n , r n ) is given byD n = 2L n (0) n. Thus, it follows from Theorem 2.1 that ED n is of the same order as nr d n as n → ∞. For the special In the next theorem we present exact and asymptotic formulae for the variance of L n (β), which generalise the findings from [17, Section 4] for β = 0.
Notice that the orders of the two terms in the denominator in (8) differ by nr d n , which is the order of the expected average degree. For σ (1) β,f , σ (2) β,f > 0 this means that the first (second) term dominates if ED n → 0 (ED n → ∞) as n → ∞, while both terms contribute to the limit if ED n → c ∈ (0, ∞) as n → ∞. For all choices of f ∈ L 3 (W ) we have σ with equality if and only if f ≡ 1 W . So σ (2) β,f ≥ 0 with equality if and only if f ≡ 1 W . The formula (9) coincides with (8) for f ≡ 1 W . Nevertheless we have to impose for (9) additional conditions on the boundary of W and on the sequence (r n ). They ensure that the sum of the second and the third term in (7) does not have an asymptotic order that is less than n 3 r 2β+2d n but still larger than n 2 r 2β+d n . The following example shows that this can happen due to boundary effects (see also [17,Section 4]). For W = [0, 1], f ≡ 1 W , β = 0 and r n < 1 2, we have Thus, the sum of the second and the third term in (7) equals If nr 2 n → ∞ as n → ∞, this is of a higher order than the first term in (7). Theorem 3.3 in [27] states asymptotic variances for the same statistics L n (β) with an underlying homogeneous Poisson point process of intensity n (i.e., f ≡ 1 W and the number of points is Poisson-distributed with mean n). In contrast to (9), these formulae show the same phase transition depending on the behaviour of nr d n as we have in (8) for f ≡ 1 W .

Proof of Theorem 2.3: A straightforward computation shows that
Here, X 1 , . . . , X 4 are independent random vectors with density f and (⋅) k denotes the kth descending factorial. Combining this with (3) yields (7).
Observe that the asymptotic behaviour of the first and the third term in (7) follows immediately from Theorem 2.1. By the inequality of arithmetic and geometric means and spherical coordinates, we obtain 1 r 2β+2d On the other hand, Lemma A.1 and the dominated convergence theorem imply , C} for x ∈ W . Now letting C → ∞ and the monotone convergence theorem yield lim inf This, together with (10) and the observation that σ β,f > 0, completes the proof of (8). For the proof of (9) we define W −rn ∶= {x ∈ W ∶ d(x, ∂W ) ≥ r n }. Now straightforward computations yield and It follows from (2) that there exists a constant C W ∈ (0, ∞) such that Together with Vol(W ) = 1 this means that the absolute value of the sum of the second and the third term in (7) can be bounded by Together with the asymptotic order of the first term in (7), which is as in the proof of (8), this proves (9). ◻ In the following we use the abbreviation β,f as in Theorem 2.3 for β > −d 2 and n ∈ N. Moreover, we write D → for convergence in distribution and N m (µ, Σ) for an m-dimensional Gaussian random vector with mean vector µ ∈ R m and positive semidefinite covariance matrix Σ ∈ R m×m . In the univariate case the index m is omitted.
For β = 0 a central limit theorem as Theorem 2.4 is established in [17,Section 4]; see also [32] and the references therein. In [25, Section 3.5] central limit theorems for subgraph counts of random geometric graphs are derived, which include the number of edges L n (0) as special case. Notice that n 2 r d n → ∞ as n → ∞ means that the expected number of edges goes to infinity as n → ∞ (see Theorem 2.1), which is a reasonable assumption for a central limit theorem involving edge lengths. The additional assumptions for f ≡ 1 W are the same as in Theorem 2.3(c) and are used to ensure that the rescaled variances converge to one. Theorem 2.4 is proved next to the following corollary concerning the behaviour under H 0 .
(a) If n 2 r d n → ∞ and nr d+1 n → 0 as n → ∞, then It can be seen from Corollary 2.2 that in part (a) of the previous corollary L n (β) is centred with its expectation, while in (b) the asymptotic expectation is used. In the latter situation, the assumptions on (r n ) are stricter. For the statistics L n (β) with respect to an underlying Together with (11), which is valid because we assume (2), and Vol(W ) = 1 this yields Hence, the assertion of (b) follows from (a). ◻ We prepare the proof of Theorem 2.4 by several lemmas, which are formulated for the following more general setting, required later: We assume that the underlying points of X n are distributed according to some density f n ∈ L 3 (W ) and that For n ∈ N we define and letX 1 , . . . ,X n be independent and uniformly distributed points in W fn . We denote the collection of these points by X n . For a pointx ∈ W fn we often use the decompositionx = (x, m) with x ∈ W and m ∈ [0, f n (x)]. Observe that the first components ofX 1 , . . . ,X n are distributed according to the density f n in W . For β ∈ R we definê If f n = f ,L n (β) has the same distribution as L n (β). For M > 0 and a ≥ 0 we definê andL n,a,M (β) ∶= Moreover, we use the abbreviations f n, Proof: By definition, we have that Now a similar computation as in the proof of Theorem 2.3(a) yields that Note that I 1 and I 2 correspond to the first two terms in (7), whereas the third term in (7) was omitted since it is non-positive. Now short computations show that Since n 2 r d n → ∞ as n → ∞ and 2β d+1 > 0, this provides (15). Now (16) follows from combining (15) and lim n→∞ VarL n, Proof: Denoting by (X Thus, (17) follows from Theorem B.1. Combining the L 2 -covergence in (15) with (17) yields (18). ◻ In the following we use the abbreviation f n,M (x) ∶= max{f n (x) − M, 0} for x ∈ W and M ≥ 0.
Proof: By definition we havê From similar arguments as in the proofs of Theorem 2.3(a) and Lemma 2.6, it follows that For I 1 we obtain the bound Because of we have .
This implies (2) and nr d+1 n → 0 as n → ∞, then Now the assertion can be proved as Theorem 2.3(b). ◻ Proof of Theorem 2.4: We consider the same setting as in the previous lemmas with f n = f for n ∈ N so that L n (β) has the same distribution asL n (β), which we study throughout this proof. For f ≡ 1 W the assertion follows from (18) in Lemma 2.7 because, for M ≥ 1,L n (β) has the same distribution asL n,M (β), σ β,f M ,n = σ β,f,n and Lemma 2.9 guarantees that the variance condition in Lemma 2.7 is satisfied. So we assume f ≡ 1 W in the sequel.
Let h ∶ R → R a be bounded Lipschitz function whose Lipschitz constant is at most one and let ε > 0. In the following we show which yields the assertion.
For M ≥ 1 the triangle inequality implies It follows from Lemma 2.7 (notice that the variance condition is satisfied because of Lemma 2.9) that R 3,n,M vanishes for any M ≥ 1 as n → ∞. The Lipschitz property of h, the Cauchy-Schwarz inequality and Lemma 2.8 imply that Here the terms depending on n can be bounded by some constants. The dominated convergence theorem with the upper bounds 2f 2 and 3f 3 leads to Hence, there exists an M 1 ≥ 1 such that lim n→∞ R 1,n,M ≤ ε 2 for M > M 1 .
A short computation using the Lipschitz continuity of h and the Cauchy-Schwarz inequality shows that By the monotone convergence theorem and the assumption f ≡ 1 W , we have σ and depending on β > − d 2 and r n ∈ (0, ∞). The choice of the sequence (r n ) is discussed in Section 5, where we introduce a parameter k. The indices e and a are abbreviations for 'exact' and 'asymptotic', and they point out that T e,n (β) involves EL n (β), which can be difficult to compute depending on the shape of the observation window W , while T a,n (β) uses a simple asymptotic approximation of EL n (β). Rejection of H 0 will be for large values of T j,n (β), j ∈ {a, e}. Empirical critical values for W = [0, 1] d can be found in Tables 9 to 12  Here χ 2 1 denotes a random variable having a chi-squared distribution with one degree of freedom. In the following theorem we consider the asymptotic behaviour of T e,n (β) and T a,n (β) under fixed alternatives. We write P → for convergence in probability. Proof: Throughout this proof we denote the terms that are squared in (21) and (22) by L e,n (β) and L a,n (β), respectively. In the following we will show that for j ∈ {a, e}, which implies the assertion.
Using the same arguments as in the proof of Theorem 2.1, one can show that By the Cauchy-Schwarz inequality, we have for j ∈ {e, a}, this shows that S 2,e,n and S 2,a,n behave at least as n as n → ∞. From the Chebyshev inequality and Lemma 2.9 it follows that which implies (23) for j ∈ {a, e}. ◻ Theorem 3.1 yields consistency of T e,n (β) and T a,n (β) against each fixed alternative f ≡ 1 W .

Behaviour under contiguous alternatives
Let g ∈ L 3 (W ) be such that g ≡ 0 and ∫ W g(x) dx = 0 and let (a n ) be a positive sequence such that a n → 0 as n → ∞. In the following we always tacitly assume that 1 + a n g(x) ≥ 0 for all x ∈ W and n ∈ N. This guarantees that 1 W + a n g is a density. In the sequel we denote bỹ T e,n (β) andT a,n (β) our test statistics in (21) and (22) computed on n i.i.d. pointsX 1 , . . . ,X n distributed according to the density 1 W + a n g (i.e., we have a triangular scheme).
The condition (24) requires that the fluctuations of g in an r-neighbourhood of the boundary of W are at most of order r. Because we assume (2), this is always the case if g is bounded.
The limiting random variable in  We prepare the proofs of Theorem 4.1 and Theorem 4.2 with several lemmas. ByL n (β) we denote the statistic L n (β) with respect to i.i.d. pointsX 1 , . . . ,X n distributed according to the density 1 + a n g, while L n (β) is with respect to n i.i.d. points uniformly distributed in W .

Lemma 4.3
Assume that W and g satisfy (24) and let n ≥ 2. Then, for any β > −d, Moreover, for any β > −d 2, VarL n (β) − Var L n (β) ≤ C n 2 r 2β+d n a n (a n + r n ) + n 3 r 2β+2d n a n (a n + r n + a 2 n + a 3 n + a 2 n r n ) (26) with some constant C ∈ (0, ∞) depending on β, d, C W,g and g.
Proof: It follows from (3) in Theorem 2.1 that We have Here, the first term is zero since ∫ W g(x) dx = 0. By (24), the absolute value of the second term can be bounded by which proves (25).
By similar arguments as for the second term in (27), one obtains Moreover, one can show the inequality β a 2 n (g(y 1 )g(y 2 ) + 2g(y 1 )g(x)) + a 3 n g(y 1 )g(y 2 )g(x) d(y 1 , y 2 , x) Summarising, it follows that R 2,n ≤ C 2 n 3 r 2β+2d n a n (a n + r n + a 2 n ) with some constant C 2 ∈ (0, ∞) depending on β, d, C W,g and g. Combining the estimates for R 1,n ,R 2,n andR 3,n completes the proof of (26). ◻ Lemma 4.4 Let β > −d 2 and assume that W satisfies (2), that n 2 r d n → ∞ and max{nr d+1 n , nr d n a 3 n } → 0 as n → ∞ and that We prepare the proof of Lemma 4.4 with the following inequality.
Lemma 4.5 For p, q > 0, v ∈ L p+q (W ) and a > 0, which is the desired inequality. ◻ Proof of Lemma 4.4: In the following we consider the framework from the Lemmas 2.6, 2.7 and 2.8 with f ≡ 1 W and f n ∶= 1 W + a n g, n ∈ N. Then,L n (β) has the same distribution aŝ L n (β). For the latter we will prove convergence to N (0, 1) after an appropriate rescaling.
It follows from (28) It follows from Lemma 4.5 (with p = 1, q = 2 and p = 2, q = 1, respectively) that Since σ 2 β,f,n = σ which vanishes as n → ∞. This means that Together with (29) we see that It follows from Lemma 2.7, where the variance condition is satisfied because of (32) and Because of the L 2 -convergence in (31) this yieldŝ which completes the proof. ◻ Proof of Theorem 4.1: By Lemma 4.3 we have that with and a remainder term R n satisfying As in the proof of Theorem 2.1 one can show that For γ = 0 one obtains lim n→∞ T n = 0 and lim n→∞ R n = 0. The latter follows from the assumption min{nr d 2+1 n a n , r n a n } → 0 as n → ∞, whence, by (34), R n vanishes directly or is of a lower order than T n and, thus, also vanishes.
For γ > 0 or nr d 2 n a 2 n → ∞ as n → ∞, we have that lim n→∞ r n a n = 0. Indeed, if there was a subsequence (n m ) such that r nm a nm ≥ c for some c > 0, we would have n m r d 2+1 nm a nm , r nm a nm } would not converge to 0 as m → ∞, which is a contradiction. Because of (34) and (35) it follows from lim n→∞ r n a n = 0 that lim n→∞ R n T n = 0, whence T n is the leading summand in (33).
Assume that nr n 2 r 2β+d n ≤ C lim n→∞ a n (a n + r n ) + nr d n a n (a n + r n + a 2 n + a 3 n + a 2 n r n ) = 0, where we also used that a n , r n , nr d+1 n → 0 as n → ∞. Now Lemma 4.4 implies This together with (33) and the above analysis of the asymptotic behaviour of T n and R n yieldsL Now (a) follows from the continuous mapping theorem.
Next we show part (b). It follows from (26) that VarL n (β) (n 2 r β+d n a 2 n ) 2 ≤ C a n (a n + r n ) + nr d n a n (a n + r n + a 2 n + a 3 n + a 2 n r n ) (nr The first term on the right-hand side vanishes as n → ∞ since a n , r n , nr d+1 n → 0 and nr d 2 n a 2 n → ∞ as n → ∞. Because Var L n (β) behaves as n 2 r 2β+d n , the second term is of order 1 (nr d 2 n a 2 n ) 2 and converges to zero as n → ∞. We thus have lim n→∞ VarL n (β) (n 2 r β+d n a 2 n ) 2 = 0 andL n (β) − EL n (β) n 2 r β+d n a 2 n P → 0 as n → ∞.
Together with the fact that T n is the dominating term in (33) and (35), this means that Because of nr Part (c) follows from (12) in the proof of Corollary 2.5.
◻ Proof of Theorem 4.2: Without loss of generality we can assume that r n < d(supp g, ∂W ) for each n. Consequently, the assumption (24) is satisfied with C W,g = 0 for r = r n . Now the proof of Theorem 4.1 works without the additional assumption that min{nr d 2+1 n a n , r n a n } → 0 as n → ∞ because R n = 0. ◻ From Theorem 4.1 and Theorem 4.2, we conclude that under the stated assumptions the tests based onT a,n (β) andT e,n (β) are able to detect alternatives which converge to the uniform distribution at rate a n . Moreover, the theorems could be the foundation of establishing local optimality of the tests by applying the third Le Cam lemma, see Section 5.2 of [21] for a short review of the needed methodology.

Simulation
In this section we compare the finite-sample power performance of the test statistics T e,n (β) and T a,n (β), β > −d 2, n ∈ N, with that of some competitors. Since the d-dimensional hypercube [0, 1] d is the mostly used observation window, we restrict our simulation study to this case with d ∈ {2, 3}. Particular interest will be given to the influence on the finite-sample power of β and r n in dependence of the chosen alternatives. In each scenario, we consider the sample sizes n ∈ {50, 100, 200, 500} and set the nominal level of significance to 0.05. Since the test statistics depend on the parameter β and the choice of r n and the empirical finite sample quantile is in some cases far away from the quantile χ 2 1,0.95 ≈ 3.8415 of the limiting distribution, we simulated critical values for T e,n (β) and T a,n (β) with 100 000 replications, see Tables 9 to   12. Each stated empirical power of the tests in Tables 4 to 8 is based on 10 000 replications and the asterisk * denotes a rejection rate of 100%.
Since there is a vast variety of ways to choose the parameters β and r n , we chose the parameter configurations to fit the limiting regimes of Corollary 2.5 as well as the following additional property: From (6) we know that the expectation of the average degreeD n behaves as κ d nr d n for n → ∞ under H 0 . This observation motivates the following choices of the radius r n for T e,n (β), namely r n = k nκ d (1 − y j ) dy, with y = (y 1 , . . . , y d ) ∈ R d . The formulae in (a) and (b) follow now from a longer calculation with polar coordinates. ◻ As competitors to the new test statistics we consider the distance to boundary test (DBtest), see [7], the maximal spacing test (M S-test), see [5,16], the nearest neighbour type test (N N -test) of [12] as well as the Bickel-Rosenblatt test (BR-test) presented in [31]. We follow the descriptions of the DB-and M S-tests given in [12].
To avoid boundary problems in the computation of the N N -test, we used the same toroid metric in the simulation as in [12]. Since rejection rates depend crucially on the power β and the number of neighbours J taken into account, we chose different values for β and J for the two alternatives where the choice was motivated by Table 2 in [12]. Notice that this test is consistent, but one has to be careful to choose the correct rejection region, which depends on the choice of β.
As a further competitor we consider the fixed bandwidth Bickel-Rosenblatt test (BR-test) on the unit cube, studied in [31]. Using the notation of [31], the corresponding test statistic is where h > 0 is a fixed bandwidth. For the sake of completeness we restate the following abbreviations. The convolution product operator is denoted by ⋆, U = 1 [0,1] d is the density of the uniform distribution over the unit hypercube [0, 1] d and for any function g we define . . , u d ) ∈ R d with a kernel k on R (so k is bounded and integrable). Using the arguments and techniques in [31], direct calculations exp − x 2 2 , x ∈ R, being the standard normal density function, give for h > 0, where Φ is the standard normal distribution function and X i,j denotes the jth component of the random vector X i , with i ∈ {1, . . . , n} and j ∈ {1, 2}.
The BR-test rejects the null hypothesis for large values of I 2 n (h). Notice that the asymptotic distribution of I 2 n (h) is known, see [31], but not in a closed form.  Following the studies in [6,12], we simulated a contamination and a clustering model as alternatives to the uniform distribution. The contamination alternative (CON) is given by the mixture under the condition that all simulated points are located in [0, 1] d . Here, I d ∈ R d×d denotes the identity matrix of order d. The chosen parameters are given in Table 3, where Φ −1 (p), p ∈ (0, 1), denotes the p-quantile of a standard normal distribution. See Figure 1, Table 3: Parameter configuration of the CON-alternatives The clustering alternative (CLU) is motivated by a fixed number of data points version of a Matérn cluster processes, see Section 12.3 in [2], and is designed to destroy the independence.
One first chooses a radius r clu and simulates n 5 random points with the uniform distribution U([−r clu , 1 + r clu ] d ), that act as centres of clusters. These points will not be part of the final sample. In a second step, one generates 5 points around each centre in a ball with radius r clu .
These points are generated independently of each other and follow uniform distributions on the mentioned balls. If a point falls outside [0, 1] d , it is replaced by a point that follows a U([0, 1] d ) distribution. In the following we set r clu = 0.1 and a realisation of this model can be found in Figure 1, third row. The clustering alternative is not included in the framework of our theoretical results since the points are, by construction, not independent. Nevertheless it is interesting to see how the test statistics behave for such alternatives, which were also considered in the simulation study in [7].
We now present the simulation results for d = 2. Table 4  Comparison with T e,n (β) for β = −0.5 (see Table 5) shows that the presented new methods are for sample sizes of n = 100, 200, 500 as good as and for n = 50 nearly as good as the best competitor I 2 n (0.1). As one can witness throughout the Tables 5 and 6, T e,n (β) dominates T a,n (β) for small sample sizes, while the power is similar to the best competitors. In case of the CLU alternative T e,n (β) gives the overall highest performance for β = −0.5 over small sample sizes of n = 50, 100, 200, while the only procedure that is better for n = 500 is again 15 . Notice that the asymptotic version T a,n (β) might even achieve higher performance if one considers bigger radii, since it attains the highest rates for the biggest values of k. A closer look at these tables reveals the dependency of the new tests on the choice of β and k.
Interestingly, the highest performance is given for both alternatives and T j,n (β), j ∈ {a, e}, for the choice of β = −0.5. The best choice of k obviously depends on the sample size.
Observe that the simulation results for d = 3 in Tables 7 and 8 show higher rejection rates for T j,n (β) than in the bivariate setting. Since the other methods were too time consuming to implement or to simulate we restrict the comparison to the DB-test. As can be seen in Table   7 the new tests dominate the DB-method for β = −0.5 and nearly for every value of k.

Conclusions and open problems
We have introduced two new families of consistent goodness-of-fit tests of uniformity based on random geometric graphs. As the simulation section shows, the presented methods are serious competitors to existing methods, even dominating them for right choices of the parameters β and r n (or k). Clearly, a natural question is to find (data dependent) best choices of them.
Another obvious extension of the presented methods would be to find tests of uniformity on (lower dimensional) manifolds, including special cases of directional statistics as the circle or the sphere (for existing methods see Chapter 6 of either [21] or [23]). Section 4 invites to further investigate in view of concepts of locally optimal tests. Since the approach is fairly general, an extension would be testing the fit of X 1 , . . . , X n to some parametric family {f (⋅, ϑ) ∶ ϑ ∈ Θ} of densities for a specific parameter space Θ (eventually the procedures would use a suitable estimatorθ n of ϑ). In view of the special interest in the case of unknown support of the data, see [5,6], we indicate that the definition of T a,n (β) is not dependent on the shape of the underlying observation window and therefore is applicable in this setting (as long as the observation window has volume one).
For each n ∈ N let Y (n) 1 , . . . , Y (n) n be i.i.d. random vectors in R d , whose distribution may depend on n. We use the shorthand notation Y n = {Y (n) 1 , . . . , Y (n) n }, n ∈ N, in the sequel. For n ∈ N let h n ∶ R d × R d → R be a bounded, symmetric and measurable function and let S n ∶= 1 2 (y 1 ,y 2 )∈Y 2 n,≠ h n (y 1 , y 2 ).
The random variables S n , n ∈ N, are so-called second order U -statistics. The following theorem provides a sufficient criterion for the convergence of (S n ), after rescaling, to a standard Gaussian random variable.
Since the assumptions of the theorem are satisfied for the U -statistics (S n ), they must also hold for the U -statistics (S n ). As the underlying random variables of (S n ) are identically distributed, we are in the previously discussed special case for which the central limit theorem holds. This completes the proof.