Testing goodness of fit for point processes via topological data analysis

We introduce tests for the goodness of fit of point patterns via methods from topological data analysis. More precisely, the persistent Betti numbers give rise to a bivariate functional summary statistic for observed point patterns that is asymptotically Gaussian in large observation windows. We analyze the power of tests derived from this statistic on simulated point patterns and compare its performance with global envelope tests. Finally, we apply the tests to a point pattern from an application context in neuroscience. As the main methodological contribution, we derive sufficient conditions for a functional central limit theorem on bounded persistent Betti numbers of point processes with exponential decay of correlations.


Introduction
Topological data analysis (TDA) provides insights into a variety of datasets by capturing their most salient properties via refined topological features. Since the mathematical field of topology specializes in describing invariants of objects independently of the choice of a precise metric, these features are robust against small perturbations or different embeddings of the object [11,12]. Among the most classical topological invariants are the Betti numbers. Loosely speaking, they capture the number of k-dimensional holes of the investigated structure. TDA refines this idea substantially by constructing filtrations and tracing when topological features appear and disappear. In point pattern analysis, simplicial complexes are built so that they are topologically equivalent to a union of balls with the same radius and centered at the data points, see the first three panels of Figure 1. As the radius increases, a sequence of simplicial complexes is then defined. Examples of such complexes are the basicČech complex or the more elaborate α-complex, which is based on the Delaunay triangulation, see [18]. In that framework, 1-dimensional features correspond to loops in the simplicial complexes while 0dimensional features correspond to connected components. When moving up in the filtration, additional edges appear and at some point create new loops. On the other hand, more and more triangles also appear, thereby causing completely filled loops to disappear. Usually, the filtration is indexed by time, and we refer to the appearance and disappearance of features as births and deaths. We refer the reader to [18] for a detailed presentation of these concepts. The persistence diagram visualizes the time points when the features are born and die, see the bottom-right panel in Figure 1. Persistent Betti numbers count the number of events in upper-left blocks of the persistence diagram and are also illustrated in the figure.
In this paper, we leverage persistent Betti numbers to derive goodness-of-fit tests for planar point processes. In this setting, the abstract general definition of persistent Betti numbers gives way to a clear geometric intuition induced by a picture of growing disks centered at the points of the pattern and all having radius r, corresponding to the index of the filtration. Features of dimension 0 correspond to connected components in the union of balls, interpreted as point clusters, whereas boundaries of the complement set can be considered as the loops forming the 1-dimensional features. Since the notion of clusters in the sense of connected components lies at the heart of persistent Betti numbers in degree 0, they become highly attractive as a tool to detect clustering in point patterns. Our tests are based on a novel functional central limit First, now that TDA has become widely adopted, the community is vigorously working towards putting the approach on a firm statistical foundation paving the way for hypothesis testing. On the one hand, this encompasses large-sample Monte Carlo tests when working on a fixed domain [6,9,13]. Although these tests are highly flexible, the test statistics under the null hypothesis must be re-computed each time when testing observations in a different window. In large domains, this becomes time-consuming. On the other hand, there has been substantial progress towards establishing CLTs in large domains for functionals related to persistent Betti numbers [25,29,35,39,40]. However, these results are restricted to the null hypothesis of complete spatial randomness -i.e., the Poisson point process -and establish asymptotic Gaussianity on a multivariate, but not on a functional level. Our proof of a functional CLT is based on recently developed stabilization techniques for point processes with exponential decay of correlations [8]. As explained in the final section of [10], the main technical step towards a functional CLT are bounds on the cumulants.
Second, the introduction of global rank envelope tests has lead to a novel surge of research activity in goodness-of-fit tests for point processes [34]. One of the reasons for their popularity is that they rely on functional summary statistics rather than scalar quantities. Thus, they reveal a substantially more fine-grained picture of the underlying point pattern. In the overwhelming majority of cases, variants of the K-function are used as a functional summary statistic, thereby essentially capturing the relative density of point pairs at different distances. Here, the persistent Betti numbers offer an opportunity to augment the basic second-order information by more refined characteristics of the data. Still, even for classical summary statistics, rigorous limit theorems in large domains remain scarce. For instance, a functional central limit theorem of the estimated K-function is proven in detail only for the Poisson point process in [22] and an extension to α-determinantal point processes is outlined in [23].
The rest of the manuscript is organized as follows. First, in Section 2, we introduce the concepts of M -bounded persistence diagrams and M -bounded persistent Betti numbers. Next, in Section 3, we state the two main results of the paper, a CLT for the M -bounded persistence diagram and a functional CLT for the M -bounded persistent Betti numbers. In Section 4, we provide specific examples of point processes satisfying the conditions of the main results. Sections 5 and 6 explore TDA-based tests for simulated and real datasets, respectively. Finally, Section 7 summarizes the findings and points to possible avenues of future research. The proofs of the main results are deferred to Sections 8 and 9 of the appendix.

M -bounded persistent Betti numbers
For a locally finite point set X ⊂ R 2 , persistent Betti numbers provide refined measures for the amount of clusters and voids on varying length scales. More precisely, we let denote the union of closed disks of radius r ≥ 0 centered at points in X . A 0-dimensional topological feature is a connected component of this union, corresponding to a cluster of points in X , while a 1-dimensional feature can be thought of as a bounded connected component of the background space, often identified with its boundary loop, and describes a vacant area in the plane. As the disks grow, new features arise and vanish; we say that they are born and die again. The persistent Betti numbers quantify this evolution of clusters and loops. Henceforth, we consider the persistence diagram only until a fixed deterministic radius r f ≥ 0.
As r approaches the critical radius for continuum percolation, long-range phenomena emerge [31]. Thus, determining whether two points are connected could require exploring large regions in space. While useful quantitative bounds on cluster sizes are known for Poisson point processes [1], for more general classes of point processes the picture remains opaque and research is currently at a very early stage [7,27]. Recently, a central limit theorem for persistent Betti numbers has been established in the Poisson setting [25,29], but for general point processes the long-range interactions pose a formidable obstacle towards proving a fully-fledged functional CLT.
From a more practical point of view, these long-range dependencies are of less concern. Although large features can carry interesting information, we expect that spatially bounded topological features already provide a versatile tool for the statistical analysis of both simulated point patterns and real datasets, even when focusing only on features of a bounded size. For that purpose, we concentrate on features whose spatial diameter does not exceed a large deterministic threshold M .
To define these M -bounded features, we introduce the Gilbert graph G r (X ) on the vertex set X . The Gilbert graph G r (X ) has for vertices the points in X and two points are connected by an edge if the distance between them is at most 2r or, equivalently, if the two disks of radius r centered at the points intersect.
2.1. M -bounded clusters. The 0-dimensional M -bounded features alive at time r > 0 are the connected components of G r (X ) with diameter at most M . Starting at r = 0, all points belong to separate connected components that merge into larger clusters when r increases. We thus say that all components are born at time 0.
To define the death time of a component, let C r (x) denote the connected component of x ∈ X in G r (X ). The components of x, y ∈ X meet at time R(x, y) = inf{r > 0 : C r (x) = C r (y)}.
Then, the death time of x ∈ X is the smallest R(x, y) such that the spatial diameter of C r (x) exceeds M or such that P x is lexicographically larger than P y , where P x , P y are the points of C r (x) ∩ X and C r (y) ∩ X whose associated disks meet at time R(x, y). This ordering determines which component dies when two of them meet.
2.2. M -bounded loops. Next, we introduce 1-dimensional features. At time r > 0, these correspond to holes, i.e., bounded connected components in the vacant phase V r (X ) = R 2 \ U r (X ). In contrast to the clusters, there are no holes at time 0, so that both birth and death times must be specified. Moreover, it needs to be defined how holes are related for different radii r.
The death time of a hole H s in V s (X ) is the first time r > s when the hole is completely covered by disks, i.e., H s ⊆ U r (X ). We identify a hole H with the point p(H) that is covered last. Thus, holes H s in V s (X ) and H r in V r (X ), are identified if p(H r ) = p(H s ).
New holes in V r (X ) can only appear when two balls merge, which corresponds to including a new edge in G r (X ). When a new hole is formed, it can happen in two ways: either a finite component is separated from the infinite component, or an existing hole is split in two. In both cases, we define the size of the newly created hole(s) as follows: Let x 1 , . . . , x k ∈ X be the points in X such that the disks of radius r around the points intersect the boundary of the hole H in V r (X ). Then, the size of H is the diameter of the set {x 1 , . . . , x k }. The size remains unchanged until the next time the hole is split into smaller pieces. Then the size is recomputed for both new holes. This definition ensures that the size decreases when the balls grow and only changes when a new edge is added to G r (X ).
The birth time of a hole H is the minimal s such that there is a hole H s in V s (X ) with p(H) = p(H s ) and size less than M . By an M -bounded loop, we mean a loop with size lower than M .
2.3. The persistence diagram. We now adapt the definition of the persistence diagram in [25] to only include M -bounded features. That is, we define the qth M -bounded persistence diagram, q ∈ {0, 1}, as the empirical measure where I M,q (X ) is an index set over all M -bounded q-dimensional features that die before time

Main results
Henceforth, P denotes a simple stationary point process in R 2 . We think of P as a random variable taking values in the space of locally finite subsets N of R 2 endowed with the smallest σ-algebra N such that the number of points in any given Borel set becomes measurable. Throughout the manuscript, we assume that the factorial moment measures exist and are absolutely continuous. In particular, writing x = (x 1 , . . . , x p ) ∈ R 2p , the pth factorial moment density ρ (p) is determined via the identity for any pairwise disjoint bounded Borel sets A 1 , . . . , A p ⊂ R 2 , where P(A i ) denotes the number of points of P in A i . Moreover, as we rely on the framework of [8], we also require that P exhibits exponential decay of correlations. Loosely speaking this expresses an approximate factorization of the factorial moment densities and is made precise in Section 4 below. Many of the most prominent examples of point processes appearing in spatial statistics exhibit exponential decay of correlations [8, Section 2.2]. Our first main result is a CLT for the persistence diagram built on the restriction P n = P ∩W n of the point process P to a large observation window W n = [− √ n/2, √ n/2] 2 . With a slight abuse of notation, we write P ∪x = P ∪{x 1 , . . . , x p }. To prove the CLT, we impose an additional condition concerning moments under the reduced p-point Palm distribution P !
x . We recall that this distribution is determined via for any bounded measurable f : R 2p × N → R, where P p = denotes p-tuples of pairwise distinct points in P. Then, we impose the following moment condition.
(M) For every p ≥ 1 sup To state the CLT for the persistence diagram precisely, we let converges in distribution to a standard normal random variable as n → ∞.
In order to derive a functional CLT for the persistent Betti numbers, we add a further constraint on P, which is needed to establish a lower bound on the variance via a conditioning argument in the vein of [38,Lemma 4.3]. For this purpose, we consider a random measure Λ, which is jointly stationary with P and which we think of as capturing additional useful information on the dependence structure of P. For instance, if P is a Cox point process, we choose Λ to be the random intensity measure. If P is a Poisson cluster process, then Λ would describe the cluster centers. If the dependence structure is exceptionally simple, it is also possible to take Λ = 0. The idea of using additional information is motivated from conditioning on the spatially refined information coming from the clan-of-ancestors construction in Gibbsian point processes [38].
The point process P is conditionally m-dependent if P ∩ A and P ∩ A are conditionally independent given σ(Λ, P ∩ A ) for any bounded Borel sets A, A , A ⊂ R 2 such that the distance between A and A is larger than some m > 0. Here, σ(Λ, P ∩ A ) denote the σ-algebra generated by Λ and P ∩ A .
Finally, we impose an absolute continuity-type assumption on the Poisson point process in a fixed box with respect to P when conditioned on Λ and the outside points. More precisely, we demand that there exists r AC > 6M with the following property, where Q denotes a homogeneous Poisson point process in the window W r 2 AC .
(AC) Let E 1 , E 2 ∈ N be such that min i∈{1,2} P(Q ∈ E i ) > 0. Then, Although (AC) appears technical, Section 4 illustrates that it is tractable for many commonly used point processes.
Since the persistent Betti numbers exhibit jumps at the birth-and death times of features, we work in the Skorokhod topology [5,Section 14].
Theorem 3.2 (Functional CLT for persistent Betti numbers). Let M > 0 and P be a conditionally m-dependent point process with exponential decay of correlations and satisfying conditions (M) and (AC). Then, the following convergence statements hold true. q=0. The one-dimensional process d≤r f converges weakly in Skorokhod topology to a centered Gaussian process. q=1. The two-dimensional process b,d≤r f converges weakly in Skorokhod topology to a centered Gaussian process.
Additionally, [8,Theorem 1.12] implies convergence of the rescaled variances. While Theorem 3.1 is an adaptation of [8], Theorem 3.2 is much more delicate. As an application of Theorem 3.2, we obtain a functional CLT for the following two characteristics, which are modified variants of the accumulated persistence function from [6]:

Examples of point processes
In this section, we give examples of point processes which satisfy the assumptions of our main theorems. More precisely, we show that log-Gaussian Cox processes with compactly supported covariance functions and Matérn cluster processes both satisfy the conditions of Theorems 3.1 and 3.2. We also show that the Ginibre point process satisfies the conditions of Theorem 3.1.
Conversely, we do not expect that hard-core point processes satisfy the functional central limit theorem in the generality of Theorem 3.2. Indeed, hard-core conditions put a strict lower bound on the death time of clusters and the birth time of loops. However, we believe that suitable repulsive point processes, where the hard-core conditions only need to be imposed with a certain probability can be embedded in the framework of Theorem 3.2.
4.1. Log-Gaussian Cox process. Let Y = {Y (x)} x∈R 2 be a stationary Gaussian process with mean µ ∈ R and covariance function c(x, x ) = c(x − x ). Then, the random measure on R 2 defined as Λ(B) = B exp(Y (x))dx, for any Borel subset B ⊂ R 2 has moments of any order. Let P be a Cox process with random intensity measure Λ, referred to as a Log-Gaussian Cox process. By [15, Equation (7)], the factorial moment densities of P are given by To apply Theorems 3.1 and 3.2, we assume that c is bounded and of compact support, which ensures that P exhibits exponential decay of correlation. We show below that condition (M) is satisfied. Let x = (x 1 , . . . , x l ) ∈ R 2l . According to [16,Theorem 1], the Log-Gaussian Cox process P under the reduced Palm version is also a Log-Gaussian Cox process P x with underlying Gaussian process Y According to [17,Equation (5.4.5)], x (u 1 , . . . , u j ) denotes the jth factorial moment density with respect to P x . Therefore, it is enough to prove that for all j, l ≥ 1. Now, Equation (8) in [15] gives that ρ (j) where the right-hand side is bounded as µ and c are bounded independently of x. This verifies condition (M).
Since conditionally on Λ, the point process P is a Poisson point process, the conditional m-dependence property holds with Λ = Λ.
It remains to verify condition (AC). By [33,Equation (6.2)], conditionally on Λ, the distribution of the point process P r 2 AC admits the density with respect to a homogeneous Poisson point process Q with intensity 1 in W r 2 AC given by where φ ∈ N. In particular, f Λ (φ) is strictly positive for all φ. Therefore, if E 1 , E 2 are two events such that min i∈{1,2} P(Q ∈ E i ) > 0, then P(P r 2 AC ∈ E i | Λ = Λ) > 0. This verifies condition (AC).

4.2.
Matérn cluster process. Let η be a homogeneous Poisson point process in R 2 with intensity γ > 0. Given a realization of η, we define a family of independent point processes (Φ x ) x∈η , where Φ x , x ∈ η, is a homogeneous Poisson point process with intensity 1 in the disk B R (x) of radius R > 0 centered at x ∈ R 2 . The point process P = x∈η Φ x is referred to as a Matérn cluster process. Since P is 2R-dependent, it exhibits exponential decay of correlations.
Next, we verify condition (M). For this purpose, we deduce from [16, Section 5.3.2] that a Matérn cluster process is a Cox process whose random intensity measure Λ has as density the random field (λ(x)) x∈R 2 given by Now, let x = (x 1 , . . . , x l ) ∈ R 2l and p ≥ 1 be fixed. From [16,Equations (19) and (20)] we obtain that Since η(B R (x)) is increasing in η for every x ∈ R 2 in the sense of [30], the Harris-FKG inequality [30,Theorem 20.4] gives that is a Poisson random variable with parameter πR 2 . In order to bound E P(W 1 ) p i≤l λ(x i ) , we first apply the Hölder inequality and stationarity, to arrive at where Φ 0 is a homogeneous Poisson point process of intensity 1 in the disk B R (o) [30,Theorem 3.9]. Again, since #Φ 0 is a Poisson random variable with parameter πR 2 , the latter expression is finite. Taking the supremum over all x and all l ≤ p, this verifies condition (M). The point process P is also conditionally m-dependent, by taking m = 2R and Λ = η. It remains to prove (AC). By [33,Equation (6.2)], conditional on Λ = η, the distribution of P r 2

AC
admits the density with respect to the distribution of a homogeneous Poisson point process. Now, consider the event on the event Since E occurs with positive probability, this proves condition (AC).

4.3.
Ginibre point process. The Ginibre point process is a determinantal point process with kernel with z 1 , z 2 ∈ C. As mentioned in [8, p. 19], this point process exhibits exponential decay of correlation. According to [21,Theorem 2], for , where the right-hand side is finite by [26,Lemma 4.2.6]. Hence, we obtain an upper bound for E ! x [P(W 1 ) p ], which is independent of x, thereby verifying condition (M).

Simulation study
We elucidate in a simulation study, how cluster-and loop-based test statistics derived from Theorem 3.2 can detect deviations from complete spatial randomness. The simulations are carried out on top of the R-packages spatstat and TDA [2,20].
For the entire simulation study, the null model Poi (2) is a Poisson point process with intensity 2 in a 10 × 10 observation window. Moreover, we fix M = √ 2 · 10 so large that it encompasses the entire sampling window and therefore suppress its appearance in the notation. Although the proof of Theorem 3.2 relies on the M -boundedness, the simulation study illustrates that it is not critical to impose this condition when testing hypotheses on common point patterns.

Deviation tests.
As a first step, we derive scalar cluster-and loop-based test statistics.

Definition of test statistics.
As a test statistic based on clusters, we use the integral over the number of cluster deaths in a time interval [0, r C ] with r C ≤ r f , i.e., After subtracting the mean, this test statistic becomes reminiscent of the classical Cramér-von-Mises statistic except that we do not consider squared deviations. Although squaring would make it easier to detect two-sided deviations, it would also require knowledge of quantiles of the square integral of a centered Gaussian process. Albeit possible, this incurs substantial computational expenses. Our simpler alternative has the appeal that as an integral of a Gaussian process, T C is asymptotically normal and therefore characterized by its mean and variance. As a test statistic based on loops, we use the accumulated persistence function, which aggregates the life times of all loops with birth times in a time interval [0, r L ] with r L ≤ r f , i.e., By Corollary 3.3, after centering and rescaling, the statistic T L converges in the large-volume limit to a normal random variable.
The statistics T C and T L are specific possibilities to define scalar characteristics from the persistence diagram. Depending on the application context other choices, such as APF 0 instead of T C could be useful. However, in the simulation study below we found the weighting by life times of clusters to be detrimental. 5.1.2. Exploratory Analysis. As alternatives to the Poisson null hypothesis, we consider the attractive Matérn cluster and the repulsive Strauss processes. More precisely, the Matérn cluster process MatC(2, 0.1, 1) features a Poisson parent process with intensity 2 and generates a Poi(1) number of offspring uniformly in a disk of radius 0.1 around each parent. The Strauss process Str(4.5, 0.1, 0.35) has interaction parameter 0.1 and interaction radius 0.35. The intensity parameter 4.5 was tuned so as to match approximately the intensity of the null model. Figure 2 shows realizations of the null model and the alternatives.
In a first step, in Figure 3, we plot the persistence diagrams of samples from the null model and of the alternatives.  From the cluster-based diagrams, it becomes apparent that in comparison to the null model, in the Matérn cluster process, features can die also at rather late times, whereas this happens very rarely in the Strauss process. When analyzing loops, we see that loops with long life times can appear earlier in the null model than in the Matérn cluster process. Conversely, while some loops with substantial life time emerge at later times in the null model, there are very few such cases in the Strauss model.

5.1.3.
Mean and variance under the null model. Now, we determine the mean and variance of T C and T L under the null model with r f = 1.5. For this purpose, we compute the number of cluster deaths and accumulated loop life times for 10,000 independent draws of the null model.
Comparing the mean curves for the number of cluster deaths in the null model with those of the alternatives matches up nicely with the intuition about attraction and repulsion. For late times, they all approach a common value, namely the expected number of points in the observation window. However, Figure 4 shows that for the Matérn model, the slope is far steeper for early times, caused by merging of components of points within a cluster. In contrast, for the Strauss process the increase is at first much less pronounced than in the Poisson model, thereby reflecting the repulsive nature of the Gibbs potential. For the loops, a radically different picture emerges. Here, the curve for the Strauss process lies above the accumulated loop life times of the null model. The Strauss model spawns substantially more loops than the Poisson model, although most of them live for a shorter period. Still, taken together these competing effects lead to a net increase of the accumulated loop life times in the Strauss model. 5.1.4. Type I and II errors. By Theorem 3.2, the statistics T C and T L are asymptotically normal, so that knowing the mean and variance allows us to construct a deviation test whose nominal confidence level is asymptotically exact. For the loops, we can choose the entire relevant time range, so that r L = 0.5. For the cluster features, this choice would be unreasonable, as for late times, we simply obtain the number of points in the observation window, which is not discriminative. Hence, we set r C = 0.1. We stress that in situations with no a priori knowledge of a good choice of r C , the test power can degrade substantially.
To analyze the type I and II errors, we draw 1,000 realizations from the null model and from the alternatives, respectively. Table 1 shows the rejection rates of this test setup. Under the null model the rejection rates are close to the nominal 5%-level, thereby illustrating that already for moderately large point patterns the approximation by the Gaussian limit is accurate. Using the mean and variance from the null model, we now compute the test powers for the alternatives. Already T C leads to a test power of approximately 60% for both alternatives. When considering T L , we obtain a type I error rate of 4.8%, so that the confidence level is kept. Moreover, the power analysis reveals that in the present simulation set-up, T L is better in detecting deviations from the null hypothesis than T C .

Envelope Tests.
Leveraging Theorem 3.2 shows that the deviation statistics T C and T L are asymptotically normal. Using a simulation-based estimate for the asymptotic mean and variance under the null model allowed us to construct a deviation test whose confidence level is asymptotically precise. A caveat of the above analysis is that the magnitude of clustering and repulsion is strong and clearly visible in the samples.

MatC
Str T C 5.1% 59.3% 60.7% T L 4.8% 94.7% 71.4% Table 1. Rejection rates for the test statistics T C and T L under the null model and the alternatives.
Recently, global envelope tests have gained widespread popularity, because they are both powerful and provide graphical insights as to why a null hypothesis is rejected [34]. The global envelope tests are fundamentally Monte Carlo-based tests and therefore do not relate directly to the large-volume CLT. However, they also rely on a functional summary statistics as input. Most of the applications in spatial statistics use a distance-based second-order functional such as Ripley's L-function. In this section, we compare such classical choices with cluster-and loop-based statistics. 5.2.1. Alternatives. Since envelope tests excel at detecting subtle changes from the null model, we consider now a new parameter set-up to compare the L-function with the cluster-and loopbased statistics. Here, both the Matérn cluster as well as the Strauss process are substantially more similar to the Poisson point process. Hence, for the alternatives, we use again Matérn cluster and Strauss processes, but choose different parameters.
We found that the cluster-and loop-based statistics were particularly powerful in situations involving small interaction radii. Hence, as alternatives we choose the MatC(20, 0.1, 0.1) process and the Str(2.1, 0.1, 0.1) process, see Figure 5. The interaction parameter of the Strauss process was again tuned to match approximately the intensity of the null model.

Power analysis.
To analyze the power of the envelope test, we generate s = 4, 999 realizations of the null model and 1,000 realizations of the alternatives. Then, we perform the global envelope test from [34] with three functional summary statistics. The first is Ripley's L-function [33,Definition 4.6]. Second, we consider the number of cluster deaths as illustrated in Figure 4. Third, for the loops, we use a two-dimensional functional statistics derived from the persistent Betti numbers {β * ,1 b,l } b,l associated with life times l rather than death times d in order to expand the support of the statistic to the entire first quadrant. More precisely, β * ,1 b,l counts the number of loops born before time b and with life time at least l.
The rejection rates from Table 2 illustrate that for the alternatives described above, the cluster-based test gives a similar test power as the L-function-based test for the Matérn cluster process and a substantially higher test power for the Strauss process. Moreover, the loop-based test works even better in the Strauss case, but performs substantially worse for the Matérn alternative.

Analysis of the minicolumn dataset
In this section, we explore to what extent the deviation tests from Section 5 provide insights when dealing with real data. For this purpose, we analyze the minicolumn dataset provided by scientists at the Centre for Stochastic Geometry and Advanced Bioimaging.
As it should serve only to illustrate the application of Theorem 3.2, the present analysis is very limited in scope, and we refer to [14] for a far more encompassing study. For instance, that work considers two datasets and investigates 3D data together with marks for the directions attached to the neurons. 6.1. Exploratory analysis. The minicolumn dataset consists of 634 points emerging as twodimensional projections of a three-dimensional point pattern of neurons. As neurons are believed to arrange in vertical columns, the projections are expected to exhibit clustering, see [32,37]. The projections are taken along z-axis, since neuroscientists expect an arrangement in vertical columns. A visual inspection of the point pattern in Figure 6 supports this hypothesis.
As a first step, we explore whether the purported clustering already manifests in the persistence diagram. Comparing the loop-based persistence diagram of the minicolumn data with the persistence diagram of a homogeneous Poisson point process in Figure 7 shows that loops with substantial life times tend to be born later in the minicolumn model. This suggests clustering since loops formed by points within a cluster typically disappear rapidly. Now, we explore whether the impressions from the persistence diagrams are reflected in the summary statistics from Section 5. When comparing in Figure 8 (left) the number of cluster death at different points in time, we note that until time 35, the curve for the observed data runs a bit above the curve for the null model. This provides already a first indication towards clustering. Next, we proceed to the loop-based features. As shown in Figure 8 (right), the curve for the observed pattern runs substantially below the one of the null model. This reflects a property that we have seen already in the persistence diagram: clusters with substantial life time tend to be born earlier in the null model, thereby leading to a steeper increase of the accumulated life times.  6.2. Test for complete spatial randomness. Under the impression of the previous visualizations, we now test the minicolumn pattern against the null model. As in Section 5, we deduce from Theorem 3.2 that the statistics are asymptotically normal under the null model, so that we only need to determine means and variances.
A subtle issue concerns the choice of the integration interval. The simplest option would be to take the whole intervals shown in Figure 8. For instance, for the loop-based features, this means r L = r f = 120. However, for the cluster-based features the choice of the interval is less clear, since taking the whole interval is not discriminatory. The experiences from the simulation study indicate that the test is most powerful for early death times. Therefore, we choose r C = 10.
With these choices, both the cluster-based and the loop-based test reject the null-hypothesis at the 5% level, since the corresponding p-values are 1.7% and 1.2%. However, the tests are sensitive to the choice of the integration bound. This is especially true for the cluster-based test, where going from r C = 10 to r C = 15 results in a p-value of 5.2%, so that the null hypothesis is no longer rejected. On the other hand, the loop-based test still yields a p-value of 1.9% r L = 75. However, when reducing even further to r L = 50, then the p-value increases sharply to 34.1%, so that the null-hypothesis is no longer rejected.

Discussion
In this paper, we elucidated how to apply tools from TDA to derive goodness-of-fit tests for planar point patterns. For this purpose, we derived sufficient conditions for a large-domain functional CLT for the M -bounded persistent Betti numbers on point processes exhibiting exponential decay of correlations. Following the framework developed in [8], the main difficulty arose from a detailed analysis of geometric configurations when bounding higher-order cumulants.
A simulation study revealed that the asymptotic Gaussianity is already accurate for patterns consisting of a few hundred data points. Additionally, as functional summary statistics, the persistent Betti numbers can also be used in the context of global envelope tests. Here, our finding is that TDA-based statistics can provide helpful additional information for point patterns with small interaction radii.
Finally, we applied the TDA-based tests on a point pattern from a neuroscientific dataset. As conjectured from the application context, the functional summary statistics indicate a clustering of points and the tests reject the Poisson null-model. However, the analysis also reveals a sensitivity to the range of birth times considered in the statistics.
In future work, we plan to extend the present analysis to dimensions larger than 2. On a technical level, the definition of higher-dimensional features requires a deeper understanding of persistent homology groups. Additionally, when thinking of broader application scenarios, a further step is to extend the testing framework from mere point patterns to random closed sets involving a richer geometric structure.
for Independent Research -Natural Sciences, grant DFF 7014-00074 Statistics for point processes in space and beyond, and by the Centre for Stochastic Geometry and Advanced Bioimaging, funded by grant 8721 from the Villum Foundation. NS was partially supported by the French ANR grant ASPAG (ANR-17-CE40-0017).

Proof of Theorem 3.1
The main tool to prove Theorem 3.1 is the general CLT of [8,Theorem 1.14]. To be in that framework, we need to express the quantity f, in the form x∈Pn ξ(x, P n ) for a suitable score function ξ(x, P n ).
In other words, we need to transform the indexing over features into an indexing over the points of the point process P n . We achieve this goal by assigning to each feature a point x ∈ P n that either kills or gives birth to this feature, depending on whether q = 0 or q = 1.
First, the death of a cluster at time r > 0 is always caused by the merging of two points x, x ∈ P n at distance 2r. Indeed, when the size of a component has a jump, this can only appear by attaching to another component. If C r (x) dies by this merging, we say that x kills C r (x). This ensures that if two components both die when they merge, their deaths are caused by different points.
Similarly, if q = 1, then the birth of a hole at time r > 0 is caused by two points x, x ∈ P n at distance 2r whose connection creates a new hole. If only one feature is born at time r, we choose the lexicographic minimum of x and x and say that it gives birth to this hole. However, if a large hole is split into two M -bounded holes, it can happen that two holes H, H are born at the same time. In this situation, we assign one hole to each of x and x . Hence, we define the score functions as Belonging to class (A2) involves itself three conditions. The first is exponential decay of correlations, one of our standing assumptions on the point process P. The second asks for an exponentially decaying radius of stabilization. Since we work with M -bounded features, this radius is finite. Finally, we need to verify the power-growth condition [ holds for every r > 0, locally finite X ⊂ R 2 and x ∈ X . To achieve this goal, we note that in the worst case x can be responsible for the death of all other points of X . Similarly, it can give birth to at most X (W r (x)) − 1 holes. Hence, Finally, we verify the pth moment condition [8,Equation (1.19)]. That is, we prove that for every p > 0 there exists M p > 0 such that We explain in detail how this is achieved if q = 0, noting that the case q = 1 can be deduced after minor modifications. If x ∈ P is responsible for the death of a component at time r, then there exists x ∈ P n at distance 2r from x. Since each ball grows for time at most r f , we see that |ξ 0 (x, P n )| ≤ |f | ∞ (P n (B 2r f (x)) + p) and an application of condition (M) concludes the proof.

Proofs of Theorem 3.2 and Corollary 3.3
In the following, we assume q = 1, since the proofs for q = 0 are similar but easier. Hence, to simplify notation, we write β b,d (P n ) for β M,1 b,d (P n ). Proof of Corollary 3.3. Note that if (X(s)) s≤r f is a Gaussian process, then the process ( r 0 X(s)ds) r≤r f is also Gaussian. As mentioned above, the plan is to start from Theorem 3.2 and then apply the continuous mapping theorem [28,Theorem 4.27]. To this end, we show that {APF M,1 r (P n )} r≤r f is a continuous functional of the persistent Betti numbers {β b,d (P n )} b,d≤r f . We assert that APF M,1 r (P n ) = r 0 β b,0 (P n )db + r f 0 β r,t (P n )dt − rβ r,0 (P n ).
The remainder of the proof proceeds in two steps. First, we verify identity (10). Second, we show that the right-hand side is continuous in β with respect to the Skorokhod topology.
To prove identity (10), linearity allows us to reduce the claim to the case where the persistence diagram consists of a single δ-measure at a point (B 0 , D 0 ) for some D 0 > B 0 > 0. If B 0 > r, then both sides vanish. If B 0 ≤ r, then β b,0 = 1{b ≥ B 0 } and β r,t = 1{t ≤ D 0 }, so that the right-hand side of (10) gives the asserted According to (10), it is sufficient to prove that the function Φ r : D([0, r f ] 2 , R) → D([0, r f ], R), β → (Φ r (β)) r≤r f is continuous with respect to the Skorokhod topology. We prove this for the first integral. The arguments for the second are similar. Let β : [0, r f ] 2 → R be càdlàg and λ : [0, r f ] → [0, r f ] be an increasing continuous bijection. Then, If β approaches β in the Skorokhod metric, then by definition of this metric, we can choose λ such that the first two expressions become arbitrarily small. Moreover, since β itself is càdlàg, it follows that also the third expression tends to 0.

The proof of Theorem 3.2 decomposes into two steps: lower and upper variance bounds and an upper bound on fourth-order cumulants. In what follows, we write
Notice that this is minus the measure PD M,q (P n ) from (2) evaluated at the block E. Moreover, β(E, P n ) is the number of holes with birth time before b + and death time between d − and d + minus the number of holes with birth time before b − and death time between d − and d + . Following [4], two blocks E, E ⊂ [0, r f ] 2 are neighboring if they share a common side.  We postpone the proofs of Propositions 9.1-9.3 to Sections 9.1-9.3, respectively. To deduce Theorem 3.2 from these two central auxiliary results, we write for the centered persistent Betti numbers. Let a 1 , . . . , a k = 0 and (b 1 , d 1 ), . . . , (b k , d k ) ∈ [0, r f ] 2 be pairwise distinct, and put

Proof of Theorem 3.2.
Then, after suitable regrouping of terms, we can express X n in the form for centered random variables X, Y . Hence, by Propositions 9.2 and 9.3, E n −2 β(E, P n ) 2 β(E , P n ) 2 ≤ (C 0 /n + 3C 2 0 )|E| 1/2+ε 0 |E | 1/2+ε 0 , for some ε 0 > 0. In particular, the process n −1/2 β b,d (P n ) b,d≤r f is tight in Skorokhod topology [24,Lemma 3]. In this context, we note that condition (8.4) of [24, Lemma 3] follows from the variance upper bound derived in Proposition 9.2 and that similar as in (2.18) [22], we have replaced the equality in (8.5) of [24, Lemma 3] by an inequality. Combining this property with the convergence of finite-dimensional distributions yields the asserted weak convergence. 9.1. Proof of Proposition 9.1. To show the variance lower bound, we adapt a conditioning argument that has already been successfully applied in the setting of Gibbsian point processes [38]. More precisely, we subdivide the window W n into blocks of a fixed size and use the law of conditional variance to obtain a lower bound in the order of the number of blocks.
Associate with the jth feature H j in PD M,1 (P n ) a center point y j ∈ W n , for instance by taking the point p(H j ) as defined in Section 2.2. Then, defines a signed measure of total mass ν n (W n ) = i≤k a i β(E i , P n ).
In the vein of [38], the key towards proving a lower bound on the variance is the following non-degeneracy property, where r AC is introduced in Section 3. Before proving Lemma 9.4, we explain how it implies Proposition 9.1. In essence, the proof follows along the lines of [38,Lemma 4.3]. Nevertheless, since the details of the conditioning argument differ a bit from the corresponding picture for Gibbs processes, we explain how to adapt the main steps from [38,Lemma 4.3] in the present setting.
Proof of Proposition 9.1. The idea of proof is to consider a family of well-separated blocks in W n . Then, we leverage the conditional m-dependence of the point process and the Mboundedness of the features to decompose the variance of their contributions as the sum of the variances. More precisely, we apply the assumption of conditional m-dependence with the conditioning set chosen as the complement of the union of well-separated blocks of side length ρ = m ∨ r AC . Then, the law of total variance yields the lower bound Moreover, since ρ > M the statistics ν n ((A ) − ) in the smaller domain is measurable with respect to P ∩ A . We obtain that Thanks to the conditional m-dependence, we have Now, the number of 6ρ-blocks contained in W n is of order n, and we conclude by noting that Lemma 9.4 and ρ > r AC imply that each of the contributions is bounded away from 0.
To verify non-degeneracy, we rely on the techniques introduced in [38]. In particular, we make use of [38,Lemma 2.3], which we restate below to render the presentation self-contained.
Lemma 9.5. Let Y be a real random variable and A 1 , A 2 be Borel sets of R. Then, Proof of Lemma 9.4. Write \ W r 2 AC /9 ) = ∅} for the events that there are no points in W r 2 AC /9 and W r 2 AC \ W r 2 AC /9 , respectively. Next, let F 2 = {β(E 1 , P r 2 AC /9 ) = 1} ∩ {β(E 2 , P r 2 AC /9 ) = · · · = β(E k , P r 2 AC /9 ) = 0} denote the event that all but the first of the considered persistent Betti numbers vanish. Now, let I 0 denote the indices of all features that are entirely contained in Then, by Lemma 9.5 with A 1 = [a 1 , ∞) and and it remains to show that the right-hand side is non-zero. Since E 1 , . . . , E k are pairwise disjoint and contain points above the diagonal, [25,Example 1.8] shows that under the homogeneous Poisson point process the event F ∩ F 2 has positive probability. Also F ∩ F 1 is of positive probability. Hence, an application of condition (AC) concludes the proof.

Proof of Proposition For a block
we let ξ E denote the score function associated with β(E, P n ). That is, x gives birth to the ith hole} is the number of holes born by x with birth and death times in E. Note that if x gives birth to the ith hole, then it gets in contact with another point at time B M i ∈ (b − , b + ]. In particular, P contains a point in the annulus A 2b − ,2b Moreover, if the ith hole dies at time D M i ∈ (d − , d + ], then a previously vacant component is covered completely, which is caused by three disks centered at points in P meeting at a single point in the plane. The three center points of the disks must form a triangle with no obtuse angle. Otherwise, two of the disks would meet for the first time in the interior of the third and hence no connected component in the background was covered by the merging. This could be interpreted as a feature that is born and dies at the same time, but we chose to exclude such features in our definition of 1-features. Henceforth, let B ± d (x, y) ⊂ R 2 denote the two disks of radius d > 0 whose boundary passes through x, y ∈ R 2 . If |x − y|/2 > d, we let B ± d (x, y) be empty. The points in B + d (x, y) ∪ B − d (x, y) are exactly the points z such that the time when the boundaries of the three disks around x, y, and z meet in one point is at most d. For d + > d − ≥ 0, we let , y) , where a = |x − y|/2. This set consists of all points z such that the boundaries of the three disks around x, y and z meet at time r with d − < r ≤ d + . Some z ∈ D d − ,d + (x, y) may still form a triangle having an obtuse angle with x and y, that is, the disks around x, y, and z already met earlier in an interior point of one of the disks. However, all z that can cause the death of a hole in E together with x and y must be contained in D d − ,d + (x, y). Now, where E x denotes the event that for some P ⊂ P with x ∈ P the event E x,b (P ) ∩ E x,d (P ) occurs, where Here, we say that y 1 , y 2 , y 3 ∈ P kill the hole H if the disks around the points meet for the first time at p(H). In particular, any three points can kill at most one hole. Similarly, for a block where we let E x,x denote the event that for some P ⊂ P with x, x ∈ P the event occurs.
Using this notation, the proof of the variance upper bound is now based on the following pivotal geometric moment bound. In the following, P x denotes the unreduced Palm measure for any non-negative measurable f : N → [0, ∞). We recall from (3) that ρ (p) denotes the pth factorial moment density. In the following, we adhere to the convention B 0 f (x)dz = f (x). Lemma 9.6 (Moment bound). Let P be a stationary point process having fast decay of correlations and satisfying condition (M). Let p ≥ 0 and K 0 > 0. Then, there exist ε > 0 and C g > 0 such that for all n > 0 and any ball B ⊂ R 2 of radius K > K 0 , holds for all neighboring blocks E, E ⊂ [0, r f ] 2 , and The proof of Lemma 9.7 relies on a delicate geometric analysis that we defer to Section 9.4. We now prove Proposition 9.2. As in [8,Equation (1.6)], for x = (x 1 , . . . , x p ) ∈ R 2p and k 1 , . . . , k p ≥ 0, we introduce the mixed ξ E -moments m (k 1 ,...,kp) n In the rest of the manuscript, we freely use that exponential decay of correlations implies boundedness of the factorial moment densities [8, Inequality (1.11)].
Proof of Proposition 9.2. To lighten notation, we write ξ instead of ξ E . To give the paper more pleasant to read, we have not attempted to optimize the exponents occurring in the course of this proof. Proceeding as in [ We derive bounds for the two summands separately. By stationarity, (11) and Hölder's inequality, the first expression is at most Hence, Lemma 9.6(1) with p = 0 yields the asserted upper bound.
To deal with the double integral in (13), we recall that ξ is a local score function and that P exhibits exponential decay of correlations. Hence, as in [8,Equation (3.26)], the factorial moment measure expansion shows that n (x)m (1) n (y) ≤ cφ(|x − y|/2) for some c > 0. In particular, choosing a cut-off K = |E| −1/128 , we see that holds for a suitable C > 0 and it suffices to derive an upper bound for For the second summand, we can argue similarly as in (14), so that it remains to bound the integral involving m (1,1) n (x, y). Here, we set z = y − x, note that ρ (2) (x, y) = ρ (2) (o, z) and combine (11) with Hölder's inequality to arrive at We bound E o,z [P(B M (z)) 16 ] thanks to condition (M). Finally, by Jensen's inequality applied to the uniform distribution on B K (o), so that applying Lemma 9.6(1) with p = 1 shows that the right-hand side is of order at most (|E| 3/4 |B K (o)|) 7/8 = |E| 7·47/512 , thereby concluding the proof. However, this artificial difficulty can be overcome by formally attaching {1, 2}-valued marks to P n . Taking up the notation from [19], we letȒ 2 = R 2 × {1, 2} andP n = P n × {1, 2} denote the correspondingly marked space and point process. Writing E = (E, E ), we define an augmented score function ξ E , where points with mark 1 are evaluated with the first score function and points with mark 2 are evaluated with the second score function. In other words, We take the concise proof of Proposition 9.2 as a blueprint for the strategy of the more involved setting laid out in Proposition 9.3. In particular, we need to address two main steps: bounds for mixed moments and a reduction of the integral to the diagonal.
In order to reduce to the diagonal, we decompose the cumulant measure into semi-cluster measures as in [3, Section 5.1] and [19,Section 3.2]. For the convenience of the reader, we reproduce the basic definitions. First, the kth moment measure M k (µ n ) is given as where f = f 1 ⊗ · · · ⊗ f k is non-negative and measurable with each f i defined onȒ 2 , and µ n = µ E ,n = n −1 x∈Pn ξ E (x,P n )δx denotes the empirical measure associated with ξ E andP n . In terms of mixed ξ-moments, with x T i the projection ofx to the coordinates in T i , we write where dx T i are the singular differentials determined via is any non-negative measurable function [19, Section 3.1]. As in (12), for k 1 , . . . , k p ≥ 0, the mixed ξ E -moments are given as for everyx = ((x 1 , τ 1 ), . . . , (x k , τ k )) ∈Ȓ 2k . Similarly, the kth cumulant measure c k n = c k (µ n ) equals f , c k n = c k ( f 1 , µ n , . . . , f k , µ n ), so that where denotes the moment measure with coordinates in T i . Next, the spaceW 4 n decomposes into a union of subsets according to which coordinate is most distant from the diagonal [19, Lemma 3.1]. More precisely, write for the maximal separation ofx S andx T , where dist(x S ,x T ) = dist(x S , x T ). Then, put Here, the marks are ignored for the diagonal ∆ ⊂W 4 n . We also put W  Before proving Lemma 9.7, we elucidate how to deduce Proposition 9.3.
Proof of Proposition 9.3. First, integration over the cumulant measure decomposes into a diagonal and an off-diagonal part [19,Equation (3.28)]. That is, n is the indicator function of the domain W (1,2) n and the sum is over all nontrivial partitions S, T . By Lemma 9.7, the off-diagonal contributions in this decomposition are bounded above by S,T C S,T |E| 1/2+ε S,T |E | 1/2+ε S,T .
Next, when integrating over the diagonal, we leverage that in the decomposition (17), only p = 1 contributes [19,Lemma 3.1]. Hence, so that applying Lemma 9.6(2) with p = 1 and noting the convention preceding that result concludes the proof.
To prove Lemma 9.7, we decompose the cumulant measures into semi-cluster measures [3, Lemma 5.1]. More precisely, as in [3,19], any two disjoint non-empty subsets S , T {1, 2, 3, 4}, induce a cluster measure where the sum runs over all partitions such that S and T are non-empty subsets of S and T , respectively [19,Lemma 3.2]. Equipped with these ingredients, we now prove Lemma 9.7. Since the basic structure of the proof parallels that of Proposition 9.2, we only provide details for the steps that are substantially different.
Proof of Lemma 9.7.
For this purpose, we decompose the moment measures dM S ∪T , dM S and dM T according to (16). Hence, we need bounds for the absolute value of differences of mixed ξ-moments of the form m (S 1 ,...,S p ,T 1 ,...,T r ) n where {S 1 , . . . , S p } and {T 1 , . . . , T r } are partitions of S and T , respectively. Since we are working on the set σ(S, T ), as in the proof of Proposition 9.2, the fast decay of ξ-correlations bounds (19) by cφ(D(x S ∪T )/2) for a suitable c > 0. Next, as in [19,Section 3.1] the singular differentials occurring in the expansion (16) of the moment measure M k can be grouped into a single object. More precisely, we writedx for the measure that equals dx T 1 · · · dx Tp on the subset ofȒ 2k consisting of allx = (x 1 , . . . ,x k ) such thatx i =x j if i, j ∈ T r for some r ≤ p andx i =x j otherwise.
In the setting of the present proof, we note that the bounds on the mixed moments from (19) only involve coordinates with indices in the set S ∪ T . Hence, we need to consider also singular differentials only with respect to these coordinates, i.e., integrate with respect todx S ∪T . In particular, we arrive at the bound Now, setting K = |E| −ε/128 |E | −ε/128 , the exponential decay assumption on the function φ gives control on one integral over the window, while the integrals with respect to the remaining variables are controlled by the volume of balls. Then, a repeated application of Hölder's inequality provides suitable bounds on the moment measures such that holds for some C > 0. Hence, it suffices to provide upper bounds for where {T 1 , . . . , T p } is an arbitrary partition of {1, 2, 3, 4}. We explain how to proceed for p = 1, noting that for p > 1 the arguments are similar but easier. We claim that for some C > 0, To prove this claim, decompose M {1,2,3,4} according to (16) and let {T 1 , . . . , T p } be an arbitrary partition of {1, 2, 3, 4}. As in the proof of Proposition 9.2, a repeated use of Hölder's inequality shows that on W  (15) by invoking Lemmas 9.6(2) and 9.6(3). As an illustration consider the setting where p = 4 and i = 2. Then, we set z = x 2 − x 1 , z 3 = x 3 − x 1 and z 4 = x 4 − x 1 . We combine Jensen's inequality with Lemma 9.6(3) to show that Hence, inserting the definition of K concludes the proof. 9.4. Proof of Lemma 9.6. We now turn to the proof of Lemma 9.6. The proof is based on the following four lemmas that are used to bound the probability with which certain point configurations occur. Throughout we use the notation The proofs make use of the inequalities where C 0 > 0 is some constant. Moreover, we repeatedly use that the volume of an annulus is given by Lemma 9.8. Let x, y ∈ R 2 and a = |x − y|/2. There is a constant C > 0 such that for all The line through x and y cuts the disk B + d (x, y) into two parts. The area of the larger part is given by y) is the union of two such sets of radius d + from which we remove two sets of the same type with radius d − ∨ a from the interior. This yields the formula for the area. The inequality follows from and, using (22) and (23), Lemma 9.9. Let 0 ≤ b − < b + ≤ r f and 0 ≤ d − < d + ≤ r f and let B M be a disk of radius M . Then, there is a constant C > 0 such that Proof. Integration with respect to y 3 yields: When δ b ≤ δ d , the claim follows directly from Lemma 9.8. Otherwise, letting a = |y 1 − y 2 |/2, we split the integral in two terms according to whether a < d − or a ≥ d − . Applying Lemma 9.8 yields the bound To bound (24), we apply the mean value theorem and perform the integration to obtain the bound To bound (25), we bound the integrand using Lemma 9.8 and note that This proves the claim when δ d ≤ δ b . Lemma 9.10. Let B M be a disk of radius M > 0. There is a constant C > 0 such that for Proof. We may assume d + > 3δ d ∨ 8 r f (δ b + δ b ). Indeed, if d + ≤ 3δ d , we can show the claim by first integrating with respect to y 1 , then using that by Lemma 9.9, , and finally integrating with respect to x 2 and x 3 to provide a factor |B M |δ b δ b . If d + ≤ 8 r f (δ b + δ b ), we first integrate with respect to x 1 , which yields the area of A 2b − ,2b + (x 2 ) ∩ A 2b − ,2b + (x 3 ). This is bounded by C 2 δ b ∧ δ b , and by Lemma 9.9 the remaining integral is bounded by We write the integral as a sum of three terms corresponding to whether I: a < d + /4, II: Term I: We first integrate with respect to y 1 . Since , the mean value theorem applied to the formula in Lemma 9.8 implies that |D d − ,d + (x 2 , x 3 )| ≤ C 4 δ d . We then integrate with respect to x 2 and x 3 to obtain the bound C 5 |B M |δ b δ b δ d .
Term II: When d + /4 ≤ a ≤ b − ∧ b − , we first integrate with respect to x 1 to obtain the area of A 2b − ,2b + (x 2 ) ∩ A 2b − ,2b + (x 3 ). To bound term II, we need to explicitly compute this area. For this, we first compute the area A a (b 1 , b 2 By the assumption on d + , This ensures that the line containing the two points where the boundaries of the disks B 2b 1 (x 2 ) and B 2b 2 (x 3 ) meet separates x 2 and x 3 . The area of B 2b 1 ( It is a straightforward computation to see that ∂ 2 ∂b 1 ∂b 2 A a (b 1 , b 2 ) is uniformly bounded by C 6 /d 2 + on the set of a, b 1 , b 2 ≤ r f satisfying (26) and d + /4 ≤ a ≤ b 1 ∧ b 2 . In particular, (26) guarantees that such that arccos and x → √ 1 − x 2 have bounded derivatives for the relevant values of x. It follows that (27) is bounded by C 7 δ b δ b /d 2 + . The remaining integral is of order |B M |d 2 + δ d by Lemma 9.9, which yields the appropriate bound.
Term III: In this case, we first integrate with respect to x 1 providing a factor δ b ∧ δ b . The remaining integral is bounded using Lemma 9.9.
The fourth lemma allows us to analyze which point configurations can cause the birth and death of M -bounded features. To state it, we recall the α-complex associated with a locally finite point set X ⊆ R 2 , see e.g. [18,Sec. III.4] for details. It is built from the Delaunay triangulation, which is a triangulation of the plane with vertex set X . For r > 0, α r (X ) is the union of all edges in the Delaunay triangulation with length at most 2r and all triangles such that the three balls of radius r centered at its vertices cover the triangle. Then α r (X ) ⊆ U r (X ) and the inclusion is a homotopy equivalence, i.e. it preserves the topology. Proof. The analogous statements hold for unbounded loops by the homotopy equivalence between the α-complex and the union of balls. (i) follows because any M -bounded loop is also an unbounded loop. An M -bounded feature is either born the same way as the corresponding unbounded component or when two balls meet to split off a component. In both cases, some unbounded loop is born by the merging, and hence an edge is added to the α-complex. This shows (ii). When an M -bounded loop dies, so does the corresponding unbounded loop, hence (iii) is clear.
We are now ready to prove Lemma 9.6.
Proof of Lemma 9.6. Proof of (1). Stationarity and Equation (4) yield In the following, we let y = (y 1 , y 2 , y 3 ), and for simplicity. By definition of E x , (28) is bounded by Here, we have used that g(x 1 , x 2 , y) is symmetric in x 1 and x 2 and in y 1 , y 2 , and y 3 . Applying (4) again, we may bound the last term in (29) by 1 is a translation of W 1 and such that ≤ C 1 |B| for some C 1 independent of K (for instance using that B K ⊆ W 4 K 2 ). Then, by the moment condition (M) for We apply this in (30) together with Lemma 9.9. Since each ρ (k) is bounded according to the assumption of fast decay of correlations, we obtain the bound C 4 |B| p+1 |E| 1/2+ε . Proof of (2). In the following, we use the notation Note that since the blocks E and E are neighboring, the features in E and E are different. Putting x = (x 1 , x 2 , x 3 ), we now expand as in (28) The condition x 2 = x 3 comes from the fact that x 1 can give birth to at most one feature when connecting to another point, and since E and E are neighboring, x 2 and x 3 correspond to different features. Similarly, y = y comes from the fact that a triangle can kill at most one feature.
The event A excludes certain point configurations that are not possible. If the triangles formed by y and y share an edge, and the vertices of this edge coincide with x 2 and x 3 , then |x 2 − x 3 | > 2(b + ∨ b + ) is not allowed. Indeed, it follows from Lemma 9.11 that the triangles correspond to the same feature in the α-complex until x 2 and x 3 are joined. Thus, this must happen before both triangles are born, that is, at the latest at time b + ∨ b + . Moreover, if the two triangles share an edge, then the two points in y, y not lying on this edge cannot be equal to x 1 and x 2 or to x 1 and x 3 , as this would lead to crossing edges in the α-complex by Lemma 9.11 (since the triangles formed by y, y cannot have any obtuse angles).
We now write the sum in (31) as a sum where each term is a sum over P k = , 4 ≤ k ≤ 9, as in (29). Each such term comes from grouping x, y, y into sets of equal points. Consider for illustration the term corresponding to the situation x 2 = y 1 , x 1 = y 2 = y 2 , x 3 = y 3 = y 3 . The sum is handled as in the proof of Lemma 9.6(1) by applying (4) and bounding the involved Palm means. For this special point configuration, it is sufficient to bound 1 A by 1. P(B + x 1 ) p g(x 1 , x 2 , y 1 , x 1 , x 3 )g (x 1 , x 2 , x) g(x 1 , x 2 , y 1 , x 1 , x 3 )g (x 1 , x 3 , x)dy 1 dx. Now, we apply the Hölder inequality with 1 q 1 + 1 q 2 = 1 to obtain the bound In the first integral, we first integrate with respect to x 2 and then apply Lemma 9.9, while in the second integral we first integrate with respect to y 1 and use the bound in Lemma 9.8 and then apply Lemma 9.9 again. Next we use that E and E are neighboring blocks so that either δ b = δ b or δ d = δ d . When δ b = δ b , we get the bound so we take 1/q 1 > 1/4 and 1/q 2 > 2/3. When δ d = δ d , we use Lemma 9.9 to get the bound so we take 1/q 1 > 1/2 and 1/q 2 > 1/3. For a general term, note that there are at least four different points among y, y , so one of them, say y 1 , cannot be equal to any of x. We consider two cases: I y 1 is not among y 1 , y 2 , y 3 . II y 1 = y 1 , y 2 = y 2 , and y 3 = x 2 and y 3 = x 3 . Since we no longer keep track of which edge kills which triangle, all possible point configurations allowed by A fall into one of the above cases after possibly renaming the variables.
In particular, if y 1 = y 1 and the points y 2 , y 3 , y 2 , y 3 are all different, one of them cannot be any of x 1 , x 2 , x 3 , and we could have taken this as y 1 and be in Case I. If y 1 = y 1 , y 2 = y 2 and, say, y 3 is not any of x 1 , x 2 , x 3 , we could have chosen y 3 as y 1 and be in Case I.
We further divide the Case I configurations allowed by A into the following two sub-cases that have to be treated separately: Ia. x 3 is not any of y 2 , y 3 .
Case Ia: We apply the Hölder inequality to The first factor is integrated with respect to x 3 and the remaining integral is bounded using Lemma 9.9. The second factor is first integrated wrt. y 1 , the result is bounded using Lemma 9.8, and the remaining integral is bounded using Lemma 9.9. The rest of the argument proceeds as in the special case treated above.
Case Ib: The claim follows by applying the Hlder inequality to (35) and arguing as in Case Ia using Lemma 9.10 to bound the first integral.
Case II: We apply the Hölder inequality exactly as in (35) and argue as in Case Ia, except that the second integral is first integrated with respect to y 3 rather than y 1 .
The setÃ consists of tuples of points (x 1 , x 2 , x 3 , x 4 , y, y ) ∈ R 20 and, similar to A, it excludes certain configurations of the points (x 1 , x 2 , x 3 , x 4 , y, y ) that are not allowed by Lemma 9.11. If the triangles formed by y and y share an edge, then the length of this edge must be at most 2(b + ∨ b + ). Moreover, if the two triangles share an edge, then the two points in y, y not lying on this edge cannot be equal to x 1 and x 3 or to x 2 and x 4 . The contribution from the cases where two of the points x 1 , x 2 , x 2 , z are identical is bounded by E x,y,y ∈P 3 = ∩B 3 2M +2 y =y P(B + x 1 ) p g(x 1 , x 2 , y)g (x 1 , x 3 , y )1Ã(x 1 , x 2 , x 1 , x 3 , y, y ) , which is handled exactly as in the proof of Lemma 9.6(2). Thus, it remains to treat the terms where x 1 , x 2 , x 2 , z are all different. Therefore, if we put x = (x 1 , x 2 , x 3 , x 4 ), we must bound × g(x 1 , x 3 , y)g (x 2 , x 4 , y )1Ã(x, y, y ) .
The rest of the proof proceeds as the proof of Lemma 9.6(2) by suitable applications of the Hölder inequality. We divide into two cases according to whether all points in y, y are one of x or not. After renaming the variables, we may assume I y 1 = y 1 = x 1 , y 2 = y 2 = x 2 , y 3 = x 3 , and y 3 = x 4 , or II y 1 is not any of x. Notice that in Case I we exclude the case y 1 = y 1 = x 1 , y 2 = y 2 = x 3 , y 3 = x 2 , and y 3 = x 4 because it was excluded by definition ofÃ. After renaming variables, Case II is divided into IIa y 1 is not any of x or y , and x 1 is not any of y 2 , y 3 . IIb y 1 = y 1 and y 1 is not any of x, y 2 = y 2 = x 3 , y 3 = x 1 . IIc y 1 = y 1 , y 2 = x 2 , y 3 = x 4 , y 2 = x 1 , y 3 = x 3 . IId y 1 = y 1 , y 2 = x 1 , y 3 = x 2 , y 2 = x 3 , y 3 = x 4 . In Case IIa, y 1 is not one of y , while in Case IIb, IIc, and IId it is. Case IIb corresponds to the situation in which the triangles formed by y, y share an edge, while in Case IIc and IId they share only one vertex. In Case IIc, each triangle contains one of the edges joining x 1 to x 3 and x 2 to x 4 , while in Case IId they do not.