Lower large deviations for geometric functionals

This work develops a methodology for analyzing large-deviation lower tails associated with geometric functionals computed on a homogeneous Poisson point process. The technique applies to characteristics expressed in terms of stabilizing score functions exhibiting suitable monotonicity properties. We apply our results to clique counts in the random geometric graph, intrinsic volumes of Poisson-Voronoi cells, as well as power-weighted edge lengths in the random geometric, $k$-nearest neighbor and relative neighborhood graph.


Introduction and main results
Considering the field of random graphs, there is a subtle difference in the understanding between upper and lower tails in a large-deviation regime. For instance, when considering the triangle count in the Erdős-Rényi graph, the probability of observing atypically few triangles is described accurately via very general Poisson-approximation results [Jan90,JW16]. On the other hand, the probability of having too many triangles requires a substantially more specialized and refined analysis [Cha12].
This begs the question whether a similar dichotomy also arises in the large-deviation analysis of functionals that are of geometric rather than combinatorial nature. For instance, Figure 1.1 shows a typical realization of the random geometric graph in comparison to a realization with an atypically small number of edges. In geometric probability, elaborate results are available for large and moderate deviations of geometric functionals exhibiting a similar behavior in the upper and the lower tails [SY01,SY05,ERS15]. However, they prominently do not cover the edge count in the random geometric graph, whose upper tails have been understood only recently [CH14].
In the present work, we provide three general results, Theorems 1.1, 1.2 and 1.3, tailored to studying large-deviation lower tails of geometric functionals. For the proofs, we resort to a method inspired by the idea of sprinkling [ACC + 83]. We perform small changes in those parts of the domain where the underlying point process exhibits highly pathological configurations. After this procedure, we can compare the resulting functionals to approximations that are then amenable to the point-process based large-deviation theory from [GZ93] or [SY01,SY05].
Among the examples covered by our method are clique counts in the random geometric graph, inner volumes of Poisson-Voronoi cells and power-weighted edge lengths in the random geometric, k-nearest neighbor and relative neighborhood graph.
In the rest of this section, we set up the notation and state the main results. Then, Section 2 illustrates those results through the examples. Finally, Section 3 contains the proofs.
We study functionals on a homogeneous Poisson point process X = {X i } i≥1 ⊂ R d with intensity 1, whose distribution on the space N of locally-finite configurations will be denoted by P. Following the framework of [SY01], these functionals are realized as averages of scores associated to the points of X. More precisely, a score function is any bounded measurable function. To simplify notation, we shift the coordinate system to the considered point and write ξ(X − X i ) = ξ(X i , X). In this notation ϕ → ξ(ϕ) acts on configurations ϕ ∈ N o , the family of locally-finite point configurations with a distinguished node at the origin o ∈ R d .
We then consider lower tails of functionals of the form i.e., averages of the score function over all points in the box Q n = [−n/2, n/2] d of side length n ≥ 1 centered at the origin. In a first step, we derive upper bounds for the lower tail probabilities. To that end, we work with approximating score functions ξ r that are r-dependent for some r > 0. That is, ξ r (ϕ) = ξ r (ϕ ∩ B r ) for every ϕ ∈ N o , where B r denotes the Euclidean ball of radius r centered at the origin.
To state the main results, we resort to the entropy-based formulation of the large-deviation rate function. We write h(Q) = lim n↑∞ 1 n d dQ n log dQ n dP n for the specific relative entropy of a stationary point process Q, where Q n and P n denote the restrictions of Q and P to the box Q n , respectively. If Q n is not absolutely continuous with respect to the restricted Poisson point process, we adhere to the convention that the above integral is infinite. Further, Q o [ξ] is the expectation of ξ with respect to the Palm version Q o of Q, see [GZ93] for details. Here is our first main theorem.
Theorem 1.1 (Upper bound). Let a > 0 and assume the score function ξ to be the pointwise increasing limit of a family {ξ r } r≥1 of r-dependent score functions. Then, (1.2) For the lower bound, we give two sets of conditions. The first deals with score functions ξ that are increasing in the sense that ξ(ϕ) ≤ ξ(ψ) for every ϕ ⊂ ψ. This applies for instance to clique counts and power-weighted edge lengths in the random geometric graph.
Theorem 1.2 (Lower bound for bounded-range scores). Let a > 0 and assume the score function ξ to be increasing and r-dependent for some r > 0. Moreover, assume that for every (1.3) However, many score functions are neither r-dependent nor increasing, or not even monotone. A prime example is the sum of power-weighted edge lengths in the k-nearest neighbor graph, see Section 2. Still, this example and many other score functions are stabilizing, R-bounded and weakly decreasing in the following sense.
First, a score function ξ is stabilizing if there exists a P o -almost surely finite measurable stabilization radius R : N o → [0, ∞], such that {R(X) ≤ r} is measurable with respect to X ∩ B r for every r ≥ 0 and In words, ξ(X) does not depend on the configuration outside the ball B R(X) . We call R de- Second, ξ is R-bounded if for every δ > 0 and sufficiently large M = M (δ) ≥ 1, Loosely speaking, the score function is negligible compared to the dth power of the stabilization radius. Third, ξ is weakly decreasing if holds for some k ≥ 1. In words, for all but at most k points of a configuration, adding a new point to the configuration decreases the score function value of the point. Finally, we need to ensure that sprinkling a sparse configuration of Poisson points yields control on the stabilization radii of the points in a box. More precisely, we assume that the stabilization radius is regular in the following sense. Let X +,M denote a Poisson point process with intensity M −d that is independent of X. Then, we assume that there exists K 0 > 0 with the following property. For every δ > 0 there exist M 0 = M 0 (δ) ≥ 1 and n 0 = n 0 (δ) ≥ 1 such that for all M ≥ M 0 and n ≥ n 0 , holds almost surely. Here, for ϕ ∈ N and any measurable subset A ⊂ R d , we write ϕ(A) = #{x ∈ ϕ : x ∈ A} for the number of points of ϕ contained in A, and denotes the event that after the sprinkling, the stabilization radii of all points in Q n are at most M . Here is the corresponding main result.
Theorem 1.3 (Lower bound for stabilizing scores). Let a > 0 and ξ be a weakly-decreasing R-bounded score function with a decreasing and regular radius of stabilization. Then, (1.3) remains true.

Examples
In this section, we discuss how to apply the results announced in Section 1 to a variety of examples arising in geometric probability. More precisely, Sections 2.1, 2.2 and 2.3 are devoted to characteristics for the random geometric graph, the Voronoi tessellation, k-nearest neighbor graphs and relative neighborhood graphs, respectively.
2.1. Clique counts and power-weighted edge lengths in random geometric graphs. As a first simple application of our results, consider the set of k-cliques associated to the origin in the geometric graph on ϕ ∈ N o with connectivity radius t > 0. Then, for k ≥ 2 and α ≥ 0, the score functions count the number of k-cliques containing the origin and the power-weighted edge lengths at the origin, respectively. Note that ξ k and ξ α are t-dependent and increasing. Additionally, if Further examples arise in the context of topological data analysis. More precisely, the number of k-cliques containing the origin is precisely the number of k-simplices of the Vietoris-Rips complex containing the origin. Similar arguments also apply to theČech complex, the second central simplicial complex in topological data analysis. We refer the reader to [BCY18, Section 2.5] for precise definitions and further properties.

Intrinsic volumes of Voronoi cells.
Recall the definition of the Voronoi cell at the origin of a locally-finite configuration ϕ ∈ N o , i.e., They are key characteristics of a convex set, e.g., v 1 , v d−1 and v d are proportional to the mean width, the surface area and the volume, respectively. We refer the reader to [SW08, Section 14.2] for a precise definition and further properties. In particular, considering v 1 in dimension d = 2, the associated characteristic n d H n becomes the total edge length of the Voronoi graph, so that we obtain a link to the setting studied in [SY05, Section 2.4.1]. Due to the intricate geometry, deriving a full large deviation principle even for a strictly concave function of the edge length was only achieved for a Poisson point process that is restricted to a lattice instead of living in the entire Euclidean space. This example illustrates that even in situations where understanding the large-deviation upper tails requires a delicate geometric analysis, the lower tails may be more accessible.
More precisely, consider the score functions and note that ξ r k (ϕ) = v k C o (ϕ) ∩ B r is a 4r-dependent, pointwise increasing approximation of ξ k (ϕ). Hence, the upper bound of Theorem 1.1 applies.
For the lower bound, the conditions of Theorem 1.3 can be satisfied using the following definitions. The radius of stabilization is described in [Pen07, Section 6.3]: Take any collection {S i } i∈I of cones with apex at the origin and angular radius π/12 whose union covers R d , where I = I(d) ∈ N. Let S + i denote the cone that has the same apex and symmetry hyperplane as S i and has the larger angular radius π/6. Then, we define the stabilization radius (2.1) as twice the radius at which the origin has a neighbor in every extended cone. In particular, both R and ξ k are decreasing. Since C o (ϕ) ⊂ B R(ϕ) , we deduce that In particular, ξ k is R-bounded for k < d. that X +,M has precisely one point in each sub-box from an M/L-partition of the box Q 2n . It follows from the definition of R that the event E M,+ n occurs whenever A M n occurs, provided that L is chosen sufficiently large. Moreover, setting K 0 = (2L) d , we deduce that X +,M (Q n ) ≤ K 0 (n/M ) d under A M n . Hence, it remains to establish the asserted lower bound on the probability P(A M n ). Fixing δ > 0 and invoking the independence property of the Poisson point process yields that is sufficiently large. Summarizing the above findings, we deduce that Theorem 1.3 can be applied to get the lower bound on the rate function.
2.3. Power-weighted edge counts in k-nearest neighbor graphs and relative neighborhood graphs. Finally, we elucidate how to apply Theorem 1.3 to the power-weighted edge count of two central graphs in computational geometry, namely the k-nearest neighbor graph and the relative neighborhood graph. As we shall see, in contrast to the Voronoi example presented in Section 2.2, we encounter here score functions that are weakly decreasing but not decreasing. A full large deviation principle for the total edge length of the k-nearest neighbor graph is described in [SY05, Section 2.3], and we believe that the proof should extend to power-weighted edge lengths with a power strictly less than d. Nevertheless, we apply here our approach towards the large-deviation lower tails as it can be directly adapted to the bidirectional k-nearest neighbor graph, the relative neighborhood graph and possibly further graphs.
In the undirected k-nearest neighbor graph, ξ expresses the powers of distances between any point and the origin, such that at least one of them belongs to the set of k nearest neighbors of the other one. To be more precise, defines the k-nearest neighbor radius of o in ϕ ∈ N o . Then, for some α ≥ 0, the score function corresponding to the sum of power-weighted edge lengths of the k-nearest neighbor graph is defined via In particular, we recover the number of edges by setting α = 0. As noted in [Pen07, Section 6.3], to construct a radius of stabilization we can proceed as in (2.1) except for replacing min x∈ϕ∩S + i |x| by the distance of the kth closest point from the origin in ϕ ∩ S + i . Hence, ξ k,α becomes stabilizing with a decreasing stabilization radius. In the same vein, a minor adaptation of the arguments in Section 2.2 yield the regularity and R-boundedness for α < d.
In order to apply Theorem 1.3 for the lower bound, it remains to verify the following.
Proof. Let us call ϕ ∈ N nonequidistant if for all y, z, v, w ∈ ϕ, |y − z| = |v − w| > 0 implies {y, z} = {v, w}. First note that for any x ∈ R d , under P, almost all configurations ϕ ∪ {x} are nonequidistant. We claim that for any nonequidistant configuration ϕ ∪ {x}, we have for all but at most k points y ∈ ϕ that Indeed, for y ∈ ϕ, let us define the set of k nearest neighbors of y in ϕ as follows . We claim that else (2.4) holds. Indeed, if y / ∈ kNN(ϕ ∪ {x}, x), then there are two possibilities. If x ∈ kNN(ϕ ∪ {x}, y), then x replaced precisely one neighbor z of y and is closer to y than z. More precisely, note that |x − y| ≤ R k (ϕ ∪ {x} − y) ≤ R k (ϕ − y). Hence, there exists z ∈ kNN(ϕ, y) such that |z−y| = R k (ϕ−y) and z / ∈ kNN(ϕ∪{x}, y), the neighbor of y that is replaced by x. Additionally, for any w ∈ kNN(ϕ, y) \ {z} also w ∈ kNN(ϕ ∪ {x}, y). Further, also for any v ∈ ϕ such that y ∈ kNN(ϕ ∪ {x}, v) we have y ∈ kNN(ϕ, v). Hence, which is (2.4). The other possibility is that x / ∈ kNN(ϕ ∪ {x}, y). Then the addition of x can only remove edges that were present due to the fact that some other point had y as a neighbor. In this case, ξ(ϕ ∪ {x} − y) = ξ(ϕ − y) unless there exists z ∈ ϕ such that y ∈ kNN(ϕ, z) but y / ∈ kNN(ϕ ∪ {x}, z), which must be due to the property that x ∈ kNN(ϕ ∪ {x}, z). So again, the addition of x can only remove such an edge and hence again (2.4) holds for y.
Note that the approach presented above also applies to further graphs studied in computational geometry. The most immediate adaptation concerns the bidirectional k-nearest neighbor graph, see [BB13], where in the definition of the score function, we replace . Not only can we take the same radius of stabilization, but also Lemma 2.1 remains valid. As a third example, we showcase the relative neighborhood graph. Here, for α ≥ 0 and ϕ ∈ N o the score function is given by The relative neighborhood graph is a sub-graph of the Delaunay tessellation, and in fact we can reuse the radius of stabilization from Section 2.2. Finally, proving the analog of Lemma 2.1 reduces to the observation that the degree of every node in the relative neighborhood graph is bounded by a constant K = K(d), see [JT92,Section IV]. What remains to be verified is that ξ RN is weakly decreasing.

Proofs
In this section we provide the proofs of the main theorems. Proof of Theorem 1.1. Replacing ξ r by ξ r ∧ r if necessary, we may assume that ξ r is bounded above by r. Then, ξ r is a bounded local observable, so that by the contraction principle [ Let Q * be a subsequential limit of {Q k } k≥1 . To simplify the presentation, we may assume Q * to be the limit of {Q k } k≥1 . Then, by monotone convergence, Since the specific relative entropy h is lower semicontinuous, we arrive at as asserted.
3.2. Proof of Theorem 1.2. To prove Theorem 1.2, we consider the truncation ξ M = ξ ∧ M of the original increasing and r-dependent score function ξ at a large threshold M > 1 and write H M n = H ξ M n . In comparison to the arguments in Section 3.1, the proof of the lower bound is more involved, since we can no longer replace P(H n ≤ a) by P(H M n ≤ a). Instead, we rely on a sprinkling approach. For this method to work, we need that the total number of points in pathological areas is small with high probability. More precisely, we say that a point X i ∈ X is b-dense if X(Q r (X i )) > b and write for the total number of b-dense points in Q n . Then, b-dense points are indeed rare. In the second step, we remove all b-dense points through the coupling. That is, we let X −,ε be an independent thinning of X with survival probability 1 − ε. Furthermore, we let X +,ε be an independent Poisson point process with intensity ε > 0. Then, the coupled process X ε = X −,ε ∪ X +,ε is again a Poisson point process with intensity 1. Now, let E b,n = {X +,ε ∩ Q n = ∅} ∩ {X −,ε ∩ Q n has no b-dense points} be the event that X +,ε has no points in Q n and that X −,ε does not contain any b-dense points in Q n .
Thus, by Lemma 3.2, Since X and X ε share the same distribution, Lemma 3.1 allows us to choose b = b(δ) > 0 sufficiently large such that Hence, sending ε ↓ 0, δ ↓ 0, and b ↑ ∞ concludes the proof of (3.1).
Proof of Lemma 3.1. Consider a subdivision of Q n , for sufficiently large n ≥ 1, into sub-boxes Q a (z i ) = z i + Q a of side length a > r where z i ∈ aZ d . Let N i = X(Q a (z i )) be the number of points in the ith sub-box and N i = X(Q 3a (z i )) be the number of points the ith sub-box plus its adjacent sub-boxes. Then, N b,n ≤ N b,n , where so that by the exponential Markov inequality, for all t > 0, Since the random variables N i 1{N i > b} and N j 1{N j > b} are independent whenever z i − z j ∞ ≥ 3, we have 3 d regular sub-grids of aZ d containing independent random variables Since t > 0 was arbitrary, we conclude the proof.
Proof of Lemma 3.2. First, since X +,ε and X −,ε are independent, it suffices to compute P(X +,ε ∩ Q n = ∅ | X) and P(X −,ε ∩ Q n has no b-dense points | X) separately. The void probabilities for a Poisson point process give that P(X +,ε ∩ Q n = ∅ | X) = exp(−εn d ). Next, since X −,ε is an independent thinning of X with probability ε, we arrive at P(X −,ε ∩ Q n has no b-dense points | X) ≥ ε N b,n , as asserted.
3.3. Proof of Theorem 1.3. In order to prove the lower bound for stabilizing score functions, we use sprinkling to regularize sub-regions that are not sufficiently stabilized. Let us define the approximation E-mail address: tobias@math.tu-berlin.de