Joint large deviation result for empirical measures of the coloured random geometric graphs

We prove joint large deviation principle for the empirical pair measure and empirical locality measure of the near intermediate coloured random geometric graph models on n points picked uniformly in a d-dimensional torus of a unit circumference. From this result we obtain large deviation principles for the number of edges per vertex, the degree distribution and the proportion of isolated vertices for the near intermediate random geometric graph models.

Independence testing Consider CGRG which is a model for Wireless Sensor Network as a very big dataset comprising the typed sites and the bonds between sites. One interesting question to ask is how many bits will be required to code the n sites and the bonds between sites with high probability? Then, an asymptotic equipartition property (AEP) for the WSN will answer this question and our LDP for the empirical measures of the CGRG will play a crucial in the prove of the AEP. Further, we can test whether a received codeword y n of WSN is jointly typical with a candidate sent codeword x n of WSN. The probability that two independent sequences (x n , y n ) (x n being a codeword other than what was sent when y n was received) actually appear as dependent is bounded asymptotically as 2 −nI , where the AEP is used to obtain the bound. See Doku-Amponsah (2016) for more on this application.
Hypothesis testing One of the standard problems in statistics is to decide between two alternative explanations for the data are observed. For example, a transmitter will send an information on the WSN bits by bits in communication systems. There are two possible cases for each transmission: one is that bit 0 of WSN data is sent (noted as event H 0 ) and the other is that bit 1 of WSN data is sent (noted as event H 1 ). In the receiver side, the bit y is to be received as either 0 or 1. Based on the y bit of WSN data received, we can make a hypothesis whether the event H 0 happens (bit 0 was sent at the transmitter) or the event H 1 happens (i.e. bit 1 was sent at the transmitter). Of course, we may make mis-judgement, such as we decode that bit 0 was sent but actually bit 1 was sent. We need to make the probability of error in hypothesis testing as low as possible and the LDPs for CGRG models can help us specify the probability of error.
In the remainder of the paper we state and prove our LDP results. In "Statement of the results" section we state our LDPs, Theorem 1, Corollary 2, Corollary 3, Theorem 4, and Corollary 5. In "Proof of Theorem 1" section we present the proof of Theorem 4. In "Proof of Theorem 1" section we combine Theorem 2.1 and Doku-Amponsah (2014, Theorem 2.1) to obtain the Theorem 1, using the setup and result of Biggins (2004) to 'mix' the LDPs. The paper concludes with the proofs of Corollaries 2, 3 and 5 which are given in "Proof of Corollaries 2, 3, and 5" section.

Statement of the results
The joint LDP for empirical pair measure and empirical locality measure of CGRG In this subsection we shall look at a more general model of random geometric graphs, the CGCG in which the connectivity radius depends on the type or colour or symbol or spin of the nodes. The empirical pair measure and the empirical locality measure are our main object of study.
Given a probability measure ν on and a function r n : � × � → (0, 1] we may define the randomly coloured random geometric graph or simply coloured random geometric graph G with n vertices as follows: Pick vertices x 1 , . . . , x n at random independently according to the uniform distribution on [0, 1] d , d ∈ N. Assign to each vertex x j colour σ (x j ) independently according to the colour law ν. Given the colours, we join any two vertices x i , x j ,(i � = j) by an edge independently of everything else, if In this article we shall refer to r n (a, b), for a, b ∈ as a connection radius, and always consider under the joint law of graph and colour. We interpret G as coloured GRG with vertices x 1 , . . . , x n chosen at random uniformly and independently from the vertices space [0, 1] 2 . For the purposes of this study we restrict ourselves to the near intermediate cases.
i.e. the connection radius r n satisfies the condition nr d n (a, b) → C d (a, b) for all a, b ∈ , where C d : � 2 → [0, ∞) is a symmetric function, which is not identically equal to zero.
For any finite or countable set we denote by P(�) the space of probability measures, and by P (�) the space of finite measures on , both endowed with the weak topology. By convention we write N = {0, 1, 2, . . .}.
We associate with any coloured graph G a probability measure, the empirical colour measure L 1 ∈ P(�), by and a symmetric finite measure, the empirical pair measure L 2 X ∈P * (� 2 ), by Note that the total mass the empirical pair measure is 2|E| / n. Finally we define a further probability measure, the empirical neighbourhood measure M G ∈ P(� × N), by while L(x j ) = (l x j (b), b ∈ �) and l x j (b) is the number of vertices of colour b connected to vertex x j . For any η ∈ P(� × N � )we denote by η 1 the -marginal of η and for every (b, a) ∈ � × �, let η 2 be the law of the pair (a, l(b)) under the measure η. Define the measure (finite), �η(·, ℓ), l(·)� ∈P(� × �) by and write H 1 (η) = η 1 . We define the function H: Observe that H 1 is a continuous function but H 2 is discontinuous in the weak topology. In particular, in the summation l(b)∈N η 2 (a, l(b))l(b) the function l(b) may be unbounded and so the functional η → H 2 (η) would not be continuous in the weak topology. We call a pair of measures and consistent if equality holds in (1). For a measure ω ∈P * (� 2 ) and a measure ρ ∈ P(�), we recall from (Doku-Amponsah and Mörters 2010) the rate function for a, b ∈ . It is not hard to see that H 1 (ω � ρ) ≥ 0 and equality holds if and only if ω = C d ρ ⊗ ρ.
For every (ω, η) ∈P * (� × �) × P(� × N) define a probability measure Q (ω,η) poi on × N by We assume d ∈ N and write where Ŵ is the gamma function. We now state the principal theorem in this section the LDP for the empirical pair measure and the empirical locality measure.
Theorem 1 Suppose that G is a CRGG with colour law ν and connection radii r n : × → [0, 1] satisfying nr d n (a, b) → C d (a, b) for some symmetric function , for a ∈ �, ℓ ∈ N.
Remark 1 Note that the first three terms of the rate function is the same as the rate function of Doku-Amponsah and Mörters (2010, Theorem 2.1). Additionally, the extra We write Corollary 2 Suppose D is the degree distribution of the random graph G(n, r n ), where the connectivity radius r n ∈ (0, 1] satisfies nr d n → c ∈ (0, ∞). Then, as n → ∞, D satisfies an LDP in the space P(N ∪ {0}) with good rate function where q k is a poisson distribution with parameter k, and �δ� := ∞ m=0 mδ(m). This rate function 2 compares very well with the rate function of Doku-Amponsah and Mörters (2010, Corollary 2.2) with the extra term 1 accounting for the geometric effect on the CGRG model.
Next we give a similar result as in O'Connell (1998), the LDP for the proportion of isolated vertices of the RGG.
Corollary 3 Suppose D is the degree distribution of the random graph G(n, r n ), where the connectivity radius r n ∈ (0, 1] satisfies nr d n → c ∈ (0, ∞). Then, as n → ∞, the proportion of isolated vertices, D(0) satisfies an LDP in [0, 1] with good rate function From Corollary 3 we deduce that on a typical random geometric graphs the number of isolated vertices will grow like ne −�(d)c . Thus, as n → ∞, the number of isolated vertices in the geometric random graphs converges to ne −�(d)c in probability. Again, the rate function ξ 2 above compares very well with the result of O'Connell (1998) with the extra term ξ 1 accounting for the influence of the geometric plane [0, 1] d on the model.

The joint LDP for the empirical colour measure and empirical pair measure of CGRG
Theorem 4 Suppose that G is a CGRG with colour law ν and connection radii Further, we state a corollary of Theorem 4 below.

Corollary 5 Suppose that Gis a CGRG graph with colour law ν and connection
Remark 2 By taking C d (a, b) = c one will obtain ψ(y) = 0 for y = �(d) 2 c, and ψ(y) = ∞ otherwise, which establishes that |E| / n obeys an LDP in [0, ∞) with good rate function where �(d)c = y.

Change-of-measure
For any two points U 1 and U 2 uniformly and independently chosen from the space [0, 1] d write Further, given a function f : → R and a symmetric function g: 2 → R, we define the constant Uf by Uf = log a∈� ef (a) ν(a), and the function h n : 2 → R by for a, b ∈ . We use f and g to define (for sufficiently large n) a new coloured random graph as follows: • To the n points x 1 , x 2 , . . . , x n picked independently and uniformly in [0, 1] d we assign colours from independently and identically according to the colour law ν defined by • Given any two points x u , x v , with x u carrying colour a and x v carrying colour b, we connect vertex x u to vertex x v with probability We denote the transformed law by P . We observe that ν is a probability measure and that P is absolutely continuous with respect to P as, for any coloured graph We write �g, ω� := a,b∈� g(a, b)ω(a, b) for ω ∈P(� 2 ), and �f , ρ� := a∈� f (a)ρ(a) for ρ ∈ P(�), and note that F (r n (a, b)) = �(d)r d n (a, b), for all a, b ∈ � 2 .
i.e. the volume of a d-dimensional (hyper)sphere with radius r(a, b) satisfying nr d n (a, b) → C d (a, b).
The following lemmas will be useful in the proofs of main Lemmas.
Lemma 1 (Euler's lemma) If nr d n (a, b) → C d (a, b) for every a, b ∈ , then Proof Observe that, for any ε > 0 and for large n we have by the point-wise convergence. Hence by the sandwich theorem and Euler's formula we get (6). We write

Lemma 2
The family of measures (P n : n ∈ N) is exponentially tight on P(�).
Proof We use coupling argument, see the proof of Doku-Amponsah and Mörters (2010, Lemma 5.1) to show that, for every θ > 0, there exists N ∈ N such that To begin, let c(d) > max a,b∈� C d (a, b) > 0 and nr d n (c) → c(d). Using similar coupling arguments as in see the proof of Doku-Amponsah and Mörters (2010, Lemma 5.1), we can define, for all sufficiently large n, a coloured random graph X with vertices x 1 , . . . , x n chosen uniformly from the vertices space [0, 1] d , colour law η and connectivity probability p n = P �x i − x j � ≤ r n (c) = �(d)r d n , for all i � = j such that any edge present in G is also present in X . Let |Ẽ| be the number of edges of X . Using the binomial formula and Euler's formula, we have that where we used np n = �(d)nr d n → �(d)c in the last step. Now given θ > 0 choose N ∈ N such that N > θ + �(d)c(e − 1) and observe that, for sufficiently large n, which implies the statement.

Consequently,
We may now use (8) and (9) to obtain, for all sufficiently small ε > 0, Taking ε ↓ 0 we get the desired statement.

Proof of the lower bound in Theorem 4
We obtain the lower bound of Theorem 4 from the upper bound as follows: a,b) ) and note that β ω (a, b) = lim n→∞hω,n (a, b), for all a, b ∈ where
where a is the positive solution of our equation. We obtain the form of ξ in Corollary 3 by observing that

Conclusion
In this work, we have proved joint large deviation principle for the empirical pair measure and empirical locality measure of the near intermediate CGRG models. From this result we have obtained asymptotic results about useful graph quantities such as number of edges per vertex, the degree distribution and the proportion of isolated vertices for the near intermediate CGRG models. The rate functions of all these large deviation principles compared very well with the rate functions of the results for coloured random graph models by Doku-Amponsah and Mörters (2010), with some extra terms accounting for the geometric effect in the CGRG models. An important future research direction is to formulate and prove an Asymptotic Equipartition Property for Networked Data Structures Modelled as the CGRG, and then a possible Coding or Approximate Pattern Matching Algorithms for such Networks. One could also investigate the Statistical Mechanics on the CGRG.