Degree distributions in AB random geometric graphs

In this paper, we provide degree distributions for $AB$ random geometric graphs, in which points of type $A$ connect to the closest $k$ points of type $B$. The motivating example to derive such degree distributions is in 5G wireless networks with multi-connectivity, where users connect to their closest $k$ base stations. It is important to know how many users a particular base station serves, which gives the degree of that base station. To obtain these degree distributions, we investigate the distribution of area sizes of the $k-$th order Voronoi cells of $B$-points. Assuming that the $A$-points are Poisson distributed, we investigate the amount of users connected to a certain $B$-point, which is equal to the degree of this point. In the simple case where the $B$-points are placed in an hexagonal grid, we show that all $k$-th order Voronoi areas are equal and thus all degrees follow a Poisson distribution. However, this observation does not hold for Poisson distributed $B$-points, for which we show that the degree distribution follows a compound Poisson-Erlang distribution in the 1-dimensional case. We then approximate the degree distribution in the 2-dimensional case with a compound Poisson-Gamma degree distribution and show that this one-parameter fit performs well for different values of $k$. Moreover, we show that for increasing $k$, these degree distributions become more concentrated around the mean. This means that $k$-connected $AB$ random graphs balance the loads of $B$-type nodes more evenly as $k$ increases. Finally, we provide a case study on real data of base stations. We show that with little shadowing in the distances between users and base stations, the Poisson distribution does not capture the degree distribution of these data, especially for $k>1$. However, under strong shadowing, our degree approximations perform quite good even for these non-Poissonian location data.


Introduction
Spatial point processes have many applications, ranging from the distribution of stars in the Milky Way [3] to the dispersal of biological species [20,24]. One application of spatial processes which has received significant interest is in wireless networks [13]. In the typical setting, a wireless network consists of base stations and users that are distributed according to some spatial process, and users connect to the nearest base station.
In 5G networks, the new concept multi-connectivity is introduced. In multi-connected networks, users connect to the k > 1 nearest base stations. Having multiple connections can make the internet connection faster and more reliable [27]. In multiconnectivity, the size distribution of k−th order areas is a quantity of importance. Indeed, from the are size distribution, one can obtain the distribution of the number of users that connect to a particular base station. This quantity is necessary to derive analytical expressions for important network statistics such as the network capacity and outage probabilities.
In this paper we therefore investigate the degree distribution in AB random geometric graphs [21], a random graph model for multi-connected networks in which points of type A connect to the k closest points of type B. We are interested in the size of the area in which a given B-point is the k-th closest to a given A-point to derive the degree distribution of B-points. Here we assume that A-points are distributed as a Poisson point process. Beside applications in multi-connected networks, other applications are in the G n,k random graph, where n points connect to their closest k neighbors. For these types of random graphs, only high-level characteristics are known, such as the parameters such that the resulting graph is connected [4]. Results on the area in which a given point is k-th closest would enable to derive the degree distribution of these random graphs as well, which could give more insights into the behavior of these graphs under dynamic processes such as epidemics or cascading failures. Other applications of such areas are in k-nearest neighbor classification [25] or in plant ecology [17].
A property of Poisson processes is that a specified area size provides the distribution of the amount of users in that area. Thus, rather than analyzing the degree distribution directly, in this paper, we first derive expressions for the size distributions of the areas in which B points are k-th closest to A points.
This degree distribution depends on the spatial distribution of the B-points, as different spatial distributions give different areas in which B-points are k-th closest. We focus on two popular location models: a hexagonal grid model and a Poisson point process. The Poisson point process is one of the most popular spatial processes, due to its mathematical properties that make it relatively easy to analyze. Examples of spatial processes that are often modeled by Poisson processes are wireless networks [13], the dispersal of biological species [20] or in forestry [26]. The Poisson point process has proven useful to obtain several quantities of interest analytically. For example, in single-connected wireless networks, the Poisson process allows to derive the probability of network outages or the capacity that each user in the network receives [13]. The hexagonal grid is a simpler spatial process model, which has been used in many applications such as modeling wireless networks [19], ecology [6] and agent based modeling [10]. The area sizes in this hexagonal grid are easier to analyse in terms of Voronoi area sizes, but other quantities of interest may be more difficult to derive, as the locations of points in a hexagonal grid are dependent, in contrast to the Poisson point process. In wireless networks, the hexagonal grid model often overestimates network performance measures such as outage probabilities compared to real data, while the Poisson point process slightly underestimates them [16].
For both location models, we start with investigating the area sizes in the 1-dimensional case, after which we will show the 2-dimensional case. For k = 1, the problem reduces to finding the size-distribution of Voronoi cells, cells which indicate in which region a given point is the closest of all points for the grid and the Poisson point Process. Unfortunately, no exact characterisation of the sizes of these Poisson-Voronoi cells exist, although several approximations from numerical simulations exist [14,28].
For k > 1, the area in which a user connects to a given point can no longer be found by standard Voronoi diagrams. For such settings, higher-order Voronoi diagrams or k-th order Voronoi diagrams exist [12]. For example, in a second order Voronoi diagram, every cell corresponds to a pair of points (i 1 , i 2 ), such that i 1 is the closest, and i 2 is the second closest in that area. In a k-th order Voronoi diagram, every cell represents the area where a given point is k-th closest. These cells are in general nor convex, nor connected, making analysis of k-th order Voronoi cells complex [12]. Several results on fast algorithms to construct higher-order Voronoi diagrams exist [2,9,29], as well as results on the complexity of its cells [8] as well as the cell shapes [12,18]. However, to our knowledge, no results on the cell sizes of higherorder Voronoi cells or k-th order Voronoi cells exist under any type of underlying point process.
In this paper, we investigate the sizes of k-th order Voronoi area's for k > 1 in order to find the degree distributions for the Poisson process and the hexagonal grid. In the 1-dimensional setting, we obtain an exact result of the distributions of the regions where given points are the k-th closest. Interestingly, these regions are equal in distribution for all k. We also derive an exact expression of the degree distribution in the 1-dimensional setting. We show that the degree distribution becomes more concentrated when k grows. Thus, increasing k balances the load in terms of connections more evenly among the B points.
In the 2-dimensional setting, we provide exact results for the areas and degree distributions under the hexagonal grid model. Unfortunately, for the Poisson point process no exact results on the Voronoi-area sizes exist even for k = 1. We therefore turn to numerical simulations instead. We provide one-parameter fitted distributions similar to the well-accepted approximation for k = 1 to approximate the distribution of the areas where a given point is k-th closest in Poisson-Voronoi cells. With these parameter fits, we find a compound Poisson-Gamma degree distribution. Moreover, we show that the 1-dimensional Poisson case, for which we found an exact degree distribution, also approximates the 2-dimensional degree distribution well, especially when k becomes large. In both the Poisson point process and the hexagonal grid, we show that the coefficient of variation of the degree distributions decrease when k increases, which means that the load of the network gets more evenly balanced among the B-points.
Finally, we investigate a case study of real data of base stations in the Netherlands. Interestingly, while these base stations are not distributed according to a Poisson point process, we show that when some randomness is present in the random graph connections, in the form of shadowing present, the degree distribution of these non-Poissonian data is well approximated by our results for the one-or 2-dimensional Poisson case for k > 1. When this randomness is not present, the fits for the degree distribution that is obtained from the Poisson Point processes still fits reasonably well for k = 1, but the fit significantly deteriorates for larger k. This also indicates that it is possible that for some data a Poisson point process is a suitable model for k-connected AB-graphs when k = 1, but not for larger values of k.
In Section 2, we derive analytical results for the degree distribution for the hexagonal grid in 1 and 2 dimensions and for the 1-dimensional Poisson setting. Moreover, we provide an approximate one-parameter fit for the 2-dimensional setting. We then show the quality of this fit, and derive the degree distribution for Poisson distributed points. Then, in Section 3, we investigate a case study on non-Poissonian real data of base station locations. We show that while these real base station locations are not distributed as a Poisson process, under high shadowing, our approximations still work quite well to predict the degree distributions of these base stations. Furthermore, we show that without shadowing, the degree distribution is still predicted quite well for k = 1, but for higher values of k, it becomes worse. This indicates that while the frequently made Poisson point process assumption may be justified for k = 1, the correlations between the presence of different points in non-Poissonian data can make the Poisson assumption unjustified for higher-order connectivity levels.

Degree distribution in k-connectivity
We model k-connectivity in an AB random geometric graph [22] consisting of points of type A and points of type B. Type A points are distributed by a homogeneous Poisson Point Process with density λ A , while type B-points have a general spatial distribution. Every point in A connects to the nearest k points in B. An example of 2-connectivity is given in Figure 1.
We are interested in the degree distribution of the B-points, which we obtain by using the fact that every A-point in the degree-j-Voronoi cell of a random point in B connects to this B-point if j ≤ k. Thus, if the area of a certain cell is known, so is the distribution of the number of A-points (N A ) in this area: where x denotes the size of this area. In order to find the degree distribution of a random point in B, denoted by D B , we then simply need to find the sum of the sizes of the degree-j-Voronoi cells for 1 ≤ j ≤ k, which we call X ≤k : for area distribution f X ≤k (x).
In the following sections, we derive expressions for the degree distribution of Bpoints for two spatial distributions of the B-points. In Section 2.1, we assume that the points in B are placed in a hexagonal grid, and then in Section 2.2, we investigate the setting in which they are distributed as a Poisson point process.
We now provide some definitions that will be used throughout this paper: Definition 1 (k-th order area). The area on which point i ∈ B is the k−th point if sorted by increasing distance, is denoted by X k (i).
Definition 2 (≤ k-th order area). The area on which point i ∈ B is the j−th point where 1 ≤ j ≤ k if sorted by increasing distance, is denoted by X ≤k (i).

Hexagonal grid
We first investigate the degree distribution and thus area sizes in the regular grid. For one dimension, this means that points B are placed on a line with equal spacing, and for two dimensions we investigate the hexagonal grid.
x Proof. In this 1-dimensional case, the distance between two consecutive points is An example of the 1-dimensional grid is given in Figure 2. From this picture, it can be seen that the X k (i) = X k+1 (i) for all i ∈ B and k, as every X k (i) consists of two area's of size d/2. Therefore, by (1), the degrees D B are Poisson distributed with parameter d.

2-dimensional case
We now turn to the 2-dimensional setting. The following theorem shows that in the 2-dimensional setting (Figure 3), the degree distribution is again Poisson, but with another parameter: To prove this theorem, we need to know the distribution of the area sizes, for which we introduce the following lemma.
Lemma 1 (Equal areas). In a regular lattice, the k-th order area for every grid point i ∈ B is equal: Proof. Let us assume we have a lattice on a torus with N = |B| points and a total area A tot . Because we have a regular grid, we know that for all i and j, Since all B-points are equal and symmetric, it holds that for all i, j ∈ B and k: Furthermore, the sum over all k−th order areas of the points in the grid sum up to A tot , since every area in the grid will be the k−th closest area to one of the B−points in the grid: This shows that X k (i) = X k (j) = A tot /N for all i, j ∈ B and k. Now, we prove Theorem 2 using Lemma 1.
Proof. Since X(i, s) is equal for every s ≥ 1, the area X ≤ (i, k) = k · X(i, 1) = kA tot /N , where A tot is the total area of the grid and N is the total number of points in that grid. We can now fill in the area in (1), as we assumed the A-points are Poisson distributed. Therefore, the degree distribution of a randomly chosen B-point is: This means that D B ∼ Poisson (λ A kA tot /N ).
We would like to compare the different degree distributions D B for different values of k. We therefore compute the coefficient of variation of D B : This shows that the coefficient of variation decreases for increasing k. Thus, when increasing the connectivity in an AB-random graph, the degree distribution of the B-points becomes more concentrated.

Poisson Point Process
Now, let us investigate the case where B-points are distributed as a Poisson point process. Again, we first find an expression for the degree distribution in one dimension, where the points are placed on a line according to a Poisson process with parameter λ B (Figure 4). We then focus on the 2-dimensional problem, where points are distributed as a homogeneous Poisson point process with parameter λ B ( Figure  5).

1-dimensional case
The following theorem provides the degree distribution in the 1-dimensional Poisson case: Theorem 3. The degree D B of a randomly chosen B-point in the 1-dimensional Poisson point process has the following distribution function: x Figure 4: X k (i) in an 1-dimensional Poisson process We call this distribution the compound Poisson-Erlang distribution.
To prove this, we are again interested in the distribution of the area X k (i) given in Definition 1. Proof. We express X k (i) in terms of D i : where the last step follows from the fact that the random variables D i are iid by definition.
As we now know the size distributions of the areas, we can prove Theorem 3.
Proof. Lemma 2 shows that for every n = m, X n (i) d = X m (i). As follows from (12), the distribution of X k (i) is the sum of two exponential distributions with parameter λ B , which means that X k (i) ∼ Erlang(2, 2λ B ). By the property of the Erlang distribution, this means that X ≤k (i) = k j=1 X j (i) ∼ Erlang(2k, 2λ B ). With this distribution for X ≤k (i) we can find the degree distribution, by using (2): Again, to compare the degree distributions for different values of k, we derive the coefficient of variation of D B : for λ = λ A /λ B . Interestingly, this coefficient of variation again decreases for increasing values of k, again with rate √ k, similar to (10). Thus, the degree distribution of B-points becomes more concentrated with the same rate in k as for the hexagonal grid.

2-dimensional case
We now investigate the setting where B-points are distributed as a 2-dimensional Poisson point process. In Figure 5, we plotted the k−th order area's of a random point in the Poisson point process. This figure shows that these areas are not equal and therefore we need to find a different expression for the degree distribution. First, we find an approximation of the area sizes in this setting, and then we show the approximated degree distribution of the 2-dimensional Poisson point process. exist [11,14,28]. For the k−th order areas, we therefore resort to approximations. Equal to [14], we use a gamma distribution and fit the parameters of this distribution with a simulation, as the authors show that a simple 2-parameter fit gamma distribution is a fair approximation: We extend this parameter fit for higher order Poisson-Voronoi areas.
In order to find the degree distribution, we need to find the sum of the sizes of the first to k-th order areas, as this is the area on which a B-point is the k-th or less closest B-point, denoted by X ≤k . We fit the parameters a k and b k in the two-parameter gamma distribution: We assume that the expected area E(X ≤k ) is k for every k so that we can simplify  Table 1: Parameters a k = k ·b k for the size distribution of X ≤k . (20) with a k = k · b k . This simplifies the fit to only one parameter: We simulated n Poisson-distributed points on a √ n × √ n square and obtained the k-th order area of every point in this square by fitting R-trees [5] with a precision of .
For the parameter fit, we did m = 0.7 million iterations of this algorithm with n = 100 points and a precision of 2 = 0.1, which gives a sample of 7 million points. We used the χ 2 -goodness-of-fit-test to find the best parameters a k = k · b k for the Gamma distribution of X ≤k , shown in Table 1. Figure 6 shows the goodness of fit of the area distribution. Considering we only used a single-parameter fit, the approximation gives an excellent fit for every k. With this area distribution, it is possible to find the degree distribution of B of (2). Since we assumed that E(X ≤k ) = 1 in (21) while this should be equal to the total area divided by the number of base stations, A tot /|B| = 1/λ B , we divide x by λ B in (2) to get the desired expected value and thus the desired distribution: where λ = λ A /λ B . We call this distribution the compound Poisson-Gamma distribution. The coefficient of variation of D B is as follows: which again decreases for larger values of k as we assume a k also increases for larger values of k (see Table 1). Thus, the degree distribution of B points concentrates at a higher rate in k than the 1-dimensional Poisson process and the hexagonal grid, for which the coefficient of variation decreases as 1/ √ k.

Numerical results on the degree distributions
In this section, we compare our analytical results and approximations of the degree distributions of the hexagonal and Poisson grid against simulations.

Regular lattice
For the hexagonal grid, Theorem 2 shows that the degrees are Poisson distributed with parameter λ A kA tot /N (7), where N is the number of points and A tot is the total area of the grid. In Figure 7, we plotted the simulated results together with the analytical Poisson degree distribution. This figure shows that the simulations follow this degree distribution well for all values of k.

Poisson point process
In Figure 8, we plotted the simulated degree distributions of the 2-dimensional Poisson process for k = 1, 5 and 50 with the one-parameter fit degree distribution for k = 1 and k = 5 as given in (23) and the analytical compound Poisson-Erlang distribution given in Theorem 3. This figure shows that the one-parameter fit for k = 1 and k = 5 fits well. Moreover, the compound Poisson-Erlang degree distribution, which was derived for the 1-dimensional Poisson process, fits reasonably well for the 2-dimensional Poisson process, especially for larger values of k. For large values of k, the simple 1-dimensional result can also be used instead of the more extensive Gamma distribution with the fitted parameters.

Real data
The final case we investigated is the area and degree distribution in a real-world network. We used base station data from OpenCelliD [1] from the Netherlands and focused on the city centre of Enschede (Figure 9). While these locations are clearly not distributed as a Poisson point process, we investigate to what extent our approximations for the degree distributions are valid under such non-Poissonian data.

Figure 9: 599 base stations in Enschede
Previous research showed that while base stations are non-Poissonian, investigating the network based on Poisson data can still work under so-called shadowing [7,23]. Shadowing in the path-loss model can cause perturbations in the observed signal at the user [23], which causes users to connect to the k base stations with the strongest signal instead of the k closest base stations. Therefore, we incorporate shadowing into this real, non-Poissonian data to investigate the quality of our degree distributions. We use log-normal shadowing [7], which gives a distance after shadowing d * (x, y) for base station x and user y: where d(x, y) denotes the real distance between base station x and user y and S X (y) is a log-normal random variable with mean 1. Then, users connect to the k base stations that have the lowest value of d * (x, y). In this real data setting, we simulated users with a Poisson process and calculated the degree of every base station. We show results for two different values of the shadowing variance: σ = 0.1 (weak shadowing) and σ = 1 (strong shadowing) in Figures 10 and 11. We plotted the simulations together with the the compound Poisson-Erlang degree distribution (13) and for k = 1 and k = 5 the fitted compound Poisson-Gamma degree distribution given in (23).  (Figures 10b-10c). We can conclude from this that assuming that base stations are distributed by a Poisson point process works well in the case users only connect to 1 base station, but quickly loses accuracy when users are connected to multiple base stations. However, when strong shadowing is taking place, as is the case in Figure 11, the compound Poisson degree distributions seem to fit better, not only for k = 1. This result implies that  (10), (17) and (27).
our results of the Poisson process can be accurate for a very wide range of spatial processes, when a process of randomness is present as well.
The coefficients of variation for the degree distribution of the Poisson point process, the hexagonal grid as well as the real data are shown in Figure 12 for λ B = Atot N ≈ 0.01 and λ A = 0.1. For the compound Poisson-Gamma distribution, we plotted the values of c V in (27) with linear extrapolated a k for k > 5, using the values of Table  1. The behaviour of the coefficient of variation for the real grid with shadowing is similar to the one of the Poisson point process, and decreases rapidly in k. The compound Poisson-Gamma distribution approximates this coefficient of variation closely, but the compound Poisson-Erlang distribution also works reasonably well as it only slightly overestimates c V .
The behaviour of the coefficient of variation for the real grid without shadowing is different from the other three c V 's depicted in Figure 12 and cannot be approximated by one of the three degree distributions given in (10), (17) and (27). Again, the coefficient of variation does slowly decrease, but it is still significant larger than the other three c V 's. This implies that for non-Poissonian data, larger values of k will still result in a more concentrated degree distribution, but not as concentrated as in the Poissonian data. This observation is also shown in Figure 10, as the degree distribution for k = 50 does not resemble a concentrated distribution.
In general, this plot shows that the coefficient of variation always decreases for larger values of k, which means that the load of all connections becomes more evenly balanced among all B-points. In the context of wireless networks, this can imply that all A-points, the users, will receive a more similar throughput, and the B-points, the base stations, will have similar degrees, which results in a more fair distribution of the resources over all users and base stations.

Conclusion
In this paper, we have derived degree distributions for AB random geometric graphs for different spatial distributions of B-points and Poisson-distributed A-points. In the case where B points are distributed as a hexagonal grid, we showed that the areas in which a B-point is k-closest are equal for every point i ∈ B and every value of k in both one and two dimensions. With this observation, we derived the degree distribution of the B-points in one and two dimensions. In the case where B points are distributed as a 1-dimensional Poisson point process, we derived the analytical size distribution of the areas in which a B-point is k-th closest, which resulted in a compound Poisson-Erlang degree distribution (13) for the degrees of all B-points. We fitted a one-parameter Gamma distribution to obtain the k-th closest area distribution in the case where B-points are distributed as a 2-dimensional Poisson point process for k ∈ {1, 2, 3, 4, 5}. This results in a compound Poisson-Gamma distribution for the degree distribution of B-points. For larger k, we show that the easier compound Poisson-Erlang degree distribution works well as an approximation for the degree distribution of B-points.
Moreover, we have shown that the coefficient of variation of the degree distributions for both the hexagonal grid model and the Poisson point process rapidly decrease for larger values of k. Therefore the degrees become more centered around the mean as k increases. This can have important implications for applications of AB random graphs. For example, for multi-connected cellular networks, this means that for large k, the load in the network becomes more evenly distributed (fairness). Investigating the extent to which fairness increases with increasing k is therefore an interesting topic for further research.
In a case study with real data of base station locations, we have shown that with strong shadowing, which introduces a source of randomness in the observed distance (Figure 11), our derived degree distributions for the 1-dimensional and the 2-dimensional Poisson point process approximate the real degree distribution well, even though these data are not distributed according to a Poisson point process.
This is in line with [7,15], in which the authors found that wireless networks appear to be Poisson under strong shadowing. Moreover, it can also be seen that for almost no shadowing (Figure 10), the degree distribution in the data behaves significantly different from the Poisson case, especially for larger values of k. A reason for this could be that real base stations are not independently distributed among the grid, which is a key property of the Poisson point process. This mismatch becomes more visible for larger degrees of multi-connectivity, as in this case more base stations and thus more dependencies need to be taken into account. This means that especially when one wants to investigate multi-connectivity in a network with little to no shadowing, it is important to investigate whether the Poisson point process is a suitable model for distributing the B-points. Even if it seems to fit well for k = 1, the fit for larger values of k may be significantly worse, comparing Figures 10b-10c with Figures 8b-8c.
In this research, we have assumed that A-points are always distributed as a Poisson point process. However, the locations of the A-points can also depend on the locations of the B-points in many application areas. For example, A points may follow a heterogeneous distribution instead of a homogeneous Poisson distribution, that depends on dense parts and less dense parts of B-points in the spatial process. Deriving degree distributions and results on load balancing for those types of AB-random graphs is an interesting topic of further research.