Brought to you by:
Letter

Isolation statistics in temporal spatial networks

and

Published 5 October 2017 Copyright © EPLA, 2017
, , Citation Carl P. Dettmann and Orestis Georgiou 2017 EPL 119 28002 DOI 10.1209/0295-5075/119/28002

0295-5075/119/2/28002

Abstract

The reliability and robustness of infrastructure networks are important problems requiring network models with nodes at fixed locations and links that break and reform with time. These temporal spatial networks are however difficult to analyse and understand due to the coexistence of short- and long-range links and inherent temporal correlations. We provide a mathematically tractable framework to analytically study the isolation statistics responsible for disconnecting spatial networks. Small-world effects and temporal correlations are also incorporated in our framework as we investigate the distribution of the time needed for information packets to be able to reach the whole network.

Export citation and abstract BibTeX RIS

Introduction

Many kinds of complex networks such as transport, power, social and neuronal networks are spatial in character [1], that is, the nodes and perhaps also the links have a physical location. Geometry can structure the network in that the probability of a link between two nodes is related to their mutual distance. Longer links are typically more expensive to build, maintain, or operate (e.g., motorways) than shorter ones, yet offer fast information flow or transport through different parts of the network. Consequently, spatially embedded networks often exhibit interesting topological features such as clustering [2], modularity [3], or small worldness [4]. Spatial structure is also particularly important in many multiplex [5] and/or dynamic [6] networks. This is of great importance when considering infrastructure networks (power, road, rail, telecommunications, water, and gas) which we rely on, today more than ever, to be reliable, efficient, and robust against random failures and attacks [7,8].

This letter focuses on temporal [9] and small-world [10] spatial networks and, unlike studies which are data or algorithmically driven, derives deep analytical results relating to the connectivity. We analytically obtain statistics (all moments and the density) for the probability of isolated (i.e., disconnected) nodes in networks with local or small-world connection models, static or with temporally uncorrelated or correlated links. Isolation statistics are vital for understanding bottlenecks in transporting people, resources or information throughout the network. The enabling approach leverages tools from stochastic geometry [11] and statistical mechanics via the probability generating functional to extract spatial averages and Ising lattices to model time-correlated links.

Consider spatial networks in which the nodes remain fixed, but for which the links break and reconnect, thus forming a temporal network. For example, this is a good model of a wireless ad hoc network where nodes (devices) communicate directly with each other rather than with a central router and where their locations may be considered random; examples include smart-grid [12], sensor [13] and vehicular [14] networks, the Internet of Things [15], or smartphone networks interconnected via Wi-Fi Direct (see applications such as FireChat [16]). In most of these examples, the link success probability decreases with the distance between nodes (see eq. (1) below). Moreover, the wireless communication channel exhibits rapid fading, so that some time later, the state of the system has the same distance-dependent link probabilities, independently or in a time-correlated manner. Note that the rapid fading is on a much shorter timescale than any mobility of the nodes, which we thus regard as fixed in space.

As an extension to the local structure displayed by ad hoc networks with distant dependent link success probabilities, we consider networks with non-local connections. Drawing inspiration from the zoo of wireless systems, Hybrid and Device-to-Device (D2D) cellular networks [17] are two examples exhibiting such non-local structure. Here, wireless devices may employ multiple technologies, for instance, most smartphones today have both Wi-Fi Direct and cellular capabilities. The latter technology has a much larger range than the former, however is mediated by base stations and a wired infrastructure. Therefore, two devices separated by much larger distances can communicate indirectly via the cellular to establish a very non-local connection, essentially forming a small-world network [10] where link success probabilities have both a local and non-local component (see eq. (2) below). Such non-local connections are often not cheap (due to the service operator) and therefore Hybrid and D2D cellular network devices typically try to balance or optimise their connectivity subject to various technical constraints.

In the absence of available data from network operators, software simulators like ns3 [18], Cooja or Qualnet can be useful towards better understanding these large-scale networks. Indeed, wireless simulators are supported by active open-source communities or even network operators themselves and are therefore programmed to mimic network layer protocols, infrastructure topologies, channel fading effects, and even user mobility patterns for both wireless and wired networks. Some wireless data is available for scientific research (e.g., ref. [19]), however it often does not include spatial information of the network infrastructure, or of the mobile devices, thus making it difficult to extract the spatio-temporal connectivity information that we are interested in; obtaining and incorporating spatial information in future data sets would be of great value, to this and other spatial network research.

Our analytical approach to temporal, spatial, and small-world networks, whilst being inspired by wireless systems, is more general and can be applied to many other spatial networks with a variety of (non-)local link probabilities and time correlations such as power networks [7] and social networks [20]. The supplementary material animated_graph.gif contains an animation, of which the final frame is shown in fig. 1. We are primarily interested in how the link probability and temporal correlations affect the time required to distribute information packets throughout the network, limited by the isolated nodes.

Fig. 1:

Fig. 1: (Colour online) The supplementary file animated_graph.gif shows a temporal spatial network with no temporal correlations in a disk of radius 6, node density $\rho=2$ and path loss exponent $\eta=2$ . The network on the left is the local model $\wp=0$ whilst the one on the right has some small-world links, $\wp=0.1$ . When not connected, colours indicate connected clusters. We observe that despite the same overall isolation probabilities, eq. (15), the small-world variant is more often connected.

Standard image

Network model

We distribute nodes in space according to the following spatial network model [21]: Place nodes in space according to a Poisson Point Process (PPP) with intensity measure Λ in d-dimensional space $\mathbb{R}^d$ . This means that the number of nodes in a bounded set $A\subset\mathbb{R}^d$ is Poisson distributed with mean $\Lambda(A)$ and independent of the number of nodes in any set B disjoint with A. The average number of nodes is $\bar{N}=\Lambda(\mathbb{R}^d)$ , possibly infinite. In uniform measures, $\Lambda(A)=\rho \text{Vol}(A)$ , where ρ is the (constant) density and $\text{Vol}(A)$ is the volume of A. In this case, we often replace $\mathbb{R}^d$ by a cube $[0,L]^d$ with opposite faces identified (a flat torus) and $\bar{N}=\rho L^d$ . Another geometry where boundaries are absent is the surface of a sphere. Boundary effects are discussed in the conclusion.

Links between each pair of nodes with locations ξ and $\xi'$ form independently with probability $\phi(\xi,\xi')$ . Here we consider random connection models (RCM), for which $\phi(\xi,\xi')=H(|\xi-\xi'|)$ where $|.|$ denotes the Euclidean (or in general some other) length and $H{:}~[0,\infty)\to[0,1]$ is called the connection function. It is possible with node locations and links to synthesise a connection function for any spatial network, and thus model it as an RCM, assuming link independence. If the link probability is one for $|\xi-\xi'|\leq 1$ and zero otherwise, the only source of randomness is in the node locations. This is the original Random Geometric Graph (RGG) model [22]. On the other hand, if ϕ lies strictly between 0 and 1, there are two sources of randomness, in the node locations and the links. Here, we fix the node locations ("quenched disorder"), and study the randomness due to the links, as in the above temporal wireless network application. This system has also been considered using graph entropy [23].

In the case of wireless communication networks, there are physical theories of the communication channel leading to a variety of connection functions; see refs. [2426]. We start with a simple model and later adapt it to include small-world features: Assume diffuse scattering of the wireless signal (Rayleigh fading), which leads to an exponentially distributed channel gain $|h|^2$ . The signal power decays as $r^{-\eta}$ where $\eta\in[2,6]$ is called the path loss exponent. Free propagation gives the inverse square law $\eta=2$ , whilst more cluttered environments have a faster decay of the signal (larger η). A link may be made if the signal-to-noise ratio, proportional to $|h|^2 r^{-\eta}$ , reaches a given threshold, leading to the connection probability

Equation (1)

for some constant r0 that determines the lengthscale; we measure the length in these units and so take $r_0=1$ hereafter. The η →  limit gives the RGG model [22].

We construct a "small-world" version of this or other short-ranged models in a finite geometry (such as the flat torus) by randomly rewiring links with a probability , as in the original Watts-Strogatz model [10]. This is well approximated by modifying the connection function to

Equation (2)

where we assume $L\gg r_0$ . The constant μ is chosen to keep the average number of links fixed; for constant density ρ we have $\mu\approx S_d\int_0^\infty H_0(r)r^{d-1}\text{d}r$ , and ρμ is the mean degree. Here, Sd is the total (solid) angle in d dimensions, so $S_1=2$ , $S_2=2\pi$ and $S_3=4\pi$ .

When $\wp=0$ we recover the original short-ranged RCM models, whilst for $\wp=1$ we obtain an Erdős-Rényi (ER) random graph. In both cases, links are independent, however in the former there is a strong sense of spatial structure and thus correlation between the links of nearby nodes. Therefore, through eq. (2) we can interpolate between local RCMs and completely non-spatial ER graphs.

Moments of the isolation probability

To understand temporal networks, we first analyse the instantaneous probability that a node is isolated, that is, it has no links. This depends on the locations of nearby nodes (see fig. 2). Considering all nodes together, there is a distribution of isolation probabilities. The isolation probability of a node at ξ in configuration X of the PPP is

Equation (3)

Note that in a PPP, the distribution of points found by conditioning on one node's position is unaffected (Palm distribution of a PPP is the same PPP); see ref. [11] for the theory of PPPs. To find the distribution of Piso, we use the probability generating functional (PGF)

Equation (4)

for the arbitrary function $u(\xi)$ where the first equality is the definition and the second follows for a PPP. The function needs to satisfy some mild conditions, for example a) $\bar{N}<\infty$ , or b) $u\in[0,1]$ and $\int|\log u(\xi)|\Lambda(\text{d}\xi)<\infty$ as the case here. We obtain the ν-th moment for $\nu>0$ :

Equation (5)

For $\nu=1$ the exponent is just the integral of H, referred to as the connectivity mass, and important for understanding the (multihop) connection probability of a wireless ad hoc network when $d\geq 2$  [25].

Fig. 2:

Fig. 2: (Colour online) A Poisson Point Process with density $\rho = 2$ . Nodes are coloured by isolation probability, using the Rayleigh connection function, eq. (1) with $\eta = 2$ and periodic boundary conditions.

Standard image

In the case of a constant density ρ this simplifies to

Equation (6)

which is independent of the location of the node. Now, for Rayleigh fading (eq. (1)), $\mu=S_d\Gamma(d/\eta)/\eta$ and

Equation (7)

where $H_\nu^{(s)}=\frac{1}{\Gamma(s)}\int_0^\infty(1-(1-e^{-x})^\nu)x^{s-1}\text{d}x$ . For integer $\nu=n$ we can expand the parentheses to yield a finite sum $H_n^{(s)}=\sum_{j=1}^n(-1)^{j-1} \binom{n}{j} j^{-s}$ which is known as the Roman harmonic number [27]. For general ν, numerical integration is efficient and stable, whilst asymptotically expanding the integral for large ν gives

Equation (8)

where $\gamma\approx 0.5772$ is the Euler constant. Thus, we have

Equation (9)

where $V_d=S_d/d$ is the volume of the unit ball in d dimensions. When $\eta=d$ , we have a further simplification

Equation (10)

using now the standard expansion of the digamma function $\psi(x)=\frac{\Gamma'(x)}{\Gamma(x)}$ . For integer moments $\nu=n$ , $H_n^{(1)}=H_n=\sum_{j=1}^n j^{-1}$ , the usual harmonic number.

Generalizing the above calculations for the small-world model (eq. (2)) with $\wp>0$ on a large torus of size L, we integrate the full connection function over a large ball of radius $R<L/2$ , and a constant for the rest of the torus

Equation (11)

where

Equation (12)

and we have defined $V(R)=(1-\alpha^\nu)(L^d-V_dR^d)$ , $\alpha=1-\frac{\wp\mu}{L^d}$ and $\beta=\frac{1-\wp}{\alpha}$ . For ν an integer n

Equation (13)

Now, in the limit of large R (and hence L), the approximation in (11) becomes exact, the ratio of gamma functions tends to unity, leading to

Equation (14)

which is now independent of R. The Roman harmonic number has been generalised to include a factor $\beta^j$ .

If we take $\nu=1$ , the result is simply

Equation (15)

which is independent of , showing that the overall isolation probability is unaffected by the rewiring.

Conversely, the large ν limit gives $\mathbb{E}(P_{iso}^\nu)\approx\exp(-\rho (1-\alpha^\nu)L^d)$ , physically that high moments (corresponding to the most isolated nodes) are dominated by the constant term in eq. (2), not the local environment. All these approximations prove useful below.

Distribution of the isolation probability

We now turn to the probability density function (pdf) of Piso, which we denote f(x) with $x\in(0,1)$ . In principle this is an inverse Mellin transform

Equation (16)

but is intractable both analytically and numerically, even for $\wp=0$ and $\eta=d$ , eq. (10).

Extracting a general distribution on a finite interval from integer moments $\mathbb{E}(P_{iso}^n)$ is called the Hausdorff moment problem, and the solution is unique if it exists. For an effective numerical approach we follow Mnatsakanov [28], who gives an approximation depending on a positive integer parameter χ:

Equation (17)

The function depends on x only through $a=\lfloor \chi x\rfloor$ , and is hence piecewise constant for any fixed χ. It converges to the correct function as χ → . It is possible to use this to get a numerical approximation to f(x), using high-precision arithmetic to overcome cancellations, see fig. 3.

Fig. 3:

Fig. 3: (Colour online) The pdf of Piso, by direct simulation in d = 2 (points) and using eqs. (7), (17) with $\chi=10^3$ (black solid lines). The parameters are $\rho=0.5$ , $\eta=2$ and $\wp=0$ , with each varied in turn. In the upper plot for x > 0.6, the first term of eq. (19) is shown as a coloured line. For $\wp=1$ , (Erdős-Rényi model), the isolation probability is a function only of the number of nodes, and hence is discrete with the appropriate normalisation; this requires $\chi=10^4$ due to slower convergence. The insets show the behaviour near x = 0 and x = 1 on a log-log scale.

Standard image

We see expected trends as a function of the various parameters, in particular, isolation probabilities are decreased with density ρ, and increased with path loss exponent η. For $\wp=0$ we see that f(x) is singular at x = 0 or x = 1 or both; when $\eta=d=2$ it is almost symmetrical at $\rho=0.22$ . It is never quite symmetrical: For $\mathbb{E}(P_{iso})=1/2$ we must have $\rho=\frac{\ln 2}{\pi}\approx 0.220636$ and then the third central moment is $2^{-11/6}-3\times 2^{-5/2}+2^{-2}\approx 0.000285\neq 0$ . For $\wp>0$ the rewiring concentrates the distribution of isolation probabilities toward that of an Erdős-Rényi model, whilst fixing the overall isolation probability, eq. (15).

For $\wp=0$ and $\eta=d$ , the large ν asymptotics from eq. (9) gives information on f(x) near x = 1, the distribution of highly isolated nodes. Making an ansatz $f(1-\epsilon)=\sum_{i=0}^\infty g_i \epsilon^{\delta+i}$ , multiplying by $(1-\epsilon)^\nu\approx e^{-\nu\epsilon}$ for $\nu\gg1$ , and integrating gives

Equation (18)

which by comparison with eq. (9) yields

Equation (19)

We find that the first term already gives an accurate approximation, and is illustrated in fig. 3.

Still for $\wp=0$ , we can take d → . If η is proportional to d, we see from eq. (7) that the only effect is to change the density ρ. If η is constant, we find $H_{\nu}^{(d/\eta)}\approx\nu$ , so that $\mathbb{E}(P_{iso}^\nu)\approx x_*^\nu$ with $x_*=e^{-\rho\mu}$ (cf. eq. (15)) and so it corresponds to a distribution that is sharply peaked at $x=x_*$ , similar to the effect of rewiring $(\wp>0)$ .

For general we can also consider the RGG limit of ${\eta\to\infty}$ , in which $\mu\to V_d$ . For $\wp=0$ , eq. (9) gives $\mathbb{E}(P_{iso}^\nu=\exp[-V_d\rho]$ as expected; it is independent of the moment ν since the isolation probability of any node is either zero or one, and the exponential is simply the probability of no nodes in a volume Vd in a uniform PPP. For general , we have from eq. (2) that there are $N_c\sim Po(L^d\rho v)$ close nodes with link probability $P_c=1-\wp(1-v)$ with $v=V_d/L^d$ . and $N_d\sim Po(L^d\rho(1-v))$ distant nodes with link probability $P_d=\wp v$ . Thus, Piso has a discrete distribution given by

Equation (20)

Taking $\wp=1$ we find the ER result

Equation (21)

which applies generally (independently of the original connection model), and illustrated in the lower panel of fig. 3.

Isolation in temporal networks

Having analysed the pdf of isolation for RCMs, we return to the temporal network problem, assuming $\wp=0$ , d > 1, constant density and neglecting boundary effects. Here, we consider a discrete time model. This is natural for wireless applications using Time Division Multiple Access (TDMA) protocols, or more generally determined by the timescale used to send a message over a link. First, consider links that are independent in time: The network chooses links anew each time step. To ensure information can reach every node, we require that no node is isolated for a time interval of T time steps. At each time step, the isolation probabilities of the nodes are a PPP on $[0,1]$ with intensity $\text{d}\Lambda=\bar{N}f(x)\text{d}x$ , neglecting local correlations. The probability that a node with $P_{iso}=x$ is isolated for T consecutive time steps is simply xT.

More generally and realistically, we also consider time-correlated links. This is natural if the message passing timescale is of the same order of, or much less than, the timescale over which the link exists. Taking the time step small, we can also use time-correlated links to attain the limit of continuous time processes. Following ref. [29] we treat the links as independent one-dimensional Ising models at equilibrium, where the (large but finite) Ising lattice corresponds to time. The Ising system with nearest-neighbour interactions is a well-studied model in statistical physics with Hamiltonian

Equation (22)

where J is the interaction strength, h the magnetic field, and the spins are $\sigma_k=\pm 1$ . At finite temperature the correlation length is finite, and the thermodynamic limit does not depend on the boundary conditions. This model is easily solvable; the probability of a spin being −ve is

Equation (23)

and conditional on this, that its neighbour is −ve is

Equation (24)

Here $K=\beta J$ and $B=\beta h$ with β the inverse temperature as usual. These two probabilities completely specify the equilibrium state, which consists of +ve and −ve domains with independent exponentially distributed lengths.

In the network context, $\sigma_k$ indicates the presence $(+)$ or absence $(-)$ of a link at time k. We fix B to be small and negative, then inverting eq. (23) yields

Equation (25)

where we have suppressed the dependence of K and $p_-$ on the nodes i and j forming the link. The highly isolated limit then corresponds to $p_-\to1\ (K\to\infty)$ , where

Equation (26)

Equation (27)

Multiplying over all nodes, the isolation probability of node i is $x_i=\prod_{j\neq i}p_-(i,j)$ and so the probability that node i is isolated for T consecutive time steps is $x_i^\tau$ with

Equation (28)

for $T\geq1$ . For the independent case we have K = 0 which gives $p_{--}=p_-$ and hence $\tau=T$ as before.

Denoting the event that no node is isolated for T time steps by CT, we can use the PGF (noting that $\bar{N}$ is finite). We write $\mathbb{P}(C_T|X)=\prod_i(1-x_i^\tau)$ and then average over configurations X of the PPP,

Equation (29)

But the integral in (28) is just $\mathbb{E}(P_{iso}^\tau)$ . For general we can use eq. (14) which exhibits a variety of behaviours. It is instructive to consider the two extreme cases: For $\wp=0$ (local connections) we have eq. (9):

Equation (30)

from which we find that the time T to ensure all nodes are connected at least once is given by eq. (28) with

Equation (31)

Equation (30) has been confirmed by numerical simulation; see fig. 4, which shows that long times are required to have high confidence in all nodes being connected, particularly at low density ρ. Thus, for low density the required time T grows as a stretched exponential, controlled by the path loss exponent η. When $\eta=d$ , it reduces to simply $\tau\approx(\rho L^d)^{1/(V_d\rho)}$ , with the probability distribution determined by extreme behaviour due to the highly isolated nodes as in eq. (19).

Fig. 4:

Fig. 4: (Colour online) Probability that no nodes will remain isolated for time T. Here, $\wp=0$ , $d=\eta=2$ , L = 20, 104 configurations: points. Theoretical curves, eq. (30).

Standard image

The other interesting special case is that of $\wp=1$ , the ER graphs. Irrespectively of the original connection function, the link probability is $H(r)=\mu/L^d$ , independent of r. Here, the isolation probability depends only on the total number of nodes N, a global variable, and the PGF fails. We have instead

Equation (32)

from which the connection time is

Equation (33)

Thus, we have developed simple analytic expressions which characterise the probability that a node will remain isolated in this temporal network.

Strictly speaking all our results for isolation probabilities apply to one-dimensional networks. However, in this case, transmission of information is limited by large gaps, rather than isolated nodes. In the RGG, transmission can occur if and only if there are no gaps larger than the link range r0, see ref. [30]. For RCMs, it is quite likely that links may be made between nodes that are not directly adjacent to the gaps, and estimating even the instantaneous connection probability remains an open problem.

Conclusion

We have analytically investigated the distribution of isolated nodes in temporal spatial networks. Here, node locations were random but frozen in time (quenched disorder), and links were formed or broken dynamically according to a distance-dependent probability. We obtained explicit formulas for the probability that no node will be isolated for T consecutive time steps, with good numerical agreement. In contrast to networks with mobile nodes, the transmission of information is greatly hindered by extremes of the quenched disorder, namely highly isolated nodes. Relaxing the distance-dependent link formation probability (2), we can probe small-world–like spatial networks obtained via random rewiring of the links. The result is that the isolation probability becomes concentrated around its mean, thus moderating the system's extreme behaviour due to highly isolated nodes (at low densities) or rarely isolated nodes (at high densities). This reduces the expected time before all nodes receive an information packet, but also degrades the reliability of the most connected nodes.

The analytical expressions and parameter dependence of the numerical results can be used for the efficient design and management of smart-grid and other mesh and D2D networks. Namely, we have shown increased reliability (fewer highly isolated nodes, and hence shorter times to connection) with decreased path loss exponent η, that is, less cluttered environments. Similarly, network reliability is improved by increasing the density ρ; this also increases cost and energy consumption, so the optimum value can be found using our analytical formulae together with specified constraints on relability, cost and energy efficiency. Finally, the optimum rewiring parameter can be found given constraints on highly isolated nodes, rarely isolated nodes and again, energy efficiency.

Boundary effects may be incorporated by returning to eq. (5) and applying the methods of ref. [25]; depending on the density, the relevant isolated nodes are likely to occur in the bulk, edges, or corners, as determined by a trade-off between the number of nodes in these locations and their isolation probabilities. Equation (5) also applies to non-uniform node distributions, a more challenging problem. Both generalisations are however important for quantitative agreement with empirical network data and would be interesting to explore using software wireless simulations [18].

Acknowledgments

The authors would like to thank the directors of Toshiba Telecommunications Research Laboratory and the EPSRC (grant EP/N002458/1) for their support. They are grateful to Justin Coon, Peter Sollich and Claudio Tessone for helpful discussions.

Please wait… references are loading.
10.1209/0295-5075/119/28002