The Influence of Communication Range on Connectivity for Resilient Wireless Sensor Networks Using a Probabilistic Approach

Wireless sensor networks (WSNs) consist of thousands of nodes that need to communicate with each other. However, it is possible that some nodes are isolated from other nodes due to limited communication range. This paper focuses on the influence of communication range on the probability that all nodes are connected under two conditions, respectively: (1) all nodes have the same communication range, and (2) communication range of each node is a random variable. In the former case, this work proves that, for 0 < ε < e − 1 , if the probability of the network being connected is 0.36 ε , by means of increasing communication range by constant C ( ε ) , the probability of network being connected is at least 1 − ε . Explicit function C ( ε ) is given. It turns out that, once the network is connected, it also makes the WSNs resilient against nodes failure. In the latter case, this paper proposes that the network connection probability is modeled as Cox process. The change of network connection probability with respect to distribution parameters and resilience performance is presented. Finally, a method to decide the distribution parameters of node communication range in order to satisfy a given network connection probability is developed.


Introduction
Wireless sensor networks (WSNs) [1,2] are a promising technology nowadays.The use of WSNs in numerous applications, such as forest monitoring, disaster management, space exploration, factory automation, secure installation, border protection, and battlefield surveillance, is emerging.WSNs technology is the basis of future network "Internet of Things" (IoT) [3], which offers a vision where anyone can interact with any addressable nodes (things or objects)-such as RFID tags, sensors, and mobile phones-anywhere and anytime."Anywhere" suggests that any object is reachable from any location.From the network topology point of view, every node in WSNs should be able to, directly or through limited number of intermediate nodes, connect to any other nodes.This kind of network is called "connected network." If the network is still connected after removing at most  − 1 nodes, it is called -connected network, where  = 1, 2, 3, . ... A -connected network guarantees that at least  different paths are available for transmitting signals from one node to any other nodes.
However, -connected network is not always possible.In WSNs, sensor nodes are usually deployed in the areas of interest either randomly or according to a predefined distribution.In this case, it is likely that some nodes are isolated from other nodes.Therefore, the network connection is characterized by probability.On the other hand, the resilient problem, which indicates fault-tolerance capability in the presence of node failure, is also important in the probabilistic network.Our concern in this paper is the probability that the WSNs are a connected network and network resilience against the node failures.
Most of earlier studies focus on the model where each node in a network is the same and, for example, has the same communication range.However, WSNs nodes are usually heterogeneous.The communication range of the WSNs node International Journal of Distributed Sensor Networks may vary from one node to another, and even communication range of the same node may change over time.For instance, in a wireless network, the transmission power required for a node to reach another node is proportional to   , where  is the transmission radius and  is the loss constant depending on the wireless medium of which typical value is between 2 and 4 and may vary from devices to devices [4].
According to various wireless communication technologies, communication range may vary from tens to thousands meters, such as IEEE 802.11 (25-600 m), Bluetooth (10-100 m), ZigBee (10-75 m), HomeRF (50 m), UWB (10 m), and WiMAX (1-50 km).Depending on how long nodes work, residual energy of battery powered devices decreases over time, so a node may try to shorten communication range in order to save energy.Environments where nodes are deployed, for example, indoor or outdoor, with or without obstacle, result in communication range quite different due to the interference, shadowing, fading, and pass loss [5].This work concentrates on WSNs connection probability for both heterogenous and homogenous networks in terms of communication range.Assuming WSNs nodes are randomly and uniformly distributed, two problems are addressed in this paper: given a network where all nodes have the same communication range, how does the connection probability change as communication range increases?In the case that communication range is a random variable, what is the network connection probability?Through analysis, this work finds that for 0 <  <  −1 and the number of nodes in the network is big enough and if the original network connection probability is 0.36, through increasing the communication range by constant (), the probability of a network being connected increases from 0.36 to 1 − .Explicit function () is given in this paper.It turns out that, when a network is connected, it is also almost sure log() + -connected (where  is the total number of nodes deployed and  is a constant greater than 1), which is important for the WSNs resilient against the node failure.Afterwards, the connection probability problem with random communication range, which is often the real case in the WSNs, is studied.The model is reformulated as Cox process, and the connection probability is analyzed by simulation.A method for determining the distribution function parameters for a given connection probability is developed.
Our main contributions are as follows: first, this paper employs an effective and novel approach to obtain analytical results for homogenous WSNs connectivity, some of which have been validated by previous studies; second, we propose that the Cox process can be used to model heterogenous WSNs and the simulations are performed to reveal the relations between the network connection probability and its distribution parameters.
The rest of this paper is organized as follows.Section 2 introduces the basic concepts of network model and the problem to be addressed.In Section 3, derivation and verification in case that the network nodes have the same communication range are presented.In Section 4, communication range is modeled as a random variable.A brief introduction of related works is provided in Section 5 while Section 6 concludes our work.

Network Model and Problem Statement
Usually, there are three methods to create links between nodes, as presented in Figure 1.One is k-nearest neighbor model.In this model, the network is formed by each node connecting to k-nearest neighbors; for example, in Figure 1(a), each node has 2 neighbors.The second is disc model.Node is modeled as a disk with communication radius .The node  is linked to node  if the Euclidean distance between  and  is less than ; for example, in Figure 1(b), node 3 cannot connect to node 2 and node 1 because they are out of communication range of node 3. The last one is Erdös-Rényi random graph that connects any two nodes by the same probability which is inappropriate in the WSNs; for example, in Figure 1(c Disc model is more plausible in the WSNs in the case that obtaining  neighbors is not always feasible.For instance, in wireless environment, some nodes may be unable to connect to a required number of neighbors due to the communication range limitation.
The notations and basic network definitions that will be used throughout the paper are now introduced.Additional terminologies are referred to [6]: : total number of nodes deployed in target field, and  ≫ 1, : area of node deployed, : node density, defined as /, : expected number of neighbors of node.Note that in this paper "log" means the logarithm to nature base .Next, main definitions are introduced.Definition 1. Node's communication range is defined as the area where other nodes can receive its signal.
For a disk, the communication range is the circle with radius .However, communication range is not necessary modeled as a disk.The communication range of radio is highly probabilistic and irregular [7,8].Figures 2(a) and 2(b) illustrate the ideal disk communication and irregular communication model, respectively.More importantly, the communication range of each node may not be the same.Note that the analysis in this section is a disk, but it can also apply to the irregular communication model.Definition 2.  , denotes a network following disc model.More specifically, the network is formed by  nodes randomly and uniformly deployed in area .The node is modeled as a disk with radius .This paper focuses on the probability of network  , being -connected.A -connected network implies that there are still  − 1 alternative path(s) if one path failed, therefore a higher  indicates that the network is more resilient against failures.In this paper,  is used to evaluate the WSNs resilience.This property depends on many factors, such as communication range, node density , node processing capability, node energy, and deployment environment.This paper is only interested in the impact of communication range on the connection probability.The problem can be stated as follows.
"Given WSNs  , with fixed node density , in the cases in which node communication range is the same and different, how network connection probability and resilience performance change as node communication ranges vary?"

Homogenous Node Deployment in WSNs
This section considers that, in the network  , , each node has the same communication range.First, the mathematical model that will be used is presented.Based on this model, theoretical results are proved and validated by an example and simulations.In Section 4, the situation where communication range of each node is a random variable will be discussed.

Network Connection Probability Analysis.
For uniformly distributed nodes with density , the number of nodes in the area  2 has a Poisson distribution [9]; therefore the probability of a node having  neighbor nodes is Number of node neighbor is also called the node's degree.The minimal degree of all nodes is called the network degree.If the network has  nodes, the probability of network  , is connected given by following well-known formula [9]: Let Note that  indicates the communication range of a node, but, if  = 1,  actually is the expected number of neighbors a node has.
International Journal of Distributed Sensor Networks Without loss of generality, assume that  = 1.For a real network the density of node  = 1 indicates that the average number of nodes in unit area is one.However, whether  is equal to 1 is irrelevant in this model, because if  is not 1, say   , then letting   = /√  the results will be the same.In order to simplify denotation, define So, (2) can be rewritten as and its first derivation with respect to  is The function (, ) can be written as () = (1 −  − )  when  = 1.In this section, the properties of () are analyzed, namely, 1-connected network.Two points are found out where () almost starts and stops growing in order to show that the connection probability increases from near 0 to reach 1.

𝑃 (𝑆
Remark 7.This theorem shows that, as  → +∞, network connection probability tends to 1 and leads to the network that has degree log  + .The author in [10] proves that if a network does not have any links at the beginning, and later links are added to connect nodes, the resulting network becomes -connected as soon as network degree is .Therefore, this theorem shows that once network becomes connected, it turns out to be log  + -connected with high probability.This conclusion is consistent with the result in [11]: by increasing  network becomes s-connected very shortly after it becomes connected, for  = (log ).log  + connected network makes WSNs more resilient against node failure because there are log + distinct paths from one node to any other nodes.

Proof.
Consider Remark 10.This conclusion is the same as [12] and has similar form in the Erdös-Rényi random graph [13].This section addresses one question.If a node current communication range is known, then the connection probability can be calculated by using (2).If the network connection probability is very low, maybe one wants to increase the node communication range to obtain a higher network connection probability.Equation ( 2) can be used again to calculate the required communication range, surprisingly the corollary proved in this section shows that the incremental of communication range to obtain a high connection probability is a constant for any size of network.

Validation Results
. This section validates the previous results by an example and simulations.In the example, 500 nodes with equal communication range are deployed in the field with √ 500 ⋅ √ 500  2 .
Figure 5 shows connection probability  = () when  = 500, 1000, 10000, 100000.Table 1 demonstrates the values of  1 ,  2 ,  1 ,  2 , and () and corresponding values of ( 1 ) and ( 2 ), for  = 500, 1000, 10000, 100000.For any  in the table, the obtained value  2 −  1 ≈ () = 5.60944.Of course, in a real network, the number of neighbors is integer, so 6 neighbors are needed.This example implies that, regardless of network size (number of nodes should be big enough), if the network connection probability is 0.66%, by increasing the communication range until each node obtains 6 more neighbors (namely, increasing communication range by 6 m 2 ), the network connection probability reaches at least 98.17%.Meanwhile, the network will be at least 10-connected.
In order to validate Theorems 6 and 8, this paper calculates the error between theoretical results and approximation values with different ,  and , as shown in Figure 6.The error of Theorem 6 is defined as (1 − ( − /))  − (1 −  − ), and the error of Theorem 8 is defined as (1 − (( + 1)/))  − 0.36 − .The errors for both theorems are very small, which indicate that both have a good approximation.

Heterogenous Node Deployment in WSNs
In the last section, the obtained asymptotic results were based on the assumption that each node has the same communication range which is often not the case in practice.This section presents the connection probability when node communication range follows a normal distribution, that is,  ∼ (,  2 ).Formally, network model is reformulated as follows:  nodes are randomly and uniformly deployed in area  with density  = 1.Communication range of node , denoted as   , is i.i.d random variable and has normal distribution   ∼ (,  2 ).Hence, the number of neighbors of the node, denoted as   , is the Poisson random variable condition on parameter , where  ∼ (, 2 ).This model is analog to the so-called Cox process in which random variable is Poisson process where density itself is a stochastic process.Cox process is widely used in economics, for example, [14].

Connection Probability for Random Communication Range. 𝐸[𝑉]
denotes the expected value of a random variable ; therefore the expected neighbors of node are In what follows, connection probability itself is researched.For  ≥ 1, probability of node  having at least  neighbors is given by For  = 1,  ,1 = 1 − (1/   ) is the probability that node  is not isolated.   has log normal distribution.Therefore, the expected value [ ,1 ] and variance Var[ ,1 ] can be obtained via standard method: For  > 1 neighbors, the distribution of  , does not have a closed-form expression.
If  is big enough, the probability of network being connected is Since parameter  is a random variable,   is a random variable as well.Letting  min = min{ 1, ,  2, , . . .,  , }, because 0 ≤  , ≤ 1, so Therefore the obstruction of connection probability of entire network is the node which has the minimal communication range.
is affected by several parameters: , , , and .Theorem 6 is used to decide .According to Theorem 6, the probability of the network being connected is at least 99.33% when  = 5.Let  = 1 and take log  + 5 as average  of communication range; for instance, if  = 500, then  = 11.2.In other words, 500 nodes with node communication range following normal distribution  ∼ (11.2, 2 ) are deployed.
Our major concerns are the parameter  which indicats communication range difference and  which shows the resilience capability.In order to study the changes of connection probability   as parameters vary, the following simulations are performed: (1) cumulative distribution function (CDF) of   is calculated after 500 runs with various  and , as shown in Figures 7 and 8; (2) given  and , what is the probability of network being -connected as the number of nodes deployed grows?This is done by computing average of  , after 500 runs for a given number of nodes, as illustrated in Figures 9 and 10; (3) how to choose the parameters in order to get the required connection probability.This is discussed in Section 4.3.
Figures 7 and 8 show the CDF of  , when  and  change.The network probability is sensitive to standard deviation.As mentioned earlier, a single node that has small communication range can cause the whole network connection probability to be low.For instance, in Figure 8 when  = 3 and  = 2, the probability of network being connected is almost sure less than 40%.Figure 9 illustrates the connection probability as  nodes were deployed in network when  = 2 and  = 1, 2, 3, 4. Figure 10 shows the changes when  = 1, 2, 3 and  = 1.Both figures show that the average of   is the decreasing function of , , and .Network connection probability as network size growing is predictable.For instance, Figure 10 shows that the network average connection probability for  = 3 is about 73% when the network has 250 nodes, but the probability falls to 55% if the network size is doubled.Figure 9 shows how the resilience performance decreases when network size grows or the probability decreases if higher resilience performance is required.For example, for networks which have 200 nodes, the probability that this network can tolerate 1, 2, and 3 (i.e.,  = 2,3,4) nodes failure are about 83%, 50%, and 10%, respectively.

Choose Distribution
Parameter.The simulations in Section 4.2 show   with different parameters.In this section, it is addressed which distribution parameter(s) can maintain the given   .This is helpful to choose appropriate parameters when network simulator is used to simulate real networks.
According to (6), ( 22) is a monotonically increasing function of   ∈ [0, +∞), and its inverse function is written as Letting  (0) , be an instance of  , , thus  (0) , =  −1  ( (0) , ).The probability  , being greater than  (0) , is given by where   () is the probability density function of .If the probability of a network required to keep network connected is at least  0 , the corresponding probability for each node is at least With formula ( 26)-( 28), the required density function parameter of communication range for given  0 can be calculated.
International Journal of Distributed Sensor Networks For example, 500 nodes are deployed in √ 500 ⋅ √ 500 m 2 ; communication area is  ∼ (,  2 ) with mean  = 10.If the desired probability of the whole network being connected, that is, -connected, is at least 90%.Standard deviation of this distribution is evaluated.For  = 1, according to ( 26) and ( 28), corresponding minimal range is  (0) ,1 = 8.46.In order to make probability of  ,1 greater than  (0) ,1 is high, for example, at least 95%, according to (27),  ,1 =  − 1.65.Therefore corresponding standard deviation  should be no more than 0.93.This is useful in the case of using network simulator to choose appropriate parameters to design high probability connected networks.Figure 11 shows the required  in order to make the network connection probability at least 90% when the number of nodes are different.Note that the node density is always 1.

Related Works
Extensive studies have been done on the connection problem of networks.Many of them focus on how many neighbors or network density is needed so that a network connects with high probability, such as [15]; some construct network to satisfy connectivity [16,17]; some works try to develop algorithms to preserve network connectivity or coverage, for example, [18][19][20], while some other works study other aspects of network connectivity, such as [21] which evaluates the quality of connectivity by measuring the reliability of link; it shows that the largest eigenvalue of the probabilistic connectivity matrix can serve as a good measure of the quality of network connectivity.When all the nodes of a region fail, [22] measures the number of connected components.This paper studies the connection probability when the network nodes are randomly deployed.
When nodes are randomly deployed, asymptotic upper and lower bounds of connection probability for both knearest neighbor and disk model have been studied [12].For knearest neighbor, [23] concludes that, as  → ∞, if each node is connected to less than 0.074 log  neighbors, the network is disconnected with probability one, while, if neighbors are more than 5.1774 log , the network is connected with probability one.Reference [24] finds that if  ≤ 0.3043 log , the network is not connected with high probability and if  ≥ 0.5139 log , then network is connected with high probability as  → ∞.But for the directed network the upper and lower bounds are 0.7209 log  and 0.9967 log , respectively.Reference [25] improves the upper bound to be 0.4125 log .For disk model [26] states that 6 to 10 average numbers of neighbors almost make sure that network will be fully connected no matter how many nodes there are totally in the network.In [27], if communication range  2 = log +, then the network connection probability tends to be  − − .Compared with [26], Table 1 in this paper shows that, when  = 10000, at least 13 neighbors are needed in order to make sure that network is connected with high probability.Besides, a result (Theorem 9) presented in our paper is the same as [27] but uses a totally different approach.
Reference [11] shows that, in k-nearest neighbor model by increasing , network becomes s-connected very shortly after it becomes connected, where  = (log ).Reference [28] proves one conjecture in [24] that, in k-nearest neighbor model for every 0 <  < 1 and  sufficiently large, there exists  = () such that, if the network has k-connected probability , then ( + )-connected probability is bigger than 1 − .This paper improves the results in [11], obtaining an explicit expression for disk model, that is,  = log  + , where  > 1.The corollary in this paper proves that the result for disc model has a similar form presented in [28].
Nodes having the same communication range usually are not true in reality.In order to make the model more accurate, [8,29] utilize irregular radio to model real nodes.The connectivity for heterogenous networks has been well studied; for example, [16,30] investigate the relay node placement problem such that network is the -connected.The authors in [31] assumes that node communication radius   of node  is i.i.d.random variable with normal probability density   ∼ (,  2 ).Reference [32] adopts the model that Poisson intensity is given by a normal distribution; then it obtains the asymptotic bound of range that all nodes in this area are connected to the origin.Reference [33] considers nodes are placed according to a shot-noise Cox process rather than uniform deployment.This paper employs the stochastic methods to characterize heterogenous network.In this paper the density is maintained constant, but the node communication range is normal distribution.

Conclusion and Future Works
When deploying many WSNs nodes, one of the key problems is whether all nodes in the network are connected to other nodes.Isolated nodes will be useless for applications.This paper presents the results on how the network connection probability changes as the communication range varies in randomly and uniformly distributed homogenous and heterogenous WSNs.In case of network with all nodes having the same communication range, through theory derivation and validation, this paper proves that, regardless of network size, the network connection probability increases from 0.36 to 1 −  by increasing constant communication range of each node.As the example shows in Section 3.2, regardless of network size, if the network connection probability is 0.66%, by increasing the communication range until each International Journal of Distributed Sensor Networks node obtains 6 more neighbors, the network connection probability reaches at least 98.17%.On the other hand, this paper shows that, once network is connected, it also becomes log  + -connected with high probability, which makes the network resilient against node failures because there are log  +  alternative paths between any two distinct nodes.
In case each node communication range is i.i.d random variable which has normal distribution, this paper analyzes the connection probability by simulation.This paper shows that network connection probability is determined by the distribution parameters and the network size, especially sensitive to standard deviation .The reason is that the network connection probability is dependent on the node that has minimal communication range.It implies that it needs to take care of the node which has minimal communication range because it is the bottleneck of the whole network.The network will become disconnected if they fail.With the same configuration, the resilience capability decreases when network size grows.Besides, given the required connection probability, this paper develops one method to decide the distribution parameter of communication range.This method can be used to choose appropriate distribution parameter of communication range for network simulators or real deployments.
In some circumstances, a full connected network is impractical and not necessary.One would be more interested in the giant connected component which contains most nodes of entire network are connected.More specifically, the relation between the giant connected component and the communication range distribution is what is wanted to be learnt.It is a percolation problem with random communication range.Percolation occurs when a node belongs to infinite component with none-zero possibility.The critical intensity   is defined as the minimum intensity in which percolation occurs.For disk model, the bound for critical intensity is known (e.g., [34]) but for variable radius is unknown.Therefore, studying the percolation problem with i.i.d communication range (or radius) will be our future work.On the other hand, the degree of the node obeys Poisson distribution in this paper.It has been found that many networks, such as the World Wide Web, the Internet, airplanes connection networks, some biological systems, and international ownership network, have power-law degree distribution with an exponent that ranges between 2 and 3 [35].Our future work will center on connection probability with a more accurate model.

Figure 1 :
Figure 1: Different methods to connect nodes: (a) node connects 2 nearest neighbors; (b) node connects other nodes within its communication range; (c) node connects other nodes with same probability .

Figure 2 :
Figure 2: (a) Disk communication model and (b) irregular communication model.

Table 1 :
Connection probability with different sizes of network.