A Dynamic and Energy-Efficient Clustering Algorithm in Large-Scale Mobile Sensor Networks

Random mobility and energy constraint are two main factors affecting system performance in mobile sensor networks, which cause many difficulties to system design. It is necessary to develop high-efficiency algorithms and protocols for mobile sensor networks to adapt to dynamic network environment and energy limitation. In this paper, a new clustering algorithm based on residual energy difference ratio is presented to improve system performance. Firstly, it is an energy-efficient algorithm. The residual energy of sensor nodes and average residual energy of system are considered in the residual energy difference ratio, which effectively avoid the nodes with low residual energy being selected as cluster heads. An energy-optimal scheme is used in cluster formation phase to minimize energy consumption. Secondly, it is a dynamic algorithm. The system dynamically clusters the sensor nodes according to the data transmission delays. It makes the whole system adapt to the random mobility of sensor nodes. The NS2 software is used to simulate the new clustering algorithm. The simulation experiments can verify the validity of the proposed theory.


Introduction
The appearance of wireless sensor networks (WSNs) has been significantly changing various kinds of remote sensing applications.WSNs contain hundreds or thousands of sensor nodes and these sensor nodes have the ability to communicate with either one another or directly to the remote destination node (sink) [1].They rely on batteries and have a limited power supply.If sensor nodes use up their energy, the network will be locally or totally paralyzed, which will cause short lifetime of the system.
Research on WSNs has grown rapidly and new techniques have been developed for the data-gathering and routing protocols.Multihopping method is preferred among these techniques, in which the information is routed in a cluster level fashion.All the sensor nodes in WSNs are divided into some clusters in which cluster heads (CHs) receive, datafuse, and forward the traffic originated by cluster members to the sink.This kind of hierarchy clustering topology is easily managed and has good scalability.The aim is to prolong the network's lifetime by minimizing the transmission power.However, many clustering protocols mainly focus on static sensor nodes or controlled mobility for hop-count reduction in data collection.Mobile sensor nodes are needed in applications where sensors are deployed on randomly moving objects for monitoring, tracking, and other purposes [2].For example, wireless sensor devices have been attached to bikes, vehicles, and animals.Other applications involving humans as participants can be flu-virus tracking or air-quality monitoring.
In recent years, mobile sensor networks (MSNs) have received increasing attention.Real-time clustering and routing protocols for MSNs are exciting areas of research because messages in the network are delivered according to their endto-end deadlines (packet lifetime) while sensor nodes are mobile [3].However, MSNs are not free of certain constraints such as power, computational capacities, and memory.Multihop idea is also suitable for MSNs.
Generally speaking, design of clustering protocols in MSNs is a greater challenge than it in WSNs due to the following reasons.Firstly, the mobility of sensor nodes is random and the topology of MSNs changes frequently.They bring more difficulties and challenges such as lower energy efficiency, shorter lifetime, and higher data delivery delays International Journal of Distributed Sensor Networks compared with WSNs [4].Mobility also increases frame errors and the bit error rate (BER) due to low signal-tonoise ratio (SNR) [2].Secondly, sensor nodes are tightly constrained in terms of energy, processing, and storage capacities.Thus, they require effective resource management policies, especially efficient energy management, to increase the overall lifetime of MSNs.
From the above discussion we can conclude that there are two main factors affecting the system performance in MSNs: random mobility and energy constraint.It is necessary to develop energy-efficient clustering algorithms which can effectively utilize the constrained energy and increase the overall lifetime of MSNs.In addition, the communication protocols in MSNs need to be scalable and robust enough to endure dynamic environments so that the MSNs nodes can recognize topology changes and update communication links rapidly [5].
In this work, a new clustering algorithm based on residual energy difference ratio is presented to improve the system performance of MSNs.The residual energy of sensor nodes and average residual energy of system are considered in the CH selection phase, which effectively avoid the nodes with low residual energy being selected as CHs.An energyoptimal scheme is used in the cluster formation phase.Furthermore, the sink dynamically clusters the sensor nodes according to the data transmission delays.It makes the whole system adapt to the random mobility of MSNs.The new clustering algorithm is dynamic and energy efficient which improves the lifetime, throughput rate, and energy efficiency of MSNs.
The rest of this paper is organized as follows.Section 2 presents some related works of MSNs.Network model is described in Section 3. The delay bound and energy-optimal scheme of MSNs are introduced in Section 4. Section 5 presents a clustering algorithm based on residual energy difference ratio.Theoretical analysis about the performance of MSNs is discussed in Section 6.The simulation results are given in Section 7. Finally, the conclusions are drawn in Section 8.

Related Works
There are two categories for the existing data-gathering protocols: hierarchy protocols and nonhierarchy protocols [6].The hierarchy protocols have been well accepted as an effective way to make the network more scalable and reduce the energy consumption of WSNs.Clustering algorithms are used in sensor hierarchical (tree-based) routing and topology control, where CHs receive, data-fuse, and forward the traffic originated by cluster members to the sink [7].Many clustering and routing protocols have been proposed for dataaggregating in static WSNs such as LEACH [8], HEED [9], PEGASIS [10], APTEEN [11], HIT [12], and ACW [13].
In recent years, some research studies have been done for controlled mobility in data collection of WSNs.They separately used mobile sensors [14], mobile sinks [15], and mobile data collectors [16] to design related protocols and improve network performance of WSNs.Yang et al. surveyed mechanisms into three categories [17]: mechanisms using mobile sinks, mechanisms using mobile sensor nodes redeployment, and mechanisms using mobile relays.
However, the mobility of sensor nodes in MSNs is random and uncontrolled, which imposes many difficulties on system design.The algorithms and protocols focusing on static sensor nodes or controlled mobility cannot be directly applied to MSNs.The random mobility should be considered when designing related protocols of MSNs.
A data-rate-control scheme of MSNs was proposed in [18].Two kinds of data rate were chosen according to the buffer size of sensor nodes and the status of channel.However, the selection of data rate is more than two kinds in MSNs because of the complex channel situation.A routing algorithm based on color theory was presented for MSNs in [1].They considered the energy efficiency of MSNs to design routing protocols.A neighbor discovery protocol for MSNs was given in [5].The collaboration with neighbor nodes was discussed in order to recognize topology changes of MSNs.The communication characteristics were not referred to in the above research studies.Liu et al. considered a distributed clustering algorithm which was a tree-based architecture with two layers for data gathering of MSNs in [6].However, the distribution of the selected CHs is uneven which causes uneven energy consumption and instability of MSNs.
Considering the above disadvantages, a new clustering algorithm based on residual energy difference ratio is proposed in this paper to improve the system performance of MSNs.The CHs are selected according to the residual energy difference ratio, which guarantees that the sensor nodes with more residual energy have greater possibility to be selected as CHs.In the cluster formation phase, characteristic distance is introduced to optimize the power and balance the energy consumption.In addition, the sink dynamically clusters the sensor nodes according to the data transmission delays.It makes the whole system adapt to the dynamic network environment of MSNs.Therefore, the clustering algorithm can effectively resolve the problems brought by random mobility and energy constraint.It also improves the overall lifetime, throughput rate, and energy efficiency of MSNs.

Network Model
In this paper, assumptions are made as follows.
(i) There is only one sink in MSNs.The sink is located far away from the sensor nodes and does not have energy and communication constraint.
(ii) Initially, each sensor node has the same energy capacity.All the sensor nodes in MSNs are mobile, homogeneous, and power limited.
The network topology is modeled as follows. = (, ) is used to represent MSNs, where  is the set of sensor nodes and  is the set of links connecting the sensor nodes.There are  sensor nodes V 1 , V 2 , . . ., V  and one sink V +1 in the MSNs.We also assume that there are totally  clusters in MSNs and each cluster  has   nodes (∑  =1   = ,  = 1, . . ., ).Many research studies of MSNs referred to the mobility models in other mobile networks which had some limitations  = 0 / * counting the number of sensor nodes that directly transmit data to the sink.* / For  = 1 to  {  = 0; / * counting the number of link hops between V  and V +1 .* / (); in moving directions or speeds.Considering the complexity of random mobility in MSNs, we assume that the properties of network model are as follows.
(i) Sensor nodes can move along any directions.
(ii) The movement of nodes does not cause the topology to change too rapidly.

Delay Bound and Energy-Optimal Scheme
In this paper, we introduce an ACM scheme [19,20] into MSNs system to choose the channel's data rates according to SNR under BER bound and average power constraint.Under the scheme,  min and  max are separately the minimum and the maximum date rates.The maximum link () delay  max can be derived from [21].Therefore, the maximum delay on a path  is where  is the number of hops on path .
Matrix  is used to express the upstream communication relativity.In this case we only consider the transmitting process from sensor nodes to sink.
max +1 is the maximum delay between node V  and the sink V +1 ( = 0, . . ., ). is the number of sensor nodes that directly transmit data to the sink.
Then we discuss the energy model of MSNs.The following energy model is adopted from [8,22].The energy consumption per second by a node that transmits data packet  meters onward, denoted as   (), is and the power by a node that receives data packet, denoted as   , is where  is the data rate,  is the path loss exponent,  11 ( 12 ) is the power to run the transmitter (receiver) circuitry, and  2 is the power for the transmit amplifier to achieve an acceptable SNR.The total power consumption by a relay node is where  1 =  11 +  12 ,  is the distance over which the relay node transmits data.
In WSNs, data links can be established between a transmitter node  and a receiver node  separated by distance .Insert  − 1 relays   ( = 1, . . .,  − 1) between  and .
According to [23], an optimal scheme to consume minimum power is given as follows.
Theorem 1 (see [23]).Given  and the number of intervening relays (−1),  Link () is minimized when all the hop distances are made equal to /, the optimal number of hops  Opt is [/ ℎ ], where  ℎ is called the characteristic distance and given by Corollary 2.  Link () is bounded as with equality if and only if  is an integral multiple of  ℎ .

International Journal of Distributed Sensor Networks
Proof.From Theorem 1, when every hop distance  =  Char ,  Link () is minimized which is denoted as  min ().Hence, combining (3)-( 6), we have with equality if and only if  is an integral multiple of  Char .
Remark 3. When node  is the sink, its energy consumption for receiving data is not included in the total energy of WSNs [23].Under this case,  Link () is bounded as Based on Theorem 1, when the per-hop distance approximately equals  Char and the data rate  is  min ,  Link () can approximately be regarded as  min ().

Clustering Algorithm Based on Residual Energy Difference Ratio
In the CHs election of LEACH algorithm [8], they used a threshold to be the probability of being a CH for each sensor node.The threshold is as follows: where  is the number of the nodes,  ( = 1, . . ., ) denotes each sensor node,  is the expected number of CHs,  is the number of rounds that have passed,  * ( mod /) is the number of the nodes that have been elected as CHs after  rounds, and  is the set of sensor nodes which have not been selected as CHs yet.Firstly, each sensor node generates a random number between 0 and 1.Then according to (10), the node is selected as a CH if its random number is smaller than the threshold.The effect of energy constraint and residual energy is not considered in (10), which may cause the nodes with low residual energy to be selected as CHs.In this case, those CHs will soon use up their energy, which causes the short lifetime of network.
In order to prolong the lifetime of MSNs, a residual energy difference ratio is designed in this paper to improve (10) and we get the following threshold: Hence, it has more possibility to be selected as a CH.In CHs selection phase, each sensor node  generates a random number () between 0 and 1.Then according to (11), the node is selected as a CH if its random number is smaller than the threshold Diff().Using the residual energy difference ratio can efficiently avoid the nodes with low residual energy being selected as CHs.It can also guarantee that in a cluster the nodes with more energy have greater probability to be selected as CH.Therefore, it prolongs the lifetime of MSNs.
Furthermore, Liu et al. proposed ACE-C algorithm for data gathering of MSNs [6].The algorithm had each sensor node be a CH in turns, which could guarantee that there was at least one CH elected and the number of CHs in each round was equal.However, the distribution of the selected CHs is uneven which causes uneven energy consumption.In particular in MSNs with large sensing area, it is easy for ACE-C algorithm to cause instability and short lifetime.
Considering the above disadvantages, an improved clustering algorithm is presented in this paper.Firstly, according to (11) a CH is selected.Then the CH sends message to nodes which are  Char /2 away from it.The node that receives the message joins in this cluster and transmits data to the CH. Char is the characteristic distance in Theorem 1.Thus, the area of a cluster is approximately equal to  0 =  2  Char /4.We can also conclude that in every cluster the distance between common nodes and their CH varies from 0 to  Char /2.Using the same method, the next CH is selected from the rest nodes and forms its own cluster.We assume that the sensor network area is .Therefore, it is easy to conclude that the expected number of CHs is  = / 0 .After all the CHs are selected, we can derive that the distance between neighbor CHs is approximately equal to the characteristic distance  Char .All the sensor nodes transmit data to their own CHs.Then the CH fuses and uploads the collected data to its higher level neighbor CH along upstream direction towards the sink.In this way, the data is transmitted level by level.The clustering algorithm is a tree-based architecture with multiple layers Input: : the total number of sensor nodes : the expected number of CHs / *  CH : the number of CHs.  is the set of CHs.  is the set of non-CHs.  ∪   = .* /  CH = 0; For ( = 1;  <= ;  + +) {If node  ∉   then continue; For node  ∈   , according to (11)  for data gathering.According to the energy-optimal scheme in Theorem 1, the energy consumption is optimal and the distribution of CHs is even.The energy consumption is also balanced.
The clustering algorithm is a dynamic process in which MSNs system dynamically clusters the sensor nodes according to the data transmission delays.From DCA algorithm in Section 4, we can get the maximum delay  max +1 between node V  ( = 0, . . ., ) and the sink V +1 .In every clustering period, the sink reads the delay  of every received packet whose data source is node V  and compares it with the maximum delay  max +1 .If  >  max +1 , it indicates that the communication condition of this path is deteriorated because of the mobility of sensor nodes.The MSNs will run the clustering algorithm to cluster sensor nodes again so that the system can adapt to the change of the network's topology.In DCA algorithm,  is derived to count the number of sensor nodes that directly transmit data to the sink. = / 0 is the expected number of CHs of MSNs.
The clustering algorithm based on residual energy difference ratio (CAREDR) is shown in Algorithm 2.

Theoretical Analysis of Power and Lifetime in MSNs
Random mobility and energy constraint are two main factors affecting system performance in MSNs, which cause many difficulties on system designing.In order to improve the lifetime and energy efficiency, this paper proposes a new clustering algorithm based on residual energy difference ratio.Now theoretical analysis about power and lifetime of MSNs is given as the following theorems.
Theorem 4. Considering cluster  with   sensor nodes and data packet transmission rate , the average power of such a cluster is Proof.According to (3)-( 4), the power of such a cluster is In (13),  is a random variable denoting the distance between a regular node and its CH and  is the random variable denoting data rate.Two-dimensional random variable (, ) follows uniform distribution over the areas 0 ≤  ≤  Char /2 and  min ≤  ≤  max .We have the average power of a cluster as follows: Corollary 5. Considering cluster  with   sensor nodes and data packet transmission rate , the average minimum power of such a cluster is International Journal of Distributed Sensor Networks Proof.According to (3)-( 4), the minimum power of such a cluster is In (16),  is the random variable denoting the distance between a regular node and its CH and  =  min is the minimum data rate.Random variable  follows uniform distribution over the area 0 ≤  ≤  Char /2.We have the average minimum power of a cluster as follows: Theorem 6. Considering MSNs with  sensor nodes, the lifetime of MSNs is bounded as where Ψ = (+/(−1)) 1 +(−)(( ℎ /2)  /(+1)) 2 −  12 .
Proof.Clearly, the total energy consumed in MSNs is not greater than the total energy available in the beginning, that is, where   (0) is the initial energy of sensor node  and  MSNs is the energy consumption per second by MSNs which is composed of two parts: the energy consumption per second by all clusters and the energy consumption per second by all links.From the discussions above, there are totally  clusters in MSNs and each cluster  has   nodes (∑  =1   = ,  = 1, . . ., ).Among  clusters there are  CHs that directly transmit data to the sink and  −  CHs that transmit data to their higher level neighbor CHs. can be derived from DCA algorithm in Section 4. We can also conclude that there are  links on which CHs directly transmit data to the sink, so their energy consumption for receiving data is not included in the total energy of MSNs.Furthermore, our clustering algorithm makes the transmitting distance of all links approximately equal to  Char .In this paper, we assume that the initial energy of all the sensor nodes is the same, that is,   (0) =  0 ( = 1, . . ., ).From ( 19), we have From ( 7), (9), and (15), we conclude Then we have where Ψ = (+/(−1)) 1 +(−)(( Char /2)  /(+1)) 2 −  12 .

Simulations
In this paper, we use NS2 simulation tools [24][25][26] to simulate the clustering algorithm CAREDR.The simulation environment is set up as follows.The sensor network area considered in the simulation is 500 m × 500 m.There are 500 sensor nodes and one sink.The simulation parameters are shown in Table 1.
In this experiment, the throughput, energy consumption, and lifetime of the network are simulated.From (6) in Theorem 1, we can get  Char = 101.The expected number of cluster heads  can be calculated from the relative formula in Section 5:  = 31.The average power and its lower bound of each cluster are discussed in Section 6 which can help to calculate Diff() and select suitable CHs.The upper bound of lifetime in Theorem 6 can offer theoretical reference of system performance comparison.We compare the results obtained by CAREDR with those of using LEACH and ACE-C under the same circumstances.As shown in Figure 1, the throughput of CAREDR is higher than LEACH and ACE-C.It is approximately increased by 49.06% and 300.32%, respectively.Figure 2 shows that the new algorithm has lower energy consumption than LEACH and ACE-C, which is approximately decreased by 7.92% and 11.94%, respectively.Figure 3 shows the death rate of the nodes through which we can conclude that the new algorithm has longer lifetime than LEACH and ACE-C.The lifetime is approximately increased by 5.82% and 13.18%, respectively.Our CAREDR algorithm effectively improves the performance of MSNs.

Conclusions
In order to improve the performance of MSNs, a new clustering algorithm based on residual energy difference ratio is presented in this paper.
It is a dynamic and energy-efficient algorithm which improves the lifetime, throughput rate, and energy efficiency  of MSNs.The residual energy of sensor nodes and average residual energy of system are considered in the CHs selection phase, which effectively avoid the nodes with low residual energy being selected as CHs.An energy-optimal scheme is used in the cluster formation phase.Furthermore, the sink dynamically clusters the sensor nodes according to the data transmission delays.It makes the whole system adapt to the random mobility of MSNs.
judge the value of Diff () with the random number () of node ; If  () < Diff () then { ∈   ;  CH =  CH + 1; Form its cluster and remove all the nodes in this cluster from   }; If  CH =  then break } End Algorithm 2: The clustering algorithm based on residual energy difference ratio (CAREDR).

Figure 2 :
Figure 2: Comparison of the energy consumption.

Figure 3 :
Figure 3: Comparison of the lifetime.
* ( mod /) *   −  CH       total / alive −  CH  and its CH,  total is the whole energy of the nodes,  alive is the number of the left alive nodes, and  total / alive represents the average residual energy.  −  CH  is the difference between node  and its CH.| total / alive − CH  | is the absolute value of the difference between the average residual energy and the CH.(  −  CH  )/| total / alive −  CH  | is the residual energy difference ratio.When   is greater than  CH  and | total / alive −  CH  | is smaller than   −  CH  , it will generate a larger threshold Diff().It indicates that node  has more residual energy than most sensor nodes.