Energy-Efficient Collaborative Communication for Optimization Cluster Heads Selection Based on Genetic Algorithms in Wireless Sensor Networks

To solve the energy constraint problem caused by the neighborhood of the sink, which is burdened with heavy relay traffic via multihop communication and tends to die earlier, a new energy-efficient collaborative communication model is proposed based on genetic algorithms in wireless sensor networks (WSNs). By setting the threshold value for a new generation judgment function, the proposed algorithm would be capable of judging whether the sensor nodes can be a cluster head. Then, the genetic algorithms will filter out some nodes from these temporary cluster heads to get the final cluster heads. Simulation results show that the proposed algorithm can allocate energy to each node of WSNs and postpone the death of the first node. In this manner, the lifetime of WSNs is effectively prolonged.


Introduction
Due to limited power of sensor nodes, an energy-efficient transmission has become a key requirement for wireless sensor networks [1,2].Typically, a sensor network consists of a number of autonomous sensor nodes, which are selforganized and cooperative to perform a particular task [3].The objective of WSNs is to transmit sensed data to sink nodes or to the base station locating in a further destination of most nodes [4].There usually exists an energy constraint problem; for example, sensors may not be recharged and replaced.Since the energy supply of a sensor node is always limited, energy optimization should be considered as a key aspect when the performance of WSNs is studied, and the design of data gathering protocols would play an important role in the research of WSNs [5].For such kind of problems, collaborative communication has proven to be an effective important energy-saving solution [6].
As WSNs are networks with limited energy, if there is excessive energy consumption on some of the network nodes, the node will fail in a short time, which would lead to the situation that the effective sensing data area becomes smaller, and, in such case, the performance of the whole network would deteriorate.As the transmission becomes larger, more hops are needed to realize multihop transmission, which leads to an increase of overheads in implementing the multihop routing [7].Meanwhile, if transmission errors occur, data would be retransmitted from the source node along the multihop path.This may result in significant increases in the total energy consumption of nodes along the path [8].Collaborative communication would produce high power gain and significantly mitigates the fading if frequency and phase synchronization are realized simultaneously.
With an ultimate objective of reaching reduction and balance in energy consumption for each node in the network, this paper proposes a novel energy-efficient collaborative communication algorithm (EECCA) for the optimization of cluster heads' selection based on genetic algorithms.The proposed algorithm would show its advantage in prolonging the network's lifespan, by dynamically adapting the network topology within cluster and reducing the energy consumed in communication.By setting the threshold value for a new generation judgment function, the EECCA determines whether the sensor node becomes a cluster head.Then, it filters some nodes via genetic algorithms from these temporary cluster heads to get the final cluster heads.Simulation results show that the EECCA can allocate energy to each node of WSNs more efficiently and regularly and postpone the occurrence of a node's death in WSNs; thus, it can effectively prolong the lifetime in WSNs.
The rest of this paper is organized as follows: a brief overview of the related works is presented in Section 2. Section 3 describes the radio energy dissipation model.A novel energy-efficient model for the optimization of cluster heads' selection is proposed in Section 4. The simulation results and performance analysis are presented in Section 5. Finally, Section 6 concludes the paper and discusses some future research directions.

Related Works
The collaborative communication has attracted much attention of many domestic and international researchers interested in WSNs.This section gives a brief review about the various existing cluster-based routing protocols in WSNs.
Although the LEACH (low energy adaptive clustering hierarchy) routing protocol has many obvious advantages in WSNs' cluster organization, it has not considered the residual energy of candidate nodes.Therefore, it is possible that a candidate node of low residual energy be selected as a CH [9].LEACH protocol does not take into account the residual energy of node during the period of cluster head selection.Additionally, random cluster head selection may lead to instability in the distribution of the rounds that produce cluster heads and cannot guarantee a uniform distribution of cluster heads in the network [10].LEACH-C [11] is an improved LEACH using centralized clustering control, which is the same protocol as LEACH, as far as the steady-state phase characteristics were concerned.In the set-up phase of LEACH-C, each node transmits information about its current position and its residual energy level to the BS.
Tan et al. proposed an LEACH-DE (LEACH-distance energy) routing protocol to decrease energy consumption and prolong network lifetime [12], in which the cluster head node is selected not only by considering whether residual energy of the node is greater than the average residual energy level of nodes in network, but also by examining the geometric distance between the candidate node and the BS.Wang et al. established a Markov random model for LEACH protocol and indicated that the number of cluster heads produced by LEACH protocol is a binomial random variable, whose value spreads over a large range inside the target, instead of concentrating in the vicinity of the target [13].Wang and Xiong made use of the Monte Carlo simulation to analyze the statistical properties of the number of cluster heads for LEACH [14].Zhang et al. proposed a novel LEACH-NOC routing protocol (low energy adaptive clustering hierarchy new optimal cluster) [15].The proposed approach can optimize the network structure and solve the problem of optimal cluster head, given the cluster distribution.
The density of actual network' node is often uneven, and the member of the cluster will be larger in the relatively dense area.For larger clusters, the cluster head is responsible for more nodes and has a higher transmission burden, which is bound to cause premature depletion of energy.In a relatively sparse region, the situation becomes the opposite.In [16], an energy-efficient heterogeneous clustered scheme was proposed based on weighted election probabilities of each node, which are finally selected as CHs according to the residual energy.Bajaber and Awan proposed an adaptive decentralized reclustering protocol (ADRP) [17], in which CHs and next heads were selected according to the residual energy of each node and the average energy of each cluster.Chamam and Pierre proposed a novel distributed clustering algorithm where CHs were selected under the constraint of a three-way message exchange mechanism between each sensor and its neighbors [18].In [19], a novel ECCP (energyefficient cluster-chain) clustering protocol for wireless sensor networks was proposed, in which a hybrid clustering approach was used to divide sensor nodes into clusters by using multiple metrics and a chain among the sensor nodes within each cluster was constructed so that each node is capable of receiving data from a neighbor while transmitting data to another neighbor.
Due to the high distribution density of nodes in wireless sensor networks, two optimal coverage control schemes were proposed, based on weighted genetic algorithm and constrained genetic algorithm, respectively [20].However, the proposed algorithm used a random manner in nodes deployment, which resulted in energy consumption imbalance.Bajaber and Awan proposed a CDC (centralized dynamic clustering) algorithm for wireless sensor networks [21].In each cluster, cluster head collected the data from all the cluster members, aggregated the data, transmitted aggregated data to the base station, and selected new cluster head for next round.When one sensor node dies, the cluster head would either send messages to the station forming the clusters or use the residual energy to select a new cluster head for the next round.Zarei et al. proposed a distributed and energy-efficient protocol, called CBRP, for data gathering in wireless sensor networks [22].CBRP clustered the network using new factors and then constructed a spanning tree for sending aggregated data to the base station.The main drawback of CBRP was communication overhead due to the large number of nondata messages exchanged between sensor nodes.Younis and Fahmy proposed a hybrid energy-efficient distributed (HEED) clustering approach for ad hoc sensor networks [23].HEED selected cluster heads randomly based on probability but it distributed cluster heads more uniformly across the sensor network by multiple iterations and smaller cluster ranges.The approach set the probability of cluster heads' selection by each node's residual energy at the first iteration of each round.

Radio Energy Dissipation Model
Radio energy dissipation model is widely used for energy analysis of wireless sensor networks [7], which is given as follows: where  elec denotes the electronics energy and depends on the digital coding, modulation, filtering, and spreading of the signal.   2 and  amp  4 are the amplifier energy, which depends on the distance between the sender and the receiver and the acceptable bit-error rate.
To continuously monitor the phenomenon, the sensor nodes are assumed to be deployed randomly in a region of  2 nodes, and the network is assumed to have the following properties.
(2) All nodes have similar communication and processing capabilities and equal significance.
(3) Nodes are deployed in random scenarios.Each node is left unattended after deployment and it is impossible to recharge or replace the batteries.
(4) Each node has limited transmission power level and thus limited radio coverage.
All nodes are organized with each one belonging to some cluster.Each node randomly decides to become a cluster head.Once a node was selected as a cluster head, it aggregates the data from various nodes inside the cluster and then sends the data to the base station.To avoid frequent changes in network topology, it is assumed that nodes are stationary or moving slightly.
Definition 1 (lifetime).A network's lifetime refers to the time span from the very beginning of the network's formation to the time when the death of one node or a certain amount of nodes occurs for the first time.
Definition 2 (data aggregation).Data aggregation refers to the process during which data is collected and then aggregated by the cluster head (CH) and finally sent to the base station.
It is assumed that each cluster head compresses the data with the ratio  after receiving the data from all members.Definition 3 (length of message).The length of message between the nodes is uniformly set to be  for cluster head selection and cluster set-up phase.

Energy-Efficient Collaborative Communication Model
Collaborative communication has been proven to be an important energy-saving method.The traditional LEACH algorithm depends on the random number generated by  sensor node, and the instability of random number leads to the instability of the number of the sensor nodes [24].
In order to ensure the stability of the cluster head distribution, the cluster head selection phase is implemented in the following steps.
(1) Threshold function formulation: the node   adopts a novel threshold function to obtain a random value, which decides whether this node will become a temporary cluster head.The threshold function can effectively guarantee that the number of temporary cluster heads  tmp is larger than the optimal number  opt .
(2) Cluster heads selection: if  tmp >  opt , then  opt cluster heads will be selected from the temporary cluster heads according to the residual energy and relatively uniform spatial distribution.If  tmp <  opt , the normal nodes will generate a random value repeatedly until more temporary cluster heads are produced.
The operation of wireless sensor network consists of several rounds, as shown in Figure 1.There are a set-up phase and a steady-state phase in the main parts within each round.The set-up phase consists of four parts: a temporary cluster head selection and advertisement period, optimal cluster heads selection period, member nodes association and cluster formation period, and the schedule creation period [25].

Optimal Numbers of Cluster Heads.
During the selection phase, there are  tmp nodes that broadcast the message of being temporary cluster head with a communication radius .Since the amount of data produced in local interactive communication is relatively small, the energy consumed in this process can be neglected.In the following analysis on the optimal clusters, let  denote the real number of the cluster heads for calculation.
In the process of cluster formation, each cluster head receives the entire join-in message from its members and broadcasts TDMA sequence.In the whole network, the number of member nodes of all clusters is  − , and the distance between non-cluster-head node and cluster-head node satisfies  toCH <  0 .Due to the relatively small distance between the member nodes and their cluster head, the energy consumption between them satisfies the free-space channel model.Therefore, the energy consumption in the stage of cluster formation is In data transmission stage, there are  cluster heads receiving the monitored data from their members and then aggregating these received pieces of data with their own.Assume that each message occupies  bit, and then the energy consumption can be expressed as where  DA is the energy consumption coefficient.The distance between cluster-head node and base station node satisfies  toBS >  0 , and the energy consumed during the cluster head sending messages to the base station is defined as where  is the compression ratio.Therefore, the total energy consumption in each round can be defined as From ( 2) to (5), one can obtain Assume that the probability density distribution of nodes in the cluster is (, ), and the cluster head locates in the centric point of the cluster.Then, the [ 2  toCH ] is defined as follows: In order to minimize the average energy consumption, the relationship between the average distance  toBS and the radius  of each node must be formulated.According to the research of Dr. Zhu [26], there is Given the partial derivative  total with respect to , and the partial derivative is given to 0, the optimal number of cluster heads can be obtained as

Improvement of Cluster Head Election Threshold Function.
To balance the overall energy consumption across the network, the role of cluster head is rotated among all sensors.
During the period of cluster head selection, each candidate node selects a random number between 0 and 1 (e.g., 0.05) and compares it with a calculation threshold value ().If the random number is lower than the threshold value, then that candidate node will become the CH in the current round.The threshold function is where  is the expected probability that a candidate node becomes a CH,  indicates the current round in the network,  is the set of nodes, node  was not a CH in the most recent 1/ rounds, and () is the threshold probability value for candidate node  to become a CH at round  in time .
Given that the number of temporary cluster heads is greater than  opt , that is,  tmp >  opt , there would be  opt formal cluster heads that can be sifted out from these nodes with enough choices.However, the number of temporary cluster heads cannot exceed  opt too much, or the energy consumption of the network will definitely increase due to more broadcast messages.For this reason, there would be  = √ tmp / and  tmp >  opt .The formula can improve each node's possibility of being a cluster head in the network.
Since LEACH does not guarantee a uniform cluster heads distribution, given the network's load and energy distribution balanced during cluster head selection, setting the scale of each cluster should be based on the density of nodes.In this manner, the threshold function can be further improved.
Let Neighbor (  ) denote the set of live neighboring nodes for node   , termed as Neighbor (  ) = {  |  , ≤ ,   ∈ }; then the density of   can be defined as It can be seen from the above expression that node density will change dynamically with the continuous operation of the network.In order to increase the number of cluster heads, we can raise the probability of becoming a cluster head for the node in dense regions and vice versa.
Therefore, (10) can be modified and the following expression can be derived: In the formula, the numerator that the neighboring nodes of   subtract 1/√ tmp / denotes an incentive mechanism.For example, when the surviving neighbors surrounding node   are greater than the average size of all clusters, it shows that node   is regarded to be in dense areas, and its probability of becoming a cluster head should be raised accordingly.Otherwise, its probability should be reduced.As far as nodes density was concerned, the cluster head selection is no longer a matter of combining a single node but a comprehensive consideration of surrounding nodes.

The Option of Formal Cluster Heads.
In order to select  opt formal cluster heads from the temporary cluster head set, effective different factors should be considered, such as the extent of coverage for all cluster heads and the degree of residual energy balance from their adjacent members.
Considering the network coverage mode, disk sensing model [27] is currently the most widely used perceptual model for simplicity and efficiency, and it can be used to analyze the coverage of wireless sensor networks.
Assume that the coverage transmission area of each node is a circle with a radius ; then if the target node locates in this area, the monitoring capacity of the target node is considered to be 1 or otherwise to be 0.
The two-dimensional disc perception model can be expressed by the following formula: where   (  ) denotes the monitoring capacity of the target node   for node   .(  ,   ) is the Euclidean distance between   and   , and Hence, the coverage model function can be expressed by   = ∑  =1   (  ).Let  1 (),  2 (), . . .,   () denote the coverage ratio of the point  by nodes  1 ,  2 , . . .,   , respectively, then the total coverage ratio of this point is The monitoring area can be divided into (/) × (/) square cells with a side length of .By summarizing from (14) and calculating the average value, we can obtain the total coverage ratio of the whole network as approximately Next, taking into account the regional energy balance, we present another optimization objective of energy equilibrium factor.The energy equilibrium function can be expressed as where   residual denotes the residual energy of node  and   0 denotes the initial energy of node .Therefore, the overall objective function can be expressed as where  1 or  2 is the weight of the subobjective functions  1 and  2 , respectively, and The actual value of the objective function  is between 0 and 1; the greater the value of the objective function, the better the optimization performance.
GA (genetic algorithm) is based on the principles of natural selection and the genetic processes associated with biological organisms, which is a popular technique used for searching large solution spaces [28,29].According to the random distribution and the dormant and active characteristics of wireless sensor network nodes, the proposed protocol uses binary encoding.  = { 1 ,  2 , . . .,   , . . .,   } denotes a temporary cluster head node selected as the official cluster head, and the node  is selected as the working node when there is   = 1.The initial population composed of  ×  individuals is termed as The proposed algorithm uses a traditional roulette wheel selection method.Once the fitness of each individual is derived, the individual fitness value of the entire group can be normalized.In order to ensure that genes are retained within the offspring of parents as much as possible, the algorithm is equipped with a cyclic recombination mechanism.The proposed algorithm uses the basic bit mutation.This is done by chromosomes randomly assigned to one mutation probabilities value for a few gene mutation operations.
The repetition of this process would not stop until a termination condition is reached.Common termination conditions include the following.(1) A solution is found satisfying the minimum criteria; (2) a fixed number of generations are reached; (3) the highest ranking solution's fitness approaches a plateau such that subsequent iterations no longer produce better results.For this consideration, the termination condition can be regarded as being satisfied  when the network coverage reaches 95%.At the end of the genetic algorithms, a chromosome with the largest fitness would be derived as the optimal solution, and the corresponding decoder would be of the expected resolution.

Simulation and Results Analysis
In order to compare the performance and lifetime of the proposed algorithm with that of LEACH [7], CBRP [22], and TLCP [30], NS2 and the MIT uAMPS LEACH NS extensions are used in simulation.A simulation scenario is set up with the parameters presented in Table 1.
Figure 2 shows a typical random distribution of 200 nodes deployed in an 100 × 100 m 2 area.Figure 3 shows the variation tendency of the number of cluster heads obtained using the LEACH protocol.It is clear that the number of cluster heads produced by LEACH protocol is distributed from 0 to 35; however, the proportion of the rounds, when the number is equal to the optimal value, is less than 25%, and in the worst case, the number of cluster heads is 0.
Figures 4 and 5 show the distribution of cluster heads in a different round of LEACH and the proposed model, respectively.It is seen that the distribution of cluster heads in LEACH is uneven and many cluster heads located in the marginal region of the network.As the distance between a cluster head and member nodes may be large, the high intracluster energy consumption may affect the lifetime of the network.In EECCA, the objective function formulates an optimization process of selecting better nodes, and the optimization results depend on the overall distribution of these nodes, which guarantee the stability of the number of cluster heads.In such manner, the concentration of nodes in a local area can be avoided and the energy consumption can be effectively reduced.The problem of selecting  opt cluster heads from the temporary cluster heads can be solved by utilizing genetic algorithm.Figures 6 and 8 give the covering sensor sets in the 100th and 200th rounds, respectively.It can be seen that full coverage is almost achieved for the whole network.Specifically, the iterations of the algorithm corresponding to Figures 6 and 8 are implemented less than 30 times, which can be seen in Figures 7 and 9, respectively.After every round of iteration, the coverage ratio of the whole network can be calculated using the cluster heads selected, and the iterations would not stop until the area coverage rate reaches 95%. Figure 10 shows the relationship between the percentage of cluster heads and the variation of the node's transmission radius.It can be seen from Figure 10 that the variation tendency of cluster heads is inversely proportional to that of the transmission radius.This is to say, as the transmission radius of a cluster head increases, the coverage area of a single node will be expanded.Therefore, to get a large transmission radius, a relatively small amount of cluster heads should be selected for the whole network.
Figure 11 shows the average energy consumption in each round, in which it is clear that when the number of cluster heads is less than 14, the average energy consumption in each round will decrease rapidly as the number of the cluster heads increases.The reason for this phenomenon is that increased cluster heads distribute the energy loads evenly in the networks.However, as the number of cluster heads approaches 14, the total energy consumption reaches its minimal value 0.049 J.
As the scale of the networks becomes larger, the number of temporary cluster heads increases, for there is a subtle relation between the lifetime and the number of  tmp .When  tmp is relatively small, the number of temporary cluster heads is less than  opt .This means that the network would run in the condition of nonoptimal number of clusters and lead to more energy consumption.If there exists  tmp >  opt , the optimal cluster heads will be chosen and reduce the total energy consumption, as seen in Figure 12.Although too many temporary cluster heads will generate certain broadcast packets, the energy of these processes is very limited and the messages of packets for clustering can be considered negligible.
Figure 13 gives the relationships between the distribution of the number of live sensor nodes and the number of rounds, with respect to each algorithm, in which the durations from the time when the first node dies to that when the last node dies, corresponding to different routing protocols, are also included.It can be observed that the proposed model has better performance than the others.From the perspectives of balancing energy consumption for all nodes in the networks, the proposed model indubitably outperforms LEACH, CBRP, and TLCP.
Figure 14 simulates the network's energy consumption.It can be observed that the optimal model uses less energy, compared with LEACH, CBRP, and TLCP in each round.The above results show that this model can extend network lifetime and stability period, balance energy, and reduce energy consumption and instability period.

Conclusion
To solve the energy constraint problem caused by the cluster heads close to the sink, an energy-efficient model for the optimization of cluster heads' selection is proposed based on genetic algorithms in wireless sensor networks.Simulation results show that the EECCA can allocate energy to each node regularly with more efficiency, postpone the occurrence of a node's death in WSNs, and effectively prolong the lifetime of WSNs.In the future, the proposed EECCA algorithm would be improved to deal with the development of heuristic algorithms for the application of a superclustering on the cluster head and the integration of other parameters in the clustering process.

Figure 5 :Figure 6 :
Figure 5: The distribution of cluster heads in EECCA.

Figure 14 :
Figure 14: Total energy consumed during the simulation runs.

Table 1 :
Detailed parameters for simulations.