CAPNet: An Enhanced Load Balancing Clustering Algorithm for Prolonging Network Lifetime in WSNs

A wireless sensor network consisting of resources, size, and cost-limited sensors is used in many military and civil applications. This paper proposes an energy-efficient clustering algorithm that extends the lifetime of sensor networks. The proposed clustering algorithm is an extended hierarchical clustering protocol that minimizes the overall amount of consumed energy in the network. The proposed approach dynamically updates clusters and distributes the load on the heavily loaded cluster heads among different nodes. It also balances the remaining energy on nodes in the network, which leads to prolonged network lifetime. The performance is evaluated in terms of network lifetime, average energy consumption, and standard deviation of residual energy.


Introduction
With the advance of wireless communication, sensor network technology is increasingly used to monitor ambient conditions (e.g., temperature, and humidity, and pressure) in a hostile environment, where access is risky and costly [1].For many applications, we envision sensor networks composed of hundreds to myriads of sensor nodes [2].Although sensors are inexpensive, they have limited computation capability and battery capacity.Sensor nodes in Wireless Sensor Networks (WSNs) collect sensed data and cooperatively send their data through the routing lifetime [3][4][5].The network lifetime can be defined in many different ways (e.g., the time elapsed paths to a destination; one of the main issues in WSNs is energy efficiency that allows longer network until the first (or last) node dies) [6,7].To extend the network lifetime, routing protocols that minimize the amount of energy for transmitting their data to a destination are needed.Most energy is consumed to communicate between nodes, so routing protocols are designed to conserve the limited energy resources of the sensors (e.g., avoiding long-distance transmissions via communication with closer nodes) [8].
Clustering is an important mechanism in energy-efficient routing protocol for WSNs [9].In sensor network clustering, the network is divided into several clusters based on certain criteria, with each cluster managed by a cluster head (CH).Sensor nodes in a cluster transmit sensed data to their CH [10].The CH relays the data to a destination or an upper cluster in a hierarchy of clusters with possible aggregation and fusion operations [11].Clustering scheme increases energy efficiency by avoiding long-distance transmissions through CHs as intermediate nodes [12].In addition, intranetwork data operations such as data aggregation and fusion eliminate redundancy, thereby reducing the total energy consumption [6].
One of the important topics for increasing the network lifetime is load balancing [13].Clustering scheme gives rise to the following uneven energy consumption problem: why some nodes deplete their energy and die much faster than others.This occurs because the formation of clusters of unequal size or different geographical conditions put uneven loads on sensor nodes in the network [4,[13][14][15].In addition, some nodes are burdened with heavier loads, leading to the so-called "hot spot problem" [16].Several existing clustering 2 International Journal of Distributed Sensor Networks protocols that improve energy efficiency by providing optimized cluster structures fail to address the network lifetime significantly reduced by unbalanced residual energy of nodes [17].
Combination of the efficient energy consumption and load balance control is a challenging issue for extending the lifetime in WSNs.In this paper, we propose and evaluate an enhanced load balancing clustering algorithm for prolonging network lifetime.The proposed method has two features for extending the network's lifetime: (1) the routing path and data aggregation are optimized using a multihop clustering scheme, and (2) network reconfiguration is determined based on the residual energy of the nodes that consume high energy.This balances the remaining energy on nodes by rotating the role of nodes based on their current residual energy level.The proposed approach achieves increased network lifetime by minimizing the overall consumed energy in the network and providing load-balanced clustering.
The rest of the paper is organized as follows.Section 2 analyzes related work.Section 3 describes the proposed enhanced load balancing clustering algorithm for prolonging network lifetime.Section 4 evaluates the performance of the proposed approach.Finally, conclusions and future research directions are given in Section 5.

Related Work
The research area of cluster-based sensor networks is quite extensive for energy efficiency.Some clustering algorithms [9,18] utilize location information without considering the energy model.They propose approaches which make the minimal number of distinct clusters with a static communication distance.Park et al. [9] also introduced methods which can maintain overlapped clusters.These methods have a limitation, however, because they do not consider the energy model and cannot measure actual energy efficiency.
Many researchers have proposed the static clustering approaches based on energy model in order to organize optimized clusters with several factors (e.g., geographical condition and design goal).They did not change the cluster formation that was organized first.Furthermore, the role of CH is shared in rotation by the nodes in the cluster to reduce the load of the nodes.Reference [19] was concerned with global information for efficient sensor network clustering.The balanced clustering approach takes into account virtual partition which is calculated by mathematical approximation of the regional residual energy.Reference [20] proposed the distributed hierarchical agglomerative clustering algorithm for finding efficient clusters.It provides an agglomerative hierarchical clustering method with computing resemblance coefficients which have quantitative (e.g., location) or qualitative (e.g., connectivity) information types.
On the other hand, many studies presented various dynamic clustering approaches which elect new CHs in every round for controlling energy consumption efficiency and load balance to extend sensor network.Reference [21] produced a pioneering work with a balance of energy dissipation approach for sensor networks which randomly selected a few sensor nodes as CHs and rotated this role to evenly distribute the energy load among the sensors in the network.This scheme is able to equalize the energy consumption of sensor nodes by uniformly spreading CHs throughout the networks.
Later studies on clustering proposed CH election algorithms which are random rotation based probability models with energy levels, node density, and overlapping areas as parameters.Reference [22] optimized cluster formation using secondary parameters such as node degrees in addition to existing residual energy levels to equalize the load among nodes.Reference [23] achieved uniform distribution of the CH nodes by favoring nodes deployed in densely populated network areas as better candidates for CH nodes.Reference [24] proposed a balanced clustering algorithm that exploits the redundancy properties and the energy level of multiple temporary clusters.
The one-dimensional clustering algorithm aims to reduce energy consumption through data fusion and aggregation inside the cluster.Thus, to reduce the energy consumption required for delivering messages to the base station, studies have suggested routing path optimization techniques using the multihop clustering technique.Reference [11] proposed a chessboard clustering scheme with a number of ordinary sensors named low-end sensors and a few powerful sensors called high-end sensors in order to maintain a balance of the node energy consumption.Reference [10] proposed a multihop clustering protocol which has two different roles for CHs for prolonging the network's lifetime.Reference [17] composed a cluster with consideration given to coverage, overlapping, and connectivity conditions and sent messages by k-hops through a routing table for managing adjacent clusters and boundaries.
The multihop clustering technique, however, has a hot spot problem during the process of optimizing the communication path.Thus, to overcome this problem, unequal clustering protocols which can achieve load balance have recently been proposed.Reference [25] proposed an unequal clustering size model where the network was organized into clusters of unequal size according to the distance to the destination.This scheme created more clusters in the area close to the base station so as to deal with the problem in the conventional multihop scheme that assigns a heavier load on the nodes near the base station.Reference [26] performed multihop clustering using relay nodes with high residual energy levels after creating clusters with unbalanced sizes by considering their distance to the base station.Reference [13] determined the cluster size by considering the locations of CHs relative to the base station.The created CHs send messages using multihop forwarding.However, most unequal clustering protocols focused on load balance within each cluster.Reference [7] proposed a hybrid intercluster routing strategy in which multiple chains were created for message transmission.To solve the hot spot problem that may result from this, the base station and CHs temporarily communicated with each other directly.Cluster head election stage.
(2) The nodes elected as a CH, and they broadcast message for notifying their location and the residual energy.
(3) Each CH calculates the distance factor () based on the received messages.( 4) () is calculated by dividing the residual energy by the calculated distance factor, ().

An Enhanced Load Balancing Clustering Algorithm
In this section, the proposed enhanced load balancing clustering approach is described.This algorithm effectively reduced the energy consumed in communication with BS using a multihop clustering scheme.The proposed approach contributes to prolonging the lifetime of networks by reducing the energy consumption.This algorithm also disperses the roles with residual energy and the distance factor.It could maintain load balance through the homogeneous networks.Namely, all sensor nodes are identical and the lifetime of the entire network is increased.The proposed approach is implemented in two separate phases, the first stage where a determinant node (DN) is elected and the second phase where network reconfiguration is determined (Table 1).

Cluster Head Election Stage (Flexible Phase
). Probabilitybased clustering scheme [21,24] is used to divide networks into many clusters.CHs, which are representative nodes of the clusters elected in this section, become candidate groups for the DN and serve the role of gathering information on affiliated nodes and transmitting to DNs.Whether to implement this phase is determined depending on the amounts of residual energy of cluster heads.This method can adjust changes in unnecessary network compositions to reduce imbalance in the amounts of residual energy between nodes; namely, the CHs that have enough residual energy maintain its role as a CH.As a result, the coexistence time of entire nodes can be extended in the random based clustering scheme.The DN receives messages from CHs, gathers these messages, and transmits them to BS.Therefore, to consider the energy consumption of the entire network, the amount of energy consumed when CHs send messages to the DN and the amount of energy consumed when the DN communicates with BS should be considered.In this case, the energy necessary for CHs to communicate with the DN follows the free space model (fs).If BS is located sufficiently far from the network, the radio power should be sufficiently amplified in order to transmit messages.Therefore, in this case, the multipath fading model (mp) is applied to calculate distance factors:

Determinant
where  is the length of the message for transmission,  is distance [12].An effective DN should consume less energy with high residual energy, which is why it only communicates with the BS among the entire nodes.Thus, a node that has a high residual energy (()) and consumes low energy in communication (()) should be chosen among CHs.The determinant node factor (()) of ( 2) is delivered to each CH with a broadcast message.Each CH should perceive itself as the DN if its determinant node factor is the highest and should broadcast an election message to the entire network: Table 2: Assumptions for estimating the residual energy of the entire network.
(1) Nodes are deployed evenly over the entire network.
(2) A cluster quartering the entire network is assumed.
(3) The role of DN is rotated between four CHs.(4) A perfect data aggregation is assumed.
This process is implemented again when the DN elects complete replacement of the network configuration.This process is not implemented if the DN is maintained as it is without any change or replaced by one of existing CHs.

Determination to Reconfigure Networks Stage.
The existing approaches based on the random rotation model form a new cluster whenever a message transfer finishes at the base station [10,13,17,21,23,24,26].However, unnecessary changes in cluster formation make balanced control more difficult because even nodes with a low energy level can be selected as the CH.As a result, some nodes may consume all of their energy and die early despite the fact that residual energy in the entire network is sufficient.To solve this problem, the DN should be flexibly elected based on the amounts of residual energy.However, it is difficult to determine this while newly electing the DN at each round.Therefore, maintaining the network configuration when the DN has sufficient energy is more effective in maintaining network load balance.
This section introduces a flexible network reconfiguration technique to solve the unbalanced residual energy among nodes and to control the network load balance.Network configurations are determined in two different cases.First, when the average amount of residual energy of CHs is larger than the average amount of residual energy of the entire node, the existing network configuration should be maintained and only the DN should be elected from existing CHs.The average residual energy ( avg CH ) of each existing CH can be determined by (3), which was obtained from the DN selection process in Section 3.2.The CH with the highest DN factor (2) value is elected as the DN.If this is not the case, all nodes in the network reelect CHs in order to reorganize the network configuration: Meanwhile, high energy consumption is required to determine the total energy of the network because communication must occur with all nodes.Thus, each node knows energy consumption without communicating with other nodes, expecting that it will converge to the average after many rounds.In other words, the energy consumption generated from the error of the estimated value has an insignificant effect on the network lifetime.Although this does not affect the network lifetime, the error increases as the number of dead nodes increase because a constant value is used.Table 2 shows the assumptions for estimating the residual energy of the entire network.
Equations ( 4)-( 6) are used to calculate the energy consumed by each node in a sensor network to perform its role as an ordinary node, CH node, and determinant node of each cluster.As shown in (4), an ordinary node consumes energy only to send data to its CH: where  denotes the number of bits in each message and  to CH represents the distance to the CH.The role of the CH is to receive data sent from ordinary nodes, perform data aggregation (DA), and transmit the aggregated data to the DN.Equation ( 5) calculates the energy needed to perform the role of the CH: where  to DetNode denotes the distance to the DN,  is the number of optimal clusters, and  is the total number of nodes in the network.
The DN receives message data sent from CHs, aggregates the received data, and transmits the aggregated data to the base station.Equation ( 6) is used to estimate the energy spent by the DN to perform its designated role.In the equation,  to BS denotes the distance to the base station: The DN determines whether to maintain or change clusters after comparing the average residual energy of CHs (avg CH) with the average energy consumption of the entire network (avg Entire) in (7).The value of avg CH is collected from the notifying messages of CHs.All the network nodes receive the determinant result (det) through a broadcast message: det = { 1 :  avg CH ≥  avg Entire , 0 :  avg CH <  avg Enitre .
The starting point of the new round is determined by the determined result (det) value.If the value is 0, the residual energy of CHs is not sufficient and the first phase is implemented in order to elect new CHs.If this is not the case, to reduce unnecessary network configurations, the existing configuration is maintained and whether to replace the DN is determined in the second stage using the existing CHs as a candidate group.If the DN still maintains the highest determinant node factor value (()), the role is maintained and if that is not the case, the node with the highest value is elected from the candidate group.This approach serves the role of sufficiently utilizing nodes with large residual energy to maintain balance in residual energy with other nodes.As a result, the effect to extend the working time of the entire node can be obtained.
When the third stage has been completed, the DN broadcasts an election message to all nodes.On receipt of the message, the CHs send a "Joint Message" to the DN in response in order to determine the order to send messages.The DN again broadcasts the result of TDMA scheduling to the network to complete the network configuration.

Determination to Reconfigure Networks Stage.
When the network configuration stage has been completed, the data transmission stage begins.To minimize energy consumption, ordinary nodes are in the wake mode only during their implementation time in accordance with the TDMA schedule and enter sleep mode in other cases.Although only minimum necessary energy is consumed during the sleep mode, no message can be exchanged.When their mode switches to the active mode, ordinary nodes collect predefined information and transmit the information to the CHs to which they belong.
In this case, individual CHs maintain the switched-on state and aggregate the information from ordinary nodes and transmit it to the DN.The DN gathers the information received from the CHs and communicates with the BS.In this case, the DN that communicates with the sufficiently distant BS consumes the greatest energy in the network and CHs that are maintained throughout this stage also consume large amounts of energy.

Experiment
4.1.Experimental Setup.This section compares the proposed approach to the previous routing schemes for performance evaluation.In this experiment, a sensor node was assumed to consume 50 nJ per 1 bit of receiving data.When sending data, extra energy for amplification is needed.If the transmitting distance is less than a certain criterion, the free space model ( fs = 10 pJ/bit/m 2 ) is applied.Otherwise, the multipath model ( mp = 0.0013 pJ/bit/m 4 ) is applied.The energy required to aggregate data in CHs is 5 nJ/bit/signal [14].
For this experiment, 200 sensors with an initial energy of 2 J were deployed over an area of 100 m × 100 m.A base station was located at the coordinate of (50, 175), provided that it was far enough away from the configured sensor network.The message data transmitted at one time was 4000 bits (i.e., 500 bytes).
The experimental results were compared to previous schemes of LEACH [21], OABC [24], and MESH [10], in terms of the network lifetime, the average energy consumption, and the standard deviations of residual energy.Furthermore, to evaluate performance by communication distance, the changes in the lifetime of the algorithms were compared  according to the location of the base station.In these experiments, the proposed approach was based on the OABC algorithm (CAPNet-OABC).This means that in the CH election stage of the proposed approach, the OABC algorithm is used to organize clusters and elect CHs.

Experimental Results.
In Figure 1, the network lifetime of the proposed approach was compared to that of the previous three algorithms, LEACH [21], MESH [10], and OABC [24].(Alluding to Section 1, the network lifetime designates the time until every node in the network was alive.)CAPNet-OABC increased the network lifetime by 62%, as compared to that of OABC, which was 2.2 times longer than that of MESH.It was also 3 times longer than LEACH's.In Table 3, we can confirm the points where the percentage of available nodes of the network is 80% (T40), 60% (T80) and 40% (T120), and 0% (T200).(i.e., T200 means that entire nodes are dead.)In case of T40, the proposed approach increased 2.4 times, 2 times, and 1.23 times compared to LEACH, MESH, and OABC.With the proposed approach, however, the decreasing trend of the available nodes rapidly accelerates over time.As a result, the time when there are no available nodes in the entire network (T200) is 1.93 times, 1.49 times, and 1.12 times compared to the other three algorithms, respectively.
Even though the network lifetime was generally improved through the fusion and aggregation of the transmission data through the optimization of communication path and the cluster organization, the decreasing rate of the available nodes increased over time.The reason for this appears to be that as the number of available nodes decreases, the influence of the load balance decreases and only the effect of the optimization of communication cost remains.This is also due to the fact that there is no large difference in the residual energy of all nodes because the algorithm maintains the load balance of all nodes.It implies that CAPNet-OABC increased the network lifetime by evenly distributing energy dissipation among the nodes.Hence, the nodes in the CAPNet-OABC applied network ran out of their energy rapidly after the first node died.These points were also observed in the next experiment result.
Figure 2 shows the average energy dissipation of each node spent in a communication round for which the energy efficiency of each algorithm was applied.For transmitting messages, CAPNet-OABC required 48%, and 58% of the energy was required by LEACH and MESH, respectively.However, CAPNet showed that the efficiency of energy consumption increased by only 13.6%, as compared to OABC.The increase in energy consumption efficiency of the proposed approach falls far short of the lifetime increasing rates of LEACH, MESH, and OABC in Figure 2.This tendency is similar to the increasing rate of the time when all the nodes of the network die.Thus, we can deduce that while the communication cost optimization aims to increase the network lifetime through improvement of the life cycle of each node, the load balance control of the entire network has greater influence on the extension of network lifetime.
Figure 3 shows the number of cluster formation changes of the proposed approach in a network.In this graph, the frequent cluster formation changes occur for approximately 900 rounds; however, after this, the frequency of the cluster formation rapidly increased slowly until approximately 2300 rounds.Initially, since most nodes have sufficient energy, cluster formation changes frequently to maintain load balance of the network.As time passes, however, the uneven energy consumption problem of the network deepens.Thus, nodes with sufficient residual energy are elected as clusters.Besides, the frequency, they play the role of determinant node in rotation, could be increased.As a result, the number of network formation changes decreases gradually as shown in Figure 4.In other words, the cluster maintenance approach maintains load balance as the CHs with high residual energy play the role of determinant node in rotation which requires high energy consumption.
To understand the cause of the improved network performance, we need to identify the balance of the residual energy of all nodes after frequent cluster maintenance.Figure 4 shows the standard deviation of the energy load among nodes according to each applied algorithm.The efficiency in the network level was the notable factor of load balance.The standard deviation of LEACH increased rapidly up to 0.266 until the first node ran out of energy in the 800th communication round.Compared to the other algorithms, LEACH had a poor network lifetime due to the large standard deviation and high average energy consumption.Although the increasing trend of OABC decreased compared to LEACH, the standard deviation increased up to approximately 0.25 until the death of the first node.In the case of CAPNet-OABC, the standard deviation decreased by approximately 70%, as compared to OABC at 1,000 rounds.Furthermore, the standard deviation did not increase more than 0.2, even upon the death of the first node.These results suggest that the proposed approach improves energy efficiency, as compared to the previous passive algorithms, by actively controlling the load balance.This experiment shows that the proposed approach maximizes the load balance between nodes compared to other algorithms.It means that load balance is a primary role for prolonging the network lifetime.In addition to the similarity of the residual energy levels of the nodes due to the effective management of the load balance, the gaps between the deaths of the first and last nodes in Figure 3 were greatly reduced.
Figure 5 illustrates the trend of network lifetime with density of nodes.We compared the lifetime of three different algorithms by varying the number of sensor nodes; they are 100, 200, and 300.The lifetime of all algorithms was gradually increasing as node density increased.Alternatively, when the number of sensor nodes was 300, the lifetimes of the LEACH and OABC algorithms increased by 15-25% as compared to 200 nodes, whereas the lifetime of CAPNet-OABC increased by approximately 35%.This experiment shows that CAPNet steadily improves the energy efficiency, regardless of the density of nodes, and also guaranteed a higher performance when the number of nodes is increased.This section compared the performance of the proposed approach with other algorithms by measuring the network lifetime, average energy consumption, standard deviation of the residual energy, and the changes in lifetime based on the location of the base station.As a result, the proposed approach steadily increased the network lifetime through effective energy transfer and control of the network load balance.

Conclusion
This paper has presented a load balance and energy-efficient cluster maintenance approach for efficient and evenly distributed energy consumption in sensor networks.To improve the energy efficiency at the node level, the proposed approach applies the multihop clustering technique and reduces communication cost.Furthermore, to maintain the load balance at the network level, the approach controls cluster formation using the residual energy of the CHs.As a result, the proposed approach effectively increased the network lifetime by combination of the optimum route, the efficient cluster formation, and load balance control.
The proposed approach can be used as a multihop message transmission mechanism, concurrently with the previous clustering schemes, to determine clusters.Although we have only provided algorithms based on a two-level hierarchy, we need to extend the multilevel hierarchy for message transmission as future research.The algorithm is also expected to be applicable to conventional unbalanced multihop clustering.Currently, the proposed approach does not consider situations where the clustering hierarchy must be maintained, for example, the addition of new nodes and existing node failures.Furthermore, we assumed a few restrictions for calculating the energy consumption of entire networks.Due to these restrictions, it is difficult to deal with severe changes in network configuration.Thus, we plan to explore the flexibility of the node employment and dynamic traffic load, and the scalability of network hierarchy should be performed.

( 5 )
Elects DN which has the highest () value.Determination to reconfigure networksstage.(6) Elected DN compares the average of CH's residual energy to the average residual energy of the entire network.(A) If Det ≥ 1, the current cluster formation is used for the next round.CH election stage is omitted.(B) If Det < 1, a new cluster formation is created for the next round.(7) Elected DN node broadcasts a DN election message to the entire network.(8) CHs that receive the DN election message reply "Join Message" to DN. (9) DN broadcasts TDMA scheduling of CHs.

Figure 1 :
Figure 1: Lifetime of each approach.

Figure 2 :Figure 3 :
Figure 2: Average energy dissipation of each node in the network.

Figure 4 :
Figure 4: Standard deviation of the energy load among nodes in each algorithm.

Table 1 :
The proposed cluster maintenance approach.

Table 3 :
Results of a network available time comparison of CAPNet with the other three algorithms (rounds).