An Efficient Cluster Head Selection Approach for Collaborative Data Processing in Wireless Sensor Networks

Since wireless sensor networks (WSNs) consist of nodes with limited battery power, collaborative data processing and balanced energy consumption should be considered as the key issue. This paper proposes an efficient cluster head selection approach for collaborative data processing in WSNs. The proposed algorithm designs an effective energy-efficient model to select the optimal cluster heads among all nodes fairly, which helps to reduce the impact of the monitoring scheme on the lifetime of network. Experimental results show that the proposed protocol is able to reduce energy consumption and obtain higher efficiency as well as effectively prolonging the lifetime of network more than a few existing cluster-based routing protocols.


Introduction
In recent years, a considerable attention and research have been devoted to the deployment of sensors for distribute management, collaborative information processing, and multihop communication [1]. Wireless sensor networks integrate the most advanced technology, such as sensor technology, embedded computing technology, wireless communications, and distributed information processing. Through a variety of microsensors monitoring and collecting the environmental information collaboratively, the data that the target customer required can be transmitted to the embedded system of information in time [2]. Therefore, collaborative data processing and balanced energy consumption should be considered as the key issue in WSNs. However, collaborative data processing has received little attention until recently.
WSNs are formed by a large number of sensor nodes, which deployed in a monitored area, and each node forms a multihop self-organizing network [3]. Depending on the sensors distributed spatially and working cooperatively, we can gather and process data from the environment (e.g., mechanical, thermal, biological, chemical, and optical readings) [4]. Since the sensors with limited battery power cannot be added, balanced energy consumption should be considered as the crucial issue for energy-efficient management [5]. In recent years, the monitoring of the residual energy level is known as the hot issues by the majority of scholars.
Existing cluster-based routing protocols only consider balanced energy consumption of nodes, while ignoring collaborative data processing between nodes. The goal of this paper is to achieve collaborative data processing in WSNs, thus prolonging the lifetime of network. Therefore, we propose an efficient cluster head selection approach for collaborative data processing (CHSCDP) in WSNs, which measure the residual energy of all candidates for the further classification in the competition. Firstly, our approach elects cluster heads with more residual energy through local radio communication to achieve even distribution of cluster heads. Secondly, the normal nodes situated in the area managed by several cluster heads should select the optimal cluster head that have the more residual energy. Thirdly, in order to reduce the energy consumption of the collaborative communication, interference, and transmission latency, we adopt OVSF (orthogonal variable spreading factor) rather than TDMA (time division multiple access) encoding in the intracluster communication. In addition, since the multihop routing is more efficient than single-hop manner in terms of reducing energy consumption, we use the method of forwarding data 2 International Journal of Distributed Sensor Networks from the cluster heads to the nearer nodes that are far from the base station.
Due to collecting data in its cluster and forwarding from other clusters, the cluster heads near the base station will consume more energy and die in advance, which will influence the life cycle of the whole network [6]. In this paper, an unequal clustering method is proposed to balance the energy consumption among the cluster heads. The proposed algorithm offers a framework for collaborative data processing among researchers. By adjusting competition radius of different candidate cluster heads, the cluster size near the base station is smaller comparatively in order to compensate for unbalanced communication overhead caused by intercluster communications.
The specific contributions of this paper include the following.
(i) A literature survey about various existing clusterbased routing protocols and an analysis of their advantages and disadvantages are presented.
(ii) An effective energy-efficient optimization model for solving the unequal clustering in WSNs is proposed.
(iii) The proposed protocol adopts the idea of energy grading to select the cluster heads, and the competition process can obtain better convergence and cost lower message overhead.
The rest of this paper is organized as follows: a brief survey is given in Section 2. We study the CHSCDP protocol and formalize it in Section 3. Experimental results and comparisons with existing cluster-based routing protocols are presented in Section 4. Finally, Section 5 concludes the paper and discusses some future research directions.

Related Works
The collaborative data processing has attracted much attention of many domestic and international researchers in WSNs. In this section, we focus our discussion on the related works on collaborative data processing and energy optimization in WSNs.
In order to achieve high-energy efficiency and increase the network scalability, the nodes can be organized into clusters and apply random selection method for cluster head election in an epoch [7]. After collecting all the data of its members, cluster head transfers them to the base station. During the phase of clustering, the cluster head plays an important role in providing data communication to nodes and the base station efficiently. This method can increase the energy consumption of the sensor network and acquire network scalability.
Low-energy adaptive clustering hierarchy (LEACH) is a typical clustering-based protocol [8]. In order to save energy and enhance network lifetime, the hierarchical routing is adopted. Younis and Fahmy [9] proposed hybrid energyefficient distributed clustering (HEED), which periodically selects cluster heads according to a hybrid of the node residual energy and a secondary parameter, such as node proximity to its neighbors or node degree. Hsin and Liu [10] proposed a new technique to select the cluster heads in every round which depends both on current state probability and on general probability. Kar and Banerjee [11] proposed a distributive energy-efficient adaptive clustering (DEEAC) protocol, which is having spatiotemporal variations in data reporting rates across different regions. DEEAC selects a node to be a cluster head depending upon its hotness value and residual energy. Dr. Li et al. proposed a dynamic stochastic distributed energy-efficient clustering method where the cluster head election probability is more efficient [12]. Moreover, it uses a stochastic scheme detection to extend the network lifetime.
Sim et al. [13] proposed an energy-efficient cluster header selection algorithm (ECS) which selects cluster head by utilizing only its information to extend network lifetime and minimize additional overheads in energy limited sensor networks. M. C. M. Thein and T. Thein [14] proposed a modification of the LEACH's stochastic cluster head selection algorithm by considering the additional parameters, the residual energy of a node relative to the residual energy of the network for adapting clusters, and rotating cluster head positions to evenly distribute the energy load among all the nodes. Chen et al. [15] proposed an energy consumption mode of clustering protocol in the condition that nodes followed the Poisson distribution and analyzed the network performance which was impacted by Poisson distribution density of the nodes.
Attea and Khalil proposed a new evolutionary based routing protocol for clustered heterogeneous WSNs [16]. Ching et al. proposed a novel hierarchical routing protocol algorithm (NHRPA) for WSNs [17]. Joe-Air et al. presented a QoS-guaranteed coverage precedence routing algorithm [18]. Jiang et al. [19] proposed a distributed energybalanced unequal clustering (DEBUC) routing protocol, and an unequal clustering mechanism is adopted for intercluster multihop way.

Collaborative Data Processing and CHSCDP Protocol
The cluster head will be selected randomly in a cyclic manner in LEACH protocol, and the energy of the whole network load is evenly distributed to each node. It can reduce network energy consumption and improve overall survival time of the network. LEACH protocol has many advantages, such as hierarchical structure, local dynamic data fusion, and cluster head selection [20]. Although LEACH protocol can effectively prevent data loss caused by redundant data transmission, there are also obvious shortcomings, such as data redundancy, energy imbalance, and poor stability [21]. Therefore, this paper proposes an improved cluster head selection based on LEACH, which focuses on the residual energy of the candidate nodes and the problem of energy consumption for intercluster communication.
In order to design an efficient clustering algorithm, we need a comprehensive analysis of the performance indicators as follows.
Firstly, energy efficiency means avoiding the end of the life cycle of the network caused by the early death of some nodes as well as reducing the energy consumption of formation and maintenance of clusters.
Secondly, the stability of cluster structure can reduce the additional overhead caused by frequent clustering.
Thirdly, the cluster heads are always in the center of the cluster and have more powerful radios to be able to communicate with all adjacent cluster heads. Then, the problem of the appropriate size of a cluster should be notified in the multihop data transmission.
The CHSCDP protocol is divided into several stages in the application process, including radio channel and energy dissipation, cluster head election, collaborative communication, and routing. The implementation of CHSCDP protocol will be described in detail in the following section.

Radio Channel and Energy Dissipation.
We assume an effective energy-efficient model for the radio channel and energy dissipation; the transmitter dissipates energy to run the radio electronics and the power amplifier, and the receiver dissipates energy to run the radio electronics. The effective radio energy dissipation model is shown in Figure 1.
In the effective radio energy dissipation model, both the free space ( 2 power loss) and the multipath fading ( 4 power loss) channel models were used, depending on the distance between the transmitter and the receiver. If the distance is less than a threshold 0 , the free space model is used; otherwise, the multipath model is used. Therefore, the energy consumption for transmitting a bits message over a distance can be formulated as where fs is power consumption of the free space propagation, amp is power consumption of multipath propagation, elec represents the residual energy of networks, and 0 = √ fs / amp . To receive bits of the information, the radio will expend as Rx = × elec . (2)

Cluster Head Selection.
In beginning of each round for selecting cluster head, the base station will collect the information of residual energy of all nodes accurately. According to the statistics, the minimum energy min and maximum energy max can be obtained and the energy level of node can be divided into four categories by the threshold, respectively: where avg = ( max + min )/2.

Transmission electronics
Tx Receive electronics Figure 1: The effective radio energy dissipation model.
The probability CH that a node is elected as cluster head is defined as follows: where res is the node residual energy, 0 is the node initial energy, and is a parameter of energy attenuation. In order to improve the convergence of the election of cluster head, we set min as a threshold which is the minimum probability and is given by where is a constant and ( ) denotes whether the node has been a cluster head in the most recent mod(1/ ) rounds. If the node has been a cluster head, ( ) = 0.
In the phase of cluster formation, the nodes will process as follows according to their own value CH .
(i) If CH ≥ 1, the node will broadcast the message of being candidate cluster head to its neighbors and waiting for JOIN message. As to the normal nodes, they may receive a few messages from several cluster heads and determine whether to join in, which can comprehensively depend on the indicators such as stability, appropriate number of cluster heads, and intracluster communication overhead.
(ii) If 0 ≤ CH < 1 and the normal nodes do not receive messages from any other cluster head, the value CH will be multiplied by itself and step into the next iteration. If the node receives a message sent by a cluster head, it will run into a plurality of candidate selection process.
The nodes with higher residual energy should have more advantages than other nodes in the cluster head election process. As far as the nodes with the approximate residual energy, they may belong to the same level. Therefore, some other factors will be considered to determine which nodes should be chosen as the cluster heads. We explore the critical factors affecting the energy depletion as indicators, including distance to base station and the number of rounds being elected as cluster heads consecutively.
As for the intercluster communication, multihop way has better results than single-hop approach in energy efficiency. In this paper, the multihop forwarding between the cluster head and the base station will be discussed. Since the head near the base station consumed more energy, we use an unequal approach for clustering.
Since the higher energy consumption in the process of the multihop forwarding, the cluster heads that are closer to the base station should get smaller cluster size than those far away. Therefore, the node's competition radius should decrease as its distance to the base station decreases. Based on the above analysis, we describe the comp as follows: where 0 is the predefined maximum competition radius and ( , BS) is the distance from node to the base station. max and min are the maximum and minimum distance between the node and the base station, respectively.

Selection from Candidate Cluster Heads.
The proposed method performs clustering in the initial network environment and selects cluster heads considering the distance to the base station, the number of times for being ever selected, and residual energy. Here cluster heads are selected by comparing their critical values, instead of the conventional method of selecting candidate cluster heads in the course of cluster head selection. For optimizing the selection of cluster heads, we describe the function cos as follows: where denotes the normal node, denotes the set of candidate cluster heads near node , ( , ) is the distance from node to the cluster head , and ( ) denotes the mathematical expectation of distance from all the candidate cluster head to the node . res ( ) denotes the remaining energy of the cluster head , and ( res ( )) denotes all the candidate cluster head residual energy of mathematical expectation. 1 , 2 , and 3 are used to describe the cost proportion; 0 < 1 , 2 , 3 < 1.
From (7), when cos obtains the minimum value, we can obtain the most optimal cluster heads. The weighted values have a great impact on multiattribute decision-making. Generally speaking, the smaller the difference in property values is, the less the impact of decision is. For these attributes, we can set lower weighted value, and vice versa.
In this paper, we use the standard deviation and mean deviation to measure the difference of attributes. The standard deviation is defined as follows: The mean deviation is defined as follows: where (0, 1) denotes the standard normal distribution and = (1/ ) ∑ =1 . The objective function can be defined as follows: Thus, the weights can be converted into a single problem of solving nonlinear programming problem properties, and we calculate From (11), we can obtain the optimal weight vector = { 1 , 2 , 3 }.

Collaborative Communication and Routing.
At the stage of the intracluster communication, each cluster head transmits a time table to its cluster members with TDMA technology, which incises time into many cyclical frames. Although generally used in wireless network protocol, TDMA has some disadvantages. Firstly, in every round, all the members should send data to their cluster head which cause high-energy consumption. Secondly, each user is allowed to transmit only within specified time intervals, and it will increase the total delay of the network. Thirdly, time synchronization is required at the beginning of each round, which will result in plenty of unnecessary message exchanges. Therefore, we use an improved mechanism based on OVSF. Firstly, the cluster head sends the OVSF matrix to its cluster members. And then, the nodes add a group of OVSF codes ahead of every datagram. When receiving all messages, the cluster head uses the same OVSF matrix multiplied by the received data, so that we could utilize the characteristic of orthogonality and incoherence of each node not only to realize the indiscriminate data transmission, but also to greatly reduce the delay and improve the efficiency of energy consumption.
Cluster routing is an energy-efficient routing model as compared with direct routing and multihop routing. In LEACH, the sensing nodes sense the environment and then transmit the data towards the cluster head, and then the cluster head aggregates them and transmits to the base station with single hop. The problem of this mechanism lies in the fact that the cluster head far from the base station costs high energy and moves into death rapidly, which results in network fractional nonconnectivity.
A technique for intercluster communication is presented for WSNs, in which each cluster head sends its data to its neighbor cluster head which is nearer to the base station, to achieve load balancing in network. We assume the data redundancy is limited and the intermediate cluster head only forwards the data to the next hop node instead of doing data fusion.
In the process of establishing the intercluster communication channel, cluster head will choose the cluster head in the neighboring cluster in terms of the cost function relay . We describe the relay as follows: where RS CH = { | ∀ , ( , BS) < ( , BS), , = 1, 2, . . . , }. res is the current residual energy of cluster head , is the number of the members in cluster , and is energy error variable. In order to cost less overhead, the cluster head with high residual energy and owning relative few members can become the next-hop intermediate node.

Simulation Experiment
In this section, we evaluate the performance of our protocol implemented with MATLAB. We assume the probability of signal collision and interference in the wireless channel is ignorable. Cluster-based routing algorithms have different configuration parameters, which may affect the experimental results. In order to reflect the fairness of algorithms, this paper will take the same configuration parameters in [8]. The specific experimental parameters are shown in Table 1. Figure 2 shows the change in the aspect of the minimum distance between the cluster heads in the first 50 rounds. As can be seen from the result, the distance between the cluster heads is about 30 m in LEACH protocol, while it is about 60 m in our protocol. It means that the cluster heads must be much concentrated in some area of the network. In CHSCDP, the competition radius is set reasonably to guarantee the distribution of cluster heads evenly.
Furthermore, we report result for the comparison of average energy consumption of cluster heads. Figure 3 shows the average energy consumption of cluster heads for the two protocols. As shown in Figure 3, the average energy consumption of cluster heads in LEACH fluctuates in the range of 0.33 J, which is higher than that of the CHSCDP. This is mainly due to the superiority of multihop transmission, which can save energy greatly in comparison with the single hop. Figure 4 shows the comparison of the energy consumption in clustered phase for the two protocols in case of not considering the energy loss in the stable communication phase. It can be observed that the CHSCDP can be slightly   larger than LEACH with respect to the total energy consumption in the formulation of cluster. The proposed CHSCDP gives the network a more uniform distribution of cluster heads. As the candidate nodes need to compete in limited area, they should consume more energy. However, ADV is a low capacity message, so the power consumption is very small. Although the energy consumption of CHSCDP has increased in clustered phase, it does not affect the overall efficiency of the protocol.
Obviously, lifetime is the criterion for evaluating the performance of sensor networks. In the simulation, we measure the life cycle by rounds and it is defined as the total amount of time before the first sensor node runs out of power.  cluster head of LEACH, LEACH-M, and BECCS is 311, 334, and 402 rounds, respectively. It can be observed that BECCS protocol can quickly converge when the network failed. Since WSNs have high fault tolerance, self-organization, and other characteristics, the failure of some node does not affect the overall network performance. But when most of the nodes lapsed, the network presence has no meaning, and therefore this protocol can be more suitable for WSNs. Figures 6 and 7 show the number of each node being elected as cluster head when the first node died for LEACH and CHSCDP. As we can see in Figure 6, the number of elections as cluster head of each node exhibits a narrow range of fluctuation between 6 and 7. However, LEACH uses uniform clustering and single-hop communication between cluster heads and base station, which have fast energy consumption for cluster heads far away from the base station so as to resulting in premature death. In CHSCDP, because it employs extra cluster heads to afford the multihop forwarding traffic in the areas closer to the base station, the nodes in these areas usually gain the higher choice to be selected as cluster heads. Figure 8 shows the residual energy distribution of the whole network. As it can be seen from Figure 8, the nodes' residual energy distribution of CHSCDP gets smaller fluctuations than LEACH and the node average residual energy of CHSCDP is much higher. Because CHSCDP uses multihop communication and unequal clustering strategy, making clusters near the base station is relatively small and there are relatively few members in the cluster.
Finally, we compared the effects of parameter values on the network's overall energy consumption and the average delay. Figure 9 shows the residual energy of the whole network with the different parameter values. It can be seen that CHSCDP is better than the traditional LEACH algorithm in terms of energy balance. The multihop routing forwarding and the use of OVSF coding contribute to reducing the energy consumption. From the experimental result in Figure 9, we also can observe that the value of is inversely proportional to the node's communication radius, thus influencing the energy consumption.
The transmission delay is usually the amount of time while the information submitted to the network until being received by the destination, and it is defined as the average delay for all nodes in a certain period. We compare the network delay between LEACH and CHSCDP under the same simulation environment. In analysis, the greater the value of is, the less the communication radius of each node will be. Figure 10 shows the average delay for the two protocols with the different clusters of expectations. It can be observed that the average transmission delay varies inversely as the number of clusters, and the average transmission delay of CHSCDP is slightly larger than LEACH. The reason is the multihop routing which results in a certain delay that is   inevitable. It can be seen that as the cluster of expectations is increased, the average transmission delay of the two routing algorithms gradually closes and the difference tends to lessen.
Through the above experimental results, it can be observed that the CHSCDP is able to reduce energy consumption and obtain higher efficiency as well as effectively prolonging the lifetime of network more than a few existing cluster-based routing protocols. The proposed CHSCDP protocol is well enhanced and balanced on exploration and exploitation and has better stability and scalability.

Conclusions
In this paper, we propose an efficient cluster head selection approach for collaborative data processing in WSNs. The energy grading concept is applied to select the cluster heads, and the competition process can obtain better convergence and cost lower message overhead. Furthermore, for the noncluster heads which locate in overlapping area covered by several cluster heads, we proposed a novel approach to evaluate the optimal cluster head in accordance with the factors, such as residual energy, distance, and the number of rounds for being selected. The approach also produces an unequal clustering to balance the overload among cluster heads. CHSCDP is fully distributed and more energy efficient. In the future, we will improve the proposed protocol by minimizing the communication cost and also increasing the reliability of the network to make further works more practical.