Multihop-Based Optimal Cluster Heads Numbers Considering Relay Node in Transmission Range of Sensor Nodes in Wireless Sensor Networks

The data transmission of sensor nodes which detect events might be affected by their neighbor nodes located in their communication range. Thus, we analyze the energy consumption of sensor networks as a function of the number of cluster heads considering the above two options, multihop-based communication and the case where a neighbor node is in the transmission range for communicating data. This helps us to elect the number of energy-efficient cluster heads in a more practical manner. Also, we determine the effect of electing the number of cluster heads by restricting the local cluster size.


Introduction
Generally, wireless sensor networks (WSNs) are organized with sensor nodes and a wireless module in a specific area [1]. These sensor nodes have four basic components: a wireless network component, a sensor component to detect the environment, a power component for the electricity supply, and a process component for data processing. To adapt them for a specific purpose, a location finding system such as GPS, a mobility device, and a power generator can be added [2]. In WSNs, the sensor nodes have to establish a self-organizing network to collect data anywhere. To provide this, we have to consider the limitations of the sensor nodes. Because a sensor node is a lightweight, small, and low-power device, energyefficient power consumption is an important issue [3]. Most of the energy consumption in WSNs occurs when the sensor node sends or receives event data.
The routing algorithms of sensor networks can be divided into flat-and hierarchical-based routing [4]. In flat-based routing, all of the nodes are on the same level in the network. Thus, global information is used for routing. All of the nodes have access to this global routing information. On the other hand, in hierarchical routing, local information is used for routing. Therefore, all of the nodes have to conserve the local routing information. This local information is bounded to a specific zone or event area. Hierarchical-based routing is superior to flat routing. The data transmission delay in hierarchical routing is lower than that in flat routing using contention for scheduling, because the former can reserve the transmission time in and out of the cluster area. For the same reason, in hierarchical routing, it is easier to synchronize information on the network. Through the process of reservation, as hierarchical routing can guarantee channel assignment, it can use collision avoidance. Thus, hierarchical routing makes it possible to achieve stable transmission. From the viewpoint of the energy consumption, hierarchical routing can maintain steady energy consumption regardless of the traffic pattern, because of the reservation of the transmission time. Hierarchical routing can prevent data duplication by aggregating similar or identical data by means of a cluster head. Due to these merits, hierarchical-based routing can achieve data load balancing between the nodes, distribute the energy consumption in the network, and increase the node and network lifetime [5,6].

International Journal of Distributed Sensor Networks
Generally, a local cluster in a clustering mechanism is formed by a cluster head, and a cluster head is elected by a sink node or among nodes which are scattered in the network. Previous proposals [7][8][9][10] based LEACH [7] that cluster head (CH) is able to communicate with a sink node via one single hop have tried to select the nodes whose energy consumption is the most minimum in the node groups as CHs compared with (or in proportion with) the number of CH and member node (MN) in the network. But in the real world this algorithm is possible to have a scalability problem when the network is extending, because its transmitting scope is always assumed as one single hop. The scope might be extended more depending on the cases. So, via previous papers, to migrate (or overcome) this limitation, multihop routingbased optimization of the number of cluster heads (MROCH) [11] has proposed the customized determining CH selection algorithm that is able to find the optimizing number of CHs in the sensor network which has multihop transmission way considering the size of nodes' communication scope. However, this proposal still has limitation, not considering the delay node which is for handling with additional energy consumptions and data transmitting costs between MN and CH on the phrase of the cluster group initiation in the network. Practical data transmission in cluster-based sensor networks (PDTC) [12] algorithm is designed for selecting the CH and defining the cluster space that is established as CH location like others. It means that local cluster, covered by CH, is automatically established with CH as the center: whenever a new CH is selected, a new local cluster is also built. This algorithm is also able to count CH number via measuring energy generation of clustering initiation based on the following principle. However, since not only, it requires additional energy consumption to build local clusters whenever a new CH is defined, but also its communication cost among local clusters is calculated as the distance of its communication scopes and detouring path simply, it needs to be revision and modified to get better performances. Like following limitations, to get optimistic number of CHs in sensor network, the process of establishing local clusters in the network is required firstly via doing CH selection process. Also, the process of counting CH number is essential considering additional communication costs among nodes. Therefore, this paper will mention extendable sensor network clusters based on multihops majorly. Also, we will propose new CH selection algorithm considering additional communication cost among relay nodes during CH selection.
The structure of this paper is as follows. In Section 2, we describe the existing clustering algorithm. In Section 3, we set up the equation for cluster modeling. In Section 3, we describe the performance evaluation and analysis of the proposed method. Finally, in Section 5, we conclude this paper.

Related Work
To collect the information of the sensor networks in an area without a network infrastructure, the sensor nodes need to create an ad hoc wireless network. WSNs, however, are not suitable for using existing routing mechanisms, because of the features of the sensor nodes. Thus, WSNs need an improved ad hoc routing algorithm considering self-organizing networks, data centric communication, the restricted capability of the nodes and so on. Generally, adjacent nodes in WSNs have similar environmental information. These nodes require a clustering mechanism to aggregate the event data from member nodes and to prevent redundant data communication in a local cluster. In WSNs, the representative clustering algorithm is LEACH [7]. The goal of LEACH is to distribute the energy consumption to all of the sensor nodes included in the cluster heads. To achieve this, LEACH circulates a cluster head and elects it randomly. Revised and expanded clustering mechanisms based on LEACH have been studied by many researchers. Increasing the distance between a cluster head and member node increases the energy consumption. The distance depends on the location of the cluster head in a local cluster. When the cluster head is located in the center of a local cluster, the distance is the shortest. The algorithm which moves a cluster head to the center of a local cluster is LEACH-C [7]. Handy et al. [8] proposed improved LEACH algorithm, called LEACH-DCHS. This proposal is a CH selection algorithm considering the status of nodes remaining energy in the network. But it has a critical problem (or limitation) that the quality of service ability is getting worse and worse as the time of CH selection process is going more and more because it considers (or calculates) the amount of the remaining energy resources of the nodes. To solve this problem, another improved algorithm, LEACH-DCHS CM (LEACH-DCHS cluster maintenance) [9] had been proposed, but there is still the same problem. Also, advanced low energy adaptive clustering hierarchy (ALEACH) [10] algorithm, calculating the nodes remaining energy amounts based on energy information, was proposed to overcome, but it was not easy to get optimistic number of CHs in the network like others. A cluster head consumes more energy than the member nodes, because it has to aggregate the data from the member nodes. So, HEED [13] selects a cluster head considering the remaining energy of the sensor nodes. These algorithms are based on single-hop communication between a cluster head and member nodes and between a cluster head and a sink node. linked cluster algorithm (LCA) [14] assumes the connection between a cluster head and member nodes, intracluster, to be a single hop. Adaptive clustering [15] makes the same assumption. Unlike LCA and adaptive clustering, CLUBS [16] uses multihop-based communication in the Intra-cluster. The communication range of the sensor nodes, however, is based on IEEE 802.15.4 (LR-WPAN), which is one of the transmission standards for WSNs. IEEE 802.15.4 typically extends up to 10 m in all directions [17]. Thus, WSNs have to use the multihop-based clustering mechanism, because it is impossible to communicate with the sensor nodes with a restricted transmission range by using singlehop communication [18]. In the multihop-based clustering mechanism, the node energy consumption is affected by the local cluster size. This means that as the size of the local cluster increases, the nodes need more relay nodes to send the event data to a sink node or a cluster head, and the cluster size depends on the number of cluster heads, as the local cluster is formed by a cluster head. So, it is important to determine International Journal of Distributed Sensor Networks 3 the number of cluster heads in multihop-based clustering algorithms. The practical data transmission algorithm [12] is a method that determines the number of cluster heads based on Voronoi tessellation. This means that if there is one spot in a specific area, the local cluster area is determined by the location of this spot, and if there is another spot around it, the area is divided into two equal spaces containing these two spots. Whenever a new spot is added to it, the area can be divided into equal spaces, because of the iteration of this process. Multi-hop routing-based optimization [11] optimizes the energy consumption of the nodes as a function of the number of cluster heads. To achieve this, it models the distance between the nodes in the Intra-cluster and intercluster parts. Through this model, it can determine the effect on the energy consumption of the local cluster and whole network as a function of the number of cluster heads. Though these two mechanisms use multihop-based clustering inside and outside of a local cluster, they do not consider which nodes can relay the event node in the transmission range of the original node. In other words, they do not apply the detour caused by the relay node. This detour can be affected by the location of the nodes. The original node should not select all of the nodes in the transmission area as the relay node but should just select them as candidate nodes for detours. Among the candidate nodes, the node which is closest to the cluster head or sink node can be selected as the relay node. Then, the original node can set up the direction to the cluster head or sink.

Cluster Modeling of WSNs
In WSNs, the process of clustering starts by electing the cluster heads. The node elected as a cluster head sends its cluster head information to its neighbor nodes or member nodes in its transmission range, and the member nodes that receive this information also send it to their neighbor nodes. This work continues until the time the node meets a member node included in another cluster head. Through these processes, each local cluster is established. As the number of local clusters is the same as the number of cluster heads, determining the number of cluster heads is the same as determining the size of the local cluster. The greater the size of the local cluster, the greater the number of member nodes which a cluster head has. The energy consumption of a cluster head also increases. On the other hand, the smaller the size of a local cluster, the lower the number of member nodes. The energy consumption of a cluster head also decreases. Increasing the number of cluster heads means increasing the transmission cost for a sink node in Inter-cluster. Thus, to determine the optimal number of cluster heads, WSNs need to know the relations between the size of the local cluster and the number of cluster heads, and to minimize the data path, the nodes should send their event data to those neighbor nodes that are closer to a sink node or a cluster head. To achieve this, the nodes have to set up the possible transmission area based on the location of the nodes. Finally, WSNs have to know the size of the local cluster by limiting its maximum size.

Local Cluster Modeling.
To model a local cluster, we assume the following: the network size is × , the number of nodes is , and the nodes are equally distributed in the network area. The transmission range of the nodes is restricted to 10 meters. The data transmission of the nodes is performed to collect and aggregate the event data or relay the data and to send the aggregated data to the neighbor nodes. The original node transmits the same amount of data in the same period of time. A cluster head is not any different from the other nodes, except that it can aggregate the event data from its member nodes and send it to a sink node by multihop-based communication. Figure 1 shows the model of a local cluster. The location of the cluster head is at the center. The longest distance from the cluster head to a member node is " " or " " which is equal to the radius of the local cluster. The maximum transmission range is " " which can describe 1 hop. So, a member node which is " " hops away is located in the " th" radius of the local cluster. As the network size is 2 , if there are " " local clusters, it is equal to " 2 = × 2 . " This means that the network size is the same as the total area of " " local clusters. So, " " can be described by the following (1)

Intracluster
Modeling. In Figure 1, the minimum number of hops between a cluster head and a member node can be described as "H min = / . " If the neighbor nodes, or member nodes, are located at the maximum transmission range, the event data can be transmitted in the minimum transmission distance, as shown in Figure 2. Though the relay node is located at the th radius, as shown in Figure 2, it can have different hop counts from the th radius to the cluster head. In Figure 3, when node "d" transmits the event data to the cluster head, the data path is "d-c-b-a, " which is the shortest path with the minimum hop count. On the other hand, node "1" can 4 International Journal of Distributed Sensor Networks transmit the event data through a detour path such as "1-2-3-4-5-6-7. " Node "1" cannot help in selecting node "2, " which is closer to the cluster head than itself, as the relay node because there are no nodes at the − 1th radius or with − 1th hop counts. After repeating the process, node "1" with a minimum of 4 hop counts transmits the event data to the cluster head over a data path with a hop count of 7. Thus, the data path or distance can be increased by the location of the nodes. To achieve this, we should select a relay node in the transmission range. The process of doing this is as follows.
To determine the number of nodes located in the th area, which is the area between the th and − 1th radius, we assume the ratio of the th area to be as follows: The number of nodes in the th area is used to multiply the total nodes " " by the rate of the th area, Ar th. This is accomplished using the following (3): As described above, the data path can be increased by location of the nodes with th area. Therefore, considering the best and worst node locations, the distance and hop counts of the data path should be calculated. In Figure 4, the transmission range of node "A" in the − 1th area can cover from the th area to the − 2th area. In this case, the optimal relay node of node "A" is the node located at the th radius, as shown in Figure 3. Though there are no nodes located at the th radius, node "A" can select the nodes in the − 1th area as the relay node. They are better than the nodes in the − 2th area because their data path is shorter. Thus, the data transmission range for selecting the relay node is at least half of the transmission range. Equation (4) shows this: Node "B" is located at the center between the − 1th area and −2th area. It is better for node "B" to pick the node in the −1th area than the node in the −2th area as the relay node. Therefore, the data relay area can be limited by the black spot, Ar relay , in Figure 4.
This area can be calculated by formula Ar th in the − 2th area minus the area of the black spot not included in the − 2th area. This can be described as follows: The number of relay nodes in the −1th area is to multiply the number of nodes in the − 1th area, C th, by the rate of black spot area, Ar relay , among Ar th. Therefore, the number of relay nodes in the − 1th area is obtained by the following equation in the Intra-cluster part:

Intercluster
Modeling. The aggregated data should be transmitted to a sink node by a cluster head in the intercluster. The number of relay nodes which send the aggregated data to a sink node is affected by the number of cluster heads. When a cluster head sends the aggregated data to a sink node, the transmission data range should also be considered as in the case of the Intra-cluster. As shown in Figure 5, a cluster head beyond the one-hop range has to send the aggregated data to a sink node using multi-hop communication. In this case, the cluster head also selects nodes in the black spot which are closer to the sink node and sends the data to the sink node. This is the same as the method used by member nodes to send the event data to a sink node in the Intra-cluster part. However, the transmission area of a cluster head is not affected by the th area. In the Inter-cluster, the ratio of the th area is always given by (7): So, the number of relay nodes in the th area is calculated by (8).

Cluster Depth Modeling.
In the above intra-and intercluster modeling, we found that the number of cluster heads is related to the size of a local cluster. The cluster size is the radius of a local cluster. It can be presented as the hop count.
If the depth of a local cluster is " , " other local clusters are located outside of the area defined by " . " This means that the number of local clusters can be increased by " . " As the cluster radius " " can be presented in terms of the hop count, " " can be given as " / . " Therefore, " " can be redefined by (9). Using (9), the number of clusters can be calculated:

Network
Configuration. An ns2 [19], simulator is used in order to experiment with the supposed algorithm above.  [20]. The network configuration is as follows. The networks size is 100 m × 100 m. The transmission range of a sensor node, " , " is 10 meters. The total number of sensor nodes is 100. They are scattered uniformly in the network. The sink node is located at a distance " " from the network. The size of a data packet is 525 bytes. The receive energy and transmit energy are the same as in the existing research [11]. After the process of clustering, the operation of clustering is divided into two parts: the process of Intra-clustering and Inter-clustering. In the processes of Intra-clustering, the member nodes send the event data to a sink node once per a specific time. After the cluster heads collect and aggregate the event data from all of the member nodes, the process of Inter-clustering starts. The cluster head sends the aggregated data to a sink node. To know the node transmission range by the position of the nodes, the position of the nodes in the th area is set up various position, line, center, left, and right (see Figure 6).

Performance Analysis.
The network can be divided into local cluster heads according to the number of cluster heads. The size of each local cluster is affected by the number of cluster heads. In a local cluster, the number of relay nodes which can transmit the data to the neighbor nodes is shown in Figure 7.
As shown in Figure 7, the node located at the line, node line, has more relay nodes than the nodes in the other locations. On the other hand, the node located on the right near the th line, node right, has less relay nodes. The number of member nodes of each local cluster decreases with increasing number of cluster heads, because the size of a local cluster becomes smaller. For this reason, the number of relay nodes is decreased by increasing the number of cluster heads. As shown in Figure 8, in the case where there is one cluster head in the WSN, the number of relay nodes of the node line is about 5. In the case where there are 15 cluster heads in the WSN, the number of relay nodes is about 2. In this case, the node line can transmit the event data to the neighbor nodes,  though the relay nodes are decreased. However, the node line is the best case scenario for transmitting the event data. In the case of 15 cluster heads, the other nodes have lower relay nodes than 1 in its own transmission area. This means that they have to select detour. Figure 8 shows the energy consumption of a local cluster. If the number of cluster heads is increased, the transmission energy of a local cluster is decreased, because the relay distance between the cluster head and its member nodes is shortened. As the number of cluster heads is increased, the amount of energy for a given location type is decreased. This is why the cluster size becomes smaller. The energy consumption of a local cluster would be the same in any location, as the cluster size is lower than 1 hop.
In the Inter-cluster operation, the number of relay nodes is not related to the local cluster size, unlike in the case of the Intra-cluster operation. It is related to the transmission range of the sensor nodes which play the role of relay nodes.  The number of cluster heads The average number of relay nodes in intercluster As the transmission range of a cluster head is the same as that of the ordinary nodes, the energy consumption in the Inter-cluster operation is only related to the number of relay nodes within the transmission range, the distance between the cluster head and a sink node, and the number of cluster heads. Figure 9 shows the average number of hops in the case where the distance between a cluster head and a sink node is set to the shortest path. As the number of cluster heads increases, the distance between a cluster head and a sink node is increased. Figure 10 shows the energy consumption as a function of the number of cluster heads in the Inter-cluster operation. The energy consumption increases rapidly with increasing number of cluster heads, because of increasing the number of relay nodes by distance.
In this paper, our selection algorithm proposes optimistic and suitable number of CH nodes before making cluster group. So, the energy consumption of clustering formulation is depended (or relies) on the number of nodes which is selected as CH. In other words, the energy consumption of clustering formulation would be getting decreased if its size is getting smaller. Like Figure 11, as the number of CHs becomes more and more, the energy usage of local cluster formulation is getting lower and lower. While previous reviewed algorithms, MROCH [11] and PDTC [12], show that the energy consumption of cluster formulation is consistently getting lower in proportion in CH's number, our proposal shows that the energy consumption is getting higher and higher after CH number is eight percent because of adding the energy consumption of detouring. Via this chart, we can  determine that this algorithm can be affected in local cluster formulation in the network than others since the detouring energy consumption in Inter-cluster is much more than Intra-cluster one.
The total energy consumption is obtained by adding the Intra-cluster energy consumption to the Inter-cluster energy consumption. Figure 12 shows the total energy consumption as a function of the number of cluster heads. In all locations, the energy is decreased in the case where the number of cluster heads is 2, 3, or 4, because of the decrease in the number of relay nodes in a local cluster. In the case where there are more than 5 cluster heads, the total network energy increases rapidly as the Inter-cluster energy is higher than the Intra-cluster energy. Thus, we can see that total energy consumption is the lowest from 2% to 5%, under 0.04 Joule. Figure 13 shows the number of relay node (RN) at the case when the node has the same hop-count and also has many data transferring request from many nodes compared with previous network steps during data transmission: the node selects a detouring node (DN) as its data transmission path. Compared with the previous literature MROCH [11], these algorithms show stable (of fixed) number of probable RNs between 5 and 14 when detouring because of not considering DN, but the number in our proposal is getting more and more since CH number is component rate is 3%. It means that increasing RN number considering detouring path is able to be affected with the energy consumption of the network. Figure 14 shows and this following effect, component rate of RN considering detouring (DN) in the network visually. Also, we can find that the component rate of DN is over 15% when CH number is between 2 and 5% in the network via this figure. To sum up, like Figure 12, the entire amount of energy consumption is possible to be affected with the detouring path when CH number is around 2 and 5% in a whole network, the most minimum energy consumption case. It means that it is not easy to assume (or get) suitable  CHs numbers exactly for the measuring of entire energy consumption in the network without detouring path (or DN).
To easily compare the variation of the entire energy consumption in the network with our proposal and other proposals, we need to focus on and analyze our clustering formulation algorithm and theirs firstly. But since LEACH assumed its communication scope as the only one single hop, we compared the performance of calculating optimistic CHs number with MROCH and PDTC via these mathematical equations. As a result, Figure 15 shows each variation of energy consumption and also represents that our proposal has suitable energy consumption states or performance if the component rate of CH number is around 2 and 5%: MROCH shows between 4 and 8%, and PDTC is around 5 to 9%. However, on "NS-2" network simulation, when the rate of CHs number in the network is around 2 to 4%, we can get optimistic energy consumption states. It means that   in our proposal, considering additional energy consumption of detouring is able to calculate optimistic CHs number more correctly than other proposals.
If the local cluster size is less than or equal to 1 hop, there are no relay nodes. Therefore, when the size of a local cluster is "1, " the number of relay nodes is "0" in Figure 16. The greater the size of a local cluster, the greater the number of relay nodes.
If the size of a local cluster is 1 hop or " , " the Intracluster energy consumption is almost equal to the total network energy consumption when the number of local clusters is 30. On the other hand, if the size of a local cluster is increased, the energy consumption of the Intra-cluster increases by increasing the number of relay nodes and the energy consumption of Inter-cluster decreases by decreasing the number of cluster heads. As the energy consumption of the Inter-cluster operation decreases rapidly with increasing the number of cluster heads in this case, the total energy consumption is decreased. Like Figure 17, until the size of a local cluster reaches 5 hops, the energy consumption continues to decrease or increase. However, when the size of a local cluster exceeds 5 hops, the number of relay nodes increases rapidly with the increasing distance between the cluster head and member nodes. Thus, the optimum depth of a local cluster from the viewpoint of the energy consumption is over 3 hops and under 5 hops. In the case of a depth of 3, that is 3 hops, the number of cluster heads is 3.5. In the case of a depth of 5, it is 1.3. Thus, the optimum number of cluster heads for energy efficiency is from 1.3 to 3.5.

Conclusion
Wireless sensor networks are networks used for monitoring and detecting environmental information using tiny sensor nodes with restricted capability in a specific area. WSNs have to use multi-hop-based communication by the sensor nodes with a limited transmission range and have to support energy efficiency mechanisms, as it is not easy to supply energy to a sensor node. Generally, sensor nodes tend to detect similar or the same event data. However, transmitting redundant data to other nodes is not energy efficient. To prevent this, a clustering mechanism is devised. The clustering mechanism of sensor networks can reduce duplicated data, as the cluster heads collect similar data from their neighbor nodes and thereby reduce the energy consumption. The clustering mechanism can collect the required data from a local cluster and make it possible to transmit the event data rapidly, as it forms local clusters based on the event features. The energy consumption associated with clustering is affected by the number of cluster heads and the size of a local cluster. The size of a local cluster is related to the number of cluster heads. The number of cluster heads affects the number of relay nodes, used for intra-and intercluster data transmission. Therefore, it affects the total energy consumption in WSNs. When the event node which detects the required data in the monitoring area selects a relay node, it is better to select a node which is closer to a cluster head or a sink node within its own transmission range in order to set up the shortest path. In this way, the path will be shortened. To achieve this, in this paper, we propose a method of selecting the node with less hops than its own hops in the transmission range. The locations of the nodes affecting the data path are also considered. Through equations, we determined the number of relay nodes and the energy consumption in the Intra-and Inter-cluster operations by means of the above considerations. We determined the variation of the energy consumption with the number of cluster heads and the size of a local cluster. By determining the energy consumption, we were able to determine the optimal size of a local cluster for a given number of cluster heads. Thus, we determined the optimal number of cluster heads in the clustering of WSNs.