Energy-efficient chain-cluster based intelligent routing technique for wireless sensor networks

With the increasing acclaim of Wireless Sensor Networks and its diverse applications, research has been directedintooptimisingandprolongingthenetworklifetime.Energyefficiencyhasbeenacriticalfactordueto theenergyresourceimpedimentofbatteriesinsensornodes.Theproposedroutingalgorithmthereforeaimsatextendinglifetimeofsensorsbyenhancingloaddistributioninthenetwork.Theschemeisbasedonthechain-basedroutingtechniqueofthePEGASIS(PowerEnergyGAtheringinSensorInformationSystems)protocol andusesAntColonyOptimisationtoobtaintheoptimalchain.ThecontributionoftheproposedworkistheintegrationoftheclusteringmethodtoPEGASISwithAntColonyOptimisationtoreduceredundancyofdata, neighbournodesdistanceandtransmissiondelayassociatedwithlonglinks,andtheemploymentanappropriateclusterheadselectionmethod.Simulationresultsindicatesproposedmethod ’ ssuperiorityinterms of residual energy along with considerable improvement regarding network lifetime, and significant reduction in delay when compared with existing PEGASIS protocol and optimised PEG-ACO chain respectively.


Introduction
Wireless Sensor Network (WSN) is an infrastructure-based wireless network made up of multiple tiny Sensor Nodes (SNs) that sense information from their specific environment and by cooperating with each other, transmit it to a Base Station (BS) sited in a particular location. Therefore, a WSN can be described as a coalition of locally distributed sensors that work as a team to achieve a common purpose, which is usually application dependent.
The SNs possess the ability to sense, process and communicate. Since they are battery operated, small and affordable, they can be practical and useful for a number of different applications. Figure 1 illustrates the components of a sensor node: Intelligent routing technique for wireless sensor networks

39
The SNs are scattered to collect useful information from a specific geographical location and send it to a BS. The collected data can either be routed using single hop or multi-hop paradigms, depending on the specific routing protocol. SNs communicate information through wireless media such as radio signals and infra-red [2,3].
WSN is widely considered as one of the most important technologies for the twenty-first century and is used in many applications like home automation, healthcare, traffic control and environment monitoring.
For the SNs to be able to sustain long term sensing capability over a large coverage set, they should conserve energy as much as possible since their battery life is limited. Replacement of a node battery is infeasible due to the harsh environment in which the SNs are usually deployed in [4]. Hence, it becomes imperative to prioritise energy efficiency in order to maximise network lifetime and performance of WSN. In time-critical applications, routing delay also becomes a pressing factor. This paper proposed two routing schemes with the following contributions: 1. To combine the benefits of chain-based and cluster-based architectures to reduce overall chain length and decrease transmission distance.
2. Reduce network latency by using simultaneously operating clusters.
3. To achieve proper load balancing in the network by adopting a suitable cluster head technique. 4. Employ an intelligent routing method to obtain globally optimal chain of shortest distance.
The rest of the paper is as follows: Section 2 describes various existing chain-based and cluster-based routing protocols. Section 3 follows with the details of the proposed routing techniques. The results are displayed and analysed in Section 4. Finally, Section 5 concludes the paper and some future works are suggested.

Related works
Some routing protocols applied for WSN can be categorised as [5,6]: Location-based protocols Data-centric protocols QoS-based protocols Hierarchical protocols stands out in terms of scalability by maintaining a tiered framework in the network, energy minimisation and network lifetime maximisation. Two leading hierarchical routing techniques are chain-based and cluster-based. LEACH [7] is the most basic-cluster based protocol proposed as an improvement of Direct Transmission, where packets are sent from nodes to BS directly. The operation of LEACH is split into communication rounds and each round is further divided into the set-up phase and the steady-state phase. During each round of operation, all sensed information are sent to the BS. In the set-up phase, election of a Cluster Head (CH) is done in each cluster, based on the desired CH percentage and frequency of the node in being chosen as CH. Each node contends for the position of CH by choosing a random number between [0,1] and if the number is less than T(n), the node becomes CH. T(n) is determined using Eq. (1) [7]: where P 5 desired CH percentage; r 5 current round; G is the set of nodes that have not been CHs in the last 1/P rounds; n 5 node number.
In the steady-state phase, nodes from each cluster communicate information to their respective CH using single hop technique and the CHs from all clusters fuses collected data with their own and transmit the data to the BS directly [7].
However, extra overhead is created due to the dynamic clustering nature of LEACH since clustering is performed during each round. Furthermore, CHs further away from BS tend to drain energy at a faster rate.
Heinzelman et al. [8] presented LEACH-C that uses a central control algorithm for proper spreading of CH over the entire network during the set-up phase. It is the BS that will then select CHs, non-CHs and organise clusters based on these information. The BS ensures proper load balancing by calculating average residual energy of node in each round to act as a threshold and selecting only nodes having energy higher than the threshold. An optimisation algorithm called Simulated Annealing is used, which gives a near optimal approximated solution to find the optimal clusters on basis of non-CH and CH distance. The same steadystate phase as traditional LEACH is then adopted. Even though a superior scheme was applied for cluster formation, the drawback of long transmission distance between CH and BS still prevails.
PEGASIS is one of the most popular chain based protocols, which is presented as an extension of LEACH, where a chain is formed instead of clusters. In PEGASIS, chain formation starts with the node furthest from the BS. Chain continues to form by connecting current node with its closest neighbour until all nodes are linked on the chain. Data is transmitted and aggregated from node to node along the chain until a randomly nominated Chain Leader (CL) is reached. CL then transmits information directly to the BS. This communication process is repeated every round. The objective of PEGASIS is to increase network lifetime by decreasing transmission distance between nodes [9].
However, chain-based routing suffers from large delay and communication overhead. Some emerging protocols combines chain and cluster routing techniques to reduce individual downsides of these two methods or unite their benefits.

Intelligent routing technique for wireless sensor networks
Linping et al. [10] have proposed a novel way of increasing efficiency of PEGASIS by selecting two CHs and dividing the network into different data levels. An efficient chain formation scheme was devised by Mishra et al. [11] to improve network lifetime of PEGASIS. The scheme elects nodes that are in the proximity of the BS as potential leaders, unlike the random selection of CL is PEGASIS and hence, reduces transmission distance from CL to BS. Additionally, a multiple overlapped chain formation technique is applied for transmission of information to BS.
Other authors [12,13] have divided the network of nodes in concentric clusters and each cluster consists of the PEGASIS chain. The chain length is considerably reduced and hence, transmission energy will also be reduced. However, authors still used the greedy method of PEGASIS which leads to increased communication distance as nodes die. Moreover, appropriate method for CH selection was not used since communication distance of CH has not been considered. PEGASIS has also been improved by using Ant Colony Optimisation for identification of globally optimal chain instead of the greedy method to ensure that the chain with the smallest overall distance is found [14,15].

Ant Colony Optimisation
Ant Colony Optimisation (ACO) is a nature-inspired meta-heuristic optimisation technique based on the foraging nature of ants to find food. Ants wander haphazardly around the colony in search of food. If an ant locates a food source, it returns to the colony almost directly, leaving behind a trail of pheromones. The closest ants will be attracted to these pheromones and follow the path taken by the previous ant. When these ants return to the colony, they fortify the pheromone trail created by the first ant, which will attract more ants, and the process is repeated. With time, the pheromone evaporates and the longest trails will eventually disappear since the ants adhered to these tracks will not have enough time to return back to the colony and strengthen the trail. Hence, the pheromone trail left at the end will be that of the shortest length and will be globally optimal [16].
In ACO, simulated ants are considered as mobile agents that work jointly and communicate with each other through artificial pheromone trails. Each ant will leave the colony (source node) and travel all over the network randomly until every node has been visited. At each node, p, the probability (transition probability) with which an ant k chooses the next node, q, is as follows [15,17]: ½τðp;qÞ α ½ηðp;qÞ β P q∈Rq ½τðp;qÞ α ½ηðp;qÞ β ; if q ∉ M k 0 otherwise where τðp; qÞ are the pheromone created by the backward ant and ηðp; qÞ is the projected heuristic function for energy and distance and R q is the recipient nodes. For node p, M k is the list of earlier visited nodes. α and β are the pheromone exponential weight and heuristic exponential weight respectively. The Roulette Wheel Selection method [18,19] is used to obtain an optimal solution based on the transition probabilities given in Eq. (2).
The evaporation mechanism of pheromone is executed by using the following equation to control the quantity of pheromone and favour finding of new paths: where 0 < ρ ≤ 1 is the evaporation rate and τ pq is the pheromone level between node p and node q.

ACI 16,1/2
The retention of pheromone on a trail when an ant moves backward in the return journey is illustrated using the following formula: where τ k is the amount of pheromone retained, L k is the distance between ant k in terms of visited hops and Q is the weighting coefficient. Hence, the pheromone level of the trail will change according to the following equation: After all ants have terminated their journey, the optimal route having the best cost that is, the shortest path, is recorded. This process is repeated for multiple rounds in order to find a better path each time until the best chain is obtained. By so doing, the global optimum chain is obtained.

Proposed algorithm
In this paper, the proposed techniques eliminate the increase in neighbour distance, from greedy algorithm of PEGASIS, as nodes get connected on the chain. Ant Colony Optimisation (ACO) is used to obtain the globally optimal chain of shortest distance and has proven its superiority compared to other optimisation algorithms like Genetic Algorithm and Simulated Annealing, in solving the Travelling Salesman Problem where shortest path connecting multiple points need to be found [20,21]. The globally optimal chain was found by choosing the best cost function calculated from multiple ACO rounds. This eliminates the problem of long distances between neighbour nodes at the end of PEGASIS chain. Network latency obtained with long chains is also reduced by the clustering methods and an efficient CH selection for each cluster is adopted by taking into account distance to destination and residual energy of nodes. First, Horizontal PEG-ACO Clustering will be discussed followed by Vertical PEG-ACO Clustering. The difference between these two methods is the way clustering of network is performed and the determination of the upper level cluster, which includes the CH responsible for the energy exhaustive task of transmitting to the BS.

Network and radio model
Some assumptions are made for the network model used: The BS is fixed, located very far from the sensing area, has high processing power, and an endless energy resource.
All the nodes are energy-constrained.
Nodes are energy-homogeneous and hence, are allocated the same initial energy.
Nodes are location aware through GPS and hence, can identify location of their neighbour as determined by the Base Station using ACO.
Nodes are immobile once positioned.
"Time-driven" sensing of information is considered, whereby each SNs sense the environment at a fixed rate and always have information to transmit to the BS.
Most of the node energy is depleted during transmission of data, which is dependent on transmission distance.
Nodes are randomly deployed over the whole network area.

Intelligent routing technique for wireless sensor networks
Our methods make use of free space and multi-path radio models to simulate energy dissipation in the network [22]. In this model, the energy required for transmission and reception of an L-bit packet is described as: where E TX is the energy dissipated by a sensor node while transmitting a packet to its neighbour, E RX is the energy disseminated when a sensor node receives a packet, E fs is the dissipated energy per bit for the transmitter amplifier in the free space model, E mp is that of the multipath model, d is the transmission distance and E elec is the energy per bit to run the transmitter or receiver circuits. Energy spent during fusion of packets by a node is given by: where E DA is the data aggregation energy per bit spent by a node when it fuses received data with its own and m is the number of packets to be fused. The communication distance, d 0 , is the threshold between the free space and multipath models, and is calculated by: The coordinates of the BS are set so that its minimum distance from the upper level CH is greater than the threshold distance, d 0 . Hence, the free space model is used for inter-cluster and intra-cluster communications while the multi-path model is used for transmission to BS. Network latency can be defined as the time lapse between the data packet generated by first node in chain and reception of all packets by BS, illustrated by [1,23,24]: where T q is the queue delay per intermediate forwarder, T p is the propagation delay, T d is the transmission delay and hops is the total number of hop count between first node to BS which can also be considered as the number of nodes acting as both intermediate forwarders and transmitters.

PEG-ACO Clustering
The issue with long chains is the increase in transmission delay and communication overhead. Therefore, the clustering method is used to decrease these issues. Multiple smaller chains are obtained to reduce network delay by allowing the intra-cluster operations to be coincident. Static clusters are implemented, instead of dynamic clusters like LEACH, to decrease overhead caused by recurrent clustering. Moreover, the most consequent energy consumption occurs during transmission of packet from CH to BS since the transmission distance is largest, implying transmission energy is highest. Clustering also helps in reducing this distance.
In the initialisation phase, the network is divided into equal static clusters and each contains the PEG-ACO chain to ensure only neighbour to neighbour communication as shown in Figure 2.

ACI 16,1/2
Determination of the number of round for the ACO algorithm has been made by testing the ACO in a particular cluster for multiple rounds. The round at which the cost function, which is considered as the overall chain length obtained, becomes stable is taken as the minimum total number of ACO rounds needed for optimal chain finding. This test has been performed for other clusters, with different node numbers, in the network for reliability of result.
Clusters will be formed based on y-coordinators (Horizontal clustering) and x-coordinators (Vertical Clustering). This is carried out in a centralised manner like LEACH-C and HEEP which is controlled by the BS [7,25]: CHs will carry the charge of inter-cluster communications and the higher level CH will transmit to BS. CH selection is performed by resorting to a trade-off between residual energy

Intelligent routing technique
for wireless sensor networks of node and communication distance. This selection is executed in each of the cluster using the following equation, where the node with higher fitness value will act as leader for its cluster: where w 1 and w 2 are weights assigned to each parameter and combined sum is equal to 1, E residual ðCH Þ is the residual energy of CH, and d dest is the distance from destination. The weights are used to set a priority between energy and communication distance. The mode of data transmission and aggregation in each of the chain will be the same as PEGASIS and occurs in the steady-state phase. The end nodes in each cluster will be the first to transmit, and a packet is generated then fused from SN to next SN along the chain until the CH is attained. Once all the CHs in the network have received packets from their neighbours, they will fuse the data with their own into a single packet. Another chain will be used to connect the CHs by considering communication distance and energy metrics for efficient data transmission to the BS.
After each communication round, other CHs will be chosen in each of the clusters for even energy distribution of nodes. This ensures that the rigorous load of long distance transmission is fairly delegated among nodes. When one or more nodes die, the PEG-ACO chain is recreated by bypassing the dead nodes.
Intra-cluster communication occurs using a Time Division Multiple Access (TDMA) schedule created by each CH, which allocates time-slots to each SNs in its cluster, so as to safeguard transmission from collisions. In TDMA technology, the channel is divided into frames, having different time-slots, which are individually allocated to all the nodes by the BS. Nodes are only allowed to transmit within their time-slots.
To reduce inter-cluster interference, Code Division Multiple Access (CDMA) is used, where each cluster uses its own spreading code for communication. Signal is received by checking for a correlation with the specific spreading code used for transmission, and following de-correlation, all other signals are regarded as noise. Thus, interference is reduced during signal transmission. After all intra-cluster communications are over, inter-cluster communication and transmission to BS are performed using another TDMA schedule.
3.2.1 Horizontal PEG-ACO clustering. In horizontal clustering of PEG-ACO, the network is divided equally by considering y-coordinates of nodes. For an (L 3 L) m 2 network, the size on one cluster would be (L 3 L/k) m 2 where k is the desired number of clusters.
CH selection begins with the upper level cluster (with highest Y-coordinators), where nodes closest to the BS are situated. Using Eq. (11), the fittest node is nominated as the CH assigned for transmission to BS (destination). Designation of next CH is performed in the following cluster using the same selection method, with the highest level CH now being the destination. This process is repeated until the lower level cluster (with lowest Y-coordinators) is reached. All CHs are then connected together on a chain. Figure 3 demonstrates the chain and cluster method of Horizontal PEG-ACO Clustering.
3.2.2 Vertical PEG-ACO clustering. Cluster formation in this method is the same as Horizontal PEG-ACO Clustering but instead, the x-coordinates of nodes are considered as shown in Figure 4.
For this approach, the fitness value of all alive nodes are computed using Eq. (11) again. Any node with highest fitness value will take on the role of transmitting to BS and its cluster then becomes the upper level cluster. The neighbour cluster will be next to perform CH selection, with the higher level CH becoming the destination. This operation is executed again until all clusters have elected a CH with a connection to other close CHs.

Intelligent routing technique
for wireless sensor networks

Simulation results and discussions
The Matlab simulation tool has been used for demonstration of the routing techniques proposed. The simulations have been performed and analysed using performance metrics such as network lifetime, number of alive nodes, residual energy of network, average energy consumed per node, delay and percentage of dead nodes.
The proposed routing methods are simulated using different network environments to test their effectiveness. Networks of 100 m 3 100 m and 200 m 3 200 m are used with BS at (50, 300) and (100, 400) respectively.
Other common simulation parameters used are given in Table 1: Figures 5-7 shows the models used for our simulations. Changes in node locations may vary simulation results slightly.  The proposed clustering methods have been tested and compared with existing PEGASIS protocol and PEG-ACO. For reliability of results, same parameters and network were used for all routing methods.

Network lifetime and load balancing
Network performance starts degrading as soon as the nodes start dying since it has been assumed that all nodes have information to transmit to the BS. Therefore, network lifetime is measured in terms of First Node Die (FND). FND is the transmission round at which the first node dies in the network. The percentage of remaining residual energy left after FND will give an indication of how load was balanced in the network. Load distribution can be referred to as how energy is being depleted in the network. Most energy is consumed by CHs and hence,

Intelligent routing technique
for wireless sensor networks appropriate CH selection will induce a better load balancing by ensuring an even energy distribution across the network. A lower percentage of energy after FND will signify a better load distribution over the whole network since all nodes have been alive for a longer period. This test is performed for a network of 100 Nodes with 0.5 J initial node energy scattered in a 200 m 3 200 m area. A bar chat is plotted for FND to show the round at which node death begins for each method as shown in Figure 8. Table 2 shows the percentage of network residual energy after FND. Figure 8 clearly shows that the proposed techniques have significantly improved the FND factor of the existing chain-based methods by decreasing transmission distance during each round. FND is improved for both proposed methods since better load distribution have been accomplished compared to the chain-based techniques. Fairness of CH selection has also been achieved, reducing energy depletion rate of nodes.
It can be deduced from Table 2 that Vertical PEG-ACO Clustering achieves better load balancing that all other routing methods which is a direct result of its high performance in terms of FND. This is accomplished by the mode of CH selection used since the whole network is considered for selection of fittest node to perform long-haul transmission to the BS.    Figures 9 and 10 show the results of alive nodes left in network after each communication round for different coverage areas. Nodes that have not yet depleted their energy completely and possess sufficient energy for transmission are considered as alive. For a 100 m 3 100 m node deployment region, it can be observed from Figure 9 that Vertical PEG-ACO Clustering is superior in terms of FND. Therefore, the stability period of the network, where all nodes are still alive, is maintained for a longer period. A sudden decrease in number of alive nodes is obtained due to the nature of selection of CH to ensure proper load distribution. Hence, a steep decrease is seen since remaining nodes have little energy left.

Alive nodes
When comparing results in terms of 50% nodes dead and last node dead (LND), it can be seen that these occur after a longer period in Horizontal PEG-ACO Clustering.
The graph of Horizontal PEG-ACO Clustering obtained contains five steps, each representing the moment a cluster dies, that is, when all nodes present in a cluster are dead. As soon as a cluster dies, a horizontal line can be noticed. This is due to the uneven load balancing in the network. The highest level cluster will be the one to die first since transmission from network to the BS will be done by nodes chosen in that particular cluster only. As a result, energy depletion will be greatest in that cluster due to longer transmission distance. Once a cluster is dead, a stable period is achieved since the nodes in the next higher level cluster still contain a relatively large amount of energy due to short inter-cluster and intra-cluster distances. Nevertheless, the nodes will deplete energy rapidly, due to the burden of long distance transmission to the BS of nodes in the highest level cluster. This is repeated until all clusters are dead.
Thus, for these particular network parameters, Vertical PEG-ACO Clustering is considered the most efficient with reference to number of alive nodes left in the network.

Intelligent routing technique
for wireless sensor networks Moreover, it can be seen that there is an even distribution of load in the network since node death is gradual as shown from the steepness of the graph.
The results in Figure 10 shows that the proposed clustering methods substantially ameliorated the network performance both with respect to FND and 50% of nodes dead.

Network residual energy
The total residual energy of nodes is an indicative of the remaining overall energy of the network. It is the sum of all energies of alive nodes and is measured during each communication round. Figure 11 shows the simulation results obtained for residual energy for a network of 200 m 3 200 m, containing 100 nodes.
As demonstrated by results from Figure 11, both clustering methods outperforms PEG-ACO with regards to residual energy of network but Horizontal PEG-ACO Clustering has higher residual energy throughout the entire network lifetime. This is due to minimisation of transmission distance to BS by using nodes present in the highest level cluster. There is a gradual increase in the highest level CH to BS transmission distance when a cluster dies since the next highest level cluster takes the charge of electing the CH to transmit to the BS. Due to this progressive increase in distance, transmission energy increases at a lower rate than Vertical PEG-ACO Clustering. As a result, residual energy of network per round is highest with Horizontal PEG-ACO Clustering.
Nevertheless, Vertical PEG-ACO Clustering still outperforms PEGASIS and PEG-ACO for this parameter from start of transmission until around when 5% alive nodes are left. Under 5%, results can be neglected since network performance has already been heavily degraded due to formation of network partitions.

Adaptability to node density variations
The routing schemes should also be resilient to changes in node density since these parameters may need to be changed to adapt to the required applications. The tests are performed in a 100 m 3 100 m network. The results are tabulated in Table 3 and Table 4: Vertical PEG-ACO Clustering outperforms both chain-based protocols during the first 50% of entire network lifetime, as seen in Tables 3 and 4. As nodes continue to die, network stability decreases, making information sent to BS unreliable. Hence, network lifetime above  Table 3. Comparison of rounds at which specific percentages of node dead for 50 randomly deployed nodes.

Intelligent routing technique
for wireless sensor networks 50% can be disregarded. This performance remains unaffected by changes in node density. However, Horizontal PEG-ACO Clustering offers poor performance, when submitted to changes in node density, for the first 50% of whole network lifetime but succeeds at maintaining overall network active for a longer period.
4.5 Average energy consumed per node From Figure 12, it is obvious that the clustering methods have reduced average energy consumed, with Vertical PEG-ACO clustering giving best results. Horizontal PEG-ACO Clustering gives lowest average energy consumed per node, but a drastic increase in energy consumed is observed because of long distance transmission to BS as higher level clusters die.

Network throughput
The network throughput can be measured as the number of messages received at the BS for a given period. The following test is performed for a network 100 nodes dispersed in a 100 m 3 100 m field.
As shown in Figure 13, a higher throughput with both proposed methods is achieved, which is mainly due to the extension in network lifetime based on the last node to die. If network lifetime is increased, more messages will be received at the BS.    Intelligent routing technique for wireless sensor networks 4.7 Maximum network delay Network latency can be defined as the time lapse between the data packet generated by the first node on the chain and reception of all packets by the BS.
Maximum delay can be obtained by determining the maximum number of nodes in clusters. In the case of the simulated chain-based protocols, a maximum delay of 100 ms is thus obtained for a network of 100 nodes. Delay for the proposed chain-clustering schemes is lower since packet transmission is performed in all clusters simultaneously. The maximum delay for each routing methods are shown in the following bar chart.
Maximum network delay, given in Figure 14, has been considerably reduced as a direct result of breaking down the long chain into smaller ones. Less hop count is needed for data transmission to the BS and since delay is proportional to total number of hop counts, latency is improved.

Conclusion and future works
This paper has proposed a novel method of combining clustering technique in PEGASIS with Ant Colony Optimisation for delay reduction and optimal path finding where transmission distance is minimised. Multi-hop paradigm is applied for both intra-cluster and inter-cluster communications to ensure that minimum transmission energy is expended. An appropriate CH scheme has been implemented for proper load balancing. Finally, simulations have shown superiority of proposed methods in terms of alive nodes left, network residual energy, latency, throughput and load balancing.
A more effective mode for clustering could be achieved by using other custom optimisation algorithms like Artificial Bee Colony optimisation, and using specific parameters for determination of their objective functions. Robustness of method and fault tolerance could be investigated. During node failure, transmission is stopped and packets are lost. This could be prevented by devising an appropriate method for relaying information successfully during a fault.