A Fast and Reliable Hybrid Data Delivery Protocol for Large-Scale Heterogeneous Sensor Networks

We propose a hybrid data delivery method for the large-scale heterogeneous sensor networks, which is a fast and reliable delivery protocol for the aggregated data from the sinks to the GW. We develop a new multicriteria-ranking algorithm which determines multiple forwarders for each hop by ranking neighbor nodes. To rank the nodes, we compute the fitness value using features for each node such as the received signal strength, nodal delay, and hop distance. We determine the time of sending among forwarders using the waiting time assignment algorithm. In the experimental section, we show that our method outperforms conventional data delivery protocols in terms of data delivery ratio and end-to-end delay.


Introduction
Wireless sensor networks (WSNs) are multihop wireless networks which main purpose is to deliver sensed data collected from multiple sensors to one or more datacollecting devices (or sinks). Its inherent power limitation [1], wireless connectivity and vast number of applications have been attracting much attention from a good number of researchers in the recent years [2,3]. WSNs are used in acquiring information from inaccessible or perilous areas. Sensed data collected at a sensor are delivered to a remote monitoring center through an intermediate system.
Large-scale heterogeneous sensor network, one type of WSN, has a number of heterogeneous wireless nodes deployed over a wide geographical region. The WSN is commonly made up of two subnetworks: a sensor network and a distribution network. Sensors in the sensor network collect or sense data and deliver them to the nearest sink. In general, sensors are static and densely deployed so they can be easily connected to a sink. The distribution network on the other hand is responsible for delivering data gathered from sinks to the remote monitoring center through a gateway (GW). The GW in turn is connected to an external network like the Internet where the remote monitoring center gets the data. Issues on the deployment of nodes and data collection in the sensor network have already been well addressed in a number of literatures [4][5][6][7]. In general, data delivery from sensors to the nearest sink is assumed to be loss tolerant due to the sheer amount of correlated data [8]. However, the aggregated data in the distribution network are not. Furthermore, the significant distance between the monitoring center and the GW is also a challenge. But since they are connected through a high-speed wired network, we focus on the data delivery between the sinks and the GW. In addition, it is desirable for the protocol to be relatively fast because a remote monitoring center may need to take appropriate actions based on the information provided by the distribution network.
Hence in this paper, we develop a fast and reliable delivery protocol for the aggregated data from the sinks to the GW. We obtain good data delivery ratio and endto-end delay by solving the drawbacks of conventional protocols. We also present a new multicriteria ranking paradigm and present a waiting time assignment algorithm. In the experimental section, we show that our protocol outperforms conventional flooding and relaying techniques. This paper is organized as follows. Section 2 presents the drawbacks of related works. In Section 3, we describe 2 International Journal of Distributed Sensor Networks in detail our proposed protocol followed by the analysis of experimental results in Section 4. We conclude in Section 5.

Related Works
In general, wireless data delivery protocols can be divided into two major categories, (1) relaying and (2) flooding approaches [9]. Relay methods send messages once to a single forwarder. Usually, these methods make use of a multi-criteria fitness function to determine the link cost of its neighbors [10,11]. Such criteria are usually extracted through a network discovery process commonly done by sending control messages. The more the number of control messages sent, the better the decision process of choosing the next forwarder reflects the state of the network. However, too many of these messages might cause data delivery ratio to fall due to increase in the probability of collision. And since packets are sent only once to a single forwarder, the probability of packet lost is also high. Moreover, the delivery time may suffer from the overhead of the network discovery process.
Flooding techniques on the other hand, in their purest sense, broadcast messages epidemically by forwarding the data to all of its neighbors [12]. In flooding techniques, data is delivered relatively fast because there is minimal or no overhead [13]. However, flooding protocols are notorious for causing broadcast storms which lowers deliver ratio due to high chance of collision between messages. Usual attempts to solve the problem like RBVT [13,14], MHVB [12,15] and RSB [16] assign neighboring nodes a waiting time before sending. It has a value commonly dependent on the forward progress given by d SD − d NiD , where d SD is the distance between the sink S and the destination D and d NiD is the distance between the receiving neighbor N i and D [12][13][14][15][16]. The best neighbor, through computation of its own waiting time, should in principle have the least value. By this, the best node broadcasts first. The other neighbors cancel their timers and suppress rebroadcasting or reset their timer to a random value [16], after receiving the duplicate broadcast from the winning node. Such technique can be generally referred to as the election approach [14]. The approach has also an advantage of implicitly having multiple backup forwarders.
The following section explains how to solve the drawbacks of both relay and flooding approaches for the WSN's distribution network.

Multicriteria Sender-Side Election Protocol
To explain our proposed protocol named multi-criteria sender-side election (MCSS) protocol, we discuss how we model the local network using neighbor information then present how it is used to rank the neighboring nodes. We present how we assign justifiable waiting time to the best k ranking sinks. Lastly, we discuss how the idea fits into the whole delivery protocol. Table. Instead of building a path from the sink to the GW, our protocol selects its next forwarder on each hop in order to improve data delivery ratio and end-to-end delay. For each hop, the forwarding node ranks its one-hop neighbors with respect to how much they are able to receive the message. We call it the fitness value of a neighboring node. The fitness value reflects the communication environment of the node. To do this, the Neighbor Table (NT) contains three fields: received signal strength (RSS), nodal delay of messages received from the neighbor, and hop distance from the GW.

Neighbor
The RSS of a message measures the signal strength between the sink and the neighbor. The higher the RSS, the better the chance that broadcasts are successful. The nodal delay describes the amount of time it takes for a message to be delivered from the sink to the neighbor. It includes the processing delay, the queuing delay, the transmission delay, and the propagation delay. The less is the nodal delay of messages received, the less is the end-to-end delay. Since packet loss probability in a wireless multi-hop communication environment increases with the number of hops [17], we choose the hop distance between the neighbor and the GW as one of the criteria. To infer the hop distance of each node, we periodically flood a PROBE message in the distribution network.
The sinks update their own NTs for every message received. The three fields of the received messages are averaged using the Exponential Moving Average (EMA). We think of the NT as an ordered set of the ith neighbor vector v i = c i, j , i = 1, 2, . . . , n and j = 1, 2, 3 with c i,1 as the RSS, c i,2 as the nodal delay, and c i,3 as the hop distance of neighbor i from the GW. Each value is updated as follows: where c i, j (t) and c i, j (t − 1) are the new and old average values at time t, respectively. c i, j (t) is measured value at time t and the constant α is a smoothing factor.

Selection of Forwarders.
The conventional schemes selecting a single forwarder can increase the chance of data delivery failure because the forwarder can miss the message in dynamic communication environment. Thus, we choose more than one forwarder by ranking neighbor nodes. To rank the nodes, we compute the fitness value using the three fields in NT for each neighbor. Note that the number of forwarders depends on the degrees of connection. In our experiments, we set it the half of the number of sink's neighbors.
In general, the larger is the value of RSS and the smaller are the nodal delay and hop distance, the better is the performance. For the consideration, we inverse the nodal delay and hop distance using International Journal of Distributed Sensor Networks   where λ is a very small positive number used to avoid division by zero. The fitness value F i of node i is given as where Γ j (c i, j ) is the min-max normalization function of c i, j . The weight w i, j can be thought of as the importance of field j to the overall evaluation and it is calculated as The top k ranked neighbors are chosen as forwarders. To illustrate our method, we use the node setup in Figure 1 and NT in Tables 1-3. Here, the sink S sends a data to the GW and ranks its neighbors A and B. The final ranking makes B the overall best neighbor with a fitness value equal to 0.95. If k = 1, then only B is considered as a forwarder. The responsibility of computing the waiting time is moved from the receiving neighbors to the sink. After choosing the forwarders, the sink assigns each forwarder with a waiting time [18]. We assign the best ranked forwarder with zero waiting time. For the waiting time gaps between adjacently ranked forwarders, we use the nodal delays between each of them.  The waiting time assigned to a forwarder with rank r = 1, 2, . . . , k is given as where β is a small random number. The waiting time table as shown in Table 4 is appended to and sent together with the data.

The Delivery
Protocol. This section discusses the overall data delivery protocol from a sink to the GW. When a sink collects and aggregates sensed data from the sensors in its service area, it sends the aggregated data to the GW. We refer to the aggregated data as the message. We use the hybrid sender-side election approach discussed above. Additionally, each sink node keeps a record of the messages and sends and forwards in a message table (MT). An entry in the MT contains a message ID and the last known time-tolive (L TTL). An ACK field is also included to serve as a crossing-out mechanism for successfully acknowledged onehop broadcasts and is initially set to NO ACK. The sink node computes its waiting time and attaches it to the message. The message is broadcasted to its neighbors. The sink enters waiting mode for the message by starting a timer initially set to the mean of its forwarders' nodal delay. And the sink creates an entry in the MT with L TTL equals to the maximum TTL. If a duplicate message is received during the time and the TTL is less than L TTL, the sink stops its timer and exits waiting mode. The message is acknowledged by setting ACK to YES ACK and ignores future duplicates. In other words, the received message is already a rebroadcast from one of its forwarders and has already progressed by one hop. We refer to the condition as the suppression condition. Otherwise, if the timer expires, the message is rebroadcasted. Figure 2 illustrates the data delivery process.

Experimental Results and Analysis
In this section, we present our experiments done using the NS-2 simulator [19]. We used the shadowing propagation model [20] with path loss exponent β = 3.0 and shadowing deviation σ dB = 5.0 to model a noisy outdoor environment.

International Journal of Distributed Sensor Networks
Sink node S collects the sensed data from the sensors in its area.
S broadcasts to its neighbors, creates an entry on its MT, then enters waiting mode. However both B and C fail to receive the message.  Sinks are placed in a 2000 m by 2000 m area and a transmission range of each node is 250 m. A sink sends a message to the GW every second at constant bit rate (CBR) for 100 seconds. Additionally, we use IEEE 802.11 [21] as the MAC layer.
We first experimented on both conventional flooding and relay approaches to serve as basis for comparison with our protocol. For the relay approach, we investigated four combinations of criteria setups. In all setups, nodes send a control message every 100 milliseconds. For the flooding approach, we experimented on two weight combinations of the RBVT protocol [13]. This protocol uses forward progress, RSS and optimal transmission area given by where d opt = 100 m and d max = 250 m are the optimal and the estimated maximum transmission ranges, respectively, and d trans = 150 m is the translation distance. We evaluated the performance of the protocols through the following performance factors.
(1) Data delivery ratio-the value computed by dividing the number of data packet successfully received at the GW by the total number of data packets sent by sinks.
(2) End-to-end delay-the time needed to forward data from a sink to the GW.
In Figure 3, we can see that the relay approach using only RSS performs best in terms of delivery ratio compared to other conventional setups. Its performance falls with the increase of neighbor density. This is due to collisions from the intense number of control messages. On the other hand, though RBVT does not send control messages, its delivery ratio might have suffered from huge waiting time values being directly equal to the sum of the three criteria. In Figure 4, relay approach using only RSS is still the best performing protocol in terms of end-to-end delay. Though RBVT does not have a discovery process, the distance-dependent waiting time makes it perform worst. Additionally, it is also important to note that considering RSS International Journal of Distributed Sensor Networks more makes the overall performance of the relay approach better and somewhat less spiky. This gives us an idea on how important the assignment of weights is.
Using the best setups for both conventional approaches, we compared the performance of our MCSS protocol using two weight combinations. In Figure 5, we can see that our protocol achieves an increase of about 20% in data delivery ratio. It is caused by the reduction of control messages. Additionally, the assignment of zero waiting time value for the best forwarders avoids unnecessary pending for message. Moreover, the improved performance can also be attributed to having the k − 1 ranking forwarders as backup resenders.
In Figure 6, we can see that MCSS decreases the end-toend delay of relay approach and RBVT protocol by about 100% and 500%, respectively. It is due to the waiting time values being well gapped, that is, nodal delay dependent and sender-side assigned. Additionally, note that varying the weights in our protocol has minor effect on its overall performance. Hence MCSS is resilient to poor weight assignments with respect to the protocols experimented upon.
We therefore conclude from the above experimental results that our MCSS protocol improves the quality of data delivery between the sinks and the GW in terms of end-toend delivery ratio and delay.

Conclusion
We developed a fast and reliable data delivery protocol for large-scale heterogeneous sensor networks. The summary and contributions of this paper are as follows.
(1) We identified the data delivery problem in the distribution network of large-scale heterogeneous WSNs.
(2) We analyzed conventional wireless data delivery protocols and enumerated their weaknesses.
(3) We developed a selection algorithm of forwarders.
(4) We developed a sender-side waiting time assignment algorithm for forwarders.
(5) We developed a general hybrid multi-criteria senderside election protocol which utilizes the two developed ideas.
(6) We applied our approach to the data delivery process between the sinks and the GW in the distribution network.
(7) We showed that our protocol outperforms conventional data delivery protocols in terms of data delivery ratio and end-to-end delay.
In our approach, we used three criteria to model WSN. As a future work, we will do research on more possible criteria and on ways to assign weights dynamically to correspond to a better performance.