Dubhe: A Reliable and Low-Latency Data Dissemination Mechanism for VANETs

Many attractive applications over vehicular ad hoc networks (VANETs) need data to be transmitted to the remote destinations through multihop data forwarding, but some unique characteristics of VANETs (i.e., high node mobility, dynamic topology changes with frequent link breakage, and unstable quality of wireless transmission) incur unstable data delivery performance. In order to reliably and quickly disseminate the data, we present Dubhe which includes a delay model and an improved greedy broadcast algorithm embedded with a coverage elimination rule. The former is used for making decisions for path selection with the aim of minimizing the transmission latency, while the latter focuses on boosting the reliability of one-hop data transmission. We also analyze the necessity and effectiveness of Dubhe and the retransmission overheads theoretically. It is shown from the experiments that Dubhe can achieve high-reliability and low-latency data delivery in comparison with the epidemic-based protocol and the static-node-assisted adaptive data delivery protocol.


Introduction
As a mobile and self-organizing network, a VANET is built over two types of nodes on an as-needed basis, that is, the mobile nodes equipped with on-board units (OBUs) and the stationary wireless stations deployed along the roads known as roadside units (RSUs). Both OBUs and RSUs have wireless interfaces to connect together or to the other equipment such as traffic lights. The communication over a VANET can be done by means of vehicle-to-vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communications. Based on these communications, many promising applications over VANETs can be envisioned in the future with the help of enabling technologies (e.g., vehicular positioning, digital maps, and sensors), such as active safety applications, intelligent transportation applications, and various convenience applications for drivers and passengers. More specially, VANETs are also underlying networks to provide access to the Internet, as well as data transmission in case of emergency (earthquakes, traffic jams, etc.) [1].
In this paper, we focus on the reliable and low-latency long distance data transmission in urban scenarios, which is indispensable to many applications. For example, a moving vehicle may initiate an inquiry for available parking lots of a business center several miles away to decide the next travel destination; the hazardous traffic situations collected by vehicles cooperatively need to be diffused timely to the infostations deployed at adjacent blocks so as to remind the driver of rerouting. Although RSUs connected to the backbone networks can provide a stable link for the data transmission, the high expenses impose restriction on the deployment of RSUs on a large scale, especially in the initial phase of deploying a VANET. A common idea is to place RSUs at intersections firstly in urban areas, where the RSUs can take on multiple roles: storing the received data temporally, making the decisions on the data routing, and forwarding the data to the passing vehicles if necessary. Meanwhile, vehicles have to cooperate with each other to deliver the data. That is, a vehicle will forward the data in case that there is a chance to establish a wireless link with other vehicles; otherwise, it will carry the data by itself. The wireless connections among vehicles are extremely uncertain owing to the rapid movements of vehicles. When vehicles are sparse, the VANET falls into a disruptive network [2]. However, a dense vehicle flow may lead to severe channel contentions due to the broadcast nature of the VANETs. Therefore, the key issue of the data delivery over a VANET is how a vehicle node on a road segment determines its action: carrying on or forwarding or dropping the received packets, and how a node (a vehicle or an RSU) at an intersection chooses the optimal paths for the packets to be delivered.
As far as the selection of relay nodes is concerned, some algorithms including the broadcast suppression techniques [3] have been developed to assist in the selection of rebroadcasting nodes, but they lack the theoretical analysis on the reliability of data delivery. On the other hand, as for the routing path selection, a variety of data delivery protocols have been proposed. Typically, while a data packet is transmitted to a node at an intersection, the optimal path aiming at minimizing the transmission delay is expected. Some protocols require nodes to maintain the routing information [4,5], as thus, considerable packets have to be exchanged to keep routing tables up to date. However, some other protocols (e.g., VADD [6]) exploit the historical traffic data to decide the path for packet forwarding. Obviously, the historical data are almost certainly inconsistent with the constantly changing traffic situations. So far, lacking an accurate model of reflecting the time-varying connectivity of a VANET has become the biggest obstacle to improving the performance of data delivery, where the connectivity refers to the probability that a node (i.e., a vehicle or an RSU) can communicate with any other nodes through wireless communication.
Motivated by the limitations of the existing protocols and noting that the vehicles in a typical city scenario are restricted in grid roads, we present Dubhe, a reliable and low-latency data delivery mechanism for transmitting data to stationary roadside infrastructures several miles away from source vehicles by unicast communication. Dubhe comprises a delay model, a path choosing algorithm, and an improved greedy broadcast algorithm. The delay model is responsible for estimating the transmitting delay between the adjacent intersections with RSUs deployed, the path choosing algorithm is used for determining the optimal path for the packet arriving at RSUs, and the broadcast algorithm concentrates on improving the reliability of one-hop data transmission. The contributions of this paper include the following aspects.
(1) We propose a delay model which takes account of the road topology and the interference of the traffic signals in an urban scenario. We also give our path choosing algorithm on the basis of the delay model.
(2) We present an improved greedy broadcast algorithm merged with a coverage elimination rule. The multicandidate strategy in algorithm can enhance the reliability of one-hop data transmission.
(3) We analyze the necessity and feasibility of the multicandidate strategy for 1-hop data transmission and the corresponding retransmission ratio theoretically. (4) We evaluate the performance of Dubhe by conducting extensive experiments to examine the performance and cost metrics (i.e., delivery ratio, transmission delay, and message overhead) over three independent factors (i.e., vehicle density, traffic signal cycle, and connectivity estimation period).
The rest of this paper is organized as follows. In Section 2, we introduce the existing researches related to data delivery mechanisms over VANETs. In Section 3, we describe our delay model and path choosing algorithm. And then, we present our greedy broadcast algorithm and analyze the reliability and the retransmission ratio in Section 4. In Section 5, we conduct the performance evaluation. Finally, we summarize our work in Section 6.

Related Work
Rapid changes of network topology, lack of central coordination, and limited bandwidth are stumbling blocks to efficient data delivery over VANETs [7]. Besides, radio channels are error-prone. The wireless signals interfere with vehicles in unpredictable ways, and the fading phenomenon (the multipart of the same wireless signal traveling along different paths that interfere with each other) deteriorates the channel quality even in the case of no neighbor trying to transmit simultaneously. All lead to apparently nondeterministic packet reception.
As is known, the best-effort manner is commonly used for data delivery over a MANET. For example, following the epidemic based data delivery (EBDD) protocol [8], a node exchanges digests of all data packets with other nodes as long as they can communicate with each other. Then they transmit the data packets that do not appear in the packet lists of counterparts in turn. However, the performance degrades quickly while facing a high vehicle density owing to the excessive repeated packet exchanging.
Due to the broadcast nature of wireless signals, a wireless transmission with the target to a particular destination vehicle can be overheard by all neighboring vehicles; how to choose an appropriate relay vehicle from multiple candidates is a tough task. In [9], a decentralized broadcast algorithm is presented to achieve a good broadcast coverage in a VANET. Beside the broadcast suppression techniques introduced in [3], a connected dominating set (CDS) on the basis of local beacon messages is proposed in [10] with the expectation of repressing the overhead of broadcast. As thus, only the nodes within a CDS have a chance to transmit the received packets. Although broadcast messages are partially diminished, the CDS suffers from the overheads for constant CDS construction owing to the fast movements of nodes. Obviously, their goal is different from ours.
Further, in order to find an efficient path for forwarding the packet, some studies propose to propagate the vehicle density of roads among vehicles to acquire the route decisionmaking information, for example, periodical probing packets International Journal of Distributed Sensor Networks 3 are used [4] to collect the vehicle density proactively and then are exchanged among neighbors. Obviously, packets for probing and propagating route information induce massive overheads. By contrast, [5,11] employ a reactive route discovery approach with the help of the relay nodes. However, the routing tables are out of date quickly because of the mobility of the relay nodes. Since the historical data of traffic flow are an important basis for deciding the optimal path, vehicle-assisted data delivery (VADD) [6] exploits the historical records of the vehicle density marked on a digital map so that a node can decide the minimum delay paths for the arriving packets. But the time-varying vehicle density may deviate far from the one marked on the map. To remedy the weakness, [12] presents a staticnode-Assisted adaptive Data delivery protocol for Vehicular networks (SADV), which suggests to place static nodes at intersections to assist packet forwarding. Each static node has the capability of storing the received packets until they can be forwarded to a vehicle travelling toward the optimal forwarding road. Meanwhile, transmission delay is piggybacked in data packets to help forwarding decisions. Nevertheless, SADV is designed entirely without consideration for the impact of the spatial distribution of vehicles on the connectivity of VANETs. Similarly, [13] suggests buffering data in the parked vehicles along the roads if no appropriate route path for data forwarding is available in urban areas. But the efficiency of data delivery is unstable, depending on the number of vehicles which are willing to relay messages as volunteers.
To enhance the adaptation and scalability of routing protocols, [14] proposes to select the intermediate nodes according to the group mobility patterns, importance of the messages, destination, current location, and speed of the vehicles. Reference [15] periodically evaluates the SIR (signalto-interference ratio) values on the available transmission channels and chooses the channel with the best perceived SIR to deliver the data. Reference [16] presents a hierarchical routing scheme for data forwarding decisions, which makes use of peer servers to discover the trajectory from a source to a destination. However, the overhead of electing peer servers and maintaining the information on the peer servers is not ignorable. On the behalf of enhancing the reliability of data transmission over VANETs, [17] proposes to utilize black-burst signals to prevent the receivers from concurrent rebroadcasting. At the same time, [17] employs acknowledgement packets to improve the reliability and a mechanism similar to the RTS/CTS handshake to decrease the effects of the hidden nodes. But the overheads incurred by the black-burst and the acknowledgement messages restrict its scalability.
The network connectivity is crucial to efficiently deliver data over VANETs. Usually, one-hop connectivity between node and node can be specified as ( ( , ) < ), where ( , ) is the Euclidean distance between two nodes, and denotes the radius of wireless transmission ( Table 2). Reference [18] examines the connectivity of VANETs in an urban scenario by experiments, and the results show that the probability of an uninterrupted multi-hop link decreases exponentially with the increasing length of the road. More concretely, when the vehicle density is 4 vehicles/250 m, the probability of the formation of a 1000 m wireless communication is just 68%, and 42% as the road length is increased to 2000 m. While the vehicle density reduces to 2 vehicles/250 m, the corresponding probabilities decline to 21% and 7%, respectively. Hence, some studies believe that it is unfeasible to achieve reliable and efficient data delivery without the support of RSUs [18]. Further, the experimental results in [19] also indicate that once the vehicle density is below a specified threshold (e.g., 60 vehicles/km 2 ), the quality of service of data transmission cannot be satisfied if it relies solely on the V2V communications.
Although the connectivity models are of great importance to measure the data transmission delay in VANETs, the most existing studies only focus on the highway scenarios [20]. Reference [21] employs discrete-time and discrete-space Markov chains to estimate the changing of vehicle velocities so as to infer whether the network partition would occur. Unfortunately, the computational complexities of the estimation methods refrain them from putting into practice for routing decisions. Considering the significant impact from the traffic signals on the speeds and spacing distribution of the vehicles, [22] derives the connectivity model of a single road segment in urban environments. However, [22] needs traffic flow data of all entrances of the road segments to calculate the connectivity, so it is improper to be directly used for routing decision.
In summary, the frequent disruptions of traffic flow in urban roads seriously affect the connectivity of VANETs. But the existing data delivery protocols still fail to adapt to the time-varying vehicle density resulting from the stochastic vehicle mobility and periodic interference of the traffic signals [23].

Connectivity-Aware Data Delivery
We propose deploying RSUs at major intersections to collect data for calculating the vehicle arrival ratios and then exchanging them with the neighboring RSUs periodically through connectivity messages. Next, the RSUs can employ the arrival ratios to estimate the real-time connectivity and exchange the transmitting delay with the adjacent RSUs by connectivity messages. Then, the RSUs can choose a forwarding path for the arriving packets in terms of the minimum delay.
On the other hand, each vehicle and each RSU are equipped with a GPS device and a digital map on which the positions of RSUs are marked.

Estimating Arrival Ratios of Vehicles.
The vehicles are required to broadcast beacon messages periodically, so an RSU can be aware of the approaching vehicles, and then estimate the arrival ratios of different directions of an intersection for every time interval . The arrival ratios calculated by an RSU (denoted as ; see Table 1) are sent to the neighboring -hop RSUs (here, 1-hop RSU is defined as an RSU immediate adjacent to the RSU ) with the assistance of the passing vehicles. If a neighboring RSU (denoted as ) The transmission delay of road at time The expected number of vehicles of road at time

( )
The vehicles that go straight through the intersection into the road

( )
The vehicles that enter road from the perpendicular direction during the green phase The region without vehicle marked with red lines The distance from 0 to the horizontal transmission boundary of The horizontal angle does not receive the reports from within a given interval, then forecasts it based on the historical arrival ratios of recorded in the local storage by applying the regression technique (see the following paragraph). If the delayed arrival ratios of reach later, then replaces the estimated ones with the incoming arrival ratios.
In order to predict the missing arrival ratios of vehicles at neighboring intersections that should have been reported, the -nearest neighbor ( NN) algorithm [24] is used to capture the traffic patterns from a time series of data, given that the traffic flow has periodical patterns [25]. As thus, a historical arrival ratio record turns into a state vector in NN which contains several attributes depicting the arrival ratios, that is, the time and the date attributes (e.g., hour of the day, the day of the week), and then it is stored locally in RSUs.
The estimation process is activated when the missing of current arrival ratio is detected. To simplify the calculation, the historical records are stored in a KD-tree and the records retrieved from the local storage are restricted to latest four items represented by ℎ1 , ℎ2 , ℎ3 , and ℎ0 . Then, the missing arrival ratio of vehicles at a neighboring intersection can be estimated as = ∑ =0 ℎ , where = 4 and = −1 / ∑ =0 −1 . denotes the dissimilarity distance between the subvector from the historical record and the current state and is measured by the Euclidean distance; that is, is the th element of the historical record under consideration, and is the th element of the current state.

Delay
Model. All RSUs are assumed to be standalone, they determine the forwarding paths for the arriving packets with the goal of minimizing the transmission delay. Note that the transmission delay depends heavily on the wireless connectivity of the VANETs on the roads, so the delay model is derived through the real-time estimation of the expected number of vehicles and vehicle distribution between adjacent RSUs.
An RSU works as follows to decide the optimal paths for the arriving packets.
(1) The RSU builds a Markov model for vehicle queue length at intersections. At time 0 , the RSU estimates the queue length of the neighboring intersections of time 0 + in the light of the current arrival ratios of neighbor RSUs.
(2) The RSU estimates the transmission delays in terms of the real-time varying vehicle number and distribution between the adjacent RSUs.
(3) The RSU calculates a minimum-delay forwarding path to a virtual destination V for a packet with an improved Dijkstra algorithm, where V refers to the RSU nearest to the original destination of the packet and can exchange arrival ratios with the intended forwarding RSU . All the destinations refer to V in the rest of the paper.
In the following, we elaborate the vehicle queue length prediction, delay estimation, and path choosing, respectively.

Vehicle Queue Length Prediction.
The arrival of vehicles at an intersection can be viewed as a stochastic process following the Poisson distribution [26,27]. As a result, the queue length of vehicles at an intersection at time can be modelled with a random variable ( ): where is a small time interval by which a green phase of traffic signal is divided into equal episodes and is a random variable standing for the number of arrival vehicles during [ − , ].Meantime, represents the number of International Journal of Distributed Sensor Networks 5 vehicles departing from the intersection in one direction; we assume that the departure ratio is a constant during the green phase. Equation (1) points out that the queue length at time depends only on the queue length at time − , the arrival and departed vehicles during ; therefore, it is a renewal process. So the RSU can predict the queue length at ( + 1) with the arrival ratios at time ( and are positive integers, 0 < < ) in an iterative manner. Note that is also the period of connectivity messages. The prediction is based on the transition probability , which is the probability that the queue length changes from to in a small interval ( = × , is a positive integer): where , > 0. Consequently, the probability that the queue length changes from to 0 can be calculated by the following equation: At any interval , the probability of finding vehicles arriving is written as where is the arrival ratio during . Note that the probability distribution of a queue length at time during a green phase can be denoted . . , (0) ] (where (0) denotes the probability that the queue length equals and (0) is randomly initialized) [26]; then probability distribution at + can be estimated by (1) = (0) ( ). Then after iterations, the probability distribution about the queue length is With the queue length at time , the expected queue length at time ( + 1) is calculated with where is the dimension of the transition probability matrix. During the red phase of traffic signal, (6) is still valid as long as the parameter in (2) is set to 0.

Delay Estimation.
With the prediction of the changes of the queue length, we then estimate the transmission delay of a specific road segment. To simplify the derivation, we suppose that all roads are of two lanes and the traffic signals are synchronized. The beginning time of the green phase of the traffic signal is denoted as time 0. So the shortest time to form the wireless links from RSU to is = ( − )/(2 × V ) after the green phase has started; it is reasonable that we suppose that the spacing of vehicles on the road follows the Poisson distribution [26] after . Then, the one-hop connectivity probability between vehicles at time can be specified as where represents the spacing between vehicles and ( ) is the vehicle density. Notice that ≥ and = ; that is, the estimation of transmission delay is only executed at an integral multiple of time interval .
Since ( ) during the interval can be estimated by where ( ( )) denotes the number of the vehicles arriving intersection during and is the ratio of the vehicles going straight through intersection into the road , apparently, the number of vehicles on the road at time can be given by Equation (9) implies that the number of vehicles of road at time depends only on the number of vehicles at time − , the number of vehicles going straight through the intersection, toward intersection and the number of arriving vehicles at intersection . Note that ( ( )) needs to be calculated only after > /V in that no vehicle in ( ) could arrive at intersection before /V . Thus, the vehicle density during the green phase is given by During the red phase of the traffic signal, since the wireless links may still be connected before time elapsed . Considering that some vehicles (denoted by ( ( ))) on the road are entered from the perpendicular direction during green phase, ( ( )) is proportional to the vehicles passed in the perpendicular direction of the intersection (the ratio is 1 − ). So the number of vehicles on the road can be estimated as ( ( )) = ( (0)) − ( ( )) + ( ( )) . (11) From (11) we can observe that the expected number of vehicles on the road at time equals the number of vehicles at the beginning of the red phase subtracting the number of vehicles arriving at intersection after the beginning of the red phase and adding the vehicles entering the road from the perpendicular direction. Specially, the wireless link of ( ) could be possible only after passes in the green phase; meanwhile, ( ) contributes to the connectivity of the road at a red phase. Therefore, the vehicles on the road can be regarded as the ( ) superimposed with ( ). Thus, the transmission delay of road at time of a green phase can be written as follows: 6 International Journal of Distributed Sensor Networks Equation (12) indicates that before a wireless link is formed, the packets are carried or transmitted by ( ) and entered during the green phase together with the vehicles turned into the road during last red phase. ( ) is the resulting delay of ( ), the vehicle density of ( ) is estimated by the numbers of vehicles passing in the perpendicular direction during the green phase of intersection and , that is, (2) ( ) = ( ( ( )) + ( ( )))/( − 2V ), so where − (2) ( ) is the probability that the spacing of vehicles is larger than . If so, the packets have to be carried by the vehicles. As suggested in (13), the transmission delay relies on the vehicle density heavily; a high vehicle density makes for a small portion of data packets completed by the vehicle movements; (1) ( ) is the delay caused by the ( ): Similar to (10), the vehicle density of ( ) is (1) ( ) = ( ( ( )) + ( ( )))/(2V ), when the spacing of vehicles satisfies the exponential distribution after time elapsed ; the data delivery is performed mainly by ( ) and the transmission delay is given by ( ) indicates that the total delay is composed of the delay caused by vehicles carrying on with proportion of − ( ) and the delay incurred by wireless transmission when the intervehicles distances are smaller than . It must be emphasized that on the calculation of the expected transmitting delay between adjacent RSUs, the impacts of the traffic signal cycle and the reasonable conditions of the exponential spacing distribution to be fulfilled are considered.

Path Choosing.
When the RSU at an intersection receives a packet with V as its destination, it calculates the minimum delay from intersection to V through an improved Dijkstra algorithm. Considering that the traffic flow is uneven, we need to scale up the weight of delay , +1 on the roads nearer to the intersection so as to avoid being deceived by the short latencies of the roads far away from the intersection . Therefore, on the basis of the original Dijkstra algorithm, we introduce (16) to calculate the minimum forwarding delay ,V , of which , +1 ∈ (0, 1) is the weight of delay , +1 and is the searching depth from to V in the algorithm The parameter , +1 depends on the average vehicle density on the road ( , + 1). The higher the vehicle density is, the greater the value of , +1 is assigned. On that account, , +1 is decided by (17), in which the offset represents the least number of vehicles in the transmission range to form a successive wireless connection [18]; is set to 6 in experiments:

Reliable Data Delivery
The propagation of wireless signals in urban environments is interfered by the buildings, facilities along the roads, and the rapid vehicle movements in different directions, which may give rise to the loss of data transmitted Therefore, improving reliability is an imperative requirement for the data delivery. Due to the broadcast nature of the wireless medium in VANETs, a single packet often evokes multiple receptions. Thus, it is crucial in determining the appropriate neighboring vehicles to rebroadcast the data in a distributed way with the promise of guaranteeing the reliability.

Improved Greedy Broadcast
Algorithm. Dubhe provides an improved greedy broadcast algorithm to choose the proper relay vehicles. In detail, the sending vehicle broadcasts a packet which contains the payload to be sent, the packet ID, the list of 1-hop neighboring IDs acquired by beacon messages, the location of the last RSU passed by, and the distance from the sending vehicle to the passed RSU. If the vehicle finds that there is no neighbor in the direction of next RSU within the transmission range, it carries the packet. However, on receiving a packet, the vehicle applies the coverage elimination rule; that is, it checks whether all its neighbors are covered by the neighbor list of the packet or not, if covered, then the recipient discards the packet, otherwise it becomes a forwarding candidate. Such a coverage elimination rule can effectively reduce the candidate number. Being a candidate implies that the vehicle needs to start a timer according to the distance from the sending node in preparation for the packet rebroadcasting. The longer the distance is, the shorter the timer is set. In this way, the vehicle furthest away from the sender has the highest priority to rebroadcast the packet. On condition that the furthest one fails, the secondary vehicles can rebroadcast the packet after a little while. Therefore, a multicandidate situation is formed.
In Dubhe, the packet may be rebroadcast by the vehicles moving in the opposite direction, which can improve the connectivity and increase the efficiency of packet forwarding. But in order to prevent the packet from being propagated in the inverse direction, the vehicle updates the distance from the last passed RSU in the packet before it rebroadcasts the packet. The improved greedy broadcast algorithm deployed on vehicles is shown in Algorithm 1.
It is clear that the maximum waiting time is a critical parameter to Algorithm 1, which closely relies on the one-hop (1) //Definition: denotes a vehicle in a VANET, ( ) denotes a neighbor list of ; (2) // denotes the last intersection passed by; dist ( , ) denotes the length from to ; (3) // denotes wireless transmission radius; (4) // denotes the data packet which needs to be delivered, and consists of ID, data, sender node , a neighbor list of ( ), the coordinates of , and the distance between and (5) On receiving a data packet at node (6) Extract dist( , ), coordinates of and ( ) from ; (7) I f has been received before or ∉ ( ) then (8) D r o pd a t ap a c k e t ; (9) If exists any defer timer on then (10) Cancel all the timers; (11) return; (12) Ifdist( , ) ≤ dist( , ) then (13) D r o pd a t ap a c k e t ; (14) If ( ) ⊆ ( ) then (15) D r o pd a t ap a c k e t ; (16) Set defer timer 1 time to ; (17) On expiration of deferred timer 1 (18) Broadcast data packet ; (19) Set equals to ( − (dist( , ) mod )) * / ; (20) Set defer timer 2 with ; On expiration of deferred timer 2 (22) U p d a t e : replace ( ) with ( ),dist( , ) with dist( , ) Broadcast data packet Algorithm 1: An improved greedy broadcast algorithm with coverage elimination.
wireless transmitting time, vehicle density, and wireless communication range and queue length of the packet in the MAC layer. The similar deferring policy is utilized in [3], which advises to set no less than 5 ms when the vehicle density is 100 vehicles/km/lane. Taking all factors into consideration, is set to 20 ms in our experiments. Additionally, in order to avoid the expanding of the neighbor list in the packet when the vehicle density is high, the probabilistic data structure (e.g., bloom filter) can be used. Meanwhile, the interval of beacon messages can be dynamically adjusted according to the vehicle density to reduce the communication overheads [11].

Algorithm
Analysis. Algorithm 1 attempts to choose the furthest vehicle to forward the packets and meanwhile lets multiple neighboring vehicles be the alternatives in case of the failure of the furthest node. Such a design aims to provide the robust packet transmission.
The following subsections will check the reliability of our algorithm by theoretically analyzing the effects of rapid moving of vehicles on the reliability of one-hop packet transmission and discussing the treatment of the failures of RSUs.

Reliability of One-Hop Data Transmission.
As we know, the vehicles at the edge of the transmission range of the message sender may get away in a very short time due to the fast movement of the vehicles. When the above situation occurs in Dubhe, the furthest one-hop neighboring vehicle of the sender may not receive the message because the neighboring vehicle is no longer within the transmission range of the sender when the sender broadcasts the message. This subsection examines the possibility of such situation caused by the rapid vehicle mobility, given the ideal wireless signal model (i.e., the unit disk model). Figure 1 illustrates the distribution of vehicles on a 4-lane road; the packets are propagated from right to left by vehicle . Note that whether the rebroadcasting vehicles could run away from the sender or not is decided by the location of the candidates when they send beacon to , the relative velocity between the sender and receivers, and packet sending time elapsed from the time of receiving beacons. Supposing that the vehicle 0 reports its position by a beacon message at time 0 and the time that sends a packet is 0 , therefore, the condition that vehicle 0 can run away from at 0 is defined as Since the relative velocity between the vehicles in the opposite directions is more likely to be greater than that of the same direction and such situation has more chances to flee from each other, so we focus on the case of the opposite directions.
To begin with, we specify the probability that a candidate vehicle can escape from at time instant : where (V) is the probability density function (pdf) of the relative velocity, ( ) is the pdf of the distance between the candidates and and ( ) is the pdf of packet transmission time of .
(1) Pdf of the relative velocity is as follows. We combine the fluid model with the stochastic model to characterize the average vehicle density and the random behavior of individual vehicle [22]. In light of the Greenshield model in traffic flow theory, as for a node at location at time , the relationship between the velocity V( , ) of the node and the vehicle density is depicted as where V denotes the vehicle velocity of free flow, represents the jamming vehicle density and = 1/ , in which is the vehicle length, and ( , ) is the vehicle density of location at time . Taking the random behavior of vehicles into account, the vehicle velocity can be redefined as (V) = V( , ) + , in which is a random variable following normal 8 International Journal of Distributed Sensor Networks distribution (0, 2 ). Since the maximum relative velocity of the moving vehicles in the opposite directions is 2V max and the additive of two hormal distributions follows (0, 2 2 ), hence, (V) is (2) Pdf of vehicle distances is as follows. The exponential distribution is turned out to be appropriate for describing the distances between the vehicles travelling along a 1-lane road; therefore, the number of vehicles in the 1-lane road can be estimated by a Poisson distribution. However, vehicles in Figure 1 are located in an approximate rectangle region formed by the 4-lane road intersecting with the wireless coverage area of . In view of this, we consider that the vehicles may be located randomly in the rectangle region which is modelled by a 2-dimension Poisson distribution. To simplify the analysis, we turn to the distance distribution of the furthest vehicles 0 and in that 0 has maximum opportunity to escape from . So the cumulative distribution function (cdf) of distance distribution between 0 and can be specified in the form where 0 is the region without a vehicle, as marked with red dotted lines in Figure 1. The angle is uniformly distributed in [0, /2]; that is, Θ ( ) = 2/ . Generally, the conditional probability that vehicles locate in region 0 conditioned on a specific angle is Particularly, the area of 0 can be calculated approximately with where is the width of the road; hence (22) is rewritten as By differentiating (25), we obtain the pdf of distance between 0 and : (3) It is a reasonable assumption that the time sends packets follows uniform distribution; thus,   Substituting (21) and (26)-(27) into (19), then the theoretical probability that the vehicle 0 can escape from (i.e., packet loss probability ( , V, )) can be calculated easily. Figures 2 and 3 show the results under different conditions. It is observed that the packet loss ratio goes up with an increase in beacon period, which is the direct result of the integral of (27), (19). Moreover, note that increasing the vehicle density also incurs an increase in the packet loss ratio slightly, which is due to the fact that the more vehicles locate in the forwarding area of , the smaller the area of 0 would be, which results in an increase in the integral of (26).
Next, we derive the success broadcast probability ( succ ) [28] to justify the function of the multi-candidate strategy to meet the challenge of the rapid movements of vehicles. First of all, recall that the vehicle number in the forwarding area of follows Poisson distribution. Therefore, the probability that there are potential vehicles that can take part in rebroadcasting the received packet can be given by International Journal of Distributed Sensor Networks  Further, assuming that at most candidates participate in forwarding a packet, the success probability can be found from The former part corresponds to the case that the number of potential candidate vehicles is less than , while the latter part reveals that the number of available forwarding vehicles is more than . However, only vehicles are involved in rebroadcasting the packet. By employing the complementary relationship, the probability that there is more than potential candidate vehicles can be easily given: Finally, the success broadcast probability can be expressed in two different forms as where = 0 implies that the packet is successfully transmitted without the help of the candidates, while ≥ 1 means that multiple candidate vehicles participate in forwarding the packet. Obviously, the probability that at least one vehicle in the forwarding area of is Combining (28)-(30) and (32) with (31), we can obtain the success transmission probability easily. Given the parameter values in Table 3, the analytical results are shown in Figures 4 and 5. As can be seen, the success transmission     probability without the auxiliary vehicle decreases as the vehicle velocity arises. In comparison, the success probability increases significantly while one auxiliary vehicle is involved in the packet relay, which exactly justifies the necessity and effectiveness of our strategy. In particular, with an increase in the vehicle density, the success transmission probability declines slightly for the same reason as the packet loss ratio.

Retransmission Ratio.
Due to the relative movements of vehicles, the vehicle may not receive the message broadcasted by 0 ( takes this message as an acknowledgment message) and it will regard that message transmitting fails and then rebroadcast the packet, which leads to the result that retransmits the packet. However, the theoretical analysis result ( Figure 6) shows that retransmission probability is acceptable under different vehicle densities. Apparently, the analysis procedure is similar to that in Section 4.2.1 except the integral upper limit of ( ) in (19) is replaced with Ψ, and Ψ is defined as where denotes the wait time before the packet is rebroadcasted by 0 , represents the maximum process time in 0 and is the one-hop wireless signal transmission time. Equation (34) comes from line 19 of Algorithm 1; therefore, the retransmission probability can be obtained by rewriting (19) as follows: where the former integral indicates that the retransmission probability is mainly result of process time and the wireless signal transmission time given that the other parts of integral are known. The ( ) can be depicted with a random variable with uniform distribution; thus, According to (34), the defer time of timer in 0 ( ) is an implicit random variable determined by the distance distribution ( ).
As can be seen from Figure 6, the retransmission probability caused by vehicle movements is less than 1.2%, so the message overheads incurred are very limited. Meanwhile, our theoretical results are expected to provide upper bound in that the parameter values configured correspond to the relative extreme case.

Failures of RSUs.
The RSUs play important roles in pinpointing the forwarding paths, so it is a necessary capability for Dubhe to tolerate the failures of the RSUs. In Dubhe, the messages of arrival ratios from adjacent RSUs act as the heart beating to maintain the neighboring relationship between RSUs. If the messages of arrival rates of a neighboring RSU have not been received within a specified interval (in units of an exchanging cycle) on condition that there are vehicles coming from the RSU, then the RSU is regarded as failed. When a vehicle finds that the approaching RSU is failed, it   tries to forward the packet to the vehicle travelling in the same direction. Otherwise, it selects a random path and tries to forward the packet.

Evaluation
We conduct simulation experiments to investigate the data delivery performance and message overheads over three independent factors (i.e., the vehicle density, the traffic signal cycle and the connectivity estimation period). The experimental results show that our mechanism can achieve reliable and low-latency data delivery in comparison with the epidemic-based data dissemination protocol (EBDD) [8] and the static-node-assisted adaptive data delivery protocol (SADV) [12].  Figure 7 shows the road topology, where the black dots indicate the deployed RSUs.
We use NCTUns 6.0 [29] to conduct experiments, and the MAC protocol follows IEEE 802.11b. In order to simulate roadside buildings in urban environments, barrier walls are set along the roadsides. The Intelligent Driver Model (IDM) is chosen for vehicles as their mobility model to reflect the features of vehicle movements in urban scenarios. In the experiments, the initial positions of all vehicles (100-1035 vehicles) are randomly placed, and vehicles make turns at the intersections with a certain probability (probabilities of going straight, turning left and turning right are 0.6, 0.2, and 0.2, resp.). The green phase of traffic signal equals the red phase, with 50 s as default. Each simulation lasts for one hour.
Data are generated as follows: 10 percent of vehicles generate a data request with a random destination (SADV and Dubhe use RSUs as destinations while EBDD uses vehicles as destinations) per second. Table 4 outlines the experimental parameters.

Results and Analyses.
We conduct three groups of experiments to evaluate the influence of the traffic densities, the traffic signal cycle and the connectivity estimation period/delay update period on the performance and overheads.
(1) Vehicle Density. The vehicle density is one of the most important factors affecting the connectivity of a VANET. The first group of experiments examines the performance and overheads under different traffic densities. Here, the number of vehicles is employed to represent the vehicle density, which is increased from 100 to 1035 in experiments. As shown in Figure 8, the delivery ratios grow substantially as the number of vehicles increases. Note that EBDD is more sensitive to the vehicle density than Dubhe and SADV. The packet delivery ratio of EBDD reaches its peak at 635 vehicles and then declines as the vehicle density further increases. The behaviors of EBDD are rooted in the fact that it works in a manner of best effort, that is, the packets are exchanged when the vehicles enter the transmission range of each other, which leads to severe channel conflicts under heavy traffic. In contrast, the delivery ratios of SADV and Dubhe show steady improvement as the vehicle density increases. Furthermore, Dubhe shows a significant advantage over SADV in terms of transmission delay and delivery ratio. It is worth mentioning that the packet loss in Dubhe is mainly due to expiration of TTL values of packets.   Figure 9 indicates that the delay of EBDD is lower than that of SADV and Dubhe when the vehicle density is relatively low (less than 635 vehicles) but grows gradually as the vehicle number continues to increase. Benefiting from exchanging the transmitting delay between adjacent intersections, the transmission delay of SADV and Dubhe cuts down rapidly with the increase of the vehicle density. But the delay of Dubhe is shorter than that of SADV in that Dubhe employs a more accurate delay model than SADV for packet forwarding decisions. Figure 10 shows the experimental results of message overheads. Here, as for a vehicle, its message overhead is defined as the message traffic that is received and sent except payload of data packets. It can be observed that SADV consumes fewer messages than Dubhe and EBDD because no beacon messages are sent in SADV. The message overheads of EBDD increase dramatically as the growth of vehicle number, which results from the exchanging of packet digest lists when the vehicles can communicate with each other, and the lengths of the lists rise sharply as the vehicle density increases. However, the overheads of Dubhe remain relatively stable, and the majority of messages of Dubhe come from the beacon messages. Considering that the beacon is an essential mechanism for safety-related applications, we claim that Dubhe does not increase substantial message overheads in comparison with SADV.
(2) Traffic Signal Cycle Length. The traffic signals at intersections interrupt the traffic flow and affect the VANET connectivity significantly. Therefore, the second group of experiments investigates the influence of traffic signal cycle length on the performance of data delivery. Figures 11-16 show the results of the delivery ratios, the transmission delay, and the overheads under distinct traffic signal cycle lengths for the scenarios of different vehicle number, (200 vehicles and 1000 vehicles, resp.). Figure 11, the delivery ratio of SADV and Dubhe decreases about 5% and the delay rises about 10% as the length of the traffic signal cycle increases from 40 s to 120 s in the 200-vehicle scenario, while the delivery ratio of EBDD boosts about 10%, which may arise from the fact that in EBDD the long traffic signal cycle provides more chances for vehicles queuing at intersections to infect mutually.

As shown in
From Figures 12 and 14, in the 1000-vehicle scenario, EBDD is significantly inferior to SADV and Dubhe in terms  of the packet delivery ratio and the transmission delay. Meanwhile, the transmission delay of Dubhe is minimum when the traffic signal cycle lasts 80 seconds, which is 8% higher than the delay with 40 seconds as the cycle length. Moreover, the transmission delay of SADV is 20% longer than that of Dubhe as a whole. The reason behind this is that the estimation of transmitting delay in SADV is concerned little about the interference of traffic signals while the delay model of Dubhe distinguishes between the required time for establishing a wireless link during the red phase and the green  phase, together with the consideration of the changing vehicle density.
What is worth noting is that short traffic signal cycles correspond to high transmission delay when the vehicle density is high. This phenomenon can be interpreted as the multi-hop wireless link between adjacent intersections cannot be established during the short green phase, which increases the proportion of the transmission time of the packet carried by vehicles.
In addition, the data in Figures 15 and 16 imply that the traffic signal cycles have little impact on the message overheads of SADV and Dubhe. However, the messages consumed by EBDD are more sensitive to the length of the traffic signal cycle especially in the 1000-vehicle scenario where the number of messages rises about 10% at most with the increase of the length of the traffic signal cycle.
(3) Connectivity Estimation Period/Delay Update Period. The RSUs deployed at intersections contribute to store the packets when no suitable vehicles can relay the packets. In Dubhe, an RSU estimates the change of queue lengths of vehicles at intersections with the period , here ; is called a connectivity  estimation period. In SADV, the static nodes at adjacent intersections periodically exchange the data transmitting delay on roads, which is utilized to decide the packet forwarding paths. Because connectivity estimation period and delay update period of static nodes have similar function, we design a group of experiments to examine their effects on the data delivery performance.
As shown in Figures 17 and 18, with the increase of and , both delivery ratios decrease slightly, that is, 1% for Dubhe and 2% for SADV. Meanwhile, we find that the parameter in Dubhe shows more impact on the transmission delay than on the delivery ratio. That is, from Figures 19 and 20,  the transmission delay increases as gets longer. In detail, when changes from 5 s to 20 s, the transmission delay grows about 5.6% in the 1000-vehicle scenario while being 2% or so in the 200-vehicle scenario. The increased delay is due to the deteriorated accuracy of connectivity estimation.
Similar to , as becomes longer, the transmission delay of SADV goes up about 9% in the 1000-vehicle scenario while being 4% in the low-density case. In comparison with Dubhe, the parameter has more effect on performance of SADV than on Dubhe. The reasons are twofold: on the one hand, the error rates of the actual delay information go up as increases. On the other hand, Dubhe can forecast the lost International Journal of Distributed Sensor Networks  arrival ratios of vehicles at neighboring intersections from historical data while SADV has no such remedy strategy.
No significant changes in the message overheads are observed in the experiments of different traffic densities, as shown in Figures 21 and 22. Since the examined parameters and are only related to the frequency of exchanging the arrival ratios and exchanging the actual delay information, the messages incurred are negligible to the total overheads.

Conclusions
The efficient and reliable data delivery lays a foundation for many VANET applications. We present Dubhe, which reduces the data transmission delay and improves the data delivery reliability of long-distance data transmission in urban scenarios. In Dubhe, the RSUs deployed at main intersections are exploited to collect the traffic densities and estimate the real-time transmitting delay between the RSUs and then choose a minimum delay path for the packets. Meanwhile, an improved greedy broadcast algorithm is proposed to enhance the reliability of one-hop data transmission; then we theoretically analyze the reliability and the retransmission ratio. The experimental results show that the packet delivery ratio of Dubhe is 10% higher than of SADV and transmitting delay is 5-12% lower than that of SADV. Moreover, in comparison with EBDD, our mechanism exhibits stable performance, low message overheads, and high scalability.