A Multihoming Clustering Algorithm for Vehicular Ad Hoc Networks

Clustering in vehicular ad hoc networks is an effective approach to make dynamic wireless vehicular sensor networks more manageable and stable. To make vehicle clustering applicable everywhere regardless of the provided infrastructure, vehicles must rely only on themselves and must not take any supporting services, such as location or external communication services, for granted. In this paper, we propose a new clustering metric and a clustering algorithm with multihoming support. It relies only on the vehicle's ability to send and receive wireless packets which identify the vehicle relationship. Clusters are created with redundant connections between nodes to increase the communication reliability in case of topological changes and the cluster creation process is also inverted compared to other algorithms. The presented solution is verified and compared to MOBIC with the use of ns-3 and SUMO simulation tools. Simulation results have confirmed the expected behavior and show that our algorithm achieves better node connectivity and cluster stability than the former.


Introduction
Vehicular communications play an important role in the emerging Intelligent Transportation Systems (ITS).It is anticipated that communication between vehicles will significantly change the way the traffic and transportation works.The main motivation comes from the field of safety where it is expected that many accidents and injuries will be prevented.But other benefits are also predicted, for example, more effective traffic flow with regard to time, as well as in the environmental footprint, driver support in finding free parking spaces, and so forth.
Deployment of the road communication infrastructure everywhere is neither feasible nor possible and also the public communication infrastructure, such as GSM or LTE, has some dead spots.The only type of communication that will be possible at all times is direct vehicle to vehicle (V2V) communication without support from the outside network entities.Such design also makes the network more robust as the communication will not be hindered by electricity outages or similar problems.Another advantage of V2V communication is also relieving the public communication infrastructure which lowers the deployment costs and saves bandwidth for other uses.A very important part of those vehicular networks is the geographical locality, as many types of data are only of local importance and can be disseminated more effectively in that way.
But to make V2V communication reliable and useful, some technical challenges have to be resolved.One of the hardest is the logical organization of the network as the vehicles move quite fast and in opposite directions.The relative speed can range up to and even beyond 300 km/h (for two vehicles on a highway moving in the opposite directions) and this limits the communication time between them to only a few seconds in the 5.9 GHz band, which is reserved for the Dedicated Short Range Communication (DSRC).
Because there is no outside entity that would help organize or prepare the network, vehicles are left to themselves to organize.As they are all equal from the network perspective, they form a Vehicular Ad hoc Network (VANET).Clustering is important because it makes the network look less dynamic which simplifies upper layer protocols and directly supports some other functionalities like trust management, sensor data aggregation, and so forth.
Clustering is an already known form of network organization in the mobile and stationary wireless sensor networks, 2 International Journal of Distributed Sensor Networks but it differs greatly from vehicular clustering.The main limitation of these networks is the energy [1] and processing power, so clustering is optimized for low resource usage.On the other hand, vehicles are not only resource rich but also highly mobile.As a result, clustering algorithms for mobile and stationary wireless sensor networks are not effective in VANET and new solutions have to be developed.Recently many clustering solutions have been proposed for VANET, but the high mobility of nodes (vehicles) and the requirements for clustering to work in different environments make effective clustering a challenging task.As the wireless vehicular sensors are just a subset of more general wireless vehicular communications, the clustering solutions are focused on providing its services for all the usage scenarios, not just sensors.
We believe that a VANET clustering solution that is reliable and effective and works well in different environments is the key to quick and successful ITS deployment.To achieve ubiquitous operation [2] of the VANET in the numerous different working scenarios the vehicles cannot rely on anything else but themselves so no external services such as location service should be used.
This paper offers three original contributions.Firstly, we are proposing a new clustering algorithm with redundant cluster head connections, enabling the support for multihoming.Secondly, the algorithm is not location service dependent but still provides grouping of related vehicles and fast response to topology changes.Thirdly, the algorithm is simulated with realistic vehicle mobility and network connectivity using the SUMO [3] mobility simulator and ns-3 [4] network simulator.

Related Work
Seamless connectivity [5] is what will enable the vehicles to communicate with distinct networks and use the services they provide.Different communication technologies will be used, depending on the type of service and other requirements.VANET communication is only one of the possibilities.
Many papers on clustering have been published in recent years which make it a hot research topic among the academia.A brief overview of clustering solutions for VANET has been published in [6] and covers the vast majority of known clustering solutions, but none of them is focused on wireless vehicular sensors.
The majority of proposed clustering algorithms for VANET depend on the Global Positioning System (GPS) [6], but this might not be the best option [7,8].Positioning and location services and vehicle to vehicle communications are both a fundamental part of ITS and should be, on the low level, independent from each other.It is a known fact that positioning services are not available everywhere, and, even if they are, their accuracy can vary significantly.The problematic areas are especially parking structures and tunnels, bridges, city centers with narrow streets and high buildings, natural obstacles, and so forth.The inaccuracy of a few 10 m, which can be tolerated by many other services (e.g., navigation), or unavailability, presents a threat for position service based clustering as the cluster stability could be severely affected by it and even lead to communication failure, either being unacceptable.Cooperative positioning [9], which could bridge this accuracy and availability gap, also cannot be used because it depends on vehicle to vehicle communications itself.Dependence on positioning services also contributes to the complexity of the whole system and as a consequence lowers the reliability of it.But even in cases when positioning services are not available or not reliable, vehicles should be able to communicate.
Most of the clustering algorithms are focused on minimal network overhead, cluster stability, minimal number of cluster heads, and so forth.We believe that all this is of secondary importance and that the primary goal should be the connectivity between vehicles even at the cost of higher network overhead.Once there is an emergency situation with lives at stake, no time should be lost on network organization as the messages have to be passed around as quickly as possible to prevent injuries and damage from happening.Only if ubiquitous connectivity with redundant connections is assured, this goal can be achieved.
A paper [10] with a new clustering algorithm has been published recently with an idea for clustering that imitates the behavior of different magnetic particles and their forces which pull together or push apart.Due to mimicking the physical system, the solution can be easily and effectively simulated with physics simulation tools which simplify the testing.But like many other clustering solutions this one also relies on the use of the GPS system.Another novel clustering solution is presented in [11].It tries to identify and cluster together vehicles with similar mobility patterns and allows for overlapping clusters.The clustering process is always started from the slowest or fastest clusterless vehicle in the area which should prevent the creation of unnecessary and unstable clusters.VMaSC [12] is also a new concept with multihop based clustering of vehicles that expose similar mobility patterns.Other interesting location service based clustering solutions with new ideas and concepts include APROVE [13], Passive Clustering [14], and VWCA [15].
Popular clustering solutions that are frequently mentioned in research papers include MOBIC [16] or the Lowest ID which is basically the same algorithm, but with different metrics (radio signal quality versus numerical id).It is a very simple algorithm built for mobile ad hoc networks so it does not scale well in VANETs but it is usually used as a performance comparison for other clustering algorithms.
An interesting algorithm that is not location service based but is based on wireless signal propagation is the HCA [17].Its specialty is the multihop clustering, which is quite rare, but it might also influence the cluster stability.Another similar radio signal propagation solution is presented in [18] and it uses relative mobility of nodes for its metric.
In [19] authors study the QoS performance of different routing schemes for vehicular networks.They propose a double cluster head routing solution as compromise between end to end delay, packet loss, and energy consumption.

Vehicle Interconnection Metric
The vehicle interconnection metric is a simple yet effective foundation for VANET clustering.It honors similarities between two vehicles' movement patterns in time, allowing the identification of vehicles that are able to communicate, able to travel on the same route, in the same direction, and with similar speed, and are in geographical proximity.These are the key properties that identify vehicles which are good candidates to be clustered together to form stable and long living clusters.
Roads, with their number of lanes and segment lengths, limit the free movement of vehicles.They are unable to split unpredictably and abruptly and are bound to travel close together for at least some time.This is the characteristic our metric is based upon.According to data from the Slovenian highways, the average length of a highway segment is around 4.1 km.A vehicle traveling at the maximum allowed speed of 130 km/h requires about 113 s for an average segment.Since all vehicles on the highway aim for similar speed, they stay in touch for a long enough time period to communicate and use clustering.
A prerequisite for our metric is the beacon frames.Each vehicle needs to send periodic beacon frames containing at least a vehicle's id (e.g., MAC address, but it can be any other identification) which unambiguously identifies the vehicle in the area.The beacon period has to be known for all the vehicles.It is possible to use an adaptive beacon rate to account for different movement patterns (e.g., congestions with lower beacon rate), but then the information about the beacon interval also needs to be present in the beacon frame.Instead of explicit beacon frames any other frame type that implicitly denotes good radio connectivity between two vehicles could also be used.For better understanding, we use explicit beacon frames and assume a constant beacon rate in the explanation of our algorithm.
The metric is based on processing the received beacons from surrounding vehicles.Each correct reception is considered as a reward to the metric and each missed or incorrectly received beacon from a previously known vehicle is regarded as a penalty.
The value of the metric is represented as an 8-bit unsigned nonoverflowing integer counter with the initial value of zero.Values closer to zero denote lower vehicle interconnections whereas values closer to the maximum value (255) denote higher, better vehicle interconnections.Each vehicle keeps a separate counter for each neighboring vehicle in its vicinity.A one-bit status flag denoting a correctly received beacon frame in the last beacon period is also kept for each vehicle.
The algorithm for generating and updating the metric is quite simple.On each correctly received beacon, the sending vehicle is looked up in the known neighboring vehicle list of the receiving vehicle.If it is found then the beacon counter value is increased by one.But if the vehicle is not found on the list then it is added and its beacon counter value is set to one.In either case, the beacon reception flag for the sending vehicle is also set.
After the beacon period timer expires, the vehicle performs its beacon sending and metric penalization routine.
Each vehicle from the known neighboring vehicle list is processed.Its beacon reception flag is checked and, if the beacon was not received, the beacon counter is penalized by its value being divided by two.If the value of the counter reaches zero, the vehicle is removed from the known vehicle list.In either case, the beacon reception flag is cleared for all the vehicles and prepared for the reception of beacons in the next beacon period.
The presented metric is not symmetric.Vehicle A does not necessarily have the same value for vehicle B as vehicle B has for vehicle A. This is part of the design and exposes the vehicle interconnections in both directions.It can be used as an additional input variable for the designed clustering algorithm.
The metric is designed to slowly build up and it takes 255 consecutive correct receptions of the beacon frame to reach its maximum value.In the case of a beacon period being 1 second it takes more than 4 minutes to reach the maximum and this time presents a very long time from the vehicles movement perspective.Thus, we can be very certain that two vehicles move quite similarly if the metric reaches its maximum value.On the other hand the metric is penalized by integer division which takes at most 8 steps to bring down the metric from its maximum to minimum value.So the metric adapts very quickly in the case of missed beacon frames and allows for the appropriate measures to be taken almost instantly.

The Clustering Algorithm
The VANET clustering algorithm we designed differs significantly from the majority of other clustering algorithms.In contrast to them, supreme connectivity has been set as the main goal whereas their goals are usually cluster stability, smallest number of cluster heads, minimal protocol overhead, and the like.We achieve this goal with the use of the multihoming principle known from the Stream Control Transmission Protocol (SCTP) where redundant connections are established to increase the reliability of the host connectivity [20].In our scenario connections to multiple cluster heads (CH) are used by the cluster nodes (CN) to provide the redundancy and increase the connection reliability.We aim to prevent connectionless states when clusters are reorganizing as it is much more likely for a vehicle to stay connected to at least one CH when this process is taking place.The improved connectivity is important for the safety and emergency applications where low latency is required for prompt responses that prevent or minimize the injuries and damages.Apart from that, better connectivity also positively affects time critical infotainment services such as VoIP or online gaming.
Multihoming opens up new possibilities in VANET communications and sets up foundations for routing protocols or other services that exploit this feature to their advantage.Two of the possible usage scenarios are the increased communication reliability and link load balancing which can contribute to the QoS and QoE improvement.An example is illustrated in Figure 1, where node A is sending the data to node B. It The other specialty of our algorithm is the cluster building principle.In contrast to the other VANET clustering algorithms that build the clusters with election of a CH among the nodes our solution reverses this procedure.By definition, each node's initial state is the CH state.As the nodes connect and the network grows, excessive nodes give up their CH role and change to CN.But when the node density drops and not enough CHs are available, a node switches its state back to CH.To the best of our knowledge, this is the first VANET clustering algorithm designed in such a fashion.
We use common known words like cluster head and cluster node throughout the paper for simplicity and better understandability although technically a more correct formulation would be active and passive cluster head.According to the algorithm design, all nodes work in the same way all the time except for the CN which do not forward data packets as this is the CH's task.So the metrics and other clustering data are updated continuously on all the nodes, making the switchover from CN to CH a no delay event.
The presented clustering algorithm is an extension of the vehicle interconnection metric and as such very similar to the algorithm presented in the previous section.Only some minor upgrades have been added to allow switching from CH to CN and vice versa.
The beacon frame structure has been slightly extended with an additional status field.This field is used by a node to signal its role which is either CH or CN.When processing a correctly received beacon frame, as explained in Algorithm 1, this role status is extracted from the frame and added to the known neighboring vehicle list.Each vehicle now knows the role of all the neighboring vehicles and can change its role accordingly.
The following are the parameters used in our algorithm.
(i) numCH is the number of CHs a node has to be connected to before it is allowed to switch to the CN state.The parameter sets the number of CHs a node has to be connected to and which in turn affects the connectivity.The greater the number is, the more CHs are in the network and the more redundant connections exist (which is a prerequisite for multihoming), but also the clustering overhead increases.Value 2 provides a minimal clustering overhead while still providing the increased connectivity and connection redundancy.For multihoming support the value of at least 3 is recommended.The choice for the value depends on the intended use of the algorithm: either higher connectivity or lower clustering overhead.
(ii) metricTreshold sets the threshold above which a node can be considered a valid candidate for a CH from the perspective of the current vehicle.It represents a compromise between the communication duration in time and clustering speed.Higher value imposes clustering with better interconnected vehicles but it also delays the clustering process for new nodes whose metric value is low.A metricTreshold that is too high could severely impair clustering, so the value should be set as low as possible but high enough to exclude the unsuitable vehicles.As a rule of thumb, communication time of at least 5 seconds should be considered.
(iii) numVehicles is the number of best performing vehicles to be included in the CH consideration.It allows the vehicle to investigate only the highly interconnected vehicles for the CH status and enforces the clusters to be formed with the best performing vehicles only.This improves the cluster stability.The value depends on the number of vehicles in the vicinity, which relates to the vehicle density and communication range, so dynamic adaptation could be used.By logical consideration, the values between 15 and 50 should be a good starting point.
(iv) switchProbability is the probability of a CH switching its role to a CN.If two neighboring nodes would switch their states at the same time, a lack of CHs could arise and force other nodes to switch to the CH state.This would result in network oscillations and switchProbability is used to lower the number of such cases.For optimum performance, the value should be around but below 0.5, so the role switch takes place with a great enough probability but, on the other hand, the switching does not happen on every attempt.A value too close to 1 would reduce the effectiveness of this feature and a value too close to 0 would limit the role switching too much.
At the event of the beacon period timer expiration, all the vehicles are processed as explained in the previous section.Additionally, value of the metric and the role of each vehicle are also checked.If the vehicle itself is a CH then it tries to change its role to a CN according to the following criterion: the role can be switched only if there are at least numCH CHs with the metric above metricTreshold in the numVehicles top ranked vehicles and the vehicle did not change its role to CH in the previous beacon period.To  in the network, the vehicle changes its role with a switch-Probability probability.But if the above criterion is not met, then the vehicle stays a CH.The algorithm is presented in Algorithm 2.
The above criterion is also valid for the reverse switch from CN to CH with the exception that the role can be changed regardless of the possible role change in the previous beacon period.This allows the network to quickly provide additional CHs if needed.
In this algorithm CHs and CNs are loosely coupled as there is no explicit notification between the two about their dependence.Although the metric is not symmetric, it is still expected to be quite similar so if a CN relies on a CH then the CH should know about it from its own metric measurement.
To implement tight coupling between CHs and CNs, explicit notifications between nodes are needed.They have to be periodic as the roles might change but they are not necessarily sent on every beacon period.However, they have to be sent on every role change to notify surrounding nodes about the change.The notification messages differ between node types: CNs send only the data about who their CHs are, so the data structure is quite small.In contrast, CHs send the whole list of nodes that are served by it-including both CNs and CHs.Nodes in the communication range, which are not meeting the clustering criterion, should be silently ignored.This implicitly allows for overlapping clusters.
Even with the tight coupling there are no explicit confirmation messages for the notifications.The only confirmations are the notifications itself.

Simulation and Test Scenarios
For the simulation, the Ovnis [21] platform has been used with its Kirchberg simulation setup.Ovnis is an integration of ns-3 [4] network simulator with the traffic microsimulator SUMO [3] and allows for the control of the traffic flow in real time from within the ns-3.Due to recent API changes in both ns-3 and SUMO, a significant part of the code was modified and updated.
Most frequently, the MOBIC is used as a de facto reference for comparing VANET clustering algorithms [13,[22][23][24][25], so it was a logical choice to use it in our case as well.This allows comparative performance comparison to other clustering algorithms that also use MOBIC as a reference.Although it would be prudent to use a more recent VANET clustering algorithm as a reference, this was not possible.To the best of our knowledge, none of VANET clustering algorithms known to us provides enough details in the published papers to implement the algorithm on our own or to replicate the same simulation scenario for the testing, but either of the two is required for making a credible comparison.
The ns-3 network simulator is configured to use the logdistance propagation model with default parameters and variable maximum distance that changes according to the test runs.We also use the constant speed propagation delay and the IEEE 802.11p control channel (CCH) with 10 MHz bandwidth and 27 Mbps constant rate.The beacon period is set to 1 second and the messages are sent using IPv4/UDP protocol.
Test scenarios are configured to simulate the described loosely coupled variant of the clustering algorithm with varying maximum distances of 100 m, 200 m, and 300 m.This covers the city scenario with variable signal propagation limitations.The other varying parameter for the simulation is the number of CHs per node (numCH) that defines the degree of connectivity redundancy and is set to 2, 3, and 4. The remaining parameters were set to fixed values: switch-Probability to 0.4, numVehicles to 25, and metricTreshold to 10.According to that, the first 10 seconds of each vehicle were ignored in order to avoid them influencing the results as it is obvious that the vehicle was not connected to any CH at that time.
The SUMO vehicle movement simulation configuration for Kirchberg was split into separate time slots from 5000 s-6000 s, 6000-7000 s, and so forth, until 9000 s-10000 s, so there are 5 different vehicular traffic scenarios each a 1000 s long run.The number of vehicles was lowered by only using every 4th vehicle, allowing for a more realistic simulation as the original was meant to congest the roads.Each scenario is left running for 200 s so the network stabilizes before making the measurements which cover the remaining 800 s of each run.The whole area of Kirchberg road network that was simulated spans across an area of 7960 m × 10575 m and is shown in Figure 2.
During the test runs, the following variables were measured: the number of CHs a node is connected to, the number of role switches performed during the whole simulation run, and the average number per each vehicle for both algorithms.

Results and Evaluation
For each pair of maximum communication distance and number of demanded CHs per node, 5 similar traffic scenarios were run and then averaged.On average, the movement of 395 vehicles has been simulated in each of those runs.On average, 58922 beacons were sent in each scenario which amounts to the total run time in seconds for all the vehicles together.According to that, the average lifetime of a vehicle in the simulation was 149.17 seconds.The other measured variable was the total number of role switches in a simulation but this is different for each run because it is affected by the maximum communication range and number of wanted CHs per node.
When sending a beacon (once a second) each node reported the number of cluster heads it was connected to, excluding itself.These numbers were then summed together for all the nodes so the results show how many seconds in the simulation the vehicles had no cluster heads to connect to, how many seconds they were connected to only one cluster head, and so forth.It is shown as a percentage of total time in the graph so the numbers between different runs are comparable.
Figures 3 and 4 show the results obtained with fixed maximum communication ranges of 100 m and 200 m and different number of demanded cluster heads per node.It is clearly visible from the graph that the algorithm tries to fulfill this requirement and that it peaks very closely to the demanded number.The number of nodes on the left side of the demanded value shows that it was not possible to fulfill this requirement as there were not enough nodes in the communication range.The increase of connectivity that  is observable on the right side of the demanded value, when comparing the graphs, is the collateral consequence of the algorithm.As there are more cluster heads in the network to fulfill the requirement, some of the nodes are lucky enough to be in the communication range of more cluster heads than needed.It has been observed that the results for 200 m and 300 m maximum communication ranges are almost identical for both algorithms and all simulation runs so the graphs for 300 m are not presented.This is, in our opinion, related to the road network layout in the particular simulation scenario.
The other measured variable was the average number of all cluster head changes in the simulation run whose results are presented in Table 1.The results confirm our expectations: in contrast to MOBIC, whose nodes switch the cluster head less often if the clusters are geographically larger, our algorithm does the opposite.This is the consequence of the cluster head elimination concept.
Figures 5, 6, and 7 show the results with a fixed demand of 2, 3, and 4 cluster heads per node and different maximum communication ranges.It is visible that the range of 100 m is not able to satisfy the demand as effectively as the other two ranges.Also, the peak at the demanded number in each graph is clearly observable.The results for ranges of 200 m and 300 m are almost identical and their graphs overlap.This implies that the critical communication range for city road network in our scenario lies between 100 m and 200 m and higher communication ranges do not produce any significant gain.
A special explanation is needed for the fact that, on occasion, the nodes were connected to less than the desired number of cluster heads.This is the expected behavior in case a node is out of range of other nodes and it cannot connect to any of them.The same applies for all the cases when the number of nodes in the communication range is less than desired.The second reason comes from the fact that the node does not count itself as a cluster head although it is functioning as a cluster head.

Conclusion and Future Work
In this paper, a new vehicle interconnection metric and clustering algorithm have been proposed and presented.Their key advantage lies in their implicit identification and exposure of vehicles with better long term connectivity regardless of their exact geographical location or moving direction.There is no dependence on any location service and its reliability and accuracy.It also effectively handles different signal propagation problems, obstacles, wireless medium congestions, and other unwanted effects.They are all implicitly covered in the metric so the clustering algorithm can be much simpler as it does not have to consider any of these limitations.
The given clustering algorithm is focused on providing ubiquitous and reliable connectivity between vehicles by using redundant links.Due to the fast convergence of the metric to the network changes the clustering algorithm quickly adopts and recovers to the topological changes, providing high reliability and good usability performances.It requires less cluster head changes than MOBIC and achieves higher connectivity between clustered vehicles.
Simulation results have also shown that there is no performance increase in VANET clustering when using 300 m communication range instead of 200 m in a city scenario.So the observed optimal maximum communication distance is between 100 m and 200 m radius which makes International Journal of Distributed Sensor Networks clustering suitable even for dense road networks with short road segments.
Our future plans include some extensions and optimizations of the algorithm.The area showing the most potential for improvement is the CH role switching which is open for more sophisticated algorithms.
We also focus the research on the field of cluster usage, especially in the distributed trust management between vehicles, where the geographical locality has been shown to be an advantage.

Figure 2 :
Figure 2: The Kirchberg road network as used in the simulation.

Figure 3 :Figure 4 :
Figure 3: Results of varying number of cluster heads demanded and maximum communication range of 100 m.

Figure 5 :
Figure 5: Results of varying maximum communication range and demanded 2 cluster heads.

Figure 6 :Figure 7 :
Figure 6: Results of varying maximum communication range and demanded 3 cluster heads.
prevent oscillations The metric penalization and clustering routine before beacon sending.

Table 1 :
Average number of cluster head changes.