Genetic Algorithm-Based Cooperative Coding and Caching Data Dissemination Scheme in Multi-UAV-Enabled Internet of Vehicles

Unmanned Aerial Vehicles (UAVs) have emerged as efficient tools in disaster-stricken areas, facilitating efficient data dissemination for post-disaster rescue operations. However, the limited onboard energy of UAVs imposes significant constraints on their operational lifespan, thereby presenting substantial challenges for efficient data dissemination. Therefore, this work investigates a data dissemination scheme to enhance the UAVs’ bandwidth efficiency in multi-UAV-enabled Internet of Vehicles, thereby reducing UAVs’ energy consumption and improving overall system performance when UAVs hover along designated flight trajectories for data dissemination. Specifically, first, we present a software-defined network-based framework for data dissemination in multi-UAV-enabled IoV. According to this framework, we formulate a problem called C2BS (Coding-based Cooperative Broadcast Scheduling) that focuses on optimizing the UAVs’ bandwidth efficiency by leveraging the combined benefits of coding and caching. Furthermore, we demonstrate the NP-hardness of the C2BS problem by employing a polynomial time reduction technique on the simultaneous matrix completion problem. Then, inspired by the benefits offered by genetic algorithms, we propose a novel approach called the Genetic algorithm-based Cooperative Scheduling (GCS) algorithm to address the C2BS problem. This approach encompasses a coding scheme for representing individuals, a fitness function for assessing individuals, operators (i.e., crossover and mutation) for generating offspring, a local search technique to enhance search performance, and a repair operator employed to rectify infeasible solutions. Additionally, we present an analysis of the time complexity for the GCS algorithm. Finally, we present a simulation model to evaluate the performance. Experimental findings provide evidence of the excellence of the proposed scheme.


Introduction
With the rapid advancement of communication technologies and the proliferation of mobile applications, Internet of Vehicles (IoV) [1,2] has emerged as a promising concept for achieving Intelligent Transportation Systems (ITSs), encompassing advancements in improving safety and security [3], enhancing driving experience [4], and providing entertainment services [5].Data dissemination through Vehicle-to-Vehicle (V2V) and Infrastructure-to-Vehicle (I2V) communications plays a pivotal role in diverse ITSs through Vehicle-to-Vehicle (V2V) and Infrastructure-to-Vehicle (I2V) communications.However, the performance of data dissemination is expected to decline because of the highly dynamic characteristics of vehicles [1] in IoV, particularly in complex environments resulting from natural disasters such as wildfires, storms, and earthquakes.On the one hand, infrastructures for wireless communication, including Base Stations (BSs) and RoadSide Units (RSUs), will be unable to perform normally to provide data service via I2V communication, since they may be destroyed in natural disasters.Poor network connectivity in highly dynamic communication environments may hinder data dissemination through V2V communication, resulting in intermittent communication connections and high end-to-end latency.On the other hand, for natural disasters, effective post-disaster rescue heavily relies on fast response to the disaster with low-latency data dissemination [6].Hence, data dissemination in disaster scenarios, where the existing infrastructure is unavailable or incapacitated, poses significant challenges that hinder effective support for post-disaster rescue efforts.
Unmanned Aerial Vehicles (UAVs) have been employed to enhance communication connectivity among vehicles and establish UAV-enabled IoV to improve network performance [7,8].The primary motivation behind this stems from their inherent characteristics, which include exceptional flexibility and cost-effectiveness in deployment [9].Inspired by this concept, UAVs have the potential to serve as airborne base stations or relay nodes for transmitting rescue messages to vehicles in areas impacted by natural disasters.This capability empowers UAVs to disseminate scheduling commands for post-disaster rescue operations to ground vehicles.However, the limited onboard energy of UAVs necessitates special consideration regarding the energy issue in UAV-enabled IoV.The energy consumption of UAVs primarily comprises propulsion (i.e., hovering and flying) energy and communication energy [10].The bandwidth efficiency of UAVs for data dissemination becomes crucial in influencing the overall energy consumption of UAVs when they hover along designated flight trajectories for a specific duration.For instance, if UAVs broadcast with constant power consumption, improving the bandwidth efficiency of UAVs can effectively reduce the number of broadcasting and service latency, thereby mitigating energy consumption aspects such as hovering and communication.Therefore, optimizing the bandwidth efficiency of UAVs for data dissemination facilitates efficient utilization of energy resources, thereby enhancing the endurance capabilities of UAVs and the overall system performance.Although the detailed methods for UAV deployment are beyond the scope of this research, they can be referenced from previous studies [7,11].As a result, this work aims to develop an effective data dissemination strategy that optimizes bandwidth efficiency for UAVs, enhancing UAV endurance capabilities and overall system performance in post-disaster rescue operations.
Relevant efforts have been dedicated to enhancing the system performance of wireless communication.On the one hand, the potential of network coding can greatly improve data dissemination performance, such as data rate and bandwidth efficiency [12][13][14].Due to sufficient storage of vehicles to cache data items, it is possible to improve the gain of network coding by coordinating the strategy of coding and caching.Compared to a system with only one UAV, where service capacity is constrained, multi-UAV cooperative service can significantly improve network performance.To achieve this, this work focuses on investigating and developing a coding-based scheduling algorithm for UAVs, aiming to optimize bandwidth efficiency by harnessing the combined advantages of coding and caching.On the other hand, great interest has been aroused in research on the service architecture by integrating Software-Defined Network (SDN) into IoV since SDN has shown capacities of flexible resource allocation and service management [15,16].The SDN framework aims at separating the network into control and data planes.Leveraging the global view of the entire network, the SDN controller facilitates flexible management for making scheduling decisions to improve network performance through its logically centralized control capabilities.
With the above motivation, this study proposes an SDN-based data dissemination architecture in multi-UAVs-enabled IoV to facilitate scheduling decisions for UAVs in a logically centralized manner, which lays the foundation for leveraging coding and caching to provide efficient data service in IoV.Moreover, inspired by the advantages of genetic algorithms for dealing with complicated optimization issues, this study presents a genetic algorithm-based scheduling scheme for data dissemination implemented at the SDN controller, aiming to enhance UAVs' bandwidth efficiency and optimize the overall per-formance of data dissemination.The primary contributions of this work are outlined as follows.

•
We propose an SDN-based data dissemination framework in multi-UAV-enabled IoV.Specifically, the SDN controller leverages the knowledge from the entire network, encompassing motion information, requested information, and caching information of vehicles, to generate coding-based scheduling strategies for UAVs.Then, UAVs execute these scheduling decisions to facilitate efficient data dissemination for vehicles.We develop a simulation model employing the proposed system framework, and perform an extensive performance evaluation by implementing the GPS data of taxis.Experimental findings show the effectiveness of the proposed scheme.
The remaining sections of this paper are organized as follows.The literature review is conducted in Section 2. We propose the system model in Section 3. Section 4 formulates the C2BS problem and establishes the NP-hardness of this problem.The GCS algorithm is proposed in Section 5, where we also analyze its time complexity.In Section 6, we perform the performance evaluation to validate the proposed scheme.we provide the conclusion in Section 7.

Related Work
Extensive studies have been conducted to explore the implementation of UAVs to enhance network performance.Zhan et al. [18] examined a communication system involving a UAV serving as a mobile BS to receive data from ground facilities.They proposed an optimization algorithm that utilizes path discretization and successive convex approximation to optimize the throughput of data uploading.An architecture aiming to optimize vehicular computation offloading using SDN and UAV assistance was proposed by Zhao et al. [19], wherein the system cost is minimized through deploying a UAV as a relay node.Yang et al. [20] investigated a system that combines UAVs with mobile edge computing, enabling a single UAV to cater to several energy-harvesting devices.To optimize energy consumption and task processing rate, they developed an innovative online algorithm that incorporates perturbed Lyapunov optimization.Dai et al. [21] studied a framework for offloading vehicular tasks with the assistance of UAVs, aiming to tackle the overloaded issue in vehicular edge computing.To reduce delays in vehicular tasks, they introduced an online algorithm for task offloading that utilizes Markov approximation optimization while considering energy limitations imposed on the UAV.Yan et al. [22] introduced a UAV-assisted mobile edge computing system for the IoV, and proposed a task offloading algorithm that integrates deep reinforcement learning into vehicular edge computing to minimize delays in the system.The above works consider scenarios involving only one single UAV.However, for complicated post-disaster rescue operations, the requirements of large-scale servers cannot be fulfilled solely by relying on a single UAV.Therefore, this work aims to enhance data dissemination performance in challenging scenarios by leveraging the capabilities of multiple UAVs.
Considerable efforts have been devoted to investigating the utilization of multiple UAVs in UAV-enabled IoV.Sami et al. [23] proposed a framework that employs UAVs for vehicular fog computing, aiming to enhance cluster management efficiency and ser-vice restoration, thereby improving the quality of service in the automotive domain.Ning et al. [24] studied the management of distributed multi-UAV trajectory control within UAV-assisted mobile edge computing networks, aiming to optimize users' short-term computational cost and minimize UAVs' long-term computational cost while delivering differentiated services.Liu et al. [25] proposed a mobile IoV framework that utilizes multiple UAVs to offer tracking services for vehicles.These UAVs transmit downlink information during flight to optimize system throughput by jointly optimizing vehicle communication scheduling, UAV power allocation, and UAV trajectory.Due to the absence of collaborative service-oriented modeling in distributed network settings, Kumar et al. [26] introduces a framework for ensuring quality of service in a UAV-supported aerial ad hoc network environment to devote to reliable aerial communication.In the context of emergency communication with the assistance of UAVs, Wang et al. [27] presented an authentication and key agreement protocol, which utilizes three components, namely smart cards, biometrics, and passwords along with the physically unclonable function to ensure the security of UAV-assisted emergency communication.Nevertheless, in UAV-enabled IoV, the energy constraints of UAVs pose significant challenges to the performance of data dissemination.Enhancing the UAVs' bandwidth efficiency becomes imperative for improving their endurance capabilities.Consequently, this work focuses on devising a data dissemination scheme for multiple UAVs to maximize bandwidth efficiency and optimize overall energy consumption.
Many studies have focused on exploiting network coding to enhance the overall performance of IoV.To enhance spectrum efficiency in IoV, Pazhoor et al. [12] proposed an index-coded Non-Orthogonal Multiple Access (NOMA) technique, which effectively minimizes the transmission count by integrating NOMA techniques with index-coded data.The practical implementation of DNN-PNC, which is based on Physical-layer Network Coding (PNC) and utilizes Deep Neural Network (DNN), was investigated by Wang et al. [13].They used the universal software radio peripheral platform to demonstrate the feasibility of employing DNN-based PNC for IoV.The frequent transmission errors lead to underutilization of channel resources, Pan et al. [14] studied a Ggraph-Theoretic method-based Network Coding (GTNC) approach to enhance resilience against transmission errors and proposed a GTNC-based cooperative concurrent distribution solution to coordinate V2V and I2V communications.To leverage the benefits of network coding, Liu et al. [28] explored the integration of fog computing into a software-defined heterogeneous IoV architecture and proposed a scheduling algorithm that utilizes clique-searching to tackle the cooperative service problem in fog-assisted environments, to minimize service delay.Bhatia et al. [29] presented a server architecture that utilizes network coding and incorporates multi-generation-mixing capabilities to improve the reliability and security of data transmission.Huang et al. [30] investigated the efficient allocation of system resources in an IoV system with multiple users by utilizing network coding.However, the performance of the coding-based scheme is significantly influenced by the crucial incorporation of caching.Therefore, this study aims to investigate a scheduling strategy that integrates coding and caching to enhance the data dissemination performance of UAVs in UAV-enabled IoV.
Distinguishing from prior research, this work focuses on exploiting multiple UAVs for data dissemination in post-disaster rescue operations.Considering UAVs have limited energy capacities and hover along designated flight trajectories for data dissemination, we are committed to maximizing the bandwidth efficiency of UAVs to enhance their endurance capabilities.Consequently, we propose an SDN-based data dissemination architecture in multi-UAV-enabled IoV and develop a customized scheduling algorithm for UAVs that leverages the combined benefits of coding and caching aiming to optimize the bandwidth resources of UAVs, thereby enhancing the overall performance of the system.

System Model
The SDN-based data dissemination framework in multi-UAV-enabled IoV is illustrated in Figure 1.In this framework, vehicles, and UAVs are regarded as switches in the data plane.The airship, equipped with a cellular network interface, serves as the SDN controller responsible for establishing the control plane to gather vehicle and UAV status data and oversee the entire network.Then, the SDN controller leverages the global view of the entire network to formulate and implement scheduling decisions encompassing resource allocation, routing determinations, and data scheduling.Data nodes perform scheduling decisions forwarded from the control plane.We consider a disaster-stricken area, where infrastructure is unavailable.Vehicles request data services for collective interests in post-disaster rescue operations, such as disaster information, road conditions, and weather updates.UAVs facilitate the provision of this information to vehicles through wireless communication.Note that a portion of the requested data might have been cached by vehicles before they entered the scenario.Based on this, UAVs store the requested data and are deployed and maintained in a hovering state for a specific duration to facilitate data dissemination for vehicles.It should be noted that the deployment methods of UAVs can be referenced from the literature mentioned in Section 1.We are dedicated to investigating a scheduling strategy that maximizes UAVs' bandwidth efficiency during the hovering state for data dissemination to enhance overall system performance.Subsequent redeployment of UAVs can follow predetermined trajectories.Additionally, vehicles in the coverage of UAV (indicated by the dashed circular shape shown in Figure 1) can obtain data via the equipped cellular network interfaces.To this goal, we propose a framework for UAV data dissemination that utilizes SDN to exploit centralized control and flexible scheduling.
The operational flow of the system is delineated as follows.First, vehicles and UAVs periodically transmit their status information, including location, velocity, direction, requested contents, and cached contents to the SDN controller via control messages upon entering the designated area.Furthermore, the SDN controller with a perspective of the entire network can predict vehicle trajectories, make informed scheduling decisions, and transmit control messages to UAVs.Consequently, UAVs possess the capacity to make scheduling decisions and provide data dissemination services for vehicles.A scheduling strategy is designed and implemented in the SDN controller, which utilizes network coding to efficiently facilitate data dissemination for vehicles.Note that detailed prediction solutions are beyond the scope of this study, they can be referenced from previous works [31,32].
The above framework serves as the fundamental basis for implementing network coding through informed scheduling decisions to enhance the performance of data dissemination.Given this, the SDN controller utilizes a technology based on bitwise exclusive-or (⊕) coding.The reason is as follows.First, the utilization of XOR coding-based operators incurs negligible overhead, thereby facilitating handling by UAVs.Moreover, the size of the encoded packet remains unchanged compared to that of the original one.For instance, given a packet p = d 1 ⊕ d 2 , where data items d 1 and d 2 are used for encoding, a vehicle cached one data item (e.g., d 1 ) can decode the other data item (e.g., d 2 ) in packet p by performing the computation of d 2 = d 1 ⊕ p.By scheduling an encoded packet, it becomes feasible to cater to multiple requested data items from diverse vehicles.Additionally, in comparison to traditional wireless sensor networks, vehicles have a higher capability for data caching and processing.Consequently, the encoded packets that cannot be immediately decoded can be cached by these vehicles.
We illustrate the proposed framework and challenges of scheduling algorithm design through an exemplification.As shown in Figure 1, suppose the database consists of d 1 , d 2 , d 3 , and d 4 , two UAVs, namely UAV 1 and UAV 2 , are deployed to facilitate data dissemination for a fleet of six vehicles (denoted as v 1 , v 2 , v 3 , v 4 , v 5 , and v 6 ).Furthermore, assuming a packet transmission time of 1-time slot and a maximum scheduling period of 3-time slots, it can be inferred that each UAV can broadcast up to three packets within every scheduling period.In addition, the gray square shows the cached data item.As shown in Figure 1, at time t 0 , the current cached packets are d  1 presents the results of vehicle trajectory prediction.
Given the above settings, we analyze the performance of three scheduling strategies on average service delay.The average delay is defined by the total time elapsed by all vehicles to obtain all data items in the database, divided by the total count of vehicles.Table 2 presents scheduling strategies and results.The detailed explanation of the scheduling process is elaborated upon in the following manner.where each element a(p) i is a binary variable.For instance, the element a(p) i in a(p) will be assigned a value of 1 if the data item d i (d i ∈ D) is included in the encoding process of p, and 0 otherwise.Similarly, a non-encoded data item d i (d i ∈ D) can alternatively be transformed into a coefficient vector a(p) of dimension |D|, where only the corresponding element a(p) i is set to 1 while all other elements are set to 0. Hence, in this paper, the term 'packet' can be used interchangeably to refer to both encoded and non-encoded packets without causing any confusion.In addition, we define the cached packet set of a vehicle (say denotes the total count of packets cached by vehicle v k .The essential symbols in this paper are outlined in Table 3. The C2BS problem is formulated based on the aforementioned illustration.Considering the set of vehicles V, the set of data items D, the set of packets cached by each vehicle C(v k )(v k ∈ V), and the set of UAVs U, the C2BS problem aims at maximizing UAVs' bandwidth efficiency by identifying a minimal group of packets, denoted as P, that can cater to all vehicles.

V
The set of vehicles D The set of data items in the database U The set of UAVs p An encoded or non-encoded packet C(v k ) The set of packets cached by vehicle v k A k The coefficient matrix of vehicle v k constructed by C(v k ) T The maximum scheduling period V j The set of vehicles in the radio coverage range of UAV u j (u j ∈ U) Λ g The g-th generation population M The total count of individuals in the population An individual in the g-th generation population λ The total count of D-dimensional coefficient vectors r(v k ) The rank of matrix The fitness function for evaluating individual The maximum count of generations

NP-Hardness Proof
In this section, we prove that the C2BS problem is an NP-hard problem by employing a polynomial-time reduction technique, which is applied to the simultaneous matrix completion problem known to be an NP-complete problem over GF (2).First, we provide an overview of the basic idea behind the problem of simultaneous matrix completion.The matrices in question consist of a mix of numerical values and symbolic variables, where each symbolic variable is exclusive to one matrix but may appear in multiple matrices.Then, this problem aims to simultaneously attain complete rank for all matrices by determining and assigning values for these symbolic variables.The subsequent section provides a comprehensive exposition of the proof process.
Theorem 1.The C2BS problem is an NP-hard problem.
Proof of Theorem 1. First, we examine a particular case of the C2BS problem, referred to as C2BS*.Furthermore, it is assumed that a single UAV is deployed in this scenario to facilitate coding-based data dissemination for all vehicles throughout the service period, thereby eliminating the need for coordination between multiple UAVs.Based on the above assumptions, the optimal solution to C2BS* is proven to be generated if and only if finding the vectors with the minimum number to achieve full rank simultaneously for all resulting matrices.
The cached packets per vehicle in the set V are employed for constructing a collection of matrices represented as A. In particular, let Moreover, it is important to perform Gaussian elimination technology on matrix A k to transform it into its reduced row echelon form, as the vectors in A k may show linear dependence.Then, the transformed matrix, denoted by )}.Furthermore, to address the C2BS* problem, the UAV schedules these packets in P * to serve all vehicles.Specifically, we append all coefficient vectors in Ω to each matrix in A ′ , resulting in the construction of a collection of matrices denoted by Then, suppose there exists a vehicle v α (v α ∈ V) that does not possess all the data elements.In this case, the matrix A * α (A * α ∈ A * ) does not have full rank, indicating the need for additional vectors to construct a matrix with full rank for A * α .To accommodate v α , it is necessary to schedule more packets.This goes against the previous assumption that P * represents the optimal solution, suggesting that all vehicles can efficiently attain all data items by scheduling these packets in P * .It is worth noting that a(p) is a coefficient vector of dimension |D| over GF (2).Consequently, it is equivalent for the C2BS* problem to identifying a minimal collection of coefficient vectors, ensuring that all matrices in A * simultaneously attain full rank.
The above evidence substantiates the NP-hardness of the C2BS* problem, while Theorem 1 establishes a specific case of the C2BS problem as proof.

Algorithm Design
With the development of optimization algorithms, it is promising to design a dedicated algorithm to solve the complex data dissemination problem in IoV.The genetic algorithm is an optimization algorithm used to simulate the evolution of living things over generations, in which a population of individuals (referred to as candidate solutions ) undergoes evolution towards better individuals.The genetic algorithm has shown competitiveness in solving combinational optimization problems, such as knapsack problem [33], task scheduling [34], and feature selection [35].Note that the C2BS problem entails the identification and scheduling of a minimal set of packets, wherein each packet can be transformed into a coefficient vector.Hence, the C2BS problem can be considered as a discrete combinatorial optimization problem.In light of this, we propose a GCS algorithm to address the C2BS problem.The following presents the brief procedure of the GCS algorithm.
The GCS algorithm commences with the initialization of a group of individuals, where each individual represents a candidate solution.Furthermore, parent selection is performed based on a dedicated fitness function to determine suitable parents for executing crossover and mutation operators to generate offspring individuals.Moreover, a local search technique is designed to enhance global exploration by leveraging local knowledge, while an effective repair operator is devised to handle infeasible solutions during evolution.Additionally, the offspring population is derived from both parents and the generated offspring solutions through population replacement.The above procedure is iteratively carried out until the predefined termination condition is reached.To better understand the process, we priorly give an overview of the GCS algorithm as a flow chart in Figure 2. The following are the presented steps of the GCS algorithm in detail.

Generation of Initial Population
Initialize population by Algorithm 1

Individual Evaluation
Evaluate fitness value of each by Equation 1Offspring Generation

Evaluate and sort the individuals in
Select the the top-M individuals as Λ !"#

Output:
the individual with the highest fitness value in

Generation of Initial Population
A population of individuals is initialized and denoted by Λ 1 = {λ 1 1 , λ 1 2 , . . ., λ 1 M }, where M is the size of population.In particular, as shown in Figure 3, an individual , which represents a candidate solution and is denoted by a binary matrix of size |U| × N, where |U| and N denote the overall count of UAVs and the overall count of the randomly generated |D|-dimensional coefficient vectors, respectively.Note that given a coefficient vector means that a k is selected and scheduled by UAV u j .Otherwise, λ 1 i [j][k] = 0. Additionally, the scheduling period allows for a maximum capacity of T packets to be scheduled.Given a feasible solution The process of population initialization is illustrated within Algorithm 1.

Fitness Function
The coefficient vector a(p) is appended to the coefficient matrix for each vehicle located within the radio coverage range of UAV u j when packet p is scheduled by UAV u j .Note that the C2BS problem aims to identify and arrange a minimal collection of packets P to cater to all vehicles.Then, the following fitness function is designed to evaluate solutions.Specifically, given a solution λ g i (λ ) represent the ranks of original and updated matrices, respectively.After a scheduling period, the average improvement in matrix rank for these vehicles can be calculated by Based on this, we define the following fitness function to evaluate solutions.

Offspring Generation
The operators with parent selection, crossover, and mutation are performed to evolve the generation.Specifically, first, the tournament selection scheme [36] is implemented for determining the two parents to allow them to pass their good genes to successive generations.Hence, given the g-th generation population Λ g , the underlying principle of the tournament selection scheme is to prioritize two individuals, denoted as λ g x and λ g y (λ g x , λ g y ∈ Λ g ), who exhibit the highest levels of fitness.The parent selection is shown in lines 2∼4 in Algorithm 2. Furthermore, crossover sites are randomly chosen from λ g x and λ g y .Then, a completely new offspring is created by exchanging the genes at these crossover sites.The process of the crossover operator is depicted in lines 5∼10 within Algorithm 2. Additionally, to preserve the variety of the population, the new population undergoes bit mutation with a predetermined probability ρ.The process of mutation operator is illustrated in lines 11∼18 within Algorithm 2.

Local Search
Local search is carried out on each solution in the population to search its neighborhood by steps for a better one.Specifically, given a solution λ g i (λ g i ∈ Λ g ), a local search is performed on λ g i from the first to the last dimension by conducting a sequential value flipping.Then, through sequential evaluation and comparison of new individuals, only the individual exhibiting the highest fitness value is accepted and saved in the population.The local search is shown in lines 24∼38 within Algorithm 2.

5.
Repair Operator It is worth noting that an individual may become an infeasible solution in evolution (i.e., crossover operator, mutation operator, and local search).In each scheduling period, the maximum number of packets broadcast by a UAV is T.Then, for a solution Hence, we design a repair operator to adjust an infeasible solution.In particular, given a solution λ g i , we traverse each row of

Algorithm 1 Generation of Initial Population
Input: M, U, N and T Output: for j = 1 : |U| do  end for 10: end for Algorithm 3 provides a detailed process of the GCS algorithm.The GCS algorithm concludes when it reaches the predefined maximum number of generations G max , which is a widely adopted approach in genetic algorithms [37,38].Consequently, the algorithm outputs the individual λ * with the highest fitness value in G max -th generation Λ G max .

Algorithm 2 Offspring Generation
Choose q individuals in Λ g randomly 3: Sort them by their fitness value in descending order for k = 1 : N do 6: end for 10: end for //Mutation Operator 11: for k = 1 : N do 13: for j = 1 : |U| do 14: Generate a number r within the range of [0, 1] in an arbitrary manner. 15: if r < ρ then 16: end for //Repair Operator perform repair operator on λ ′ temp ← λ ′ l 27: end for perform repair operator on temp Evaluate temp by Equation ( 1) end for

38:
end for 39: end for Algorithm 3 The GCS algorithm Sort the individuals in Λ g ∪ Λ ′ by their fitness value in descending order 8: g ← g + 1 10: end while 11: Output the individual with the highest fitness value in Λ G max

Time Complexity Analysis
The time complexity of the GCS algorithm is analyzed according to its procedural steps.For the population initialization, since a solution Based on the analysis provided above, it can be inferred that the GCS algorithm exhibits a time complexity of ).The parameters |U|, N, M, and G max are treated as constants in the GCS algorithm, with their values independent of the system's scale.Then, the time complexity has been decreased to O(|V| × |D| 2 ).Additionally, considering |D| as a constant or small scale in specific scenarios, such as post-disaster rescue, is justifiable due to its representation of the database size.The time complexity of the GCS algorithm exhibits a robust linear correlation with the number of vehicles in the service scenario, thereby rendering it practically applicable for addressing the C2BS problem in terms of time complexity.

Simulation Setup
We conduct a simulation model utilizing the system architecture outlined in Section 3. In particular, we extract taxi trajectories from a 3 km × 3 km region located in Haidian District, Beijing, China, during the time frame of 23:50 to 23:55 on 13 November 2015 to accurately mimic real-world scenarios, as shown in Figure 4.The extracted trajectories consist of a total of 277 vehicles with an average residence time of 213.6 s and a standard deviation in residence time of 10 s.Additionally, the average vehicle number per second is 198, with a standard deviation of 13.3 in terms of the number of vehicles.Moreover, four UAVs are deployed in simulation for data dissemination, while the algorithm is implemented using MATLAB.
To facilitate performance comparison, we implement three competitive algorithms.The first algorithm, MRF [39], employs a non-coded scheduling strategy, aiming to prioritize the data item with the highest count in the pending request queue.The second one, MRF_C [40], utilizes an encoded packet scheduling approach to broadcast pending requests with the maximum number.The last one, Round-Robin [41], adopts a sequential broadcasting scheme to transmit all data items in rotation.The default settings for the GCS algorithm comprise a population size of 100, assigning a total number of solution vectors N as 30, limiting the maximum number of iterations G max to 100, fixing the mutation probability ρ at 0.01, and restricting the maximum scheduling period T to 5. Additionally, the transmission rate of the UAV is configured to be 6 Mb/s, while the communication radius is set at 400 m.The database size is defined as 60, with each data item having a size of 18 Mb.Upon entering the simulation area, vehicles may possess cached data items that adhere to a Zipf distribution with a skewness parameter of 0.8 and a cache ratio set at 0.6 (referring to the ratio between total cached data items and those in the database).Unless otherwise specified, all simulations are conducted under these default settings.
The performance of algorithms is quantitatively evaluated by establishing metrics as follows.

•
Average Service Delay (ASD): The delay in service for vehicle v k (v k ∈ V) (denoted by t k ) is indicated as the duration between entering the simulated area and obtaining all data items in the database.The ASD is computed as the sum of service delays for all vehicles divided by the overall count of vehicles, as follows: • Average Service Productivity (ASP): Let r k and r ′ k denote the ranks of the original and new coefficient matrices, respectively, for vehicle v k ∈ V, while n k represents the number of scheduling packets during its service.The ASP is calculated as the sum of the improved rank for each vehicle per scheduling packet divided by the overall count of vehicles, as follows: • Average Service Ratio (ASR): After the simulation, let s denote the total count of vehicles that successfully retrieve all data items.Then, the SR is calculated as follows: The performance evaluation of algorithms across varying cache ratios is illustrated in Figures 5-7.Specifically, the ASD of four algorithms across varying cache ratios is depicted in Figure 5.According to Figure 5, an increase in the cache ratio results in a decrease in the ASD of all algorithms.The higher cache ratio results in more packets being cached by vehicles, reducing the overall system workload.The GCS exhibits superior performance across all ranges, thereby demonstrating the efficacy of its approach by leveraging the advantages of coding and caching in multi-UAV data dissemination scenarios.Figure 6 compares the ASP of four algorithms for various cache ratios.A higher ASP indicates on average more vehicles are served through per scheduling.As shown in Figure 6, the GCS consistently endeavors to uphold the upper limit of the ASP across various scenarios.Furthermore, it has been noted that the MRF and the Round-Robin underperform in terms of the ASP.As the cache ratio increases, vehicles cache a larger number of packets, resulting in a reduction in the number of vehicles served with nonencoded packets.Additionally, with the cache ratio gradually increasing, the difference between GCS and MRF_C is becoming less significant.The reason lies in the benefits of network coding.With an encoded packet, more vehicles can be served.This result confirms findings in [40].The SR of four algorithms across varying cache ratios is depicted in Figure 7.When the system works with a higher cached ratio, vehicles cache more packets, and all the algorithms should have better performance on SR.As illustrated in Figure 7, when the cache ratio is increased, the SR of all algorithms is raised.Moreover, the comparison results indicate that MRF_C outperforms MRF, thereby demonstrating the potential of MRF_C to enhance bandwidth efficiency in UAVs through network coding.In addition, comparing the GCS with the MRF_C, the result shows the superiority of the GCS by exploiting the benefit of coding and caching.

Effect of Database Sizes
Figures 8-10 show the performance of algorithms across varying database sizes.Specifically, the ASD of four algorithms is compared with different database sizes in Figure 8.
As depicted in Figure 8, the ASD of all algorithms exhibits an increase when the database size expands.The expansion of the database size leads to an increase in the overall system workload.Nevertheless, the GCS consistently exhibits the lowest ASD compared to other algorithms across all scenarios.The ASP of four algorithms across varying database sizes is illustrated in Figure 9.An algorithm with higher ASP states the better performance on improving the bandwidth efficiency.As illustrated in Figure 9, GCS consistently outperforms other algorithms across all scenarios.The MRF_C outperforms the MRF and the Round_Robin by leveraging coding-based scheduling decisions.
Figure 10 compares the SR of four algorithms across varying database sizes from 40 to 80.According to Figure 10, the decrease in the SR of all algorithms is observed with an increase in the database size, owing to the necessity of scheduling more packets to serve all vehicles when dealing with a large-scale database.However, the GCS consistently demonstrates superior performance in terms of the SR across all ranges, which further demonstrates the effectiveness of the GCS by making full use of coding and caching.

Effect of Data Sizes
Figures 11-13 evaluate the performance of four algorithms across varying data sizes.Specifically, Figure 11 compares the ASD of four algorithms across varying data sizes.Due to the fixed transmission rate, the increase in data size leads to a proportional escalation of transmission delay, thereby imposing a higher burden on the system workload.As shown in Figure 11, when the data size is increased, the ASD gets higher.However, the GCS still exhibits the lowest ASD. Figure 12 exhibits the ASP of four algorithms across varying data sizes.As shown in Figure 12, the heavy system workload prevents four algorithms from completing service before vehicles leave the scenario.However, the GCS demonstrates superior performance in terms of optimizing bandwidth efficiency compared to other algorithms.Moreover, when the data size continues to increase, the disparity between the GCS and the MRF_C becomes more pronounced, thereby highlighting the superiority of the GCS.
Figure 13 shows the SR of four algorithms across varying data sizes.As shown in Figure 13, the SR of four algorithms is decreased when the data size becomes bigger.However, the GCS consistently maintains the highest success rate across all scenarios, thereby further substantiating its efficacy and scalability.

Conclusions
This work investigated a data dissemination framework integrated SDN in multi-UAVenabled IoV.Based on this framework, the C2BS problem was formulated to maximize UAVs' bandwidth efficiency for enhancing overall system performance.Then, we proved that the C2BS problem is an NP-hard problem and proposed a scheduling algorithm based on the genetic algorithm to address the C2BS problem.Additionally, the time complexity of the GCS algorithm was analyzed in order to demonstrate its practicality.Finally, the performance of the GCS algorithm was evaluated in comparison with alternative algorithms.The simulation findings show the effective performance of the proposed scheme in enhancing UAVs' bandwidth efficiency and optimizing the performance of the system.
As future work, our research can be extended by considering collaboration between UAVs and existing infrastructures (e.g., BSs and RSUs) to enhance scalability in UAVenabled IoV scenarios.Moreover, we will investigate efficient algorithms by considering the features and constraints of UAVs to enhance the performance of multi-UAV enable IoV.
1 and d 2 for v 1 , d 2 and d 4 for v 2 , d 2 and d 3 for v 3 , d 3 and d 4 for v 4 , d 2 and d 4 for v 5 , and d 1 and the set of vehicles, where |V| denotes the overall count of vehicles.The set of UAVs is symbolized as U = {u 1 , u 2 , . . ., u |U| }, where |U| denotes the total count of UAVs.Let V j = {v j 1 , v j 2 , . . ., v j |V j | } (V j ⊆ V) denote the set of vehicles that fall within the radio coverage range of UAV u j (u j ∈ U).The set of data items is designed as D = {d 1 , d 2 , . . ., d |D| }, where |D| represents the overall count of data items in the database.In the XOR-based coding scheme, the encoded packet p can be represented by a coefficient vector a(p) of dimension |D|, denoted by a(p) = {a(p) 1 , a(p) 2 , . . ., a(p) |D| },

Figure 2 .
Figure 2. Flow chart of the GCS algorithm.
a k is attached to each vehicle's coefficient matrix.In a scheduling period, the overall count of packets scheduled by UAV u j is denoted byb j = ∑ N k=1 λ g i [j][k].As shown in Figure 3, for the solution λ i , the total number of scheduling is represented by b num = max{b 1 , b 2 , . . ., b |U| }.Then, according to the solution λ g i , for each vehicle v j k (v j k ∈ V j ) in the coverage of the UAV u j , let r(v j k ) and r ′ (v j k nonzero elements and set them to 0. Consequently, we have ∑ N k=1 λ g i [j][k] = T by performing the repair operator.6. Population Replacement Population replacement is to update the population and keep the population size steady.To achieve this, new individuals are inserted into the current population.Then, we sort the individuals by their fitness values in descending order.Finally, the top-M individuals are selected to form the new population for the next evaluation.The process of population replacement is illustrated in lines 7∼8 within Algorithm 3.

Figure 3 .
Figure 3. Illustration of the individual λ 1 i in the 1st generation Λ 1 .

do 25 :
for each offspring individual λ ′ l ∈ {λ ′ 2i−1 , λ ′ 2i } for k = 1 : N do 26: traverses |U| × N bits to construct.The time overhead for initializing the population is O(|U| × N × M), with M representing the size of the population.In offspring generation, for the parent selection operator, we randomly choose k individuals and sort them by their fitness values in descending order.According to Algorithm 2, the operation is required for matrices of |V| vehicles, with each matrix having a maximum size of |D| × |D|.The crossover operator requires traversing two offspring individuals with a time overhead of 2 × |U| × N.For the mutation operator, it is also required to traverse two offspring individuals.For the repair operator, at worst N times repair operator is required to be implemented for each offspring individual.Based on this, the time overhead for generating offspring isO(k × |V| × |D| 2 + 2 × N + 2 × N + 2 × |U| × N 2 ),which can be further simplified to O(|V| × |D| 2 + |U| × N 2 ).The local search requires operating on each bit in the two offspring individuals, applying the repair operator to them, and evaluating their fitness.Consequently, the time overhead of the local search can be expressed as O(2 × N + 2 × |U| × N 2 + 2 × N × |V| × |D| 2 ), which can be further simplified to O(N × |V| × |D| 2 + |U| × N 2 ).The loop termination condition is determined by G max .

Table 2 ,
UAV 1 broadcasts d 1 ⊕ d 4 and d 2 ⊕ d 3 at times t 0 and t 1 , respectively, while UAV 2 broadcasts d 1 ⊕ d 2 , d 3 ⊕ d 4 and d 1 ⊕ d 2 at times t 0 , t 1 and t 2 , respectively.Taking v 3 as an exemplification, at time t 0 , because the encoded packet d 1 ⊕ d 4 broadcast by UAV 1 cannot be decoded, v 3 cannot be served.Then, it can decode out d 4 from d 3 ⊕ d 4 broadcast by UAV 2 at time t 1 .To completely serve v 3 and v 4 , UAV 2 has to broadcast d 1 ⊕ d 2 at time t 2 .3 will enter into the coverage area of UAV 2 at t 1 , and the d 1 ⊕ d 2 will be broadcast by UAV 2 .Then, v 3 can retrieve d 1 when d 1 ⊕ d 2 is broadcast by UAV 2 .Meanwhile, v 3 can obtain d 4 by decoding previously cached d 1 ⊕ d 4 .All vehicles receive service within 2 time slots, with a delay of 2 time slots per vehicle.Consequently, the average delay for the CCS is merely 2 time slots.
•Independent Scheduling (IS): In this scenario, UAVs independently provide data services without considering network coding.As illustrated in Table 2, UAV 1 broadcasts d 1 , d 3 at times t 0 , t 1 , respectively, while UAV 2 broadcasts d 1 , d 2 and d 4 at times t 0 , t 1 and t 2 , respectively.Taking v 5 as an exemplification, it receives d 1 broadcast by UAV 2 at t 0 .UAV 2 broadcasts d 2 and d 4 at time t 1 and t 2 , respectively.Except for vehicle v 5 , all other vehicles can be completely served, requiring an additional time slot to broadcast d 3 for serving v 5 .The vehicles experience service delays of 3, 2, 3, 2, 4, and 3 time slots.As a result, the average delay for the IS is approximately 2.8 time slots • Coding-based Individual Scheduling (CIS): In this scenario, UAVs broadcast individually by implementing network coding.As shown in 1 broadcasts d 1 ⊕ d 4 for v 1 , v 2 , v 3 , v 4 at time t 0 and d 3 for v 1 , v 2 at time t 1 , respectively.UAV 2 broadcasts d 3 ⊕ d 4 for v 5 , v 6 at time t 0 and d 1 ⊕ d 2 for v 3 , v 4 , v 5 , v 6 at time t 1 , respectively.The scheduling procedure is outlined in Table 2. Taking v 3 as an exemplification, d 1 ⊕ d 4 broadcast by UAV 1 cannot be immediately decoded at time t 0 , but d 1 ⊕ d 4 will be stored by v 3 since with the vehicle trajectory prediction, v

Table 1 .
The results of vehicle trajectory prediction

Table 2 .
The different scheduling strategies

Table 3 .
Overview of symbols.
vectors with dimensions of |D|.In particular, the condition |A ′ k | == |D| suggests that matrix A ′ k has full rank and can be converted into an identity matrix by using Gaussian elimination approach.If |A ′ k | ≤ |D|, to ensure the acquisition of all data items for vehicle v k , it is necessary to append at least |D| − |A ′ k | coefficient vectors with dimensions of |D| to construct a matrix with full rank.As stated in Section 4.1, a coefficient vector of dimension |D| represents either an encoded or non-encoded packet.The minimum number of packets, denoted as |D| − |A ′ k |, are determined and scheduled to ensure that v k can retrieve all the data items.Then, we define the smallest number of coefficient vectors in A ′ as γ = min ({|A ′ 1 |, |A ′ 2 |, . . ., |A ′ |V| |}).Based on the above analysis, the scheduling of a minimum number of |D| − γ coefficient vectors is necessary to ensure that all matrices in A ′ attain maximum rank.Consequently, the C2BS* problem aims to identify a minimal packet set that can accommodate all vehicles.Conversely, the optimal solution for the C2BS problem, denoted as P * = {p * 1 , p * 2 , . . ., p * |P * | }, is assumed to consist of a set of |P * | packets.The set Ω is constructed basen on P * , which consists of |P * | coefficient vectors with dimensions of |D| and is denoted as G max Output: λ * with the highest fitness value in G max -th generation Λ G max //Generation of initial population 1: g = 1 2: Λ 1 ← Generate a initial population by Algorithm 1 3: Evaluate fitness value f (λ 1 i ) of each λ 1 i ∈ Λ 1 by Equation (1) 4: while g < G max do //Offspring Generation Evaluate fitness value of each individuals in Λ g ∪ Λ ′ by Equation (1) 7: