Dependency-aware online task offloading based on deep reinforcement learning for IoV

The convergence of artificial intelligence and in-vehicle wireless communication technologies, promises to fulfill the pressing communication needs of the Internet of Vehicles (IoV) while promoting the development of vehicle applications. However, making real-time dependency-aware task offloading decisions is difficult due to the high mobility of vehicles and the dynamic nature of the network environment. This leads to additional application computation time and energy consumption, increasing the risk of offloading failures for computationally intensive and latency-sensitive applications. In this paper, an offloading strategy for vehicle applications that jointly considers latency and energy consumption in the base station cooperative computing model is proposed. Firstly, we establish a collaborative offloading model involving multiple vehicles, multiple base stations


Introduction
5G mobile communication technology enables interconnection of all things through the establishment of a more efficient, intelligent, and reliable heterogeneous communication network, achieving ultra-low latency and high-speed data transmission [1][2][3][4][5].With the support of intelligent network technology based on AI, edge computing, and the Internet of Things [6], IoV technology deeply integrates cars with the network.It enables intelligent interconnection among vehicles and between vehicles and road networks [7][8][9].This brings users a more intelligent, convenient, and secure travel experience.A large number of applications in vehicular environments appear with the rapid growth in the number of intelligent vehicles.Novel vehicle applications continue to evolve, such as driver assistance, autonomous driving, collision warning and high-precision mapping.This development not only brings new business models and growth opportunities to the automotive industry, but also enhances convenience and enjoyment in people's travel and daily lives [10,11].At the same time, novel vehicle applications impose higher requirements on the IoV system, including computational capacity, processing latency, and energy consumption.More computational resources are necessary to ensure the high reliability and efficiency of data processing [12].Edge data increases in IoV and its limited capacity cannot process all the data in time.cloud computing architecture has high latency and low reliability due to Long-distance transmission.Meeting quality of service requirements for real-time vehicular applications is a challenge [13].Edge computing provides cloud services at the edge through wireless access networks.This extends cloud computing capabilities closer to the user side and significantly reduces the transmission latency of user service requests [14,15].Vehicle edge computing is a new computing paradigm and is gaining attention in the field of intelligent transportation systems in recent years [16].The way vehicle applications are offloaded from resourceconstrained vehicles to lightweight, ubiquitous edge servers can lead to enhanced vehicle performance.Application offloading decision-making is primarily about how to offload vehicle applications and where to offload them.The main optimization objectives of offloading decisions include latency, energy consumption, and trade-offs between latency and energy consumption [17].
However, the rapid expansion of intelligent vehicles has led to the emergence of a plethora of vehicle applications in the IoV.Edge servers are at risk of overload in computational storage and bandwidth resources.Offloading all vehicle applications to a single edge server may cause overload and limit performance gains.When performing edge collaborative offloading of vehicle applications, it is essential to consider the impact of task dependencies on application latency [18].For example, there are certain dependencies between various components in an augmented vehicle augmented reality system.Such as target tracking, model mapping, target recognition, perspective transformation and fusion processing.Considering task dependencies in task offloading method ensures that multiple vehicle applications are completed promptly.
Due to high mobility of vehicles, it is crucial for the system.Making vehicle applications timely offloading decisions and being completed within extremely low latency.Otherwise, there is a risk of offloading failure for vehicle applications.Additionally, making offloading decisions is difficult because of the dynamically changing vehicular network environment and the rapidly changing wireless channel conditions.
In this paper, we propose a Dependency-aware Online Task Offloading method to address above problems.This method take interdependencies among multiple tasks into consideration.Vehicle make applications consisting of tasks with multiple dependencies offloading to edge servers of base stations.Edge servers deploy dependencyaware online task offloading methods to autonomously make offloading decisions in dynamic and uncertain IoV.Edge server decides whether to transfer a dependency task to other edge servers or the current edge server for execution.By leveraging the computational resources of nearby edge servers to promptly process and fulfill emergency information or services for vehicles, we can enhance the quality of service for vehicles and reduce the energy consumption of IoV systems.
The main contributions of this study are as follows: 1. We establish a model for edge collaborative offloading.The model consists of multiple vehicles, multiple base stations (BS), and multiple edge servers (ES).
The model divides complex applications into multiple tasks and uses a DAG to model the data dependencies among tasks.2. A dependency-aware online DAG task offloading scheme is proposed, which solves data dependencies present in task offloading.The scheme defines application and task priorities and models data dependencies between tasks and tasks.We formulate offloading problem as a Markov Decision Process (MDP).It treats DAG task as part of the environment state to obtain a suitable offloading decision.3. We experimented extensively to prove the effectiveness of reducing completion time of multiple applications and total energy consumption of edge servers, and the proposed DAG-DQN was superior in improving completion rate of applications and the efficiency of system decision making.
The remainder of this paper is organized as follows: Related work section summarizes related work; System architecture section describes the system architecture and model in detail; Dependency-aware task offloading method in Dependency-aware task offloading method section.Experimental evaluation section presents the experimental evaluation.Finally, Conclusion and future work section concludes this paper and discusses future work.

Related work
This section discusses the current state of task offloading on edge computing research.Then, works on latency, energy consumption, and task dependency are presented, and related works on task offloading are compared in Table 1.
Task latency is the main optimization objective for executing task offloading in IoV [29].Fan et al. [19] proposed a collaborative resource management scheme to minimize the processing latency of vehicle tasks.This scheme implements task offloading, channel assignment, and allocation of computational resources to vehicles and roadside devices.Zhou et al. [20] devised a sophisticated risk-sensitive reinforcement learning algorithm for online task offloading decisions, aimed at reducing the latency of computational tasks.Deng et al. [21] introduced an autonomous partial offloading system for latency-sensitive computational tasks, delivering users with minimal latency offloading services.Although these schemes effectively reduce task latency, they ignore the fact that edge devices consume large amounts of energy while performing highly complex computational tasks.
Reducing energy consumption is critical to improving IoV system performance, and it is another key optimization objective for offloading decisions.Energy consumption typically encompasses computational and communication energy costs.Reducing computational and communication energy costs of edge servers has significant implications for offloading decisions [30].Many schemes jointly consider latency and energy consumption as optimization objectives for offloading decisions.Shinde et al. [22] proposed the utilization of a value iteration algorithm to seek the optimal strategy in uncertain vehicular environments, jointly optimizing total latency and energy consumption during vehicle task processing.Ning et al. [23] proposed an efficient and energy-saving task offloading framework that supports mobile edge computing, jointly considering task offloading between servers and the downlink energy consumption of roadside units.This framework aims to minimize the energy consumption of roadside units under task latency constraints.Maleki et al. [24] Consider energy consumption, transmission delay and processing delay during offloading as offloading costs.A switchable offloading strategy is proposed to improve service quality and user experience.Zhao et al. [25] proposed a task offloading method based on deep Q-networks to achieve an optimal balance between task execution latency, processing speed and energy consumption.
While these works consider reducing task latency and energy consumption of edge devices, they do not consider the impact of task dependencies on offloading decisions.They treat applications as independent tasks and do not divide applications into multiple tasks.Complex applications can be divided into multiple tasks that can be executed independently on different edge servers [31].Dependencies between tasks determine the execution order of tasks.Therefore, offloading methods should consider task dependencies.Liu et al. [26] considered task dependency requirements between multiple tasks of the same application and application completion time constraints to prioritize multiple applications and tasks.Define task offloading problem as an NP-hard optimization problem.Qi et al. [27] proposed a knowledge-driven decision framework for service offloading.Complex services are considered as multiple data-dependent tasks to reduce task response latency.He et al. [28] propose a task replication-based DAG task offloading method that uses DAG to model an application with a set of distributed tasks.Assigning tasks to different computing devices ensures that DAG tasks can be accomplished with lower latency.
In conclusion, these works do not consider the impact of data dependency requirements between tasks on offloading decisions.As well as the impact of task dependencies on application completion time and edge device energy consumption.Therefore, they cannot be directly applied to vehicular application offloading methods in base station cooperative computing.In addition, prioritization must be performed before application processing, with fine-grained prioritization of tasks with data dependencies.Application completion time and energy consumption are optimized using collaborative offloading between edge servers.
To improve the ability to solve the dependent task offloading problem in scenarios involving multiple vehicles, base stations, and edge servers.In this paper, we propose a vehicle application offloading method for base station cooperative computing scenarios that integrates application completion time and energy consumption.The dependencies between tasks are represented as directed acyclic graphs.A deep reinforcement learning algorithm that incorporates DAG task coding is used to enable the agent to perceive task dependencies.This method solves the dependent task offloading problem and reduces the risk of vehicle application offloading failure.

System architecture
The system architecture is introduced first.Subsequently, the relevant models and problem formulations are defined.Details of the symbols used in the paper are shown in Table 2 for reference.

System architecture
Vehicles in IoV can use different types of resources and can offload tasks to edge servers for fast processing.However, in the event of a sharp increase in workload, the task completion time will also escalate.If additional edge servers are hastily deployed at the base station, improper configuration in the short term will require additional human and material resources, resulting in wasted computing resources.Therefore, we propose a task offloading strategy in the inter-base station cooperative computing model to assist edge servers in managing the tasks offloaded by vehicles.
Figure 1 illustrates the system architecture of vehicular edge computing, composed of vehicles, BSs, and ESs.Suppose a crossroad is covered by K BSs located along the road, each with a fixed communication range.Each BS is equipped with an ES of varying performance and limited computational resources.The BSs located on either side of the road can communicate with each other, collectively forming an undirected topological structure.
Suppose M vehicles located at different positions are traveling at a constant speed v.Each requesting vehicle has a computationally intensive and latencysensitive application that needs to be offloaded within strict deadlines.Each requesting vehicle performs offloading only once.Different vehicles have different applications to offload, and these applications are independent of each other.Offloading and execution do not interfere with each other.The requesting vehicle offloads its application program to the ES for execution via the BS.The ES receives the application program data from the requesting vehicle, performs the appropriate computational tasks, and returns the results to the vehicle.The application can be denoted as APP m = D represents the amount of data to be returned to the requesting vehicle after processing application m by the ES, and T max m represents the maximum deadline allowed for completing application m.If the application exceeds the maximum deadline then it faces processing failure.When the application fails it becomes worthless and hence there is a need to ensure that it completes on the edge node within the given deadline.
Within the system architecture, each base station is equipped with K sets of transceiver antennas, utilizing K × K channels for communication.The communica- tion process is regarded as a multiple input multiple output (MIMO) system [32].The transmission rate S of the MIMO architecture is: where K denotes the number of transmit and receive antennas.B denotes the channel bandwidth.SNR denotes the average signal-to-noise ratio.Transfer time of application m offloaded to BS Transmission time between task i and task j

T app
Application processing time The time it takes application m to return results

T total
Application completion time

E m
Total energy consumption spent to complete application m α, β Weighting factors identity matrix.X represents a normalized channel matrix composed of x i,j (1 ≤ x i,j ≤ K ) , with its conjugate transpose denoted by X' [33].

Task dependency model
Each vehicle application can be divided into multiple tasks, which are interconnected by data dependencies.
Some tasks require the output data from other tasks as input after their completion.As illustrated in Fig. 2, for example, the real-time navigation application in vehicles can visually represent the data dependencies among tasks, where data is transferred between different tasks to achieve real-time navigation functionality.Considering tasks as computational units, each task can be , and E i,j ∈ E representing the set of data edges.A data edge E i,j indi- cates that the execution of task N j depends on the results of task N i .In the directed acyclic graph, nodes represent tasks, where the value on node i signifies the amount of data processing required to complete task N i , denoted as c i .The edges between nodes represent data dependencies between two tasks, where the weight of the edge signifies the amount of data transmitted between tasks with dependencies, denoted as d i,j representing the data size transferred from task i to task j. Figure 2 depicts the process of modeling the vehicle navigation application as a DAG of tasks.To maintain uniformity in the structure of the DAG task model, if an application has multiple initial or terminal tasks, a virtual task is defined as a new initial or terminal task.In Fig. 2b, the "Obtain Vehicle Location Information" and "Map Data" modules correspond to tasks N 2 and N 3 in Fig. 2c.Since these two tasks only involve data transmission without receiving data, a virtual task N 1 is defined to con- nect tasks N 2 and N 3 .The virtual task N 1 does not require any data processing or transmission.In this scenario, the virtual task N 1 serves as the new initial task, while N 6 acts as the terminal task.Tasks N 2 and N 3 are predecessors of task N 4 , with task N 4 depending on the execution results of N 2 and N 3 .Task N 4 is a predecessor of tasks N 5 and N 6 , and task N 5 is a predecessor of task N 6 .Task N 5 is a successor of task N 4 .Each task must wait for its prede- cessors to complete before it can commence execution.Tasks connected in series cannot be executed in parallel.

Application completion time
The completion time of each application consists of three parts.(1) the wireless transmission time to upload the application to BS; (2) the processing time of the application on ES; and (3) the wireless transmission time for the BS to send back the processed results of the application to the requesting vehicle.When application m is offloaded from the requesting vehicle to the nearby BS, the wireless transmission time is: The processing time of an application consists of three parts: task computation time, task-to-task data transmission time, and task waiting time.The computation time of task i is: (2) where f k represents the computational capability allo- cated to task by the ES k.
For two tasks on the same ES, there is no need to consider data transmission time.However, tasks located on different ESs require data transmission.Define δ i,j to indicate whether tasks i and j under the same application are offloaded to the same ES.
When the amount of data transmitted from the current task i to the subsequent task j is d i,j , the transmis- sion time is: Each ES is capable of maintaining a task queue, where tasks awaiting processing are stored.The ES places tasks that need to be processed into the task queue.Considering that the ES can only compute one task at a time and follows the principle of First-Come-First-Serve in processing tasks in the queue.The waiting time for task i is defined as T wait t .Let t start i and t end i denote the actual start time and actual end time of task i, while pre(i) and sub(i) represent the set of predecessor tasks and successor tasks of task i respectively.If a task has no predecessor tasks, it is referred to as a starting task and denoted as N in ; if a task has no successor tasks, it is considered as an ending task and denoted as N out .For the starting task N in , it holds that t start N in = 0.In conclusion, the actual start time of any task will not precede the completion time of any of its predecessor tasks.Therefore, the start time t start i and end time t end i of task i are as follows: The processing time of the entire application is: After the ES completes processing application m, BS returns the processed data results to the requesting vehicle.The transmission time required for returning the results is: (3) Finally, by combining Eqs. ( 2), (8), and ( 9), the completion time of the application is:

Energy consumption model
To evaluate the energy consumption of ES during task processing, three types of power are defined.(1) Computing power: the power consumed by the ES when performing computational tasks; (2) Transmission power: the power consumed by the ES during data transmission; (3) Standby power: the power consumed by the ES when in an active state but not processing tasks or transmitting data.
When ES k processes task i and needs to communicate with multiple ESs where the subsequent tasks of task i are located, the data transmission during this communication process is parallel.ES k simultaneously transmits the data generated by task i to multiple ESs unloading the subsequent tasks.The energy consumption during data transmission by ES k is: When the ES task queue is empty and not transmitting data to other ES, the ES is determined to be in an idle state.The idle time of ES k is: The symbol �(k) denotes the set of pending tasks in the task queue of ES k.
By combining Eqs. ( 11) and ( 12), the total energy consumption of all ES in the system when handling application m is:

Problem formulation
To assess whether an application can be completed before the deadline or before the vehicle exits the communication range.This paper defines the completion rate of application offloading to evaluate the performance of DAG-DQN method.The distance between vehicle m and the point where vehicle m exits the signal coverage of the base station is denoted by L m k .The completion time of application m must satisfy: Equation ( 14) indicates that application m has been successfully processed within the deadline or before the requesting vehicle exits the communication range, and the requesting vehicle has received the processing results of application m.Meeting these conditions constitutes successful processing, denoted by χ m = 1 .If these con- ditions are not met, it is considered a processing failure, denoted by χ m = 0 .The completion rate of all applica- tions offloaded to the base station at the same time is: The optimization objective is defined as simultaneously minimizing the completion time and total energy consumption of all applications in the system.The completion rate of the applications is used as a constraint on the optimization objective.The optimization problem is formulated as follows: Where α and β are weighting factors representing the importance of total application completion time and total energy consumption respectively.In Internet of Vehicles, there is a high demand for low-latency offloading of tasks, with the primary goal being to reduce total application completion time rather than decrease total energy consumption.Therefore, more weight should be assigned to M i=1 T i than to M i=1 E i , implying that α should be greater than β.
The above optimization problem belongs to the assignment problem in 0-1 Integer Linear Programming, which is NP-hard and cannot be solved to optimality in polynomial time.Next, we will design a method to efficiently solve this problem and obtain a suboptimal solution.

Dependency-aware task offloading method
In this section, we design a task offloading method that jointly optimizes latency and energy consumption based on Deep Reinforcement Learning for Task Dependent Quotient Offloading, or DAG-DQN for short, to provide a suboptimal solution with low computational complexity.The task offloading process is as follows.( 14) Multiple vehicles offloading multiple applications to ES at the same time, ES divides the application into DAG tasks in order according to the priority of the application.Multiple tasks under the same application are stored in the task queue according to the priority of the tasks.DAG-DQN is a reinforcement learning method based on task dependency awareness.It aims to jointly optimize latency and energy consumption.ESs deploying the DAG-DQN approach collaborate with other ESs to process the task of vehicle.
The DAG-DQN method is responsible for making the best decision, which relies on the agent's exploratory learning of task offloading rules from interaction with the environment.As shown in Fig. 3. Agent learns optimal task offloading rules and patterns by perceiving task dependencies.Agent perceives the task offloading environment and task state, makes offloading actions to interact with the environment in real time, and uses Q network as a strategy judgment criterion.States, actions, rewards and next states are stored in the experience replay pool.The Q network is trained repeatedly using the Q learning algorithm over many iterations and the network parameters are constantly updated to approximate the optimal Q function.The trained DAG-DQN method is evaluated in a simulated environment, and the evaluation metrics include application completion time, edge server energy consumption, and application completion rate.Based on the evaluation results, we decide whether to deploy the DAG-DQN method to the real environment.

Prioritize building applications and tasks
By prioritizing applications and tasks, the system can better manage and schedule tasks to ensure that requesting vehicles are processed promptly within application completion deadlines and to avoid situations where tasks cannot be completed due to communication delays or base station communication range limitations.This prioritization helps to improve the efficiency and performance of the system while enhancing the quality of service and user experience of the vehicle.The Application Priority ensures that requesting vehicle is completed within the application completion deadline and that the requesting vehicle is not outside the communication range of the BS.The application priority is: AQ is an application priority queue that stores multi- ple applications in ascending order of priority A m in the queue.
Construct a task queue T Q m for multiple depend- ent tasks of the application m.The starting task is stored (17) Fig. 3 DAG-DQN task dependency-aware offloading method at the head of the queue, and the succeeding tasks are stored in T Q m in descending order of T m,i .Repeat this process so that all tasks are stored in the task queue and the terminating tasks are stored at the end of the task queue.The task priority T m,i is: where σ denotes the trade-off factor σ ∈ (0, 1).

DAG-DQN task dependency awareness
Applications and tasks are stored in queues according to their priority.Data dependencies between tasks are modeled as directed acyclic graphs that become part of the state of the environment perceived by the Agent.Agent makes different actions to interact with the environment state and the resulting state, action, reward and next state are stored in the experience replay pool.Redefine the reward function in the Q-network and iteratively train the Q-network through the Q-learning algorithm to get the optimal offloading strategy through multiple iterations.The DAG-DQN method is summarized in the Algorithm 1.
State Space: At moment t, the requesting vehicle uploads an application program to the ES within communication range, and the ES observes an environmental state with two components, a continuous data state s c t and a discrete data state s d t . ( denotes the set of computational data volume of all tasks under the application, and denotes the set of data volume trans- ferred between tasks.F t = [f 1 , f 2 , . . ., f K ] and P t ′ = [P 1 , P 2 , . . ., P K ] denote the idle computational capacity and power consumption of all edge servers in the system, where P k = P cal k , P trans k , P static k , k ∈ K .Some work predicts user movement trajectories and uses them as decision factors [34].However, in this paper, the requesting vehicle can unload the task in a short time (within 1s), and the movement of the vehicle can be estimated based on the position and instantaneous speed of the requesting vehicle.Therefore, the unloading decision in this paper does not consider the vehicle's trajectory.We take the vehicle's position information and velocity as inputs to the unloading decision.
It is worth noting that in neural networks, unlike continuous data, discrete data usually use an Embedding layer to convert discrete variables into continuous vectors, which are fed into the neural network together with other continuous data.By processing discrete data, the neural network pays more attention to the discrete data, to highlight the dependency between tasks and truly reflect the environment state before the task is unloaded.The discrete data state is: Where X t = [x 1 , x 2 , . . ., x K ] denotes the set of all edge servers in the system environment, and the value of x i is defined as the edge server number, x i = 1 means that task i is offloaded to the edge server numbered 1.

(20)
X t denotes the offloading scheme for all tasks under an application.
The dependencies between all tasks under an application are represented by a 0-1 encoded two-dimensional matrix G.
Where g j i ∈ {0, 1} , when the jth column element of the ith row has a value of 1, it means that task j is the successor task of task i.When the element of the jth row of the ith column is 1, it means that task i is the predecessor task of task j.For example, g 5 2 = 1 indicates that task 5 is the successor task of task 2 and task 2 is the predecessor task of task 5. g 5 2 = 0 indicates that there is no dependency between task 5 and task 2.
Action Space: The action of the agent is to select the task to be offloaded to the ES under the current BS or transferred to the ES under another BS.When the agent receives the current environment state s t , it quickly makes the task offloading action a t = [ 1 , 2 , ..., K ] , where i ∈ {0, 1} and satisfies K i=1 i = 1 , a t denotes the offloading location of the task, and calculates the reward value R t through the reward function.we adopt a single-step updating strategy, where the deep neural network returns a single task offloading action a t , which represents the offloading scheme of a task, and the offloading scheme of all tasks in an application is Reward Function: In reinforcement learning, the goal of the agent is to maximize the cumulative reward over time by learning to adjust its strategy.The cumulative reward is represented by the following equation: Equation (22) indicates that at moment t, the agent takes a series of actions from the current state to obtain the corresponding timely rewards r t , r t+1 , r t+2 , . . ., r T −1 , r T .They are weighted and summed to obtain the cumulative reward R t .The dis- count factor γ represents the importance of future rewards which decreases over time.Future rewards will be more uncertain than current rewards and therefore need to be discounted.
The goal of DAG-DQN is to maximize the cumulative reward by training for the optimal task offloading method.If agent offloads task i to the ES at time t, it (21 can get the reward r t in time.The normalized reward function is: Where and µ denote the balancing factors, and �(k) denotes the set of all tasks preceding task k in the task queue T Q m .The reward obtained is inversely propor- tional to the task completion time and energy consumption.Reward value is larger when the task completion time and the energy consumption of the edge server are smaller.
Updating Network Parameters: In DAG-DQN method, parameter update of the deep neural network is mainly realized by using the mean square error loss function.A batch of random samples S t = s i , a i , r i , s ′ |i = 1, 2, . . ., H of size H is randomly drawn from the empirical replay pool.This batch of empirical samples is fed into the deep neural network.Each sample is computed with an estimated Q-value and a target Q-value, denoted as Q(s t , a t ; θ) and y t , respectively, with θ denoting the parameters of the deep neural network.The target Q-value is calculated by the equation: Where θ − denotes the parameter of the target network, which is fixed for a period of time and then periodically updated from θ − to θ .The purpose of this is to make the target Q-value more stable and reduce the oscillation of the algorithm.
Next, loss value is calculated based on the mean square error loss function: Updating network parameters θ by gradient descent algorithm.
Where α is the learning rate, which is used to control the step size of each update.∇L(θ i ) is the gradient vec- tor of the loss function L(θ ) at the parameter point θ i , which represents the rate of change of the loss function for each parameter.Through iterative updating, the gradient descent algorithm can find the minimum value of the loss function.(23) Offline training of deep neural network models in DAG-DQN requires agents to constantly interact with the environment.The intelligent body learns the execution patterns and rules for task offloading from the environment state and quickly adjusts the parameters of the deep learning network.The trained model quickly formulates the task offloading method according to the environment state.

Experimental evaluation
This section is structured as follows, we first introduce the experimental simulation conditions.Secondly, we analyze the performance of DAG-DQN method in terms of application completion time, energy consumption, and completion rate through the experimental simulation results.Finally, we analyze the time complexity of the method and the execution time of the method.

Experimental settings
We use computational experiments to perform DAG-DQN task offloading experimental simulations.Communication methods, legal and institutional constraints, and real-world uncertainties in IoV systems are considered [35].We use a truncated Gaussian distribution to generate vehicle mobility due to the traffic speed limitations required on urban highways [36].The network environment in the experiment is established in an ideal situation, without considering abnormalities such as packet loss or interference.
We used Python simulator and Pycharm tools to write the simulation platform EdgeSim.Environment consists of 20 base stations, each equipped with an edge server, each with a bandwidth value randomly chosen from [10,50] Mbps and a processing capacity randomly chosen from [0.5 × 10 5 , 10 5 ] KB per second.The coverage area of each base station is 1000 m.Each vehicle requests to offload an application, and each vehicle application consists of 7 tasks, the amount of data for the application is a randomly chosen value from the interval [100,300] KB, and the amount of data to be computed for each task is a randomly chosen value from [50,100] KB.The amount of data transferred between tasks is a randomly chosen value from the set [10,20] KB.We provide specific information about the tasks and data involved in the application during the simulation experiment.This is reflected in Table 3.The experimental parameters are summarized in Table 4.

Evaluation of the DAG-DQN method
To evaluate the effectiveness of the methods, DAG-DQN method was compared with several commonly used methods.The specific practices of the three methods are described below.

Randomized Selection Method(RS):
The RFF is a probability-based Method.This Method tends to find an approximate solution in the minimum time when dealing with NP-hard problems.By comparing RS with DAG-DQN, the efficiency and accuracy of DAG-DQN models in offloading decisions can be evaluated.
Optimal Greedy-based Method(OGOM) [37]: The OGOM is a local optimization Method.The decision that seems optimal at the moment is selected at each step without considering future effects.Therefore, it can obtain a near-optimal solution quickly.By comparing OGOM with DAG-DQN, the performance of DAG-DQN can be evaluated in terms of considering long-term effects and global optimization.

GA-based Sustainable Mobile Edge Server Selection(GAME)[38]:
This is an improved genetic method.The authors used tournament selection to select the right number of solutions and cross mutation of chromosomes or solutions to get the best solution in the edge server selection problem.By comparing GAME to DAG-DQN, it is possible to assess the performance of DAG-DQN when dealing with complex problems and to understand its ability to adapt in the presence of a larger search space.
The specific experimental design is as follows.First, keep the number of edge servers constant and increase the number of applications.Observe the changes in the completion time of the total applications and the total energy consumption of the edge servers.Second, keep the number of applications constant and increase the number of edge servers.Observe the changes in the completion time of the total applications and the total energy consumption of the edge servers.The performance of DAG-DQN is evaluated in terms of reducing the total application completion time and total energy consumption compared to three methods, namely RS, OGOM, and GAME.In the simulation environment, two sets of experiments were conducted.In the first set of experiments, the number of applications was set to 5, 10, 15, and 20.In the second set of experiments, the number of edge servers was set to 10, 20, 30, and 40.Observe the impact of the number of applications and the number of edge servers on the task offloading decisions made by the four methods of generation.
The results of the first set of experimental comparisons are shown in Figs. 4, and 5. Increasing the number of applications leads to an increase in the amount of data that ESs need to compute, and therefore the total energy consumption and total application completion time gradually increase.Due to the limited computing resources of edge servers, multiple applications are queued for processing on a few edge servers.This increases the queuing latency of the applications and results in some edge servers being under high load for a long time.For the same number of applications, the offloading decisions given by the DAG-DQN method have a lower total application completion time and total energy consumption than the offloading decisions given by the RS and OGOM methods.This demonstrates the superiority of the DAG-DQN method in reducing the optimization goals.However, it is slightly inferior to GAME.This is because GAME is a heuristic search method.This method can search for the global optimal solution in the entire search space thus obtaining better results.However the time complexity is high, convergence is slow and offloading decisions cannot be made in real time.In contrast, DAG-DQN can learn the optimal policy through online interaction with the environment with better convergence and stability, and can make uninstallation decisions in real time.
In the second set of experiments, the results of comparing the total application completion time and energy consumption of RS, GAME, OGOM, and DAG-DQN are shown in Figs. 6, and 7.An increase in the number of edge servers results in a larger number of edge servers that applications can choose to offload.Since the solution space expands as the number of edge servers increases, this may generate new offloading solutions.So the total application completion time decreases or remains the same.As the number of edge servers increases, the total

Application completion rate
Equation (15) defines the completion rate of the application to evaluate whether the application can be processed within the time limit or before the vehicle moves out of communication range.The magnitude of the completion rate indicates whether the application can be successfully processed within the constraints.
The results of the completion rate comparison are shown in Figs. 8, and 9.In Fig. 8, as the number of applications increases, the application completion rate decreases for different methods.A requesting vehicle at the edge of the base station's communication range is present to unload the application.The requesting vehicle uploads the application to the base station and drives out of the communication range of the base station.This situation may result in the application not being completed causing a decrease in the application completion rate.The advantages of GAME and DAG-DQN are demonstrated by comparing the completion rates of the same number of applications.In Fig. 9, the completion rate of the application gradually increases with the increase of edge servers.With the same number of edge servers, DAG-DQN and GAME have higher completion rates than RS and OGOM.

Method execution efficiency
The experiment was repeated several times while ensuring that the experimental conditions were the same, and the execution time required for the method to make the unloading decision was recorded and averaged each time.The experimental results are shown in Table 5.
n denotes the number of tasks of the application.m denotes the number of edge servers.p and g denote the size and the number of iterations in the GAME method, respectively.As shown in Table 3, DAG-DQN spends much less execution time than GAME to formulate an offloading decision, with a four times shorter execution time.Although GAME works better in terms of solving capability, GAME has high time complexity and long decision time.GAME cannot fulfill the requirement of real-time application offloading.In autonomous driving, a delay of 1477ms may result in the vehicle not being able to respond to changes in road conditions and other vehicles promptly, on time thus compromising the safety of the vehicle.In vehicle safety applications, a delay of 1477ms seconds may result in the system not being able to determine whether the driver is fatigued or distracted in time, thus affecting the safety and stability of the vehicle.By analyzing the experimental results and method execution time, DAG-DQN can make offloading decisions efficiently and quickly, and meet the real-time application offloading requirements in IoV.The task offloading method should generate offloading decisions in a short time to meet the offloading requirements of computationally intensive and latency-sensitive vehicular applications.Also, task offloading methods should ensure that performance metrics such as latency and energy consumption are optimized.RS and OGOM have the same advantages as DAG-DQN in terms of execution time.However, they perform poorly in reducing application completion time and energy consumption.And RS and OGOM cannot guarantee the completion rate of vehicle applications.On the contrary, GAME excels in reducing application completion time and energy consumption with a high completion rate.However, the high complexity and long execution time of the method make it unsuitable for offloading decisions of real-time vehicle applications in IoV.It is concluded that DAG-DQN is more suitable for the task offloading in IoV.DAG-DQN has low execution time for fast decision making, which effectively reduces the application completion time and energy consumption and ensures the application completion rate.

Conclusion and future work
In this study, we propose a Dependency-aware online task offloading method for base station cooperative computing scenarios.It effectively reduces the total application completion time and total energy consumption and reduces the risk of application offloading failure and system overhead.We model applications as DAG tasks and define the priority of applications and tasks.ES stores applications and tasks in an execution queue based on priority.An ES using DAG-DQN method senses DAG tasks in the network environment and targets latency and energy reduction as the goal of task offloading decisions.Realizing real-time task offloading in IoV.Experimental results validate the superiority of the DAG-DQN task offloading method.The experimental results verified the superiority of the DAG-DQN method.
In the future, we aim to refine and bolster the performance of our task offloading methodology.This will entail the accumulation of extensive datasets pertaining to vehicular edge computing, enabling comprehensive validation of our approach across diverse real-world scenarios.In addition, we will improve our methodology to accommodate the dynamic and heterogeneous nature of vehicular environments, while ensuring seamless integration within existing IoV infrastructures.By addressing these key issues, we aspire to fortify the practical applicability and efficacy of our approach.

,
m ∈ M represents the amount of data to be uploaded to the ES for offloading application m, D down m

Fig. 4 Fig. 5 Fig. 6
Fig. 4 Impact of the number of applications on application completion time

Fig. 7
Fig. 7 Impact of the number of ESs on the energy consumption of ESs

Fig. 8 Fig. 9
Fig. 8 Impact of the number of applications on the completion rate

Table 1
Compare the characteristics of the task offload work

Table 2
Summary of notations

Table 4
Parameter setting

Table 5
Experimental result