A dynamic service migration strategy based on mobility prediction in edge computing

Mobile edge computing is a new computing paradigm, which pushes cloud computing capabilities away from the centralized cloud to the network edge to satisfy the increasing amounts of low-latency tasks. However, challenges such as service interruption caused by user mobility occur. In order to address this problem, in this article, we first propose a multiple service placement algorithm, which initializes the placement of each service according to the user’s initial location and their service requests. Furthermore, we build a network model and propose a based on Lyapunov optimization method with long-term cost constraints. Considering the importance of user mobility, we use the Kalman filter to correct the user’s location to improve the success rate of communication between the device and the server. Compared with the traditional scheme, extensive simulation results show that the dynamic service migration strategy can effectively improve the service efficiency of mobile edge computing in the user’s mobile scene, reduce the delay of requesting terminal nodes, and reduce the service interruption caused by frequent user movement.


Introduction
With the rapid development of mobile Internet, mobile data traffic is increasing at an alarming rate. According to a report by Cisco VNI, 1 the total amount of mobile data will increase sevenfold from 2016 to 2021. The huge amount of data has increased the burden on the central network and caused higher transmission delay. Traditional cloud centers are overwhelmed and cannot meet the demand for new technologies such as virtual reality (VR), WIT120, and connected cars. In order to cope with the increasing mobile traffic, great progress has been made in recent years in providing closer cloud services to users.
Compared with traditional cloud computing models, edge computing can better support mobile computing and Internet of things (IoT) applications. According to Cisco's statistics and forecast data, 2 most cloud data are temporary data that do not need long-term storage. Migrating the computing and storage capabilities of servers to the network edge can greatly reduce the amount of temporary data processed at the network edge. Short distance edge services can effectively reduce network jitter and delay, and enhance the responsiveness of services; at the same time, in view of the insufficiency of cloud computing that cannot provide fine-grained control for user data access and use, edge computing. Can improve data security by providing dedicated use and storage infrastructure equipment for critical private data, and use operation restrictions.
Researchers have proposed many conceptual models, including micro cloud, 3 Micro Data Center, 4 Cloudlet, 5 Fog Computing, 6,7,2 and Follow-Me Cloud (FMC). 8 The core of these models is to run applications and related processing tasks near mobile users, reduce network congestion, and improve service experience. 9 With the standardization of various architectures in 5G network, the concept of mobile cloud computing has gradually developed into the latest mobile edge computing (MEC) standard architecture. In 2014, the European Telecommunications Standards Institute (ETSI) established the MEC Industry Standards Group (ISG) to focus on the standardization of MEC in wireless access networks, which is depicted in Figure 1.
MEC has become a key technology to realize the bright vision of future network development. The contradiction between the limited total geographic deviation server and the mobility of the user terminal will lead to significant network performance degradation, which will lead to a sharp decline in quality of service (QoS) and even interruption. Therefore, for the ongoing edge service, it is difficult to ensure the continuity of the service.
In order to ensure continuity, the following decision should be made without interrupting the ongoing edge service: when an arbitrary user moves it out of the service area of the connected edge server, whether and where to migrate the ongoing edge service. 10 We can (1) continue to run services on the current edge server and exchange data with mobile users through the core network or other edge servers or (2) migrate the service to another edge server covering the new area. 9 The migration decision involves such issues as when to migrate and whether the migration destination is optimal.
In this article, a Dynamic awareness Service placement and Migration scheme in edge networks (DSM) is proposed where the distribution and set coverage of users is used to obtain the initial placement of services that meet the requirements of multiple users; this multiple service placement algorithm (MSPA) reduces the repeated placement of services as much as possible while meeting the experimental constraints of most users. In the process of user dynamic movement, by correcting the user's location, using Lyapunov optimization to obtain a suitable service migration strategy, to achieve a balance between user-perceived delay and long-term cost.
The rest of this article is organized as follows. The motivation and the contributions of this article are presented in section ''Motivations and contribution.'' In section ''System architecture and workflow,'' we introduce the system model and workflow in this article. In section ''Place service initially,'' a multi-user service initial placement algorithm is proposed. In section ''Dynamic service migration,'' we introduce the dynamic service migration strategy based on Lyapunov and we propose a mobility-based service migration strategy. Section ''Simulation and evaluation'' describes simulations and results analysis of this article. The work of this article and prospects for possible future research are summarized in section ''Conclusion.''

Motivations and contribution
MEC is now one of the key technologies supporting 5G operations. Services need to be dynamically placed in the edge network to ensure user service experience.
However, a new challenge is that the resources (processing power, storage space, etc.) of the MEC server are much smaller than traditional data centers. Therefore, due to cost considerations, it is impossible to delegate the services of each data center to edge servers. In other words, it is to find the minimum set of replicas that can satisfy the constraints of latency between users requesting the same service.
However, user mobility will bring challenges to service deployment. Noting that the delay perceived by the user is jointly determined by the communication delay. 11,12 Therefore, if the service configuration file of each user is actively placed on the nearest MEC node, some MEC nodes may be overloaded, resulting in increased calculation delay. When the services are not used and do not migrate (i.e. the services are saved in the first server they are allocated), it will cause the distance between the service and the user to increase, resulting in performance degradation. In summary, making appropriate migration decisions and shortening service migration time are essential to improve service continuity and service quality.
In this article, we focused on the service migration strategy in MEC, hoping to reduce service interruption caused by user movement and to ensure user QoS as much as possible to reduce transmission delay and improve service efficiency. The main contributions of this article are as follows: An algorithm for the initial placement of services is proposed to reduce the repeated placement of services while meeting the delay constraints of most users. Build QoS and migration cost model and design a service migration strategy based on Lyapunov. Lyapunov is used to get the service migration decision set with minimum system delay in the case of guaranteeing long-term cost stability, that is better adapted to the mobile environment, reduce network latency, and improve service efficiency. Propose a method based on user mobility considerations to decide which edge server to migrate the service to. We consider the mobility of mobile node using the Kalman filter that is more mobile-oriented to ensure service continuity.

Related work
It is demonstrated that user movement in cellular network under FMC scenario can be modeled as Markov chain model. 11 Through the analysis of data such as the average delay and migration cost for users to obtain services after virtual machine migration, it is proved that the cost caused by all migrations is greater than the cost caused by partial migrations, and as the distance between users and the optimal server increases, the migration cost gap will decrease.
Service migration problem under the mobile cloud computing scenario is modeled as a one-dimensional Markov decision process (MDP). In Sharma et al., 12 the distance between the mobile user and micro cloud was defined as a state function, and the network costs were defined as a reward function. At the same time, the existence of the so-called migration optimal threshold was proved. When users reach the threshold, the benefits of service migration would be greater than the benefits of no service migration, and the decision would be made for service migration. Some researchers further optimized the above model, simplified the solving process of MDP, and achieved good results. 13 In Zhang et al., 14 the authors proposed a cloud service migration decision system and established an MDP model to cope with server performance degradation in mobile edge networks by actively migrating services. Through the state collection module, system senses and monitors the changes of response time of each server in the collection network, and predicts QoS according to the collected data. Specifically, if the predicted QoS exceeds a given QoS threshold, the QoS prediction module will trigger the search of the target server to migrate the service to another edge cloud that provides the best QoS. In order to reduce the large number of complex computing and statistical processes in the above MDP migration scheme, researchers such as IBM's Rahul et al. 15 designed a new MDP service migration solution, using the decoupling property of MDP to transform the constrained MDP problem into a simple deterministic optimization problem, which simplifies the calculation.
In addition to traditional heuristic algorithms, deep reinforcement learning has also been used in the field of edge computing in recent years. In Feng et al., 16 a cooperative computation offloading and resource allocation framework for blockchain-enabled MEC systems is designed, and used A3C to solve cooperation computation offloading and resource allocation problem. In Zhang et al. 17 proposed a deep Q-network for task migration in MEC system, which can learn the optimal task migration policy from previous experience without necessarily acquiring the information about users' mobility pattern in advance. Florain et al. 18 propose a general framework for optimizing edge service migration based on reinforcement learning techniques. Using their solution, the migration policies can be learned with respect to a large variety of optimization goals. Gao et al. 19 designed a reinforcement learningbased framework for a single-user edge computing service migration system. And they take many requirements into account, but not user movement and link change. In Chen et al. 20 Chen et al. proposed a service migration mechanism in Edge Cognitive Computing, in which the service is migrated based on the behavioral cognition of a mobile car. Besides the movement of cars, this article focused on the service awareness, including emotion detection and video streaming.
Some researchers have also considered some personalization factors. Huang 21 investigates service migration in MEC by taking the risk of location privacy leakage into account, and formulates the service migration problem as an MDP and proposes an efficient algorithm to find the optimal solution that minimizes the long-term total cost. In Varasteh et al., 22 a power and latency aware optimum cloudlet selection strategy was proposed for multi-cloudlet environment with a proxy server. Power consumption and latency are reduced by approximately 29%-32% and 33%-36%, respectively.
Through the study of service migration strategies in mobile edge network in recent years, we found that most mobile migration strategies currently used MDP to model and solve mobile service migration strategies, but some of the scenarios were difficult to extend to 5G applications. In addition, most researchers often consider only part of the factors to decide whether to migrate while ignoring the user mobility on the performance of migration strategies. Therefore, we proposed MSPA and DSM to make up for these shortcomings.

System architecture and workflow
As shown in Figure 2, we consider a backend cloud contains a set M = {1, 2, ..., M} of MEC nodes, and the network controller is placed in the backend cloud.
The network control center collects all user request information, delivers the service to the edge node, and provides the service to the user through the edge server. After the requested service is deployed to the edge node, it detects the user status as the user moves. The move behavior determines whether to migrate the service to the specified server.
The environment for the applications to run on each mobile device is assigned to a dedicated virtual machine or container. To better capture the user mobility, the system is assumed to operate in a slotted structure and its timeline is discretized into time frames t 2 T = {0, 1, 2,..., T}. At all discrete time slots, each mobile user sends a service request to the local MEC node, and then the network controller responds the request and determines the optimal node to migrate to serve the user.

Place service initially
In the process of users requesting services, service resources will be deployed in a large number of locations on the edge network, which can locate resources closer to users, reduce bandwidth costs, and increase availability. Faced with the massive amount of requested data, edge networks 9 often need to place service copies in multiple locations on the premise of ensuring user service quality. Therefore, reducing the total cost of service deployment to ensure load balancing in the data center is one of the important issues that need to be resolved. However, there are a few studies in this area.

MSPA algorithm
In this problem, we assume that the network graph is G = (U , ES, W ). Let the set of nodes U represents the users, ES be the set of edge servers, and W record the location of users and edge servers.
Assuming that each edge server has capacity C, and the total number of replicas is R. p i, j , r i, j 2 f0, 1g is used to indicate the initial placement, if the service s i of user u i is placed in the edge server es j , p i, j = 1, and if the copy of s i is allocated to es j , r i, j = 1. We use cost to represent the total storage cost, which is defined as where u i represents the storage cost of service i.
To ensure the experience of users, we also need to guarantee user-perceived latency, whose threshold of tolerate delay is threshold L while optimizing the cost, the latency can be defined as Thus, the network service placement in edge cloud problem is as follows C2 : Equation (4) ensures that each service can only be placed on one server once; equation (5) ensures that each data center has a capacity limit when saving users' data; and equation (6) ensures that the total number of replicas cannot exceed the predefined limit. Equation (7) requires that the delay between the user and the service does not exceed a given threshold.
To solve this problem, we use latency constrained matrix to describe the coverage relationship of edge servers to users, which is defined as follows Then, we adopt a heuristic algorithm to obtain the minimum set of servers, so as to optimize the storage cost of edge network. Algorithm 1 presents the calculation process of service placement scheme. The input includes a set of users, tolerate latency, and |ES| edge servers.

Algorithm complexity
The algorithm is divided into three steps.
Step one establishes a discernibility matrix, step two performs a greedy search, and step three reduces the data center. The complexity of these three steps is O(n 2 ), so the time complexity of algorithm is O(n 2 ).

Dynamic migration decision
Relying on edge computing technology, we can reduce the latency of network tasks and improve user QoS. However, frequent service interruptions caused by the movement of mobile nodes are still a bottleneck affecting network performance. Therefore, in order to improve the service efficiency, reduce the request delay of mobile nodes, and minimize the service interruption in the dynamic environment, cost-efficient trade-off should be considered.
In this section, network architecture, latency model, and costs model of MEC are introduced. We use a k (t) to indicate the migration decision taken by ES k node for the service k during the state detection time slot t. When a k (t) = 1, it means that ES k decides to migrate service k at this time. When a k (t) = 0, it means that ES k has decided not to migrate the service at this time and still runs the service locally QoS model Communication latency. In this article, we use response delay to present QoS. The response time between a mobile device and the MEC node generally contains the network propagation delay and the data transmission delay.
Since the network condition and the data transmission information in current time slot are available from the system-level perspective, we can extend the above cases into a general model, which we do not impose structural assumption on. Given the service request information as well as the current location of user, the communication delay to MEC node i can be characterized by a general model D i (t).
Computing latency. Each MEC node may provide computing resource to multiple users and the computing latency of MEC nodes may not be ignored. Thus, computing latency should be concerned in the system model. We use C i (t) to the computing delay of MEC node i at time slot t and C i (t À 1) at time slot t 2 1.
When considering the service migration decision a k (t), the user latency experienced by user can be further expressed as Cost model. In order to achieve satisfactory QoS, it is often necessary to migrate services among edges to follow the user mobility. However, it is worth noting that service migration across edge servers often brings additional operational costs. To model the operational costs incurred by cross-edge service migration, we use E k (t) to denote the costs of migrating the service k from source MEC to the optimum MEC. Considering the service migration decision a k (t), the migration costs by user k can be further expressed as When to migrate Since our goal is to minimize user latency with longterm average cost constraints to obtain performance-cost trade-off, which is difficult to directly solve in traditional methods, in this section we adopt the Lyapunov theory 25 to obtain service migration decision-making in edge computing. We first convert the original problem to a queue stability control problem. Input queue Q(t) is defined as where E a is a long-term time-averaged costs and Q(t) is the queue length at time t. The virtual queue can be regarded as a constraint on the migration costs. When the queue is stable, the transmission cost will meet the preset migration costs.
Proof. We first rewrite equation (4) as follows By taking the average from time index 0 to time index T À 1 on both sides and letting T goes to infinite, we have

If the virtual queue Q(T ) is rate stable, that is, lim
This completes the proof. The quadratic Lyapunov function is defined as Then, we can get the Lyapunov drift function as Therefore, the drift function can be collated as After constructing the input queue, our goal is to find a migration policy to coordinate the perceived latency and migration costs. So, a drift plus penalty function is defined as where the parameter V ø 0 is the penalty weight, which represents the importance of the objective function compared to the constraints at every time slot. Our goal is to minimize the value of the right expression. In order to simplify the problem, we omit the fixed value part such as (E a ) 2 , Q(t)E a at a certain moment. The problem can be simplified as We can calculate the value of X (t) and determine the best offloading strategy a k (t) with & Note the optimal action a k (t) is not static, since the reward function involves the real-time QoS that captures the dynamic link states. The optimal action a k (t) will be updated with the input of the real-time QoS. Our proposed model reflects the dynamic link states. Therefore, the dynamic optimal actions generated maintain the long-term balance between QoS and costs.

Select the server based on user mobility prediction
With the presence of user mobility, it is intuitive that to ensure a desirable level of QoS, the service should be actively migrated to follow the user mobility. However, frequent migration would incur excessive operational costs in return. Then, a natural question is how to navigate such a performance-cost trade-off in a cost-efficient manner, so a dynamic service migration strategy is introduced in section ''Dynamic service migration.'' In this article, we use the Kalman filter to predict user mobility and decide where to migrate the service to ensure the connectivity between the terminal and the ES. A Kalman filter 26,27 is a set of mathematical equations that efficiently estimate the state of a linear system that minimizes the estimated error covariance to reach optimization. The user state vector x is defined by the set of data ½p x , v x , p y , v y which describes the user's movement at time t, where p x and v x correspond to the latitude point and the velocity, respectively, and p y and v y correspond to the longitude parameters of a user. Because user's movement can be regarded as a linear model within short time, we utilize uniform linear motion model to measure the change in the state of the user within Dt. Then, the state vector x k and the observed value z k vector are given by where v k and u k represent a normal probability distribution of the process white noise. To produce the estimated state x k + 1 at time k, the matrix x is multiplied by the state model A The Kalman filter works in two steps recursively at each time step Dt: time update that predicts the next estimation of a current state, and measurement update that adjusts the current state estimation with actual measurements at time t, they are defined as follows: where w k represent a normal probability distribution of the process white noise. P À k + 1 is the priori estimate error covariance matrix, and Q is the process noise covariance.

Correct step
In this step, x k + 1 is the posteriori state estimation at time t, P k + 1 is the posteriori estimate error covariance matrix, and R represents the measurement noise. In this article, the connectivity between the terminal and the ES is determined using outage probability where P TX represents the transmission power, L(d) is the path loss, N 0 is the noise power, and G is the signalto-noise ratio (SNR) threshold.
The process of the proposed ES selection can be described: when user is asking a task that is held on ES and migration has to be performed, predict the location of the user and select the closest ES from all connectable ESs.
Thus, our mobility-aware service migration can be given by Algorithm 2.

Simulation setting
We set our simulation in MATLAB and Python. In order to realize the real simulation of urban scene with parameters, in this article, specific node movement trajectories and scenes were constructed based on the movement trajectory data set 28 obtained from global positioning system (GPS) tracking of taxis in San Francisco bay area by CRAWDAD community. When we set up the simulation environment, the base station is approximately evenly distributed and referring to the mobile operator's base station laying and area coverage setting, we set the coverage of each ES to 200 m and the overlapping area of adjacent ES to 20 m. The hop distance between two MEC nodes is calculated by the Manhattan distance. To simplify the problem, we assume that the transmission delay between ES and user follows uniform distribution within [0.9, 1.1] of the distance, and 1 unit per hop. We simulate 100 time slots for our system. The settings of network parameters such as service request, packet size, bandwidth, delay, and other parameters used in DSM algorithm were shown in Table 1.
In this section, we compare our algorithm with the following four other methods: (1) Service Offloading in SDN (SO) 23 in which algorithm services migrate to the nearest MEC node at every opportunity over a long period of time, (2) the method which migration policy based on deep Q-learning (DQN) is applied, 19 (3) never migrate service (non-migration), and (4) genetic algorithm is used to solve the problem of data placement in cloud for online social networks. 24

Results
Average response time. Average task response time is defined as ART i = P i j = 1 Time j =i, where ART i is the average response time of the first i seconds, and Time j represents the response time of j seconds. The performance of the algorithm can be seen by comparing the average task response time when sending packets of different sizes.
In the simulation experiment, the data packet sizes are tested at 2M and 16M (Figures 3 and 4), respectively, and the task response time under different strategies is compared for comprehensive analysis and discussion.
As can be seen from Figures 5 and 6, the latency gradually decreases and becomes stable. The response time of the service of non-migration strategy is significantly larger than the other two strategies. Through analysis, we believe that the reason is as follows: since the mobile node keeps moving, the ES handover (base station handover) process will always occur. Sub-switching means a service migration, which will significantly increase the task transmission delay. The process of SO and DSM according to their respective designs does not have a serious impact on network traffic.
The average response time of DSM is better than that based on the SO scheme. Compared with the SO Algorithm 2. Dynamic service migration strategy based on user mobility.
Input D i t ð Þ, E k t ð Þ, network situation Output Service migration decision. Initialization Set the costs queue Q(0) = 0 at beginning End Initialization 1. for each time slot t = 1, 2, . . . , ' 2. predict user location using Kalman filter, select closest server as optimum server 3. Calculate do not to migrate service 7. else 8.
migrate service to optimum server selected by step 2 9. end if 10. Update the virtual queue 11. Q t + 1 ð Þ= max Q t ð Þ + a k t ð ÞE k t ð Þ À E a , 0 ½ 12. end for scheme, the DSM predicts the user's movement trajectory, taking into account the mobility of the user when requesting the service, so that the user gets the service response in a closer place, and the reduced user perception delay.
Connection success probability. Figure 5(a)-(d) shows the success probability of task execution connection result when user is moved at speed 10, 20, 40, and 100 km/h. It shows that there is almost no difference in the success rate using the proposed method and the compared method when move speed is 10 km/h. This is because when user moves slowly, the moving distance of user during service processing at the selected ES is short and the connectivity of the ES at the requested timing can keep connectivity. When user speed is 20 km/h, it can     be observed that the connection success probability of DSM is better than SO and the average connection success probability of DSM is 15% higher than that of SO. Moreover, when user speed is higher and reaches 100 km/h, the success rate of the service connection of the SO solution has dropped sharply, while the success rate of the connection of the DSM solution can still remain stable. In general, connection success probability decreases with the increase in speed, and the connection success probability of SO is more sensible to movement speed while DSM is more stable. Also, the connectivity becomes higher as the number of ESs increases because the density of ESs increases and the distance between the terminal and the ES used in the proposed method is shortened. This is because the longer the processing time is, the longer the moving distance of the terminal becomes, so that the connection between the user and the ES used in compared method becomes easy to break.
Service initial placement costs with different latency constraints. The cost of each algorithm under four constraints of latency is shown in Figure 6.
We compared the effectiveness of the initialization placement algorithm under different delay constraints. The full placement strategy refers to placing the service on the server closest to the user for each service request of the user. The random placement strategy refers to randomly selecting some servers to serve placement. GA algorithm uses genetic algorithm to select the optimal server to initialize the placement service. When the delay threshold is 200 ms, the costs of the DSM strategy is similar to those of the GA algorithm, slightly larger than the GA algorithm and much smaller than the full placement strategy; when the delay threshold is 175 ms, the costs of the DSM algorithm is 1% less than the GA algorithm. As the delay constraint shrinks, the advantages of the DSM algorithm are more obvious.
Package loss rate with different package sizes. In the process of calculating the packet loss rate, we conducted a total of 250 simulation experiments, each lasting 100 time slots. Then, we integrated all the simulation data to calculate the number of packets lost, and calculated the packet loss rate when the packet size is 1M, 2M, 8M,16M, 32M, and 64M.
In general, we can find that with the increase in data packets, the packet loss rate gradually increases. This is because the increase in data packets will inevitably increase network traffic, and network transmission will be affected, causing packet loss in severe cases.
It can be seen from Figure 7 that the packet loss rate of SO is the highest. When the data packet is large, the packet loss rate of the proposed algorithm is 1% lower than that of the DQN algorithm. The SO migration strategy takes the user leaving the service coverage area as the triggering condition for the migration. When the user leaves the current server coverage area, the services that the user needs will be migrated to the new ES. This frequent migration method will transfer a large amount of data, resulting in the massive increase in network traffic may even cause network congestion, cause packet loss, and increase the probability of data transmission failure. Both DQN and DSM take server load, transmission delay, and other factors into account. However, the algorithm proposed in this article predicts and corrects the user's request location, which can better cope with the dynamic changes of the network.
Delay and migration costs with different Vs. It is shown in Figure 8(a) that a large value of V makes system care more about latency. User latency decreases with V increasing and gradually approaches minimum latency in DSM, while SO remains stable. When the value of V is small, the proportion of delay in decision-making is small and the delay is high. As the value of V increases, the delay is continuously reduced and approaches the minimum value, and as V increases, so does the migration costs.
It is shown in Figure 8(b) that user latency decreases with long-term costs increasing and gradually approaches minimum latency in DSM. When the value of long-term costs is small, the delay is high.

Conclusion
In this article, we focus on service migration strategy and propose a service migration algorithm based on the Lyapunov optimization under the constraints of migration costs. The DSM algorithm achieves fast response of the edge of the service request, reduces the request delay of the user node, and improves the overall service efficiency of the network. At the same time, the mobility-aware service migration process is effective. The probability of service interruptions caused by node motion is reduced, and the continuity of mobile services is guaranteed to better extent. In future research, we will continue to study balancing user-perceived latency and migration costs under multi-user conditions, while satisfying load balancing as much as possible.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work is supported by National Key R&D Program of China (2020YFB1807802.