Mining Travel Time of Airport Ferry Network Based on Historical Trajectory Data

. An airport ferry vehicle is a ground service vehicle used to transfer passengers between the far apron and the terminal. The travel time of ferry tasks in the airport ferry network is an important decision-making basis for ferry vehicle scheduling. This paper presents a graph-based method to mine the travel time between nodes in the airport ferry network. Firstly, combined with map and trajectory information, the method takes the terminal boarding gates, parking lots, and remote stands as road network nodes to build a complete airport ferry road network. Then, this paper uses big data processing technology to identify the travel time between regional connection nodes by data fusion through the temporal and spatial relationship between ﬂight schedule and ferry vehicle GPS travel trajectory. Finally, the Floyd shortest path algorithm in graph theory is used to obtain the shortest path and travel time of all OD points. The experimental results show that all the ferry times calculated by the method proposed in this paper can better reﬂect the actual driving situation. This method saves the manpower, material resources, and time cost of on-site investigation and lays a foundation for the scheduling of ferry vehicles.


Introduction
When the aircraft is parked at a far apron, passengers need ferry vehicles to transfer between the aircraft and the terminal. After the airport operation center obtains the flight schedule and gate assignment [1,2] results in a period of time, it will consider the travel time consumed by the ferry task in the road network and reasonably arrange the ferry vehicles for each flight parked in a remote stand. Only a reasonable ferry scheduling plan can ensure the normal implementation of the flight schedule. At present, the scheduling of ferry vehicles in many airports mainly depends on experienced staff. e efficiency of the scheduling results is difficult to be ensured, which is easy to lead to flight delays. erefore, how to establish the optimal scheduling model of ferry vehicles by scientific means and improve the service level of airport ferry vehicles has attracted the attention of some scholars. e premise of ferry vehicle scheduling modeling is to know the travel time information between OD points in the ferry network. However, due to the closed environment in the airport flight area, nonstaff cannot enter the interior for field measurement. With the rapid development of smart airports, in order to realize the digitization and intelligence of airport management and meet the needs of airport safety management, some airports have established ground service vehicle GPS real-time monitoring and management systems to realize vehicle positioning, real-time tracking, track playback, cross-border alarm, and other functions. e historical GPS track data of ferry vehicles recorded by the system can reflect the time information and spatial information of vehicles in the process of performing flight support tasks. erefore, using the trajectory data to mine and extract the travel time of the airport ferry network is a new idea.
is paper proposed a novel graph-based method to mine the travel time in the airport ferry network. e research work is mainly divided into three aspects. Firstly, combined with the map and trajectory information, this paper takes the terminal boarding gates, parking lots, and remote stands as the road network nodes to build a complete airport ferry road network. en, the nodes are divided into different regions, and the travel time between regional connecting nodes is identified by data fusion through the temporal and spatial relationship between flight schedule and ferry vehicle GPS travel trajectory. Finally, the Floyd shortest path algorithm in graph theory is used to obtain the shortest path and travel time of all OD points. e research results of this paper can save the manpower, material resources, and time cost of on-site investigation and provide a decision-making basis for the optimal scheduling model of ferry vehicles. e rest of this paper is organized as follows. In Section 2, the literature on the optimal scheduling of airport ground support vehicles and GPS trajectory mining is reviewed. Section 3 takes Kunming Changshui International Airport in China as the research object and introduces the method of constructing the airport ferry network. In Section 4, the calculation method of OD point travel time in the road network is designed. e example verification and results are given in Section 5. Finally, conclusions and future research directions are proposed in Section 6.

Literature Review
With the rapid development of civil aviation, the disadvantages of airport management and operation relying only on manual experience are becoming more and more prominent. In order to ensure the normal operation of flights, how to improve the service level of airport ground support services through scientific means has attracted more and more attention of scholars. From the optimal operation of special vehicles providing various ground services to the overall optimization of joint scheduling of multiple vehicles, many achievements have been made. is paper summarizes the research literature of different ground service vehicles as a reference. Although there are differences in service content with ferry vehicles, there are many similarities in the optimization objectives and constraints in their scheduling.
Norin et al. [3] constructed an airport deicing vehicle optimal scheduling model aiming at minimizing the delay time and the minimum number of deicing vehicles and designed the corresponding solution algorithm based on a greedy randomized adaptive search algorithm. Du et al. [4] studied the trailer scheduling problem in flight transit service, described the problem as VRPTW, and constructed an integer programming model with minimizing the trailer operation cost as the optimization objective and the vehicle operation restriction as the constraint. Starting from the flexibility of vehicle service, Wang et al. [5] proposed a scheduling algorithm based on a greedy strategy to deal with the dynamic scheduling problem of airport refueling vehicles. With the diversification of airport ground service and considering the interaction between various vehicle services, many studies begin to pay attention to the joint scheduling of airport ground service vehicles. Padron et al. [6] constructed a double objective optimization model to minimize waiting time and turnaround time for the joint scheduling of airport service equipment. Xu and Shao [7] evaluated the fluctuation of service equipment operation time by analyzing the airport historical data and proposed an optimization model of ground service support equipment with uncertain operation time. Fei and Shu'an [8] studied the optimization problem of airport service vehicle scheduling in peak hours and constructed a joint scheduling model of ground service vehicles to minimize the purchase cost and operation cost of service equipment. In terms of ferry vehicles, Zhao et al. [9] established an integer programming model with the goal of minimizing the number of ferry vehicles required in peak hours or a certain period and constructed a ferry vehicle sharing network, which transformed the model into the problem of maximum network flow. Han et al. [10] also proposed a ferry capacity network model, in which the directed edge indicates that the two associated nodes may be continuously served by the same ferry.
e model aims to minimize the number of ferry vehicles required and is solved by the method of the graph. However, their research only roughly estimated the travel time between the terminal and the far apron area in the process of building the ferry network. For large airports, the terminal covers a wide area and is far away from different boarding gates. e unified estimation method will cause great errors in practical application.
With the development of positioning technology, many vehicles are equipped with GPS receivers. During the moving process, the vehicle continuously collects real-time information, including position, motion parameters, and positioning time, and transmits it to the data center. is type of data is called floating car data [11,12]. e emergence of these data makes it possible to mine rich knowledge from GPS trajectory data by using big data analysis technology [13]. Palma et al. and Bhattacharya et al. [14,15] proposed the method of analyzing motion characteristics (velocity, azimuth, acceleration, etc.) to mine important places related to people and objects. Zheng and Xie [16] proposed a trajectory mining algorithm to analyze the user's GPS trajectory data, so as to recommend personalized tourist attractions. In addition, Li et al. [17] analyzed the typical characteristics of the data of floating vehicles in the parking lot and used the DBSCAN algorithm to extract the parking lot location. Wang et al. [18] proposed the identification of key nodes and sections of urban road networks based on GPS trajectory data. In terms of path extraction, Schoredl et al. and Li et al. [19,20] proposed two methods to extract a high-precision roadmap from GPS trajectory, which are applicable to high sampling frequency and low sampling frequency of positioning data respectively. Unlike them, Cao and Krumm [21] proposed a new gravity model to transform the original GPS track into a road network that can guide path selection. Tang et al. [22] proposed a lane-level road network information mining method based on lane number and turning rules. In addition to the region of interest and path recognition, scholars have also done a lot of research on spatiotemporal pattern extraction based on trajectory data, for example, the method of automatically extracting passenger train operation information from historical track data [23]. Dong et al. [24] proposed a study on the temporal and spatial change of traffic accessibility under public health emergencies based on GPS trajectory. Lei et al. and Zhao et al. [25,26] proposed a spatiotemporal analysis model to capture the motion mode of the object. Different from those, Lu et al. [27] proposed a visual analysis method to study the behavior of vehicles on a certain route. Some scholars have also studied periodic pattern recognition, such as the probabilistic periodic detection method of moving objects [28]. In general, most of the GPS data mining and analysis research by scholars is based on the road traffic network, and few studies use the GPS track of airport ferry vehicles to extract the road network structure and travel time of the flight area. erefore, the research content of this paper is innovative.

Construction of Airport Ferry Network
is paper takes the real data of Kunming Changshui International Airport in China as the research object. e airport terminal covers an area of 548300 square meters, with 78 boarding gates, 104 remote stands, 2 entry ports, and 3 parking lots. Figure 1 shows the spatial layout of the airport flight area. e boarding gates and entry ports are distributed around the terminal, while the parking lots and remote stands are distributed at the far apron.
When a flight is assigned to an apron, the ferry vehicle needs to arrive at the corresponding stand in advance for the arriving flight. After the plane arrives, the ferry vehicle will drive along the planned route to deliver the passengers to the entrance port of the terminal. For departure flights, according to the flight schedule, the ferry vehicle needs to arrive at the boarding gate before ticket check-in. After the gate is opened, passengers will be sent to the designated parking stand. ere is also a necessary transfer time between two consecutive services of the ferry vehicle, that is, the travel time required for the ferry vehicle from the end of the last service to the start of the next service. According to the arrival and departure attributes of two adjacent service flights, the transfer between two services can be divided into four categories, as shown in Table 1. After a ferry task, when the buffer time is sufficient, the ferry vehicle can go to the parking lot first and then go to the starting place of the next service when the next task is approaching. When the buffer time is insufficient, in order to avoid flight delays, the ferry vehicle can go directly to the starting point of the next service. e security efficiency of ferry vehicles is mainly affected by the transfer time. Flights with a remote stand at the airport usually change the boarding gate flexibly according to the situation. For large airports, the terminal covers a wide area and different boarding gates are far away from each other. erefore, calculating the transfer path and time between all OD points in the whole airport is the prerequisite for the optimal scheduling of ferry vehicles.
Regarding the boarding gates and remote stands as road network nodes, directly calculating the travel time of any two points in the road network according to the coordinate information and the travel track of ferry vehicles will cause a large workload. In order to simplify the airport road network structure, firstly, the airport terminal boarding gate and apron are divided into several relatively independent areas according to the adjacent relationship, and the spatial transfer of ferry vehicles between service points is regarded as the transfer between regions. According to the number of remote stands, apron 3 is divided into 1 to 4 parts, apron 5 is divided into 1 to 4 parts, and apron 7 is divided into 1 to 2 parts. e boarding gate area of the terminal is divided into 1 to 4 parts for the convenience of the road network structure. Figure 2 shows the historical track of ferry vehicles. Combined with the actual road network distribution of the airport, the connection relationship between regions can be obtained, as shown in Figure 3. e arrangement structure of service points in one region is relatively simple, including series and parallel. As shown in Figure 4, there are two rows of remote stands in area 5-2, which are distributed in parallel. Area 3-4 has only a single row of stands, which are distributed in series. erefore, combined with the road network structure between regions, a complete road network structure of airport service points can be obtained, as shown in Figure 5. After obtaining the spatial connection relationship between all service points from the road network diagram of service points, any service point can communicate with other service points through the road network, and when combined with the distance information on the edge, the travel time between any two points can be calculated. erefore, the next section will explore the travel time of each edge of the road network.

Calculation Method of Travel Time
According to the ferry road network established in the previous section, this section uses the historical GPS track data of ferry vehicles to obtain the weight of each edge. For the convenience of description, the following definitions are made in this paper: all nodes in the ferry network are represented by V � [D 1 , D 2 , . . . , D i , . . . , D n ], where D i represents different regions. e nodes in one region are divided into internal nodes and connection nodes, , the internal node P im is only connected with the nodes in the region, and the connection node Q ik can be connected with the nodes in other regions. e GPS tracks of all ferry vehicles are rep-

Travel Time between Terminal and Far
Apron. During the flight support task, the ferry vehicle will travel between the boarding gate area and the far stand area. erefore, the path and time of the connecting edge between the gate area and the far stand area can be mined according to the temporal and spatial correlation between the flight schedule and the GPS track of the ferry vehicle. For arrival flights, the service flow of ferry vehicles is shown in Figure 6. t ETA i indicates the estimated arrival time of the flight. e ferry service starts from the time of arriving at the remote stand in advance, and Journal of Advanced Transportation the service start time TS A i can be calculated according to equation (1). After the flight arrives at the remote stand, it takes time T wait to wait for passengers to board. e ferry service ends after the passengers get off at the port, and the service end time TE A i can be calculated according to equation (2).
Similarly, for departure flights, according to the flight schedule, the ferry vehicle arrives at the boarding gate before    ticket check-in. After passengers get off at the parking stand, the service of the ferry vehicle ends. Figure 7 shows the ferry vehicle service process of departure flights. Assuming that the opening time of the boarding gate is 40 minutes (according to the service standards of China's civil transport airport) before the departure time t ETD i of the flight, the start time of the ferry vehicle for the departure flight is calculated by equation (3) and the end time of the service is calculated by equation (4).
According to the ferry service process, the algorithm steps of mining the travel time T ferry i between the boarding gate area and the apron by integrating GPS track data and flight data are as follows.
Step 1. According to the road network structure, find out the connecting nodes between the boarding gate area and the far apron area to form OD pairs [Q ik , Q jl ]. Search the flight set F and find the flight i whose starting and ending point of the ferry task is [Q ik , Q jl ].
Step 2. If flight i found is a departure flight, obtain the departure time t ETD i from the flight information, take time t and distance m as the search range parameters, form a set Ferry start of all ferry vehicles that appear at the boarding gate Q ik in the period [t ETD    Step 3. Take the ferry car G i existing in both Ferry start and Ferry end as the ferry car i serving the flight, and calculate the travel time T i end − T i start of the ferry car as the travel time from the boarding gate to the stand.
When judging whether the ferry car enters the search range, take the boarding gate or stand as the center of the circle, first calculate the longitude and latitude distance between the ferry car and the center of the circle, and then compare this distance with the reference distance m. If the calculated distance is less than the reference distance, judge whether the ferry car enters the search range.

Travel Time between Different Far Aprons.
e starting and ending point of flight service is from the boarding gate to the stand, so it is impossible to obtain the time when OD points are both stands through flight information. ere are two situations for the travel time of the ferry car between two target stands. e first situation is that the ferry car directly travels from the starting stand to the ending stand. At this time, the travel time represents the distance between the two stands. Another situation is that the ferry car drives from the starting stand to other places and then to the target stand. At this time, the travel time is obviously greater than the distance between the two stands. erefore, without the mining time range, we can only intercept the GPS track of the ferry vehicle through the coordinate values of the connecting nodes in two far aprons, then count a large number of travel time values, and select the smaller value as the travel time between the aprons. In this way, the distance from the parking lot to other service points can also be obtained.

Travel Time between Any Two Nodes.
After mining the travel time information of edges in the road network through the above method, the nodes, edges, and the weight of edges of the road network have been obtained. e travel time between any two nodes can be calculated by using the shortest path algorithm. Calculating the shortest path is a classical problem in graph theory. At present, there has been a very mature research. e commonly used shortest path algorithms include Floyd [29], Dijkstra [30], and SPFA [31]. It is necessary to select the appropriate algorithm according to different use scenarios. From the characteristics of the algorithm, the Floyd algorithm is suitable for finding the shortest path from multiple sources, but the high time complexity makes the algorithm not suitable for road network maps with many nodes; the Dijkstra algorithm is the basis of all basic shortest path algorithms. It is the most stable algorithm, but it is suitable for finding the shortest path of a single source. It needs repeated operation to obtain the shortest path of all points; the SPFA algorithm is a queue optimization of the Bellman Ford algorithm. It is the shortest path algorithm based on BFS. It solves the problem of changing weight to negative value that the Dijkstra algorithm cannot solve. At the same time, the implementation of the algorithm is the most troublesome. e road network constructed in this paper has a total of 187 service points. Considering the scale of the road network, the positive right of way, and the need to obtain the path information between all points, the Floyd algorithm is selected to solve the shortest path.
Floyd algorithm is an algorithm that uses the idea of dynamic programming to find the shortest path between multiple source points in a given weighted graph. e algorithm has the characteristics of clear structure and fast implementation. Its core idea is as follows: for each pair of shortest vertices u and v, if there is a vertex w that makes the path from u to w and then to v shorter than the known path, update it. Considering that the nodes have been divided into different regions, this paper first uses the Floyd algorithm to calculate the shortest circuit between different regions and then calculates the shortest circuit between the nodes in the region and the connecting nodes. e algorithm flow of finding the shortest path of OD points of ferry task is as follows (Algorithm 1): After the shortest path between regions is obtained by the above algorithm, combined with the distance from the internal node to the connecting node in the region, the shortest travel time between any two points can be obtained. e edge inside the area mainly represents the distance between two gates, and the fixed value C gate can be adopted. If n stands are separated between the internal node and the connecting node, the distance is n * C gate .

Experimental Results and Analysis
In order to verify the proposed travel time mining method of the ferry network, this paper uses the actual data of Changshui International Airport to verify the effectiveness   Figure 7: Ferry vehicle service process of departure flight. and practicability of the method. Flight data from January to June 2019, historical GPS track data of ferry vehicles, and coordinate information of each parking stand and boarding gate of the airport are collected. e data samples are shown in Tables 2 to 4.
Since the airport does not record the daily scheduling historical data of ferry vehicles, we do not know which vehicle performs each task. erefore, we need to identify them according to the data fusion method proposed in Section 4 to obtain the specific ferry vehicles corresponding to the task and the time nodes of their entry and departure starting and ending points. According to the method in Section 4, we first excavate the distance between the boarding gate area and the far apron area. Ferry vehicles will not stop too far away from the aircraft and stop at the task site for too long, so we set the time scaling parameter t � 10 min and the space distance scaling parameter m � 20 m. Take boarding gate area B-1 and far apron 5-2 as examples, according to the identification results, the corresponding starting waiting time, travel time, and terminal waiting time can be calculated, and the GPS track of the vehicle can be checked back according to the identification time of the starting and ending points to obtain the support path, as shown in Figure 8. e yellow circle indicates the search range, and the time difference of these seven GPS data can be calculated as the travel time of connection nodes gate B48 and stand 521L. According to this, the results of each calculation are collected, and the mean value of all results is taken as the transfer time between the two. For the distance between different far apron areas, we count a lot of travel (i) Input: region ferry network G � (Q, E), Q is the set of connection nodes representing each region, and E is the travel time between regions obtained by data mining as the weight of edges. (1) Initialization matrix M, m ij represents the travel time between vertices i and j. If i and j are not directly connected, the weight is set to ∞. If i and j come from the same region, the weight is set to 0. (2) Starting from the first node q 1 , calculate that m (1) ij � min 1j is the shortest path from i to j that only allows the middle to pass through q 1 , and update the matrix M.
kj is the shortest path from i to j that only allows the middle to pass through q 1 , q 2 , . . . q k , and update the matrix M.   Journal of Advanced Transportation 7 time between two connection nodes. Figure 9 shows the distance distribution characteristics from gate 48 to P-1 parking lot connection point p5|p6 and from stand 526L to stand 531. e corresponding distance is represented by selecting a smaller value, which can be taken as 0.5 and 1.0 minutes. rough the above two methods, the transfer time between regional connection nodes in the outlet network can be calculated, as shown in Table 5. With the travel time between connection nodes, the shortest travel time between any regions can be calculated according to the Floyd shortest path algorithm. e distance between aircraft stands in the area is C gate � 40 m according to the actual situation of Changshui airport. When the maximum allowable speed of the ferry vehicle in the flight area is 25 km/h, the travel time between the inner edges of the area can be calculated as about 0.1 minutes. erefore, the distance between any two nodes in the airport ferry network is equal to the distance between regions plus the distance between these two points and the connecting nodes in the region. e calculation results are shown in Table 6.
In order to check the accuracy of the distance information mined in this paper, we directly calculate the distance between the internal nodes of the two randomly selected areas as the real value through the above distance calculation method and    Figure 9: Histogram of OD distance distribution from b48 to p5|p6 (a) and 526L to 531 (b).  compare it with the corresponding distance in the distance matrix obtained by Floyd algorithm. In order to better measure this difference, this paper introduces absolute error (Error), mean absolute error (MAE), and root mean square error (RMSE) as evaluation indexes to test the accuracy of the distance matrix, as shown in equations (5) to (7).
e error is shown in Figure 10, and the MAE and RMSE calculation results are shown in Table 7. e results show that the mean value of the distance error between the distance calculated by the road network and the GPS track is kept at about zero, and the deviation is kept at about 1 minute. Considering that there are many random factors in the actual driving of the ferry car, the current results can better reflect the actual driving situation and meet the needs of data accuracy for modeling.

Conclusion
is paper studies the ferry vehicles in the airport flight area. Based on the analysis and mining of ferry vehicle trajectory data, the ferry road network map is established, and a new method is proposed to extract the travel time between OD points in the ferry network. e experimental results show that the shape of the ferry road network constructed in this paper and the calculation results of travel time can reflect the real operation to a great extent. e purpose of this research paper is exactly to solve the problem of ferry vehicles' dynamic scheduling. With the travel time between OD points of the ferry network obtained in this paper, combined with the real-time GPS data of ferry vehicles, we can study how to predict the task state of vehicles and schedule vehicles dynamically and reasonably in the future.
Data Availability e flight data and ferry vehicle GPS track data used in this paper are provided by Kunming Changshui Airport, China. e coordinates of each boarding gate and remote stand of the airport are obtained by Google satellite map.

Conflicts of Interest
e authors declare that there are no conflicts of interest.