Segmented Trajectory Clustering-Based Destination Prediction in IoVs

Location-based services have important applications in IoVs, and especially the destination-related applications have attracted more and more attention. Due to privacy consideration or operation convenience, people hesitate to share destinations to the public. Thus, these applications need to predict the destinations of moving vehicles in order to provide better services. Some existing works on destination prediction suffer from the dataset sparsity problem or the model inaccuracy problem. To overcome these problems, a Segmented Trajectory Clustering-Based Destination Prediction mechanism is proposed in this paper. First, each original trajectory is segmented to several key sub-trajectories, with the DP-based trajectory segmentation algorithm. Then, all the sub-trajectories are clustered based on the average nearest point pair distance to reveal the common characteristics or similar tracks. Finally, a deep neural network-based model is utilized to predict destinations, according to the history trajectories. Extensive simulations are conducted for destination predictions. Simulation results show that our proposed method can predict destinations with acceptable average errors and outperform other methods in most of the cases.


I. INTRODUCTION
The location-based services in the Internet of Vehicles(IoVs) have lots of attractive applications [1]- [3], such as navigation service, accident alarm service, and advertisement publish service. In the location-based services, the location information is the core part of the services, which should be well collected. With the help of the Global Positioning System(GPS) devices, the location information of vehicles can be acquired and then forwarded to the center service providers, via wireless communication techniques, including WAVE, LTE-V or cellular services [4]. The center service providers make suggestions according to their services and locations, which should be sent to the requested vehicles to make recommendations [5]. Among numerous The associate editor coordinating the review of this manuscript and approving it for publication was Chao Chen. location-based services in IoVs, the destination-related applications play important roles and provide much more convenience to people [6]. However, how to get the destination of each vehicle's trip is a significant problem that should be well solved to serve those applications.
From the perspective of service providers, it is hoped that each vehicle can public the destination of its current trip before its starting, in order to provide better services. However, due to the consideration of privacy protection [7] or operation convenience, most of the vehicle users hesitate to share their destination information. Thus, predicting the final destination of each vehicle becomes an effective way, which can help the service providers analyze vehicle behaviors. First, transportation officials can manage the road traffic better, such as traffic congestion control. Second, based on the destination prediction service, some recommendation systems such as the advertisement delivery system can be VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ implemented. For example, significant changes have taken place in the field of taxi dispatch. The dispatch system should determine to notify which taxi to pick up a passenger. To achieve this purpose, the destination of each moving vehicle should be predicted. This problem has been one of the Kaggle challenges [8].
Reviewing existing works on the destination prediction, Markov and Hidden Markov Models(HMM) [9], [10] are widely used. Although these models work at a certain extent, vehicle trajectories do not always conform to the Markov process so that the prediction result is not good enough sometimes, especially when the trajectory is very long. Besides, the method of trajectory matching with historical trajectories is also proposed. However, the dataset sparsity problem is a significant obstacle to achieve an accurate prediction result [11]. In addition, some works [12], [13] predict the final destination with the method of clustering. Once all the trajectories have been clustered, by assigning each new trajectory to the clustered trajectories, the destination of each trajectory can be predicted. But, these methods are usually based on the shape of trajectories, but neglect the distance between trajectories, which may lead to worse prediction accuracy.
In this paper, we incorporate some of the above ideas and propose a segmented trajectory clustering-based destination prediction algorithm. The algorithm contains three parts, namely trajectory segmentation, sub-trajectory clustering, and destination prediction. In order to avoid the disadvantages of clustering entire trajectories, a Douglas-Peucker (DP)-based method of segmenting trajectories to common sub-trajectories is utilized. Once trajectories are segmented, some key sub-trajectories may be very similar or have common characteristics. Then we perform clustering operation on the sub-trajectories to classify the similar sub-trajectories. Here, the similarity of two sub-trajectories is defined as the average nearest point pair distance, detailed in the following section. Finally, a GRU based deep learning network is proposed, with which the destinations of moving vehicles can be predicted. Extensive simulations are done to evaluate the effectiveness of our proposed method. The result shows that our proposed method can well predict the destinations of moving vehicles with relatively smaller prediction errors, compared to three existing methods.
The main contributions of this paper are listed as follows: 1) A DP-based Trajectory Segmentation Algorithm is proposed for segmenting trajectories to several key subtrajectories. 2) A Sub-trajectory Clustering Algorithm is defined, which can classify similar sub-trajectories to reveal common characteristics. 3) A deep learning network GRU is used to predict the traveling destination based on the segmented trajectories and clustered sub-trajectories. The rest of this paper is organized as follows. Related works are discussed in section II. The problem is formulated in section III. Section IV elaborates on the details of destination prediction. Several extensive simulations are conducted in section V. Finally, the conclusions and discussion are stated in section VI.

II. RELATED WORK
Destination prediction is of much importance, and there have been already lots of works. Overall, there are four classes of methods, including Markov-based, history trajectory match-based, clustering-based, and deep learning-based methods.
Markov and its variant Hidden Markov Model are statistical models that have many applications, which can also be utilized in the area of destination predictions, such as [9], [10], [14]. Paper [9] uses Markov chains to predict the remaining traveling time for moving vehicles on freeways. With low-cost GPS sensors and a map database, an HMM model between traveling routes and destinations is built [10], which can predict drivers' destinations.
Another common solution of destination prediction is to derive the movement probability from historical trajectories. However, this method is based on the dataset composed of all the historical traveling trajectories, which should cover all the traveling cases. Thus, there may be a problem that no matches can be found in the dataset of historical trajectories which is called trajectory sparsity problem. Some works [11], [15], [16] have been done to solve this problem. Paper [11] proposes Sub-Trajectory Synthesis (SubSyn) algorithm on the problem of data sparsity, which decomposes historical trajectories into sub-trajectories, and connects these subtrajectories into ''synthesized'' trajectories. The trajectory data set used for matching has greatly increased and a better predictive effect is achieved by this means. Compared to [11], paper [16] focuses on the behavior characteristics of queried trajectories. A method called MGDPre is proposed which can work better on the sparse dataset.
Besides, some works predict the destinations of moving objects by clustering. A data-driven framework -DestPre is proposed in [12]. For efficiently retrieving similar trajectories, DestPre involves an index based on the Bucket PR Quadtree and Minwise hashing. In order to predict vehicle destination, it involves a clustering method. Paper [13] clusters trajectories and models main traffic flow patterns with a mixture of 2-D Gaussian distribution. For any new trajectory, it first assigns the trajectory to a cluster which it most likely belongs to. Then it uses the characteristic of each cluster to predict the final destination.
Currently, some works use modern deep learning techniques or extra information together with trajectories to predict destinations. Different from most previous approaches, paper [17] considers more information such as the departure time, the driver id and so on to achieve this objective. It uses neural networks to predict the destination, and a fixed-length output can be obtained from a variable-length input sequence. Destination Prediction Model(DPM) is proposed in [18] to estimate a user's future destination. It combines the filter and contextual knowledge together, which is used to increase accuracy and reduce historical and contextual knowledge mistakes. An efficient data embedding method -circular fuzzy embedding (CFE) for time-related feature pre-processing is proposed in [19], combining the ensemble learning model(ELM), and a better destination prediction accuracy can be obtained. Based on the most recent partial trajectories and additional contextual data, a Long Short-Term Memory (LSTM)-based model is proposed in [20]. A novel prediction algorithm T-CONV is proposed in [21], which treats trajectories as two-dimensional images rather than one-dimensional sequences, and higher accuracy can be obtained with convolutional neural network(CNN).
Most of the previous works can do well in some domains. However, little of work can consider the above mechanisms together to maximize the advantages. In this paper, we will consider several ideas above to propose a new destination prediction mechanism.

III. PROBLEM FORMULATION
In this paper, we focus on the problem of destination prediction in the IoVs. We first list the definitions used in this paper.
Definition 1: In an IoV, there are N vehicles, denoted as Each vehicle produces a sequence of locations called trajectory while traveling, which is composed of a series of space-temporal data. The first point l t 1 v n is the starting point and the last l t J v n is the endpoint or destination in a piece of complete trajectory. We use d v n to represent the destination of the vehicle v n , which is the same as l t J v n . An example of a trajectory sequence can be illustrated as the blue line in Fig. 1.
When the sample rate doesn't change, due to traffic jams or other reasons, the collected points in some areas may be dense. This phenomenon doesn't make too much sense to our problem, and it will generate too many redundant data in trajectories. Some works such as [22] and [23] have studied this problem with movement pattern mining. However, they just keep the movement pattern information rather than location information in trajectories. Therefore, in order to avoid the data redundancy problem and keep the location information in trajectories, the key or representative location information from trajectories should be extracted. The processed trajectory not only contains original location changing information, but also becomes a series of the key track segments, which is easier to be handled. Then, the key locations of a trajectory are defined as follows.
Definition 3: The key locations of trajectory T v n can be represented as , is the key location in the trajectory. Take the trajectory in Fig. 1 as an example, which is presented by the blue line. In this trajectory, locations l t 1 v n , l t 4 v n , l t 9 v n , and l t 15 v n are the key locations. Therefore, the trajectory The method to obtain the key locations will be detailed in section IV-A. Based on these key locations, the segments or linear representations of trajectory T v n can be defined as Definition 4.

Definition 4: The trajectory of vehicle v n can be represented as several segments or piecewise linear trajectories
In order to avoid information loss of trajectories, we fill points in each segment such as segment from kl m to kl m+1 , and the number of filled points is the same as the original number of locations between the two key locations, kl m and kl m+1 . The filled points are evenly distributed in the segment. The segmented trajectories of the original trajectory indicated by the blue line in Fig. 1 is presented in the red line. There are several points in each segmented trajectory, as shown in the red lines. Some trajectories may have common or similar subtrajectories. In order to find some common characteristics on trajectories, sub-trajectories are necessary to be clustered. There are two advantages to cluster sub-trajectories. First, some vehicles usually travel on similar routes, which contain some common sub-trajectories that can be clustered. Second, due to the errors of GPS devices, some sub-trajectories on the same road may be regarded as different routes, which should be corrected. A very intuitive idea to cluster is to match the points collected with GPS devices to the map to determine where the vehicle is traveling. However, this method does not always work well [24], [25], as the quality of the GPS sampling will seriously affect the map matching accuracy.
It is common that the key trajectories on the same road are always similar in location and their shapes are not of too much difference. That gives us the inspiration to find the common trajectories of the same road. The clustering method based on the trajectory shape and distance has a very good effect on solving the problem of finding similar features or attributes. We will introduce the details of clustering in section IV-B.  After all the key sub-trajectories are clustered, each of them will uniquely belong to a certain cluster. Note that if successive sub-trajectories in ST v n belong to the same cluster c i , there may be two duplicate items in CT v n . In this case, the duplicated items should be deleted and only one is kept. This will reduce the redundant data and keep the movement patterns of vehicles.
So far, all the original trajectories recorded in longitude and latitude structure become the sequences composed of clusters after the segmentation and clustering processes. Next, our destination prediction process is to predict the final cluster in CT . As each cluster sequence follows the chronological order, the RNN network can be used to train the prediction model. In this paper, we introduce the GRU model to train the prediction model. All the clustered trajectories compose the training set and the destination cluster of each trajectory is regarded as the label. Based on different completion rates of vehicle traveling, several prediction models can be obtained, with which the final destination can be predicted. In section IV-C, we will introduce how to predict destinations in detail. At last, the overall flow chart of destination prediction is shown in Figure 2. Details of each operation in the chart are described in section IV.

IV. PROBLEM IMPLEMENTATION A. TRAJECTORY SEGMENTATION
In order to segment trajectories, an intuitive method is to find inflection points in trajectories such as [26], [27], where vehicles turn or have a large track corner. Intuitively, an angle-based segmentation method is an effective method to find the inflection position and segment the trajectory at the position. However, if trajectories are split by this method directly, a short trajectory may be divided into many subtrajectories, which is not feasible. Thus, we use the idea of Douglas-Peucker(DP) [28] algorithm to segment trajectories. DP is a commonly used graphic compression algorithm whose purpose is to compress a large number of redundant graphic nodes or to extract the necessary data points from the original data. In this paper, the DP algorithm is used to find the key locations namely inflection points of trajectories, and then the piecewise linear trajectories can be obtained based on the key locations.

Algorithm 1 DP Based Trajectory Segmentation Algorithm
Input: end for 6: if d max ≤ TH then 8: num + + 11: else 12: index = arg(d max ) 13: 14: for n ← 1 to N do 28: add (kl s + − −− → Piece × n) to Seg 29: end for 30: add kl e to Seg Here, the segmentation threshold TH is important, whose value determines the segmentation effect of trajectories. Based on our previous descriptions, the segmentation point of a trajectory should be the location where a vehicle turns. The minimum curve radius of roads is designed to measure vehicles' safety while moving at a specified driving speed [29]. As shown in Fig. 3, when a vehicle is driving at speed v, the curve radius must be greater or equal to r; otherwise, it is unsafe to passengers. That's because the road cannot provide sufficient centripetal force F to prevent moving vehicles from slipping. The radius r can be calculated with Equation 3, where µ is the lateral force coefficient, and i h is the superelevation slope, both of which are used to measure road conditions.
Therefore, we can let r be the segmentation threshold TH , which means a vehicle's normal turning radius will not be greater than r and the original trajectory should be segmented to several sub-trajectories with arg(d max ) if the distance d max is greater than r.

B. SUB-TRAJECTORY CLUSTERING
Many clustering algorithms have been proposed and it is necessary to choose a suitable clustering algorithm in order to ensure an acceptable clustering result. Some existing clustering algorithms need to fix the number of clusters before clustering, such as K-means and Gaussian Mixture Model (GMM). However, these algorithms can not be used in our case, where the final number of clusters cannot be fixed in advance. Another class of clustering algorithms are not necessary to fix the number of clusters in advance such as the Density-Based Spatial Clustering of Applications with Noise(DBSCAN). The clustering process of DBSCAN is to constantly add neighbors according to distance information among items without considering the shape of each item itself. Therefore, it is unsuitable for our scenario. Besides, Affinity Propagation(AP) is a new clustering algorithm proposed in Science Magazine in 2007, but its time complexity is O(N 3 ), which is not fit for the case of clustering millions of trajectories. Based on the above observation, the Single-pass Clustering Algorithm, which is fit for processing large streaming or amounts of text data [30], is utilized in our paper to cluster millions of trajectories. The idea of the algorithm is sketched as follows: At first, let the first item in the dataset be a new class. Next, traverse the other items in the dataset to compare the similarities between these items and the existing classes. The item with the similarity less than a threshold with an existing class will be added to the class. Otherwise, if the similarities to all the existing classes are greater than the threshold, the data will be treated as a new class. Based on this idea, we design our sub-trajectory clustering algorithm and the first task is to define the measurement of similarity on sub-trajectories. We define the Average Nearest Point Pair Distance for measuring the similarity of two sub-trajectories. The distance in the definition can be Euclidean distance or Manhattan distance, and here we choose the Euclidean distance. |s i | means the length of the sub-trajectory s i . Besides, when |s i | = |s j |, according to the definition of ANPPD, it is defined that ANPPD is calculated from the short sub-trajectory to the long one. An example of the calculating process is illustrated in Fig. 4 The purpose of clustering is to find sub-trajectories in the same road or in the same direction with a small distance, so that their ANPPDs should be very small. The threshold of ANPPD is defined for tuning the effect of clustering, which means if ANPPD of two sub-trajectories is less than the threshold, they should be in the same cluster. However, when two sub-trajectories are perpendicular to each other or the angle between the two sub-trajectories is relatively large, and one sub-trajectories may have a few points with the other one containing many points, their ANPPD may also satisfy the condition. Therefore, we have to consider the angle between sub-trajectories while clustering. Then a single pass-based Sub-Trajectory Clustering Algorithm is proposed as Algorithm 2.
The input parameters are the sub-trajectory set S, threshold on angle and ANPPD between two sub-trajectories. The clustering result C is the output. The main idea of algorithm 2 is sketched as follows.
At first, the first item in S is added to the cluster set C as the first cluster to initialize all parameters(line 1 to line 2). Next, iterate over the other sub-trajectories to calculate the ANPPDs and angles to existing clusters(line 4 to line 8). Based on the above, the minimal ANPPD in DIS can be found, which means that the current sub-trajectory is closest to the cluster with the minimal ANPPD(line 9 to line 10). If the ANPPN is less than the threshold and the angle is less than , the sub-trajectory can be added to the cluster and the center line of the cluster should be recalculated(line 11 to line 13); Otherwise, it should be created as a new cluster and added to the cluster set(line 14 to line 16).
To calculate the cluster center, we first determine the vector direction of the cluster center. We define all the sub-trajectories in a cluster c with sub-trajectories s 1 , s 2 , s 3 , . . . , s n . Then each sub-trajectory can be denoted as a Once the direction of the cluster center has been fixed, the two endpoints on the center line need to be fixed. The starting point of the center vector is the center of all the start points of sub-trajectories in this cluster. The endpoint of the center vector is obtained by projecting the center of all the terminals of sub-trajectories to the center vector. In order to avoid a large number of redundant points on the cluster center, we fill the center line uniformly between the start point and the endpoint. The number of filled points is the average number of points in each sub-trajectory in the cluster. An example of calculating the cluster center is shown in Fig. 5.
Once the clustering result is obtained, each sub-trajectory can be categorized. When a new sub-trajectory needs to be categorized, the ANPPD value to each cluster should be calculated. Then the cluster with the minimum value is chosen as the category. Once the process is finished, the original trajectory data set becomes a set of sequences of clusters.

C. DESTINATION PREDICTION
Neural network models such as CNN and recurrent neural network(RNN) have shown great advantages on sequence predicting problem. Deep learning-based predictions are needed for large scale prediction problems. However, the problem of gradient explosion or disappearance usually appears in the neural network models. Although changing some parameters appropriately can make some improvements, this problem can't be solved absolutely as they are in short-term memory. Some new solutions are proposed to overcome that problem, such as the LSTM model and the Gate Recurrent Unit(GRU) model. Compared to LSTM, GRU has fewer parameters and is easier to converge, with almost the same effect as LSTM. Therefore, in this paper, we choose the GRU network as the prediction model.
About the training process of deep learning networks, the selection of features and labels is essential. All the trajectories compose the training set and each trajectory can be represented as a sequence of clusters, shown as definition 5. The items except the last one in trajectories are regarded as the feature sequence, and the last one is the label while training models. Take CT v i = {c 1 , c 2 , . . . , c I −1 , c I } as an example.
{c 1 , c 2 , . . . , c I −1 } is the feature sequence and c I is the label, which means the cluster of the destination. The feature set and label set can be obtained after performing the feature and label extraction on all the trajectories. Then the GRU network can be trained based on the training set, after which, the destination prediction models are also obtained. Referring to the prediction process, for a new trajectory, we first perform the trajectory segmentation operation as stated in IV-A. Next, classify the output sub-trajectories to the corresponding clusters and form a sequence of clusters. The cluster of the destination can be predicted based on the sequence and the prediction model. However, the prediction process is not complete, as with the above steps only the area of the destination, not the final location point, can be obtained. We define that the final location is located in the segment of the cluster center. Besides, due to vehicles moving closer to destinations most of the time, we select the closest location in the segment to the last known location of the current trajectory as the final destination location. In this way, we can roughly estimate the movement destination of each vehicle.

V. SIMULATIONS A. DATA SET DESCRIPTION
In this paper, the taxi trajectories of Porto city in Portugal are used as the experiment data, which are also public for the dataset of Kaggle challenge at "ECML/PKDD 15: Taxi Trajectory Prediction (I)" section [31]. There are 1710670 trajectories in the original dataset, and the sampling rate of the dataset is four times per minute. Then, the trajectories with the number of sampling locations between 12 and 600(namely, the traveling time lasts for 3 minutes to 1.5 hours) are chosen. Besides, the trajectories whose longitudes are outside of the range [−8.65, −8.57], or whose latitudes are outside of the range [41. 10, 41.20], will be filtered out. At last, 30000 trajectories among them are selected for our simulation. A snapshot of the dataset is shown in Fig. 6.

B. TRAJECTORY SEGMENTATION
As shown in previous descriptions in IV-A, the segmentation threshold should be fixed first. Generally, the lateral force coefficient µ is about 0.06 and the superrelevation slope i h is about 0.08 of an ordinary highway. According to Equation 3, the turning radius is related to the vehicle speed. As we all know, the speed range is usually between 0 and 120km/h. If the vehicle speed is always fixed to 120km/h, then many corners on roads may be ignored, so that some inflection points cannot be pointed out. Otherwise, if the vehicle speed is set very small, then most of the points are regarded as inflection points. Therefore, we set the vehicle speed to the average value 60km/h in this paper. Then the threshold is determined. The trajectories can be segmented based on the threshold, and several segmentation examples are shown in Fig. 7. The blue lines and the red lines represent the original trajectories and the segmented trajectories respectively. Here, we can see that the inflection points can be pointed out and each sub-trajectory is uniformly filled.

C. CLUSTERING
According to the inputs of Algorithm 2, all the subtrajectories segmented from Algorithm 1 are selected as inputs. In order to choose an optimal degree , we try the value 15 • , 30 • , and 45 • . Similarly, we try the value 40m, 80m, 120m, 160m, and 200m to select the optimal value of .   There are two objectives to measure the performance of the two thresholds. First, the sum of ANPPD distances between each element of a cluster and its cluster center is calculated, which is represented by Dis. The smaller Dis is, the better the performance is. Second, the total number of clusters Num is another metric, which should be minimized and means that a cluster can include more sub-trajectories. The values of the two objectives for different and are shown in Table 1.
In order to select the optimal threshold and , a multi-objective optimization method is used, whose formula is shown as equation 5 and 6. η in which f min (X ) and f max (X ) are the minimum and maximum values of X respectively, and f Cur (X ) is the current value of X . Here, the value of η X denotes how close f Cur (X ) is to the best value.
According to Equation 7, we choose the pair of and with the largest η value. Finally, when the angle is set to 30 • and the threshold of ANPPD is set to 80m, the optimal value can be obtained. Several clustering examples are shown in Fig. 8, where the clustering center is indicated in red lines and the others are the sub-trajectories in the same cluster.

D. DESTINATION PREDICTION
Mean Haversine Distance is used to evaluate the final prediction accuracy in this paper, which is commonly used in navigation by measuring distances between two points on a sphere based on their latitudes and longitudes. It is also used as the evaluation metric in Kaggle competition. The Harvesine Distance between the two locations can be computed as follows: where φ is the latitude and λ is the longitude in Equation 8, d is the distance between two points in Equation 9, and r is the sphere's radius whose value is Earth's radius in our case (i.e. 6371 kilometers). In order to tune the hyperparameters of the GRU network to improve prediction accuracy, several simulation works have been done by changing the number of layers and neurons of each layer in the GRU network under different cases of trajectory completion rate. The average MSEs of different models are shown in Table 2, where CR is the ratio between the traveling distance and the total length of the trajectory. L is the number of layers in the GRU network and N is the number of neurons in each layer of the GRU network. We set the number of layers from 1 to 5 respectively and the number of neurons 16, 32, 64, 128 and 256 respectively. Due to space limitations, the data of layer 4 -5 is not listed in table 2. The activation function and loss function are set to 'tanh' and 'mse' respectively in our simulation. Dropout function and early stopping function are used to prevent overfitting, where the value of 'dropout' is 0.2 and the value of 'patience' in early stopping function is 5. The simulation results are shown in table 2 and we compare the MSE of each model.
It's obvious that with the trajectory completion rate increasing, the overall average MSE declines. It is found that when the trajectory completion rate is 30%, the average MSE is minimum with 16 neurons in each layer. Referring to the other trajectory completion rates, the average MSE is minimum when the number of neurons in each layer is 64. Furthermore, the 2-layer model performs best with completion rate 40% and the 1-layer model performs best in other completion rates. Therefore, in the following simulations, we select 2-layer GRU model with 64 neurons in each layer for 40% completion rate and l-layer model with 16 neurons for 30% completion rate; for other completion rates, we choose 1-layer model with 64 neurons in the layer as the simulation parameter.
The batch size of training also affects the prediction accuracy of models. Thus the models with different batch sizes are trained so that the best prediction models with different trajectory completion rates can be found. The mean prediction errors of those models under different batch sizes and completion rates are shown in table 3. The smallest prediction errors are shown in bold texts in the table, and the corresponding models will serve as our final prediction models. Although the batch size affects the prediction accuracy, the rule that with the trajectory completion rate increasing the mean prediction error declines, doesn't change. Moreover, it can be found the mean prediction errors are quite small. Especially, when the completion rate is 90%, the minimum error is less than 390m.
The distributions of distances between the predicted destinations and the true destinations with different trajectory completion rates are shown in Fig. 9. As the trajectory  completion rate increases, the proportion of destination prediction error lower than 1km also increases. In addition, the proportion of prediction accuracy lower than 100m is increasing too, which is similar to other prediction accuracy levels. When the completion rate is 90%, the proportion of trajectories with the prediction error less than 1km can almost reach 95%, which proves the effectiveness of our proposed method.
In order to better illustrate the effectiveness of our proposed method, we compare our method with the other three methods [19]- [21] that do well in the trajectory prediction area. An ensemble learning model(ELM) based on support vector regression(SVR) and deep learning is proposed in [19], and we use 'ELM' to represent the method. A novel prediction algorithm T-CONV is proposed in [21] to predict the destination of partial trajectories. In [20], the destination is predicted based on LSTM model and we use 'LSTM' to represent it. The comparison result is shown in Fig. 10, where we can see that the method we proposed performs best among the three models under most of trajectory completion rates.

VI. CONCLUSIONS AND DISCUSSION
A segmented trajectory clustering-based destination prediction method is proposed in this paper, including trajectory segmentation, sub-trajectory clustering and destination prediction process. In the trajectory segmentation, a DP-based  trajectory segmentation algorithm is used, with which trajectories are split to sequences of key sub-trajectories. Second, we propose a sub-trajectory clustering algorithm to classify sub-trajectories, which reveals similar characteristics among sub-trajectories. Here, the average nearest point pair distance is defined to measure the similarity of any two key sub-trajectories while clustering. At last, a neural network GRU is applied to predict the final destination. Extensive simulations have been done on Porto dataset to tune the hyperparameters and train the neural network model. Simulation results show that our proposed method can predict the final destination with acceptable errors and it can outperform the other methods most of the cases.
In this paper, the number of clusters is still high, which may affect the prediction accuracy in the GRU network. In order to lower down the number of clusters, other good clustering algorithms can also be tried while clustering the key subtrajectories, such as AP. However, its efficiency is very low with so many sub-trajectories. Our future work is to improve its efficiency and decrease the number of clusters while utilizing the algorithm. Additionally, the proposed method is only tested on the Porto dataset, which is limited. In the future, we will test it on other datasets to check the robustness, such as vehicle traveling dataset in Cologne and taxi traveling Dataset in Beijing.