GCN-CNVPS: Novel Method for Cooperative Neighboring Vehicle Positioning System Based on Graph Convolution Network

To provide coordinate information for the use of intelligent transportation systems (ITSs) and autonomous vehicles (AVs), the global positioning system (GPS) is commonly used in vehicle localization as a cheap and easily accessible solution for global positioning. However, several factors contribute to GPS errors, decreasing the safety and precision of AV and ITS applications, respectively. Extensive research has been conducted to address this problem. More specifically, several optimization-based cooperative vehicle localization algorithms have been developed to improve the localization results by exchanging information with neighboring vehicles to acquire additional information. Nevertheless, existing optimization-based algorithms still suffer from an unacceptable performance and poor scalability. In this study, we investigated the development of deep learning (DL) based cooperative vehicle localization algorithms to provide GPS refinement solutions with low complexity, high performance, and flexibility. Specifically, we propose three DL models to address the problem of interest by emphasizing the temporal and spatial correlations of the extra given information. The simulation results confirm that the developed algorithms outperform existing optimization-based algorithms in terms of refined error statistics. Moreover, a comparison of the three proposed algorithms also demonstrates that the proposed graph convolution network-based cooperative vehicle localization algorithm can effectively utilize temporal and spatial correlations in the extra information, leading to a better performance and lower training overhead.

to road users. However, accurate localization is one of the most vital premises for the implementation of the aforementioned applications [2]. Although several localization methods, such as map matching, fingerprinting, and image/video localization, can be used to provide coordinate information of vehicles, the global positioning system (GPS) is still the most common choice for providing localization results to vehicles. There are two reasons for prohibiting the wide use of these methods. First, to apply these methods, expensive sensors, such as cameras and video recorders, should be installed in the target vehicle to provide the required information for matching or fingerprinting. To achieve accurate localization, acceptable sensors, which can be used to provide a highresolution or high-quality output, are costly, hindering the widespread use of such methods. Second, to achieve an effective matching or fingerprinting, a database containing sufficient reference samples should be built in advance, prohibiting the wide use of these methods. Moreover, the need for a predefined database also limits the operation area for effective localization. As an alternative, GPS is the most commonly used localization system for vehicle applications, offering a cheap and easily accessible solution for global positioning [3].
Although GPS offers an easy and accessible way to conduct localization, the precision of GPS still has room for further improvement in providing accurate localization. To be more specific, GPS suffers from the influence of several factors (e.g., receiver noise and a multipath effect) such that the received GPS coordinates have large errors with the actual coordinates of the vehicle, thereby posing a threat to the safety of the AV or the precision of ITS applications. To solve this drawback by working on the GPS error refinement, vehicular ad-hoc networks (VANETs) [4] have recently been introduced to the automotive research community where vehicles can communicate with each other to improve their location awareness [2], [5]- [7]. By integrating vehicle-tovehicle (V2V) communication, an effective "cooperative driving" network can be established to share information for GPS refinement usage [1]. To be more specific, several studies have already focused on incorporating GPS with auxiliary information (e.g., ranging measurement and reference points) through optimization-based algorithms to enhance the system performance [2]. In [5], the authors proposed a direction of arrival (DOA)-based cooperative localization method, incorporating GPS with radar to improve the localization. Furthermore, because the information coming from each sensor has its own limitations, the concept of data fusion has been introduced into the GPS refinement to refine the GPS results based on the information acquired from multiple sensors. In addition, in [7], the authors proposed a cooperative neighboring vehicle positioning system (CNVPS), incorporating GPS with various sensors using the weighted average to improve the localization. However, the approach in [7] only employs a linear function for the application of a sensor data fusion for GPS refinement, thereby leaving room for further performance improvements.
Although more powerful algorithms should be developed to improve the performance of existing CNVPS algorithms, the design of advanced optimization-based CNVPS algorithms is not trivial. First, the design is highly dependent on the precise domain knowledge of different information sources (i.e., the error distribution of extra sensors), which may not be easily available under real scenarios. Moreover, the optimization problem should be redesigned if different types of sensors are employed. Second, to achieve accurate results, multiple iterations or complex matrix operations are often employed in the optimization process. Finally, although the problem of interest can be considered time-series data (i.e., multi-time-slot data), existing optimization-based methods only focus on the scenario of a single time-slot data fusion. As a result, the modeling of an optimization problem for extracting the correlation in multiple time-slot data and further improving the performance remains an open problem.
Differing from existing optimization-based algorithms, our idea is to develop CNVPS algorithms based on a deep learning (DL) algorithm owing to such advantages as a low complexity, high performance, and design flexibility. Specifically, given a sufficient training dataset, even without a precise mathematical system model, the DL model can be used to construct a nonlinear function and automatically solve the optimization problem of interest. Moreover, during the online testing stage, only simple matrix operations are executed when generating GPS refinement results, matching the computational limitations of on-board vehicle units. Furthermore, the DL model can extract hidden features (i.e., correlation of a multiple time-slot scenario) to further improve the GPS refinement results, making it almost impossible for optimization-based algorithms to do the same. Based on the aforementioned motivations, we decided to focus on the development of DL-based CNVPS GPS refinement algorithms. As a result, we can not only fully fuse various sensors but also integrate multiple time-slot data by introducing flexible characteristics of DL algorithms. For a single time-slot GPS refinement, we propose a multi-layer perceptron-cooperative neighboring vehicle positioning system (MLP-CNVPS) to achieve DL-based GPS refinement. Considering a multiple time-slot GPS refinement, we propose a long short-term memory-cooperative neighboring vehicle positioning system (LSTM-CNVPS) for obtaining superior results by considering the temporal correlation in multiple time-slot data. Moreover, a graph convolution network-cooperative neighboring vehicle positioning system (GCN-CNVPS) was further developed to better utilize both temporal and spatial correlations and achieve an efficient GPS refinement, leading to an even better performance compared to the aforementioned DL-based CNVPS algorithms. Although some more complex architectures of graph neural networks, such as GraphSAGE [8], [9], can also be used to implement cooperative neighboring vehicle positioning systems, it is noteworthy that the main idea of this paper is to provide a way for on-board computing units to achieve fast localization enhancement with outstanding performance. As a result, we consider GCN is a more suitable candidate than other graph neural network architectures due to the relatively straightforward and stable graph structure of the considered problem [10] and the realtime demand of vehicle applications.
Simulation results show that the performance of the proposed DL-based localization approach is better than that of existing optimization-based localization algorithms by improving the GPS error mean from 4 − 20 m to 2 − 4 m. This improvement is comparable to the current V2I localization performance [11], thus verifying the contribution of this study. Moreover, as [12] reported, an error mean of approximately 6 − 8 m is sufficient for most vehicular applications using V2V ranging. Furthermore, when it comes to the implementation issues of the proposed approaches, recent studies [9], [13] suggest that the proposed GCN structure can be implemented in on-board computing units to provide realtime GPS enhancement due to the low complexity nature of the proposed GCN structure. Considering both the performance and complexity, our study is a strong candidate to be implemented in V2V applications with the advantages of low-cost hardware, a fast and simple method, and accurate and stable performance.
The remainder of this paper is organized as follows. In Section II, the system model and sensor configurations are as described below. Section III illustrates the proposed method in detail, including the framework of the deep learning-based localization approach and the operations in both the online and offline stages. The simulation results are presented in Section IV, followed by some concluding remarks in Section V.

A. SYSTEM SETUP
As shown in Fig. 1, a cooperative group consisting of N vehicles is considered to enable a CNVPS-aided localization refinement [7]. With the support of vehicle-to-vehicle (V2V) communications, a target vehicle can use extra information from the GPS localization results of neighboring vehicles in the group and sensors in the vehicle to refine the GPS localization result. Specifically, GPS installed in each vehicle is used to estimate the vehicle coordinates. Moreover, commonly used omnidirectional radar is also employed in each vehicle to measure the relative distance and angle of the surrounding vehicles. We further assume that these measurements can be matched to the right surrounding vehicle owing to the matching ability of the sensors and GPS [7]. We also assume that each vehicle in the group can communicate with neighboring vehicles through V2V, which is used as a setting in related studies [2], [5]- [7]. To explain this, the basic safety message (BSM) [14] and the optional part of the BSM, which is supported by both dedicated short-range communication [7] [15] and cellular vehicle-to-everything standards [16], can satisfy our need to exchange information between vehicles in a group. As a whole, in each BSM frame (i.e., time slot), each vehicle can acquire the GPS coordinate results of all surrounding vehicles (from GPSs and V2V exchanges), the relative distance and angle to other vehicles (from onboard omnidirectional radar), and the received signal strength indication (RSSI) of other vehicles (from V2V exchange) for achieving CNVPS-aided localization refinement for a refined GPS localization result.

B. SENSOR CONFIGURATION
In this section, we introduce the sensor configurations used in this study, including GPS, radar, and RSSI.

1) GPS
In this study, it is assumed that the GPS error follows a Gaussian distribution, as there are several independent sources, such as satellite clock bias, atmospheric delay, code acquisition noise, and multipath effect, contributing to the GPS error in practice [17]- [19]. In each case, the relationship between the GPS localization resultĜ ∈ R 2 and the real position G ∈ R 2 of a vehicle can be expressed as follows: where n = [Re{|a|e jθ }, Im{|a|e jθ }] T denotes the error term with a ∼ N (µ, σ 2 ) and θ ∼ u(0, 2π). According to existing literature [20], because the levels of the multipath effect are site-dependent, the statistical properties of the Gaussian distribution should be set differently to reflect the GPS error in different environments. In this study, we simulated three different environments to validate the robustness of different CNVPS-aided refinement algorithms. Specifically, in this study, three environments, including f reespace (e.g., highway environment), suburban, and urban scenarios, were considered and set as N (4.7, 4.84), N (14.8, 49), and N (20.5, 72.25), respectively, [2] to reflect the GPS estimation behavior.

2) Radars
In this study, radar is used to provide the relative distance and angle of the surrounding vehicles as an information source for GPS refinement. It is noteworthy that although the ghost target effect may appear in the radar sensing stage, we assume that the radar raw data processing step can be finished ideally based on the following reasons. First, the physical characteristics and the detection results from different time-slots can be utilized to aid the ghost target removing. Furthermore, as only short-range radars will be employed in the considered scenario and the number of detecting targets can be regarded as a prioir knowledge according to the pre-defined size of the considered vehicle group, with the aforementioned information, several advanced ghost target removing algorithm [21], [22] can be used to provide correct processed radar data in the considered scenario. As a result, we assume ideally radar raw data is available for the use of proposed algorithms.
To simulate the radar measurements, the relationship between the radar distance measurementD ∈ R and the real distance D ∈ R of a vehicle can be expressed as follows: whereň D ∼ u(−0.025D, 0.025D) describes the longitudinal uncertainty of the measurement related to the true distance. Note that the uncertainty increases with the real distance, satisfying the realistic radar measurement behavior. Moreover, the relationship between the angle measurement Θ ∈ R and the real angle Θ ∈ R of a vehicle can be expressed as follows:Θ wheren Θ ∼ u(−0.5, 0.5) describes the lateral uncertainty of the radar measurement with an angular error of 0.5 • according to [23].

3) RSSI
Despite the radar measurements, because we adopt BSM to serve V2V communication, the RSSI information can also be acquired in each frame of the BSM to provide another distance measurement for the GPS refinement [24], [25]. To model the RSSI measurements, the practical path loss can be described as a log-normally distributed random variable with a distance-dependent location parameter [26], [27]. That is, where P r (d) denotes the received signal strength measured in decibel milliwatts (dBm) at the transmitter-receiver distance d (in meters), P 0 (d 0 ) denotes the reference power (in dBm) at a reference distance d 0 (in meters), n p is the channel path loss exponent, andñ is the effect of channel fading. To conduct the simulation in this study, we set the aforementioned parameters as P 0 (d 0 ) = −34, n p = 2.1, andñ ∼ N (0, 5.5 2 ).

C. PROBLEM STATEMENT
The GPS refinement problem can be considered as a method for improving the GPS localization based on the extra afore-mentioned information. Hence, the goal of this study is to design a function that takes extra information and the original GPS estimations as input and returns the refined GPS estimation result, minimizing the difference between the predicted result and the real position of the target vehicle by means of data fusion. Specifically, as shown in Fig. 1, in each vehicle in the group, through a V2V information exchange, the extra information X t,i is available in each vehicle i at time slot t for the refinement of the GPS localization result. Based on the aforementioned sensor configurations, the extra information can be further expressed as X t,i = [Ĝ t,i ,D t,i ,Θ t,i , P r;t,i ] ∈ R 5 , which contains the GPS, relative distance to the target vehicle, the relative angle to the target vehicle, and the RSSI. Hence, for a target vehicle in the group with GPS measurementsĜ T , the problem we address can be expressed as a function design problem. That is, where X Tc,1 , X Tc,2 , ..., X Tc,N −1 are the extra information from other N − 1 vehicles in the group at the current time slot T c . Note that all algorithms in the previous studies [2] [5], [7], [17], [18] can also be regarded as designing the prediction function through the optimization process. Furthermore, considering that a cooperative group usually exists for several BSM transmission periods (i.e., multiple time slots), for multi-time-slot scenarios, although we can deal with different time slots independently using single timeslot CNVPS solutions, we try to further improve the GPS localization result by using the correlation between multitime-slot measurements. In light of this, we first consider the GPS refinement problem under the multiple time-slot scenario in this study. In particular, the multiple time-slot problem we considered can be expressed as where X 1 , X 2 , ..., X Tc are the features of the previous time slots and can be further expressed as X t = [X t,1 , X t,2 , ..., X t,N −1 ]. Note that Eq. (5) can be regarded as a special case of Eq. (6) when only one time-slot information is provided for GPS refinement.

III. DEVELOPMENT OF DL-BASED LOCALIZATION ALGORITHMS A. OVERVIEW
Our idea is to develop CNVPS algorithms based on DL algorithms owing to their low complexity, high performance, and design flexibility. More specifically, for a single timeslot GPS refinement, we propose a multi-layer perceptroncooperative neighboring vehicle positioning system (MLP-CNVPS) to achieve a DL-based GPS refinement. For a multiple time-slot GPS refinement, to obtain superior results by considering the temporal correlation in multi-time-slot data, we propose a long short-term memory-cooperative neighboring vehicle positioning system (LSTM-CNVPS). Moreover, a graph convolution network-cooperative neighboring vehicle positioning system (GCN-CNVPS) was further developed to better utilize both temporal and spatial correlations for achieving an efficient GPS refinement, leading to an even better performance compared to the aforementioned DLbased CNVPS algorithms. In the remainder of this section, the motivations and details of the DL-based CNVPS GPS refinement algorithm with different structures are first introduced, and followed by the training specifics at the end of this section.

B. ARCHITECTURE OF THE PROPOSED MLP-CNVPS
As shown in Fig. 2 (a), we propose MLP-CNVPS based on a conventional MLP DL model. Under the multiple timeslots scenario, all extra information and original GPS measurements of the target vehicle will be directly fed into the MLP-CNVPS without preprocessing. As a result, the input layer of MLP-CNVPS can be expressed as a vector of size where T is the number of time slots, according to the considered system model (i.e., the extra information from N-1 vehicles from the same cooperative group and the GPS measurements of the target vehicle itself). Following the input layer, we constructed four fully connected layers as hidden layers to process the input data. The neurons in each layer were set to 256, 128, 64, and 32, respectively. Behind each layer, the parametric rectified linear unit (PReLU) [28] is employed as the activation function to provide nonlinearity. Then, the output of the last hidden layer is fed into the output layer with two neurons, generating the refined localization of the target vehicle. As a mathematical expression, the MLP-CNVPS model for multiple time-slot scenarios can be described as follows: where W i and b i represent the weights and bias of the ith layer, respectively, Γ function is the employed PReLU activation function. In addition, , W out , b out } represents all trainable parameters, and X MLP = Vec(Ĝ T , X 1 , X 2 , · · · , X Tc ), where Vec(·) is a vectorized operation. Note that the MLP-CNVPS under a single time-slot scenario can be considered as a special case of Eg. (7) when T c = 1. With the aforementioned DL model architecture, MLP-CNVPS, can be used to construct a nonlinear function and extract hidden information in the given input to refine the GPS estimation results automatically. The simulation results confirm that MLP-CNVPS outperforms the existing CNVPS solutions.

C. ARCHITECTURE OF THE PROPOSED LSTM-CNVPS
Although MLP-CNVPS can extract hidden information in multiple time-slot measurements and provide an improved GPS refinement, the performance achieved can be further improved by further utilizing the correlation of multi-timeslot measurements along the temporal axis. For the considered problem, multiple time-slot measurements actually belong to the time-series data format. However, it should be noted that all features from different time slots are fed into MLP-CNVPS simultaneously, failing to emphasize the correlation of multiple time-slot measurements in the temporal axis. As an alternative, we further propose LSTM-CNVPS to better utilize a temporal correlation in multiple timeslot measurements. More specifically, the proposed LSTM-CNVPS, as shown in Fig. 2(b), is based on LSTM to leverage its ability to extract hidden information in multiple time-slot measurements. In time slot t, the operation of the LSTM-CNVPS can be expressed as follows: where h t is the state vector, and X LSTM,t = Vec(Ĝ T , X t ) of time slot t. In addition, 1+e −x is the sigmoid function, and tanh(x) = e x −e −x e x +e −x is the hyperbolic tangent function. The operation represents an element-wise multiplication. It is noteworthy that in every time slot, a state vector h will be generated and considered as an input in the next time slot. By doing so, important information from the measurements of the previous time slot will be extracted and kept in this state vector, thereby Using the special kernel design of a GCN, the adjacency matrix is used to describe the spatial correlation of input data under the graph structure, and to achieve a superior performance, the temporal and spatial correlation in the input feature matrix will be simultaneously emphasized.
affecting the operation of LSTM-CNVPS in the next time slot and emphasizing and better utilizing the correlation in the temporal axis compared to MLP-CNVPS.
For the model structure, we constructed an LSTM-CNVPS with two LSTM layers, and the LSTM units of each layer were 44 and 88, respectively. The output is fed into a fully connected layer, which contains two neurons and is regarded as a refined location estimation. The full LSTM-CNVPS model can be described as follows: where Θ LSTM is the set of all trainable weights and biases, and is further expressed as Finally, it is worth noting that when we employ LSTM-CNVPS to solve the GPS refinement problem in a single time-slot scenario, because there is only one time slot in (8), the special mechanism of LSTM used to extract the temporal correlation will no longer exist, and the behavior of LSTM-CNVPS is retrograde to that of MLP-CNVPS.

D. ARCHITECTURE OF THE PROPOSED GCN-CNVPS
Although LSTM-CNVPS can better utilize a temporal correlation to improve the performance of MLP-CNVPS, neither MLP-CNVPS nor LSTM-CNVPS consider the spatial correlation of neighboring vehicles. To explain, the extra information provided by different neighboring vehicles should have different weights or levels of confidence considering the relative distance in an adaptive manner, MLP-CNVPS and LSTM-CNVPS cannot support a delicate design for handling this aspect. To further improve the performance of vehicle localization, we propose GCN-CNVPS to simultaneously consider both temporal and spatial correlations. To do so, if the input data belongs to Euclidean space (i.e., image data), a convolutional neural network (CNN) can satisfy the need to consider both temporal and spatial correlations simultaneously based on its special kernel design and consequent convolution operations. However, the input data of the problem considered belongs to a graph representation, limiting the usage of the CNN model. As an alternative, Fig. 2(c) presents the architecture of the proposed GCN-CNVPS. The input for the GCN-CNVPS is represented by an adjacency matrix A and a feature matrix X GCN . Specifically, the adjacency matrix is used to describe the graph structure of interest, allowing the GCN to utilize the spatial correlation in the considered graph structure. We describe this structure to reflect the fact that the target vehicle is able to communicate with the neighboring N − 1 vehicles, despite these N − 1 vehicles having no connections with each other. The resulting adjacency matrix of size N × N can be defined as follows: However, notice that directly employing adjacency matrix A into GCN-CNVPS causes numerical problems (a gradient explosion and vanishing problem) [29] during the training process, failing to lead to the convergence of optimal weightings. As a result, a normalized adjacency matrix is adopted to prevent the aforementioned issue. In particular, A =D − 1 2 (A + I N )D − 1 2 is a normalized adjacency matrix with added-self connections, where I N is the identity matrix,D = D + I N is the degree matrix, and D = diag( j A ij ) ∈ R N ×N . The first D − 1 2 represents the normalization for each row, and the second represents that for each column. By using the normalized adjacency matrix, the numerical problem during the GCN model training process can be solved. For the model input, the feature matrix X GCN can be represented as a matrix with N rows and ((3 × (N − 1) + 2) × T ) columns, representing N vehicles and (3 × (N − 1) + 2) × T measurements for T time slots, respectively. To utilize the spatial correlation effectively, the information of each surrounding vehicle should be assigned and placed carefully to be processed separately. Specifically, for the ith row (i∈[1, N −1] for the neighboring vehicles), the measurements are [Ĝ t,i , 0, ...,D t,i , 0, ..,Θ t,i , 0, ..., P r;t,i ] ∈ R (3×(N −1)+2) . That is, each neighboring vehicle only acquires the observations of itself and has no information about other vehicles, letting the corresponding values to be set to 0. By contrast, for the N th row (the target vehicle), the measurements of time slot t can be expressed as [Ĝ t,N ,D t,1 , ...,D t,N −1 ,Θ t,1 , ...,Θ t,N −1 .P r;t,1 , ..., P r;t,N −1 ] ∈ R (3×(N −1)+2) . To explain this, the target vehicle possesses its own GPS measurement as well as radar observations and RSSI obtained from neighboring N −1 vehicles through V2V communications.
Note that the dimensions of the measurements are slightly different from the previous to separate the acquired information of each vehicle by arranging the measurements in different rows. However, the overall amount of information remains the same.
Finally, for the structure of GCN-CNVPS, to exploit the spatial dependence in the input features, we employed two GCN layers [30] [31], with the number of neurons in each layer being 32 and 16, respectively, in GCN-CNVPS. Specif-ically, the convolution function of the GCN layer can be expressed as follows: The Γ function is the employed PReLU activation function. Here, W and b represent the trainable weight matrix and bias matrix, respectively. In (12),ÂX GCN W aggregates all features of neighboring nodes with trainable weights for each node. According to [32]- [34], the operation is analogous to the function of the convolutional kernels in convolutional neural networks (CNNs) and is therefore capable of extracting spatial characteristics in a graph. The output of the last GCN layer was flattened and fed into a fully connected layer. The number of neurons in each layer was 2. The complete operations of GCN-CNVPS can be formulated as follows: Θ GCN is the set of all trainable weights and biases, which can be represented as

E. TRAINING METHOD
To train the aforementioned DL-based models, supervised learning algorithms were adopted, and the mean square error was employed as the loss function as follows: where f (X i ; Θ) is the DL-based model that estimates the result corresponding to a sample X i with trainable weightings Θ, ϕ i is the true localization, and D is the total number of samples in the training dataset. Adam [35], a popular gradient descent-based optimizer, was employed to iteratively reduce the loss of each epoch through a backpropagation algorithm during the training process. For MLP-CNVPS, the initial learning rate was set to 0.00001, and the batch size was set to 128. After 1000 epochs, the trained weightings of MLP-CNVPS were recorded, and the offline training process was completed. For LSTM-CNVPS, we set the initial learning rate to 0.00005 and the batch size to 128. After 1000 epochs, the trained weights of the LSTM-CNVPS were recorded, and the LSTM training was completed. For GCN-CNVPS, we set the initial learning rate to 0.0001 and the batch size to 128. After 750 epochs, the trained weights and bias of the GCN-CNVPS were recorded, and the GCN training was completed. Once the offline training process is completed, during the online testing process, the trained DL model can be used to provide vehicle localization estimation results without any further operations.

IV. SIMULATION RESULTS AND DISCUSSION
In this section, three proposed DL-based CNVPS algorithms, MLP-CNVPS, LSTM-CNVPS, and GCN-CNVPS, are evaluated and compared to three existing optimization-based CN-VPS algorithms. More specifically, the existing optimizationbased CNVPS algorithms, centroid location (CL) algorithm [36], DOA-based location algorithm [5], and optimizationbased CNVPS algorithm [7] are implemented in this study as benchmarks. Without the assistance of extra sensors, the CL algorithm simply averages the GPS coordinates of neighboring vehicles to estimate the location of the target vehicle. Thus, the variance of the GPS estimations can be reduced. With the assistance of radar, the DOA-based locating algorithm employs the direction of arrival information of neighboring vehicles to estimate the position of the target vehicle. However, this algorithm cannot exploit additional sensors to further improve the performance. As in the previous study, CNVPS successfully utilizes various sensors to estimate the coordinates of the target vehicle and conducts weighted average localization considering prior knowledge in terms of the standard deviation of each extra sensor. However, because CNVPS only employs a linear function to exploit the information from extra sensors, the achieved performance is limited. Moreover, because the weightings of different sensors are pre-defined according to the statistics of the sensors and remain fixed, the CNVPS cannot adjust the weightings adaptively according to different inputs to achieve better performance. Furthermore, CNVPS also fails to be employed in multiple time-slot scenarios to further improve the performance. In contrast to existing algorithms, DL-based algorithms provide a way to design an adaptive nonlinear function to better utilize the information from extra sensors by extracting temporal and spatial correlations. Moreover, multiple time-slot scenarios can be considered and supported to provide a superior performance compared to a single timeslot scenario. In this section, we first introduce the process of data generation and then compare different algorithms in different scenarios to validate the superiority of DL-based CNVPS algorithms.

A. DATA GENERATION
To obtain the dataset for model training and testing, we first generate the coordinates of the target vehicle G T and then generate neighboring with a distance constraint G T − G i 2 < 10 (unit: meters). Subsequently, we can obtain measurements according to the sensor configurations mentioned in Sec. II. Specifically, MATLAB software is used to generate virtual measurements for our simulations. We have followed the aforementioned sensor settings and created a scenario as depicted in Fig. 1.
In particular, we collect some real data on campus to validate the sensor configurations settings employed in this paper and the results show the same tendency to the generated data based on the system model of this work. As for the generation of multiple time-slot measurements, we specify the vehicle mobility by setting the horizontal velocity V h and vertical  velocity V v for each vehicle. Moreover, we defined two driving modes, the straight-through mode and the lane change mode, to set the driver behavior. The vehicle velocities of the former are set as V h = 0 and V v ∼ u(10, 15) m/s, the latter of which are set as V h ∼ N (0, 1.5 2 ) and V v ∼ u(10, 15) m/s. Based on the aforementioned settings, we set the number of samples for training, validation, and testing datasets to 100000 for each of these three scenarios under different driving modes. We then compute the resulting average mean and standard deviation of the different algorithms to report the error statistics.

B. PERFORMANCE ANALYSIS WITH SINGLE TIME-SLOT MEASUREMENTS
In this section, we simulate and discuss the behavior of different CNVPS algorithms in three practical scenarios: f reespace, suburban, and urban, under a single time-slot measurement condition. Fig. 3 shows the achieved mean and standard deviation of the estimation error for different algorithms under three scenarios with a cooperative group size equal to 5. Although optimization-based algorithms can improve the GPS estimation error, DL-based algorithms can further improve the GPS measurements by showing a lower achieved mean and standard deviation for all scenarios. Specifically, regardless of the severity of the original GPS estimation error, DL-based CNVPS algorithms can refine the GPS estimations and provide a certain level of improvement. It is also worth noting that GCN-CNVPS slightly outperforms MLP-CNVPS because the spatial correlation is emphasized and better utilized through the special mechanism of GCN operations.

C. PERFORMANCE ANALYSIS WITH MULTIPLE TIME-SLOT MEASUREMENTS
In this section, we discuss the performance of different algorithms under three scenarios with multiple time-slot measurement conditions. Because none of the existing optimizationbased algorithms can be extended to multi-time-slot conditions, in this section, we describe the application a single time-slot CNVPS algorithm, which showed the best results among the optimization-based algorithms in the previous simulation, in each time slot instead as a benchmark. Fig. 4 illustrates the error statistics for different algorithms under the three scenarios when the driving mode is set as the straight-through mode in a cooperative group containing five vehicles. Compared to CNVPS, which fails to utilize information from multiple time-slot measurements to further improve the performance, as the number of time slots increases, all three DL-based CNVPS algorithms can gain from the extra information and achieve a better performance compared to single time-slot measurement conditions. More specifically, LSTM-CNVPS outperforms MLP-CNVPS because of the special mechanism for emphasizing a temporal correlation. Furthermore, GCN-CNVPS can offer a better performance than LSTM-CNVPS because temporal and spatial correlations are both considered through the convolution operations of the GCN model. However, we can also note that the improvement of the urban case is more compelling because the GPS error in this scenario has more room for improvement. However, even under the freespace scenario, DL-based CNVPS algorithms can still be used to improve the original GPS estimation results. Fig.  5 shows the error statistics for different algorithms under the same three scenarios when the driving mode is set as the lane-change mode in a cooperative group containing five vehicles. Nevertheless, we can observe the same behaviors of the three algorithms by showing impressive improvements over the results of the CNVPS algorithm. Note that straight-    through mode is easier than lane-change mode because of the relatively fewer variations in directions and relatively higher correlation of locations at different time slots. Hence, we found that all methods perform worse than the straightthrough mode. However, among them, GCN-CNVPS always achieved the best performance in both modes. These results suggest that GCN-CNVPS enhances the performance of GPS by extracting temporal and spatial relationships from historical measurements, confirming our motivation toward the design of GCN-CNVPS.

D. SCALABILITY OF PROPOSED ALGORITHMS
In this section, we further verify the scalability of the proposed algorithms by extending our algorithms to a vehicular scenario that consisting of ten cars. Figs. 6 and 7 present the error statistics for different algorithms in the three scenarios under different driving modes in a cooperative group containing ten vehicles. With additional information provided by increasing the number of surrounding vehicles, the performance of all DL-based CNVPS algorithms improved compared to the previous simulations. However, we also noted that because MLP-CNVPS and LSTM-CNVPS cannot utilize a spatial correlation well, the performance of these two algorithms saturates quickly under this scenario. As an alternative, GCN-CNVPS can handle and utilize the complex spatial correlation between ten vehicles and offer an even better performance than a smaller group of cooperative vehicles, proving the scalability of GCN-CNVPS. Fig. 8 shows the number of trainable parameters for different DL-based CNVPS algorithms. Because the dimensions of the input layer of MLP-CNVPS increase with the number of time slots, the number of trainable parameters also increases with this number. For the LSTM-CNVPS, the number of train-able parameters remains the same because LSTM-CNVPS can use the same trainable parameters to process the data from different time slots. Moreover, we can observe that the numbers of trainable parameters of MLP-CNVPS and LSTM-CNVPS are comparable. For GCN-CNVPS, although the number of trainable parameters will also increase with the number of time slots because the number of dimensions of the input layer of GCN-CNVPS also increases with the number of time slots, the rate of increase is fairly flat compared to that of MLP-CNVPS. Finally, we can observe that the training overhead of GCN-CNVPS is far less than that of MLP-CNVPS and LSTM-CNVPS. Note that GCN-CNVPS can also significantly outperform MLP-CNVPS and LSTM-CNVPS. Based on the aforementioned observations, we conclude that GCN-CNVPS is an efficient CNVPS solution with a high performance and low training overhead, because both temporal and spatial correlations are well utilized for aiding the GPS refinement during GCN operations, making GCN-CNVPS a potential solution to assisting the GPS refinement in practice.

V. CONCLUSION
In this study, we proposed several cooperative vehicle localization approaches based on the DL technique to provide precise location estimation results. Specifically, MLP-CNVPS can be used to apply an effective data fusion and aid in the GPS refinement. LSTM-CNVPS was developed by further considering the temporal correlation hidden in the multiple time-slot data. Finally, GCN-CNVPS was developed to consider temporal and spatial correlations simultaneously, offering a higher performance and lower training overhead compared to the existing aforementioned algorithms. Moreover, extensive simulations also confirmed the scalability and robustness of the proposed algorithms, making the developed algorithms potential candidates for use in GPS refinement in practice. Inspired by outstanding performance in this work, we will look for industry partners to test our algorithm in over-the-air scenarios in the future. We also hope that this study will encourage researchers to introduce GCN-based algorithms for efficient vehicular applications.