Congestion Attack Detection in Intelligent Traffic Signal System: Combining Empirical and Analytical Methods

+e intelligent traffic signal (I-SIG) system aims to perform automatic and optimal signal control based on traffic situation awareness by leveraging connected vehicle (CV) technology. However, the current signal control algorithm is highly vulnerable to CV data spoofing attacks. +ese vulnerabilities can be exploited to create congestion in an intersection and even trigger a cascade failure in the traffic network. To avoid this issue, timely and accurate congestion attack detection and identification are essential. +is work proposes a congestion attack detection approach by combining empirical prediction and analytical verification. First, we collect a range of traffic images that correspond to specific traffic snapshots which are vulnerable to potential data spoofing attacks. Based on these traffic images, an improved generative adversarial network is trained to predict whether a forthcoming attack will cause congestion with a high probability. Meanwhile, we define a group of traffic flow features. After exploring features and conducting a thorough analysis, a TGRU (tree-regularized gated recurrent unit)-based approach is proposed to verify whether congestion occurs. When we find a possible attack that can cause congestion with high probability and subsequent traffic flows also prove congestion, we can say there is a congestion attack. +us, we can realize timely and accurate congestion attack detection by integrating empirical prediction and analytical verification. Extensive experiments demonstrate that our approach performs well in congestion attack detection accuracy and timeliness.


Introduction
Connected vehicle (CV) technology [1,2] empowers vehicles to communicate with the surrounding environment (roadside units and traffic signal control infrastructure) and is now transforming today's transportation systems. As one key component, the intelligent traffic signal (I-SIG) system [3] is responsible for performing dynamic and optimal signal control. It is based on automatic traffic situation awareness by leveraging the emerging communication infrastructure of the space-airground integrated network (SAGIN) [4,5] with the advantages of coverage, flexibility, and so on. For instance, since September 2016, a series of I-SIG systems have been deployed in California, Florida, and New York by the U.S. Department of Transportation (USDOT) as a CV Pilot Program [1]. ese systems are currently under testing and not yet widespread.
Unfortunately, such dramatically increased connectivity also opens a new door for cyberattacks. Recently, such I-SIG has exposed a vulnerability of the controlled optimization of phases (COP) algorithm [6,7]. Attackers can compromise the on-board units on their vehicles and send malicious messages (such as those containing speed and location) to influence the traffic control decisions at specific times, thus causing unexpected heavy traffic congestion. Some data show that a single attack vehicle can cause a total delay 11 times greater than the total delay before the attack [8], posing a significant barrier to the development and deployment of I-SIG systems on a wide scale in the future.
Previous research [8] reveals such congestion attacks on the COP algorithm, analyzes how congestion attacks affect the COP algorithm decisions, and explains how to launch an attack using data spoofing in SAGIN. However, developers may still lack a deep understanding of such I-SIG attacks and defenses, raising some pressing concerns: (1) What is the effect of different phases where the attack vehicle is located? e different phases of the attack vehicle can cause different congestion effects. (2) What is the quantified correlation between the attack and congestion degree? e quantified correlation refers to the potential relationship between the attack and congestion degree; once identified, we can infer whether the attack occurred according to the congestion degree. (3) Are there any potential features to be utilized for revealing the above correlation? It is necessary to analyze the congestion attack mechanism firstly to solve these issues. e challenges of solving these issues include how to automatically explore multiple and multidimensional features to quantify the traffic flow characteristics under no attack and congestion attack and analyze the correlation between attack features and attack effects.
us, demystifying the congestion attack based on the COP mechanism through quantified features and exploring new analysis methods will benefit all stakeholders for I-SIG, including transportation, SAGIN, and security specialists.
We demystify the attack and corresponding congestion from a machine learning perspective by exploring and utilizing quantified features. We deeply analyze data spoofing in SAGIN and the COP algorithm vulnerability under two different attack strategies. To explore the effect of different phases of the attack vehicle, we consider utilizing high-level image features and design a novel analysis model based on the cycle generative adversarial network (CycleGAN) [9] to reflect the relation between the attack and the congestion caused by the attack. us, we can predict whether a forthcoming attack will cause congestion and the congestion effect according to the traffic image at a specific moment. To explore the quantified correlation between the attack and congestion degree, we utilize traffic flow features and the TGRU classification model [10] (an explainable gated recurrent unit-based model [11] with tree regularization) to verify whether a congestion attack occurs based on all vehicles' trajectory data in an intersection. Following analysis, we also give some promising suggestions for defending I-SIG systems against a congestion attack.
We implement the I-SIG and experiment through visualized simulation in VISSIM [12]. e experiment shows the effectiveness of our approach. We find that feature-based machine learning can reflect the correlation between the attack and congestion degree well.
rough the deep learning-based training, the CycleGAN-based approach output visualized results with satisfied prediction compared with real values: the MAE and RMSE of the congestion degree are near 0.02 and 0.03, respectively, and the MAE and RMSE of the congestion degree are near 0.94 and 1.14, respectively. TGRU has a 0.84 precision and 0.79 recall on predicting the spoofing attack based on 30 features. Generally, for defenses, we suggest improving the estimation of vehicle location and speed (EVLS) [7] algorithm of I-SIG if we would like to keep a limited cost, which requires fewer authentication mechanisms and SAGIN reinforcement efforts.
We summarize our contributions as follows: (1) We perform the study to demystify the attack to I-SIG and the corresponding congestion from a machine learning perspective by exploring different kinds of features through supervised learning and unsupervised learning. (2) For predicting the spoofing congestion attack, we automatically explore the image feature to quantify the traffic flow characteristics under no attack and congestion attack. And we propose a CycleGANbased approach to analyze the potential relationship between the congestion attack and corresponding results two stages later based on the image feature. (3) For verifying the spoofing congestion attack, we propose a TGRU-based approach to explore the underlying relationship between the congestion attack and traffic flow feature at the current moment based on the traffic flow features, which are firstly defined in this work. (4) We evaluate our approach empirically from the real COP algorithm through VISSIM. We collect 4476 high-quality image samples and 3600 traffic flow data for the experiment, which enables us to demonstrate the effectiveness of our approach compared with ground truth. Figure 1 presents the basic architecture for the space-air-ground integrated network of I-SIG, in which two main segments are included: a space segment and a ground segment. e I-SIG of the CV environment is located in the network-based ground segment. ere are three main components within the ground segment: on-board units (OBUs), roadside units (RSUs), and signal planning units. ese refer to the devices installed in vehicles, roadside servers, and traffic lights, respectively. Both vehicle-to-vehicle (V2V) [13] communication and vehicle-to-infrastructure (V2I, e.g., roadside servers) [14] communication adopt the dedicated short-range communications (DSRC) [15] [16]. By providing two-way communication between vehicles and traffic signals, NTCIP is specially designed to achieve interpretability and interchangeability between computers and electronic traffic control equipment from different manufacturers, thus increasing use in smart city initiatives.

I-SIG Data Flow.
e data flow of the I-SIG system is revealed in Figure 2. Each OBU of a vehicle sends BSMs to the RSU for real-time trajectory collection. en, the data are preprocessed to form an arrival table (Table 1) to be used as input for signal planning, which contains COP and EVLS algorithms. If the penetration rate (PR) of OBU for a vehicle is less than 95%, the arrival table will be sent to EVLS for an update. Otherwise, it will be directly sent to the COP algorithm for planning. According to the results of the COP algorithm, a downward signaling command will be transferred to the phase signal controller. After each stage of signal control, the status of the signal will be returned as feedback for continuous COP planning.
ere are 8 traffic signals in I-SIG, as shown in Figure 3, called phases; odd numbers are for left-turn lanes; even numbers are for through lanes. Table 1 is the arrival table which is sent to the signal planning model. In Table 1, denotes the time to arrive at the stop bar from the current location. I-SIG sets M � 130 seconds, covering a BSM statistic of over two minutes. N ij (i ∈ [0, M], j ∈ [1,8]) means that in phase j, there will be N ij vehicles that are going to reach the stop bar within T i seconds. Here, the stop bar is set in front of the traffic light as it is marked in real road intersections.
e EVLS is based on Wiedemann's car-following model and is used to fill the blank monitoring area of the monitoring segment and insert vehicle data between OBUequipped vehicles.
e key is to estimate the number of queued vehicles. Because it is assumed that a queue always begins at the stop bar, the last vehicle in the queue needs to be found to determine the queue length.
First, the historical distances to the stop bar and stop time of the last stopped connected vehicle and the secondto-the-last stopped connected vehicle in the queue are calculated; these are denoted as L q1 , T q1 , L q2 , and T q2 , respectively. e current time is T c , and the estimated queue length is L es . Assuming that the queue propagation speed v q is constant, we have en, If the average vehicle length is C, the number N 0i of vehicles in queue is then calculated as follows: Although such estimation provides effective support for a low PR, it also introduces a new threat of data spoofing attack to the COP algorithm.

Data Spoofing reat.
ere are two data spoofing attack strategies proposed in I-SIG ( Figure 4). e first one is a direct attack on the arrival table without considering PR; the second one is an indirect attack on EVLS when the PR is less than 95%. e first strategy is for arrival time and phase spoofing, for both the full deployment period (PR ≥ 95%) and  e attacker changes the location and speed information in vehicle BSMs to alter the vehicle's arrival time and requested phase; thus, the corresponding arrival table elements in Table 1 are changed. is attack strategy can directly attack input data flow no matter what the PR is. As shown in Figure 4(a), the attacker adds a spoofed vehicle into the original vehicle queue at any location. e insertion of a spoofed vehicle makes the queue longer. Moreover, there is an increase in the duration of the green light allocated by the COP algorithm for the current phase, which delays the next start time of the green light of all phases, thus increasing the delay for vehicles to pass through the intersection. e second strategy is for queue-length spoofing, for the transition period only. is strategy aims to extend the queue length estimated by the EVLS algorithm by changing the location and speed values in BSMs. Figure 4(b) shows that the attacker adds a stopped vehicle with the farthest distance to the stop bar. Owing to the EVLS algorithm estimating the queue length based on the location of the last stopped connected vehicle, this attack causes the estimated queue length L es calculated by equation (2) to increase. erefore, the number of vehicles in the queue N 0i calculated by equation (3) increases as well.

Planning-Level Congestion Analysis.
e COP algorithm is responsible for traffic signal planning; thus, it is essential for planning-level congestion analysis of I-SIG.    rough reading the published COP-related papers [6,7] and analyzing the implementation code, we reveal a more complete and detailed COP algorithm for the first time (Algorithms 1 and 2). e authors in [6] first proposed a COP algorithm that allows optimization of various performance indices, including delay, stops, and queue lengths, for the optimal control of a single intersection. However, it did not support flexible or dual ring and phase sequences, and it is difficult to understand for most readers due to the lack of the algorithm flow. Based on the COP algorithm, the authors in [7] presented a real-time adaptive traffic control algorithm by utilizing data from connected vehicles to optimize the phase sequence. However, they did not provide the details of the algorithm. Compared with [6,7], Algorithm 1 is the first algorithm that provides a complete and detailed flow of signal planning.
In Table 2, we list the meanings of the mathematical symbols that appear in the two algorithms. e design of the COP algorithm uses the collaboration of two-stage planning and operation. e COP algorithm plans signals for the next-stage based on the vehicle's estimation, and such planned signal duration will be operated at the next-stage signal control time. us, this is a continuous alternate process in a fixed phase sequence, which means that the I-SIG system cannot change the order and duration of phases in the current stage since this is set in the previous stage. When bringing foresight of planning, such a design also opens the door to attack signal planning in order to affect next-stage operation continuously. e spoofing of the arrival table affects the variables A t,k and planP r,p and the later calculation of Delay r in line 19 of Algorithms 1. e change in Delay r causes the variables optV r in line 21, optG r,0 in line 22, and optG r,1 in line 23 of Algorithms 1 to change as well. Finally, the outputs planP r,p , optG r,p , x * j , and v j are changed.

High-Level Image Feature-Based Congestion Attack
Prediction. In this subsection, we employ an image featurebased CycleGAN to explain the relationship between the phase where the spoofed vehicle is located and the congestion image features two stages later. As mentioned in the Data Spoofing reat section, there are two data spoofing attack strategies, but either attack will cause congestion in a period. Different phases of spoofed vehicles lead to different congestion effects. erefore, the image features of intersection congestion are also different.
e CycleGAN model can mine the potential relationship between two different types (X and Y) of images. rough training, CycleGAN can generate the corresponding images Y according to X and generate the related images X according to Y. erefore, we utilize the CycleGAN model to predict the congestion effects according to the phases of spoofed vehicles, which were considered the attack feature, in order to reveal the relationship between the phase of the spoofed vehicle and the caused congestion image feature.
e CycleGAN architecture is illustrated in Figure 5. One training sample is a pair of image x i and image y i to form (x i , y i ), x i ∈ X, and y i ∈ Y. x i refers to the processed traffic image at the spoofing time, and y i is the processed traffic congestion image two stages later. e image processing consists of three steps: (1) filter out environment background; (2) extract four images, in which each has 2 phases at one intersection; and (3) join these four images
ere are four neural networks in the CycleGAN architecture: two generative networks (G and F) and two discriminant networks (D X and D Y ). e generator G generates a fake image y , which is similar to y given real image x, i.e., G: X ⟶ Y. Meanwhile, F generates a fake image x, which is similar to x given real image y, i.e., F: Y ⟶ X. e adversarial discriminator D X aims to distinguish whether the input image is x and outputs probability P(x). Similarly, D Y aims to discriminate whether the input image is y and outputs probability P(y).
) ≈ y is a cycle called backward cycle consistency. ere are two kinds of losses: adversarial loss and cycle consistency loss. Adversarial loss can only guarantee that the samples generated by the generator are distributed with the real samples, but we want the images between the corresponding domains to correspond one by one. at is, X-Y-X can also be migrated back. So, forward cycle consistency and backward cycle consistency are used to make the samples generated by two generators not contradict each other.

Adversarial Loss.
is refers to the difference in dataset distribution between generated images and corresponding real images. For discriminators D X and D Y , the closer the output value is to 1, the smaller the loss is. e losses of G and F can be calculated as L G and L F , respectively, as follows: Cycle Consistency Loss. is prevents the learned mappings G and F from contradicting each other, making F(G(x)) ≈ x and (F(y)) ≈ y. e loss of L cyc (G, F) is calculated by the following equation: e total loss for CycleGAN is in which λ is an important parameter. en, the objective function of the CycleGAN is defined as follows: e detailed implementation of neural networks in CycleGAN will be described in the following experiment setup.

Traffic Flow Feature-Based Congestion Attack
Verification. In this subsection, we use a deep learningbased decision tree model, TGRU, to explain the relationship between the traffic flow features and the congestion attack.
is relationship can then be used to verify if congestion is occurring. e TGRU model is an interpretable depth timeseries model, which is very suitable for intersection traffic flow features with time characteristics. At the same time, interpretability helps to analyze better the relationship between traffic flow features and the congestion attack. e input of the congestion prediction is the traffic image feature of the intersection, and the prediction model outputs the congestion affects two stages later according to the image feature, which indicates that whether the congestion will occur. However, after the congestion prediction, the verification model is used to verify whether the congestion attack is occurring. e verification input is the defined traffic flow feature that is calculated according to vehicles' information. When we find a possible attack that can cause congestion with high probability and subsequent traffic flows also verify congestion, then we can predict there exists a congestion attack. us, we can realize timely and accurate congestion attack detection by integrating empirical prediction and analytical verification.
Feature Definition Based on Traffic Flow. To measure the congestion effects caused by spoofed vehicles, we propose capacity ratio and congestion degree, as well as an attack acceleration and attack amplification ratio based on capacity ratio and congestion degree. We define features as follows: (1) Vehicle Capacity Ratio (CR). C max k is the maximum vehicle capacity of each phase, and the vehicle capacity of all 8 phases is computed as C max total � 8 k�1 C max k . en, the vehicle CR can be denoted by CR � 8 k�1 N k /C max total , where N k is the vehicle number of the kth phase.
(2) Congestion Degree (CD). e number of vehicles queuing in the kth phase is denoted as Q k . Q normal is the number of vehicles during normal queuing and is a constant. en, the CD of the kth phase can be computed by PCD k � Q k /Q normal , and the global CD for an intersection is ICD � 8 k�1 PCD k . (3) Attack Acceleration. Let t 0 be the start time of the data spoofing attack. en, the accelerations of CR, PCD k , and ICD at time t are, respectively, calculated by α (4) Attack Amplification Ratio. Let t 0 be the start time of the data spoofing attack. en, the amplification ratio of CR, PCD k , and ICD at time t is, respectively, calculated by β CR (t) � CR(t)/CR(t 0 ), β PCD (t, k) � PCD(t, k)/PCD(t 0 , k), and β ICD (t) � ICD (t)/ICD(t 0 ). Features are divided into macrofeatures and microfeatures for the sake of discussing interpretability, depending on whether they are a feature of the whole intersection or a specific phase (Table 3). Macrofeatures measure the congestion characteristics of the whole intersection, and microfeatures measure the phase of a single signal phase. Unlike the traditional traffic flow characteristics, such as traffic flow, traffic density, and speed, the traffic flow features we defined are related to attacks and are divided into the features for all single signal phases and the features of the whole intersection. For a traffic flow of 1800 seconds, we only sample the first 10 seconds of flow head and the last 10 seconds of flow tail. For flow head, we choose features of macro, micro, or both, and then we choose the same features from the flow tail. erefore, the number of features is from 20 to 600. We use the Z-score as a standardization to adjust feature values. For values (x 1 , x 2 , . . . , x n ) of one feature in all samples, the new value is computed by x ′ � x i − x/s, in which s is the standard deviation and x is the mean value of (x 1 , x 2 , . . . , x n ).
TGRU Model. We try data spoofing exhaustedly using the last vehicle and collect time-sequence samples. For such data, we use TGRU, a time-series model with decision tree regularization, for interpretability. Figure 6 shows the TGRU architecture for end-to-end calculation.
ere are four main calculators: sigm, tanh, plus, and Hadamard product. Sigm refers to the sigmoid function, and tanh refers to the hyperbolic tangent function. e objective function is as follows: where λ (λ > 0) is the regularization strength, W is the whole parameter space, N is the sample number, and T denotes a sampling frequency in one series. e logistic loss function is binary cross entropy. Next, a single binary decision tree that accurately reproduces the network's thresholded binary predictions y n given input x n is found. en, the complexity of this decision tree as the output of Ω(W) is measured. e complexity is measured by the average decision path length, i.e., the average number of decision nodes that must be touched to make a prediction for an input example x n . A regularization function Ω(W) is used to map W to an estimate of the   average path length and is implemented by a multilayer perception (MLP) approximator. en, tree regularization is conducted, and its objective function is defined as follows: where J is the size of the candidate dataset of W and vector ξ denotes the parameters of this chosen MLP approximator.

Setup.
e platform and experimental environment configuration are shown in Table 4. We use a PC to run the COP algorithm and VISSIM for real-time traffic flow signal control and corresponding traffic simulation. We use another GPU server for both TGRU and CycleGAN training.
VISSIM, the traffic simulation platform, can capture and display the changes of traffic signal and traffic flow planned by the COP algorithm in real-time, as shown in Figure 7. Table 5 shows the sample datasets for TGRU and CycleGAN. In TGRU, we train a 3-layer MLP with 100 first-layer nodes, 100 second-layer nodes, and 10 third-layer nodes. In the CycleGAN, the generator contains encoding, transformation, and decoding. Encoding includes one 7 × 7 Convolution-InstanceNorm-ReLU layer with stride 1 and two 3 × 3 Convolution-InstanceNorm-ReLU layers with stride 2. Transformation includes 9 residual blocks for 256 × 256 images and two 3 × 3 convolutional layers with the same number of filters on both layers. Finally, decoding includes two 3 × 3 fractional strided Convolution-InstanceNorm-ReLU layers with stride 2 and one 7 × 7 Convolution-InstanceNorm-ReLU layer with stride 1. In the discriminator networks, we use 70 × 70 PatchGANs [17], and the discriminator architecture includes four 4 × 4 Convolution-InstanceNorm-Leaky-ReLU layers with stride 2. e last layer contains a convolution to produce a 1-dimensional output.

Congestion Attack Prediction and Visualized Analysis.
We evaluate the performance of the CycleGAN model based on image features. Evaluation Metric. For N samples testing, we further evaluate the CR, PCD, and ICD based on the mean absolute error (MAE) and root mean squared error (RMSE). We have MAE and RMSE of CR expressed as follows: where CR i is the real value and CR i is the estimated value. Similarly, we have MAE PCD k , RMSE PCD k , MAE ICD , and RMSE ICD .
Visualized Results and Quantitative Qnalysis. In Figure 8, the first column is the original image x, the second one is the output image G(x) by CycleGAN, and the third column gives the real image y with congestion. Our approach has a satisfied generator and can predict a future result of congestion attacks to provide a visualization for better human understanding.   and CD measurements of one intersection and display a satisfying prediction compared with the ground truth. As shown in Table 6, in 10-fold cross validation, our CycleGAN has a pretty good performance in CR prediction. It has very small MAE and RMSE values, 0.0205 and 0.0225, respectively, which is better than that obtained with 4-fold cross validation.
For ICD (Table 7), the 4-fold cross validation results of MAE and RMSE are better than those of 10-fold cross validation, reaching 0.8100 and 0.9987, respectively. We present the detailed values of each phase for MAE and RMSE of congestion degree in Table 8. We can see that through comparing values based on the training set and cross validation, our CycleGAN-based model does not overfit by training. e best results are at k � 3, and we have the lowest values of MAE PCD k and RMSE PCD k (0.2250, 0.2050, 0.2050, 0.2617, 0.2519, and 0.2360) compared with the values of other phases. However, the errors at k � 5 increase a lot, which is why MAE ICD and RMSE ICD approach 1. is is because the fewer the vehicles, the better the prediction effect of the model. However, the attack vehicle is at phase 3, which has the least number of queues, while the congestion occurs at phase 5, which has the largest number of queues. erefore, the prediction effect of phase 3 is the best of the 8 phases, so the lowest MAE and RMSE values are k � 3, while the prediction errors at phase 5 increased a lot.
We present bar charts for MAE and RMSE of 8-phase congestion degree in Figures 9 and 10, respectively. In  Security and Communication Networks Figure 9, the best average value of MAE is based on 10-fold cross validation (with a value of 0.3844), and the worst average value is based on 4-fold cross validation (with a value of 0.4481). In Figure 10, we have similar results for RMSE; the best and the worst are 0.4471 and 0.5491, respectively. Both average values mean that the CycleGAN is robust and that we have good feature capture in our approach.
In addition, we compare the performance of the CycleGAN model with that of pix2pix [17], another GANbased model, by quantitatively analyzing experimental results from the whole intersection and the specific phases perspective, respectively. Here, we use the experimental results under 4-fold cross validation. For the measurements of the whole intersection, as shown in Table 9, we have MAE CR � 0.0213, RMSE CR � 0.0256, MAE ICD � 0.8100, and RMSE ICD � 0.9987 for CycleGAN and MAE CR � 0.1167, RMSE CR � 0.1297, MAE ICD � 3.7917, and RMSE IC D � 3.8500 for pix2pix. We can see that CycleGAN has lower MAE and RMSE values than pix2pix.
erefore, the CycleGAN model has a better performance than the pix2pix model on the measurements of the whole intersection.
We further compare the model performance for the specific phases. Table 10 shows the detailed MAE and RMSE values of each phase for CycleGAN and pix2pix. ere are the lowest MAE and RMSE values at k � 3 for both models: 0.2050 and 0.2538 for CycleGAN and 0.2519 and 0.9830 for pix2pix. For all phases, the MAE and RMSE values of CycleGAN are lower than those of pix2pix. erefore, the CycleGAN model also has a better performance than the pix2pix model on the measurements of all phases.
To sum up, in the CycleGAN-based prediction model, we extract four-direction road images of the intersection and perform phase-based composition for generating a new sample image to quantify the traffic flow characteristics. Based on the image feature, the CycleGAN-based approach analyzes the potential relationship between the congestion attack and the corresponding congestion effect two stages later. Also, the model is used to analyze the congestion effects that different phases of the attack vehicle caused. Meanwhile, we can obtain the visualized results based on the image feature. e experimental results on the CycleGANbased model and compared experiments with the pix2pix model demonstrated the superiority of the CycleGAN-based model.

Congestion Attack Verification.
Here, we evaluate the performance of the TGRU model based on traffic flow features. We use the confusion matrix, accuracy, AUC value, precision, recall, and F1-score. e TGRU model is trained to distinguish whether the intersection is under a spoofing attack based on traffic flow features. We collect time-series traffic flow data for 3600 seconds under both the normal state and attack state. We consider 1-second intervals as time steps. Each data vector x nt has 30 features, as defined in Section 3.4. Each outcome y nt is a binary label marking whether the intersection is under a spoofing attack. e sequence length is set to 20 seconds, considering that the maximum green time of each signal is 20 seconds. Hence, 360 samples are obtained in total: 180 samples of which contain 3600 traffic flow data and are used for training and the other 180 samples are used for testing. We apply the model to the test set and calculate its AUC value, accuracy, precision, recall, and F1-score; these values are reported in Table 11. We see that for different parameter settings, the TGRU model with our defined traffic features can achieve a great prediction quality. e AUC values are all approximately 0.8, and the accuracy values are 0.79, 0.79, and 0.75 when using different parameter settings. Furthermore, the average values of precision, recall, and F1score are satisfying, almost near 0.8. Figure 11 shows the three ROC [18] curves of TGRU. Corresponding AUC values are shown as well. We can see that these curves are similar, and their AUC values (0.82, 0.85, and 0.78) are all around 0.8. Moreover, the TGRU model has similar performance with different parameter settings; this indicates that our defined classification features are efficient, and the different parameter settings have little effect on TGRU model's performance. e decision tree generated by Graphviz [19] is shown in Figure 12. For 3600 traffic flows, this tree has 9 levels. From top to down, according to each feature value, the flow data can be grouped into different classes step by step. For example, when X [13] ≤ 0.068, there are 32 traffic flows of 57 flows correctly predicted as the class of spoofing attack 1; this indicates the importance of the 13th dimension feature, i.e., the congestion degree of the 8th phase PCD 8 , in predicting the class of spoofing attack 1.
Also, we compare the TGRU model with a time-series prediction method, seasonal autoregressive integrated moving average (SARIMA). Here, it is detected whether the congestion occurs or not based on traffic flow features. We carry out experiments for different approaches under different traffic flow feature sets. We choose the primary traffic flow data for the first feature set as the traffic flow feature FS 1 . e second feature set FS 2 is shown in Table 3. According to the two approaches, we construct two feature sets based on the traffic flow data we collect. As shown in Table 12, the accuracy values of SARIMA and TGRU on the feature set FS 1 are 0.744 and 0.772, respectively, and on the feature set FS 2 are 0.784 and 0.790, respectively, which demonstrates that the TGUR model based on our defined traffic flow features is superior to others.
In conclusion, in the TGRU-based verification model, we propose some timing characteristics, including capacity ratio, congestion degree, attack acceleration, and attack amplification ratio, to measure the congestion effects based on traffic flow. Based on the defined traffic flow features, the TGRU-based model is used to analyze the underlying relationship between the congestion attack and traffic flow features at the current moment. Meanwhile, the decision tree helps better interpret the relationship between traffic flow features and the congestion attack. e experimental results on the TGRU-based model and compared experiments with the SARIMA model demonstrated the superiority of the TGRU-based approach.

Defense Suggestions
To proactively address the congestion attack of the I-SIG system, this section discusses how to defend against the attacks assessed above.
EVLS Improvement for COP Reinforcement. As estimated by the USDOT [20], I-SIG may take 25-30 years to reach a 95% PR for intelligent transportation systems. us, for I-SIG under a real low PR, I-SIG needs to adopt an EVLS algorithm to estimate non-OBU-equipped vehicles' location and speed. In the current I-SIG system design, the congestion attack on the COP algorithm utilizes a nonrobust estimation of EVLS. However, it is possible to improve EVLS and thus reinforce the COP algorithm. For single global positioning system (GPS) spoofing, we can introduce more collaboration mechanisms from the transportation field, such as the car-following model. A natural way to accomplish this is to significantly improve queue-length prediction. In the existing EVLS, this could be realized by adding a new software module that interacts with the COP algorithm. Such implementation has a low cost and brings little change to the original COP algorithm.
Another problem is the high impact of PR on security, which we have to change. In the current design, when the PR is smaller, the impact of the attack on the system is more significant because the system cannot accurately obtain the   queue length with fewer data. We do not suggest providing two alternative versions of EVLS (i.e., one for high and one for low PR, respectively). Although we analyzed a car-following model in work [21], we believe that a more useful model with a collaboration mechanism should be studied; this will make the estimation of EVLS more accurate as well as COP security more robust.
Authentication and Anomaly Detection. In the current design, authentication is realized through communication between OBU and RSU. However, the attack vehicle might not be a newly joining vehicle or an unauthenticated vehicle; in fact, it can be a normal vehicle with legal authentication. us, although authentication reinforcement is not the solution, it can be used to aid in anomaly detection. e idea is that a vehicle cannot appear somewhere suddenly; from the beginning authentication, we should perform analysis on time-series trajectory data to discover any anomaly behavior; this requires a powerful RSU with more computing ability and storing capacity. In addition to an anomaly detection algorithm, implementation needs the support of a collaboration mechanism of multiple I-SIGs; this is a complex global design of intelligent transportation and has not been realized yet. We believe this is critical work that must be accomplished before wide I-SIG deployment.
Prevent Cold-Start Attack. Essentially, the congestion attack is a type of insider attack. us, it is challenging to perform anomaly detection for such an attack in a pretty       transportation system. After that, any data of one node have to be verified by all other nodes. is would result in nearly no chance for spoofed data to be accepted. However, the cost of rebuilding the system is obviously enormous, and more attention should be paid to the light blockchain to test the trade-off between efficiency and security. Regardless, we still believe that this is a promising future for I-SIG security defense.

Related Work
Data Spoofing Attacks in SAGIN. e SAGIN has a heterogeneous structure, including vehicle nodes, roadside infrastructure, mobile terminal users, drones, airships, and other stratospheric nodes, as well as high altitude satellite nodes; this brings security challenges [22], such as the various attacks of authenticity, identity, confidentiality, data integrity, and privacy [23]. As a SAGIN-based intelligent transportation system deployed in California, Florida, and New York by the USDOT, I-SIG is exposed to data spoofing attacks [8], which can cause heavy congestion. Such an attack is a position-faking attack of GPS spoofing but is different from a tunnel attack. In a tunnel attack, each vehicle of a Vehicular Ad hoc NETwork (VANET) [24,25] is equipped with a positioning system (receiver). e attack can be achieved using a transmitter generating localization signals stronger than those generated by the real satellites [26,27]. e victim could be waiting for a GPS signal after leaving a physical tunnel or a jammed-up area. In comparison, the position spoofing attack to I-SIG refers to an authenticated vehicle only sending the wrong position to affect the COP algorithm, which has lower attack cost and easier implementation. In such an attack, the data spoofing is just one factor, while the mechanism of the COP algorithm is the key factor. Furthermore, for the GPS spoofing attack, our work focuses on algorithm-level security analysis under a spoofing attack.

Congestion Attack
Analysis. e previous work [8] reveals the existence of such congestion attacks on the COP algorithm. It analyzes how congestion attacks affect COP decisions and explains how to execute an attack using data spoofing in SAGIN. However, it lacks consideration about the potential features and the quantified correlation between the attack and congestion degree. In comparison, we demystify the attack on I-SIG and corresponding congestion from a machine learning perspective by exploring different kinds of features based on both supervised learning and unsupervised learning. In addition, as the first utilization of both traffic flow features and image features, our work can inspire all stakeholders of I-SIG, including experts of transportation, SAGIN, and security.

Conclusions
Toward the spoofing to connected vehicle technology and the SAGIN, a congestion attack has been revealed on the COP algorithm of I-SIG, which performs dynamic and optimal signal control based on automatic traffic situation awareness. Owing to the lack of quantified feature-level analysis, we demystify the attack on I-SIG and the corresponding congestion from both supervised learning and unsupervised learning. We propose a CycleGAN-based approach to analyze the potential relations between the congestion attack and the corresponding results two stages later. We also present a TGRU-based approach to explore the relations between the congestion attack and traffic flow features at a certain moment. In our experiment, we collect high-quality 4476 image samples and 3600 attack-oriented traffic flow data. We then evaluate our approach empirically using the COP algorithm and VISSIM, and our results show the effectiveness of our approach compared with ground truth.
is work is expected to inspire a series of follow-up studies on the security of CV-based I-SIG, but not limited to (1) more machine learning-based approaches, (2) more concrete defense implementation on SAGIN-based I-SIG, and (3) more feature fusion for attack and defense analysis.

Data Availability
All data generated or analyzed during this study are owned by all the authors and will be used to our further research. e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.