Abstract

Short-term traffic prediction under corrupted or missing data for large-scale transportation networks has become an important and challenging topic in recent decades. Since the critical roads have predictive power on their adjacent roads, this paper proposes a novel hybrid short-term traffic state prediction method based on critical road selection optimization. First, the utility function of the quality of service (QoS) for the critical roads in a large-scale road network is proposed based on the coverage and the data score. Then, the critical road selection optimization model in the transportation networks is presented by selecting an appropriate set of critical roads with the maximum proportion of the total calculation resources to maximize the utility value of the QoS. Also, an innovative critical road selection method is introduced, which is considering the topological structure and the mobility of the urban road network. Subsequently, the traffic speed of the critical roads is regarded as the input of the convolutional long short-term memory neural network to predict the future traffic states of the entire network. Experiment results on the Beijing traffic network indicate that the proposed method outperforms prevailing DL approaches in the case of considering critical road sections.

1. Introduction

Real-time traffic state prediction plays a vital role in traffic management and public service. By predicting the evolution of traffic timely and accurately, governments and travelers could react to the traffic congestion ahead of time. For instance, intelligent transportation systems, advanced traffic management systems, and traveler information systems depend on real-time traffic state prediction.

In the past decades, there have been numerous studies on this topic [1]. These studies could be divided into mathematical models, statistical models, and data-driven methods [2]. Compared with data-driven methods, mathematical or statistical models derived from macroscopic and microscopic theories of traffic flow are difficult to handle unstable traffic conditions and complex road settings due to strong hypotheses and assumptions [3].

Data-driven methods have achieved promising results due to more potential in processing complex nonlinear problems [4]. These methods include support vector machine (SVM), Bayesian network, and neural network. Among all these data-driven methods, deep learning approaches have proven effective in traffic state prediction. This method could exploit much deeper architectures and process the high-dimensional set of explanatory variables [5].

However, most of the deep learning approaches are constructed based on the entire dataset. The attributes of datasets influence the prediction performance. In other words, the deep learning approaches have highlighted the data quality in short-term traffic state prediction [5]. Due to the corrupted or missing data problem, there is limited high-quality real-time or historical data obtained, indicating that the data has low predictive quality [6]. These setbacks weaken the predictive accuracy and efficiency, limiting the capacity for providing a practical and reliable forecasting result.

Recently, researchers find that some critical roads significantly affect the traffic states of their adjacent roads at a specific road network [7, 8]. This conclusion indicates that the traffic state of one road could be predicted based on its neighbors’ state [6].

There have been numerous studies on this topic, but only a tiny percentage of them have paid attention to prediction based on critical road selection optimization. Therefore, if we extract the traffic state of the critical road sections with the most dominant predictive power, we could characterize the spatiotemporal features of traffic flow and predict the future traffic state of the overall network.

In this paper, we propose a novel hybrid short-term traffic state prediction method based on critical road selection optimization. Different scores were given to data collected on different road segments. The higher the criticality of the road segment, the higher the score of the data. The critical road selection problem is abstracted as a multiobjective optimization problem, maximizing the sensing coverage and data scores. In other words, our objective is to select the most suitable critical roads to maximize the quality of the service (QoS) with the limited consumption of network resources.

Then, a novel critical road selection method is proposed, considering the topological structure and the mobility of the urban road network. Subsequently, the traffic speed of the critical roads is regarded as the input of the convolutional long short-term memory neural network to predict the future traffic states of the entire network. Finally, to demonstrate the effectiveness of the proposed method, the numerical experiments using the traffic states depicted from GPS trajectory data in Beijing. In addition, other traditional machine-learning models are compared to demonstrate the advantage of the proposed method. Practical experiment results showed that the proposed method could precisely predict future traffic network states.

The rest of our paper is organized as follows. Section 2 discusses the literature on short-term traffic state prediction and the existing approaches to prediction based on critical roads. In Section 3, the critical road selection optimization model is proposed by selecting an appropriate set of critical roads with the maximum proportion of the total calculation resources to maximize the utility value of the QoS and also an innovative critical road selection method is introduced. Section 4 elaborates on the numerical experiments using the traffic states depicted from GPS trajectory data in Beijing. The last section concludes the study and discusses future work.

2. Literature Review

Over the past decades, considerable short-term traffic forecast models have emerged to handle the prediction of future traffic states ranging from a few seconds to few hours. These approaches could be generally categorized into mathematical approaches, statistical approaches, and data-driven approaches.

Mathematical approaches focus on predetermining the model structure by theoretical assumption. The evolution of the traffic state could be simulated by the theoretical mathematical models [9]. Because of the strong dependence on theoretical considerations, these mathematical approaches do not match the actual traffic state condition [10]. Mathematical approaches depend on theoretical mathematical models to simulate the traffic evolution. There are many famous mathematical models in the past decades, such as the Greenshields model, car-following model, Van Lighthill–Whitham–Richards (LWR) model, and so on. The future traffic state of the road network could be calculated by these models directly [11, 12]. Most traffic simulation simulators, such as Q-Paramics, VISSIM, DynaMIT, and DynaSMART-X, predict traffic states based on the traffic flow models, car-following models, and dynamic traffic assignment models [1316].

Statistical approaches usually rely on statistical assumptions. The autoregressive integrated moving average (ARIMA) and its variants are the most common mathematical approaches. Hamed et al. [17] proposed a simple ARIMA to predict the traffic state of the road network. Williams et al. developed a seasonal ARIMA (S-ARIMA) to predict the traffic state of the urban freeway. The results showed that the S-ARIMA model could obtain better properties [18, 19]. Ding et al. proposed a space-time ARIMA to predict the traffic state of the road network [20].

In addition, the Markov chain, Kalman filter (KF), KF-based approaches, and other approaches also have important applications in short-term traffic prediction [2124]. However, these approaches fail to produce favorable results under unstable traffic conditions, such as unexpected events [25].

Different from mathematical approaches and statistical approaches, data-driven approaches rely on a sufficient mass of traffic data. In recent years, due to abundant data attached to extensive traffic sensors and advanced big data technology, data-driven approaches have developed rapidly. Numerous data-driven approaches were put forward for the short time traffic state prediction, such as SVM [2628], neural network [2932], and hybrid methods [33, 34].

Among these data-driven approaches, deep learning methods have become extremely popular and successful because of their powerful ability to process nonlinear high-dimensional problems. Huang et al. employed a deep belief network (DBN) with multitask learning for traffic flow prediction [4]. Lv et al. proposed an SAE model to predict the traffic flow, and the performance is superior to other methods at different prediction horizons [35]. Furthermore, numerous efforts have been devoted to emphasizing the temporal characteristics and spatial dependencies on prediction [2]. Ma et al. predicted the traffic speed of a large-scale transportation network using an LSTM neural network and CNN [36, 37]. Wu and Tan extracted spatial and temporal features by CNN and LSTM, respectively, and predicted traffic volume combined these two approaches [38]. Wang et al. proposed an eRCNN model then trained the recurrent CNN by reducing predictive feedback errors [39].

In summary, among all these traffic state prediction approaches, deep learning methods stand out as the most potent alternative. However, few prediction methods could achieve satisfactory accuracy because of the missing data problem in reality. For the critical roads that have a significant effect on the traffic states of their respectively adjacent roads at a specific road network, we could extract the traffic state of the critical road sections and predict the future traffic state of the overall network.

3. Methodology

This research focuses on utilizing data from critical road sections to predict future traffic conditions of the overall urban transportation network. In this section, the utility function of the QoS for the critical roads in a large-scale road network is proposed based on the coverage and the data score. Then, the critical road selection optimization in the transportation networks is presented by selecting an appropriate set of critical roads with the maximum proportion of the total calculation resources to maximize the utility value of the QoS. Specifically, the critical road sections are selected by an innovative critical road selection method. Finally, the traffic speed of the critical roads is regarded as the input of the convolutional long short-term memory neural network to predict the future traffic states of the entire network.

3.1. Road Network Topology Analysis

In the practical urban road network, road sections are connected at intersections and eventually form a complex network. When performing sensing tasks in urban road network areas, the topological structure characteristics of the road network need to be considered. The results of numerous studies have shown that the topology of urban road networks is complex and diverse. The failure of connectivity in some sections of the network can impact the traffic operation in the whole network. However, it is inevitable that road connectivity in the network will sometimes be interrupted due to bad weather, technical failures, or serious accidents. This will lead to local traffic congestion and paralysis in the network, which will have harmful economic and social impacts. It means that the higher importance of road sections in the network under different spatial and temporal conditions indicates the more severe damage to the integrity of the road network connectivity brought by their failure. In order to avoid serious impacts on road network connectivity and vehicle movements, priority should be given to the use of high, critically important road sections for future traffic state prediction.

In order to study the structural properties of real road networks, it is necessary to reasonably abstract the road networks into topological structure diagrams, consisting of points and lines by suitable methods. The primal approach and the dual approach are two methods to abstract the road network into a complex network model. The primal approach uses the roads in the road network as edges in the abstract network and the intersections as nodes in the abstract network. The dual approach models the intersections as edges in the network and the roads in the urban road network as nodes, thus showing the interconnections between roads in a highly abstract way. The primal approach is intuitive and simple and can retain the layout characteristics of the road network. The dual approach ignores some geographic significance of network entities, such as the geographic location of roads, length, and width, and therefore, it is more suitable for analyzing abstract network structures. In this paper, we pay more attention to the characteristics of road connectivity, so we use the primal approach to model the road network.

3.2. QoS for the Critical Roads

The QoS is the standard for measuring the performance of the critical roads in a prediction task. The greater the value of QoS, the better the prediction performance of the critical roads. Here, we propose a utility function to calculate the QoS for the critical roads.

The coverage of the selected roads is mainly concerned with people when predicting the traffic state of the whole road network. So, we choose coverage as our first metric. Coverage indicates the coverage of the selected vehicle to the entire road network. For better calculating the coverage , we divide the road network into small grids, and the grid is used as the basic measuring element for coverage calculation. As shown in Figure 1, the pink curves represent roads. The grid with roads passing through is painted green. In this paper, the size of the grid is 0.0005° × 0.0005°, based on longitude and latitude, where 0.0005 longitude (latitude) is about 50 m.

Define as the single grid in the urban road network. Let represent the coverage state of , and indicates that grid j is selected during the predicting task time; otherwise, . In this paper, we tried to predict the whole network traffic state. If belongs to road section l and road section l has grids, then the road factors could be calculated as . The following equations could calculate the coverage of the selected grids:

On the other hand, we introduce the concept of data score to represent the criticality level of the roads. Specially, we give a higher score to the data collected from the critical roads than the data collected from the ordinary roads. Here, we use the correlations among road sections on both space and time to calculate the data score.

The spatial weights matrix represents the spatial dependency among road sections in traffic networks. According to graph theory, the local connectivity of a node can be calculated based on the connection between the node and its adjacent node, which is represented by the degree of node . Suppose there exists a network, which can be represented as , where is the set of nodes in the network and E is the set of edges in the network, representing the connections between nodes in the network. The network has nodes and edges. Then, this network can be represented by the adjacency matrix . If nodes and j are connected directly or indirectly, we call these two nodes “kth-order neighbors,” and their adjacency relationship could be expressed as equation (2). The spatial weights matrix P could be defined as the sum of the kth spatial weights matrix; element in P is calculated as shown in equation (3).where K is the highest order of the spatial weights matrix.

And, to comprehensively consider the spatial correlation of the road network with the temporal correlation, the spatial weights matrix is introduced as a spatial indicator to improve the initial correlation distance. Let be traffic speed in the road section i at time t, where . Then, the integrated speed values of the adjacent road sections could be computed by equation (4). Then, the data score between road section i and its adjacent road sections could be calculated by equation (5):where (s > 0) is the time lag between the speeds of road sections i and its adjacent road sections; is the mean value of traffic speed in road sections i during the time duration of T. is the mean value of integrated speed . Based on the previous analysis, mathematical expressions of data score D are defined as follows:

The utility function of QoS, , is defined aswhere is a parameter for tuning the weight of data coverage and data score. To make more sensitive in equation (7), we apply instead of and instead of . Since logarithm is a common way to aggregate data, this process can reduce the heteroscedasticity of data in the QoS function.

3.3. The Critical Road Selection Optimization
3.3.1. Definition of the Critical Road Selection Problem

The goal of the critical road selection model is to maximize the QoS of the urban traffic prediction system with limited critical roads, so the mathematical expression of the critical roads selection problem can be defined as where function is used to calculate the utility value of the QoS; represents whether grid is selected, with indicating that grid has been selected; otherwise, the value of is 0. is the set of critical roads, and is the coverage. is the data score between road section i and its adjacent road sections. is the threshold for the selection.

3.3.2. The Critical Road Selection Method

Based on the above analysis, a critical road selection model for traffic state prediction is constructed. In this paper, due to the spatiotemporal state changes of the road network, the selected roads need to be updated at intervals . An improved greedy algorithm is proposed to solve the critical road selection problems. A greedy algorithm means that we always make the best choice for the current situation when solving an optimization problem. In other words, such an algorithm does not consider the overall situation but rather considers only the local situation. The pseudocode of the proposed algorithm is shown in Algorithm 1.

Input: number of roads: n
  Number of prediction tasks: m
  Set of critical roads: F
  QoS for the critical roads: U
  Output: set of finally selected roads: R
(1),
(2)for j from 1 to m do
(3)
(4) for i from 1 to n do
(5)  
(6)  
(7)   if then
(8)    
(9)   else
(10)    
(11)   end if
(12) end for
(13)end for
(14)return R;
3.4. Traffic State Prediction Using the Deep Learning Approach

In this paper, a spatiotemporal recurrent convolutional network is proposed for the prediction (STRCN). The proposed STRCN inherits the advantages of deep convolutional neural networks (DCNN) and long short-term memory (LSTM) neural networks. The spatial dependencies of network-wide traffic can be captured by CNN, and the temporal dynamics can be learned by LSTM.

3.4.1. Capturing Spatial Features by CNN

CNN has been successfully applied to traffic prediction for its great potential in extracting features using multiple layers. A typical CNN mainly comprises multiple convolution layers and pooling layers. The former contributes to mine spatial dependencies of road sections since every layer retrieves a distinct feature using different filters. In comparison, the latter assists in reducing the number of parameters required for training CNN under the premise of ensuring prediction accuracy. Given that the input for CNN could be intuitively regarded as an image with each pixel value associating one kind of traffic state during a certain time, 2D CNN is naturally utilized to abstract spatial features between road sections. Figure 2 illustrates the structure of CNN, including the input layer, convolution layer, pooling layer, fully connected layer, and output layer. Each part plays a unique and vital role for CNN, and the details are briefly explained below.

Suppose that we need to predict the future traffic speed of the network , where a is the prediction horizon and m is the number of road sections. The input of the CNN is the historical traffic speed of critical road sections {}, where represents the traffic state of critical road sections at time (tn), n is the look back step, and is the number of critical road sections. Then, the spatial features of input are captured by convolutional and pooling layers. Let be the output of lth convolutional and pooling layers with r filters and the weights and bias of lth layers be (). Then, the could be calculated by the following equation:where means the output of the previous layer and is exactly the input layer. f is a nonlinear activation function, and pool denotes the pooling procedure.

3.4.2. Capturing Temporal Features by LSTM

Intuitively, traffic states at each moment have a strict sequential relationship in time dimension rather than isolated from each other, which is especially suitable for RNN to capture the temporal evolution of traffic flow. However, it is difficult for traditional RNN to capture temporal dependency if two time intervals are remote. Then, LSTM, one of the specific forms of RNNs, is proposed to tackle these issues by adding memory cells in hidden layers. As shown in Figure 3, four main parts, an input gate, a neuron with a self-recurrent connection, a forget gate, and an output gate, are collaborated to alleviate the problems of traditional RNN caused by the gradient vanish and explosion.

In our model, next to CNN, the LSTM naturally takes the output of CNN as its input to predict the future traffic states , namely, its output, where q is the number of hidden units of the output layer. For a memory cell, the input states are while the output is . Meanwhile, the states of input, forget, and output gates are , respectively. The temporal features could be iteratively calculated by the following equations:where weights matrices and bias vectors b are constructed to connect input layer, output layer, and the memory cell, denotes the scalar product of two vectors, and represents the standard logistics sigmoid function defined as follows:

3.4.3. Training with STRCN

Integrated with the advantages of CNN and LSTM, the STRCNs is utilized to predict future traffic states by sufficiently exploiting the spatiotemporal characteristics of the data. Eventually, a fully connected layer is employed to predict the future speed by taking the output of LSTM as input. The future speed could be calculated by the following equation:where and are weight and bias related to the hidden layer. Conclusively, the model is trained from end to end, and the values of are prediction results, which are the output of the entire mode. Several hyperparameters within the model will be set and elaborated in the experiment section. Additionally, it is significant to note that the input size will alter as the number of critical road sections changes due to different extracting rate , and hence several hyperparameters will change too.

4. Case Study

4.1. Data Used

In this section, a case study is conducted to evaluate the performance of the proposed critical road selection optimization model and the traffic prediction method. The main urban road of a subtransportation network of Beijing near West Second Ring Road is selected as the research objective, as shown in Figure 4. The network comprises 278 road sections, including several kinds of hierarchies of roads, such as freeways, arterials, secondary roads, and collectors. The total length of all the roads is approximately 24.53 km, and the network covers around 0.6 km2 areas.

Data collected by taxis equipped with GPS devices from June 1st, 2015, to August 31st, 2015 (92 days) is utilized for training the proposed model and predicting the future traffic speed of the network. The updating frequency of data is 2 minutes, and a time period ranging from 6 : 00 : 00 to 23 : 00 : 00 is concerned for high travel demand is repeatedly observed. Accounting for the traffic state varies every time interval, we could observe 511 traffic states per day.

4.2. Critical Road Selections

The proposed QoS-based critical road selection method and other two road selection methods (random method and coverage-based method) were used to select the road sections for prediction in the road network. In a random method, the road is selected randomly during the prediction process. In the coverage-based method, the road is selected based on the coverage, which means that roads with higher coverage are preferred for selection. The three critical road selection methods are applied under different roads extracting rate (i.e., the proportion of critical road sections to all roads, ).

Eventually, the correspondence of extracting rate and the number of critical road sections are listed in Table 1. For example, means that we will select 139 roads as critical road sections and subsequently use them to predict the traffic states of 278 roads.

4.3. Results and Comparison
4.3.1. Performance between Different Critical Road Selection Methods

The root means squared error (RMSE) and root mean squared error proportional (RMSEP) are employed to evaluate the performance of all the models, which could be calculated as in the following equations:where is the ith ground-truth value and is the ith predicted value. The value of denotes the number of critical road sections at extracting rate and is the total number of traffic states.

4.3.2. Performance Using the Different Critical Road Selection Method

In order to evaluate the performance of the proposed critical road selection optimization model, we train and test the STRCN model using different road selection methods. The RMSEs and RMSEPs in the context of different extracting rates are listed in Table 2.

It could be found that in the range of 0.8 to 1.0, the performance of the STRCN model using different road selection methods is almost the same. The reason is that the finally selected road sections by different methods have little difference under the condition of high extracting rate. However, when the extracting rate is between 0.5 and 0.8, the performance of the QoS-based selection method is a little superior to the overall prediction model. In general, the decrease of accuracy is reasonable and within the acceptable limits, which demonstrates the validation and generalization of the approach and the fact that some road sections indeed have less contribution towards prediction.

Additionally, when the extracting rate comes to 0.5, the predictive performance gradually tends to be unstable, probably because too many roads are omitted. Particularly when the extracting rate is between 0.0 and 0.2, the performance is basically the same. That is because no matter what method is used, there are too many missing road sections to predict. Figure 5 shows that prediction accuracy generally declines as the extracting rate decreases with different critical road selection methods.

4.3.3. Performance between Several DL Algorithms

As we introduced before, corrupted or missing data generally exist on account of the monitoring equipment failure, extreme weather, data transmission error etc., which weakens the effectiveness of the prediction model or even disables the model. To test the performance of our model under random structural missing data, we stochastically extract a part of road sections of the rate of where the value remains the same as those mentioned above.

Four popular deep learning-based algorithms are selected for comparison, including ANN, CNN, LSTM, and SAE. ANN adopts a plain and shallow structure to process multidimension and nonlinear problems. The parameters of the SAE are set according to [35], which achieves high accuracy in predicting traffic flow. We take 30 min historical traffic speed as the input to predict overall traffic states after 2 min.

Table 3 presents the quantitative results of Q-STRCN, ANN, CNN, and LSTM. It could be observed that the result of Q-STRCN outperforms other models, indicating that our model could precisely mine the spatiotemporal features of the data and make a relatively accurate prediction. Among all the rival algorithms, LSTM has the best performance, probably resulting from that the temporal features of the time-series data are essentially prominent. The results of ANN and SAE demonstrate that these two models fail to extract spatiotemporal characteristics that have vital impacts on prediction. Figure 6 shows the quantitative results among different approaches under different extracting rates.

5. Summary and Conclusions

Structural missing data usually has a massive negative effect on short-term traffic state prediction. In this paper, a novel hybrid short-term traffic state prediction method based on critical road selection optimization is proposed. First, the utility function of the quality of service (QoS) for the critical roads in a large-scale road network is proposed based on the coverage and the data score. Then, the critical road selection optimization model in the transportation networks is presented by selecting an appropriate set of critical roads with the maximum proportion of the total calculation resources to maximize the utility value of the QoS. Also, an innovative critical road selection method, which is considering the topological structure and the mobility of the urban road network, is introduced. Subsequently, the traffic speed of the critical roads is regarded as the input of the convolutional long short-term memory neural network to predict the future traffic states of the entire network. Experiment results on the Beijing traffic network indicate that the proposed method outperforms prevailing DL approaches in the cases of considering critical road sections.

However, even the case study showed that the proposed method could significantly improve the QoS of the traffic prediction, there is still a long way to go from practical application for the method. For future studies, the inherent attributes of the road should be taken into account when calculating the QoS for the road network. Besides, the traffic accident, temperature, weather, and other external factors affect the traffic prediction accuracy. All these will be left for our future research.

Data Availability

All data, models, and code generated or used during the study are available in the submitted article.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this study.

Acknowledgments

This research was funded partially by the National Science Foundation of China under Grant no. 51908018 and Beijing Science and Technology Commission Deep Computing Project under Grant no. Z191100002519001.