A Multifeatures Spatial-Temporal-Based Neural Network Model for Truck Flow Prediction

+e majority of studies on road traffic flow prediction have focused on the flow of passenger cars or the flow of traffic as a whole, which ignore the significant impact of trucks with different sizes and operational characteristics on traffic flow efficiency. +erefore, in this paper, we focus on truck traffic flow and propose a Multifeatures Spatial-Temporal-Based Neural Network model (M-BiCNNGRU) to improve its prediction.+e proposed model not only comprises conventional temporal characteristics and spatial relationships but also includes a range of multifeatures. +ese multifeatures include policy limit, optimal time delay, road resistance, and traffic congestion state. +e impacts of upstream and downstream road sections are considered on the spatial relationship by using a Convolutional Neural Network (CNN). A Bidirectional Gated Recurrent Unit (Bi-GRU) is employed to account for the temporal characteristics. To evaluate the proposed model, traffic flow data were collected from amajor expressway in Beijing and the results were compared with those derived from existingmodels.+e results show that the prediction accuracy of the BiCNNGRU model, with spatial-temporal characteristics, and the M-BiGRU model, with multifeatures and temporal, is, respectively, 4.13% and 2.15% greater than that of the Bi-GRUmodel, with temporal characteristics.+e prediction accuracy of the proposed M-BiCNNGRUmodel is 92.86%, which is 7.12% greater than that of the Bi-GRUmodel and 13.83% greater than that of the Support Vector Regression (SVR) model. In general, therefore, the proposed M-BiCNNGRU model, which combines multifeatures, temporal characteristics, and spatial relationships, can significantly improve accuracy in predicting truck traffic flow.


Introduction
It is generally acknowledged that the efficient operation of modern supply chains relies heavily on the use of trucks. is is particularly the case in China which has seen a growth of over 40% in truck road traffic during the last decade [1]. is increased transportation dependency has essentially reinforced the importance of trucks but given their poor dynamic performance and large sizes, the net effect of increased truck traffic is the problem of potentially more congestion and reduced traffic capacity overall [2]. Consequently, this problem is now attracting the attention of academia, especially concerning the development of an accurate prediction model for truck traffic flow, aiming to find solutions that can reduce such traffic congestion that is caused primarily by the increasing numbers of trucks on roads.
Despite the increasing congestion problems caused by trucks, currently, most research related to congestion is associated with cars [3] and mainly relies on historical traffic data for traffic prediction and seeking rules from time characteristics [4][5][6][7]. Li et al. [8] proposed a model based on ensemble empirical mode decomposition and a random vector functional link network to predict travel time in mixed traffic flow. ere are relatively few studies on the influence of truck flow, even though it is a significant component of overall road traffic. Several researches have predicted traffic flow from the perspective of temporal characteristics and spatial relationships and obtained better prediction results, indicating that increasing the spatial relationship will benefit prediction accuracy [9][10][11]. Compared with regular passenger cars, however, trucks have a range of operational features or constraints. For example, there are policies that restrict truck operation on specific road sections and time periods. Such policies, therefore, suggest that it is necessary to analyze such features that affect truck traffic flow and explore whether these features can effectively improve the efficiency of truck traffic flow prediction.
In terms of traffic flow prediction, the models currently employed can be classified as either parametric models or nonparametric models [12]. e parametric models mainly include ARIMA, subset ARIMA, and Kalman filtering [13][14][15][16][17][18]. However, a major drawback of the parametric models is their inability to reflect nonlinear traffic flow data with strong randomness, leading to traffic flow prediction inaccuracies. Addressing this problem in order to improve prediction accuracy, nonparametric models have been proposed which are more adept in the capture of nonlinear characteristics of traffic flow. e nonparametric models include K-Nearest Neighbor (KNN), Support Vector Regression (SVR), Back Propagation Neural Network (BPN), and Fuzzy Neural Networks [16,[19][20][21][22].
In recent years, not least with improvements in computer performance, the field of deep learning has developed rapidly. Of particular relevance, deep learning methods can capture the characteristics of nonlinear traffic data and achieve excellent results. However, each deep learning model has its fields and characteristics. For example, the Convolutional Neural Network (CNN) has an excellent performance in processing spatial data, and the Recurrent Neural Network (RNN) has advantages in capturing temporal features [23]. It also has been found that both historical time data and the road network spatial relationship have an impact on traffic flow, and such a finding led to the joint deep learning model combining spatial-temporal advantages being applied to traffic flow prediction [24]. Asadi and Regan [25] proposed a CNN-LSTM deep learning framework for spatiotemporal forecasting problems. LSTM (Long Short Term Memory) is a variant of RNN [26], which solves the problem of gradient explosion and disappearance of RNN by adding memory modules, and has achieved positive performance in time series prediction [27,28]. Based on temporal-spatial features, Li et al. [29] combined an advanced multiobjective particle swarm optimization algorithm and deep belief networks to forecast traffic flow for the next day. Yang et al. [28] proposed a hybrid deep learning model to predict traffic flow, in which the LSTM and a CNN component were applied to learn temporal and spatial features, separately. A CNN-LSTM model was proposed by Cheng et al. [30] to capture traffic flow evolution in a transportation network. A CNN layer was applied to both downstream and upstream road sections for spatial characteristic mining, followed by the incorporation of an LSTM layer. It is worth noting that the prediction performance of the LSTM method depends on a large number of parameters and training data and that its computing ability is limited by computer bandwidth and memory. e Gated Recurrent Unit (GRU) was proposed to solve the deficiencies of LSTM, and its parameters are relatively small in number and it is easier to obtain a converged solution [31]. Fu et al. [3] applied the GRU method to predict car traffic flow for the first time, which performed better than the ARIMA model. Considering the advantages of GRU and CNN in capturing temporal and spatial features, they can be combined to capture the temporal and spatial characteristics of truck traffic flow to improve the prediction accuracy.
Compared with passenger cars, trucks are required to adhere to special operational policies, mainly in terms of driving time and space. For example, the transportation department stipulates that trucks are prohibited from passing on certain urban roads within 24 hours or trucks are prohibited from passing through certain areas from 7:00-8: 30 and 16:30-18:30 during passenger car rush hours. Also, traffic condition, the impedance of the road, and the mixed flow rate of trucks all can affect traffic efficiency but there are relatively few researches into the influence of these factors on truck traffic flow prediction; most rely instead on the use of single-source data from historical traffic flow records.
In general, to improve the prediction accuracy of truck traffic flow, additional factors that affect such flow will be analyzed from the perspective of policy, traffic status, and road impedance. Within this approach, multiple factors are defined as variables to form multiple features, and the input variables are, therefore, changed from one-dimensional traffic flow data to multidimensional feature data. Finally, the spatial relationship between upstream and downstream road sections is added to form a multifeatures spatialtemporal-based joint deep learning model to predict truck traffic flow and evaluate the prediction results. e overall aim and main contributions of this study are summarized as follows: (1) e primary objective of the study is to improve the prediction accuracy of truck traffic flow. Although passenger car traffic flow forecasting has been widely studied, the characteristics of trucks and passenger cars are different, it is not feasible to apply the forecasting method developed for passenger cars to trucks. For this reason, it is necessary to incorporate truck characteristics into the method for predicting truck traffic flow.
(2) A Multifeatures Spatial-Temporal-Based Neural Network model (M-BiCNNGRU) is proposed in this study to improve the prediction accuracy of truck traffic flow. In the proposed model, three types of factors are considered, namely, "spatial relationship," "temporal series," and "road operation." Moreover, the impacts of upstream and downstream road sections are considered on the spatial relationship using a CNN. A Bidirectional Gated Recurrent Unit (Bi-GRU) is employed to account for the temporal characteristics. e road operation factor includes a range of multifeatures. ese multifeatures include "policy limit," "optimal time delay," "road resistance," and "traffic congestion state." ese multifeatures are also through the CNN model to explore the relationship between the truck flow and multifeatures.
(3) e prediction performance of the M-BiCNNGRU model is explored. Several baseline models are also trained and predicted, including (i) the BiCNNGRU model, which considers spatial relationships and temporal characteristics; (ii) the M-BiGRU model, which considers multifeatures and temporal characteristics; (iii) the Bi-GRU model, which considers only temporal characteristics; (iv) the SVR model, which again, only considers temporal characteristics; (v) finally, the ARIMA model, which likewise only considers temporal characteristics." is paper is organized as follows. Section 2 introduces the methodology of the multifeatures spatial-temporalbased truck traffic flow prediction method. In Section 3, the properties of road sections are introduced, and the spatial relationship and congestion coefficient associated with road sections are analyzed. e results of the proposed model are analyzed in Section 4. Section 5 discusses the prediction performance during peak and low peak periods. Section 6 summarizes the conclusion.

Methodologies
In previous models, traffic flow predictions associated with passenger cars were only inferred from historical traffic flow data. However, other than historical traffic flow, there are multiple other features that affect truck traffic flow, such as current road conditions, road resistance, traffic congestion index, and traffic restriction policies. To consider these additional features, in this paper, we propose a Multifeatures Spatial-Temporal-Based Neural Network prediction method for truck traffic flow, which is termed M-BiCNNGRU. e CNN model is used for multifeatures and spatial feature learning and Bi-GRU model is employed for temporal characteristics learning. e proposed M-BiCNNGRU model structure is shown in Figure 1. (1) Truck traffic flow: Truck traffic flow has a self-correlation, which means that historical truck traffic flow will have an impact on future truck traffic. e truck traffic flow q refers to the number of trucks passing through a road section in a certain time interval and is also defined as a continuous variable. e definition of truck is that mainly used for carrying goods in terms of design and technical characteristics, including special vehicles whose main purpose is to carry goods (GA 802-2014). In this paper, the authors regard all truck types as the research object of truck traffic flow prediction. To unify the different truck types, the authors converted the traffic flow of each truck type to the Passenger Car Equivalent (PCE). e classification of truck type and the PCE conversion factor is shown in Table 1. (2) Policy limit: For different road sections and different periods, truck operation will be restricted by policy, which will cause changes in truck traffic flow. e policy limit variable is defined as a discrete variable. If trucks are allowed to operate during a period, it is defined as 1; if not, it is defined as 0. (3) Optimal time delay: In this paper, the road section where traffic flow is to be predicted is called the target section. For situations when truck flow "upstream" of the target section changes, a period of time is incurred before the traffic information can be disseminated to the target section. erefore, it is necessary to find a time delay difference r, which represents the time taken for the upstream and downstream sections to transmit truck flow characteristics. e optimal time delay r b refers to the delay which provides the best match between two road segments. e time delay diagram of upstream and downstream sections is shown in Figure 2. n is defined as the number of truck flow time series. In this paper, the Pearson correlation coefficient (PCC) is used to calculate the correlation of truck flow series under a different time delay r, where the time delay under the maximum PCC is the optimal time delay r b . Moreover, r b is a discrete variable whose value is equal to the optimal time delay value. (4) Road resistance: In previous models, the BPR function was often used to calculate the road resistance of passenger cars. However, for truck traffic, the traditional BPR function is inadequate and led to the development of an improved truck road resistance function (1).
where l t a is the length of the road section a at time t. v t 0 represents the speed of the truck when the traffic flow of the road section is zero at time t. q t a represents the truck traffic flow on road section a. e t represents the coefficient for the conversion of trucks into equivalent passenger cars; its respective values are 1.5, 2, and 3 for small, medium, and heavy trucks. m t a represents the proportion of trucks in the overall traffic flow. c a represents the traffic capacity of road section a. For the improved truck road resistance function, the length of the road section, the current truck traffic flow, the truck traffic flow rate in the mixed traffic flow, and the truck equivalent are taken into account to express the road operation more accurately. (5) Congestion coefficient: At present, the standards for measuring congestion indicators are not uniform. Due to the complex and diverse traffic operating environment of urban roads, the classification of traffic operating status is often not accurate enough and has a certain degree of ambiguity [32,33]. For this reason, the fuzzy evaluation method is used to judge the traffic state. is paper applies the truck traffic flow fuzzy comprehensive evaluation method to calculate the congestion coefficient on the road section. e function of the congestion coefficient (CC) is When (q t a /m a ) reaches the design flow limit, the road section is in a serious congestion state, where the CC is equal to 10; however, when the (q t a /m a ) is equal to 0, the road section is in a clear state where the CC is close to 0.
For the fuzzy comprehensive evaluation method, the threshold range according to the congestion degree needs to be defined first. For the CC, the larger its value means the more congested the road section is. e CC of a road network is ranked in a descending order and divided into five levels according to a certain proportion. In this paper, the definition of congestion is obtained with reference to Beijing's "Evaluation Index System for Urban Road Traffic Operation" (EISU) published on April 28, 2011. EISU proposed the recommended conversion relationship between the Road Network Operation Level (RNOL) and Traffic Performance Index (TPI). When the TPI is between the values [0, 2], [2,4], [4,6], [6,8], and [8,10], the RNOL is defined as "unblocked," "basically smooth," "light congestion," "moderate congestion," and "severe congestion," respectively. Moreover, the respective variables associated with these RNOL definitions are 1, 2, 3, 4, and 5. Further details of the above variable definitions can be found in Table 2. To unify the types of discrete variables and continuous variables, the author performs one-hot encoding of discrete variables.        Figure 3. e feature variables of the upstream section, the downstream section, and the target section are defined as

Construct
. e input feature vector X of the proposed model is, therefore, constructed, as shown in equation (3).
If the target road section has only one upstream and one downstream road section, as shown in Figure 2(a), it is relatively easy to compose the input feature vector. However, a target road section often has multiple upstream road sections and multiple downstream road sections, as shown in Figure 3(b). Moreover, if the feature variables of multiple road sections are all simply applied to the input feature vector, there will be inconsistencies in the dimensions of the input feature vectors. However, if one of the multiple upstream sections is simply selected as being representative of the upstream section and because the upstream representative section may not accurately display the truck traffic flow status of all upstream sections, errors may also be introduced. It should be noted that discussions on the multiple scenarios linking the target road section with the upstream and downstream road sections are often ignored in previous papers. erefore, when there are multiple upstream road sections, this paper proposes aggregating the feature variables of multiple upstream sections into one upstream section. e aggregated rule is as follows: (1) For the truck traffic flow variable, it is formed by summing up the multiple upstream sections. (2) For the policy limit, it is set as the policy for the road section with the highest levels of truck traffic flow. (3) For the optimal time delay, CC, and road resistance, they are recalculated according to the sum of the truck traffic flow in the multiple upstream sections.
e rule for the downstream road section is the same as the above upstream section. rough the above processing, multiple scenarios linking the target section with the upstream and downstream section are aggregated into a unified dimension of the input feature vector.

Spatial Relationships.
To mine the spatial relationship between the upstream and downstream sections and the target section, the CNN was chosen. is is because CNN   has advantages in mining spatial local related information and has achieved significant results in other fields such as natural language processing. e conventional 1D CNN is applied to learn the spatial relationship with the locality structure of key features. e CNN contains a convolutional layer, activation layer, and a pooling layer. By performing 1D convolution on the input feature vector X, the m-th feature map of the 1D CNN is calculated: where h i is the weight vector of the i-th feature, b i is the bias, k i is the nonlinear activation, * represents convolution, and pool denotes the pool function. To avoid overfitting, the depth of the 1D CNN is set to 3. e ReLU is applied for nonlinear activation.

Temporal Characteristics.
Truck traffic flow has a longterm temporal dependency. In this paper, a Bi-GRU is introduced to learn the temporal characteristics associated with truck traffic flow. is is because the Bi-GRU has advantages in capturing time characteristics. Recently, an LSTM method has been applied in numerous problems as it, in general, performs well in terms of prediction accuracy, but compared with LSTM, the Bi-GRU contains fewer parameters and has achieved a better performance in the field of natural languages. However, for truck traffic flow, few scholars have studied its potential benefits. Specifically, in the Bi-GRU model, the reset gate and update gate were applied to control the update status of the time series. e reset gate R i and the update gate U i are calculated using where S i is the input from the Bi-BRU, which is also the output of the 1D CNN, σ is the activation, W r and W u are the weight vectors, b r and b z are biases, and H i−1 is the hidden layer output value. Also, the structure of the Bi-GRU is composed of the forward and reverse sequences. e forward sequence is from start to end, and the reverse sequence is from end to start.
In summary, the Multifeatures Spatial-Temporal-Based Neural Network prediction method, namely, the M-BiCNNGRU model, is a new joint neural network model. Compared with the CNN model, the proposed M-BiCNNGRU model adds the structure of mining temporal characteristics from the Bi-GRU neural network. As far as the authors are aware, this study is the first to combine the Bi-GRU and CNN neural networks for truck traffic flow tasks. e proposed model has three advantages; firstly, the impact of the truck traffic flow in the upstream and downstream sections on the target road section can be explored using the spatial relationship. erefore, the road conditions of the upstream and downstream sections can be well represented in the proposed model. Secondly, the Bi-GRU neural network is well capable of mining the temporal characteristics.
rough the full connectivity layer, the output result of the spatial relationship is placed in the Bi-GRU model which can then continue to explore the time relationship. irdly, the proposed model is more capable of expressing the characteristics of truck flow from the traffic perspective.

Properties of the Road Section.
e data for the period June 1st to June 5th, 2019, were provided by the traffic survey in Beijing, China, and contains truck traffic flow as well as passenger car traffic flow information. e data were collected by manual counting methods. e data were provided by Beijing Transport Institute, which invited professional survey companies to conduct data collection work, and a total of 30 investigators participated in this work. e time interval of truck traffic flow is one hour, and the size of the data in the CSV format is 3.48 GB. As an important part of Beijing's northwest freight corridor, the Sixth Loop Expressway is selected as the research object and is divided into 1,293 road sections and numbered accordingly. To consider different scenarios, six road sections (S1-S6) were randomly selected as target road sections, as shown in Figure 4.
From Figure 4, it can be seen that S3 and S6 have three upstream sections and one downstream section, whereas S1, S2, S4, and S5 each have one upstream section and one downstream section. e average daily truck traffic flow of the selected six target section is shown in Figure 5. It is known that the hourly distribution of daily truck traffic flow is different from the distribution of morning and evening peak flow of passenger cars. e peak hour of the truck traffic flow is during 11:00-23:00 and the low peak period is during 0:00-11:00.  Table 3. In this context, mean refers to the mean truck traffic flow in 1 hour. Std refers to the mean standard deviation. e mixed ratio refers to the rate of trucks to the overall traffic flow. Peak mean and peak std refer to the mean truck traffic flow and standard deviation in peak hours (11: 00-23:00). Low peak mean and low peak std refer to the mean truck traffic flow and standard deviation in low peak hours (0:00-10:00).

Spatial Relationships.
To show that the upstream and downstream sections have a spatial relationship, PCC is usually applied to study the interaction between sections. e PCC is defined as the quotient of the covariance and standard deviation between the truck traffic flows of two road sections. e value of PCC is between −1 and 1. When it is closer to 1, it indicates that there is a positive correlation between road sections. e PCC with the upstream and downstream sections of the six selected target sections is shown in Figure 6.
As can be seen from Figure 6, the PCC value is close to 1, indicating that there is a strong spatial correlation between the upstream and downstream sections and the target section. erefore, there is a certain basis for improving the accuracy of predicting truck traffic by using the spatial relationship between upstream and downstream sections.

CC.
Using the fuzzy evaluation method, the TPI of six road sections is obtained as shown in Figure 7. e congestion situation in the upstream and downstream will affect the flow of the target road section. However, by calculating the CC, the operating status of the road section can become well understood, which is helpful for the prediction of truck flow. Similarly, the CCs of the upstream and downstream sections of the six sections are also calculated.

Evaluation.
To analyze the performance of the proposed model, model training and evaluation of the truck traffic in the six selected road sections were carried out. e first four days of data were used for training purposes, and data from the last day were used for verification, within which the Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) were applied as parameters for evaluation.

Results
To explore the prediction performance of the M-BiCNN-GRU model, several baseline models were also trained and used in predicting truck traffic flows. ese model include the following: (i) BiCNNGRU, which considers spatial relationships and temporal characteristics. (ii) M-BiGRU, which considers multifeatures and temporal characteristics. (iii) Bi-GRU, which considers temporal characteristics. (iv) SVR, which considers temporal characteristics.
For the deep learning algorithm, several superparameters that affect the performance of the proposed model need to be discussed; these superparameters include batch size, hidden unit size, hidden layer number, time steps, and epochs.
When the values of the evaluation parameter of MAPE, MAE, and RMSE reach their minimum, the superparameters are selected as the optimal parameters. e learning rate in the study is set to 0.001; the batch size is set to 20. Epochs are 100, 230, 130, 240, 160, and 210, respectively, for the six sections of S1, S2, S3, S4, S5, and S6, as determined from multiple iterations of training to find the optimal parameters. (1) For the training environment, Keras and Tensor-Flow provided the deep learning packages, and Python 3.6 was used as a general-purpose programming language. ese were hosted on a desktop computer, comprising an Intel Xeon (R) E5-2640 2.5 GHz CPU and 32 GB memory.     (2) For choosing the optimal hidden units, the evaluation value with different hidden units was compared. Firstly, the hidden units from [5,10,15,30,50, 100] were tested, respectively. e results indicated that the best parameter occurs for a unit of 5.
en the hidden units from [1, 2, 3, 4, 5] were used in further testing, where it was found that the respective MAPE value for each hidden unit for the road section S1 is 8.32%, 9.31%, 8.59%, 7.62%, and 9.65%. e value of hidden unit 4 represented by the lowest value of MAPE (7.62%) is, therefore, the optimal parameter of the hidden unit.  [4,8,12,16,20,24] were tested, and the corresponding evaluation distributions are shown in Figure 8.
From Figure 8, it can be seen that when the time step is gradually increased from 4 to 24, the distributions of MAPE, MAE, and RMSE for road sections S1, S2, S3, S4, S5, and S6 all show a gradually increased trend. When the time step is 4, the error values of MAPE, MAE, and RMSE are at their lowest, suggesting that the proposed model has the best performance and that time step 4 is the optimal parameter. at is to say, the best model fit is obtained by learning the flow of the first 4 hours to predict the flow of the fifth hour. e evaluation results of the above six models are shown in Table 4, and Figure 9.
It can be seen in Table 4 and Figure 9 that the M-BiCNNGRU model, which combines multifeatures, spatial relationships, and temporal characteristics performs better than the baseline models. e average MAPE of the six sections is 7.14, and the prediction accuracy is 92.86%, which is 2.99%, 4.97%, 7.12%, 13.83%, and 13.14% lower than that of BiCNNGRU, M-BiGRU, Bi-GRU, SVR, and ARIMA, respectively. e improved performance in prediction accuracy of the proposed M-BiCNNGRU model indicates that the spatial relationships, temporal characteristics, and operational factors with multifeatures are significantly helpful in predicting truck traffic flow accurately. To further analyze multifeatures, spatial relationships, and temporal characteristics that contribute to the accuracy improvement, the authors designed several models with and without single factors to evaluate the prediction performance.
e BiCNNGRU model was designed by considering the factors of spatial relationships and temporal characteristics; its purpose is to explore whether the road operation factors of multifeatures play a positive role in improving the prediction accuracy of truck traffic flow. e average MAPE value of the BiCNNGRU truck traffic flow prediction model considering spatial and temporal factors is 10.13, which is 2.99 higher than the proposed M-BiCNNGRU model. However, without the road operation factors of multifeatures, the prediction accuracy reduces by 2.99%; hence, the results show that the operation factors of multifeatures is effective in improving the prediction accuracy of truck flow.
Concerning the M-BiGRU model, it was designed by considering the factors of temporal characteristics and the road operation factors of multifeatures; its purpose is to explore whether spatial relationships of upstream and downstream sections play a positive role in improving prediction accuracy. e average MAPE value of the BiCNNGRU model is 12.11, which is 4.97 higher than the proposed M-BiCNNGRU model. e result shows that, without the factor of spatial relationships, the prediction accuracy is reduced by 4.97%. e reduced accuracy of 4.97% by not considering spatial relationships is higher than the reduced accuracy of 2.99% by not considering multifeatures.
is result indicates that, compared with multifeatures, the consideration of spatial relationships has a higher accuracy contribution.
e Bi-GRU model was designed by only considering the factor of temporal characteristics. It is a deep learning model which has been widely studied and applied as a result of its good performance in the mining of temporal characteristics. However, does this good performance also apply in truck traffic flow prediction? How much of an increase in truck flow prediction accuracy can be achieved using the Bi-GRU model to mine the temporal characteristics compared to the parameter-type model ARIMA and machine learning model SVR?
e average MAPE of the Bi-GRU model is 14.26, which is 6.71 and 6.02 lower than that of SVR model and ARIMA model, respectively, showing that the Bi-GRU model selected in this paper to capture time characteristics has a good prediction performance and perform better than the parameter-type models and machine learning models. It also indicates that the Bi-GRU model can play a fundamental role in obtaining a high prediction accuracy in the constructed M-BiCNNGRU model. Furthermore, the average MAPE of the M-BiGRU model considering multifeatures and temporal characteristics is 12.11, which is 2.14 lower than that of Bi-GRU model only considering temporal characteristics. It also indicates that the operation factor of multifeatures is helpful in improving truck flow prediction accuracy, and the improved prediction accuracy value is 2.14%. e average MAPE of the BiCNNGRU model considering the spatial relationship and temporal characteristics is 4.13 lower than the Bi-GRU model only considering temporal characteristics. It also indicates that the factor of spatial relationship is conducive to improving the truck flow prediction accuracy, and the improved prediction accuracy value is 4.13%, which is higher than the improved prediction accuracy value 2.14% of multifeatures.
e result shows that the contribution to improving the prediction accuracy of spatial relationship factor is greater than the operation factor of multifeatures.
In summary, the proposed M-BiCNNGRU model, considering three type factors of spatial relationship, temporal characteristics, and road operation multifeatures, outperforms other models in predicting truck traffic flow. Besides, for the contribution to improving the prediction accuracy, the temporal characteristics are the biggest, followed by the spatial relationship and then the road operation factor of multifeatures.

Discussion
To further discuss the performance of the M-BiCNNGRU model, the original truck traffic flow is compared with the predicted truck traffic flow values. At the same time, this paper divides the day into two periods, low peak period (4: 00-10:00) and peak period (11:00-22:00), and compares the prediction results of truck traffic flow in the two periods. e comparison results are shown in Figure 10.
For an improved quantitative analysis of the truck traffic flow prediction model in both peak and low peak periods, MAE, MAPE, and RMSE indexes were also used for evaluation. e results are shown in Table 5.
It can be seen from Figure 10 and Table 4 that the overall average MAPE of truck flow in the peak period is 2.27 lower than that in a low peak period. e MAPE value of the S1, S3, S4, S5, and S6 road sections in the peak period is 6.18, 1.42, 5.59, and 0.94, respectively. is means that the prediction results are more accurate during peak periods than they are in low peak periods. e comparison of the evaluation parameters of the six sections in peak and low peak periods is shown in Figure 11.

Conclusions
e paper proposes a Multifeatures Spatial-Temporal-Based model (M-BiCNNGRU) for the accurate forecasting of truck traffic flow. e multifeatures include factors such as truck operational policy, road resistance, truck traffic flow rate, traffic congestion state, and optimal time delay which were all incorporated into the model. e novelty of the model is that it includes the impact factors of multifeatures, as well as temporal characteristics and spatial relationships. e model was trained using data from six road sections from a major expressway network in China collected over a five-day period, where the final day's data were excluded from the training but were instead employed in verifying the performance of the model. To gain further insight into the prediction results of the M-BiCNNGRU model, five baseline models, namely, BiCNNGRU, M-BiGRU, Bi-GRU, SVR, and ARIMA were also applied. e results show that the prediction accuracy of the BiCNNGRU model is 89.87%, which is 4.13%, 10.84%, and 10.15% higher than that of the Bi-GRU, SVR, and ARIMA models. is indicates that the prediction performance of truck traffic flow can be improved by increasing the spatial relationship between upstream and downstream sections. e prediction accuracy of the M-BiGRU model is 87.89%, which is 2.15%, 8.86%, and 8.17% higher than that of the Bi-GRU, SVR, and ARIMA models.
is indicates that the incorporation of multifeatures can help to improve prediction accuracy. e prediction accuracy of the M-BiCNNGRU is 92.86%, which is 13.83% and 13.14% higher than that of the SVR and ARIMA models with single temporal characteristics. In summary, the proposed model based on multifeatures, temporal characteristics, and spatial relationships outperforms baseline algorithms as demonstrated in this paper. However, in this paper, the contributions of each individual feature to the overall prediction effectiveness were not specifically analyzed but it is suggested that this could form part of a future study to determine the influence of each feature in terms of weighting factors.

Data Availability
Some data and code used during the study are available in a repository in accordance with funder data retention policies (https://github.com/uubest/-LSTM-and-GRU).

Conflicts of Interest
e authors declare that they have no conflicts of interest.