Speed Distribution Prediction of Freight Vehicles on Mountainous Freeway Using Deep Learning Methods

Driving speed is one of the most critical indicators in safety evaluation and network monitoring in freight transportation. Speed prediction model serves as the most efficient method to obtain the data of driving speed. Current speed prediction models mostly focus on operating speed, which is hard to reveal the overall condition of driving speed on the road section. Meanwhile, the models were mostly developed based on the regression method, which is inconsistent with natural driving process. Recurrent neural network (RNN) is a distinctive type of deep learning method to capture the temporary dependency in behavioral research. The aim of this paper is to apply the deep learning method to predict the general condition of driving speed in consideration of the road geometry and the temporal evolutions. 3D mobile mapping was applied to obtain road geometry information with high precision, and driving simulation experiment was then conducted with the help of the road geometry data. Driving speed was characterized by the bimodal Gauss mixture model. RNN and its variants including long short-term memory (LSTM) and RNN and gated recurrent units (GRUs) were utilized to predict speed distribution in a spatial-temporal dimension with KL divergence being the loss function. The result proved the applicability of the model in speed distribution prediction of freight vehicles, while LSTM holds the best performance with the length of input sequence being 400m. The result can be related to the threshold of drivers’ information processing on mountainous freeway. Multiple linear regression models were constructed to be a contrast with the LSTM model, and the results showed that LSTM was superior to regression models in terms of the model accuracy and interpretability of the driving process and the formation of vehicle speed. This study may help to understand speed change behavior of freight vehicles on mountainous freeways, while providing the feasible method for safety evaluation or network efficiency analysis.


Introduction
In the driving process of freight vehicles, driving speed is one of the most important indicators in safety evaluation and e ciency analysis.Currently, the speed prediction model plays an important role in obtaining the driving speed of vehicles [1].On mountainous freeways, the poor terrain conditions allowed relatively unfavorable road geometry design, and combined vertical and horizontal alignments are not uncommon in mountainous freeways, which directly produces many accident-prone road sections [2].Meanwhile, freight vehicles are always the most critical elements in the cause of accidents.erefore, it is essential to study the regularity of speed change behavior of freight vehicles.
On the selection of the variables in the input of the speed prediction model, parameters are mostly related to environmental conditions (e.g., road geometry, roadside vegetation, guardrail, and delineator) [3,4].On mountainous freeways in China, the unfavorable terrain condition allowed road alignment design with relatively lower standard.Combined vertical and horizontal alignments are not uncommon.With a tradeo with acceptable cost, such alignments may force drivers to handle high-speed gradients, which can produce higher crash frequency [5][6][7].Typical road alignment features include horizontal and vertical curvature, curve length, guardrails, roadside vegetation, and road delineator [8].
e previous studies agree that road geometry plays a dominant role affecting operating speed under most circumstances.
Considering the possible difference between two-dimensional road design method and three-dimensional road scenario, speed prediction on three-dimensional freeway alignment attracts the attention of some researchers in recent years [9,10].In order to fully characterize the feature of road geometry, objective characteristics of the road alignment and drivers' subjective perception in a 3D real space should both be considered [11,12].Driving speed serves as the most representative variable due to a clear correlation between drivers' perception of the road and behavior change [13][14][15][16].Meanwhile, studies which adopted sight distance as the input of the speed prediction model found that the index is highly correlated to driver's operating speed, as well as driving safety [17][18][19].
On the selection of the output in the speed prediction models, operating speed or other representative value of speed was selected to be the predicted value of the model [20][21][22][23].In practice, drivers choose their driving speed according to the road condition including road geometry, roadside vegetation, guardrails, and traffic condition in the environment.Design speed of the road does not necessarily mean the actual driving speed.Operating speed, on the other hand, is a statistical value of the speed dataset.e value can be representative under most circumstances, but the detailed information of the whole speed dataset remains unrevealed.Numerous studies have also supported the view that the driving speed is subject to normal distribution or even multimodal distributions, as regarding to road condition and sample drivers.Some of the researchers tried to fit the distribution of speed by multimodal Gaussian or Bayesian distribution and reach favorable fitting performance [24,25].
On the selection of speed prediction models, regression is the most frequently used method in driving behavior prediction [26][27][28][29][30]. Various regression models such as linear regression, log-linear regression, nonparametric multivariate adaptive regression splines, and parametric logistic regression have all shown their adaption in different practical problems [4][5][6].However, the majority of existing studies on speed prediction modeling treat the input as static variables, instead of considering speed changing process in time series.Although a few studies took the geometric features of previous and oncoming elements into consideration [31][32][33], the models were not able to match the actual driving process when negotiating with the complexity of the road environment.Some researchers also conducted valuable comparative studies based on the classification of road geometric features and road scenarios [12,13], commonly with a great fitting performance, but not able to mimic actual driving process.For instance, for a cloud model based on fuzzy neural logic system, the input time sequence data were regarded as isolated variables, so the problem of correlations between input sequences remains [15].Currently, the study on speed prediction model targeting freight vehicles is still insufficient.
Recently, powerful deep learning methods have been applied to the transportation studies.Recurrent neural networks (RNNs), such as long-short term memory (LSTM), were developed based on bionics to capture long-term temporary dependency [34,35], which made them effective in transportation studies [36].Connections were created among hidden layers to transmit information.e application of RNN and its variants in traffic flow prediction and trajectory prediction also proved its advantages in processing time series data in other fields [37,38].In driving behavior field, LSTM had been applied to the prediction of vehicle trajectory based on naturalistic driving data and reached an accuracy of 96.83% in a 10 s prediction horizon [39].A study on vehicle speed estimation was conducted based on in-vehicle accelerometer and gyroscope data.e study compared the robustness of LSTM model with various layers and the absolute error reached 1.61 km/h [40].ere also exist relevant studies such as identifying behavioral change among drivers using LSTM [41,42].However, the complex structure of LSTM usually took longer training time.As a relatively new model which was first introduced in 2014, gated recurrent units (GRUs) are derived from LSTMs, with the appearance of a simpler structure and a faster training speed, and are more convenient to solve.So far, GRUs have not been employed for operation speed prediction.
On the study on freight transportation, travel time, break-taking behaviors, emission, etc., are the most popular topics [47][48][49].Some studies have taken the freight transportation process as a consecutive sequence, and uncertainty factors in the transportation process were taken into consideration as well [50][51][52].
ese current studies mainly focus on the macroscopic scale; there are few research studies on the microcosmic behavior of freight vehicles such as speeding, accelerating and decelerating, speed change, etc.Meanwhile, the microcosmic behavior is the ultimate cause of the macrofeatures of freight transportation.Driving speed is one of the most critical indicators in safety evaluation and network monitoring in freight transportation.It is essential to conduct relevant studies to explore the mechanism of the microcosmic features of speed change behavior of freight vehicles.
In summary, the speed prediction model connects the complex road environment and driving speed.Road geometry is always the representative variable in road environment, which holds the dominant place in speed prediction.e output of the prediction model varies around driving speed, but the most appropriate way to characterize driving speed is in the way of statistical distribution.On the selection of prediction model, the continuity of driving process should be taken into consideration, which means that the model should allow the input variables to be sequence data, while holding the ability to handle consecutive information at the same time.
erefore, the objectives of this paper are summarized as follows: (1) Road geometry information should be obtained with high precision to provide input variables for the speed distribution model (2) Deep learning network should be applied to predict speed changes of freight vehicles in consideration of 2 Journal of Advanced Transportation the spatial dependencies and temporal evolutions of road geometry (3) Driving speed should be obtained and characterized in a way of statistical distribution is paper is organized as follows.In Section 2, 3D mobile mapping is introduced as a method to obtain road geometry information, driving simulation and the Gauss mixture model are introduced, and RNN and its variants are presented.In Section 3, an empirical study is presented including driving simulation experiment and model training and testing results.In Section 4, the results are presented and analyzed.
e last part concludes remarks and scopes of future studies.

Methodology
2.1.3D Mobile Mapping.In order to provide road geometry data as the input variables of speed prediction model, information including road alignment condition, terrain condition, and roadside vegetation are all needed with high precision.3D mobile mapping then becomes an ideal solution.
e proposed 3D mobile mapping system utilized in the study was model TOPCON IP-S2, which combines 3 SICK 511 laser scanners, 1 high-precision GNSS antenna, 1 IMU module, and a Ladybug panoramic camera.
e laser scanners provide LiDAR point clouds and images of the road environment with its sampling frequency being 75 Hz.e resolution of GLONASS and GPS signal received by the GNSS antenna provides the spatial coordinates of the point cloud.e IMU module guarantees the continuity of the GNSS data.
e panoramic camera provides consecutive panoramic picture of the road environment with its sampling interval being 5 m.Registration of panoramic pictures and point clouds was made by the mainframe, giving RGB information of the pixels to the points.e facility and a sample of true color point cloud are presented in Figure 1.
In order to obtain adequate information about road geometry, road alignment 3D available sight distance was selected as the main index.Road alignment information includes curvature and curve length on horizontal, vertical, and cross sections, respectively.Fitting of road alignment on horizontal and vertical section information requires a series of points or a multisegment line along the road for reference.Considering that laser scanners could provide the reflection intensity information of target points, landmarkings can be easily distinguished from the point cloud by its higher reflection intensity.erefore, landmarkings are selected to be reference in the fitting of road alignment.Sampling step on the landmarking was selected to be 5 m to guarantee sample density, as shown in Figure 2.
e fitting of the horizontal and vertical alignment was conducted in Civil 3D, as shown in Figure 3.
e fitting result shows that the average bias with the fitted alignment and the source multisegment line was no more than 0.052 m, which is acceptable in practice.e information of cross section includes cross section width and cross slope.It is then essential to extract the points on the road cross section in the massive point cloud.e point cloud consists of consecutive scanlines, which are parallel to road cross sections.Meanwhile, the points on the road are always the lowest points on altitude compared with the surroundings.erefore, traverse of the scanlines for the consecutive lowest points on the road surface could provide the road alignment information on the cross section.e extraction work was completed in Civil 3D, as shown in Figure 4, where blue points are the points on road surface, and the sampling interval was set to 5 m. e width of the cross section could be measured easily by the length of the road surface in the scanline.But the calculation of cross slope needs the spatial attitude of the road surface.Considering the high density of point cloud, the points on a scanline are not strictly a line but a narrow surface.erefore, plane fitting was conducted by the least square method, and the tangent vector Available sight distance (ASD) refers to the longest distance drivers can see on their normal height in their normal driving process, which reflects the visual field supply of road geometry condition.But the measurement of ASD is always challenging since the value of ASD is affected by road alignment design, roadside vegetation, terrain condition, etc.In order to measure ASD accurately, observers and targets are needed to be set in the real space of point cloud model.According to Chinese Design Standard for Road Alignment, the height of target obstacle was set to 0.1 m.Considering the height of the target four-axis freight vehicles, height of observers was set to 2.0 m.Observers and obstacle targets are placed every 5 m along the road.Spatial sight lines are then constructed between observers and targets.e works were completed in the Spatial Factory software, as shown in Figure 5.
e measurement of ASD was conducted based on the actual road alignment.In order to detect collision between sight lines and road infrastructure or roadside vegetation, "Rigidbody" command in Spatial Factory was used to transform the points in the point cloud to cubed rigid bodies with the side length of 0.1 mm so that collision can be detected.Traverse of the sight lines was then conducted till collision.e number of the sight line was then recorded in Spatial Factory.e measurement of ASD could then be conducted by the "Measurement" command in Spatial Factory as shown in Figure 6.
An example of 3D ASD measurement result is shown in Figure 7.
e measuring method mentioned above was applied to a 900 m road segment in China, a horizontal curve on a mountainous freeway.

Driving Simulation.
Traditional methods in obtaining driving behavior data include naturalistic driving experiment, video surveillance, and driving simulation experiments.Naturalistic driving experiment provides the opportunity to observe actual driving process.But the result is affected by the unexpected traffic condition, while the cost of the experiment is higher than the other ones.Video surveillance provides large quantity of data, but the continuity of the data sequence is difficult to guarantee.Driving simulation experiment enables participants to drive in a simulated environment with the help of a driving simulator.Data of driving behavior are easy to collect, and the variables in the road environment are easy to control; the cost of the experiment is also relatively low.At the same time, the simulated environment will be with high fidelity with the help of 3D mobile mapping.erefore, driving simulation experiment was utilized to obtain driving behavior data.
Firstly, the logical layer of the road environment model is constructed in HintCAD with the road alignment information extracted from 3D mobile mapping, as shown in Figure 8.
en, the model layer was built by 3ds Max and Google Sketchup, and road infrastructures were imported together into the driving simulation software OKTAL SCANER, as shown in Figure 9.
Finally, scripting of data acquisition and model control was completed in SCANER software.e steering performance of the vehicle was calibrated by the participants of the experiments so as to ensure the driving experience to be consistent with the actual environment.In order to guarantee the driving behavior data to be as continuous as possible, the sampling frequency was set to 10 Hz.Traffic flow was adjusted to 0 in the script to guarantee the influence of road conditions on driving behavior to be observed without disturbance.

Speed Distribution Model.
Numerous studies have found that the distribution of driving speed is approximately subject to normal distribution or lognormal distribution.But it will be difficult for a single normal distribution to describe the speed distribution with the existence of large speed difference on a road section, while the speed difference will be more significant in freight vehicles since the performance of the vehicles varies from type to type.e actual distribution of driving speed might be bimodal or even multimodal.e Gaussian mixture model applies multiple Gaussian probability density functions to quantify any distribution with precision.e probability density function of GMM is described in the following formula:   where K is the number of Gaussian distribution in GMM and λ k , μ k , σ k are the weight, mean value, and standard deviation of the Kth distribution, respectively.eoretically, when K approaches infinite, the Gaussian mixture model can perfectly describe any kind of statistical distribution.In order to describe the distribution of freight vehicle speed with precision, avoiding the possibility of underfitting by any isolate distribution form, the Gaussian mixture model (GMM) is utilized to characterize the speed distribution of freight vehicles.
In practice, GMM with three or more normal distribution may include one or more distributions with very small weights, which may cause massive unnecessary calculation in the fitting process.Generally, a bimodal GMM can be qualified in the fitting of speed distribution with considerable accuracy and limited variables to be determined.e function of bimodal GMM can be written as Considering that there are two distributions in bimodal GMM, it is difficult to determine the ascription of the sample points in the training dataset.Maximum likelihood estimation will be no longer suitable to estimate the parameters in the model.erefore, EM (expectation-maximization) algorithm is applied to fit the bimodal GMM. e algorithm firstly estimates the ascription of sample points (E-step), and then iteration is conducted based on the maximum of likelihood function (Mstep).e conduction of E-step is shown in formula (4), where k is set to 0 in the first step, μ 1 , μ 2 , σ 1 , σ 2 , and λ 1 are initiated to be random values, λ 2 is a dependent variable in which λ 2 � 1 − λ 1 , N is the number of samples in the dataset, and m is the number of distributions in GMM.In the fitting of bimodal GMM, the value of m was set to 2.
en, the conduction of M-step is shown in the following formulas: e iteration ends when the inequality in Formula (8) holds.In order to avoid local optimum solutions, the solution should also hold the inequality in formulas ( 8) and (9).ε λ , ε μ , ε σ , and ε p are allowable error ranges of weight, mean value, standard deviation, and maximum likelihood function, and k max is the maximum iteration number.Generally, the error benchmark could be set to 1%, and reasonable value of k max could be 100.

Deep Learning Methods.
With the discussion of the weakness of traditional regression model, RNN (recurrent neural network) and its variants are ideal candidates with their great capability in handling sequence input data.e basic structure of RNN is shown in Figure 10.e output of the former time step is set to be the input of the latter one, and information is transmitted between hidden layers in time domain.Memory units were constructed based on the transmission functions as shown in in formulas ( 10) and (11).
However, with the increase of the data scale, gradient vanishing or explosion will become a common problem in RNN and information may get lost.As a special variant of RNN, long short-term memory (LSTM) improves RNN by joining "input gate," "output gate," and "forget gate" into the memory unit to promise the information transmission with accuracy among the hidden layers.
e structure of the memory cell is shown in Figure 11.e transmission functions among the gates are shown in formulas ( 12)- (18).
where i t is the status of input gate, which controls the information to update from the former memory unit, f t is the status of forget gate, which controls the information to forget from the former memory unit, and o t is the status of output gate, which controls the information to transmit to the next status.e delicate designed structure of LSTM could handle the problem of gradient vanishing or explosion.But the complex structure may also increase the burden of training.And it will be easier for LSTM to be overfitting.
Another variant of RNN is GRU (gated recurrent unit). is kind of variant developed the structure of memory unit by coupling the input gate and forget gate in LSTM into an update gate.
e reset gate served as the output gate to control the recurrent connections, as shown in Figure 12.
e transmission functions among the gates are shown in formulas ( 19)- (22).
where r t is the status of reset gate, which decides the information to update from the former memory unit, and z t is the status of update gate, which controls the information to transmit to the next status.

Empirical Analysis
3.1.Experimental Settings.Experiment was conducted on a section of G15 freeway from Xinchang to Baihe in Zhejiang Province, China.e 30 km length test segment is 2-lane mountainous freeway with a speed limit of 100 km/h, as shown in Figure 13.Firstly, 3D mobile mapping was applied on the road section to obtain information of road geometry and 3D ASD.Secondly, independent repeated driving simulation experiments were designed to reflect the characteristic of the driver group.In order to obtain enough sufficient training data for the fitting of speed distribution on any road sections, totally 40 freight vehicle drivers were recruited to take the experiment.e information of the participants is shown in Table 1.Considering that there exist a certain number of female drivers of freight vehicles in China, the sample group contained 7 female participants to represent the real condition of the driver group of freight vehicles.e road environment model was built in SCANER based on the point cloud model.e vehicle type was set to be four-axis freight vehicles.
Each participant was asked to drive through the road section once (total of 40 independent repeated experiments).
e participants were required to drive at a safe, reasonable, and comfortable speed according to the road environment on the SILAB driving simulator of Tongji University.Totally, 3000 groups of data were collected with 40 sample points of driving speed in a group.
irdly, bimodal GMM fitting was conducted in each group of data.An example of bimodal GMM fitting of speed distribution is shown in Figure 14.
As shown in Figure 14, the histogram represents the statistical quantity of the observed driving speed.e probability density curve in solid line represents the fitted bimodal GMM, while the two-dotted line represents the two component distributions in the GMM.e parameters of the fitted bimodal GMM are shown in Table 2.

Construction and Training of Deep Learning Models.
Considering the stochastic characteristic of neural networks, optimized structure does not necessarily mean better performance.Tests are needed to verify the applicability of the three mentioned models in the prediction of driving speed.
e three models were constructed with the same structure in Python, and the performance was then compared with the same set of training dataset.
For the basic structure of the model, full-connected model was selected to improve potential performance of the model while avoiding limitations of artificial intervention.
For the input variables of the models, totally 7 variables are selected to form the input vector, including road alignment features and 3D ASD, as shown in Table 3.
For the output variables of the model, considering that driving speed was described in term of bimodal GMM, the output variables are set to be the parameters in the GMM, as shown in Table 4.
Since λ 2 � 1 − λ 1 , there is no need for λ 2 to take an additional spot in the output vector.
For the selection of numbers of nodes in the hidden layer, there has not been universal method in determining the certain number of nodes in NN.Generally, the number of samples in the training dataset should be 2-10 times more than the number of connection weights in the hidden layers.Considering the number of nodes in the input layer and output layers, it is appropriate to select the number of nodes in the hidden layer to be 25.
For the selection of numbers of hidden layers, it has been proved that more hidden layers could improve the accuracy of the model, but the generalization ability would decrease at the same time.Generally, the profit from increasing number of nodes is larger than the profit from increasing number of hidden layers.erefore, the number of hidden layers is set to 1 in this study.
For the selection of loss function, mean relative error (MRE), mean square error (MSE), and mean absolute error (MAE) are the representative ones to measure the accuracy of NN models.However, the aim of the models applied in this study is to predict an accurate distribution of speed.Prediction accuracy of any isolated parameters in the GMM will be less significant to reveal the performance of the model.It will be more appropriate to measure the precision of the predicted distribution by measuring its divergence with the actual distribution.
erefore, Kullback-Leibler divergence (KL divergence) was selected to be the loss function considering its physical significance in distribution comparison.e loss function is shown in formulas ( 23) and (24).
e value of the function represents the difference between distribution P(x) and Q(x).
Since KL(P ‖ Q) ≠ KL(Q ‖ P), P(x) was set to be the actual speed distribution and Q(x) was the predicted speed distribution.
For the selection of length of input sequence, freight vehicle drivers' perception and memorization should be taken into consideration.A short length of input sequence may not be able to cover all the necessary information, while longer length of input sequence may cause redundancy.It has been proved that short time memory lasts for no longer than 20 s.Assuming the driving speed to be 90 km/h on mountainous freeway, drivers will remember the information of former 500 m road condition.In order to find out an appropriate length of the input sequence, five different lengths of input sequence including 100 m, 200 m, 300 m, 400 m, and 500 m will be tested, respectively.e bias vectors and weight matrixes are initiated randomly between (− 1, 1), and the optimization of the weight matrixes is completed by backpropagation through time (BPTT).
According to the convention of the training of neural networks, the proportions of the training set could be 60-90% considering the scale of the dataset [53].Considering that there are totally 3000 groups of data in the dataset, 2700 groups of data were used for the training and establishment of the model, and the remaining 300 groups were used to test the performance and validate the model.

Results
e comparison of the three models with different lengths of input sequence is shown in Table 5; the values in the table refer to the average KL divergence between the predicted Journal of Advanced Transportation distribution and the actual distribution.An example of the predicted distribution of LSTM with the length of input sequence being 400 m and the actual distribution is shown in Figure 15.e parameters of the actual distribution and the predicted distribution are shown in Table 6.
It can be found that either type of network shows its applicability in fitting the speed prediction model with the minimum value of the loss function being no more than 0.32.e best performance belongs to LSTM with the input sequence of 400 m with its KL divergence reaching 0.11.It is also worth noticing that LSTM and GRU always outperformed RNN in accuracy with the same length of input sequence.It can be inferred that the additional gates in RNN could partly solve the problem of time dependency.Among those, LSTM holds the best performance all the time.e simplified structure of GRU resulted in its poorer performance.
Besides, suppose that the driving speed to be 90 km/h, a length of 400 m means a memorization span of 16 s, which means that drivers are able to remember the information of road geometry for about 16 s on mountainous freeway.e result is highly correlated with the terrain condition on mountainous freeways.e character of road geometry is likely to change in 400 in mountainous freeway, which is a relatively large burden for the freight vehicle drivers to take.So, it can be explained that drivers are likely to get used to the oncoming road condition and forget the previous information if there is a significant difference between the adjacent sections.
e length of input sequence and the time span may change in different road conditions.
In order to validate the proposed LSTM model, the study applied multiple linear regression to make a comparative study.Since there are five parameters in the bimodal GMM,  However, the multiple linear regression could help to understand the mechanism of the road environment affecting vehicle speed.Taking the multiple linear regression model predicting μ 1 as an example, as shown in Table 8, under the confidence level of 0.05, horizontal curvature, horizontal curve length, and 3D available sight distance have significant influence on one of the mean values of the GMM.e coefficient of the three variables can be interpreted as follows: the increase of horizontal curvature, horizontal curve length, and 3D available sight distance will lead to higher speed, while horizontal curvature has a most significant impact among the three variables.

Conclusion
Targeting the weakness of traditional speed prediction models, this study applies several types of deep learning methods based on RNN to predict speed distribution on mountainous freeways.RNN, LSTM, and GRU were trained and tested, respectively, based on a driving simulation dataset.e results suggest that (1) It is reasonable to apply bimodal GMM to characterize the distribution of speed of freight vehicle, which improved the traditional method by using single normal distribution or lognormal distribution.
(2) RNN, LSTM, and GRU all show their applicability in speed prediction of freight vehicles in consideration of road geometry.LSTM outperforms RNN and GRU on most tasks, and GRU shows its slight advantage over RNN on this specific issue.
(3) e length of input sequence has a significant impact on training results of the models, which demonstrates the necessity of the application of RNN and its variants.Drivers of freight vehicles are proved to get affected by the consecutive road condition in their driving process.On mountainous freeway, drivers are likely to update their memorization of road condition with the threshold of 16 s, which is 400 in driving distance.e change of road geometry may be the main reason for the time span.e value of the  is could also explain that mountainous freeway always has accident-prone road sections with heavy information load for freight vehicle drivers to take.(4) e accurate road geometry information provided by 3D mobile mapping may help to improve the reliability of driving simulation study and the results of the speed prediction model.( 5) LSTM is obviously superior to regression models in terms of the model accuracy and interpretability of the driving process and the formation of vehicle speed.Further research on regression models could be conducted targeting the mechanism of road environment affecting vehicle speed.
e findings contribute to a better understand on freight vehicle drivers' driving behavior on mountainous freeways.
e study also provided important insights into the road geometry design and development of transportation safety strategies.
In the future work, more elements in road environment and driving behavior could be included into the model.Effort could be put into freight vehicle drivers' memorizing mechanism of road environment, and analysis of the information load for drivers could also be made.

Figure 8 :
Figure 8: Logical layer of road environment.

Figure 9 :
Figure 9: Model layer of road environment.

Figure 12 :
Figure 12: Structure of the memory unit in GRU.

Figure 15 :
Figure 15: An example of the predicted distribution and actual distribution.
Figure 10: Structure of RNN and the memory unit.

Table 2 :
Parameters in the bimodal GMM fitting example.

Table 3 :
Classification and explanation of input variables.

Table 4 :
Classification and explanation of output variables.

Table 1 :
Information of participants.

Table 5 :
KL divergence of the deep learning models with different lengths of input sequence.

Table 7 .
It can be easily concluded that multiple linear regression is weaker in model accuracy and interpretability than the LSTM model since the adjusted R 2 values of the models are all smaller than 0.3, which means that it is difficult for the model to predict the parameters with precision.Meanwhile, the structure of the model is inferior than LSTM in interpreting the driving process since the isolated input variables fail to reflect the sequence characteristics of the driving process.

Table 6 :
Parameters in the predicted distribution and the actual distribution.

Table 7 :
Adjusted R 2 of the multiple linear regression models.

Table 8 :
Coefficient table of the multiple linear regression model predicting μ 1 .