Quantification of Rear-End Crash Risk and Analysis of Its Influencing Factors Based on a New Surrogate Safety Measure

1 e Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Shanghai 201804, China College of Transportation Engineering, Tongji University, 4800 Cao’an Highway, Shanghai 201804, China Investment and Development Department, China Shandong International Economic & Technical Cooperation Group Ltd., 1822A Shandong Hi-speed Group Mansion, Jinan 250098, Shandong, China


Introduction
Statistics from the World Health Organization (WHO) show that the number of deaths caused by road traffic crashes is about 1.35 million each year, ranking eighth among all causes of death [1]. e serious consequences of traffic crashes have driven researchers to investigate the causes of the crashes. Among the many causes, driving behavior has been found to be a crucial one. For example, a study conducted by National Highway Traffic Safety Administration (NHTSA) found that driver-related factors account for 94% of the critical reasons of these crashes, and most studies indicated that traffic crashes can largely result from risky driving behaviors [2][3][4]. To reduce casualties and mitigate injuries from traffic crashes, understanding and identifying the crash risk is essential.
Among different types of traffic crashes, rear-end crashes are recognized as one of the most common types [5,6]. Statistics from the NHTSA show that the rear-end crashes accounting for 32.4% of all accident types that cause personal injury [7]. Since most rear-end crashes occurred in car-following situations, it has become crucial to identify the rear-end crash risk during car-following process and explore its influencing factors [8][9][10].
Despite the efforts on rear-end crash risk identification and analysis, several research gaps still exist. Measures, including time-to-collision (TTC), stop distance index (SDI), deceleration rate to avoid crash (DRAC), and others, have been proposed to study driving risks [11][12][13]. However, TTC based on constant velocity assumption ignores the response of the following vehicle (FV) and the changes in the states of the vehicle pairs. erefore, measures which take the mechanism of driver response and the development of the crash into account should be developed for better representing risks during car-following events. Besides, the other traditional surrogate measures of safety (SMoS) cannot fully reflect the crash probability and crash severity at the same time. In addition, driving risks change with driver's personal characteristics and environmental factors [3,[14][15][16]. In-depth research on the risk of rear-end crash risk and its influencing factors is essential to formulate effective countermeasures to reduce the risk of rear-end crash.
Judging from the previous studies, the traditional SMoS cannot fully consider the crash mechanism or fail to reflect the crash probability and crash severity at the same time. In addition, just few previous studies have been conducted based on large-scaled naturalistic driving data and comprehensively consider driver heterogeneity, behavioral characteristics, and environmental characteristics to study the influencing factors of rear-end crash risk from the perspective of microscopic car-following behavior. erefore, this study aims to propose a reliable measure to quantify the driving risk in the process of car-following and investigate the impact of various influencing factors on the rear-end crash risk considering driver's heterogeneity. For that purpose, a new rear-end crash risk index (RCRI) was introduced, which fully considers the crash mechanism and integrates crash probability and crash severity. A total of 16,905 car-following events were extracted from Shanghai Naturalistic Driving Study (SH-NDS). Crash risks under different influencing factors were analyzed and compared based on the proposed RCRI. en, the mixed-effects linear regression model was then employed to study the impact of the behavioral characteristics and environmental factors on rear-end crash risk.

Overview of Studies on Surrogate Measures of Safety (SMoS).
High-risk car-following behaviors, such as closedriving to the leading vehicle (LV), may lead to a high probability of an accident [17]. Based on temporal and spatial proximity, various rear-end crash risk indexes have been proposed that can be used to evaluate driving risks. Among time-based surrogate measures, time-to-crash (TTC) was widely used in practice [11]. Meanwhile, the risk of rear-end crash also depends on driver's crash avoidance behavior. e crash can also be avoided if the FV brakes in time. Herein, deceleration rate to avoid crash (DRAC) was then introduced to evaluate the braking requirement during vehicle conflicts to quantify the risk of rear-end crash [13]. Besides, maintaining a safety distance between the LV and the FV is the key to avoiding rear-end crashes. erefore, the stop distance index (SDI), which is based on the concept of safe stopping distance of the FV, was proposed by Oh et al. [12]. To mitigate the risk of rear-end crash, the FV should maintain the safe headway distance from the LV.
Although the traditional SMoS was widely used in quantifying the risk of rear-end crash, they also have some limitations. Kuang et al. [18] summarized three main limitations of the SMoS presented above: (i) the driver's response characteristics when experiencing a conflict are not considered in these SMoS; (ii) due to the requirements of the boundary conditions, any situation where the speed of the FV is lower than the speed of the LV is considered a safe situation, which may be unreasonable for the situation where the vehicle is traveling at a similar speed and the headway distance is small; (iii) the arbitrary selection of the index threshold will also make the result inaccurate.
Researchers have also proposed some new surrogate measures to address the abovementioned problems. Xie et al. [19] proposed the time-to-crash with disturbance (TTCD) to quantify the risk of rear-end crash. is indicator solves the problem of inaccurate risk identification when the speed of the FV is less than the speed of the LV. Besides, Shi et al. [20] derived a new hybrid indicator named key risk indicator (KRI), which integrates time integrated time-to-crash (TIT), crash potential index (CPI), and stopping sight distance (PSD). However, these proposed surrogate measures cannot make up all the above three limitations. In view of this, Kuang et al. [18] proposed a surrogate measure based on tree structure to evaluate the rear-end crash risk, namely, aggregated crash index (ACI), which considers many major factors in rear-end crash, including disturbance of LV, driver's reaction characteristics, and available braking capacity of the FV. However, this method is not convenient due to its complicated leaf nodes, which makes it complicated to apply. In addition, this measure only discusses the occurrence probability of a rear-end crash event without considering the severity of the crash. erefore, it is crucial to propose a new SMoS to address the above limitations and better quantify the risk of rear-end crash.

Overview of Studies on Driving Risk-Related Factors.
Driving risks are often considered to be related to multiple factors, including driver's individual characteristics and environmental factors [3,[14][15][16]. Several studies have explored the interaction between driving risks and various influencing factors. Table 1 summarizes these previous studies, which can be summarized into three main categories based on experimental methods: survey-based studies, driving simulation studies, and naturalistic driving studies.
Most of the initial research is based on reported survey data. However, the main problem is that the sample size of the data is limited and may be highly subjective. In addition, survey-based research pays more attention to the influence of driver characteristics on driving risks and ignores driving environmental factors. Alternatively, due to its high safety, controllability, and comprehensive data acquisition, driving simulators are used to study driver behavior characteristics to improve driving safety, especially in safety-critical conditions. However, studies based on driving simulation experiments are mostly limited to specific behavior in a limited To investigate the influence of driver's personality characteristics on risky driving behavior.

Regression analyses
Risky driving has a greater correlation with emotion recognition and expression levels, and less correlation with age.
Harbeck et al. [24] To examine risky driving in relation to psychological variables.
Statistical analyses e results of the proposed model show that young driver's risky driving is related to risk perception, response cost, and rewards.

Ulleberg and
Rundmo [25] To understand the underlying mechanism of risky driving behavior through the combination of personality traits and social cognitive methods. Survival analysis e results show that the lower visibility leads to higher rear-end crash risk, and road alignment has a significant impact on crash risk.
Precht et al. [28] To identify the main influencing factors contributing to driving risks. Naturalistic driving data 108 trip segments.
Generalized linear mixed models (GLMMs) Driving violations are related to anger, the presence of passengers, and personal differences. In addition, secondary tasks that cause distraction of the driver's visual attention and complex driving tasks are associated with high driving risks.
Pnina et al. [14] To investigate the interactions between driving context and their associations with risky driving behaviors of young novice drivers.

Passion regression analyses
Driving own has a higher high-risk driving than shared vehicle, and driving during the day has a higher risky driving rate than driving at night.
Chen et al. [29] To explore the contributing factors to crash risk during lane-changing process. number of safety-critical scenarios and cannot truly reflect the external driving environment [21]. Research based on naturalistic driving data represents real-world driving situations. It is possible to extract significant driving behavior parameters from naturalistic driving data, such as speed, acceleration, relative position with surrounding vehicles, and environmental conditions to study the influencing factors of driving risk [22]. e collected data are more comprehensive and effective and may provide more valid results.

Brief Introduction of the Shanghai Naturalistic Driving Study (SH-NDS).
e real-world driving data used in this paper were collected by the SH-NDS, jointly conducted by Tongji University, General Motors (GM), and the Virginia Tech Transportation Institute (VTTI) from 2012 to 2016 [30,31]. e 60 drivers participating in this naturalistic driving experiment are aged between 35 and 50 years old, and all of whom have a driving experience of more than five years. e total mileage that has been driven before participating in the experiment is more than 20,000 kilometers, and the average daily mileage is not less than 40 kilometers. Each driver drives the assigned experimental vehicle on the open road network, and the driving route is selected according to the driver's daily travel needs. e video data of SH-NDS are mainly recorded by 4 cameras, which are installed in hidden locations that are not easy to observe. e video image is shown in Figure 1, which is composed of the front and rear vision of the vehicle, the driver's facial state, and the hand operation image.

Car-Following Event Extraction.
In this study, the carfollowing events in the SH-NDS were extracted to analyze the influencing factors of rear-end crash risk. e SH-NDS data cover all daily trips of the drivers participated. Totally data of 18,242 trips were collected. In the SH-NDS, data were automatically collected using the data acquisition system, triggered by the ignition switch of the vehicle. erefore, the database inevitably contains a large number of trip records that are not related to the research content (vehicle activities including fueling, car washing, maintenance, and other types of short-distance trip were all recorded). In addition, missing values and outliers also exist in the SH-NDS database. Data processing mainly includes the following four steps: (i) Step 1: eliminate invalid record files. Due to the large number of short-distance trip records in the SH-NDS database, considering the distance between adjacent entrances and exits on urban expressways, a trip includes at least entering the urban expressway, driving on the urban expressway, and exiting the urban expressway, so the trip files with the travel time less than 5 minutes should be removed. (ii) Step 2: remove driving data under discontinuous traffic flow. To extract the car-following event under continuous traffic flow, a point-to-point map matching algorithm was used to match the driving trajectory measured by GPS with the electronic map road data to find the trip record on the urban expressway [32]. en, each trip was verified through camera video recording to ensure the validity of the driving data selected for analysis. e weather and light conditions were also determined during the verification process. (iii) Step 3: handle missing values and outliers. In order to obtain data such as vehicle's speed, acceleration, relative space-time distribution of surrounding vehicles, and traffic environment conditions during the car-following process, data preprocessing is required. Linear interpolation was applied to deal with missing values. en, outliers were eliminated based on pauta criterion, and the data were smoothed using the moving average filter. (iv) Step 4: extract data of car-following events. Carfollowing events were extracted by applying an automatic extraction algorithm proposed by Zhu et al. [33]; (1) radar target's identification number of LV > 0 and remained constant: guaranteeing the FV was preceded by the same LV; (2) 7 m < longitudinal distance between the FV and the LV < 120 m: eliminating the congested-flow conditions; (3) lateral distance between the FV and the LV < 2 m: ensuring that the FV and the LV were in the same lane; (4) duration of the car-following event >15 s: guaranteeing that each car-following event has enough data for analysis.
After the application of the above steps, 16,905 carfollowing events (about 135 h of total event duration) were extracted from 1,197 trips. Figure 2 shows the histogram of the duration of car-following events.

Methodology
e methodology of this research mainly includes three parts: (i) derivation of a new surrogate measure for rear-end crash risk, (ii) identification of influencing factors for rearend crash risk, and (iii) mixed-effects linear regression for rear-end crash risk modeling and factor analysis.

Derivation of New Surrogate Measure for Rear-End Crash Risk
In the process of carfollowing maneuver, the driver will choose the appropriate speed and safe headway distance according to the movement status of the LV. e determination of the safe headway distance needs to consider the driver's reaction time and the vehicle's deceleration process. Otherwise, it may easily lead to a rear-end crash when the headway distance is too small. In this research, we imposed a hypothetical disturbance on the LV, assuming the LV decelerated at a certain deceleration rate. As can be seen in Figure 3, the FV will take appropriate evasive actions based on the initial driving condition and the disturbance after reaction to avoid the crash. e crash outcome can be identified by evaluating the initial conditions, the disturbance, the driver's reaction characteristic, and the degree of evasive action [18,34].

Rear-End Crash Risk Index (RCRI).
Risk is the product of the possibility that a hazard event will occur and the consequence of the event. To address the limitations of SMoS mentioned above, we propose a new SMoS named RCRI, which considers the crash probability and crash severity at the same time, to quantify the risk of rear-end crash. According to the changes in the speed and distance of the LV and the FV before the crash, the process of rear-end crash can be divided into four categories, as shown in Figure 4 [35,36].
en, according to the characteristics of the rear-end crash, that is, the crash is plastic and the two vehicles tend to move together after the crash. In this study, the momentum theorem was used to calculate the speed of the two vehicles after the rear-end crash, that is, where m l is the mass of the LV, m f is the mass of the FV, v pre l is the speed of the LV when the crash occurs, v pre f is the speed of the FV when the crash occurs, and v c is the speed of the LV and FV after the crash. Calculate the energy loss ΔE of the two vehicles after rear-end crashes using the law of conservation of energy, as follows: (2) erefore, this study uses the square of the absolute speed difference (SASD) at the time of the rear-end crash of two vehicles to express the severity of the rear-end crash.
To simplify the rear-end crash avoidance process, the model adopts two assumptions: (i) the braking process of the vehicle is regarded as a uniform deceleration process and (ii) the FV only adopts braking measures to avoid crash. erefore, the braking stop distances of the two vehicles before the crash can be obtained, as shown in the following formula: where s l is the braking distance of the LV, v l is the speed of the LV before braking, t c is the time of crash, and a l is the deceleration rate of the LV. When t c ≥ (v l /a l ), the LV has been completely stopped, and the corresponding braking stop distance of LV is During the entire conflict process, during the reaction time t R , the FV maintains a constant speed v f and then decelerates at a constant deceleration rate a f . When 0 ≤ t c ≤ t R , the braking distance of the FV can be represented as where s f is the braking distance of the FV. When t R ≤ t c ≤ t R + (v f /a f ), the braking stop distance of the FV is erefore, if the longitudinal distance between the FV and the LV is reduced to zero before the FV completely stopped, a crash will definitely occur, namely, where l is the initial gap between LV and FV. For scenario 1, where 0 ≤ t c1 ≤ t R and 0 ≤ t c1 ≤ (v l /a l ), when the crash occurs, For scenario 2, where (v l /a l ) ≤ t c2 ≤ t R , when the crash occurs, For scenario 3, where 0 ≤ t c3 ≤ (v l /a l ) and t R ≤ t c3 ≤ t R + (v f /a f ), when the crash occurs, If a l ≠ a f , the solution is If a l � a f , the solution is For e solution is  According to the previous studies, the deceleration rate taken by the LV follows a shifted gamma distribution (17.315, 0.128, 0.657), which was suggested and calibrated by Kuang et al. [18]. e reaction time of the FV follows a lognormal distribution (0.17, 0.44), and the braking coordination time is 0.175 s [37]. e maximum available deceleration rate (MADR) was assigned to be a truncated normal distribution with a mean of 8.45 m/s 2 and a variance of 1.4 m/s 2 between 4.23 m/s 2 and 12.68 m/s 2 [38].
e Monte Carlo simulation method was used to randomly select the deceleration rate taken by the LV, the reaction time of the driver, and the deceleration rate of the FV on the basis of the distribution function of each parameter mentioned above. According to the initial states of the LV and FV, the crash time (t) and the crash consequence (represented by SASD as discussed before) were calculated based on the above equations. en, by integrating the possibility and consequences of the crash under the current car-following conditions, a new SMoS named RCRI can be calculated and expressed as below. It should be noted that the SASD needs to be normalized to a value between 0 and 1 before calculating RCRI. erefore, the calculated RCRI range is also between 0 and 1: where RCRI i represents the risk of rear-end crash at the i th moment (0.1 s) in the car-following scenario; N � 10,000 is the number of random samples generated by the Monte Carlo simulation. When crash time t has a solution, that is, a crash occurs, crash ij � 1; otherwise, crash ij � 0. It should be noted that SASD is dimensional, so it needs to be normalized before calculation.

Identification of the Influencing Factors for Rear-End
Crash Risk

Definition of the Variables.
As discussed in previous studies, the driving risk is mainly affected by driver's operational characteristics and the external driving environment [3,[14][15][16]. In order to quantify the various influencing factors for risks in the car-following process, this study extracted three categories of variables: behavioral variables, temporal variables, and environmental variables, as shown in Table 2.  Journal of Advanced Transportation e relevant variables of the driver's car-following behavior consider the duration of car-following event, time headway, and the driving speed and acceleration of the LV and FV. In order to eliminate the influence of the absolute value of the speed on the modeling, in terms of the speed indicator, this study adopted the average speed difference (ASD) between LV and FV, which is presented as where v l(t) and v f(t) are the instantaneous speed of the LV and FV at its t-th record, and t ∈ T, T is the duration of carfollowing event.
Acceleration difference ratio (ADR) refers to the ratio of the standard deviation of the acceleration of the FV and LV. e ADR of one car-following event can be calculated as where σ f and σ f are the standard deviation of the acceleration of the FV and LV during one car-following period. Peak hours increase the likelihood of congestion, resulting in a shortage of driving space, which in turn breeds risky driving behaviors. In order to consider the possible impact of the peak period, in this study, the car-following events within a day are divided into three categories: morning peak, evening peak, and off peak. Among them, the morning peak refers to 7:00 to 9:00, while the evening peak refers to 17:00 to 19:00. Besides, the traffic density was determined based on the speed of the FV and forward camera video recording, as described in Yang et al. [39].
To examine the effect of these independent variables on rear-end crash risk, the mean RCRI of a single car-following event was used as the dependent variable in this study.

Statistics Description of the Variables.
As mentioned, we extracted 16,905 car-following events from 1,197 trips. Table 3 presents the descriptive statistics of the continuous variables and frequency information of the discrete independent variables in this study.

Mixed-Effects Linear Regression for Rear-End Crash Risk
Modeling and Factor Analysis. In this research, all drivers participated in multiple car-following events, and carfollowing behavior characteristic variables were repeatedly collected from each participant. erefore, the correlation problem between repeated observations will be exposed, that is, within-cluster correlation, and this problem can be solved by applying mixed-effects linear regression model [30,40]. e mixed-effects linear regression model is an extension of the linear regression model, including fixed effects and random effects. Compared with the ordinary linear regression model, the mixed-effect linear regression model can well control the influence of the driver's personality factors.
e results of the model can reflect the commonality of drivers and consider the internal correlation between samples, which is more suitable for solving the research problems.
In this paper, the fixed effects are the independent variables that the research focuses on. Besides, the drivers were treated as random effects to address the problem of within-cluster correlation.
Formally, the mixed-effects linear regression model can be written as where y ∈ R N×1 is the dependent variable; X ∈ R N×p is a matrix of the independent variables; β ∈ R p×1 is the coefficients of the fixed effects; Z ∈ R N×q is the matrix for random effects; μ ∈ R q×1 is the coefficients of the random effects; and ε ∈ R N×1 is a column vector of the residuals. To recap, To better understand the structure of the model, here we provide an example where 16,905 (N) car-following events were collected from 58 (q) drivers. Our outcome y is the risk of car-following event. As mentioned above, we have 11 fixed effect predictors. e following equations represent the vectors and matrices provided in the previous equations: e random effects μ in the regression model are a column vector containing random intercepts. However, it is not necessary to estimate μ in actual regression modeling. Instead, the model assumes that μ follows a normal distribution, with a mean of zero and a variance of σ 2 : Parameters of all components were estimated using the mixed procedure in Stata/MP 16.0. e statistical significance level was set at 0.05.

Rear-End Crash Risk Identification Using SH-NDS Data and RCRI.
To better illustrate the rear-end crash risk identification measure proposed in this study, TTC, DRAC, SDI, and RCRI were all employed to quantify the risk for one car-following event. According to the previous studies, the thresholds of each SMoS are chosen as follows. e threshold of TTC is normally chosen as 3 s [41], and the threshold of DRAC is 3.4 m/s 2 [42]. In the SDI calculation, it is assumed that the deceleration speed is 3.3 m/s 2 and the reaction time is 1.0 s [20]. A portion of the vehicle movement data during one car-following event is presented in Table 4. e risks identified at intervals of 0.1 s are shown in Figure 5.
As mentioned in the methodology, the RCRI is calculated based on the assumed disturbance. is measure can be used to quantify the risk in any scenario, even when the LV's speed is greater than that of the FV. Besides, the RCRI takes into account the most critical variables in crash mechanisms such as driver reaction characteristics and vehicle braking performance and comprehensively considers the crash probability and consequences. Moreover, the RCRI is a continuous variable, so it has better flexibility to quantify the real-time change process of rear-end crash risk. On the contrary, the risk quantification results based on TTC, DRAC, and SDI are dummy variables. erefore, the RCRI can more accurately represent the risk and has wider applicability.

Comparative Analysis of Rear-End Crash Risk under Different Influencing Factors.
Before the significance test, the Kolmogorov-Smirnov (K-S) test was employed to verify the distribution of rear-end crash risks under different influencing factors. e results of the K-S test show that rear-end  crash risk under different influencing factors meets the requirements of homogeneity of variance and normal distribution. erefore, the analysis of variance (ANOVA) was then applied to test the significance of difference in rear-end crash risk under different influencing factors. Table 5 provides the summary statistics of the driving risk under these significant influencing factors. It can be seen from Table 5 that the rear-end crash risk under most influencing factors, such as day-of-week, time-of-day, light condition, and traffic density, is significantly different. Figure 6 shows the comparison results of rear-end crash risk under different influencing factors. Specifically, for dayof-week, driving risk increased with the workday (by 14.3% from 0.0028 to 0.0032). In addition, the driving risks are 0.0030, 0.0034, and 0.0032 respectively in the three cases of off peak, morning peak, and evening peak. e significant differences in the risks of these temporal variables indicate that the driving risks are different for different travel purposes. e decrease in driving risk was slight but significant by 0.0002 (6.3%) from daytime to nighttime. e driving risk decreased from 0.0049 in high traffic density to 0.0021(by 57.1%) in low traffic density, indicating that traffic congestion leads to a decrease in driving safety.

Results of Mixed-Effects Linear Regression Model.
Based on the car-following behavior variables and environmental factors variables obtained from SH-NDS, the influencing factors of rear-end crash risk are investigated. Table 6 presents the results of mixed-effects linear regression model. As shown in Table 6, the results of chi-squared goodness of fit indicate that the mixed-effects regression model fits well (χ 2 (11) � 11198.24, Prob >χ 2 (11) � 0.00). Except for the morning peak in temporal variables and weather conditions, all the variables listed in Table 6 are significant at 95% confidence level.  As shown in Table 6, all selected behavioral variables affect the driving risk. e longer the duration of car-following event, the lower the driving risk. is can be understood as the risk of driving increases when the LV is changed frequently. Clearly, the larger the time headway, the lower the driving risk. is result is consistent with Duan et al. [9], who evaluated risk perception in car-following process. In addition, based on the SH-NDS data, Zhu et al. [43] concluded that the aggressive drivers have a shorter time gap than conservative drivers. Existing studies found evidence that speed dispersion is also an important factor in determining crash risk [44][45][46]. e larger speed difference between LV and FV is associated with a higher crash rate, which is generally consistent with our findings. In addition to the speed difference, this paper investigates the impact of acceleration difference on driving risk. e results show that the higher the acceleration difference ratio, the higher the risk, which indicates that the driving risk will increase when the FV uses more frequent acceleration and deceleration operations than the LV during car-following processes.
Qin et al. [47] suggested that due to the different travel purposes (to/from work) of drivers, the probability of a crash during working day and nonworking day is different, and the probability of crashes on working days is higher. is finding is consistent with the results of the regression model in this paper; that is, working days lead to higher driving risks. Furthermore, from the regression coefficients of this study, it can be concluded that the crash risk is higher for the morning peak compared with other times of the day, and the evening peak is the lowest risk period of the day. e results are further confirmed that that there is a significant correlation between driving risk and driving purpose. e driving risk of drivers on the way to work is higher than the risk of leaving work.
As for the diverse environment, compared with sunny days, the risk of rear-end crash is higher for drivers on rainy days, which is consistent with Das et al. [48] and Jung et al. [49]. From the obtained results, we can draw the conclusion that traffic density has a greater impact on rear-end crash risk [50]. e variable of median-density and low-density show a negative coefficient (β � −0.002 for median-density and β � −0.005 for low-density), indicating that driving risk decreased in lower density traffic. e high-density traffic flow leads to an increase in the uncertainty of traffic flow and increases the driving risk.
is result is consistent with Huang et al. [22], who investigated the driving risks under different conditions using naturalistic driving study and driver attitude questionnaire.

Conclusions
is study proposes a new SMoS to quantify driving risks in car-following situations and investigates the impact of different influencing factors (behavioral factors, temporal factors, and environmental factors) on rear-end crash risk considering driver's heterogeneity. A total number of 16,905 car-following events were extracted from SH-NDS database. Risks of rear-end crash under different influencing factors were compared. In addition, a mixed-effects linear regression model was then applied to investigate the relationship between rear-end crash risk and various influencing factors.
Several key conclusions can be drawn: (i) Different from TTC, DRAC, SDI, and other indicators, the surrogate measure RCRI was proposed based on crash mechanism and comprehensively considers the crash probability and consequences. is measure can be applied in any car-following situation, even when the speed of the LV is greater than the speed of the FV. e RCRI proposed in this study is a continuous variable, so it can be more flexible to quantify the risk of rear-end crash.
(ii) Among different temporal variables, workday and morning peak hour had the highest mean value of driving risk. For different light conditions, the crash risk increased for daytime compared to nighttime. As for different traffic density, the driving risks corresponding to low-density traffic flows are significantly lower than those corresponding to highdensity and medium-density traffic flows. (iii) e mixed-effects linear regression model performed well in quantitatively evaluating the impact of various influencing factors on rear-end crash risk. e developed models demonstrated that duration of car-following event, mean time headway, average speed difference, acceleration difference ratio, dayof-week, time-of-day, weather condition, and traffic density had significant effects on rear-end crash risk. Workday and morning peak negatively affected driver safety. As for environmental variables, rainy and high-density traffic decreased driver safety.
As the main contribution, this paper utilizes a new SMoS and naturalistic driving data to quantify rear-end crash risk and identify the impacts of different influencing factors on the crash risks. Research was conducted based on naturalistic driving data, which objectively reflects the real operation of drivers. e new SMoS can not only be used to investigate the driving safety of drivers under different driving environments but also be used for driving risk evaluation and real-time risk prediction. Results from the mixed-effects linear regression model can be used to improve driving safety by adopting appropriate countermeasures. For example, traffic safety management can be strengthened during working days and morning peak hours to ensure safe driving.
Still, limitations exist in this study. e proposed indicators mainly focus on the risk of rear-end crash and cannot comprehensively consider other types of crash risks. In addition, no crash data were obtained from SH-NDS database that can be used to verify the effectiveness of RCRI. For future work, further validation will be applied to evaluate the effectiveness of RCRI based on crash data. Moreover, the RCRI will be used to predict the real-time change process of driving risk and explore the impact of risky driving.
Data Availability e data used in this paper are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.