Insights on Crash Injury Severity Control from Novice and Experienced Drivers: A Bivariate Random-Effects Probit Analysis

,is study intended to investigate the crash injury severity from the insights of the novice and experienced drivers. To achieve this objective, a bivariate panel data probit model was initially proposed to account for the correlation between both time-specific and individual-specific error terms. ,e geocrash data of Las Vegas metropolitan area from 2014 to 2017 were collected. In order to estimate two (seemingly unrelated) nonlinear processes and to control for interrelations between the unobservables, the bivariate random-effects probit model was built up, in which injury severity levels of novice and experienced drivers were addressed by bivariate (seemingly unrelated) probit simultaneously, and the interrelations between the unobservables (i.e., heterogeneity issue) were accommodated by bivariate random-effects model. Results revealed that crash types, vehicle types of minor responsibility, pedestrians, and motorcyclists were potentially significant factors of injury severity for novice drivers, while crash types, driver condition of minor responsibility, first harm, and highway factor were significant for experienced drivers. ,e findings provide useful insights for practitioners to improve traffic safety levels of novice and experienced drivers.


Introduction
According to the National Center for Statistics and Analysis (NCSA), each year, it is estimated that 25,000 people got killed in motor vehicle crashes, and the trend tends to cause a 9% increase. Among these crashes, novice drivers account for a large proportion, whose risk per mile driven is nearly 3 times greater than that of experienced drivers, especially in the first 6 months when licenses are issued [1]. Furthermore, novice drivers are easily influenced by a variety of factors, such as various distractions (e.g., neon lights and billboards) along the roadways, smartphones, and online chatting, which causes more severe injury than experienced drivers. A variety of factors affect the injury severity, including human (drivers, pedestrians, and bicyclists), vehicles, roadway, and environment, but novice and experienced drivers may cause different injury severity levels because of the personal features and factors. However, the identification and determinants of injury severity among novice and experienced drivers have not been uniformly recognized, although various scholars have explored different aspects of novice drivers, experienced drivers, or both. Moreover, for the injury severity levels, there may exist interrelations between their unobservables of both groups; thus, how to estimate the two (seemingly unrelated) injury severity levels simultaneously and to control for interrelations between their unobservables may be challenging. erefore, it is necessary to investigate and determine the influencing factors of injury severity for novice and experienced drivers so that some general consensus can be reached, and the interrelations can be addressed.
From the beginning of this century, the crash analysis of novice drivers has been very active. Ferranteet et al. [2] explored the relationship among novice drink drivers, recidivism, and crash involvement using multivariate survival analysis. e results found that if a driver's first drink driving offense occurred at a younger age, he/she was significantly more likely to drink, drive, and crash again. After that, Simons-Morton et al. [3] described the effects of the Checkpoints Program on parent limits on novice teen driving through six months after licensure. It was found that it was possible to foster modest increases in parental restrictions on teen driving limits during the first 6 months of licensure, but the level of restriction was not sufficient to protect against violations and crashes. Continued with the crashes of novice teenage drivers, Braitman et al. [4] identified the characteristics and contributing factors leading to crashes of novice 16-year-old drivers in Connecticut.
e results revealed that three-fourths of the crash-involved teenagers were at fault, while more than half of the at-fault crashes of newly licensed novice drivers involved more than one contributing factor including speed, loss of control, and slippery roads. From the perspective of simulation, Ivers et al. [5] explored the risky behaviors and risk perceptions of young novice drivers and sought to determine the relation with crash risk. A detailed questionnaire was conducted and Poisson regression was employed to explore crash risk. e results reached that self-reported risky driving behaviors among novice drivers were associated with 50% increased risk of a crash and the types of novice driver policies need to be strengthened. A similar study by McDonaldet al. [6] developed simulator scenarios for assessing novice driver performance with crash data. Chapman et al. [7] evaluated crash and traffic violation rates before and after licensure for novice California drivers subject to different driver licensing requirements. Plots and Poisson regression were employed to compare overall rates and subtypes of crashes and traffic violations among novice drivers. It was found that novice 16-and 17-year-old drivers' highest crash rates occur almost immediately after they were licensed, and their peak traffic violation rates were delayed until around age of 18. A recent study by Curry et al. [8] employed the Poisson regression to compare crash rates of older and younger novice drivers. It was found that older novice drivers experienced much less steep crash reductions over the first year of licensure than younger novice drivers. Moreover, early night crash rates of novice drivers under age 21 declined rapidly while changes in late-night crashes were much smaller.
Compared to novice drivers, experienced drivers perform much better due to the personal status, driving experience, decision-making ability, and so on, so not many studies concentrate on experienced drivers individually. However, in order to verify this, a variety of studies have focused on the comparison between novice and experienced drivers. From the ergonomics aspect, Underwood et al. [9] performed the eye fixations, while novice and experienced drivers drove along different types of roadways. Differences in sequences of fixations were found between novice and experienced drivers on three types of roads, and experienced drivers showed greater sensitivity overall, while the novice drivers revealed some stereotypical transitions in the visual attention. From the perspective of driving skill, Craen et al. [10] questioned whether novice drivers overestimate their driving skills more than experienced drivers. Questionnaires were designed, and the results showed that when the novice drivers were compared themselves to the average and peer drivers, they were not as optimistic about their driving skills, but when comparing their self-assessment with actual behavior, they overestimated their driving skills. Mitchell et al. [11] compared the crash circumstances of common crash types for novices and experienced drivers in New South Wales, Australia. Correspondence analysis revealed that the crash characteristics between novice and experienced drivers were similar, but vehicle speed, fatigue, and alcohol were risky factors in novice driver crashes. Crundall [12] tested hazard prediction in isolation to assess discriminates between novice and experienced drivers. Based on the Situation Awareness Global Assessment Technique, the results suggested that experienced drivers found hazard prediction less effortful, while response time measures can discriminate between novice and experienced drivers.
Simulation plays an important role in the crash risk analysis of novice and experienced drivers. By simulating different scenarios, Lee et al. [13] detected the road hazards of novice teen and experienced adult drivers. e results indicated that a large portion of teen drivers failed to disengage from peripheral task engagement in the presence of hazards, while the adult drivers observed hazards and demonstrated overt recognition of hazards more frequently than the teen drivers did. A similar study by Smith et al. [14], from the perspective of the sleepiness effect, investigated hazard perception in novice and experienced drivers. Based on the video test, the results indicated that the hazard perception skills of the more experienced drivers were relatively unaffected by mild increases in sleepiness, but the novice drivers were significantly slowed. Ohlhauser et al. [15] compared the driving performance of novice teenage drivers and experienced drivers over the span of six monthly simulator sessions. It was found that novice drivers' perception response times (PRT) to the braking events were significantly longer than those of the experienced drivers. From the visual information search in simulated junction negotiation, Scott et al. [16] compared gaze transitions of novice and experienced drivers. e results revealed that when scanning the junction, young experienced drivers distributed their gaze more evenly across all areas, whereas older and novice drivers made more sweeping transitions, bypassing adjacent areas. A similar study by Alberti et al. [17] strengthened the impact of a restricted field of view on visual search and hazard perception, by comparing novice and experienced driver performance in a driving simulator.
e results showed that all drivers were more likely to avoid the hazards when presented with a wide view, but gaze movement recording revealed that only experienced drivers made overt use of wider eccentricities. Seacrist et al. [1] made the comparison of crash rates and rear-end striking crashes among novice teens and experienced adults using a driving study. e results identified significantly more crashes and rear-end striking crashes among the teen group than the adult group, which conformed to the previous findings.
Another important approach to determining the influencing factors of crash rate/injury is econometric modeling. Simon-Morton et al. [18] compared rates of risky driving among novice adolescent drivers and adult drivers and elevated g-force event rates by Poisson regression with random effects. e findings revealed that elevated g-force events among novice drivers may have contributed to crash and near-crash rates that remained much higher than adult levels after 18 months of driving. Chapman et al. [7] employed plots and Poisson regression to compare overall rates and subtypes of crashes and traffic violations among novice drivers.
During the last decade, there have been a variety of different approaches and perspectives [19][20][21][22] presented in safety evaluation, among which multivariate regression analysis has been considered as one critical method dealing with two or more dependent variables with correlation and heterogeneity issues. At an early stage, Yamamoto and Shankar [23] developed a bivariate ordered-response probit model of driver's and passenger's injury severities in collisions with fixed objects. e results revealed the driver's characteristics, vehicle attributes, types of objects, and environmental conditions had an effect on both driver and passenger injury severity. After that, Dong et al. [24] analyzed injury crashes and proposed a random-parameter bivariate zero-inflated negative binomial regression model. A Bayesian approach was employed as the estimation method, and the results showed that the proposed model outperformed other investigated ones. e model gained new sights into how crash occurrences were influenced by risk factors. Focused on temporary disability and permanent motor injuries, Ayuso et al. [25] introduced a bivariate copula-based regression model for the joint analysis. e findings illustrated that the conditional distribution function of injury severities may be estimated. A similar study by Wali et al. [26] applied copular-based bivariate ordinal models to investigate the degree of injury severity sustained by drivers involved in head-on collisions. Chen et al. [27] developed a random-parameter bivariate ordered probit model to examine influencing factors by two drivers involved in the same rear-end crash between passenger cars. Taken both within-crash correlation and unobserved heterogeneity into consideration, the proposed model outperformed the individual ordered probit models with fixed parameters, which provides the foundation for this study. A recent study by Besharati et al. [28] extended into a bivariate spatial negative binomial Bayesian model with random effects of traffic fatalities and injuries across Provinces of Iran. Unobserved heterogeneity and spatial correlation were addressed, and the results helped to prioritize area-wide safety initiatives and programs. Besides bivariate regression models, different multivariate regression models, for example, multivariate tobit analysis [29][30][31][32], Bayesian multivariate approach [33,34], multivariate spatial or/and temporal models [35][36][37][38][39], and mixture of abovementioned models, have been presented to address correlation and unobserved heterogeneity among injury severities.
As summarized from the literature above, there have been various methods and comparisons about the crash risk analysis of novice and experienced drivers. However, most of the studies address the crash risk of novice and experienced drivers separately, and there may exist interrelations between the unobservables, which can be accommodated by multivariate regression models. erefore, the purpose of this study is to estimate the two (seemingly unrelated) nonlinear injury severity levels and to control for interrelations between their unobservables with bivariate random-effects probit models, which can address the injury severity levels simultaneously and accommodate the interrelations between the unobservables (i.e., heterogeneity issue).

Methodology
is study attempts to jointly model the injury severity of novice drivers and experienced drivers. Since the injury severity levels could be interdependent, there may be an interrelation between the unobservable factors influencing the injury severity of novice drivers and experienced drivers. In order to address this issue, the bivariate probit model is proposed where the injury severity levels of novice and experienced drivers rely on the set of independent variables, and the interrelation between the two error terms is considered as an auxiliary parameter. e reason that the bivariate probit model is selected lies in that whether the interrelation is significantly different from zero or not, the selected model does not require exclusion restrictions to provide meaningful estimates, particularly of the interrelation coefficient ρ [40]. More importantly, the bivariate panel data probit model can estimate two (seemingly unrelated) nonlinear processes and control for interrelations between the unobservables, which can account for the correlation in both time-specific and individual-specific error terms. Specifically, the model includes two equations, one for the binary injury severity of novice drivers (y 1it ) with main responsibility, Property Damage Only (PDO) (0) or injured and fatality (1), and the other for the binary injury severity of experienced drivers (y 2it ) with main responsibility, PDO (0) or injured and fatality (1). erefore, the equations can be expressed as follows: where i represents the panel variable (here is referred to as individual observation) with i � 1,. . .N, and t denotes the time point (here is referred to year) with t � 1,. . .T. e dependent variables y 1it and y 2it are explained by the independent variables x 1it and x 2it , respectively. β 1 and β 2 are coefficients, and v jit refers to the process-specific error terms with j ∈ (1, 2). Here, v jit includes two parts, an individualspecific time-invariant error term α ji and a time-specific idiosyncratic shock u jit ; that is, v jit � α ji + μ jit ; thus, equation (1) can be described as Due to the normalization of the error terms, the two components are assumed that the error terms α j are normally distributed, and the idiosyncratic shocks u j are standard normally distributed. e ratio of the time-constant individual-specific error term and the composite error term is calculated as Discrete Dynamics in Nature and Society where t ≠ s; if t � s, the correlation of the error terms can be calculated as where ρ refers to the correlation of the error terms. e individual likelihood function Li can be obtained from the product of the joint probability of the observed binary outcome variable P i (α 1 , α 2 ) and the joint density of the random-effects error terms f 2 (α 1 , α 2 ; μ α ) , where μ α refers to the covariance of the random-effects error terms (μ α � ρ α σ α1 σ α2 ). Since the joint density of the randomeffects error terms is assumed to follow a bivariate normal distribution, the joint probability of the observed binary outcome variable is expressed as where V 2 [·] refers to the bivariate normal cumulative dis- According to Greene [41], the bivariate normal cumulative distribution function can be described as the following form: while the density takes the form as follows: us, the likelihood of the sample can be expressed as follows: e estimation can make full use of quasirandom numbers (Halton draws) and maximum simulated likelihood to achieve the correlation between the error terms of both processes. For more details about the bivariate probit and random-effects probit models, refer to Yildirim et al. [40] and Plum [42].

Data Description
Similar to the dataset adopted by Xiao et al. [43], Arc GIS open data site maintained by Nevada Department of Transportation (NDOT) from 2014 to 2017 was considered as the data source. "Identical" dataset here denotes that both datasets are from the same open dataset by NDOT, but the variables and modeling employed in this study are different; that is, the dataset is a different subset from the Xiao et al. 27 major and minor arterials in the metropolitan Las Vegas area were the target population selected in this study, which included City of Las Vegas, City of North Las Vegas, City of Henderson, and Clark County. Four main aspects were collected and considered: the crash status, the vehicle features, roadway characteristics, and environment.
As shown in Figure 1, 27 arterials 1999 injuries including both novice drivers and experienced drivers were considered. Conformed to Seacrist et al. [1], here, the novice and experienced drivers are selected among 16-19-year-old and 35-54-year-old drivers correspondingly. In Nevada, PDO, injury, and fatality are classified as three injury severity types. Since the fatality only accounted for 0.5% and the injury was quite similar, the injury and fatality categories were merged as one type, which may not affect the inference potentially. erefore, the dependent variables in the proposed model were considered as binary injury severity, in which PDO was regarded as one, while injury and fatality were treated as the contrast, finally forming a binary probit model.
In the form of the vehicle profiles, the explanatory variables include the total vehicle, vehicle types, vehicle direction, vehicle action (e.g., changing lanes, making U-turn, and passing other vehicles), vehicle conditions (e.g., hit and run, mechanical defects, and driving too fast), and vehicle driver's age and driver's conditions (e.g., normal, fatigue, physical impairment, and distracted), whereas pedestrian, pedal cyclist, and motorcyclist are also considered. In this study, according to the classification by NDOT, when the crash happens, if there are two or more vehicles involved, the vehicle with the main responsibility here is considered as vehicle 1, and the rest with minor responsibility is vehicle 2. After the dataset was cleaned, crashes involving two vehicles account for 87% of injuries, which verifies the classification reasonably. In this study, the selected injury severity involves both novice and experienced drivers, so that the same injury can be addressed simultaneously.
e roadway characteristics involve the number of vehicle lanes, roadway conditions (e.g., dry, wet, ice, and snow), and the crash environment extracts the weather, lighting conditions, and first harm (e.g., median, fence, and pedestrian).
In order to evaluate the proposed models in Stata software, the categorical variables are digitalized, and all the variables collected are summarized in Appendixes A and B for novice and experienced drivers, respectively.

Results
Based on the typical variables selected, the characteristics of the crashes and correlation among main factors could be examined. In this study, Stata software was used to analyze the data. e correlation test was conducted to avoid the colinearity among the independent variables. In this study, crash type is 4 Discrete Dynamics in Nature and Society highly related to total vehicle, while vehicle 2 action, vehicle 2 type, and vehicle 2 driver condition are highly correlated with each other; thus, in the final results, the variables with a high correlation may not occur at the same time. e bivariate random-effects probit and bivariate probit models are proposed to assess the likelihood of novice and experienced drivers. e final results are presented in Table 1 with 50 Halton draws. In order to make the comparison, both numbers of observations are selected as the same.
As shown in Table 1, in the novice driver injury model, crash type, vehicle 2 type, pedestrian, and motorcyclist are significant for both bivariate probit model and bivariate random-effects probit model, while in the experienced driver injury model, crash type, vehicle 2 driver condition, first harm, and highway factor are significant. e covariances ρ of both models are not equal to 0, implying that correlation does exist between the injury severity levels of novice and experienced drivers, although the correlation is lower than 0.5. e log-likelihood values at convergence (− 978.069) and zero (− 1894.337) from the bivariate randomeffects probit model are a little smaller than those (− 925.747 and − 1809.635) from the bivariate probit model, respectively. It can be found that the goodness of fit of the proposed bivariate random-effects probit model performs better than that of the bivariate probit model; thus, the following explanation would concentrate on the proposed model. Table 1 demonstrates the effect on injury severity of novice and experienced drivers. For novice drivers, crash type and vehicle 2 type are negatively related to injury severity while pedestrian and motorcyclist notably increase the likelihood for injury severity levels. Compared to unknown crash types, the injury severity is reduced with the changing from angle to noncollision, which is understandable. Among all the crash types, angle and rear-end crashes frequently occur, accounting for about 85% and leading to different injury severities as verified by Xu et al. [44] and Hosseinpour et al. [45]. With the crash type from angle to noncollision, the injury severity of novice drivers is reduced about 110%.
Vehicle 2 type is negatively related to injury severity of novice drivers, indicating that the injury of cars and trucks is less than that of motorcycles, which is in line with the studies by Quddus et al. [46], Zmbon and Hasselberg [47], and Chang et al. [48]. Since motorcycles are exposed outside, even the drivers with minor responsibility (vehicle (2)) may still be suffered from severe injuries. Computed from the marginal effect, the injury severity of cars and trucks may be decreased about 4.9% compared to motorcycles.
Pedestrians play a positive significant role in the injury severity of novice drivers, meaning that the more the pedestrians, the more severe the injury of novice drivers. e study is uniform with Oh [49], and the possibility may go up to 139% if pedestrians are increased by onefold. e reason is that the driving skills of novice drivers are inadequate and they may become nervous when more pedestrians show up; thus, the possibility of running into injury is raised.
Similarly, motorcyclists have a positive association with the injury severity of novice drivers, implying that more motorcyclists may increase the injury severity. It can be calculated that the possibility may rise 167% if motorcyclists are increased by onefold, which is in agreement with common sense. More motorcyclists may produce the disordered traffic easily and cause more conflicts, especially for novice drivers, since they are not very skilled and may go on the rampage, thus leading to more chances of running into crashes.
For experienced drivers, crash type, first harm, and highway factor are negatively related to injury severity while vehicle 2 driver condition is positively concerned with injury severity. Similar to novice drivers, the injury severity is reduced with the crash type changing from angle to noncollision, compared to unknown crash types, and the possibility is reduced about 5.6%. It can be seen that the novice drivers or experienced drivers can be influenced by various crash types.
Different from the novice drivers, the driver conditions of vehicle 2 are positively significant to the injury severity of experienced drivers, indicating that, compared to the unknown, apparently normal condition causes less injury severity.
is is in line with Weber et al. [50], and the possibility increases about 1.6% with the driver condition varying from the normal conditions to the unknown. Although most crashes happen under apparently normal conditions, the injury severity may be more severe under unknown conditions because the unknown makes the driving condition unpredictable.    Discrete Dynamics in Nature and Society e last two negatively significant variables are the first harm and highway factor. With the variation of first harm from cross median/centerline to "no data," the injury severity is decreased. Since first harm mainly includes motor vehicle in transport, slow/stopped vehicle, and "no data," the injury severity of motor vehicle in transport is the worst, which makes sense. Because the motor vehicles in transport have more chances of running into conflicts, the possibility of injury severity is reduced by 11% than the others.
Compared to none highway factor, injury severity in the active work zone is the worst, which reaches some consensus with Wong et al. [51] and Sze and Song [52]. In the active work zone, speeding happens frequently, as well as the stopand-go traffic, thus causing more chances of running into conflicts and leading to injury severity.

Discussion
So far, there have been various approaches and comparisons about the crash injury analysis of novice and experienced drivers. However, most of the studies address the crash injury severity of novice and experienced drivers separately and there may exist interrelations between the unobservables. In this study, in order to estimate the two (seemingly unrelated) nonlinear injury severity levels and to control for interrelations between their unobservables, the bivariate random-effects probit models are proposed, which can address the injury severity levels simultaneously and accommodate the interrelations between the unobservables (i.e., heterogeneity issue). Table 1, the closer examination of the estimated results reveals some similarities and differences between novice and experienced drivers. First, the similarity is that, among all the influencing variables, crash types are of significance for injury severity of both novice and experienced drivers. is indicates that certain crash type would lead to specific injury severity and need to be paid more attention whether for the novice or experienced drivers. Secondly, the difference is that significant variables for novice drivers may emphasize more on moving objects, especially the vulnerables, that is, pedestrians and motorcyclists, since their driving skills are not mature enough and still need more time to become accustomed to driving situation, while for experienced drivers, the injury severity is more derived from static facilities and environment. is implies that, after a certain driving period, experienced drivers have become used to the moving objects, while paying less attention to the static ones.

As shown in
According to the results obtained, from an empirical point of view, for the novice drivers, more education and training hours are necessary before they are qualified to drive on the roadways safely, while the pedestrians and motorcyclists should be paid more attention with clear warning/ crossing signs and helmets, respectively. As for the experienced drivers, more alternative facilities should be designed to avoid the first harm; the presence of active work zones increases the injury severity; thus, one way of improving the safety is to organize the traffic flow efficiently to avoid the conflicts between vehicles, so that the injury severity levels may be decreased.

Conclusions
In this study, bivariate random-effects probit model was proposed initially to investigate the injury severity among novice and experienced drivers, in which both injury severity levels were addressed by bivariate (seemingly unrelated) probit simultaneously, and the interrelations between the unobservables (i.e., heterogeneity issue) were accommodated by random-effects model. e results showed that crash types, vehicle 2 types, pedestrians, and motorcyclists were potentially significant factors of injury severity for novice drivers, while crash types, vehicle 2 driver Note. Unknown category in the dataset has no actual data. 10 Discrete Dynamics in Nature and Society condition, first harm, and highway factor were significant for experienced drivers. Two main findings can be drawn from the results of the study. First, there indeed exists a correlation between novice drivers and experienced drivers in injury severity, although the correlation is not so strong. Second, bivariate random-effects probit model can address the injury severity levels simultaneously and accommodate the interrelations between the unobservables (i.e., heterogeneity issue), which extends the range of bivariate probit analysis.
Some drawbacks still exist in this study. One is that the division of novice and experienced drivers is conducted using the age difference as the dataset provides, and the preferred division should depend on the proposed criterion described, that is, the number of years with a valid driver's license or the number of miles driven, which may reflect the actual driving experience. Moreover, since the results of the study are based on the dataset from Las Vegas, it is worthwhile to try out different data sources to confirm the findings and transferability of this study in future studies. Further study may try other types of modeling, bivariate random-parameter probit model, or bivariate spatial probit model, so that spatial and temporal issues can be addressed efficiently.