Comparison of Factors Affecting Crash Severities in Hit-and-Run and Non-Hit-and-Run Crashes

A hit-and-run (HR) crash occurs when the driver of the offending vehicle flees the crash scene without reporting it or aiding the victims. The current study aimed at contributing to existing literatures by comparing factors which might affect the crash severity in HR and non-hit-and-run (NHR) crashes. The data was extracted from the police-reported crash data from September 2017 to August 2018 within the City of Chicago. Two multinomial logistic regression models were established for the HR and NHR crash data, respectively.The odds ratio (OR) of each variable was used to quantify the impact of this variable on the crash severity. In both models, the property damage only (PDO) crash was selected as the reference group, and the injury and fatal crash were chosen as the comparison group. When the injury crash was taken as the comparison group, it was found that 12 variables contributed to the crash severities in both HR andNHRmodel.The average percentage deviation of OR for these 12 variables was 34%, indicating that compared with property damage, HR crashes were 34% more likely to result in injuries than NHR crashes on average. When fatal crashes were chosen as the comparison group, 2 variables were found to be statistically significant in both the HR and the NHR model. The average percentage deviation of OR for these 2 variables was 127%, indicating that compared with property damage, HR crashes were 127% more likely to result in fatalities than NHR crashes on average.


Introduction
Injuries and fatalities caused by traffic crashes are serious problems encountered by most countries in the world.According to the World Health Organization (WHO), the global number of road traffic deaths reached 1.25 million in 2013, and an additional 20-50 million were injured or disabled [1].Among various crash types, hit-and-run (HR) crashes have drawn more and more attention in both general public and academia during past few years.A hit-and-run crash occurs when the driver of the striking vehicle leaves the crash scene without reporting it to the authority or aiding the victim [2].According to a recent AAA (American Automobile Association) report, both HR crashes and fatalities were increasing.It was estimated that more than one HR crash happened in the US every minute in 2015 [3].Although the penalties of HR crash vary from state to state in the US, most states consider it a felony if the crash leads to injury or fatality.HR crashes may not be completely eliminated in the short term, but it is possible to alleviate the severe damage caused to the public by HR.In order to do so, it is crucial to understand factors affecting the severities of HR crashes.And potential engineering and administrative countermeasures could be scheduled and prioritized.
The current paper aimed at contributing to existing literatures by explicitly analyzing factors affecting the crash severities of HR crashes.Moreover, factors affecting severities of HR crashes and non-hit-and-run (NHR) crashes were quantitatively compared correspondingly.

Literature Review
Current literatures on HR crashes generally fall into two categories: identifying vehicles involved in HR crashes and identifying factors affecting the decisions of fleeing crash scenes.The identification of HR vehicles has been an area of interest in various fields, such as forensic, legal, and insurance [4].For instances, Teresiński and Madro [5] examined knee joint injuries of 357 fatal pedestrian victims of traffic crashes in order to determine the type of vehicles in HR crashes.Baucom [6] applied clinical forensic skills to identify the truck in a HR crash.Karger et al. [7] examined various fragments from the crash scene to determine if there was a preceding collision while the pedestrian was in an erect position in HR crashes.
With regard to identifying factors contributing to drivers' decisions of fleeing crash scenes, various factors have been examined in previous studies.Based on the crash data of Singapore, Tay [4] developed a binary logistic regression model to identify factors which might affect the probability of HR crashes.This study found that drivers were more likely to flee when the crash occurred at night, on straight road and near shop houses.And a crash was more likely to be a HR crash if the driver was a male, minority and aged between 45 and 69.Another study carried by Tay et al. [8] aimed at identifying factors contributing to HR decisions in fatal crashes.According to the model results, the speed limit, traffic control device, lighting condition, roadway functional class, and roadway alignment were among the most important parameters affecting the occurrence of HR in fatal crashes.Traffic engineers could target these parameters to potentially minimize the HR crashes.Regarding the most dangerous type of HR crashes, pedestrian HR crashes, Aidoo et al. [9] conducted a study to explore the effect of road and environmental characteristics on pedestrians HR crashes.Based on the results of a binary logit model, this study indicated that unclear weather, dark conditions, straight road without medians, and intersections could significantly increase the likelihood of HR decisions.MacLeod et al. [10] also explored the factors associated with HR pedestrian fatalities through logistic regression analysis.Among all factors which could increase the risk of HR, alcohol usage, and early morning were identified as the leading factors.Authors concluded that pedestrian fatality could be substantially reduced by reducing alcohol-related crashes.Aiming at comparing contributing factors in HR crashes with distracted and nondistracted drivers, Roshandeh et al. [2] conducted a comprehensive analysis based on the crash data within Cook County, Illinois.The driver distraction was classified into 5 different groups based on the distraction sources.The results of this study indicated that nondistracted drivers were 27% less likely to flee the crash scene compared to distracted drivers.Unlike most of previous studies which were carried out based on the data in developed countries, Zhang et al. [11] explored factors contributing to HR crashes using data from Guangdong Province in China.This study found that drivers who were middle-aged, male and without valid drivers' licenses were more likely to flee the scenes after crashes.Most recently, Xie et al. [12] employed real-time traffic data to investigate factors associated with HR crashes and the severity levels of HR.In addition to factors identified in previous studies, the average occupancy and speed from upstream detector were found to have a positive correlation with the occurrence of HR crashes.
As described in the literatures, most studies regarding HR crashes focused on identifying fleeing vehicles or factors contributing to the occurrence of HR crashes, but relatively few literatures discussed factors affecting the severities of HR crashes.However, studies on statistical analysis of general crash severity were relatively extensive.Savolainen et al. [13] conducted a thorough review and assessment of methodologies for highway crash injury severity analysis.A wide variety of methodological approaches were discussed, including binary outcome models, ordered discrete outcome models, unordered multinomial discrete outcome models, and other data mining approaches.This study also provided directions for future research.Considering traffic crash data were generally characterized by underreporting, Patil et al. [14] applied a weighted conditional maximum likelihood estimator to solve this problem for crash severity analysis.This new estimator was demonstrated to be able to improve the estimation quality.Targeting at injury severity analysis for a mountainous freeway section, Yu et al. [15] incorporated real-time traffic and weather data into the modelling.Three models were developed separately: fixed parameter logit model, support vector machine, and random parameter logit model.Findings from this study demonstrated that the crash injury severity could be substantially influenced by realtime traffic and weather variables.Ye et al. [16] conducted one of the first studies on the sample size requirements for crash severity modelling.Three different models were examined: multinomial logit model, ordered probit model, and mixed logit model.Results of this study confirmed that small sample size could significantly affect the quality of crash severity models.Pedestrians safety has always been a major concern in this filed.Haleem et al. [17] applied the mixed logit model to identify factors affecting pedestrian crash severity at intersections.Study results revealed that speed limit, percentage of trucks, rainy weather, and atfault pedestrians were associated with more severe crashes at signalized intersections.A hybrid approach combining multinomial logit models and Bayesian network methods were proposed to analyze contributing factors of injury severity in rear-end crashes [18].Based on the modelling results, several factors could significantly increase injury severities in rear-end crashes, including truck-involvement, lighting condition, windy weather, and number of vehicles involved.Naik et al. [19] investigated the relationship between weather conditions and single-vehicle truck crash severity.This study provided a practical method to combine comprehensive 15min weather data with crash data.Similar with pedestrians, bicyclists have also been considered as vulnerable road users.Behnood et al. [20] explored factors contributing to the injury severity of bicyclists in motor-vehicle/bicycle crashes.Based on the results of a random parameters multinomial logit model, the following risk factors were identified: driver race and gender, alcohol consumption, riding on the wrong side of road, not wearing helmet, and so on.Another study conducted by Behnood et al. [21] aimed at investigating the effects of passengers on driver-injury severities in single vehicle crashes.Based on the estimation results of a random parameters logit model, the age and gender of passengers could significantly affect driver injury severities.Zeng et al. [22] proposed a generalized nonlinear model-based mixed multinomial logit approach to identify risk factors contributing to crash severity.The results indicated that the new approach could fit the observed crash data better than the standard mixed multinomial logit model.Most recently, Jeong et al. [23] proposed a hybrid approach for classifying injury severities with imbalanced crash data.The geometric mean was used to evaluate the classification performance.The results indicated that the effect of treatments for imbalanced data was maximized when undersampling was combined with bagging training-testing method.
As mentioned before, few literatures discussed factors affecting the severities of HR crashes, let alone compared them with factors contributing to crash severities in NHR crashes.To fill this gap, the current study aimed at contributing to existing literatures by comparing factors associated with the severity level of HR and NHR crashes.The rest of this paper was organized as follows.Section 3 described the dataset and variables used in this paper.Section 4 focused on the methodology.Section 5 discussed the model results.Section 6 concluded the study and put forward the study limitations.

Data Collection and Processing
Data used in the current study was extracted from the policereported crash data from September 2017 to August 2018 within City of Chicago [24].The data contains detailed information regarding a series of crash attributes, such as "crash severity," "weather condition," and "crash type."To prepare the dataset used in the models, the original dataset was first divided into two groups: HR crashes and NHR crashes.This dataset contained 117,253 crashes, of which 30,655 were identified as HR crashes, accounting for 26.14% of total crashes.The crash severity was classified into 3 categories: fatal crash, injury crash, and property damage only (PDO) crash.As per the independent variables, 56 variables were selected for the modeling purpose.These variables fell into 11 categories, including traffic control device, device condition, weather condition, lighting condition, crash type, trafficway type, roadway surface condition, crash damage in dollar value, number of units involved (a unit refers to a motor vehicle, a pedestrian, a bicyclist, or another roadway user), crash hour, and day of week.Please refer to Table 1 for the detailed summary statistics of variables.It should be noted that some of the variables had a defined category "other/unknown," which might influence the model results.These crash entries were eliminated from the dataset before modelling.After cleaning up, the dataset contained 93,371 crashes, of which 23,332 were identified as HR crashes.

Methodology
In the current paper, the main objective was to identify factors which might affect the crash severities of HR crashes and NHR crashes.Over the years, various models have been proposed for the crash severity analysis based on the nature of the dependent and independent variables.In the context of this study, the dependent variable (severity of traffic crashes) was classified into three categories: property damage only (PDO) crashes, injury crashes, and fatal crashes.Due to three discrete outcomes of the dependent variable, the multinomial logistic regression (MNL) was selected for the modelling purpose, as adopted by many previous studies in this field [25][26][27][28].
As a traditional discrete outcome model, the MNL model is suitable to analyze the relationship between potential contributing factors and multiple crash severity outcomes.As described in [29], the probability of crash n having injury severity outcome i could be written as where O in is a function that determines the severity of crash n and I stands for a set of possible severity outcomes.Assuming O in has a linear-in-parameters form, then (1) can be rewritten as where   stands for estimated coefficients for the severity outcome i and   represents the explanatory variables which might affect the crash severity i for crash n.   is a disturbance term which explains the unobserved influences on crash severity i.If   is assumed to be independently and identically distributed as generalized extreme value distributed, a standard MNL model can be expressed as follows: Although the MNL model does not impose a monotonic effect between independent variables and dependent variables, it does require careful consideration of the correlation between dependent variables and each independent variable.Additionally, the possible multicollinearity among independent variables needs to be taken into consideration.Before modelling, Pearson's chi-square test ( 2 ) was conducted to evaluate the relationship between the dependent variable and each independent variable [30].As a nonparametric test, Pearson's chi-square test could be used to test if two groups of categorical variables are independent of each other.The test applied a contingency table to analyze the data.To evaluate the independence, the test statistics was calculated as follows: where   and   are the observed and expected cell count in the ith row and jth column of the table, respectively.R and C are the total number of rows and columns in the contingency table.The  2 value could be used to calculated the p value.Independent variables with p values greater than 0.05 were omitted from the subsequent modelling process.
To resolve the possible multicollinearity problem, the forward selection (likelihood) stepwise method was adopted in the analysis.Variables were added to the model one at a time based on the significance of the score statistic and the removal testing was based on the probability of a likelihood ratio statistic.
In the current paper, two multinomial logistic regression models were established for HR and NHR crash data, respectively.SPSS 20 was employed for the modelling purpose.

Model Results and Discussions
Pearson's chi-square test results for the HR and NHR model were presented in Table 2.For the HR model, the p value for "Day of Week" was greater than 0.05.And this variable was eliminated before modelling.For the NHR model, all variables were kept.The estimation results of the multinomial logistic models for HR and NHR crashes were reported in Tables 3 and 4.
Both models fitted the data very well.The goodness of fit test results indicated deviance p values for both models were greater than 0.05.And the McFadden R 2 for HR model and NHR model were 0.358 and 0.320, respectively.
For both HR model and NHR model, PDO crashes were selected as the reference group, and the other two crash severity outcomes (injury and fatal crashes) were analyzed relative to the PDO crashes.To quantitatively analyze the impact of each variable on the crash severity outcomes, the 6 Journal of Advanced Transportation odds ratio (OR) of each statistically significant variable was calculated.The OR of an independent variable indicates the impact of one unit change in this variable on the comparison group relative to the reference group given other variables remain constant [31].It can be obtained by exponentiating the coefficient of the variable.OR ranges between zero and positive infinity.For the following part, OR for variables of the HR model and NHR model were denoted as OR HR and OR NHR , respectively.

Injury Crashes vs. PDO Crashes.
As mentioned before, PDO crashes were selected as the reference group for both HR model and NHR model.When injury crashes were taken as the comparison group, 14 and 18 variables were found to be statistically significant in the calibrated HR model and NHR model, respectively.Among these variables, 12 of them contributed to the severities of crashes in both HR and NHR model.To quantitatively compare the effects of the same variable on the crash severity in HR and NHR model, the percentage deviation of OR was calculated as follows: Please refer to Figure 1 for the detailed results.
As can be seen from Figure 1, among the 12 variables, 7 of them had larger OR in the HR model than in the NHR model.Our findings indicated that the OR of sideswipe same direction crashes in both HR and NHR models were less than 1, indicating that these crashes were more likely to be PDO crashes (OR HR = 0.327 and OR NHR = 0.129).However, OR HR was 153% larger than OR NHR , implying HR behaviors would increase the possibility of injury even in less risky crashes.The risk of injuries would be significantly increased for crashes involving pedestrians in both HR and NHR models (OR HR = 47.808 and OR NHR = 22.037).It should be noted that OR HR was 117% larger than OR NHR , indicating pedestrians were much more likely to be injured in HR crashes than in NHR crashes.Crashes involving pedal cyclist were more likely to be injury crashes than PDO crashes (OR HR =26.562 and OR NHR = 12.247).Similar to pedestrians, as vulnerable road users, pedal cyclists were more likely to be injured if the offending drivers decided to flee.Crashes occurring in parking lots were more prone to PDO crashes in both models (OR HR = 0.535 and OR NHR = 0.305).This could be due to relatively low speed and few people in parking lots.But again, if the drivers of the offending vehicles decided to flee, the risk of injuries would be increased by 75%.Crashes involving more than 2 units showed greater propensities towards injuries than PDO (OR HR = 6.626 and OR NHR = 4.279).And HR would increase the possibility of injuries by 55%.Lighting conditions might affect the crash severities in both HR and NHR models.Crashes on darkness roads with lights were more likely to be injury crashes (OR HR = 1.377 and OR NHR = 1.333).And the risk of injuries would be slightly increased if offending drivers fled the scenes.For crashes occurring on one-way roads, both HR and NHR model results indicated that these crashes were more likely to be PDO crashes (OR HR = 0.792 and OR NHR = 0.789).This could be due to the absence of opposite direction traffic on these roads.Besides, most one-ways roads are in downtown Chicago, which are strictly regulated by traffic signals.This could provide further protections to other road users, such as pedestrians.
Among the 12 variables which contributed to the severities of crashes in both HR and NHR model, 5 of them had smaller OR HR than OR NHR .As shown in Figure 1, the percentage deviation ranged from -2% to -71%.It should be pointed out that this did not mean HR crashes were "safer" than NHR crashes in certain conditions.For instance, the percentage deviation of OR for "parked motor vehicle" was -71%.But both OR were much smaller than 1 (OR HR = 0.056 and OR NHR = 0.192), indicating crashes involving parked motor vehicles were highly unlikely to result in injuries in both HR and NHR crashes.
The average percentage deviation of OR for the above 12 variables were 34%.This suggested that compared with PDO crashes, HR crashes were 34% more likely to result in injuries than NHR crashes on average.
Additionally, two variables were found to be statistically significant only in the HR model: dawn and wet road surface.HR crashes occurred at dawn were more prone to injury crashes (OR HR = 1.522).This might because fewer witness was expected at dawn, which might delay any necessary    emergency medical service after the offending driver fled.
And HR crashes on wet road were more likely to result in PDO (OR HR = 0.767).Drivers tend to slow down on wet road surface, which might alleviate the severity of HR crashes.Six variables were found to contribute to the crash severity only in the NHR model.Among which, 4 variables were more likely to result in PDO crashes, including "Sideswipe opposite direction" (OR NHR = 0.399), "Angle" (OR NHR = 0.618), "Turning" (OR NHR = 0.446), and "Rear end" (OR NHR = 0.276).And the presence of remaining 2 variables would more likely to lead to injury crashes, including "Darkness" (OR NHR = 1.171) and "Weekend" (OR NHR = 1.128) 5.2.Fatal Crashes vs. PDO Crashes.When fatal crashes were taken as the comparison group, out finding indicated that 4 and 11 variables were statistically significant in the HR and NHR model, respectively.Among which, 2 of them had potential impacts on the crash severity in both models.Similar to the previous analysis, the percentage deviations of OR for these 2 variables were presented in Figure 2.
As shown in Figure 2, both variables had significantly larger OR HR than OR NHR .When the crash damage value was over $1500, the possibility of fatalities would be increased considerably relative to PDO (OR HR = 29.024and OR NHR = 10.494).This made sense as high damage values were normally associated with more severe crashes.It should be noted that OR HR was 177% larger than OR NHR , indicating that HR crashes were almost two times riskier than NHR crashes in this case.Similar to the case of injury crashes versus PDO crashes, the lighting condition could affect the likelihood of fatalities in both HR and NHR crashes.Crashes on darkness roads with streetlights showed greater propensities towards fatalities than PDO (OR HR = 4.609 and OR NHR = 2.59).Again, OR HR was 78% larger than OR NHR , suggesting the possibility of fatality could be increased by 78% if the perpetrators fled from the scenes.
The average percentage deviation of OR for the above 2 variables was 127%.This suggested that compared with PDO crashes, HR crashes were 127% more likely to result in fatalities than NHR crashes on average.
Besides, 2 variables were found to affect the possibility of crash severity only in the HR model, including "Dusk" (OR HR = 11.102) and "Divided with median barrier" (OR HR = 4.970).Both OR values were greater than one, indicating that the presence of these 2 variables would considerably increase the possibility of fatalities in HR crashes.Particularly, if HR crashes occurred at dusk, the possibility of fatalities could be increased by a factor of 11.102.
On the other hand, 9 variables were found to be statistically significant only in the NHR model, including "Darkness" (OR NHR = 2.516), "More than 2 units" (OR NHR = 5.678), "Pedestrian" (OR NHR = 7.288), "Angle" (OR NHR = 0.077), "Turning" (OR NHR = 0.057), "Sideswipe same direction" (OR NHR = 0.021), "Parked motor vehicle" (OR NHR = 0.146), "Rear end" (OR NHR = 0.021), and "PM peak hour" (OR NHR = 0.333).Most of these variables would increase the possibility of PDO crashes relative to fatal crashes with OR smaller than one.Nevertheless, the results revealed that if NHR crashes occurred in dark environment, involving more than 2 units or pedestrians, the possibility of fatality would be increased considerably.

Conclusions and Recommendations
Hit-and-run (HR) crashes are those in which the driver of the offending vehicle flees the crash scene without reporting it to authorities or aiding the victims.Despite the severe punishment for HR drivers, the HR crashes and fatalities rate are still increasing in America [3].In order to alleviate the crash severity, it is crucial to identify factors which might contribute to crash severity levels in HR crashes.Additionally, by comparing factors affecting crash severity levels in HR and NHR crashes, it would help engineers, decision-makers, and public to improve the understanding of HR crashes from a quantitative point of view.
In the current paper, the multinomial logistic regression (MLR) model was adopted to analyze the police-reported crash data from September 2017 to August 2018 within City of Chicago, Illinois.Two MLR models were established for HR crashes and NHR crashes data, respectively.In both models, PDO crashes were selected as the reference group, and injury crashes and fatal crashes were taken as the comparison group.
When injury crashes were taken as the comparison group, it was found that 12 variables contributed to the crash severity in both the HR and the NHR model.Among which, 7 variables had larger OR in the HR model than in the NHR model, including sideswipe same direction, pedestrian, pedal cyclist, parking lot, more than 2 units, darkness lighted road and oneway road.The percentage deviation of OR for these 7 variables ranged from 0.4% to 153%.The other 5 variables had smaller OR in the NHR model than in the HR model.The percentage deviation ranged from -2% to -71%.Averagely speaking, the percentage deviation for these 12 variables were 34%, indicating that compared with PDO crashes, HR crashes were 34% more likely to result in injuries than NHR crashes on average.
On the other hand, when fatal crashes were chosen as the comparison group, 2 variables were found to be statistically significant in both the HR and the NHR model, including crash damage value over $1500 and darkness lighted road.Both variables had considerably larger OR in the HR model than in the NHR model.The percentage deviations of OR for the above two variables were 177% and 78%, indicating that compared with PDO, the risk of fatality could be significantly increased if the offending driver decided to flee.
The results of the current study could help stakeholders to alleviate the HR crash severity from both the engineering and the administration perspectives.For instance, driving on dark roads with street lights could significantly increase the likelihood of fatality.Moreover, if the driver decided to flee, the likelihood of fatality could be further increased by 177%.Therefore, the traffic law enforcement again HR should be strengthened on these particular roads.Additionally, traffic safety education could be improved based on the results of the current study.For example, drivers should be aware that the risk of injury could be increased by 117% if they hit a pedestrian or pedal cyclist and decided to flee, which would substantially aggravate the punishment.
Despite the contribution of the current study, the analysis results could be further enhanced from the following aspects.The MNL model used in the current study does not impose a monotonic effect between explanatory and independent variables, but it does omit the possible unobserved effects from one severity level to the next.Future studies could benefit from applying different statistical models (such as nested logit model or mixed logit model) and comparing the results with the current study.Additionally, only one-year crash data was applied in this study.The dataset size limitation might affect the prediction accuracy, especially for the HR model.Future studies should try to use more comprehensive dataset to improve the prediction accuracy.To further capture factors affecting severities of HR and NHR crashes, the effect of additional variables should also be examined, such as alcohol consumption, driver distraction, and crash location.

Figure 2 :
Figure 2: Percentage deviation of OR for statistically significant variables in HR and NHR Model (fatal vs. PDO crashes).

Table 1 :
Variables description and percentage distribution for HR and NHR crashes.

Table 2 :
Results of Pearson's chi-square test.

Table 3 :
HR multinomial logistic regression model results.

Table 4 :
NHR multinomial logistic regression model results.Figure 1: Percentage deviation of OR for statistically significant variables in HR and NHR Model (injury vs. PDO crashes).