Collective Risk Ranking of Highway Segments on the Basis of Severity-Weighted Crash Rates

0is study is intended to focus on the major factors affecting traffic crash rates and severity levels, in addition to identifying crashprone locations (i.e., black spots) based on the two indicators. 0e available crash data for different road segments used for the analysis were obtained from the Washington state database provided by the Highway Safety Information System (HSIS) for the years 2006 to 2011. A Random Forest (RF) classifier was used to predict the outcome level of crash severity, while crash rates were predicted by applying RF regressor. Certain features were selected for each model besides the abstraction of new features to check if there are unobserved correlations affecting the independent variables, such as accounting for the number and weight of crashes within 1 km area by implementing the Getis-OrdGi∗ index. Moreover, to calculate the collective risk (CR) score, crash rates were adjusted to incorporate crash severity weights (cost per severity type) and regression-to-the-mean (RTM) bias via Empirical Bayes (EB) method. Finally, segments were ranked according to their CR score.


Introduction
In recent years, crash rate and crash severity were the two major indicators used to assess roadway's traffic safety, as well as identifying crash-prone locations. Reducing crash rates and severity requires different strategic approaches and policies that aim at reducing the exposure for traffic accidents. Both reliable historical crash data and well-developed prediction models are essential in the process of estimating the impact of human and environmental-based features on the frequency and outcome of a crash incident. A weighted risk index or a combination of crash frequency and severity should prove an advantage over having only crash rate as an index of traffic hazards, without any regard to severity level or vice versa (e.g., having an urban road segment with high crash rate but lower levels of crash severity, and the opposite on a rural highway segment). Predictions should also account for fluctuations and trends in crash rates. Furthermore, unobserved features and correlations (e.g., spatial correlations between adjacent segments that can amplify a hazardous impact) should be considered. Observed or predicted indicators can be utilized to rank road facilities according to their overall risk score, allowing traffic safety agencies to distinguish which facilities have the priority in future crash countermeasure policies.
is article hereby introduces a collective risk (CR) score that combines both crash rate and severity into a single index, and in the same time, gives an account of spatial correlations between adjacent road segments using the Getis-Ord G i * statistic and provides control over bias in predictions of hazardous segments using the Empirical Bayes (EB) method. So as to illustrate the process mentioned above, Section 2 provides a brief review of related literature, while Section 3 presents the implemented methodology to process the given data, Section 4 lists the results with discussion, and finally, Section 5 concludes the findings of this article. crash severity. Aside from the observed features, there are other points that should be taken into consideration when analysing crash data, these factors include (a) underreporting and source of data, as police reports count about 50% of actual property-damage-only (PDO) crashes; (b) ordinal nature of severity data and the correlation among neighbouring severity levels; (c) fixed parameters that restrict the effects of explanatory variables to be the same across individual observations, ignoring unobserved heterogeneity such as in risk-taking behavior; (d) fixed severity thresholds that lead to biased estimations of features (e.g., airbags implementation), affecting the likelihood of certain severity levels due to a constrained shift in thresholds; (e) endogeneity bias that occurs when explanatory variables (e.g., the amount and frequency of warning signs) are affected by the dependent variable (e.g., the crash rate); (f ) within-crash and crash type correlations (e.g., number of crash-involved personnel, collision angle, cause, and responsibility, etc.) [1][2][3].
In addition to the combined effect of crash frequency and severity, spatial and temporal correlations between crash observations were also a major focus. Ma and Kockelman [4] performed a study on speed limit's effect on traffic safety by clustering homogenous highway segments from the state of Washington into crash counts by severity. Random-effects linear model and ordered probit regression were used for estimating crash rates and severity, respectively. On the other hand, Soltani and Askari [5] used spatial autocorrelation methods (Moran's I and Getis-Ord G i * index) for clustering hotspots in Iran. Clusters were compared in three attributes; time of the crash, severity index, and location. In comparison, Bao et al. [6] utilized the RF technique to identify factors affecting crash severity levels with the purpose of studying unobserved travel patterns and spatial correlations related to crash rate distribution for large-scale taxi GPS data from the city of New York.
Efforts were also made to introduce a combined index used for ranking potential hotspots, providing a reference to impose safety countermeasures. Da Costa et al. [7] incorporated crash severity in a collective risk ranking of road segments in Australia. e severity-weighted frequencies per crash type (based on crash severity costs in Australia) were adjusted for a potential RTM bias via the EB method. Similarly, Afghari et al. [8] employed the EB method to predict hotspots in Australia besides a joint risk model of both crash rate and severity, stating that the latter has an advantage over traditional count models. Table 1 lists additional articles related to crash risk analysis.

Dataset and Methodology
e dataset from the HSIS database (years 2006 to 2011), which covers 38 counties of Washington state, contains accident data for each year, including crash scene characteristics and conditions. Different data logs were merged using a Visual Basic for Applications (VBA) automated MS Excel worksheet. e data were then split into two groups, the first group for building the models (years 2006 to 2009) and the other was used as a testing sample (years 2010 and 2011). Crash data (including time, weather, and severity level) referenced a single point on the roadway and only account for the most severe observation, while crash scene characteristics were collected from the homogenous segment of the roadway where the crash occurred [18]. Data went through a preprocessing stage which involved new categorization based on ordinal ranking for each feature, in addition to the elimination of redundant features and observations, and finally adding new features (such as hotspots).

Hotspots.
To verify whether a crash scene or a road segment is a part of a larger cluster of hazardous locations, the Getis-Ord statistic G i * was calculated. Additional data for milepost coordinates were obtained from the Washington State Department of Transportation (WSDOT) geospatial database [19]. Using crash scene milepost and the related coordinates of that milepost, and by implementing a VBA automated Excel worksheet, all crash observations from the years 2006 to 2009 were incorporated in the G i * statistic calculation process. e collected crash data categorized severity into 8 classes. However, to conform with the crash cost data for the state of Washington [20], crash severity categories were transformed into 4 severity classes based on the KABCO ranking as shown in Table 2.
In order to incorporate crash counts into the calculation process, all crash observations were weighted according to their cost value (C � Cost/Cost PDO ). e study area (about 580 km × 380 km) was divided into 1 km 2 cells and each crash observation was assigned to its cell centroid. e G family of statistics can be used as measures of spatial correlations in many aspects, yet G i and G i * (i.e., local statistics) can detect pockets of spatial links within the global study area. In this article, the standardized local statistic G i * was chosen over the other global and local statistics for the sake of both simplicity and efficiency in finding local clusters since it can detect correlations within the same local area (e.g., in the case of adjacent segments or an intersection that are located within the same grid unit i).
where w(d ij ) is the distance weight function between the centroid of interest i and the adjacent centroid j, and x j is the weighted value of crashes in j (i.e., C j ), while n is the total number of cells (or centroids) including centroid i [21]. e distance was assigned to be a 1.5 km radius around centroid i to reach centroids in corner cells (Queen neighborhoods), while the distance function produces only two outputs, 1 for d < 1.5 (including d ii � 0) and 0 for d ≥ 1.5. e G * i value is already a z-statistic and can be directly utilized to identify clusters within the grid according to the p-value of the cell. Figure 1 shows the clusters within the study area as cells with p < 0.01 were considered significant crash-prone areas [22]. A hotspot class was given for each crash observation based on their clusters. Hotspot classes were incorporated in the modeling process as an explanatory variable.

Crash Severity.
Out of 183,400 training data points (years 2006 to 2009) and 85,160 testing observations (years 2010 and 2011), only 157,321 and 43,623 data points were selected, respectively. Preprocessing included removing data points with missing data, as well as the elimination of features with low variance, Spearman Correlation ranking, and recursive feature elimination and cross-validated selection (RFECV). Figure 2 provides a description of the topranked features with regard to their importance to crash severity prediction.
Two of the most popular machine learning algorithms supported by Python were used to fit the observed crash severity levels and the related crash features.

Random Forests.
Random forest is an ensemble of K tree-structured estimators (h k ) that split according to the values of a random vector Θ k sampled independently and with the same distribution for all trees in the forest. Classification accuracy results from growing the trees and collecting their votes for the most popular class y at input x. A margin function mg(xy) measures the extent to which the average number of votes at x, y for the right class exceeds the average vote for any other class. e larger the margin, the more confidence in the classification.
where I(·) is the indicator function and PE * is the generalization error [23]. Nonetheless, given the convenience of the Scikit-learn ensemble package [24], a random forest of 1,000 estimators was used to classify the severity levels.

Ordered Logit.
Discrete choice models, such as the multinomial logit and the mixed logit, were used extensively in the transportation domain (e.g., route choice and travel behavior) [25,26]. However, as mentioned in Section 2, the nature of severity data requires an ordinal classifier. Based on the application of the generalized linear models (GLM), the ordered logit model (OLM) scales the response values into J ordered classes, and each response z j is confined within θ j thresholds (θ j-1 < z j < θ j ). To estimate the optimal slope Author Domain Methodology Chiou and fu [9] Joint models Multinomial-generalized Poisson joint model with error components. Sugiyanto et al. [10] Collective risk Upper Control Limit (UCL) for hotspots with equivalent accident number.
Ouni and Belloumi [11] Risk factors Multinomial logit model to identify factors affecting severity. Ripley's K-function and kernel density estimation (KDE) for vulnerable road users (VRUs) clusters.
Xu et al. [12] Risk factors crash severity A sequential logistic regression model with random parameters for multiple collision angles.
Feng et al. [13] Risk factors Real-time experiment on the psychological and behavioral changes of drivers in longitudinal underpass segments.
Wen et al. [14] Spatial correlation crash rate A Poisson-based model with spatial conditional autoregressive priors (CAR) for crashes within adjacent road segments.
Zeng et al. [15] Spatial correlation crash severity Bayesian generalized ordered logit model with CAR prior spatial term, fixed parameters, and flexible thresholds.
Ulak et al. [16] Spatial correlation crash rate Getis-Ord G i * and local Moran's I distance-based spatial weights with crash prediction accuracy index (CPAI).
Wang et al. [17] Risk factors A questionnaire-based study on risk perception among cyclists, and the major influencing factors in the prediction of risky behaviors. vector parameters w j and thresholds θ j , both prediction errors and the sum of all penalties (for crossing each threshold), given by the all-thresholds loss function, should be minimized in all classes.
where f(·) is a margin penalty function, y is the predicted range, and s(j, y) is positive if j ≥ y and negative otherwise. e data can be fitted using the posterior probability P(y ≤ j | x), and the cumulative probability of the ordinal class follows a logistic function [27].
is procedure is implemented in the MORD library under the all-threshold variant classifier [28].

Crash Rate.
e 157,321 observations from the years 2006 to 2009 and the 43,623 observations from the years 2010 and 2011 (i.e., data used in Section 3.2.) went through further preprocessing to be implemented in the crash rate prediction process. Crash observations were clustered for each homogenous road segment, with a length of no less than 0.16 km for convenience [29]. Since the HSIS dataset only provides the most severe case for a single crash observation, grouping data into segments for each year makes temporal variables (e.g., weather conditions) redundant. 19,490 segments from the years 2006 to 2009 and 7,321 from 2010 to 2011 were selected for training and testing the model. e features were filtered by standardized coefficients and Spearman Correlation rankings, as well as the embedded RFECV method (See Figure 3).

Random Forest
Regressor. Similar to RF classifiers, RF regressors are formed by growing trees depending on the random vector Θ. e output of the tree predictors h k is numerical and the training set is independently drawn from the distribution of the random vector y, x. An accurate regression forest has a low correlation between residuals and low error trees. e mean squared generalization error for the forest E x,y (y − avg k h(x, Θ k )) 2 is formed by taking the average random predictor over k trees [23]. Scikit-learn's regressor was used with 1,000 estimators.   Journal of Advanced Transportation

Bayesian Ridge Regression.
Bayesian inference techniques are considered popular machine learning tools in the transportation field (e.g., intelligent transportation systems) [30]. Ridge regression is a form of linear regression that penalizes the coefficients w with the ridge coefficient α ≥ 0, hence minimizing the residual sum of squares.
e larger the value of α, the greater the amount of shrinkage. Similarly, the Bayesian Ridge (BR) includes regularization parameters during the estimation procedure, where w has a prior p(w|λ) given by a spherical Gaussian N(w|0, λ −1 I p ). e numerical output y of the probabilistic model p(y|x, w, α) � N(y|xw, α) is assumed to follow a Gaussian distribution and both priors for α and λ are randomly chosen from gamma distributions. w, α, and λ are estimated during the fit of the model by maximizing the log marginal likelihood [24]. Scikit-learn's BayesianRidge was used with default initial parameters.

Empirical Bayes.
Traffic crashes are random events, thus predicting the rates have some limitations due to the inherent characteristics of the data itself, regardless of the methods used to collect it. ese limitations can introduce bias and affect crash data reliability, especially in the long term. Statistically, a low crash rate period will likely be followed by a high rate period, and vice versa. is tendency is known as regression-to-the-mean (RTM). Applying shortterm predictions for longer periods can cause a bias called RTM bias (See Figure 4). e EB method is used to estimate the expected average crash rate N E of an individual site (road segment or intersection) for a given time period T (in years) on the condition of unchanged site features (e.g., geometric design). e estimate relies on predictive models N P (modified for similar characteristics or control function) combined with observed crash rates N O .
An overdispersion parameter (i.e., a variance of classification) is needed to provide an indication of the statistical reliability of N P . e lower is the overdispersion parameter, the more statistically reliable is the model. is parameter is used in the EB method to provide the weights w and 1 − w to N P and N O, respectively [29]. As N P increases, N E becomes more dependable on the N O value.
Since the crash rate model already incorporated various variables (e.g., road features) in the fitting stage, in addition for providing more simplicity, N P values for the study period (i.e., years 2006 to 2009) are taken directly from the model output without further modifications regarding these features. Aiming to give more importance to segments with low rates but high severity levels (see Figure 5), both N P and N O should be converted to severity-weighted rates S P and S O , respectively.
where P itj is the probability of crash severity j occurring on segment i in year t, C j is the cost for severity level j, and M it is the total number of crash predictions or observations in segment i and year t. When S P is calculated, P is taken directly from the classifier probability predictions, while for S O , P is simply taken as the ratio m j /N O .  Figure 4: Rtm effect on crash rate prediction [29]. Grade percentage.
Degree of horizontal curvature.
Absolute change in speed limit between adjacent segments. * Blue indicates positive correlation.
Median width in feet.
Lane width in feet. Journal of Advanced Transportation 5 Finally, N E is modified to estimate the collective risk (CR) by predicting the weighted crash rate S F for future year t * (i.e., the year 2010 or 2011) [7].

Results and Discussion
Crash severity was found to be correlated to some unobserved random features such as weather conditions and collision points. ese features were needed to be drawn from a suitable distribution to be used in the model's testing stage. Both weather and road surface conditions follow a Gaussian distribution, while the collision point follows a Beta distribution (with α � 0.2 and β � 0.4). AADT was taken as an average from the years 2006 to 2009. Crash severity results are summarized in Table 3. It can be seen that both methods could not produce accurate results. Furthermore, predictions were biased towards no Injury/PDO severity class (largest number of observations). However, the RF method generated more correct predictions in the higher severity classes; therefore, class probabilities shall be taken from the RF output. Incorporating the random features did not affect the overall performance of the model; hence, only AADT and median width can be used to predict severity class. Figure 6 illustrates the results of crash rate models.
RF regressor gave more accurate predictions for crash rates, while the BR model had underdispersed results. us, RF predictions are selected to be used as N P and N F values, respectively. EB weights for S P indicate that predicted values have less importance to the estimated crash rate (see Figure 7).
Since not all segments had crash observations during both periods (i.e., years 2006-2009 and 2010-2011), only 3,649 and 2,647 segments from the years 2010 and 2011, respectively, were ranked according to their CR values. e top 5 ranked segments are listed in Table 4.
CR ranking indicates a local cluster on route 5 in the city of Seattle. Nevertheless, only 9 segments were included in the top 100 ranks of both actual and predicted rankings. e predictions of the year 2011 showed a slight improvement in accuracy for the top 1000 segments (see Figure 8). is can be understandable since the EB method is more effective in long-term estimates rather than the short-term. Severity weights also have a role in some bias towards crash clusters.

Conclusion
Machine learning algorithms introduce simple alternatives to the conventional regression models by providing both reasonable estimations and a user-friendly interface. In other words, it does not require comprehensive statistical knowledge. Random Forest accuracy for regression and classification outperformed traditional models (i.e., linear and logistic estimators). Yet, a multistage algorithm that combines RF and other models might produce better results and should be further investigated. Annual average daily traffic was a major factor in predicting both crash severity and frequency among some road features. On the other hand, random and temporal variables had a very small impact. e Empirical Bayes method demonstrated significant control over predicted crash rates, keeping it in a reasonable range. Although CR rank predictions were relatively poor in terms of reliability, having a collective risk estimate exhibits more potential to identify hotspots or localized clusters of safety hazards, thanks to the combination of frequency and severity of crashes. Attention should be paid to severity weights (i.e., considering an alternative to  Journal of Advanced Transportation the cost index C) and the screening method (e.g., using road segments instead of grid cells) to avoid allocating large weights on outliers, especially when reported data points are related to the most severe case and there is only one observation per segment.
Data Availability e traffic crash datasets used to support the findings of this study were supplied by HSIS. Requests for access to these data for research and educational purposes should be made to HSIS staff at https://www.hsisinfo.org/datarequest.cfm. Other types of data are cited at relevant places within the text as references.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.