Evaluating Highway Traffic Safety: An Integrated Approach

This paper presents a novel methodology for determining the overall highway safety level by integrating statistical analysis and analytic network process (ANP) with set pair analysis (SPA) which is applied in the evaluation of the overall highway safety for the first time. The methodology accounts for both quantitative and qualitative factors that contribute to traffic safety. The statistical analysis uses crash, alignment, intersection, and other data to determine the significant indices (variables) that affect safety. These indices are then combined with the planning (qualitative) indices to determine the weights of all indices based on expert opinions using ANP. Finally, the overall safety level of the highway is determined using SPA. The methodology is illustrated using data collected from two highways in China. The results demonstrate that the proposed methodology is sound and reliable. The methodology is applicable to existing or new highways and can help to effectively evaluate the overall safety of a highway and develop long-term strategies for safety improvements.


Introduction
With the increased mobility on highways, traffic crashes have substantially increased and safety assessment has become increasingly important [1]. Traffic safety has been among the cutting-edge topics for scientific research [2]. For example, Theofilatos [3] adopts Bayesian and finite mixture logit models to predict the likelihood and severity of road accident in urban arterials. Wang and Huang [2] apply Bayesian hierarchical joint model to evaluate road network safety. Christoforou et al. [4] use multivariate probit models to examine the relationships between a variety of traffic factors and type of crash. Siegrist [5] presents a method for the ex ante estimation of a potential road safety program. Nghiem et al. [6] adopt a state-space time-series model to investigate the determinants of accidents. Kweon [7] uses regression analysis with correction for serial correlations to identify factors affecting the changes in traffic safety. These studies are mainly based on "hard data" or quantitative indices, which do not consider the full spectrum of indices that may affect highway safety. Several advanced analytical methods have been developed in the literature, with consideration of both quantitative and qualitative indices, including analytic network process (ANP) and set pair analysis (SPA). Yet, there is limited attempt that integrates these methods for evaluating overall highway traffic safety, so as to aid local and national government agencies' planning decisions for improving traffic safety.
To address the research gap, the current paper proposes a novel methodology that integrates three analysis methods, i.e., regression analysis, ANP, and SPA, for evaluating highway traffic safety. SPA adopts a system theory using a connection number to process the uncertainty caused by fuzzy, random, and incomplete information as proposed by a Chinese scholar Keqin Zhao in 1989 [8]. SPA is little known outside China, despite the popularity it enjoys in its home country. In contrast, ANP has been extensively applied in a variety of contexts. In the area of safety, AHP has been implemented to evaluate railway traffic safety, ship traffic safety, road traffic safety, and work zone safety [9]. ANP integrates expert judgments into the evaluation process and considers the interdependencies and interactions among evaluation 2 Journal of Advanced Transportation elements [10,11]. This important feature has attracted many decision-makers and planners in the fields of urban planning, logistics, supply chain management, and transportation [12].
In this integrated model, the SPA is used to determine the overall safety level of the highway by analyzing the features of a set pair indices from the aspects of identity, discrepancy, and contrary. The method requires two basic inputs: safety indices (or criteria) and their weights. It was decided in this study to develop an objective process for determining these two inputs. The significant safety indices were determined using regression analysis and their weights were determined using the ANP. The three methods are used sequentially, where the output of one method is used as input to the next method. As such, the final decision drawn from the three methods is more objective than a single method. In this integrated model, first, regression analysis is used to predict the statistically significant alignment, intersection, and general indices. Second, the algorithm-based ANP is used to determine the weights of the evaluation indices of highway traffic safety. Finally, SPA is used to determine the grade of the highway based on fuzzy membership functions of the assessed indices and specified criterion grades. Integrating the three analysis methods sequentially provides a more objective process for evaluation of highway safety level.
The following sections describe the methodology and its integrated components and illustrate its application using actual data from a highway in Fujian Province, China. The main conclusions of the study are then presented.

Proposed Methodology
The proposed methodology integrates three steps (regression analysis, ANP, and SAP) that are sequentially performed, where the output of one step is used as input to the next step, as shown in Figure 1. Regression analysis uses the quantitative alignment, intersection, and general indices to determine the statistically significant ones.
The ANP uses the significant indices and the qualitative planning indices to determine the indices weights. The SAP uses the weights to determine the overall highway safety. Note that each of the ANP and SAP steps involves the use of expert opinions. The first step is a traditional regression analysis to determine the statistically significant indices that contribute to traffic crashes. As such, the methodology presented in the paper is concerned with determining the overall safety of a highway section, rather than individual elements as compared to the traditional approach. In this respect, it would beneficial for the policy measures in safety highway to decide the safety indices being as aggregate as possible. It was performed using STATA software, described in detail in Gutierrez [13].
The analytic network process integrates expert judgments into the evaluation process and considers the interdependencies and interactions among evaluation elements, as pioneered by Saaty [14] and Saaty and Vargas [11]. This important feature has attracted many decision-makers and planners in the fields of urban planning, logistics, supply chain management, and transportation [12]. In the transportation field, the AHP has been implemented in numerous applications. The applications include the impacts of transit priority on signal coordination [15], project selection and prioritization of pavement preservation [16], prioritizing network level maintenance of pavement segments [17], assessing options to enhance bicycle and transit integration [18], selection and prioritization of intelligent transportation system user service [19], integration multiple criteria decision making to prioritize transportation [20], assessing asphalt pavement construction quality control [21], analytic minimum impedance surface [22], prioritizing traffic calming projects [23], and determining low volume road standards, long-term needs, and environmental risks [24]. In the area of safety, AHP has been implemented to evaluate railway traffic safety, ship traffic safety, road traffic safety, and work zone safety [9,25].
The set pair analysis adopts a system theory using a connection number to process the uncertainty caused by fuzzy, random, and incomplete information as proposed by a Chinese scholar Keqin Zhao in 1989 [8]. The method has been extensively applied in the field of water resource management and agricultural science. For example, Feng et al. [26] used  [29] constructed a comprehensive evaluation model based on improved SPA that better serves reservoir dispatching. Application of SPA in transportation engineering, however, has been rather limited. The specifics of the two steps involving the implementation of the ANP and SPA are described next.

Analytic Network Process.
The analytic network process is a decision making tool that is suited for nonindependent hierarchy structure as described by [30]. The ANP structure is made up of clusters and nodes. The method involves carrying out pairwise comparisons by an expert who judges how important an index i is when compared to another index j with respect to the overall goal. The verbal judgments of the experts are then transformed into numerical values using a nine-level Likert scale, as shown in Table 1. The odd numbers represent the primary importance weights and the even numbers represent intermediate importance weights. The scale is used to compare input parameters by pairings to determine how much one parameter is more or less important than another. Based on this comparison, an n by n evaluation matrix A = (a ) is established, where n is the number of parameters involved in the decision and a is relative value of index i to index j. Mathematically speaking, the evaluation matrix is defined as The evaluation matrix is related to the eigenvector (weights) and eigenvalue by where is matrix with elements , is eigenvector of matrix A with elements , = 1, 2, . . . , , which represent the weights of the indices, and max is eigenvalue of matrix A. The weights are then normalized to obtain the final weights of the indices as follows: To assess the consistency of the decision-maker in the assignment of the importance weights, a consistency ratio is computed as follows: where CR is consistency ratio, CI is consistency index, RI is random consistency index, max is maximum eigenvalue, and n is size of the matrix. The value of RI has been defined in the literature based on matrix size, where it ranges from 0 to 1.49 for n = 2 to 10, respectively [31]. The consistency ratio helps to identify possible errors in judgment and actual consistency in the judgment itself. It has been suggested that CR should be less than 0.1 which means that the method allows up to 10% error in human judgment during the paired comparison process. If the error is greater than 10%, the experts are asked to revise the pairwise comparisons. To calculate the weights of the parameters, we used Super Decision software developed by Creative Decisions Foundation in this study [32].

Set Pair Analysis.
The set pair analysis is a modified uncertainty theory that considers both certain and uncertain indices as an integrated system and depicts the certainty and uncertainty systematically in terms of three aspects: identity, discrepancy, and contrary. The SPA refers to a pair that consists of two interrelated sets and uses connection degrees to handle, in a unified manner, the uncertainty caused by fuzzy, random, mediation, and incomplete information. The main steps of SPA are as follows: (a) structure the sets in view of the problem, (b) analyze the features of the two sets, and (c) set up a connection-degree formula for the two sets including identity, discrepancy, and contrary degrees.
Let N be the total number of features of the two sets, the connection degree of the set pair, S the number of identity features, P the number of contrary features, and F the number of the features of the two sets that are neither identity nor contrary, denoted as discrepancy degree (which equals N -S -P). Let the ratios S/N, F/N, and P/N represent the identity, discrepancy, and contrary degrees of the two sets, respectively, and j be the coefficient of the contrary degree (specified as -1). The coefficient of the discrepancy degree i is an uncertain value between -1 and 1; that is, i∈ [-1, 1]. The uncertainty of the discrepancy degree of the two sets is eliminated when i equals -1 or 1 and will increase as i approaches zero. Therefore, the connection degree of set (A, B), (A, B), is defined as Let a = S / N, b = F / N, and c = P / N; then (7)  Note that the lower limits of Grades I to IV start from values just greater than those shown in the above grade ranges. The corresponding index values are defined as set B = {100, 80, 60, 40, 20}. According to Feng et al. [33], if the values of the indices are within the specified boundaries of the grade, they are considered identity; if they are within separated boundaries of the grade, they are considered contrary; and if they are within the adjacent boundaries of the grade, they are considered discrepancy.
To illustrate this, consider (A, 1 ), for example. Assume that there are S indices falling in Grade I whose weights are where reflects the discrepancy degree of set ( , 1 ) about the evaluation index whose weight is (that is, is the fuzzy connection degree between the evaluation index and Grade II standard value 1 ). Then, is defined as The connection degrees of the set pair { , 1 } are defined, respectively, as where a , b , and c are the identity, discrepancy, and contrary degrees, respectively. Note that the identity degree is the approximate degree between x and 1 , while the contrary degree represents the approximate degree between x and 2 . The terms 1 and 2 represent the upper and lower limits of Grade II standard values, respectively.

Empirical Application
The proposed methodology was applied to assess the traffic safety level on an actual highway in China. Details on data collection, implementation of the three basic steps of the methodology, and the respective results are described in this section.

Data Collection.
The data collection involved three main tasks: (1) selection and definition of evaluation indices, (2) collection of indices data to calibrate the methodology, and (3) selection of the highway section to be evaluated. For the first task, various indices that affect traffic safety were divided into four categories: planning indices, alignment indices, intersection indices, and general indices (see Table 2).
To ensure that the evaluation methodology is effective, the indices selected were rational, simple, and comprehensive. However, only the significant indices are determined and used for evaluating the highway safety level. For the second task, the indices data were collected for Xia Rong expressway from BK111 + 800 to BK107 (longitude = 116.88 ∘ , latitude = 25.17 ∘ ) and G324 Line Fortress arterial road (longitude = 118.66 ∘ , latitude = 24.94 ∘ ), as shown in Figure 2. The crash statistics (2010-2013) were obtained from the traffic police corps of Fujian Province. There were 117 crashes on the expressway and 150 crashes on the arterial road. Then, the crash data were organized to establish the panel data and perform the regression analysis. Four planning indices that affect highway safety were identified: functional grade, highway classification, land use, and service level. The ANP was subsequently used to analyze these indices along with the significant indices obtained from the statistical analysis.
For the third task, a section of Horizontal Five (HF) highway that has similar characteristics to the highways for which the methodology was calibrated was selected for evaluation. The section extends from Taiding (Figure 3). The HF highway is a second-class highway with a design speed of  The section contains tangents, horizontal curves, and vertical curves. The desirable minimum horizontal curve radius is 100 m, while the absolute minimum radius is 60 m. The recommended and ultimate maximum longitudinal slopes are both 7%. The section passes through numerous villages with intersection density up to five intersections per kilometer. This section is considered suitable for a comprehensive evaluation using the proposed methodology.

Results of Regression Analysis.
Regression analysis was performed to establish the significant alignment, intersection, and general indices (independent variables) that affect traffic safety. The crash was used as the dependent variable, where the index takes a value of 1 if it affects traffic safety and takes a value of zero otherwise. The analysis involved 13 independent indices: five alignment indices, three intersection indices, and four general indices. A sample of the panel data of these indices is shown in Table 3. The table shows crash data involving four crashes on the expressway (1, 3, 116, and 117) and four crashes at the intersections (118-120, 265). As previously mentioned, the total number of crashes used in the regression analysis was 117 crashes. The results of regression analysis are shown in Table 4. The indices that have p values less than or equal to 0.05 are considered statistically significant. As noted, the significant indices are grade length, horizontal curve radius, sight distance, superelevation, intersection density, and intersection sight distance. The p values of the preceding indices are 0.002, 0.018, 0.022, 0.001, 0.040, and 0.003, respectively. The impacts 6 Journal of Advanced Transportation  Table 3: Panel data of 13 indices related to selected crashes on the expressway and intersections (total number of crashes is 117) a .  of other indices on traffic crashes were not significant at the 95% confidence level. The preceding significant indices are considered further along with the planning indices in the ANP step.

Building ANP Traffic Safety Model.
Based on the relationships among the indices of highway traffic safety, an ANP decision model with inner dependencies was built, in which each cluster was linked to itself through a loop link, as shown in Figure 4. According to this model, traffic safety is affected by alignment indices, intersection indices, and planning indices. The significant alignment indices (GL, R, SD, and e) and intersection indices (D, S), which are quantitative, were previously determined using regression analysis. The planning indices included FG, HC, LU, and SRL.

Computing Weights.
Since the indices affect highway safety differently, their relative weights were determined using questionnaires completed by 14 experts. The experts were from local and provincial governments, consulting companies, and college professors. The panel composition covered the broad categories of indices, including highway planning, geometric design, and road safety. Each expert was asked to compare all ten indices and rank them based on their experience and knowledge of the HF highway. Most experts were designers who have been involved in the design and construction of numerous highway facilities in the province. The ANP software was then used to help the respondents evaluate those indices. The software allows the respondents to judge key indices by using a pairwise comparison. In this method, the expert compared two indices at a time, decided which one was more important in promoting highway safety, and set the preference level of the selected index. The scale for pairwise comparison describing the intensity of judgments is previously presented in Table 1. Note that, based on (2), if index i has one of the numbers in Table 1 assigned to it when compared with index j, then index j has the reciprocal value when compared with i.
The average of the judgment scores was then used to establish the judgment matrix for comparison of the clusters and the indices within each cluster. The ANP involved two main tasks. First, the judgment matrix is used to establish the respective weights for clusters (Table 5) and indices (Tables 6-12). Note that SD and SL were not included in the evaluations conducted in Tables 9 and 10, respectively, since they were considered to have little impact on GL and FG. For each pairwise comparison, the consistency ratio (CR) is shown. As noted, the CR values of all comparisons are less than 0.1, which indicates that all comparisons pass the consistency test.
To illustrate the calculations of the weights in Tables 6-12 (last column), consider Table 6. The judgment matrix A is given by which are the values shown in the last column of Table 6. Second, the normalized weights in Tables 6-12 make up the unweighted index matrix which should be modified to reflect the weights of the cluster level. Thus, the components 8 Journal of Advanced Transportation      of the unweighted matrix were multiplied by the corresponding cluster weights to obtain the weighted, normalized matrix. The final weights of the indices are shown in the last column of Table 13.

Results of Set Pair Analysis.
To establish the standard values for assessment grade, the evaluation domain of highway traffic safety was divided into five grades, as previously described. To establish the grade of each index, 30 experts were invited to grade the ten indices, 25 responses were received, and 2 incomplete responses were discarded (the panel was different from that used in the ANP). The assessment results (scores and grades) as well as the weights from the ANP are shown in Table 13. The scores in this table are calculated as follows: score = ∑ (M U ) / 23, where i = 1, 2, . . ., 5, M is number of responses for Grade i and U is the upper limit of Grade i. For example, in Table 12 the score of FG is calculated as (13 x 100 + 7 x 80 + 3 x 60 + 0 x 40 + 0 x 20) / 23 = 88.7%. Hence, the index FG has Grade I since 88.7% lies between 80% and 100%.
To compute the connection degrees, the 10 indices under consideration were denoted as set A to form a set pair with different standard levels set B . Using (9)-(13) and the results of Table 13, the connection degrees for each grade were calculated. They were then compared and the grade corresponding to the maximum connection degree represented the grade of highway traffic safety. The results are shown in Tables [14][15][16]. Note that in the equation of fuzzy connection degree (see (10)), i is the coefficient of the discrepancy degree and is specified as 0 and j is the coefficient of the contrary degree and is specified as -1.  Table 14, the sum of u equals 0.2817, the sum of (t i ) equals -0.011, and the sum of V equals -0.1095. Therefore, based on (9), (A, 1 ) = 0.162. The respective calculations are shown as a footnote for each table. Thus, the order of the connection degrees (from large to small) is (A, 2 ) > (A,

Discussion and Conclusions
This paper proposes an advanced methodology for determining the overall highway safety level, integrating both quantitative and qualitative methods: statistical analysis, ANP, and SPA, which is an integrated algorithm method to evaluate the highway geometric design. The logic of study design is firstly to use three analysis (regression, ANP, and SPA) to generate an objective evaluation method and secondly to use the empirical dataset to testify the integrated method reliability as compared to the traditional subjective evaluation. The statistical analysis used crash, alignment, intersection, and other data to determine the significant indices (variables) that affect safety. These indices were then combined with the planning (qualitative) indices to determine the weights of all indices based on expert opinions using ANP theory. Finally, the overall safety level of the highway is determined using SPA. The methodology was illustrated using data collected on two highways in China.
The power of the ANP is that it can account for such factors in the evaluation process. The reason for the ANP success is that it elicits judgments and uses measurement to derive ratio scales. Weights as ratio scales are amenable to performing the basic arithmetic operations of adding within the same scale and meaningfully multiplying different scales. In addition, SPA also involves expert judgments. The  judgment in both methods is formulated in a systematic way. Expert knowledge improves the derived results and makes the evaluation much more accurate. The feasibility of the proposed methodology is illustrated using HF highway in Fujian Province, China. Regression analysis was used to analyze 13 indices related to alignment, intersection, and general features and six indices were found to be statistically significant. These indices were then combined with four planning indices and their weights were determined using the ANP which involved the use of expert judgments. Based on these weights, the overall traffic safety level of the highway was determined using SPA as Grade II (Very Good). The methodology presented in the paper focuses on the overall safety of a highway section, rather than individual elements, with the aim of devising practical implications for evaluating the current safety status of highways and for designing new ones. In the proposed methodology, the performance indices are as aggregate as possible. For example, instead of considering the individual elements of combined horizontal and vertical alignments, sight distance was used as a surrogate measure of numerous factors including combined alignments. Similarly, intersection density was used instead of individual intersection characteristics, such as signal features, signal phasing, and intersection geometry. In China, highway classification is considered a key element of road safety. Although different classes may be designed according to standards, lower highway classes are normally associated with higher crashes due to the nature of the traffic mix. Unlike freeways, traffic in lower highway classes includes not only vehicle traffic but also pedestrians, motor bicycles, and bicycles. In addition, traffic behavior and the level of adhering to traffic regulations are different. For these reasons, highway classification was considered as one of the planning indices and the results of the ANP showed that it was an important index.
The computational results demonstrate that the proposed methodology is reasonable, reliable, and applicable for highway safety evaluation, and the evaluation results have significant traffic safety policy implications. Although in this study the methodology was presented in the context of highway safety evaluation, it can be applied to other contexts such as railway, water, and air transportation. This study has some limitations which can be addressed in future research. For example, the weights of the indices have a direct and substantial influence on the assessment results where different weights may lead to a different safety level of the highway. The method used for determining the indices weights is subjective, and the appropriate number of experts used to establish the indices weights could be further explored. Nonetheless, it is hoped that the proposed methodology will be of interest to transportation policy makers, researchers, and practitioners involved in developing long-term strategies for safety improvements.

Appendix Notation
A: Evaluation (judgment) matrix CI: Consistency index a, b, and c: Coefficients that equal S/N, F/N, and P/N, respectively a k , b k , and c k : Identity, discrepancy, and contrary degrees, respectively F: Number of the discrepancy features of two sets i: Coefficient of the discrepancy degree i ∈[-1, 1] i k : Fuzzy connection degree between x k and Grade II standard value b1 k j: Coefficient of the contrary degree (-1) k: Index for evaluation indices n: Number of significant indices N: Total number of features of the two sets P: Number of contrary features of two sets RI: Random consistency index S: Number of identity features of two sets S 1 k , S 2 k : Upper and lower limits of Grade II standard values, respectively 1 , 2 , . . ., 3 : Weights of the F indices 1 , 2 , . . ., 3 : Weights of the S indices V 1 , V 2 , . . ., V 3 : Weights of the P indices V: Grade domain {Excellent, Very Good, Good, Average, Poor} w: Weights of indices 1 to n (eigenvector of matrix A) W: Normalized weights of indices 1 to n x k : Evaluation index k max : Maximum eigenvalue of matrix A : Connection degree of two sets.