Investigating the Impact of Various Risk Factors on Victims of Traffic Accidents

In this study, our goal was to determine the impact of various risk factors on traffic accidents in the city of Shenyang, China, and to discuss the various common factors that influence pedestrian and non-motor vehicle accidents. A total of 1227 traffic accidents from 2015 to 2017 were analyzed, of which, 733 were accidents involving pedestrians and 494 were non-motor vehicle accidents. Among these traffic accidents, pedestrians and non-motor vehicle users had either minor or no responsibility. Sixteen influencing factors, including main responsible party attributes, pedestrian/non-motor vehicle user attributes, time attributes, space attributes, and environmental attributes were analyzed with regards to their impact on accidents using the binary logistic regression model (BLR) and the classification and regression tree analysis model (CART). Age, administrative division, and time of year were the three most common factors for pedestrian and non-motor vehicle accidents. For pedestrian accidents, the personal influencing factors of the main responsible party included illegal acts while driving and hit-and-run behavior. Factors affecting pedestrian and non-motor vehicle accidents also had different orders of importance.


Introduction
Traffic accidents pose a considerable threat to public health, and deaths resulting from traffic accidents have become a serious concern for society. As World Health Organization statistics show, road traffic accidents have been a leading factor, resulting in the deaths of many people in the past twenty years [1]. Of all parties involved in traffic accidents, pedestrians and cyclists are more vulnerable as they have less protection than motor vehicle drivers. Therefore, it is vital to study road traffic accidents, both non-motor vehicle and those involving pedestrians, to make our world safer and more sustainable.
Many scholars have studied traffic accidents involving both pedestrians [2] and non-motor vehicles [3]. From the behavior mechanisms of pedestrians, non-motor vehicle users, and motor vehicle users, researchers found that during traffic accidents, pedestrians and non-motor vehicle users often suffer disproportionally because they are more vulnerable [2,3]. Accident data provided by the police department have also shown that pedestrians and non-motor vehicle users are more likely to have either minor or no responsibility regarding accident accountability. Therefore, the analysis of the factors affecting the severity of traffic accidents is of great significance for the protection of the relatively more vulnerable groups that include pedestrians and non-motor vehicle users [4].
In this article, the traffic accident data were selected according to pedestrians and non-motor vehicle users who had either minor or no responsibility. Factors affecting the impact of non-motor vehicle users and pedestrians on traffic accidents were analyzed, and the similarities and differences between pedestrian and non-motor vehicle accidents were determined. The city of Shenyang was taken as the sample location for analysis. Shenyang is the capital of Liaoning province in the northeastern part of China. Its population is 8.3 million, and it had a total GDP of 629 billion yuan (89 billion US dollars) in 2018 [5]. The weather in Shenyang is characterized by a five-month wintertime period.

Related Factors Impacting Traffic Accidents
Many scholars have researched pedestrians and non-motor vehicle users involved in accidents, and a wide range of influencing factors have been identified. Factors affecting the severity of traffic accidents include aggressive driving behavior [6], weather [7], time of year [8], administrative division [9], cold regions [10], and driving circumstances [11]. These factors are, therefore, the focus of our study.
Various methods and models have been used to analyze the factors affecting traffic accidents, including hypothesis testing [12], binary logistic regression (BLR) [13], multinomial logistic regression [14], ordinal logistic regression [15], the mixed logit model [16], the tree-based model [17] including classification and regression trees (CART), the clustering approach [18], XGBoost [19], and the Bayesian network [20]. Among these methods and models, if the interpreted variable only has two values, the binary logistic regression model is the best choice. To analyze the importance ranking of different factors, CART is often used.

Non-Motor Vehicle Accidents
Due to their limited external protective devices, cyclists and motorcyclists are prone to serious injury in road accidents. Analyzing the level of recovery and the differences between these groups of road users is an important step towards determining the burden of road trauma and using the information to optimize prevention [21]. Illegal occupation of motor vehicle lanes, speeding bicycles, passing through red lights, illegal carrying of people, and bicycles going in the wrong direction are the main dangerous behaviors of non-motor vehicle users [4]. Kaplan et al. [22] found that it is important to improve the quality of bicycle paths and bicycle lanes to improve the safety of non-motor vehicle users.

Pedestrian Accidents
Pedestrians travel at the lowest speed of all traffic and have the weakest protection, which often results in more severe injuries for them in traffic accidents. The protection of pedestrians has therefore become an important topic [2]. Behnood et al. [23] considered a series of variables that may affect the severity of pedestrian injuries, including time, location, environmental conditions, pedestrian characteristics, and collision characteristics. Kovaceva et al. [24] developed a driver assistance system (automatic emergency braking and steering) to analyze the behavior characteristics of cyclists and pedestrians and then proposed the framework of a quantitative driver assistance system to benefit their safety. Wang found that older pedestrians were more likely to be involved in severe accidents in Singapore [25].

Data and Methods
The data and methods are illustrated in the following two sections. Two methods, binary logistic regression as well as classification and regression trees, were used to calculate the data.

Data
In light of 2015-2017 data from the Shenyang public security department, 1227 traffic accidents in Shenyang, in which pedestrians and non-motor vehicles were victims (of minor or no responsibility incidents), were used for calculations. Among all the traffic accidents under consideration, 733 were accidents of pedestrian victims and 494 were accidents of non-motor vehicle victims. The impact on accidents was set as the interpreted variable and named as Y. The influencing factors of impact on accidents were set as the explanatory variables and named as Xi. All 16 influencing factors can be classified into five parts, which are the main responsible party attributes; pedestrian/non-motor vehicle attributes; time attributes; space attributes; and environment attributes. The main responsible party indicated factors causing harm to the victims in our study. Each specific influencing factor under every part is listed as follows: (1) Main responsible party attributes: hit-and-run behavior of the main responsible party (X 1 ), illegal acts in driving of the main responsible party (X 2 ); (2) Pedestrian/non-motor vehicle attributes: gender (X 3 ), age (X 4 ), accident responsibility (X 5 ); (3) Time attributes: season (X 6 ), rush hour (X 7 ), weekday or weekend (X 8 ); (4) Space attributes: intersection (X 9 ), administrative division (X 10 ), location of road (X 11 ), pavement condition (X 12 ), vehicle isolation (X 13 ), central isolation of road (X 14 ), physical isolation of road (X 15 ); (5) Environment attributes: weather (X 16 ).
The definition of variables for pedestrian and non-motor vehicle accidents is shown in Table 1 and the proportion of different variables for accidents is shown in Table 2. Table 1. Definition of the variables for pedestrian and non-motor vehicle accidents.

Interpreted Variable Definition
Impact on accidents (Y) 0 = Zero death; 1 = Accidents with death  Table 1, Spring means April and May; Summer comprises June, July, and August; Autumn comprises September and October; and Winter comprises the remaining months of a year. Hit-and-run behavior means the act of escaping from legal investigation after a traffic accident.

Binary Logistic Regression Analysis
In binary logistic regression analysis (BLR), the impact on accidents was set as the interpreted variable, Y, with two possible values (0, 1). Each factor that affects the impact on accidents is named as X i , and the mathematical expression of Y and X i is shown in Equation (1): In Equation (1), b is set as a constant for the equation, and a i is a correlation coefficient for each influencing factor. Equation (1) is changed to Equation (2) via a logit transition.
By calculating a i , the correlation coefficient of each influence factor can be obtained, and Exp(B) is the odds ratio of the influencing factor indicating how much it can change Y from 0 to 1 in comparison with the referenced value.

Classification and Regression Tree Analysis
The classification and regression tree (CART) model is composed of two steps: decision tree generation and decision tree pruning. For the decision tree generation based on the training dataset, the generated decision tree should be large enough. For decision tree pruning, the decision tree is pruned under the standard minimum loss function.
Supposing that the probability of the kth category is P k , and the Gini coefficient expression of the probability distribution is shown in Equation (3): For sample D, the number of that sample is |D|. For the supposed k categories, the number of the kth category is |C k |. Then, the Gini coefficient expression is shown in Equation (4): Under the consideration of condition A, sample D is divided into two data subsets, D 1 and D 2 , and the Gini coefficient expression is shown in Equation (5): The Gini coefficient indicates the uncertainty of the provided sample data. The importance of each calculated variable is converted into a relative ratio in consideration of the most important factor, which shows a 100% converted value.

Binary Logistic Regression Analysis
The results of the binary logistic regression are shown in Table 3, and they are categorized into two parts: pedestrian accidents and non-motor vehicle accidents.

Pedestrian Accidents
Out of 16 influencing factors considered, 5 were significant (p < 0.05). The results of the significant influencing factors are shown in Figure 1.
For administrative division, the significance value was 0.002. The probabilities of causing death from an accident of administrative division 1 and 2 were, respectively, 0.550 and 0.505 times that of administrative division 3. The area with the highest fatality rate was division 3.
For seasons, the significance value was 0.000. The probabilities of causing death from an accident of season 1, 2, and 3 were 0.590, 0.980, and 3.123 times that of season 4, respectively. Season 3 had the highest probability of causing fatal accidents.
For an illegal act in the driving of the main responsible party, the significance value was 0.019. The probability of causing death from an accident of value 1 was 0.639 times that of value 2. An illegal act in the driving of the main responsible party 2 had a higher probability of causing death.
For hit-and-run behavior of the main responsible party, the significance value was 0.007. The probability of causing death of value 1 as 1.730 times that of value 2. Hit-and-run behavior of the main responsible party 1 had a higher probability of causing death.
For the influencing factor of age, the significance value was 0.000. The probability of causing death of age 1, 2, and 3 were, respectively, 0.139, 0.608, and 0.913 times that of age 4. The highest probability of causing death accident was age 4.

Non-Motor Vehicle Accidents
Among the 16 influencing factors taken into consideration, 5 were significant (p < 0.05). The results of the significant influencing factors are shown in Figure 2. For administrative division, the significance value was 0.001. The probabilities of causing death from an accident of administrative division 1 and 2 were 0.351 and 0.452 times that of administrative division 3, respectively. The area with the highest fatality rate was division 3.
For seasons, the significance value was 0.017. The probabilities of causing death from an accident for season 1, 2, and 3 were 0.476, 0.533, and 0.798 times that of season 4, respectively. The highest probability of causing death was season 4.
For the influencing factor of age, the significance value was 0.031. The probabilities of causing death from an accident of age 1, 2, and 3 were 0.116, 0.520, and 0.815 times that of age 4, respectively. The highest probability of an accident causing death was age 4.
For the influencing factor of gender, the significance value was 0.037. The fatality rate of value 1 was 1.573 times that of value 2. Gender 1 had a higher probability of causing death.
For the influencing factor of accident responsibility, the significance value was 0.027. The probability of causing death from an accident of value 1 was 1.600 times that of value 2. Accident responsibility 1 had a higher probability of causing death.

Classification and Regression Tree Analysis
By using classification and regression tree analysis, we obtained the normalized importance of pedestrian and non-motor vehicles accidents. The results of normalized importance are listed in Table 4. Table 4. The normalized importance of pedestrian and non-motor vehicle accidents.

Variables Pedestrian Accidents Non-Motor Vehicle Accidents
Hit-and-run behavior of the main responsible party (X 1 ) 7.3% -Illegal act in driving of the main responsible party (X 2 ) 0.2% - Age (X 4 ) 39. For pedestrian accidents, the season had the highest normalized importance (100.0%) of all the significant factors on pedestrian accidents. This was followed by age (39.7%), administrative division (33.1%), hit-and-run behavior of main responsible party (7.3%), and illegal act in driving of the main responsible party (0.2%) on normalized importance.
For non-motor vehicle accidents, the administrative division had the highest normalized importance (100.0%) of all significant factors on non-motor vehicle accidents. This was followed by gender (52.7%), season (42.8%), age (4.4%), and accident responsibility (0) on normalized importance.

Discussion
Based on the results from the calculations using binary logistic regression and classification and regression tree analysis, the influencing factors can be classified into three types: common influencing factors, different influencing factors, and no-influence factors. They are detailed as follows.

Common Influencing Factors
Age, administrative division, and season were the three common influencing factors for both pedestrian accidents and non-motor vehicle accidents.

Age
For both pedestrian and non-motor vehicle accidents, age 4 had the highest possibility of causing death and age 1 had the lowest possibility of causing death.
Elderly people in their 60s or above are less alert to their surroundings than younger people. When they encounter a traffic accident and become the victim of minor or no responsibility, it is more likely that they will sustain more severe injuries than younger people. Therefore, the elderly over 60 years old are at the highest risk of all ages in traffic accidents, whereas children and teenagers are more sheltered under their parents' supervision, so they are the least likely group of people to have death from traffic accidents.

Administrative Division
The influencing factor of administrative division showed different results on pedestrian accidents and non-motor vehicle accidents. For pedestrian accidents, value 3 had the highest possibility of causing death, and value 2 had the lowest possibility of causing death. For non-motor vehicle accidents, value 3 and value 1 had the highest and lowest possibility of causing death, respectively. Hence, for both pedestrian and non-motor vehicle accidents, the most attention should be paid to the outer suburban area as it had the highest possibility of causing death. A potential reason underlying this high possibility could be the relatively poor infrastructure in outer suburban areas compared to urban areas; the former lack effective protection measures for pedestrians and non-motor vehicles. Hence, it is necessary to direct our focus to the road infrastructure construction in the suburbs to reduce the death rate of road traffic accidents.

Season
The influencing factor of season shows different results on pedestrian accidents and non-motor vehicle accidents. For pedestrian accidents, value 3 had the highest possibility of causing death, and value 1 had the lowest. For non-motor vehicle accidents, value 4 had the highest possibility of causing death, and value 1 had the lowest. Thus, autumn and winter were the two seasons that should be paid greater attention to reduce the occurrence of death. Extreme conditions are more frequent in the cold weather of autumn and winter [10], coupled with low visibility in rain and snow. Automobile tires also often have low pressure in these two seasons due to cold contraction, making autumn and winter the two relatively more dangerous seasons. The occurrence of death from accidents in these two seasons could be prevented by increasing the number of traffic police on duty and imposing seasonal speed limits.

Pedestrian Accidents
For pedestrian accidents, personalized influencing factors were an illegal act in the driving of the main responsible party and hit-and-run behavior of the main responsible party. Both were significant influencing factors on pedestrian accidents, but not significant for non-motor vehicle accidents.
For an illegal act in the driving of the main responsible party, value 2 had a higher possibility of causing death than value 1. Therefore, more attention should be paid to non-illegal acts in driving, including irregular acts and others. Punishment measures for these behaviors are not included in the current road traffic laws, and without legal regulation, irregular driving may lead to a higher mortality rate.
For hit-and-run behavior of the main responsible party, value 1 had a higher possibility of causing death than value 2. Thus, more attention should be paid to hit-and-run behavior. For urban infrastructure, more cameras could always be set up in places with high frequencies of road accidents to record the road traffic situation, to warn off hit-and-run behavior.

Non-Motor Vehicle Accidents
For non-motor vehicle accidents, the personalized influencing factors were gender and accident responsibility. Both were significant influencing factors for non-motor vehicle accidents, but were not significant for pedestrian vehicle accidents.
For the influencing factor of gender, value 1 had a higher possibility of causing death than value 2. Therefore, male drivers of non-motor vehicles should pay more attention than female drivers whilst driving.
For the influencing factor of accident responsibility, value 1 had a higher possibility of causing death than value 2. Therefore, more attention should be paid in cases of non-responsibility accidents of non-motor vehicles than minor responsibility accidents. It is necessary to improve the level of protection measures for non-motor vehicle drivers to reduce the occurrence of death from accidents.

No-Influence Factors
The no-influence factors included rush hour, weekday or weekend, intersection, location of road cross section, pavement condition, vehicle isolation, central isolation of road, physical isolation of road, or weather. All of these are time and environmental attributes rather than driver attributes. A possible explanation could be that driver attributes contribute more to traffic accidents.

Conclusions
Taking the city of Shenyang as the example, 1227 traffic accidents from 2015 to 2017 were analyzed, in which 733 were pedestrian accidents and 494 were non-motor vehicle accidents. Sixteen influencing factors from the aspects of the main responsible party attributes, pedestrian/non-motor vehicle attributes, time attributes, space attributes, and environment attributes were used to analyze their relationship with accidents, using binary logit (BLR) and classification and regression tree (CART) methods. The factors of age, administrative division, and season were the common influencing factors. Illegal acts in driving and hit-and-run behavior of the main responsible party were the personalized factors for pedestrian accidents. The factors of gender and accident responsibility were the personalized factors for non-motor vehicle accidents. Other factors did not show significant values for an effect on accidents.
The normalized significance values of pedestrian accidents and non-motor vehicle accidents were different. For pedestrian accidents, season was the most important influencing factor, and for non-motor vehicle accidents, administrative division was the most important influencing factor. Of all the influencing factors, driver attributes had more effect than time attributes.
Further studies can be conducted discussing the effects of new factors to determine whether there are more significant factors affecting accidents within the aspects of time and environment.

Conflicts of Interest:
The authors declare no conflicts of interest.