Road Accident Prediction Model for the Roads of National Significance of Lithuania

This summary of the author's PhD thesis defended on 20 December 2012 at the Vilnius Gediminas Technical University. The thesis is written in Lithuanian and is available from the author upon request. Chapter 1 describes the analysis of road infrastructure safety management procedures and their implementation. Chapter 2 gives the overview of accident prediction models and the principles of their development. Chapter 3 presents the designed accident prediction algorithm for the roads of national significance of Lithuania, the developed mathematical accident prediction models for homogenous groups of roads and junctions, the implemented network safety ranking and the determined road sections with a potentially high accident concentration. Chapter 4 describes the testing and analysis of software intended for the implementation of accident prediction algorithm.


Topicality of the problem
Improvement of safety on roads still remains a priority field both in Lithuania and other European Union (EU) countries. The basic data indicating driving conditions and safety of roads is the number of accidents and their severity. Based on data of the World Health Organization more than million people are annually killed on the roads all over the world, and almost 40 thousand people are killed and 1.7 million are injured on the roads of EU. Accident losses are estimated to amount to 1-2% of the Gross Domestic Product (Elvik 2000). In Lithuania, about 4-5 thousand road accidents are recorded every year where people are killed or injured and this causes large social losses for the society. Due to the annual number of road accidents the national economy incurs losses amounting to 1.5 billion Litas (3.45 Lt = 1 EUR). Since each of the society members is a road user, road safety is a universal problem.
Road and its infrastructure, being one of the constituent parts of road safety system, are very important when seeking to reduce the risk of road accidents. If, despite preventive measures, the road accident nevertheless occurred the fact were the road users killed or injured and how severe the accident was is mainly dependent on the safety of vehicles and road. Engineering improvements are able to protect road users from injures, as well as to form road users' behaviour in a way to prevent road accidents.
In recent years, safety improvement measures on the roads of Lithuania have been implemented mainly on the pre-determined black spots, i.e. the sites where road accidents had already occurred and people had already been killed (in a four-year period 4 injury and fatal accidents have occurred in a 500 m section). Seeking for accident prevention and not waiting until accidents occur and the black spot is formed, it is necessary to use accident prediction models and to implement safety improvement measures on the potentially dangerous road sections, thus, preventing the formation of new high accident concentration sections.
The object of research is the roads of national significance of the Republic of Lithuania.
The aim of dissertation is by making use of the best practices of foreign countries to develop and introduce accident prediction model for the roads of national significance of Lithuania.
The tasks of thesis The following tasks were solved to achieve the aim of research.
1. To systemize and analyse scientific works and legal acts aimed at the implementation of infrastructure safety management procedures.
2. To carry out the analysis of accident prediction methods. 3. To design accident prediction algorithm for the roads of national significance of the Republic of Lithuania. 4. To develop mathematical accident prediction models for homogenous groups of roads and junctions. 5. To develop methodology for the road network safety ranking. 6. To implement accident prediction model in a computer software, to test it and to make test calculations.

Methodology of research
Research methodologies used in this work are based on the analysis of works of this field by the scientists of foreign countries. The following research methods were used in this work: statistical analysis, data comparison, grouping and detailing. The dissertation is based on the scientific publications by the authors of Lithuania and foreign countries, scientific and information publications by academic institutions.

Scientific novelty
The approbated practices of foreign countries made it possible to develop accident prediction model for the roads of national significance of Lithuania by applying the empirical Bayes method.
For the first time in Lithuania methodology for predicting road accidents was developed, installed and tested by calculations. Based on 2006-2010 data of road accidents, road geometrical parameters and traffic volume the mathematical accident prediction models were developed for homogenous groups of roads and junctions.
For the first time in Lithuanian the road network safety ranking was carried out according to the predicted number of accidents, i.e. the potentially dangerous road sections were determined where a higher number of road accidents are expected compared to the other road sections similar in their environment. Determination of the mentioned sections and implementation of appropriate safety improvement measures on them will allow to avoid road accidents or to mitigate their severity.

Practical value
The suggested tools for implementing road infrastructure management procedures -road safety impact assessment and road network safety ranking -will allow to predict in advance the number of road accidents on the roads of national significance of Lithuania, to implement the preventive safety improvement measures and to avoid black spots, i.e. high accident concentration sites.
The use of dissertation results will influence the reduction of road accidents and their damage on the roads of Lithuania.

Analysis of the road infrastructure management procedures
In 2008, the European Parliament and the Council adopted the Directive 2008/96/EC on Road Infrastructure Safety Management which established four procedures of road infrastructure safety management: road safety audit, road safety inspections, road safety impact assessment and network safety ranking, also classification of high accident concentration sections. The above procedures are divided into the already settled in the EU countries two groups of road safety activities -proactive and reactive. The aim of procedures belonging to the proactive group is to detect and eliminate reasons which may cause road accident. Activities of the reactive group are based on information of accidents that have already occurred. The activities of this group differ in a scale of research object from short road segments to the groups of different type of roads. Implementation of road infrastructure safety management procedures ensures safety improvement during the whole service life of the road from planning to operation.
At present, improvement of road infrastructure and implementation of safety improvement measures in Lithuania are carried out mainly on the black spots, i.e. the sites where road accidents have already occurred and the road users were killed. Following the principle "prevention is better than cure" implementation of road infrastructure management procedures -road safety impact assessment and road network safety management -shall be based on the prediction of road accidents.
To rationally use the limited financial resources for improving safety on roads, the safety improvement measures shall be implemented on the potentially dangerous sections of the road network, i.e. those sections where the largest accident number is predicted, and those sections where it is possible with the lowest costs to achieve the largest reduction in accident number. For this purpose, when designing new roads or preparing road reconstruction projects the solutions, related to road infrastructure parameters and engineering safety improvement measures, that are taken on the newly designed roads should prevent the occurrence of road accidents or reduce their number as much as possible, and on the roads undergoing reconstruction -should reduce the number of accidents and mitigate their severity. To solve these problems the accident prediction models should be used which would enable to determine the potentially dangerous road sections and to predict accident number if no engineering safety improvement measures are implemented, as well as to determine the predicted number of accidents after implementation of one or another selected measure.
Most often accident prediction models only predict the number of accidents or the number of fatal accidents but not the number of people killed or injured, since safety of the killed and injured in many cases depends on other factors also, i.e. number of passengers, vehicle safety, driver's experience, etc.
Many scientists point out that the empirical Bayes method is well-developed and widely used in the field of road safety (Elvik 2007;Hauer 1995;Hauer et al. 2002;Cheng, Washington 2005;Persaud et al. 1999;Persaud, Lyon 2007;Persaud et al. 2010). This method is based on the assumption that in a similar environment with the prevailing similar traffic conditions the risk of accidents is similar. Using the empirical Bayes method ( Fig. 1) the expected number of accidents is determined by combining two information sources: 1) number of historic accidents on a specific road element, and 2) mathematical accident prediction model describing accident risk on the road elements similar in their environment.
When using the empirical Bayes method the expected number of accidents on a specific location is calculated by weighting the registered number of accidents on the location and the general expected number of accidents for similar sites calculated by accident prediction models. This method is illustrated by the following formulas (Sørensen, Elvik 2008 weighting coefficient: where -the predicted number of accidents on a specific road section/junction; λ -the general expected number of accidents for the whole group of homogenous sections determined with the help of mathematical accident prediction models, r -the number of historic accidents on a specific section; k -the inverse value of the overdispersion parameter. Parameter α means weight given to the mathematical accident prediction model of homogenous group of roads or junctions by combining it with the number of historic accidents. For the road sections and junctions the different mathematical accident prediction models are used. Accident prediction model used for road sections is based on the number of accidents per the vehicle travelled distance, whereas, for junctions -on the number of accidents per entering vehicles. It should be noted that mathematical accident prediction model calculates the predicted number of accidents on a road element having certain similar properties. Based on this, the road network shall be divided into groups having similar properties, depending on the selected independent variables. Accident prediction models are not able to assess all the factors influencing the occurrence of accidents (Caliendo et al. 2007). The main factors having the largest influence shall be distinguished.
Most of mathematical models contain prevailing data used by the state institutions (road accident register, road bank, vehicle register, and the like). In this way, data availability necessary for the prediction purposes has been ensured. Prediction of road accidents requires information on historic road accidents, road infrastructure and traffic conditions. Accident modelling is usually based on data of 3-5 year period. This period Fig. 1. Example of the use of empirical Bayes method to calculate the predicted number of accidents (Sørensen, Elvik 2008) is recommended because of two reasons: 1. The higher number of accidents gives more reliable modelling results. 2. During this period no general tendencies and changes take place yet.

Development of accident prediction model for the roads of national significance of the Republic of Lithuania
Based on the analysis of the development of accident prediction models the algorithm of accident prediction model for the roads of national significance of Lithuania was designed (Fig. 2). The model was developed on a basis of empirical Bayes method where the predicted number of accidents is determined by combining two information sources described in Chapter 2.
Selection of independent variables is a very important stage of the development of prediction model, since they are responsible for the factors to be assessed in the model. Besides, selection of independent variables reflects information which will be necessary for making predictions. On the other hand, it is to be considered if the required information is gathered on a national scale and its availability will be guaranteed.
When driving on the different road sections a probability to get involved in road accident is different due to the different road geometric parameters, traffic conditions, road environment and other factors.
Accident prediction model has been developed based on 5-year data of observations. This period was selected due to two reasons. Firstly, the use of 3-5 years data for the prediction purposes is suggested in the scientific literature. Secondly, a larger amount of observation data allows to assess data dynamics and to make a more reliable prediction.
For the analysis of Lithuania's road network and development of mathematical models the 2006-2010 data on technical road categories, road cross sections, junctions, speed restrictions, average annual daily traffic (AADT), road accidents, etc. was used that has been stored in the Lithuanian Road Information System (LAKIS).
For the different type of road elements the different mathematical prediction models are developed. Taking this into consideration, the road network of Lithuania was classified into homogenous groups of road sections and junctions. Road section is referred to a part of road between junctions. The length of road sections is inconstant quantity which depends on the road parameters having influence on the number of road accidents. Junction zone covers a part of junction situated at a 200 m distance on all sides of the junction. Junction zone is a spot object (section) having no length.
Homogenous groups of road sections were classified by the following 4 criteria: 1. Road significance (1. Roads with a median. 2. Main roads. 3. National and regional roads. 4. Roads crossing the built-up areas). The road network of national significance of Lithuania (total 21 268.40 km) was classified into 34 homogenous road groups consisting of 13 254 homogenous road sections. The average length of one homogenous road section -2.31 km. The largest group of homogenous road sections is the group 3 comprising the roads of national and regional significance as well as gravel roads. The total length of roads of the group 3 is 16 266.99 km, these roads were divided into 7770 individual homogenous sections.
Homogenous groups of junctions were classified by the following 3 criteria: 1. Type of junction (1. Three-leg junctions. 2. Fourleg junctions. 3. Roundabouts; 4. Grade-separated junctions). 2. Road significance. Based on this criterion the junctions are grouped depending on which type of road significance the major road of the junction belongs to. 3. Traffic volume at the junction. Based on this criterion the junctions are grouped depending on the proportion of vehicles entering the junction from a minor road to all vehicles entering the junction. The junctions of the road network of national significance of Lithuania were classified into 14 homogenous groups which are made of 1454 junctions.
Each group of roads/junctions contains n of road sections/junctions. Comprehensive information gathered about each of them (number of accidents, length, AADT, etc.) enables to develop the mathematical accident prediction model. Mathematical accident prediction model is a constant for each homogenous group, classified according to the independent variables selected in the first modelling stage, and is equal to the average accident rate of the group. Mathematical accident prediction model is a constant for each homogenous group composed according to the independent variables selected in the first modelling stage, and is equal to the average accident rate of the group.
Mathematical accident prediction models were developed for each homogenous group: (3) , (4) where -mathematical accident prediction model for the homogenous group j; A j -number of accidents during the study period in the homogenous group j; -for the groups of road sections: the total length of sections of the homogenous group j, km; for the groups of junctions: the length depends on the number of roads crossing at the junction and is calculated by multiplying the number of crossing roads by 0.2, km; m -the study period, years; -for the groups of road sections: AADT during the study period, vpd; for the groups of junctions: AADT of vehicles entering the junction during the study period, vpd; e -elasticity coefficient showing a degree of dependence of accident rate on the traffic volume, on the change in land purpose, etc. When developing mathematical accident prediction model the road sections where at least one of the variables (number of road accidents, AADT) was equal to zero were eliminated.
Mathematical accident prediction models have been developed for three types of road accidents. Lithuania distinguishes seven types of road accidents which are grouped into three groups: 1. Vehicle -involved accidents.

Accidents involving pedestrians and cyclists. 3. Animal -involved accidents.
Grouping of accident types is necessary for the reason that the impact coefficients of safety improvement measures, used in assessing the effect of safety measures implemented on a specific road, are different depending on the type of accidents.
Classification of the road network of national significance of Lithuania into homogenous road sections and development of mathematical accident prediction models make it possible to predict the number of accidents for each homogenous road section by using the empirical Bayes method: , (5) where -the predicted number of accidents on the road section i; α -weighing coefficient; mathematical accident prediction model for the homogenous group j which includes the road section i; -the number of historic accidents on the road section i. Weighing coefficient is calculated by the formula (2). The values of mathematical accident prediction models and their conformity values to the groups of homogenous road sections and junctions were calculated based on 2006-2010 data. The Model conformity values were calculated using the SPSS software package.
Using 2006-2010 data and the formula (5) the predicted road accidents were calculated within the road network of national significance.
The mentioned prediction method makes it possible to distinguish in the whole road network the potentially dangerous road sections in respect of road safety , where the predicted number of accidents is higher than that on the other road sections similar in their environment. The potentially dangerous road sections are referred to those sections where the values of predicted accidents are higher than the critical value of predicted accidents of a homogenous group.
The critical value of predicted accidents is calculated by the formula given in the PIARC Road Safety Manual: , where -the critical value of predicted accidents in the homogenous group j; -the average value of predicted accidents in the homogenous group j; K -constant: at the reliability level of 85% -1.036; at the reliability level of 90% -1.282; at the reliability level of 95% -1.645; at the reliability level of 99% -2.326; the recommended constant -1.645; m -the time period used for prediction purposes, years; L j -the length of the homogenous group j, km; AADT j -average annual daily traffic of the homogenous group j during the study period, vpd.
Based on the formula (6) the critical value of predicted accidents for each homogenous group was calculated, accident dispersion was presented and the number of potentially dangerous road sections was determined for each group. Fig. 3 gives an example of the dispersion of 111 predicted accidents of homogenous road group. Road sections situated above the critical accident prediction value are considered to be the potentially dangerous road sections in respect of road safety.
The list of potentially dangerous road sections is made of 1071 road sections the total length of which is 5345.98 km, i.e. 25.14% of the total road network of Lithuania. The largest number of dangerous sections is represented by the road group 3, i.e. the road group of the network of national significance where there is the largest risk to get involved in the road accident. Since the number of such sections is rather large, it was suggested to select the most dangerous sections from the list of dangerous sections by using the same principle of the critical value. In the list of dangerous road sections the critical accident prediction value is equal to 0.89. There are as many as 389 road sections above this value with the total length of 2668.99 km. The top of the list of potentially dangerous road sections is taken by the roads of national significance crossing the builtup areas. The most potentially dangerous junctions are the junctions of type X located on the main roads and having a prevailing large number of vehicles entering the junction.

Software for the realization of accident prediction model
When predicting road accidents a very large amount of data of the certain period is used related to road geometrical parameters, AADT, historic accidents, and the like. Taking this into consideration, the software was developed which allows the user by performing uncomplicated actions and without input of additional information to predict the expected number of accidents on a specific road or road section.
In the result of cooperation between the specialists of Dept of Roads of Vilnius Gediminas Technical University, State Enterprise Transport and Road Research Institute, Technical Research Centre of Finland VTT and the Finnish Computer Software Company Simsoft Oy the computer software Tarva LT was developed giving a possibility to calculate the expected number of accidents on the road or road section and to assess the effect of safety improvement measures on safety situation.
Tarva LT is a computer software intended: 1. To carry out safety ranking of the road network.
2. To provide comprehensive information about the road sections/junctions in order to make their assessment. 3. To select the most suitable safety improvement measures. 4. To assess the effect of the suggested safety improvement measures. 5. To assess the change in the number of road accidents and their severity having implemented safety improvement measures. 6. To calculate accident cost savings. The performed test calculations showed that the software Tarva LT calculates the predicted accident number on the selected road section and assesses the effect of suggested safety improvement measures. It should be emphasized that with the help of this software and with low time expenditures it is possible to prepare several alternatives of safety improvement measures planned to be implemented and to select those measures which require lower investments in order to avoid the 1st road accident. When selecting safety improvement measures it is very important to analyse traffic conditions at the road section, the types of historic accidents and their causes, as well as the road environment.
Selection of safety improvement measures for the road building and reconstruction projects depends not only on the planned effect of the measure but also on the funds required for its implementation. Safety measures after implementation of which the increase in the number of accidents is predicted shall be rejected and not considered.
In the software Tarva LT the economic effect of a safety improvement measure is assessed depending on the investments required for the reduction of one accident and is calculated by the following formula: , where -the cost of implementation of a safety improvement measure, Lt; EIS -reduction in the number of accidents due to the implemented safety improvement measure; m -duration (life-cycle) of the effect of a safety improvement measure, year.
The use of the above described software makes it possible to compose alternative options of safety improvement measures planned to be implemented on the newly built or reconstructed road and to compare accident cost savings having implemented one or another measure. Thus, it provides possibility to prepare an optimistic project with the consideration of available funds for project implementation and of the desired level of road safety.

General conclusions
Road infrastructure safety management procedures are indispensable tool in ensuring safety on road within the whole period of road service life from planning and design to its operation. To reduce the number of accidents and to mitigate accident severity it is necessary to carry out road network safety ranking, to determine the potentially dangerous road sections in respect of road safety and namely on them to implement safety improvement measures giving the highest effect. To avoid the occurrence of new high accident concentration sections, the safety ranking and road safety impact assessment procedures shall be implemented based not on historic accidents but on accident prediction.
The analysis of worldwide practice shows that accident prediction models were developed using four basic methods -Multivariate Analysis, Empirical Bayes method, Fuzzy Logic and Neural Network. The empirical Bayes method is the mostly recommended method for predicting the number of accidents on road sections/junctions of similar traffic conditions and similar environment. When using this method the homogenous groups of roads and junctions shall be determined, the road network shall be classified into homogenous sections and mathematical accident prediction models shall be developed for each homogenous group.
Using the empirical Bayes method the algorithm of accident prediction model was designed for the roads of national significance of Lithuania. Implementation of this model gives a possibility to accomplish the safety ranking of the road network and to determine the potentially dangerous road sections.
For the realization of accident prediction model the computer software Tarva LT has been developed and tested by the calculations allowing to calculate the expected number of accidents on the roads of national significance, to determine the potentially dangerous road sections, to assess the effect of safety improvement measures on the predicted number of accidents and to select the most efficient safety measures from the road safety and financial point of view. The computer software is recommended to be used aiming to reduce the number of road accidents.
The Tarva LT database requires annual updating. For an effective accident prediction, annual changes in the road network shall be taken into consideration which may correct the structure of homogenous groups, in the result of what depending on the data of historic accidents of the recent calendar year the new mathematical accident prediction models shall be developed for each homogenous group.