Traffic Accidents Detection using Geographic Information Systems (GIS) Spatial Correlation of Traffic Accidents in the City of Amman, Jordan

— The mission of reducing the number and severity of traffic accidents becomes the dominant target of road traffic safety management worldwide. The main objective of this work is to analyze traffic accidents in temporal and spatial frameworks in the capital city Amman and identify hotspot zones in the study area. Several statistical analyses are conducted using SQL to give insight into the temporal distribution of accidents and to identify the most revealing accidents based on several attributes such as the year of accidents, the severity of accidents, road type, and lighting environment which enables the authors to do further investigations on the more frequent accidents. GIS-based statistical and spatial analysis tools are utilized to examine the spatial pattern of accident distribution in the study area for three successive years, hotspots are identified for clusters of high concentrations. The Nearest Neighbor Index (NNI) is used to analyze the pattern of traffic accident distribution based on selective parameters. This was followed by identifying hotspot zones for regions that showed clustering using the optimal hotspot analysis tool. Experimental results showed clustering for all tested groups, and thus hotspots were detected for these accidents in the study area. The importance of this work is in providing a spatial understanding of accident distribution in the capital city of Amman which can help policymakers of road safety setting out efficient strategies for traffic safety management and find optimal solutions as required for factors causing such accidents.


I. INTRODUCTION
Traffic accident problem is major public health anxiety in the globe.Statistics showed that there are about 1.3 million deaths and 50 million injuries caused by road accidents each year [1].
The traffic authorities in most countries tend to control the traffic to reduce traffic accidents using several techniques such as: optimize traffic-light management, improve cycling infrastructure, enforce existing road traffic laws, improve perceptions of buses, extend residents' parking zones, use CCTV to monitor road conditions, charge for workplace parking, improve bus services, .etc.
The impact of road accidents on both economy and society is massive.It is estimated that the cost of these accidents is about 518 billion USD worldwide [1,2].In developing countries such as the Kingdom of Jordan, people are more vulnerable to road accidents according to the global status report issued by the World Health Organization(WHO) on road safety [3], Jordan has appeared in the set of the worst countries regarding road safety.The report revealed that the rate of deaths caused by road accidents is about 24.4 for each 100.000 of population.
Jordan's population experienced growth in the last decade to reach roughly 9.9 million [4] leading to growth in road accidents.Road crashes are considered the second cause of death [5] in Jordan.Most of the casualties of these accidents are passengers and pedestrians who represented (62.3%) of fatalities resulted from road accidents safety [3].The region of interest of this study is shown in Error!Reference source not found.. Road and traffic safety planning is a life basic requirement in all societies.Safety information is essentials for addressing road and traffic safety needs.Informed decision-making of road safety issues is based on the registered crash information.The accident's information was collected from the police traffic department.The collected data then is used to analyze and identify the hazardous regions on road, then the pattern of accidents can be identified by applying engineering studies.Results of the analysis are used to implement the required improvements and road planning.
Geographic information systems tools are utilized for advancing safety planning [6].Applications of GIS in the management of accident analysis and traffic safety were introduced in 1990.Several GIS-based accident information systems are devoted to managing and tackling traffic flow.These GIS-aided systems aim to decrease the number of accidents by identifying the hazardous locations which can be achieved by using the spatial data analysis and statistical analysis methods provided in GIS software Booth [7].GIS software includes a variety of analysis tools such as density, proximity, cluster, pattern, spatial query analysis in addition to the ability to build customized models using the model builder technique.
GIS-based accidents information system overcomes tabular database information system in enabling a user to identify relationships which is very difficult to be achieved using tabular database systems this makes the GIS-based information system one of the powerful spatial information systems.The main components of a spatial information system are a tabular database representing accident data, a GIS system that connects the database to the corresponding locations, and a set of tools to analyze the attribute and spatial data.Results of analysis aids in road and traffic safety planning enhancements.www.ijacsa.thesai.orgIdentifying accidents hotspots is a major part of most safety analysis studies, where locations with the highest proportions of total accidents are identified.There are two main categories for identifying hotspots, observed accidents rate, and expected accident rates.However, hotspots can be identified efficiently by using methods that incorporating both categories because accidents are considered random events and they can change in time and locations [11].
Several methods have been introduced for identifying hot zones [8][9][10].Each study has advantages and drawbacks and is designed to address hotspots identification issues in a specific environment.The most widely used approaches for hotspot identification [11] is the kernel density estimation method (KDE) [12,13], nearest neighbor hierarchical method (NNH) [14], and local indicator for spatial association (LISA) [15], where these methods indicate the spatial pattern of events, whether it is clustered, random or dispersed.Having the spatial pattern identified, the hotspots are then identified using various tools available in GIS software or any other software designed for this task.
The quality of results obtained from accident analysis is affected by several factors.One important factor contributing to the success of the spatial and statistical analysis is the accuracy and reliability of spatial databases which are extracted from accident reports.A study was conducted in [16] on accident report forms used in many countries around the world, the study revealed that there are 99 different types of information according to the environment of the accidents.
Only a few studies have examined traffic accidents in a spatial framework in the city of Amman, Jordan.This study is aimed to examine accident data in both temporal and spatial frameworks based on several categories and consequently unravel the main zones of accident concentrations for categories that showed a high proportion of total accidents.The objectives of this study are addressed by first conducting temporal statistical analysis for accident data using SOL based on several categories such as year of accident, severity, road type, and lighting condition, this highlights the recurrence of accidents related to some categories other than accidents related to other categories.GIS-based tools are then utilized to find out the spatial pattern of these accidents, then identify the hotspot zones for patterns showing clustering.This can help in providing more understanding of the geography of traffic accidents and the spatial correlation between these accidents in the study area and as a result, help interested parties in setting precautions and optimal solutions to safety problems.
The remaining sections are organized as follows: Section 2 provides an extended literature review of recent research on the hotspot analysis and the main causes of the accidents.Section 3 describes the dataset that is going to be used to conduct the experiments and evaluations.Section 4 presents the authors' approach to analyze accidents data based on various attributes, examining the distribution of accidents in the study area and identifying hotspots, and finally concluding this work and suggesting the future works are presented in section 5.

II. LITERATURE REVIEW
In general, this section covers the literature review from different sources related to the studies of traffic and road safety that are focused on the factors affecting safety.Several factors contribute to road and safety deficiency some of these factors are caused because of the structure of the road, geographical and weather conditions and deficiency in the lighting system of roads [17].Spatial databases are manipulated using GIS technologies which result in identifying the relationships between these spatial phenomena.
Another study for accident analysis was introduced in [7], the study objective was to build a GIS-based system for identifying hotspots in Afyonkarahisar in Turkey and factors causing these accidents in these areas with statistical analysis methods so suitable solutions can be applied by road safety specialists.Also, a Safety Evaluation Method for Local Area Traffic Management (SELATM) was introduced in [7].Their system is mainly based on GIS technology for analyzing accidents pattern.
Another traffic accident analysis system was introduced in [18].The database of accidents, structures of roads, and facilities of road accessories can be managed and analyzed using the developed system.As a result, rates of accidents along with their frequencies can be identified.The work in [19] introduced a GIS-based system for accident analysis.The www.ijacsa.thesai.orgsystem is designed for identifying the accident location besides that the rank of the accident can be identified as well.The system enables the user to input and retrieves a database related to accidents and manipulate statistical analysis on a specific accident location.
An aided GIS model for identifying hotspot location of road accidents was developed in [20].The data used for analysis and determining the locations of the hotspots are the XY coordinates of road accidents obtained from traffic police.The developed model along with the other three models for scheduling and operating system of the enforcement camera, controlling and balancing the traffic load in the courts and a model for analyzing videos contribute to reducing the number of accidents and fatality.
A GIS-based Analysis to Identify the Spatiotemporal Patterns of Road Accidents in Sri Racha, Chon Buri, Thailand was developed in [21].The analysis is performed by applying kernel density estimation (KDE), and Ripley's K-function tools in ARC-GIS to determine the distribution and pattern of the accidents.The database used in this study is the accident data from 2017 to 2020.Several scales for clustering the spatiotemporal pattern of accidents were used.Clustering the spatial distribution of different accident types was performed at different distances.Experimental results show that there are three main areas in the studied area with high-density accidents.
An innovative spatial-auto correlation-based method for identifying road accident hot zones was presented in [22].A new method based on ARC-GIS software and spatial autocorrelation algorithm was built for identifying hotspots of accidents taking into consideration both the properties and attributes of the accident.The proposed method is applied on-road sections divided into several 100 meters long, as a result, the location of where the accident is identified regardless of the rate of the accident in this location.However, further work can be done for classifying the hot zones based on different rules such as the severity of an accident.
A study for identifying hot zones of accidents and analyze the relationship between these accidents and land use was introduced in [11].The first step of examining the hot zone was performed using ARC-GIS tools.The analysis of hot zone is mainly applied for three categories, severity category which includes two groups, causes of crash occurrences which includes seven dominant causes, and three major types of accidents frequently happening in the study area of Dammam city, Kingdom of Saudi Arabia (KSA).The next step of identifying the relationship was achieved by applying a GIS-based geographically weighted regression (GWR) method to identify the relation of accident and density of population and type of land use in the crash area.Although this study contributes to analyzing the complex relationship between the crash and land use in the crash neighborhood, various detailed features of the neighborhood environment could be considered to advance the existing analysis tools.
A GIS-based system is incorporated into fuzzy logic to predict the hot zones of accidents was proposed in [23].The spatial-temporal analysis was applied to examine fatality and injury groups in the context of accident severity.The prediction process is performed by applying the Fuzzy Overlay Method (FOM) and the Weighted Overlay Method (WOM).The results obtained from the previous step are verified using the density point tool.The proposed algorithm is evaluated using the database between 2013 and 2015 of Irbid city Jordan.Results show that there are 8 hot zones, five are main road intersections and three road sections were investigated to identify factors causing these accidents.
A GIS-based system incorporated with the Firefly Clustering algorithm was presented in [24] to identify hot zones of accidents.spatial analysis tools existing in ARC-GIS were used to find distances between accident points, while characteristics of accidents were identified by applying a Firefly Clustering algorithm.The performance of the developed method was evaluated by conducting a comparison between distances calculated using the GIS origin-destination (OD) cost tool and the Euclidean distance tool.Results show that number of hot spots is overestimated using Euclidean distance, particularly at intersection zones.
A GPS and Arc-GIS incorporated system was proposed in [25] to identify black zones in the city of Pristina, Kosovo.In this study, maps were created using GPS technology and compared to traditional methods of collecting accident data from location.Results of evaluating the proposed methods reveal that GPS technology in collecting accident data gives more accurate black spot identification rather than conventional methods of collecting record data of accidents.
A spatial accident information system was introduced in [26].The system manipulates attributes characteristics of accidents obtained from accident reports using a set of spatial analysis tools provided in GIS.Hotspots of crashes are identified by applying stat crime programs based on the nearest neighbor hierarchical clustering technique [27].Accidents along roadways were analyzed, interpolation of crashes is applied using the Crime stat program, which shows the presence of hot spots of accidents with high risk.The advantages of this system over other systems that it can be used for usual types of accident analysis and difficult types of analysis that cannot be performed with tabular system such as spatial selection.However, more accuracy in spatial information is required to improve the performance of the system.
A study for identifying crash hotspots based on GIS methods in Muscat city the capital of Oman was introduced in [28].Various methods were employed to identify the crash hot spots including Kernel Density Estimation (KDE), Network-based K-Function, Network-based Nearest Neighbor Distance (Net-NND), spatiotemporal Hot-zone analysis, and Random Forest Algorithm (RF) and.Results confirm that road intersections influence road accidents more than other geometric features of the road.A cell-grid model was proposed in [29] for bicyclist risk maps in Manhattan, New York City. the Bayesian framework was utilized to develop a random parameter model which is used to correlate the cost of bicycle accident with land use, transportation and sociodemographic data.Findings confirm that the proposed method is superior to the Tobit model.
A prediction for heavy vehicle accidents hotspots clusters was introduced in [30].The clustering is based on three criteria, and achieved using Moran's I spatial autocorrelation.The risk www.ijacsa.thesai.orgalong the network was estimated using Getis-Ord Gi* statistic, findings reveal the existence of 22 segments remarked with heavy vehicle risk.This paper presents a statistical analysis of traffic accidents in the city of Amman, Jordan for three successive years 2017,2018, and 2019 based on different attributes such as severity, type of location where accidents have occurred, lighting environment of accidents' locations.The distribution of accidents is examined using GIS spatial tools.The hotspots are also identified using GIS tools for traffic accidents in each studied year and traffic accidents causing fatalities and serious injuries in each year as well.Moreover, the hotspots are identified for roads that registered the highest proportion of accidents.A new contribution presented in this paper is about adding an extra attribute on the collected data to show the status of the road if it sets on a transport road or not.Hence the hotspots can be analyzed by correlating road status parameters in terms of a transport road or not with other parameters provided with the data and described in the following section.

III. DATA
The accident data of the years 2017, 2018, and 2019 were obtained from the traffic institution in Amman the capital city of Jordan, these data were converted into an SQL server format.This could provide the ability to build relational algebra expressions to represent intended methods.
The attributes of the data are divided into 12 categories as follows:  The date of the accidents.
 The location (latitude, longitude) of the accidents.
 The number of accidents for each location of accidents is going to be counted.

 Minor injuries.
 Intermediate injuries.
 Major injuries.
 The direction of lanes (one way, or bidirectional ways).
 Driving license type.
 Lighted road (yes or no).
 Types of mistakes that cause accidents.

 Type of vehicle (Passengers or Shipping).
 Type of accidents (Vehicle crash or Pedestrian crash).
The roads of interest can be selected based on the map of transport in the city of Amman as shown in Error!Reference source not found.
Also, the data collected in [31] described in Error!Reference source not found.were considered and being updated from the main source of Jordan Traffic Institute and Traffic Department.

A. Accidents Temporal Distributions
This section presents accident temporal variations in the studied region based on several categories: years, accident location, severity, and lighting environment.
Error! Reference source not found.shows the number of injuries, and deaths for the years 2017, 2018, and 2019.A total of 628,006 crashes were recorded, resulting in 33,439 injures and 897 deaths.Also, this figure shows that the year 2019 includes the highest number of accidents, injuries, and deaths.Consequently, an analysis of the data was required to justify the main causes and providing recommendations to take further actions by the authorities of road safety.Error!Reference source not found.shows the numbers of accidents according to the location of accidents.In this category, five major groups according to accident locations are taken into accounts (a road with two lanes separated by a central island, a road with two lanes not separated by a central island, one-lane road, public square, and inside a park).The accidents that occurred on roads with two lanes separated by a central island constitute the highest proportion of the total number of accidents which exceeded 350000 accidents.This is followed by 200000 accidents recorded on roads with two lanes not separated by an island.The three remaining groups represent a small proportion of the total number of accidents.
Another statistic for the category of accident locations along with the year category was performed, where the number of accidents according to accident location is calculated for each year separately.Results are shown in Error!Reference source not found..It can be observed that there is a serious problem with roads of two lanes, especially where lanes are separated by a central island, which is more vulnerable to traffic accidents.This high number of accidents stirred further investigations to find out factors contributing to accidents and taking preventive measures to tackle this problem.
Error! Reference source not found.shows the results of a statistical analysis based on severity category.In this category, four groups are considered (simple injuries, intermediate injuries, serious injuries, and fatality).There is almost a convergence between the total number of casualties in each studied period.However, 2017 year witnessed the smallest number of intermediate injuries while 2019 witnessed the largest.
Error! Reference source not found.shows statistical results based on the lighting category.In this category, several groups are studied including various parts of the day and lighting environment (day, rise, sunset, darkness, etc.).Results show that most accidents have occurred during the daytime in the three studied years.The second-largest proportion of accidents occurred at night through adequate lighting conditions.A small proportion of accidents occurred during the night with not enough lighting and during sunset time.These results are consistent with the system of life in the studied area Amman.where most roads witness a heavy traffic volume during the day due to the high density of population in the capital city of the kingdom.

B. Identifying Hot Spots of Accidents Analysis
Although the accidents have been thoroughly examined based on various attributes using SQL, a spatial framework enables in identifying the pattern of accident distribution which provides a better understanding of road safety.Based on the type of spatial pattern of accidents, hotspots can then be identified.The hotspots can be defined as a concentration of accidents at or near a specific location.Analysis can be applied to investigate clusters of events that occur near each other.Hotspots then can be identified by using spatial analysis tools provided in GIS software, where several crashes occurring at a given location are counted. .An alternative way to identifying hot spots of crashes is using stat crime programs based on nearest neighbor hierarchical clustering [32].
In this work, the nearest neighbor index (NNI) is used to analyze the pattern of traffic accident distribution.The nearest neighbor index was calculated using the average nearest neighbor tool provided by ArcGIS v.10.8.The nearest neighbor is considered one of the popular methods used to identify point pattern and it is calculated using equation ( 1) Where represents the average distance between each accident location and the nearest accident location.
∑ (1) represents the expected distance based on the hypothesis that the accidents are distributed randomly in the studied area.

̅ √
In equations ( 2) and ( 3), n represents the number of accidents in the studied area and A is the area of the studied region.
In this work, the spatial pattern of traffic accidents of the years 2017, 2018, and 2019 respectively are analyzed.Results show that the accidents are clustered for each studied year, where NNI is less than one for each year.Since the priority of this work is to mitigate the fatalities and casualties of road accidents, the nearest neighbor analysis is conducted for two severity groups (fatalities and serious injuries group) for each studied year.Results of the analysis showed that accidents causing fatalities are clustered as well, where NNI is less than one for all studied years as shown in Error!Reference source not found.II.
Regarding the serious injuries group, also (NNI) index indicates the clustering of accidents.However, the clustered accidents are not significant for the years 2017 and 2019, while there is a clustering of accidents resulting in serious injuries in the year 2018 with a small significance level.
As shown in the previous section the majority of traffic accidents were occurring on roads with two lanes separated by a central island.This is followed by accidents occurring on roads with two lanes that are not separated by an island.These results require further spatial analysis to find out the locations of these roads.The nearest neighbor analysis is performed on accidents Since the results of all the examined accidents showed clustering status, hotspots are identified for accidents each year, for two severity groups, two types of roads.Hotspot zones are identified using the optimized hotspot analysis tool provided in ArcGIS.This technique is based on creating a fishnet polygon Cells.Each cell has a side length of 250 m. Results show that the hotspot areas were mapped with more than a 90% confidence level.
Hotspots maps were produced for the whole accidents occurring in each studied year 2017,2018 and 2019 as shown in Figure 8[a, b and c].Findings clearly pinpoints the concentration of accidents in the same zones for each studied year which are mainly in and around the central regions of the study area.Other hotspots maps were produced for accidents causing fatalities for each studied year as shown in Figure 9[a,b,  and c].Results are similar to the aforementioned results in the manner of distribution of regions of accidents concentration, where hot zones are mainly distributed in and around the center of the study area.However, accidents distribution that causing fatalities in year 2018 is noticeably less than accidents concentrations in year 2017 and 2019.
Regarding the accident resulting in serious injuries group, Findings reveals that there are no significant presence of hot spots in year 2017 and 2019 with small presence of accident hotspots in year 2018 as shown in Figure 10.Since accidents located on roads of two lanes show high significance clustering, hotspots maps were produced for this category of accidents as well.As shown in Figure 11[a, b and c], and Figure 12[a, b and  c], hotspot zones distribution for both groups of accidents in this category are similar to each other , to the distribution of accidents related to severity category and to the general distribution of hotspot zones along the studied years.
The spatial analysis highlight evidence of spatial clustering and recurrence of traffic accidents in the central regions of the studied network.The findings confirm that roads directed to the center of the studied network are more vulnerable to traffic accidents other than roads that are located away from the center of the study area.This confirms that geography of the road and neighborhood elevate the risk of accidents concurrence.The results show that 85.31% of accidents have occurred on roads of public transport.Also, the number of fatalities is increased by this type of road.The accidents are focused on the middle city of Amman.This indicates that the more cars, the more accidents.The roads with insufficient lights are not the main cause of accidents, the statistics have revealed that the lack of road lights has increased the drivers' attention while driving, leading to enforce them to reduce the speed.The recommendations of this research to the authorities of traffic control are: to set special lanes for the buses, enhance the highway road and provide more safety signals, assign special lanes for emergency vehicles such as ambulances and fire fighting, and assign CCTV cameras on the hotspots areas.

VI. CONCLUSION AND FUTURE WORKS
This study aimed to examine the accident data in the capital city Amman, Jordan and find the temporal and spatial distribution of these accidents and finally identify the hotspot in the study area.
SQL was used to perform the temporal analysis based on several categories.Results showed that accidents located on roads with two lanes are significantly more than accidents located on other types of roads, Moreover, this work aims to examine the accidents in the severity context and the accidents occurring each year, thus GIS-based tools were utilized to further examine these categories in a spatial framework.
Results highlight evidence of spatial clustering and recurrence of traffic accidents in the study area.Accident hotspots-based hazard maps for 2017 year accidents, 2018 year accidents, 2019 year accidents, two severity groups (Fatalities and serious injuries) and two road type groups (road of two lanes separated by central island and road of two lanes not separated by central island) were produced at the zonal level in the study area.The examined groups of accidents were found significant (>90%) for hotspots mapping.
Findings confirm that hotspots for all examined groups were mainly concentrated in commercial /residential and industrial/ zones which are located in and around the central regions of the study area.This is justified by that such regions have high density of population and public facilities, as a result high traffic volume and low speed limits.
More importantly, the findings contribute in elevating the systematic understanding of the spatial factor of traffic accidents in the study area and the influence of the geography and neighborhood of road in addition to other geometric features of roads on the traffic accidents.This could help in modelling and predicting high-risk zones and as a result taking into considerations precautions and developing suitable countermeasures for the identified hotspots zones.www.ijacsa.thesai.orgIt is worth mentioning that the accuracy of accident analysis depends on the accuracy of the original data.The accuracy of accident data can be improved by continuously update the records system used in the traffic department.This could be achieved by using Global Positioning System (GPS) receivers by the traffic police station to obtain accurate information about the locations impacted by the accident.The availability of such accurate information enables conducting efficient safety analysis.An accident diagram along with pattern analysis can be produced using special software.
Future work can be extended by taking into consideration more attributes of accidents such as reasons of accidents, identifying the relationship between traffic accidents and key the factors contributing to these accidents, and applying machine learning to predicate hotspots for other areas.

Fig. 2 .
Fig. 2. The Transport Map of a Region in Amman.
www.ijacsa.thesai.orgoccurring in these two types of roads.Results show that the accidents are clustered in the studied area.Results are summarized in Error!Reference source not found.II Error!Reference source not found..

Fig. 10 .
Fig. 10.The Distribution of Hotspots of Accidents that Caused Serious Injuries in 2018.

Fig. 11 .
Fig. 11.Distribution of Hotspots of Accidents that Occurred on Roads Attached with Two Lanes Separated by a Central Island in 2017 (a), 2018 (b) and 2019 (c).

Fig. 12 .
Fig. 12. Distribution of Hotspots of Accidents that Occurred in the Road with Two Lanes not Separated by a Central Island in 2017, 2018 and 2019 from Left to Right respectively.V. CRASH ANALYSIS OF TRANSPORT ROADSOne of the main objectives of this study is the analysis of safety along with various transport roads which are expected to gather more population (drivers and passengers).A new database schema was created to describe the collected data as shown in Error!Reference source not found.Error!Reference source not found..The parameters that are

TABLE II .
NEAREST NEIGHBOR INDEX (NNI) VALUES AND THE NUMBER OF ACCIDENTS EACH YEAR

TABLE III .
HAZARDOUS STREETS IN AMMAN.WHERE #A,#I, #F AND PTR ARE THE NUMBERS OF ACCIDENTS, INJURIES, FATALITIES AND ON ATRANSPORT ROAD RESPECTIVELY