A comprehensive analysis of the relationships between the built environment and traffic safety in the Dutch urban areas

Built-environment factors potentially alleviate or aggravate traffic safety problems in urban areas. This paper aims to investigate the relationships of these factors with vehicle-bicycle and vehicle-vehicle property damage only (PDO) and killed and severe injury (KSI) crashes in urban areas. For this purpose, an area-level analysis using 100x100m 2 cells, along with a Spatial Hurdle Negative Binomial regression model were employed. The study area is composed of a selection of municipalities in the Netherlands-Randstad Area where major land-use developments have occurred since the 1970s. The study was conducted by developing a rich dataset composed of various national and local databases. The findings reveal that built-environment factors and land-use policies have substantial impacts on safety, which cannot be neglected. The factors explaining the land-use density and diversity in the area (e.g., urbanity and function mixing levels), as well as the land-use design characteristics (indicated by average age of the neighborhoods), traffic and road network characteristics, and proximity to different destinations influence the probability, frequency, and severity of crashes in urban areas. Furthermore, low socioeconomic levels are associated with a higher frequency of traffic crashes.


Introduction
Traffic safety policy and research have mostly focused on road network measures and road design, vehicle safety, and road user behavior in the past decades. However, the role of built-environment factors in traffic safety has not yet been fully examined. For the Netherlands, this is rather surprising given the importance attributed to urban spatial planning and it being one of the most planned countries in the world. Despite the considerable improvements that have taken place in traffic safety in urban areas over the decades, one-third of road fatalities still occur in the built-up areas of the Netherlands (SWOV, 2021). Considering the aim of construction of on average 75,000 new homes per year until 2025 (MinIenW, 2018), understanding the relationships between the built-environment factors and traffic safety is invaluable to improving traffic safety.
Built-environment characteristics are usually represented by indicators called "5Ds", namely (population) density, (land-use) diversity, (land-use) design, distance to transit, and accessibility to destinations (Ewing & Cervero, 2010). Different built-environment factors can increase or decrease traffic crashes as they contribute to variations in exposure, speeds, and conflicts between road users (Ewing & Dumbaugh, 2009;Merlin et al., 2020b;Saha et al., 2020). Previous studies about the effects of built-environment factors on traffic crashes have mainly accounted for the impacts of the road network characteristics and a limited number of other built-environment factors. The studies have resulted in mixed and sometimes contradicting findings on the magnitude and direction of the effects of the built-environment factors on traffic crashes. Some studies, for example, show an increase in population and employment density is associated with a higher frequency of vehicle and bicycle crashes (even after controlling for exposure) but a lower frequency of severe crashes (Obelheiro et al., 2020;Osama & Sayed, 2017). However, at the same time, higher residential density areas promote walking and cycling and reduce car-use which can lead to a decline in all types of traffic crashes including both vehicle only and bicycle involved crashes (Osama & Sayed, 2017;Saha et al., 2020;Schepers et al., 2019). Furthermore, some land-use characteristics such as commercial and educational zones have been reported to negatively affect cyclist safety (Mukoko & Pulugurtha, 2019;Osama & Sayed, 2017). High land-use diversity (i.e., mixed land-use index (MXI)) was shown to decrease (KSI) crash frequency in vehicle-only and vehicle-bicycle crashes (Chen & Shen, 2016). This is because higher MXI leads to shorter travel distances which results in reduced car-use and consequently risk exposure specifically for the cyclists (Chen & Shen, 2016;Schepers et al., 2019). Proximity to facilities such as transit stops, commercial and service locations, on the other hand, were shown to increase the risk of crashes that involve vehicles and/or bicycles (Kim et al., 2010;Osama & Sayed, 2017;Schepers, 2021b;Vandenbulcke et al., 2014;Wei & Lovegrove, 2013). However, Kim et al. (2011) suggested that increased proximity to roads is associated with reduced crash severity.
This paper analyses the impact of a comprehensive set of builtenvironment variables on vehicle-only and bicycle-involved types of crashes. Most literature focuses on determining the impact of a specific of limited set of built-environment factors on different crash types. Only a few studies have examined the impact of a comprehensive set of builtenvironment factors on traffic safety. For instance, Obelheiro et al. (2020) analyzed the impact of built-environment factors on the number of crashes at newly developed traffic safety zones. However, the authors focused only on the "road network and infrastructure design" indicated by the density of different types of intersections, traffic signals, and the proportion of different types of roads. Similarly, Yu and Xu (2018) employed the "nonmotorized infrastructure design" features indicated by the ratio of sidewalk/bike lane length to the street length. Ukkusuri et al. (2012) accounted for different built-environment factors at the census tracts. However, only a limited number of land-use factors such as land-use density were utilized. Nonetheless, none of these studies examined the effects of the "land-use design" characteristics on traffic safety. As an exception, Xie et al. (2019) studied the effects of land-use changes and showed that conversions of the business, commercial and residential land-use classes could increase the severe crash frequency in urban areas.
Disregarding the above-mentioned collective effects of builtenvironment factors can potentially lead to an under/overestimation of their importance due to the unobserved effects of missing variables. Moreover, these factors do not have the same effect on different types of crashes (e.g., motor-vehicle vs. bicycle crashes). However, literature falls short of identifying and comparing the variations in the magnitude of these effects. For example, it is still not clear how different levels of MXI affect vehicle-bicycle crashes compared to vehicle-vehicle crashes. Similarly, the effects of proximity to different facilities on the severity level of crashes remain unclear.
From another perspective, built-environment characteristics change over time and space as a result of different land-use and transport policies. These changes naturally affect travel behavior and, in turn, traffic safety. For example, the Netherlands underwent major developments during the 1970s and 1990s, with policies such as "Dutch New Towns"  and "VINEX Areas" (1993). Safer road network designs, accessible public transport, and short distances to services (e.g., supermarkets) were some of the goals for the development of these newer areas. These spatial policies and developments together with transport planning have played a role in the declining numbers of fatal crashes (Schepers, 2021a;Schepers et al., 2019). Nonetheless, these changes add to the complexity of the problem of understanding the effect of the built-environment factors on safety because the homogeneity across different areas is lost.
To address the issues summarized in the preceding paragraphs, this study developed and utilized a wide-ranging set of built-environment, land-use, road network, and socioeconomic variables that have an effect on traffic safety. Moreover, the study accounts for real traffic exposure and transportation infrastructure characteristics in the analysis. To be specific, property damage only (PDO) and fatality/severe injury (KSI) Vehicle-Bicycle (V&B) and Vehicle-Vehicle (V&V) crashes were investigated. For this purpose, a rich dataset was compiled by integrating various national and local databases covering several domains ranging from built-environment characteristics to traffic features. A square grid-based macro-level analysis was adopted. The spatial contiguity effects of the built-environment factors were particularly considered in the analysis to address the spatial heterogeneities and dependencies. The compiled rich dataset was analyzed using a Hurdle Negative Binomial model which can handle count datasets involving an excessive number of zero outcomes (a common issue in crash count data) (Hosseinpour et al., 2013). The analysis was conducted for a case study area covering a selection of municipalities in the most urbanized part of the Netherlands, the "Randstad Area" which has experienced several changes in the built-environment over the decades.
The contribution of this paper to the literature is threefold. Firstly, the study examines the collective effects of a comprehensive set of builtenvironment factors on traffic safety, exploiting a rich dataset compiled from different national and local databases. Secondly, the study compares the contribution of similar built-environment characteristics on both V&B and V&V crashes in PDO and KSI types. Thirdly, the study presents a spatial-statistical modeling approach that can address prevalent area-based modeling issues such as crash assignment (Kocatepe et al., 2019), spatial dependency and heterogeneity (Ziakopoulos & Yannis, 2020), and an excessive number of zeros in the crash counts (Lord & Mannering, 2010).

Study Area
This study particularly focuses on the urban parts of the cities that are demarcated as built-up and recreational areas by the Central Bureau of Statistics in the Netherlands (CBS, 2015) ( Fig. 1-a). Furthermore, as this study only focuses on urban traffic crashes, motorways (i.e., intercity roads and trunk links) and crashes that happened on these roads were excluded from the analysis.
The case study was conducted in the Randstad Area which is the most urbanised area in the Netherlands and includes the four major cities in the country, i.e., Amsterdam, Rotterdam, The Hague, and Utrecht ( Fig. 1-b). These cities include new urban centers and housing developments that have been expanded from the city center towards the city boundaries since the 1970 s. Several new towns (built in the 1970 s), such as Almere and Houten were also included in the analysis ( Fig. 1-c). The built-environment characteristics of these new urban areas (built after the 1970 s) mainly differ from historic city centers in terms of destination accessibility, distance to transit and infrastructure design with seperated road space for different urban transport modes (Bach et al., 2006).

Geographical units of analysis
Analyzing traffic crash data requires specific considerations regarding the level and scale of the analysis, as well as the specification of the crash data. From the existing methods adopted to analyze the impacts of the "built-environment factors" on safety the area-level analysis is one of the most promising approaches. Depending on the study approach and availability of data, alternative units of analysis such as traffic analysis zones (TAZs) (Obelheiro et al., 2020;Osama & Sayed, 2017), administrative areas (e.g., census tracts or postcode areas (Abdel-Aty et al., 2013;Osama & Sayed, 2017;Saha et al., 2020)), or uniform formats (e.g., hexagons (Cui et al., 2021) and square grids (Gladhill & Monsere, 2012;Kim et al., 2010)) have been selected in the literature.
The uniform formats such as square grids are more advantageous than administrative areas considering that boundaries of these areas are demarcated by the roads where crashes occur. Therefore, there is always a problem of "crash assignment" when administrative areas are used (Gladhill & Monsere, 2012;Kocatepe et al., 2019). Utilizing the square grids as the unit of analysis allows the inclusion of all crashes without the need to give special consideration to crashes on "boundaries". Moreover, the uniform units allow the researcher to control for the unit size. Due to the practicality of this method, the availability of 100 × 100m 2 grid cells (100-cells), and the availability of extensive land-use and socioeconomic and demographic data in this format (CBS, 2017a) we divided the study area into 100-cells ( Fig. 1-a).

Modeling spatial spillover effects
A major concern in traffic safety analysis is related to spatial dependency and heterogeneity biases. These biases affect the performance of crash analysis due to unobserved factors associated with the locations of crashes (Xu & Huang, 2015). To alleviate these issues, two alternative approaches are proposed in the literature (Cai et al., 2016): 1) spatial autocorrelation effects: accounting for the effects of the unobserved exogenous variables by involving spatial autocorrelation among the units of analysis; 2) spatial spillover effects: accounting for the effects of the observed exogenous variables (e.g., built-environment factors) influencing the dependent variable (e.g., crash frequency) at the targeted and neighboring areas. The spatial spillover effect approach is appealing for macro-level studies due to the ease of modeling and its superior interpretability and communicability features (Cai et al., 2016).
To test the spatial dependency and heterogeneity, the Global Moran's I-tests (Ziakopoulos & Yannis, 2020) were conducted. The total number of PDO and KSI crashes in V&B and V&V types were tested. The tests results revealed that spatial autocorrelations exist between the crash frequencies of the neighboring analysis units. Therefore, we accounted for the spillover effects in the models. For this purpose, "queen contiguity" was used to identify the neighboring cells for each cell (eight neighbors in the case of square cells - Fig. 2). Then, the average values of identified "spillover variables" for these eight neighboring cells were computed and assigned to each cell. These neighboring variables were included as independent variables in the statistical models.

Hurdle Negative Binomial model
Several counts/continuous modeling techniques have been applied to analyze the crash frequencies and rates (Lord & Mannering, 2010). Crash data commonly include an excessive number of zeros due to the lack of crashes in the majority of the entities (e.g., roadway segments). The use of data with "excessive zeros" in modeling which does not address this issue can result in incorrect and invalid parameter estimations (Lord & Mannering, 2010). To address this problem, models such as the Tobit model for continuous dependent variables (e.g., crash rates) (Anastasopoulos et al., 2012;Chen et al., 2021;Ulak et al., 2018), as well as Hurdle and zero-inflated (ZI) models for discrete dependent variables (e.g., crash counts) (Cai et al., 2016;Chen et al., 2022;Katrakazas et al., 2021), have been proposed. Additionally, novel resampling data analysis methods are also introduced for handling unbalanced data in accident analysis studies. These methods are applied for developing predictive models using synthetic data for uncommon crash types (e.g., bus-involved crashes) and in the presence of excessive zeros (Ariannezhad et al., 2021;Chen et al., 2022;Morris & Yang, 2021;Yahaya et al., 2019).
The superiority of the dual-state models compared to the conventional Poisson and NB models has been previously considered for both area-based and network-based analysis (Cai et al., 2016;Hosseinpour et al., 2013). The major difference between the ZI and the Hurdle models is that the latter only accounts for the sampling (i.e., random) zeros, whilst the ZI model accounts for both sampling zeros and structural zeros (Hosseinpour et al., 2013). The structural zeros indicate an "impossibility" of crash occurrence (e.g., when there is no traffic), which disregards the fact that traffic crashes can happen on any road regardless of the safety level (Hosseinpour et al., 2013;Yu et al., 2019). Therefore, the Hurdle model is considered the more appropriate approach in this study.
The Hurdle models are modified two-part count models of which the first part is a binary model that handles whether the dependent variable crosses the "hurdle" (i.e., whether a crash happens in a cell) and the second part consists of a count model which is truncated at zero (Mullahy, 1986). Alternative probability distributions such as Poisson, NB, and lognormal can be used in the count part of the model depending on the goodness-of-fit to the data (Cai et al., 2016;He et al., 2019;Ma et al., 2015). Hurdle model was utilized in traffic safety analysis by Hosseinpour et al. (2013) to investigate the effects of the road network characteristics on pedestrian-vehicle crash frequencies on the roads. Even though, they showed that the HP models offer better modeling performance compared to the HNB models. In this study, we used Hurdle NB Regression models, as not only the NB distribution can be generalized to Poisson distribution but also comparisons of the performance of the Hurdle Poisson (HP) and Hurdle NB (HNB) models showed a better goodness-of-fit for the HNB model. The HNB model can be described as Eq. (1): where p i specifies the probability of crash occurrence in the analysis unit (0 < p i < 1); p i can have Logit (or Probit) distribution and y i indicates the number of events in the units of analysis; g ( y i ) is the negative Binomial distribution with mean μ i and NB dispersion parameter α (Eq. (2)). The model parameters are estimated by the Maximum Likelihood method, where the specification of the likelihood function has the advantage of separately maximizing both the count and the hurdle components of the model (Zeileis et al., 2008). In this study, we used R software, "pscl" package (Jackman et al., 2015) for estimating the model parameters.

Data processing approach
For the analysis, a rich dataset was compiled using several sources. The final set of variables in this dataset and summary statistics of these variables are listed in Table 1. The variables in this table are grouped into seven categories based on the characteristics of the variables. In each cell, crashes (based on the type) were summed up to calculate the total number of crashes (Fig. 3). In addition, line and polygon features intersecting with each cell were also aggregated (Figs. 4 and 5). That is, for example, roadways that fall into a cell were combined to find the length of each roadway type (e.g., residential, etc.) in that cell. Then, the proportion of different types of roads in each cell was calculated and assigned to the cell. For this purpose, Eq. (3) was used.
where R r i indicates the proportion of road type r in the corresponding cell i; l r ij is the length of road segment j with type r in cell i. For polygon features, such as the area of different land-use classes, a simple proportioning method (Eq. (4)) was used.
where S l i is the proportion of the polygon feature S type l in the cell i; S l ij is the area of the S l i in building block j; B j is the total area of the building block j, and a ij is the area of the building block j that intersects with cell i.
These values were then used to estimate the mixed land-use index (MXI) of the cells, which indicates the ratio of housing function areas to all other functions (Van Den Hoek, 2008). It is worth noting that the utilized MXI uses a floor-space area of the land-use functions provided in the RUDIFUN database (Leefomgeving, 2019) rather than the groundspace area. This results in more accurate MXI values (Harbers et al., 2019).

Motor-vehicle and cycling volumes data preparation
One major issue in traffic safety studies is the lack of reliable exposure variables, e.g., traffic volume, data for the minor roads (Schepers, 2021b). Previous studies usually utilized "proxy variables" such as the trip production measures (i.e., population or employment), (bicycle) lane-kilometer, mode choice, or accessibility to train to account for the traffic exposure (Chen, 2015;Merlin et al., 2020b;Obelheiro et al., 2020;Wang et al., 2019). This means that traffic and cycling exposures were not explicitly controlled for whilst analyzing the effects of builtenvironment factors. Although it is clear that such proxy variables can lead to less accurate estimates of these effects, use of these variables can give a fair indication of the traffic/cycling volumes in the absence of high quality traffic volume data.
In this research, we used databases from different sources such as Open Transport Map (OTM) (Jackman et al., 2015) and Open Street Map (OSM) (Geofabrik, 2019). Furthermore, traffic count data were obtained from the traffic models of municipalities of Amsterdam, the metropolitan region of Rotterdam-The Hague, the Province of Utrecht, and the city of Almere. However, the traffic models had two shortcomings: 1) the traffic models provide weekday counts and do not model weekend counts, and 2) the traffic models mainly provide the vehicle counts on the major urban roads (arterials/distributor roads) and do not cover minor roads (e.g., residential or access roads), which constitute a major portion of the road network in urban areas.
To remedy these shortcomings, we used regression models (Table 2) based on the data for the selected cities from a database called "INtensities on WEgVAkken" (INWEVA) provided by the ministry of infrastructure and the environment (Rijkswaterstraat, 2019a). This database consists of traffic intensities on major roads both for weekdays and weekends based on different vehicle categories. These models were used to estimate the vehicle counts at the weekend (WE n ) based on the weekday (WD n ) data. Estimated traffic volumes were then assigned to the minor roads. Finally, traffic volumes were used to calculate the total vehicle-kilometre-travelled (VKT) values in the cell i, using Eq. (5).
where VKT i is the total VKT in cell i; v ij is the average daily vehicle count on the road link j in the cell i; l ij is the length of the road link j (in km) located in cell i. For bicycle volume and average cycling speed, this study used data including the number of registered bicycle rides per road section during the entire Bicycle Counting Week (BCW) in 2016. This project had almost 51,000 participants in the Netherlands and was carried out from the 19th to 25th September 2016 (Nationale Fietstelweek, 2016). To check the validity of the BCW data, Van Petegem et al. (2021) conducted an analysis using data from manual bicycle counts, detector loops, and the BCW data in Amsterdam. The statistical tests found a high correlation between BCW data and manual counts (Pearson correlation = 0.78).   Moreover, Uijtdewilligen et al. (Accepted for publication) found that the ratio between the detector loops and BCW data is equal to 41.12. Thus, we calibrated the daily bicycle counts obtained from BCW by using a factor of 41.12. Similar to the VKT, the total Bicycle-Kilometer-Traveled (BKT) in the cells were calculated after this calibration.

Distance variables indicating the proximity to facilities
Proximity to facilities was evaluated using variables showing the distance to certain facilities. Information on the distance of each cell to the main roads was of interest in the V&V crash analysis because more motorized vehicles can be expected in the cells with a shorter distance to the main roads. In addition, the average distances to supermarkets and food-product shops were included as previous research in the Netherlands showed that these facilities are one of the main car/bicycle-trip destinations (Veenstra et al., 2010). The average distance to different types of (secondary) schools and train stations was also included in the V&B crash analysis as cycling is especially prevalent in the Netherlands for trips to education facilities (Mobiliteitsbeleid, 2017b) and access/egress to train stations (Mobiliteitsbeleid, 2017a).

Crash data
This study uses data obtained from the database with crashes reported by the police (BRON, 2015(BRON, -2019. Note that crash data commonly suffers from underreporting of minor injuries and PDO crashes and BRON crash dataset is no exception to this (SWOV, 2016). In general, severe crashes involving motor vehicles have a higher registration rate than less severe crashes and crashes which do not involve motor vehicles. Therefore, KSI crashes, defined as crashes with fatalities and/or injuries leading to hospitalization, are analyzed separately.
It is also worth mentioning that there can be inaccuracies in crash locations depending on the record type. In BRON data, crash locations are registered at four accuracy levels: 1) exact coordinates, 2) intersection level, 3) street level, and 4) municipality level (the least accurate level) (Please see (Rijkswaterstaat-CIV, 2021) for more details). In the BRON (2015-2019) data used in this study area, only 2.3% of crashes were registered at the municipality level. However, this part of the data was not discarded, due to its negligible share.
Despite these potential inaccuracies, the BRON dataset is still a highquality dataset and the only available source of traffic crashes in the Netherlands. Of the final set of 100-cells included in the analysis, one cell had 491 registered V&V-PDO crashes and was discarded after a manual inspection due to erroneous records.

Crash types and descriptive statistics
A total of 7,622 V & B crashes including 5,149 PDO and 2,473 KSI types were analyzed in this study. 26,443 PDO and 2,066 KSI crashes  involving motor-vehicles only (V&V) were included in the analysis. Fig. 6 shows that the majority of cells included in the analysis had zero crashes. Figure 7 shows that the maximum number of V&B-PDO and V&B-KSI in 2015-2019 are 15 and 9 crashes respectively. For V&V crashes, the maximum number for PDO and KSI crashes are 81 and 8 respectively.

Modeling results
Spatial Hurdle Negative Binomial models were conducted to analyze the traffic crashes. Tables 3 and 4 present regression coefficients (β), standard error (SE), and the p-value of parameters of the count and zerohurdle parts of the models as well as the goodness-of-fit (i.e., Log-Likelihood) of the models. In terms of comparing the modeling performances, we should note that the KSI crash models (AIC KSI-V&B = 15,366.10and AIC KSI-V&V = 13,917.21) had a better performance than the PDO crash models (AIC PDO-V&B = 24,888.37and AIC PDO-V&V = 69,450.99). Figures 8 and 9 show the "standardized coefficients" (Siegel, 2016) for significant variables (95% CI). Standardized coefficients were chosen because they are scale-less and unit-less, making them more useful than regular coefficients for comparing the direction and specifically the magnitude of effects (Zhao et al., 2021).
In Figs. 8 and 9, the plain and diagonal shaded blue bars reflect the impacts of the factors on the PDO-crash "frequency" and "probability" respectively. "Frequency" refers to the expected number of crashes in the 100-cells and "probability" refers to the expected chance of observing one or more crashes in the 100-cells within a 5-year period. The orange bars, on the other hand, show the same values for KSIcrashes. Fig. 8 shows that a greater frequency of V&B crashes is expected in the areas with a larger number of old housing (built before 1900). A comparison between the size of the blue and orange bars reveals that the magnitude of this impact was smaller for PDO crashes than KSI ones. The age of the buildings implies policy changes towards landuse and roadway design, rather than the effects of the buildings themselves. That is, the weighted average year of build for residential properties in the 100-cells can be treated as a proxy variable reflecting the variety in the design of the built-environment, which affects traffic safety.
In Fig. 8, it is also clear that the size of the plain orange bar corresponding to the 30 km/h variable is larger than the same bar for the 50 km/h. This indicates that 50 km/h roads have a larger impact on KSI crashes than 30 km/h roads. Whereas 30 km/h is more influential on PDO crashes than 50 km/h roads. These figures provide a clear illustration of the scale-free effects of variables on crashes and a comparison of these effects. However, the reader should refer to Tables 3 and 4 to see the unstandardized coefficients of these variables.

Effects of built-environment characteristics
The results show that fewer V&B crashes are likely in areas with a higher proportion of households with children (Fig. 8). This result was expected as the speed limit of the access roads in residential areas with housing for families is set at 30 km/h (15 km/h in home zones where the cyclists and pedestrians have priority over cars which makes the areas safer for pedestrians and cyclists). Moreover, relatively many families with children live in the VINEX areas that are designed (and also found) to be safer locations for cyclists.
Three groups of land-use characteristics affect traffic safety, land-use density, land-use mixture, and land-use classes (Xu et al., 2020). Our findings reveal that increasing the level of urbanization (indicated by the density of the registered addresses in the cells) raises the probability and frequency of traffic crashes. This might be due to the higher population and the conflicts caused by an increased rate of trip attractions/ generations in these areas. In contrast, a reduced crash probability was expected and found at a higher level of MXI (representing land-use diversity). This relationship was also found in the literature (Chen & Shen, 2016). This relationship can be explained by the fact that locations (i.e., origins and destinations) are closer to each other in areas with larger MXI, which encourages the residents to walk or cycle.
Furthermore, the results show that the ages of the built-up areas used as proxy variables for various land-use designs impact traffic safety. These variables also indicate the implementation of different land-use policies which is very intriguing. As the results illustrate, areas built after 1950 are safer for cyclists. Also, the results imply that the older the area (i.e., built between 1900 and 1970), the higher the frequency of V&V-KSI crashes. These findings verify the results of Schepers (2021b) and Schepers et al. (2019) who found that the development of the Dutch new towns (in the 1970 s) and VINEX areas (in the 1990s) has had a positive impact in reducing road deaths.
Higher proximity to grocery stores increases the frequency and/or probability of KSI crashes in the area. In addition, the shorter the distance to the lower-ranked 50 km/h municipal roads (indicated by "distance to the main road" (Leeuwen & Venema, 2021)) the lower the probability of KSI crash occurrence. This result is in line with the findings of Merlin et al. (2020a) who showed that cumulative higher accessibility to traffic volume on arterials and freeways within a 30-minutes travel time thresholds had a positive correlation with crash frequency of all (injury and modes) types. The short distance to the main roads may indicate a better-designed infrastructure using new policies which consider traffic safety. Higher proximity to the main roads also indicates a longer distance to the city center. In these areas, the road network is less dense and modes other than motor-vehicles are less frequent. All these characteristics might lower traffic conflicts and crashes. Figure 8 also illustrates that high proximity to schools increases the frequency of severe V&B crashes. These findings are consistent with studies in the Netherlands (Schepers, 2021b) and other regions (Hadayeghi et al., 2007) indicating that severe V&B crashes were more probable in the areas proximal to schools. One reason for this might be the tendency of secondary school students to cycle in groups (i.e., bunch cycling). Such behavior has been found to increase the risk of sports cycling crashes in the Netherlands (Wijlhuizen et al., 2016). Also, shorter distances to schools are common in areas with older buildings which were found to be riskier areas for cyclists.

Effects of traffic and infrastructure characteristics
The major role of exposure variables (i.e., VKT and BKT) in increasing the frequency and probability of crashes can be seen in Figs. 8 and 9. These findings broadly support the work of previous studies on traffic crashes (Mukoko & Pulugurtha, 2019;Tagar & Pulugurtha, 2021). Comparisons between the coefficients of BKT and VKT (Fig. 8 for KSI crashes) reflect "the safety-in-number effect". The ratio of these coefficients (VKT/BKT = 2.10/1.67) is fairly compatible with what was found by Elvik and Bjørnskau (2017) (motor vehicle volume/bicycle volume = 0.5/0.43). However, as Table 3 shows, the actual numerical estimates substantially vary in the Hurdle and count parts of the models.
The primary source of such variation in the safety-in-numbers effect in these models could be the models' specifications and utilized explanatory variables (Elvik & Goel, 2019). The results also show that the impacts of exposure variables on KSI crashes were considerably larger than their impacts on PDO crashes.
Based on Fig. 8, KSI crashes are more frequent (and probable) in areas with 50 km/h speed limit roads compared to areas with 30 km/h roads. Such results were consistent with the fact that almost 80% of road deaths in Dutch cities occurred on roads with a 50 km/h (or 70 km/h) speed limit (Schepers, 2021a). Nonetheless, Fig. 9 shows that severe V&V crashes are less likely to occur in areas with high-speed (>50 km/ h) roads. A similar result was found in the literature (Bao et al., 2021) indicating the possibility of drivers' violence from the posted speed limits at the lower speeds as well as the complexity of driving conditions because of mixed traffic and more access points at these roads (Tignor & Warren, 1990). In line with previous studies (e.g., Adams & Aldred, 2020;Obelheiro et al., 2020;Schepers, 2021b) we found that a higher proportion of arterial and secondary roads is associated with an increased number/probability of the crashes. Moreover, the figures reflect that a higher density of residential and minor roads in the areas leads to a higher crash frequency in all types.
Interestingly, high cycling speeds were (on average) observed to reduce the frequency of V&B-PDO crashes. This relationship can be reasoned as higher cycling speed is possible on cycling paths where there are no conflicts with motorized traffic. Also, the average cycling speeds should be lower at intersections where more conflicts and consequently more crashes are expected.
Regarding the effects of the cycling facilities, the results confirm what was found in other studies indicating that V&B crashes are concentrated on the arterial roads that are adjacent to the high volume and high-speed roads (Schepers et al., 2013;Schepers et al., 2011). In this study, these facilities are characterized by suggested cycling lanes and other cycling roads (i.e., mixed traffic roads). Moreover, based on Fig. 8 the separated cycle paths may negatively affect the V&B crashes compared to the suggested cycle lanes and mixed traffic roads. This result seems to contradict the findings of Van Petegem et al. (2021) and other previous studies showing that separated cycle paths are usually safer than cycle lanes adjacent to 50 km/h streets and mixed traffic streets (Adams & Aldred, 2020;Aldred et al., 2021). Although the separated cycle paths prevent conflict with motor-vehicles, the presence of busy side roads, parked vehicles besides the cycle paths, as well as large intersections that need to be crossed can increase the probability and frequency of V&B crashes (Vandenbulcke et al., 2014).

Effects of socioeconomic and demographic characteristics
To identify the effects of socioeconomic levels on safety, variables including the percentage of rental properties, average of housing values, and the percentage of the high-income population were used in the model. Figs. 8 and 9 show the effects of these variables indicating that areas with high socioeconomic levels are safer places, confirming previous findings (Najaf et al., 2018;Osama & Sayed, 2017;Xie et al., 2019).
Our results show that severe V&B crashes are less likely in areas with a higher proportion of households with children. However, the associations between the number of households in the area and the analyzed crashes remained unclear in our study. It is also good to note that, we excluded demographic variables such as gender and age as they were highly correlated with the total number of households and the percentage of households with children in the study area.

Effects of characteristics of the neighboring areas (spillover effects)
As for the role of neighboring variables which reflect spatial spillover effects, the results suggest that an increase in VKT in neighboring cells increases the probability (and to some extent the frequency) of PDO crashes. In contrast, high vehicle/bicycle volumes in neighboring cells were associated with a reduction in the frequency of severe V&B crashes and an increase in PDO-V&V crashes. This is rather counterintuitive but could indicate that the targeted areas are less likely to have high bicycle exposure. It is also worth noting that a higher proportion of shopping areas in the surrounding cells may raise the number of KSI crashes.

New insights
Previous studies about the effects of built-environment factors on traffic crashes have shown mixed and contradicting outcomes on the magnitude of the effects of the built-environment factors. This study examined the effects of a comprehensive set of built-environment factors on vehicle-only and bicycle-vehicle crashes in the built-up areas. The Table 3 Results of the SHNBR Model on V&B crashes; Count model coefficients (truncated NB with log link) and Zero hurdle model coefficients (binomial with logit link). 1. Land use variables are often highly correlated which can result in biased effects on traffic crashes. By including land-use variables indicating the urbanity level, we observed that the proportion of different land-use types,e.g., accommodation, shop, etc., become less influential on the crash frequency and probability. Furthermore, we did not find significant associations between crash frequency (and probability) and the proportion of meeting and sports areas (which can be considered as leisure areas). 2. The effects of building density outweigh the effects of population density. This indicates that the contribution of number and location of human activities (i.e., buildings) and therefore land-use policies in traffic safety is greater than the number of residents in an area. 3. The majority of the built-environment factors have similar impacts in terms of increasing or decreasing V&B and V&V crashes. However, this study gives new insights about the magnitude of these impacts that vary based on the severity level of the crash and also the vulnerability level of the involved parties. For example, a greater level of MXI reduces the number of both types of crashes. In this study, we also observed that this impact on V&B crashes 4. A greater proportion of residential roads and separated cycling paths increases the probability (and frequency) of V&B crashes in Dutch cities. This result differs from findings of previous studies in other countries. E.g., Osama and Sayed (2017) found that an increase in the proportion of local roads and separate bicycle paths had reduced the number of V&B crashes in Vancouver, Canada. Findings for the Netherlands are likely to differ given the design and density of the cycling facilities and high bicycle volumes in Dutch cities (Uijtdewilligen et al., Accepted for publication).

Discussion
The newly designed built-up areas, namely New Towns and VINEX areas developed in the 1970s and 1990s respectively, have improved the safety of cyclists in Dutch cities. Nonetheless, it is worth mentioning that another traffic safety policy, the Sustainable Safety Policy, was adopted after the 1990s, focusing on designing safer roadways (SWOV, 2018) and improving traffic safety (Weijermars & Wegman, 2011). As mentioned, we found some associations between traffic safety and the proxy variables indicating the implementation of different spatial and transport policies. This reveals that distinguishing the effects of such policies on traffic safety would be an interesting subject for future research. Contrary to cyclist safety, the above-mentioned urban policies might be less effective in reducing the frequency of serious car crashes. Interestingly, older neighborhoods (e.g., built before 1900) appear to be safer than other areas. This is probably because these areas are usually located in the city centers (i.e., historic places); hence interactions between vehicles and bikes are fewer and less severe. Newly designed areas were also devised to increase the accessibility of neighborhoods. In practice, accessibility is improved by shortening travel distances to various facilities, such as attraction locations and transport systems facilities. Although better accessibility is always desirable, our findings revealed that higher proximity to some "attractions" such as grocery stores and schools aggravate traffic safety. Whilst, Merlin et al. (2020a) found a negative significant relationship between increased proximity to job locations from the residential location of the crash victim and vehicle crash frequency (per capita), only within a 10minutes threshold (i.e., cumulative job-accessibility). They explained such a relationship based on the positive impacts of reduced distances to work on declined VKT. This raises a valid concern that needs to be addressed by future research investigating the reasons behind such a deterioration in safety. Future studies should also investigate the impacts of accessibility indicators on traffic safety (and vice versa) to provide greater insight into the relationships between accessibility and traffic safety.

Limitations and future directions
This study adopted square grid cells to evaluate the impacts of the built-environment and land-use factors on traffic safety. The potential effects of this choice (i.e., square cells) on the results were not investigated as such an inquiry is beyond the scope of this paper. For example, the hexagon units could be a better alternative as they have the minimum total perimeter length which results in the least boundary effects Fig. 9. Comparison between standardized Count/Zero-Model coefficients of the V&V crashes. (Cui et al., 2021). Future research could focus on identifying the effects of using alternative spatial units such as honeycomb and hexagon cells, building blocks, neighborhoods, or postcode areas.
A limitation in this study was the crash data, which were compiled based on police reports which are known to underreport traffic crashes with minor injuries and to have inaccuracies in reporting the location of crashes. To alleviate this problem, safety officials are currently working on compiling crash data based on ambulance reports (Rijkswaterstraat, 2019b). Future traffic safety research in the Netherlands could benefit from such more accurate data in terms of the severity level of the crashes. This data is planned to become available in the coming years at the national level, based on a national action plan on traffic safety (Rijkswaterstraat, 2019b).
From the methodological perspective, the scope of this study is limited in terms of evaluating the unobserved spatial heterogeneity, as well as the spatial and temporal autocorrelations. Therefore, future research could take advantage of using Spatial Error and Random Parameter modeling approaches as used in previous studies (e.g., (Cui et al., 2021;Huang et al., 2018;Ulak et al., 2018;Wang et al., 2019;Xu et al., 2020;P. Xu & Huang, 2015;Ziakopoulos & Yannis, 2020)). Moreover, future studies may investigate the built-environment and traffic related factors that can contribute to the temporal heterogeneity in traffic crash frequency. Some examples for such time-varying explanatory factors are variation in traffic volumes and operation speeds (R. Yu et al., 2019), as well as variation in the role of commercial and residential areas during the hours of the day and days of a week.
Data processing and methodology approaches introduced in this study can be utilized for analyzing traffic crashes in different cities. However, the findings of this research are based on the data in the Netherlands. Thus, the results of this study cannot be directly generalized to other regions, such as urban areas in the US. Because the land-use and road infrastructure characteristics in the selected study area are different from other areas, especially in terms of the availability and design of the cycling infrastructure. Moreover, differences in these characteristics lead to differences in travel behavior and traffic/cycling volumes.

Conclusions
Various built-environment factors contribute to traffic safety in urban areas. However, there is a scarcity of literature on the effects of these factors on the severity and number of traffic crashes in the Netherlands. This study exploits a rich dataset to investigate how and to what extent different built-environment factors are associated with frequency and probability of vehicle-vehicle and bicycle-vehicle crashes. An area-level analysis utilizing 100 × 100 m 2 cells along with Hurdle NB regression models was adopted to analyze PDO and KSI crash types. To account for the observed spatial correlation effects, the spillover effects of neighboring cells were included in the analysis.
The results of the HNB models show that in addition to road network and traffic characteristics, built-environment factors play a role in the number of crashes (i.e., crash frequency) as well as the chance of occurring a crash (i.e., crash probability), in the analysis units. It is noticeable that impacts of the built-environment factors are more dominant in KSI crashes compared to PDO crashes. Moreover, we observed that newly developed urban areas (built after 1990), that have different spatial designs, are safer for road-users (particularly cyclists). Increased land-use diversity (represented by MXI) is significantly associated with its positive impact on improving traffic safety. Whilst, a higher land-use density (represented by urbanity level) has negative effects on traffic safety. Population density, on the other hand, does not have significant effects although it was expected to have an impact on frequency and/or probability of traffic crashes in the analysis units. The modeling approach also addresses spatial dependency and heterogeneity through spatial spillover effects. Results indicate that traffic and land-use characteristics in surrounding areas (e.g., vehicle/bicycle volumes, proportion of shopping areas) are associated with traffic safety in the targeted analysis unit.
One of the major contributions of this study is analyzing the collective impacts of a comprehensive set of built-environment variables along with the traffic and infrastructure-related factors. This comprehensive variable set became the cornerstone to examining a cumbersome topic in traffic safety: identifying the relationship between traffic safety and the built-environment.