A cellular-automata model for assessing the sensitivity of the street network to natural terrain

ABSTRACT Natural and human-made features are not exclusive in settlements but interact across time and space, placing the context in constant evolution. The purpose of this paper is to search for the influence of terrain, a natural feature, on the configuration of the street network, a human-made feature, by analysing the results of two transition states of cellular automata used to model street networks. This work uses data from open-source projects and open-source applications. The first transition state models the street network considering the neighbourhood rules and randomness, assuming the natural terrain and street are exclusive. The second transition state models the street network as the product of characteristics of the terrain, neighbourhood rules, and randomness, thus assuming the natural terrain and street network interacting with one another. The model is run thirteen times for four different cities by varying the terrain characteristics and calibrated by comparing the simulated street maps with recent street maps. The results are compared and found that the CA model with the second transition state yields better simulation results than the first transition state. In one of the four cities studied, the first transition state results are similar to a specific state of the second transition state, indicating a weak inter-connectedness between the terrain and the street network in the mega-city. Further research can reveal whether the amount of inter-connectedness is specific to the city’s terrain or size. The recognition of the inter-connectedness of the road to terrain can help plan for resilient human settlements.


Introduction
The similarities between geometries found in nature, such as trees, and geometries of cities, particularly the organic ones, are striking. These forms and structures emerge because of the constraints of physical space governed by the laws of nature. The street network dictates cities' geometry (Marshall 2005, chap. 4;Southworth and Ben-Joseph 2003, chap. 1). The streets are expected to follow the territory's natural structure, especially when the terrain is rough, as they represent the direction of least resistance (Bejan and Zane 2012;Guerreiro 2010, 22). This work's overarching theme explores this expectation by modelling the street network using the natural terrain.
There is not much research on the configuration of transportation networks in light of terrain influences. Mohajeri (Mohajeri 2012;Mohajeri, French, and Batty 2013b;Mohajeri, French, and Gudmundsson 2013a;Mohajeri andGudmundsson 2014a, 2014b) uses several quantitative measures to show the influence of landform constraints on the street network with regard to its overall geometry. However, an explicit characterization of the influence of the territory's structuring lines on the street network is yet to be explored thoroughly, with only a few works (Aburas et al. 2016;Levinson, 2012) considering the context of topography while analysing the street network variables. This paper explores this concept by modelling the relationship between the territory's structuring lines and the street.
The topography is the first condition in any settlement's genesis (Urban morphology 2016, chap. 2). The natural territory's structuring lines are the ridgelines and the drainage lines that connect the highest and lowest elevation points, respectively. Using examples for cities in Portugal, Guerreiro suggests that these lines are the path of least resistance. Hence, in different human settlements, 'the first paths follow these structuring lines of territory strengthening its geometry and configuration (Urban morphology 2016, chap. 2)'. However, an indepth analysis of cities is pending. The ridgelines and the drainage lines are together referred to in this work as the stream channel. Thanks to the progress in remote sensing techniques and open-access datasets, it is now possible to access remotely sensed terrain surface models to extract the drainage lines and ridgelines for cities around the world (Maidment and Morehouse 2002).
Traditional urban models have sought to model the economic and size relationship (Alonso, 1964;Berry 1961;Christaller and Baskin 1966;Zipf 1949, chap. 9) or the city's social and economic patterns (Jacobs 1961, chap. 8;Pacione 2009, chap. 7). These models largely ignore the physical characteristics of urban expansion. In the early nineties, Batty et al. used a dynamics system model called diffusion-limited aggregation (DLA) to model urban expansion and demonstrated the effect of transportation routes on growth (Batty and Longley 1994). The DLA model is similar to cellular automata techniques (Clarke, Hoppen, and Gaydos 1997;Wolfram 2002, 5:994).
Cellular automata (C.A.) is a simple and easily automated method for generating simulations. CA originated from the 'theory of self-reproducing automata' developed in the late 1940s by John von Neumann and Stanislaw Ulan (Codd 2014, 14, 15). The relationship of C.A. models with the real-world phenomenon is highly abstracted but still gives rise to the radical idea that the real-world is C.A. (Hagen-Zanker 2008, 26). Researchers such as Convoy, Wolfram, and Bak (Bak, Tang, and Wiesenfeld 1988;Gardener 1970;Wolfram 1986) further developed the CA model, independently. There are few early examples of a geographical model that is CA-like (Schelling 1971;Tobler 1979Tobler , 1970. It was applied in the urban context for the first time by White and Engelen (White and Engelen 1993) and soon by Batty and Xie (Batty and Xie 1994). Currently, several urban models are using the classical C.A. model and its variations, including the use of GIS, road network, and socio-economic data (Aburas et al. 2016, sec. 1.1.1;Carter 2018;Hagen-Zanker 2008;Hewitt, Van Delden, and Escobar 2014;Lagarias 2012;Leao, Bishop, and Evans 2004). The various methods and research works on cellular automata models for simulation and prediction of the spatiotemporal urban growth trends are detailed in a review paper by Aburas et al. (Aburas et al. 2016). Thus, the CA model is ideal for investigating the relationship between micro-level behaviour and macro-level patterns.
This work uses C.A. model applied in R known as SIMLANDER (Hewitt, Díaz-Pacheco, and Moya-Gómez 2013). R (R core team 2017) is an open-source programming language for statistical computing and graphics. In SIMLANDER, single land use is modelled at a time (Hewitt, Díaz-Pacheco, and Moya-Gómez 2013;Roodposhti, Hewitt, and Bryan 2020, 2), which suits this work's objective where the focus is on simulating only the 'street' cells. The SIMLANDER, though a simple landuse model like SLEUTH, is helpful for rapid exploration and simulation of urban growth, pattern, and form (Clarke, Hoppen, and Gaydos 1997;Feng et al. 2018a;Feng, Liu, and Tong 2018b;Feng and Tong 2019;Xian and Crane 2005).
The street network defines the urban spatial structure. The street network configuration can indicate whether a city is a planned or a self-organized one (Marshall 2005, chap. 4;Southworth and Ben-Joseph 2003, chap. 1). The organic or the self-organized cities have grown due to individualistic decisions at the local level. In contrast, the planned cities are formed due to 'top-down' decisions and often exhibit geometric regularity in the urban form. Organic cities resemble the pattern of a growing cell, following the terrain and looking like a spider-web -irregular but not disordered ("Video Lectures | Theory of City Form | Architecture | MIT OpenCourseWare," 2013). This work is limited to cities with self-organized urban patterns, such as star, branch, ring, or radial forms rather than self-imposed forms of strict geometric regularities such as grids.
Cities and consequently street networks are complex systems. Complex systems are sensitive to initial conditions and exhibit 'path dependence' (Arthur 1988). Path dependence refers to the idea that historical states influence the system's future states. Thus, the past is the best predictor of the future is an assumption taken in this study. Additionally, even though the complex and dynamical processes of development of road network are the outcome of geo-morphology, technical, managerial, economic, socio-cultural, and political policies (Marshall , 2005, chap. 4;Southworth and Ben-Joseph 2003, chap. 1;Xie and Levinson 2007), this work is limited to the geomorphology following the overarching theme of the study.
In this paper, we attempt to understand how street patterns emerge in the context of the natural terrain. This study aims to use a C.A. street network growth model to ascertain the inter-connectedness between streets and natural terrain. The proposed analysis was accomplished by systematically running the model with varying transition rules, one having and the other not having the terrain characteristics, and then assessing the model output differences. While comparing the outputs (simulated maps), the Jaccard index and coincidence matrix provided quantitative descriptions.
This study showcases how high-resolution topographic data, street network information, and simulation models can help grasp the essence of complex geographical systems and understand the interaction between physical entities. This research uses open-source materials. The CA model is developed and run on the R-platform, and the publicly accessible data on streets and terrain forms the raw data for the research.
The rest of the paper has two parts. The first part describes the C.A. model, calibration process, and approach for analysing the model results. The second part outlines the results, discussions, and conclusions.

Study area and data collection
This work uses street maps of Singapore in Southeast Asia, Caracas in Latin America, Fukuoka in Japan, and Ahmedabad in India. All these cities have a population of over one million and have a typical self-organized pattern. In terms of network properties, all these cities have a network size of 100 or more. The network size is the number of nodes in a network. The road data set is converted to a graph format in its 'primal' form (Barthélemy 2011;Cardillo et al. 2006;Porta, Crucitti, and Latora 2006) to find the network size. Ahmedabad is a river city in the floodplains. Singapore's terrain ranges from plain to hilly, while Fukuoka is a coastal city with hilly terrain, and Caracas is a city with high ranges. Thus, the cities represent different landscapes, from floodplains to high ranges. Moreover, all the cities have different economic, historical, cultural, functional, and geo-climatic conditions. 'Atlas for Urban Expansion' is a "multi-phase research effort to monitor the quantitative and qualitative aspects of global urban expansion (http://www.atlasofurbanex pansion.org/about [WWW Document] 2016). The project provides the digital street maps of the cities, which are publicly accessible. The street maps with arterial roads are sufficient for the study to form the urban fabric's skeletal frame. The vector map's raster format with two cell types, street, and non-street cells, is used for the model. A binary C.A. model for predicting the street network's growth is designed in R taking cues from the SIMLANDER (Hewitt, Díaz-Pacheco, and Moya-Gómez 2013). The binary C.A. model reduces the analysis process and is suitable for the study aim. The core of the C. A. model is the neighbourhood effect. According to the neighbourhood effect, an individual cell's susceptibility to change state at each time step is a function of the neighbouring cells' influence. The methodology followed is graphically represented in Figure 1 and the specific characteristics of the C.A. model are now discussed.

Influencing factors and variations
The CA model is developed by taking cues from SIMLANDER (Hewitt, Díaz-Pacheco, and Moya-Gómez 2013). As explained by Hewit et al., 'SIMLANDER' stands for 'SIMulation of LAND use changE using R'. The SIMLANDER code calculates a 'transition potential' to evaluate the demand for particular land use at each time step. By the end of each time step, the output is a new land use map. The code is available at https:// simlander.wordpress.com/about/.
In this C.A. model, the 'transition potential' has two rules. The first transition rule, henceforth referred to as 'TR_1ʹ, is a function of distance from road network (R.D.), neighbourhood transition potential (N.P.), and stochasticity or factor of randomness (R.P.).
For each time step, the transition rule of 'TR_1ʹ is: The second transition state henceforth referred to as 'TR_2ʹ, is a function of neighbourhood rules (N.P.), stochasticity (R.P.), distance from road network (R.D.), stream channels (STR), and slope map (S.M.). For each time step, the transition rule of 'TR_2ʹ is: In the C.A. model, the distance from road network (R.D.), neighbourhood rules (N.P.), stochasticity (R.P.), stream channels (STR), and slope map (S.M.) are the influencing factors. The stream channels (STR) and the slope map (S. M.) are the environmental (or precisely the terrain) characteristics obtained from the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010) (Danielson and Gesch, 2010). The GMTED2010 contains seven elevation products for 30-, 15-, and 7.5-arc-second spatial resolution: minimum elevation, maximum elevation, mean elevation, median elevation, standard deviation elevation, systematic subsample, and breakline emphasis. This research uses the publicly available mean elevation raster product for 7.5-arc-second spatial resolution, around 250 m EarthExplorer -Home [WWW Document] 2019 The GMTED 2010 at a scale of 1:1,000,000 is used to get the terrain elevation data for extracting the stream channels for each city. The threshold value of '5ʹ for the flow accumulation gives a detailed stream network. The stream network extracted from the DEM is the 'drainage lines'. The 'drainage lines' follow the depressions in the DEM. The lines along the raised areas will form the network of ridgelines. An inverted image of the DEM gives the ridgelines. The DEM is multiplied with a negative one (−1) to get the inverted image. The functions in R and GRASS GIS platforms can extract the stream channels from DEM. The map of stream channels ('STR'), distance from road network ('R.D.'), and slope map ('S.M.') are initial conditions that remain constant. The neighbourhood transition potential map ('N.P.') and the factor of randomness ('R.P.') are calculated anew at the end of each time step. The following paragraphs describe the variables used in the function. The results are dependent on the values of these variables. N.P. A neighbourhood rule is applied to the initial map at each time step to get the neighbourhood potential map by following the steps given in SIMLANDER (Hewitt, Díaz-Pacheco, and Moya-Gómez 2013, 7). A moving window filter of dimension three cells by three cells (3x3) applies weight on a cell's existing state and its neighbourhood to determine its transition potential. The final value is re-scaled into a range of [0.25, 2.75].
R.P. For the factor of randomness, the steps as given by Hewitt (Hewitt, Díaz-Pacheco, and Moya-Gómez 2013, 9) generate the randomness map, and the final values are re-scaled into a range of [0.025, 0.25]. The factor of randomness has the least weightage.
R.D. The road distance map is a rasterized image of the road network with the pixels having the value of all non-road cells' distance to the road cells. The reclassification will give a higher weightage to a cell near an existing 'road' cell to change into a 'road' cell. The value of '10ʹ is given to the 20% of cells closest to the existing road cells while the rest are re-valued as '0.001ʹ. The given weights for N.P. and R.D. make sure that a cell closer to an existing road cell will have more chances of being converted to a new 'road' cell than any random cell. Meanwhile, the value of '0.001ʹ for the cells that are far from an existing road cell is much lower than the lowest range of R.P.
S.M. The GMTED 2010 gives a slope map of the terrain. The slope map values are reclassified, keeping in mind that urbanization often occurs in terrain with a slope of less than 5%. The average slope of a large dataset of global cities is 3.3%, and 75% of cities have a slope less than 3.9% (Appendix 2). The reclassification of slope values gives higher weightage to cells with lower slopes. The areas with a percentage of slope less than 5% have a value of '5ʹ, 5% to 20% have a value of '0.75ʹ, and the rest have a value of '0.1ʹ.
S.T.R The drainage and ridge network compose the natural terrain's geomorphology and refers to stream channels (Urban morphology 2016, 11). Various orders can represent the stream network's hierarchical distribution (Haggett and Chorley 1969, 9). In this analysis, Strahler order is used. It provides a generalized description of the hierarchy in stream channels. The segments of stream channels are differentiated between each other quantitatively by using Strahler order.

Varying the elements
In this study, the two states of transition rule, TR_1 and TR_2, were used to evaluate the effects of terrain on the street map simulation. The stream channels and slope map characterize the terrain. For each city, the C. A. model is run once for 'TR_1ʹ (represented as '5ʹ) and for 'TR_2ʹ the C.A. model is run for three changes of the stream channels (STR), as seen in Table 1: (1) 'complete' stream channel denoting that both the ridge and drainage channels are taken (represented as 'B'), (2) only the drainage channels (represented as 'D'), and (3) only the ridge channels (represented as 'R') and in four parts for each change in the stream channels: (1) only the higher orders of the stream channels as represented by '2ʹ, (2) higher orders of the stream channels along with the neighbourhood as represented by '4ʹ, (3) higher and lower orders of stream channel as represented by '3ʹ, and (4) higher and lower orders of stream channel along with the neighbourhood as represented by '1ʹ, thus yielding 12 simulated maps for each city. A moving window filter of dimensions three by three (3×3) gives the stream channels' neighbourhood. Thus, each city is run for 13 simulations, once for 'TR_1ʹ and 12 times for 'TR_2ʹ. Next, the Jaccard index and coincidence matrices evaluate the effects of terrain on the street map's simulation.

Characterizing the model
The simulations start with the historical street map of 1990. The reference map is the actual street map of 2015. The historic street map of 1990 and the actual street map of 2015 of the cities are available in the repository (http://www.atlasofurbanexpansion.org/data). All the raster files have the same resolution extent and resolution of 0.00074 × 0.00057 degrees (around 80 × 65 m). The transition potential is a function of all the influencing factors. It determines the susceptibility of a cell to change its state at the end of each time step. Here the cells have only the state of being in '0ʹ or '1ʹ. The state of '0ʹ is for the cells with no road segment and '1ʹ for cells with a road segment.
A 'for' loop runs for all the time steps. The number of transformed cells at each time step is the difference between the total number of cells that have changed the state from 0 to 1 between the two time periods divided by the number of years. The historic street map used inside the 'for' loop is buffered by 250 m so that new 'road' cells form away from the core of the city, where the road density is high, and hence there is less scope for new arterial roads to appear. The cells in the state '1ʹ at each step are masked from further changes in the state as a new road segment, which is often a permanent change. The growth rate in the number of road pixels is assumed to be constant across the years.
The transition potential is a function of influencing factors. The top 'n' cells are selected and converted to state' 1ʹ, where 'n' is the difference between the total number of cells that have changed the state from '0ʹ to '1ʹ between the two time periods divided by the number of years. The pressure for a road pixel to appear on a non-road pixel depends on the transition rule.

Model Validation
In this work, the similarity between the simulated and reference maps will give a high value as most of the simulated map cells belong to the reference map. Thus, any statistic used to measure the maps' similarity will give a high value leading to false conclusions. Therefore, in this study, it is essential to evaluate the similarity between the newly generated street cells of the simulated map and the actual map that is not there in the reference map. Here, the statistics will give a low value as the simulated street map is only an abstracted image of the actual road map. The authors expect that a simulation of an accurate representation of the street network configuration is not possible because the streets belonging to different terrains may intersect the structuring lines of the territory at various angles rather than merely following it. Therefore, the similarity between the map representing the neighbourhood of the 'original change map' and the 'simulated change map' is assessed. A 'change' map shows only the street cells that are not there in the reference map. The 'original change map' is the latest street map minus the historic street map, while the 'simulated change map' is the 'simulated street map' minus the historic street map. A moving window can buffer the cells to get a neighbourhood map of the 'original change map' with actual and neighbouring cells. The neighbourhood of the 'original change map' is compared to the 'simulated change map.' The Jaccard index can measure the similarity of maps with discrete values. Here, the maps have only two values, '1ʹ for road cells and '0ʹ for non-road cells. Since the focus is on the overlap of the road cells in two maps, the Jaccard index (Jaccard 1912) is suitable. The Jaccard index is the intersection between the rasters divided by the union between the rasters. e number of cells in the 'original change map' and its neighbourhood, which coincides with the 'simulated change map' using the coincidence or confusion matrix, gives the degree of resemblance measured using 'true positives'. The coincidence matrix gives the variables to calculate the sensitivity and specificity (Irizarry 2019, 496). The sensitivity is the true positives (T.P.) divided by the sum of true positives and false negatives. The specificity or precision is the proportion of outcomes called positives (T.P. + F.P.), which are 'true positives' (T. P.) (Irizarry 2019, chap. 28). If the statistics yield unacceptable results, calibrations of the model parameters can further improve the simulations.

Analysis
The paper aims to identify the variation of an influencing factor or an input parameter, the terrain, on a street map simulation. The authors have analysed the results by measuring the model output changes relative to changes in input parameters. The proposed method uses the Jaccard index and the coincidence or confusion matrices to compare the model results without and with the terrain characteristics, specifically the slope and different stream channels' condition.

Results
An R-based binary C.A. model simulates the four cities' street network for a temporal interval of 25-years using a one-year time step. Figure 2 shows a map of the ridge, drainage, and road network. The highlighted parts are the stretches of road network overlapping with the ridge and drainage network. The transition state parameters are varied to get 13 simulations for each city. The Jaccard index value and the coincidence matrix calculated for each simulation compare the simulation outcomes. Figure 3 shows the latest road network and simulated maps using 'TR_1ʹ and 'TR_2ʹ overlaid on the initial road network.

Sensitivity on transition rule change from 'TR_1ʹ to 'TR_2ʹ
The Jaccard index and coincidence matrix compare the C.A. model of each city's simulation results run with 'state _1ʹ with the actual street map. In Figure 4 and Figure 5, the value '5ʹ represents the first transition rule results, 'TR_1ʹ. Similarly, we compare the actual street map and the simulated results of the C.A. model with 'state _2ʹ for three changes in four parts. In Figure 4 and Figure 5, labels "1" to "4" show the model run the results with the second transition rule, "TR_2ʹ, for each city. The three changes, namely, the "complete" stream channel consisting of both the drainage and ridge channels, only the drainage channels, and only the ridge channels are represented, respectively, by the alphabets" B', 'D' and 'R' appended to the end of the name of the city. The 'change' maps assess the resemblance between the simulated maps and the actual street maps. The Jaccard index compares the 'original change map with its neighbourhood' and the 'simulated change map'. The column titled 'Jaccard Index' in the Appendix has the Jaccard index for all the model runs. Figure 4 shows that for all cities, the Jaccard index of at least one C.A. models run with TR_2 is more than the Jaccard index of maps simulated using TR_1. However, the Jaccard index of TR_1 and state '1ʹ of TR_2 are similar for Singapore, signalling a weak influence of terrain on the street layout. The Jaccard index of TR_1 and state' 1ʹ of TR_2 widely vary for the other three cities, signalling a more substantial influence of terrain on the street network's configuration. Therefore, the dependence between the street and terrain is high in hilly cities like Caracas and Fukuoka, medium in cities with flat terrain like Ahmedabad, and negligible in mega-sized cities like Singapore. The simulation results are further analysed using a confusion matrix.
The coincidence matrix gives the true positives, false positives, false positives, and false negatives. The Appendix shows these values for the model runs. The number of true positives is the number of 'street' pixels in both the 'original change map with its neighbourhood' and the 'simulated change' map.

Discussion
This study aimed to evaluate the relationship between natural terrain and the street network. This study  examined the transition rule parameters' changes on the generated model outcomes using the Jaccard index and coincidence matrix. The results indicate that there are significant impacts on adding the terrain characteristics to the transition rule for the street network simulation. Moreover, the results resemble the 'actual' street network by changing the stream channels' setting in the transition rule. Thus, the terrain does influence the configuration of the street network.
All the obtained simulations corroborate the assertion that the simulated CA model outputs are sensitive to the terrain characteristics' inclusion, specifically the stream channels and slope map. The 'setting' with all the stream channels' orders and neighbouring cells gives the best simulation results. This observation may indicate that in all terrains, all orders of the stream channels influence the configuration of the street network and are not limited to the higher orders or just the ridge or drainage network.
Innumerable geographic and cultural factors influence the urban street layout ( (Southworth and Ben-Joseph 2003, chap. 1)). This research considers only two geographic factors: proximity to the existing street and physical terrain. Therefore, the simulated maps are poor representations of the 'real' road network. Although the simulated maps weakly resemble the 'actual' street network, the CA model helps interpret the terrain's influence on the street. This work compares the changes in simulation results on varying the transition rules for the limited variations in the influencing factors. However, the results will change depending on the weights used for the factors and the neighbour configurations. Further research can assess this CA model's sensitivity to cell size, neighbour configuration, and weights of the variables for possible improvements. Furthermore, extending this study to perform spatial analysis on a large dataset of cities of different sizes and belonging to different terrains can scientifically confirm whether the influence of terrain on streets is dependent on the landform or size of the city.
Finally, open-geospatial data and open-modelling are an increasingly important paradigm that provides opportunities to accelerate innovations and discoveries. These are useful to gain knowledge about the 'science of cities' to plan for better cities. This work uses openaccess data and open-source software, and is therefore easily reproducible by researchers. The open geospatial data and open-source technologies are publicly available, making them available at no cost. These technologies can be scaled up and upgraded easily to suit other contexts.

Conclusion
In summary, the C.A. model described in this work abstractedly illustrates the influence of natural terrain on the street network configuration. The simulation result of a specific 'setting' of the stream channels in the transition rule of 'TR_2ʹ yields a comparatively better simulation of the actual street network than 'TR_1ʹ.
Additionally, several factors stimulate changes in the road network configurations, such as socio-cultural, economic, ecological, political, and governance. The model does not take these factors for the sake of the simplicity of the model.
The research is an open-source one performed by using publicly available data and open-source tools for the modelling. The freely available resources have simulated scientific research. The methodology described is easily replicable in other cities to verify the results. Further research can include additional parameters to study their influence on the street network configuration.