A Bottom Up Approach to Modeling Habitat Connectivity Dynamics Through Networks Analysis

© 2012 Morgado et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. A Bottom Up Approach to Modeling Habitat Connectivity Dynamics Through Networks Analysis


Introduction
Urbanization processes conquer more and more land through sprawl. Cities unite into large functional regions, leaving species other than humans lacking habitat areas. Biodiversity is in decline worldwide [1], but does it has to be? City development doesn't happen monotonously, as an ever spreading wave of sprawl, like it was believed for some time. But rather cities expand by leapfrogging [2] and have a fractal form [3]. This type of process leaves vacant "open spaces" in-between urbanized areas.
Urban planners normally would address urban development from human centered perspective, treating the open spaces as a void, left out by the city. As long as there was a lot of this void, nature could cope with this approach. But as open spaces become scarcer, biodiversity declines due to habitat depletion and fragmentation, and the conservational biodiversity-centered landscape planning can no longer be ignored. A typical process of planning would involve trying to make restrictive policies on development. Do restrictive policies really work if there are strong economic incentives to develop? In Israel the existing planning land-use system has very limited impact on real life land-cover configurations, though the resulting sum area allocations mostly follow the guidelines. [9]. On the other hand, if there is knowledge that allows making educated guidelines for future development, perhaps the sprawl processes can be directed and designed in a less destructive way, the knowledge about nature's spatial configuration requirements in order to thrive as opposed to minimal requirements to survive.
Of course open spaces are becoming more fragmented as a result of human activities and infrastructures. Open spaces fragmentation has an important ecological effect, due to its contribution to habitat depletion and degradation and subsequent biodiversity loss. According to EEA [4], fragmentation results in the inability of different species to access resources, in reduction of the amount and quality of habitat areas, and their isolation, among other effects. Fragmentation is one of the major problems in EU nowadays, thus contradicting the principles of sustainability, especially concerning biodiversity conservation and land use policies [5,4]. A wide range of literature, regarding connectivity and landscape fragmentation, assumes that increase in fragmentation of open space is associated with decrease in connectivity [5,6]. Therefore spatial planning and environmental policies should be arguing in favor of the need to increase connectivity between green spaces, as a way to enhance habitat preservation. Unfortunately some of the planning tools and policies have poor scientific basis, neglect or ignore spatial influence and have more aims than the measures to achieve them.
This chapter offers a summarized overview of methods and models, skills and tools required for educated decision making regarding environmental management of open spaces, and then proposes a bottom up approach to model complex phenomena in a landscape involving habitat connectivity. To pursue this, several concepts, technologies and methods are assembled together, such as GIS, Multi-Agent modeling (MA) and Complex Networks analysis (CA). Finally an explicit spatial model is set up in order to analyze the evolution of connectivity habitat and deliver some insights about environmental spatial planning. This model is based on the understanding that different species of animals have different dispersal characteristics and operate in the environment at different scales, meaning that there is a need for suitable linkages between natural spaces at a scale relevant to each species [5,6]. Hence a generic spatially explicit model, constructed for analyzing habitat fragmentation of a landscape, should be implemented in a range of species specific scales. In order to serve decision makers as a tool, this type of analysis should take into account several focal species (e.g. small and large mammals, small birds, insects and plants) and then compile these results together into a landscape metric. This model is intended for use as a tool for spatial analysis that could be implemented on a sequence of temporal data and thus used for understanding of spatio-temporal dynamics of open space connectivity. This work's purpose is thus threefold. First we introduce the importance of explicit spatial models for environmental studies, e.g. biodiversity conservation planning, environmental impacts of urbanization and urban planning, and more particularly explanation of the fitness of bottom up models to analyze evolution of connectivity habitat, based on state of the art of referenced papers and works published. Second, a step-by-step explanation of the modeling methods is conducted, so it could be used as a benchmark for others, devoted to this kind of studies. Third, rather than drawing conclusions, we prefer to make final remarks on further developments and define promising applications based on new techniques and methods.
This methodology is developed in following stages: 1) the definition of the conceptual model regarding the integration of GIS, multi agent system and complex networks; 2) model implementation; 3) network analysis. Therefore the methodology contributes to the discussion on the relevance and suitability of multi agent systems, GIS & CN to model landscape fragmentation, as well as to improve the spatial dimension analysis techniques using connectivity algorithms.

Explicit spatial models for environmental studies
Longley et al say that "There is something special about spatial" [7].
We cannot agree more. Considering that most of events occur in space, geography should be a key variable in the equation of any model. In fact geography is both a constraint and a condition of life throughout territory. Our research is focused on species interactions with the territory as a basis for landscape conservation prioritization and land use decisionmaking. Of course we could have many different models that ignore geography. But as it can be observed from experience, even the more elegant and simple models are sometimes a poor proxy for the complex spatial reality.
Therefore, it is fundamental to contain in the model the complex spatial reality that includes connectivity and fragmentation (e.g. RAMAS [8]). Location (Where?) is crucial to understand the relationships (What? and How?) and the resulting decisions making support, and help to build better tools for spatial planning.

Spatial environmental models
Past studies analyzed open space functioning through landscape modeling using network theoretic approach. A complex landscape can be conceptualized as a mosaic of habitat patches, connected by corridors, and surrounded by hostile matrix [12]. Spatial models usually take a snapshot in time and explore spatial aspects of ecosystems: pattern analysis, habitat fragmentation, patch structure, corridors and connectivity, least cost paths through a landscape [11]. Theory of island biogeography [13], together with the meta-population theory [14] [15], was seminal for development of spatial models in ecology. Different patches levels of species richness were compared.
Graph based approaches of landscape can yield estimates of the value of individual patches and corridors for the whole system. The approach is typical when tracking population dynamics and when detailed biological or demographic processes are not needed, or simply when such information is not available in practice [16]. A different type of study developed ranking of dispersal obstruction effect of the land and application of this index as a cost surface to the landscape [17]. The more detailed approach is to consider the land mosaic in its functional landscape heterogeneity and not as binary habitat-matrix. By creating a map of functional land-cover types, the impact of landscape heterogeneity on biodiversity is assessed [18]. Cantwell and Forman [19] explored the topology of graphs defined by the juxtaposition of different patch types. Links could represent potential dispersal routes, functional connections between different patches, trophic or mutualistic interactions, weighted by the strength of interaction or flow rate and direction [20].
Another method utilizes scenario examination modeling. A moderately complex implementation framework consisting of modeling future habitat state, simulating persistence of individual surrogate entities and then projecting it across multiple real entities was developed by Ferrier and Drielsma [21] and has been employed widely in various whole-landscape planning processes throughout New South Wales, Australia.

GIS-based modeling
Although modern GIS became popular in the 1980's, various lineages can be found in GIS histories articles. Many works were done that point out some of the referenced in GIS functionality even before Roger Tomlinson GIS acronym creation in the 1960's. Some of these pre-GIS works were done by landscape architects and urban planners [22] alongside with geographers and computer scientists, and they highlight the need for well-informed environmental spatial decision concerning land use planning and land resource management. Two of the major GIS software producers in the market in the beginning of GIS computer automation were ESRI (with the ArcInfo software) and INTERGRAPH (with the MGE software). Both of them were built for Environmental spatial modeling purposes, testified by the names adopted, e.g. Environmental System Research Institute (ESRI) and Modular GIS Environment. This only emphasizes the importance of GIS tools and methods for environmental modeling and assessments.

Biodiversity and land use change
Biodiversity can be measured using different metrics. Species richness (alpha diversity) is the most intuitive way to understand it. Ecosystem (or beta) diversity is another metric. Of course different ecosystems have different species to inhabit them, so the two metrics are correlated in a landscape-wide context. [10] Land-use changes that follow urbanization processes impact nature in various ways. Agricultural land-use captures space that belonged to nature and had its own biodiversity indices. After the change both diversity metrics tend to decline. Examining the dynamic process it could be said that some species will die out with more and more land captured by agriculture while several new ones will flourish (e.g. the grown species themselves and their parasites and predators). Every agricultural patch of land has some area of ecological influence outside its borders, dependent on type: intensive or extensive agriculture, husbandry, fish or bird farming, greenhouses, etc. Urban areas contain different land-uses. Built areas can be classified by population density, type of activity, etc. Different configurations of urban fabric can have different ecological footprints. One of the ways to classify urbanization is by percentage of impervious surfaces, which is expected to correlate with population density (residential Land Cover) and/or economic activity levels (commercial or industrial LC). Each urbanized patch has ecological influence that extends outside its borders, eg. residential areas have garbage collection points and other sources of food for some species; And industry has pollution effects that have different distance of influence (water, soil, noise and light pollution, magnetic fields etc.), which brings intensification of development, and more and more transport infrastructures are added. Roads, rails and navigable rivers are linear elements of the landscape that function as movement barriers with different impedance dependent on species characteristics and level of traffic. Therefore spatial elements of urbanization processes ecological impacts should be modeled using GIS. Spatial modeling is done as a snapshot in time, which masks the dynamic nature of the processes modeled, for instance, a relevant aspect is the history of each patch. If the patch was a part of a big natural area, but recently became isolated by surrounding urban fabric, its biodiversity probably has not declined yet. This is because there is a time lag in ecosystem response to environmental disturbance. Another patch of similar configuration could have very different characteristics because it was used for agriculture sometime in the past. Land-use legacies of the patches themselves and their neighbors can affect ecological systems for long periods. Thus temporal perspective on patches land-use history is important [11] and should be implemented in spatial modeling using GIS.

Networks and graph-theory
Graph-theory is a branch of mathematics that is dedicated to network topology. Leonhard Euler created the graph theory in 1736, but until the 1950's it has been almost solely a mathematics and physics research subject. Due to GIS revolution and performance computing improvements between the end 1980's and 1990's, graph theory has been undergoing explosive growth in many disciplines including, geography, and of course landscape ecology and conservation biology [22]. Also, during that time period, we have witnessed the emergence of a new type of science: the science of complex networks [23]. Landscape and geographic networks can be considered as a subclass of complex networks, where nodes are located in an Euclidean bi-or three-dimensional space, and arcs express real physical relationships instead of abstract ones [24].
There is a natural affinity between Graph Theory and Geography. In fact, if we look at the 1960's emergence of the quantitative analysis and mathematics in the Geography, we find that geographic network analysis (GNA) is not new [29]. Topology also echoes Tobler's groundbreaking work such as spatial dependence and interaction models. Therefore, and the fact that topology is about connectivity, the need becomes obvious to incorporate graph modeling and analysis of landscape habitat connectivity.
Graph theory and complex networks metrics are useful not only to give a simplified visualization of system structure, but also to describe and analyze its accessibility, resilience and robustness, and to evaluate and compare its evolution through time. This approach allows us to understand how species interact, what are the implications of their spatial organization and their reaction to human activity inside and around their habitat. [5] Graph theoretic and complex networks metrics can be divided into two types: connectivity and accessibility metrics. Connectivity metrics allow comparing the structural complexity of the network, and measuring the degree of network connectivity. Accessibility metrics allow to measure individual nodal accessibility and can be considered as advanced network analysis. Using these metrics we can identify individual changes in the network and causes for global structural changes.

Multi-agent systems
Our conceptual data model uses GIS, Multy-agent systems and Network analysis. Why to combine these three approaches? The aim is to benefit from the advantages of these different methods, that have proved to be fitted for analyze complex systems, namely environmental spatial systems.

Model conceptualization
According to Longley, et al (2001), conceptual model is "…Human oriented, partially structured, model of selected objects and processes that are thought relevant …". Bearing in mind our central issue: how to measure habitat connectivity from one point to another -, our first step is to the design the diagram that portrays this conceptual model. So, this first step is essential to define the data to be used and the subsequent operative processes. The conceptual model is developed in following stages (Fig. 1): 1) data acquisition; 2) GIS integration; 3) computational modeling; 4) multi agent system and network simulation; 5) validation of the model. Here we present a brief review on these topics. When acquiring data to integrate in a GIS environment we need to consider the following principles: the data is georeferenced and in this case uses a local datum in order to provide spatial accuracy. It has a detailed scale enabling to produce comprehensive information and it preserves topologic relationships. The analysis of habitat connectivity and its evolution considers land use data for a testregion. It uses land use/cover data from the Corine Land Cover (CLC) database of the EEA (European Environment Agency). The CLC data is obtained through satellite images and is available for Europe in vector format for three different periods: 1990, 2000, and 2006. It uses a pixel size of 30 meters and the smallest mapping unit area is 25 hectares. We use the 1990 and 2006 period as a toy example to analyze habitat connectivity evolution and we extract five types of uses/covers: urban (urb), industry and commerce (wrk), agriculture (agr), forest (gre) and wetlands (wet). The scale of the data (1:100 000) is more appropriate to deal with global/regional phenomena and in this case not so adequate to study habitat fragmentation. Despite that we use this data for our toy model as it is rapidly accessible and enables temporal analysis and thus is suitable for our test purposes. Further testing with other types of data is planned.
2) About Geographic Information Systems data integration. GIS are a special class of information systems that keep track of the location of the phenomena [7].The integration of georeferenced information in our computation process uses a specific application of NetLogo, the GIS extension. This extension provides the ability to load vector GIS data and raster GIS data into NetLogo, thus preserving the geographical integrity of the data.
3) About the computational modeling. The computational modeling combines multi-agent system and network analysis using both NetLogo and Gephi software. First we model the connectivity graphs between land use patches based on rules that are defined using literature and are species specific parameters. Then we save the results (nodes and edges) with their coordinate system in order to explore them in network analysis software while maintaining the spatial perspective of the data. 4) About multi-agent system (MAS) and Network analysis. The MAS enables to build connectivity graphs between the different land uses. These graphs can then be analyzed trough network analysis. The metrics are of two types: 1) global, i.e. representing the all structure of the networks (e.g. diameter, density, etc.), and 2) individual, i.e. representing the centrality and accessibility of each node (for instance Eigenvector, Betweenness and Closeness centrality metrics)

5) About the validation and calibration of the model.
During the development of the model, some problems were take in consideration: the scale is too global thus we need to test the model with data using a more detailed scale; the pixel size is too big so we need to decrease of the pixel size from 30x30m to 5x5m in order have more accuracy. We are only using one data type of generalized land uses. We need to combine other relevant spatial data such as infrastructure networks, urbanization densities, and areas with restriction to urban development.

Model implementation
The Unicorn model has been implemented in the NetLogo multi-agent system, in four steps.
First, it was necessary to find a way to load the data files from GIS into the NetLogo multiagent system and, to display them properly (without distortion, etc.) inside the small patches that constitute the world of NetLogo [1]. Then, neighborhood graphs were created between adjacent patches with the same land use. After this step, other connections were made between these neighborhoods, symbolizing all the possible paths between them that have been found according to the specific rules based on the distance neighborhood. Finally, these connections were saved and exported in a format, which can be used by Gephi [2], an open-source software for network visualization and analysis.

Multi-agent system: NetLogo
NetLogo is a software program and a programming language that is part of the Logo languages family, designed in a process of active learning and constructivist teaching for educational purposes (strong notion of play, testing and experimentation).

It is both:
 An agent-oriented programming language massively used to create, among various other things, virtual worlds in which thousands of entities (agents) can be created and simulated, each with their own processes and their own rules. Because of it's Logo ancestry, this language is extremely easy to understand and learn;  An application that offers a complete and flexible graphical interface. It has multiple tools to visualize and study models (agent monitors, command center, viewing window, plots, etc.). In NetLogo it is possible to start and stop simulations or even directly manipulate some variables throughout the process (using graphical items such as sliders, buttons, menus, etc.).
The NetLogo language does not include many advanced or complex features in its basic use, compared to other multi-agent systems (such as some widely used pedestrian behavior models or graph creating features), since it must stay easy to learn. However, in addition to the advantages mentioned above, the system has a plethora of useful features that make it an excellent tool in communication and that enable designers to focus on implementing their models. Thus, it is possible:  To export the results, displayed at any time through the graphical interface, as images, movies or CSV files (Comma-separated values files), useful for further data analysis in spreadsheet software;  To extend the language by creating extra functions oneself (written in Java or Scala) or just by using extensions created by other programmers;  To integrate the whole system into another application;  To replicate the same simulations or even to run entire experimental designs, when the studied model requires more than one try (such as stochastic models), etc.
These possibilities led NetLogo to become a tool used to program models in many different sectors, such as urban planning, biology, chemistry, social sciences, small games, etc. The implemented models belong to a wide range of complexity, from those which are called "simple" (such as the two-body problem) to much more complex systems (such as the simulation of pedestrian behavior during building evacuation).
Finally, it could be noted that the representation of agents in NetLogo is unusual. There are in fact only four different kinds of agents:


The turtles are entities that are closest to the commonly used definition of agentsmoving entities that can change state according to rules;  The patches are in fact small pieces of the "floor" of the model world. The viewing window is modeled as a grid of patches, each representing a square land area. Turtles are located on patches. As opposed to the mobile turtles, the patches are stationary;  The links are a type of "linear" agents. They connect two turtles together;  The observer is a unique agent and is the "master" of the world. It is through the observer that the modeler creates and manipulates other agents. It has no location or visual representation.
In the case of the Unicorn model, all types of agents are used: patches are used to store the land use data from files generated with the GIS application; turtles and links are used to create connectivity graphs, first between adjacent patches of the same land use value, then between non-neighboring patches of the same land use.

Loading land use data into patches
An extension of NetLogo, logically called "GIS extension", can be used to load topological data, raster or vector, contained in files created with a GIS (shape files or ASCII files).
In our case, we imported the land use GIS data into the model. The data was in fact simple numerical values. The map was rasterized and each patch (or each "cell") contained a numeric code. Each one of those represented a particular type of land use. For instance, "111" meant that the current patch was a part of an urban area, "121" corresponded to industrial land-use, "211" to agriculture, "311" to semi-natural and natural landscape elements (woods, grasslands, etc.), "411" to water bodies (lakes, ponds, etc.) and, finally, negative numbers to not specified spaces (patches with no data or outside correspond the studied map).
NetLogo automatically matched the raster pixel grid contained in the imported file to the one formed by the patches. The data was then stored in a patches-own variable. The accuracy of the result of this mapping could be decreased or increased by adjusting the number of patches in the model. If there were fewer patches than values in the source file, they would acquire the value of the majority of patches that were inside the square of each pixel.
The translation of these values was then performed in the model: each patch received the land use it represented as a string and a fill color, for visualization (see Figure 1):

Land use neighborhoods
After each patch received its land use value, it was necessary to classify them in groups belonging to the same area.
An area was defined as a group of patches sharing the same land use value and having at least one of their 4-neighbors (patches with one side in common with the current patch: patches directly to the north, south, to the east and west) following the same rule.
A node (turtle) was created in the center of each patch with a certain land use value. Next each one of the nodes was connected (through a link) to those present on the four neighboring patches with the same land use value. Then, we used a propagation algorithm to retrieve sets of turtles, which were interconnected (see Algorithm 1). Sets of links that belonged to each set of turtles were stored in another list, but in the same order. All interconnected nodes belonged to the same and unique group.  At the end of the process, we obtained two lists containing the following seven items:

Building connections between same land-use neighborhoods
We considered the distance between the neighborhoods with the same land use. The algorithm searched for all existing paths between their patches, according to a certain maximum threshold distance (see scheme 2). Only turtle-to-turtle displacements were allowed, through links, so we used the Manhattan distance (also called the taxicab norm) to find the paths, instead of the Euclidian distance, as animals disperse from one patch to another they have a general destination, but they don't move in a straight line. Rather, they exhibit foraging behavior, exploring the area and searching for food on the way. In scheme 2, the Paths(start, ends) procedure was used to compute all paths between a turtle from the edge of an area (start of the searching) and turtles from the edge of another area which were not too far from the first one (ends). A path is a list of turtles. We used a simple graph search algorithm (see Algorithm 3). With these procedures, the paths were found two times: from an area A to an area B and, in the reverse situation, from the area B to the area A.
Applied with the example shown in Fig. 3 and a maximum distance of 3 patches, we obtained the following result, illustrated in Fig. 4 below:   The results were then saved into a file to be read by Gephi. For each area, the model computed and stored the polygon centroid (average of coordinates of the patches) instead of center (average of the coordinates) and the number of patches that composed it, its land use value and the year of the studied data. Then, for each connection between areas, the starting and the ending areas are stored, and the number of paths found between the two.

Network analysis
Complexity dictates the end of the reductionism, and has explained that for us to comprehend nature we can no longer first decipher its components, but instead we have deciphered the relations taking place. Nowadays, scientists recognize that nothing happens in isolation and most events and phenomena are connected, caused by, interacting with each other [26].
We already have subscribe Longley et al phrase that's something special about spatial, and also we have argue that graph theory and complex networks metrics are useful tools not only for networks visualization, but also to describe and analyze network structure, it's resilience and robustness, well as the accessibility of the components, and to evaluate and compare its evolution, through time. Taking that in account, we aim to have software that allows us to do spatial-temporal network analysis, and to uncover information that has been unrevealed due to classic linear analysis methodologies, that rather considering the phenomena dialectic are focus on the phenomena's. Therefore, we come into Gephi which is an open source software for graph and network analysis that also provides easy and broad access to network data and allows for spatializing, filtering, navigating, manipulating and clustering [28].
The networks metrics for analysis that we consider can be shifted in two: global metrics and individual metrics. Global metrics are the ones that analyze the network as a whole, its connectivity and structure. The density (that tell us how close is a graph to become complete), the diameter (How far apart are the two most distant nodes), modularity (communities detection) and the time-line (that allow us to compare the evolution of connectivity). For individual network metrics, which are the ones that revealing the accessibility of each node of the network and better understand the role and importance of the node in the network. The degree of the node or eigenvector (measures the importance of a node in the network based on node's connections), the betweeness (measure how often a node appears on shortest-path between nodes in the network) and closeness (the average distance from a given starting node to all others nodes in the network) [27,28].

Results and discussion
The global metrics revealed, for both time periods (1990 and 2006), the presence of poor connectivity networks (table 1). The highest the values for density and diameter of the network, less are the connectivity, and consequently more fragmented the landscape are, i.e. more difficult for species to travel through patches (average path length). That reality is highlighted by the existence of more isolated communities.
The following figure (fig. 6) is also very helpful for us to understand network evolution through time.
As we can see, from 1990 to 2006 there is a lost of connection between nodes (habitat) on the NW part of the network, which leads to the creation of more sub-graphs and consequently lost of connectivity.

Conclusions and further research
We can conclude that our spatial-temporal bottom-up approach model suites the empirical knowledge on the habitat analysis so it can uncover some reality, as habitat fragmentation on space and species dangerous of isolation and extinction, accordingly to some business-asusual kind of trend spatial policy.
Although, we consider that the model as to be more refine and robust in order to better fits reality and applies for well better spatial planning decisions. The fact that we needed to migrate data through different software's is a major minus. Therefore, our further research is either considering a creation of a friendly graphic user interface that could run as a plugin for some GIS software, or build our own GIS-Multi-agent-complex network based open source software.
Also, in a more short time table, we are already experiment and parameterize some individual metrics of nodes accessibility as an upgrade of connectivity metrics. We consider those metrics helpful for understand resilience and robustness of the network, i.e. what could happen to the network connectivity, if we lose this or theses particular nodes? Which nodes (habitats) lose their connectivity to the network? Which nodes is the core or the periphery of the network? Which nodes are from the cluster or the hub of the network? Those are only some of the answers that we could already get answers, both we believe at this research stage of our experimental project, that some other algorithms can be easily created and tested.