A global mapping template for natural and modi ﬁ ed habitat across terrestrial Earth

),


Introduction
Natural systems are being eroded by large scale increases in infrastructure, human land uses and pollution (extensive literature summarised in IPBES (2019)). One of the five key interventions proposed by the recent IPBES Global Assessment (IPBES, 2019) was "taking preemptive and precautionary actions in regulatory and management institutions and businesses to avoid, mitigate and remedy the deterioration of nature, and monitoring their outcomes". To act on this intervention, there is a clear need to support businesses and policy makers in governments to take action on these findings.
One highly influential and widely adopted performance standard for business use is the International Finance Corporation's Performance Standard 6 (IFC PS6) (IFC, 2019), which is used by investors to make decisions on project financing. IFC PS6 has three main objectives that it recognises are fundamental to sustainable development: protecting and conserving biodiversity, maintaining ecosystem services, and sustainably managing living natural resources (see Brauneder et al. (2018) for more information on IFC PS6). To achieve these objectives, IFC PS6 requires projects to identify risks and impacts to Critical Habitat, Natural Habitat and Modified Habitat (with Critical Habitat being a subset of Modified and Natural Habitats). IFC PS6 requires different levels of mitigation action to be implemented depending on the identified habitat(s) a project has the potential to impact (IFC, 2019).
'Critical Habitat' represents areas of high biodiversity value that are of significant importance to threatened, endemic, congregatory and migratory species, threatened or unique ecosystems, and key evolutionary processes (IFC, 2019). Global screening layers for Critical Habitat across the terrestrial and marine realms have already been produced (Brauneder et al., 2018;Martin et al., 2015) and are used by businesses at early assessment stages of projects. There is, however, no global layer for identifying the broad state of habitats (i.e. natural or modified). Landcover datasets provide an idea of the general habitats present but do not include any information on the pressures these habitats face and are blind to their ecological value . Data on species (i.e. the IUCN Red List (IUCN, 2019)), while crucial in some decisions , are not at the right resolution to support all private sector screening (Di Marco et al., 2017), and there is a lack of data on specific ecosystems and habitats globally.
As a result of this lack of data, the reality is that site screening and policy decisions fall back on datasets such as protected areas (WDPA: UNEP-WCMC and IUCN (2020)) and Key Biodiversity Areas (KBAs: BirdLife International (2020)). Inevitably it means that other aspects of biodiversity are ignored, or in the case of businesses, are considered much later in project development. This undermines the application of the precautionary principle (which is fundamental to processes around IFC), which aims to avoid environmental impacts by preventing activities unless there is evidence that there will be little impact. Datasets such as the Critical Habitat layer are useful but fail to give the true impression of the value of biodiversity outside of already identified sensitive areas (e.g. protected areas and KBAs). Many companies are seeking to align with IFC PS6 (Silva et al., 2019), but do not have the screening data necessary to determine how to apply the requirements outside of these sensitive areas, as many 'non-sensitive' places still harbour significant biodiversity values.
Of the total terrestrial area analysed for the Critical Habitat layer, 15.2% was classified as either likely or potential Critical Habitat (Brauneder et al., 2018). This does not mean that the remaining 84.8% of terrestrial Earth has no value. Here we aim to fill this gap with a global terrestrial map that considers habitat 'state' (i.e. natural or modified). It is aligned with IFC PS6, so can be used by businesses for the initial assessment stages of projects screening for Natural and Modified habitats, but can also be useful for national and international policy. It is designed to be used alongside the Critical Habitat layer. A similar classification scheme for datasets was used as in the Critical Habitat analysis, and the global layers are produced at the same spatial resolution. Our intention is that the common alignment and compatibility with both the definitions and requirements of IFC PS6, and the approach taken to produce the global screening layer for Critical Habitat, will increase the utility and likely uptake of our new layer by the private sector.

Natural and modified habitat definition
Here we define habitat using the same definition as IFC, namely a terrestrial geographical unit "that supports assemblages of living organisms and their interactions with the non-living environment" (IFC, 2019). IFC PS6 defines natural habitats as "areas composed of viable assemblages of plant and/or animal species of largely native origin, and/or where human activity has not essentially modified an area's primary ecological functions and species composition". It defines modified habitats as the opposite of this: "areas that may contain a large proportion of plant and/or animal species of non-native origin, and/or where human activity has substantially modified an area's primary ecological functions and species composition". However, in reality a given area will often fall between these two definitions on a continuum that ranges from largely untouched, wilderness areas to intensively managed, human modified habitats (IFC, 2019).
These definitions were used to classify datasets. However, there is no data available globally, at a fine enough scale and which is not modelled, relating to ecological function and species composition. Thus, we use data on human pressure as a proxy for the loss of ecological function and species composition.

Data screening and classification
Data screening and classification followed a very similar process as the terrestrial Critical Habitat layer (Brauneder et al., 2018). Relevant spatial datasets were identified and classified through consultation among the authors and other experts based on the following criteria. Datasets were only considered if they were: 1) global in extent; 2) displaying data from within the past decade (regarded as sufficiently recent to inform current and future policy; Joppa et al., 2016); 3) represented the best available/most up to date data for the feature of interest; and 4) were available for use by the private sector.
Selected datasets were classified as supporting screening for either 'likely' or 'potential' Natural Habitat or 'likely' or 'potential' Modified Habitat based on two variables: 1) alignment to the IFC PS6 definitions of Natural/Modified Habitats and 2) spatial resolution of the dataset indicating presence on the ground (i.e. accuracy of the data) (Fig. 1). Datasets with features that aligned strongly with the IFC PS6 definitions of Natural/Modified habitats and had a high spatial resolution (≤1 km or vector data) were classified as supporting screening for likely Natural/Modified Habitat. Datasets with features that aligned strongly with the IFC PS6 definitions of Natural/Modified Habitats but had a lower spatial resolution (> 1 km) or vice versa were classified as supporting screening for potential Natural/Modified Habitat. Where alignment of features to the definitions of Natural/Modified Habitats was less strong and spatial resolution was lower, datasets were not included.
The datasets selected are the most up to date data available which align with the IFC PS6 definitions of Natural and Modified Habitat, which is needed to make reliable decisions, but they do not cover the whole land surface. Once combined, the datasets selected covered 62.5% of the global land surface (not including Antarctica). Instead of leaving the remaining areas as 'unknown', which cannot be used for site screening, we filled in these areas using a categorised version of the updated Human Footprint Layer, a cumulative pressure map, which uses the same methods as Venter et al. (2016) but with datasets centred on the year 2013 opposed to 2009. For these regions a Human Footprint value of < 4 was categorised as likely Natural; 4-6 as potential Natural; 7-9 as potential Modified; and 10 or greater as likely Modified. These categories were based on the experience of the co-authors working with the Human Footprint and previous categorizations of the Human Footprint. A Human Footprint value of less than four is often considered "natural" or "low disturbance" (Mokany et al., 2020;Watson et al., 2016) and 10 or greater as "very high pressure" or "highly modified" (Mokany et al., 2020;Venter et al., 2016)these are our likely Natural and likely Modified categories. A Human Footprint value between 4 and 6 is considered "moderate pressure" and a value between 6 and 10 as "high pressure" )these are our potential Natural and potential Modified categories.
As we wanted to produce a layer which was as up to date as possible, we did not use this classified version over the entire world to identify Natural and Modified Habitat. The most recent version of the Human Footprint Layer uses data for 2013, whereas all but one of the datasets we selected are more recent than this. We also wanted a layer that could be easily updated when new datasets are published.

Data processing and spatial analysis
Data processing and analysis were undertaken in Google Earth Engine, which means the layer is easily updated when new datasets become available. The data screening process retained data in both raster and vector formats. All datasets were converted to raster layers with the resolution matched to the Critical Habitat layer (~1 km). When converting higher resolution raster datasets and polygon data to a 1 km grid cell size, we set a threshold of > 50% of the 1 km grid cell had to be covered by the dataset being converted. For polyline datasets, any cell a line passed through was allocated a value (see supplementary material Table A1 for details on individual dataset processing).
The final layer is a composite of the underlying data layers combined following the precautionary principle. First, all data layers for each category were merged, resulting in four binary layers: likely Natural, potential Natural, potential Modified, and likely Modified. These four layers were then combined in a hierarchical order, with likely grid cells being retained over potential grid cells. Likely Natural and likely Modified grid cells that overlapped were given a potential Natural value. Potential Natural grid cells were retained over potential Modified grid cells (Fig. 2). This method of combining layers ensures the final screening layer follows the precautionary principle, by first relying on better quality data and otherwise retaining Natural values where there is disagreement between a Natural and Modified dataset for a given grid cell. This effort ensured that Natural Habitat was not falsely identified as Modified.
This combination of underlaying data layers results in a final raster layer with each grid cell classified into one of the four categories: likely Natural, potential Natural, potential Modified, and likely Modified. A given grid cell is categorised as it is, either because: 1) more than half the cell is covered by an underlaying dataset that has been classified as that category; 2) more than one underlaying dataset classified as different categories covered more than half the cell, and the highest ranking category was retained (as detailed above); or 3) no datasets covered more than half the cell, meaning the classified Human Footprint was used.

Validation
Validation was undertaken using the same methods as the Earth's remaining Low human Impact Areas (LIA) dataset (Jacobson et al., 2019). This involves using existing global validation data from the Human Footprint Layer . These validation points were produced by visually interpreting human pressures in 3114 1km 2 plots using high-resolution satellite imagery. The imagery had a median resolution of 0.5 m and a median acquisition year of 2010. The visual interpretation resulted in a visual score which we used to validate our layer. A visual score of less than one was classified as Natural (low impact) and one or more as Modified (high impact), the same threshold used by Venter et al. (2016) and Jacobson et al. (2019).
We performed this validation on the final layer (version 1) which includes the Human Footprint Layer, as well as on a version that covers 62.5% of the global land surface which excludes the Human Footprint Layer (version 2). Of the 3114 validation points, 30 fell within NoData areas of version 1 which resulted in 3084 validation points being used for this version. As version 2 does not cover the entire land surface, 1255 points fell within NoData areas, leaving 1859 validation points for this version. We calculated the overall accuracy (the percentage of validation points that were correctly classified) and the Cohen kappa statistic for each version. The kappa statistic measures the agreement between the screening layers and the validation points, taking into account expected agreement by chance (Viera and Garrett, 2005).

Selected datasets
A total of 11 datasets were selected (Table 1) from a total of 24 that were reviewed for their suitability (see supplementary material Table  A2). Of these 11, five related to Natural Habitats and six to Modified Habitats.

Global coverage of natural and modified habitat
Of the total global terrestrial area (excluding Antarctica and waterbodies), 36.7% is classified as likely Natural, 24.9% as potential Natural, 16.6% as potential Modified, and 21.8% as likely Modified (Fig. 3). This means that there is the possibility that 61.6% of the global terrestrial habitat remains in a natural state according to the interpretation within IFC PS6.

Validation
The overall accuracy of the screening layer is 77%, and this increases to 83% when excluding the Human Footprint Layer. When only considering the likely Natural and likely Modified pixels, the user accuracy indicates a high level of accuracy at 91% and 87% respectively. This is lower when only considering the potential Natural and potential Modified pixels at 63% and 55% respectively. The Kappa statistic for the version of the screening layer that does not include the Human Footprint Layer was 0.655, also indicating a good agreement with the validation points. When you only consider the likely Natural and likely Modified pixels (1139 validation points), this rises to 0.902. The final version which includes the Human Footprint Layer has a slightly worse agreement with the validation points, with a Kappa statistic of 0.526, but this still indicates agreement. This also rises when you only consider the likely Natural and likely Modified pixels (1818 validation points) to 0.779. Considering only potential Natural and potential Modified pixels, the Kappa statistic drops dramatically for both the versions without and with the Human Footprint Layer to 0.293 and 0.175 respectively (see supplementary material Table A3).

Discussion
Here we present a novel approach to classifying global habitat state to inform private and financial sector decision making. By assigning each grid cell to one of four categories, this layer is in a format that is usable for non-mapping specialists to make decisions around the state of habitats, and is compatible with existing approaches to screening for Critical Habitat (Brauneder et al., 2018;Martin et al., 2015). Other efforts that use cumulative mapping approaches, such as Venter et al. (2016) and Kennedy et al. (2019), map human pressures, but do not identify the state of habitats. While these are important for quantifying human pressures on the landscape and what this means for conservation interventions (Allan et al., 2017a(Allan et al., , 2017bJones et al., 2018) or species vulnerability assessments (Di Marco et al., 2018), they are not suitable for businesses and other decision makers for identifying Natural Habitat. Given that natural habitats may still contain some form of human pressure, using human pressure scores on their own, without classifying these pressures, may overlook areas of natural habitat. For business decision making, this could lead to incorrect screening of new sites, potentially weakening safeguards, and ultimately resulting in project delays or impacts to biodiversity.
A global layer of habitat state, which is aligned with the definitions of Natural and Modified Habitats according to IFC PS6, can be used by businesses in the early stages of project development, by highlighting areas of potential or likely Natural and Modified Habitat. It can be used at a landscape scale, due to the resolution and precision of the underlying data. It does not remove the need for more detailed ground surveys at a site level, but provides an overview of the state of habitat in the surrounding area.
Although the IFC PS6 definition of Natural and Modified Habitat is based on ecological functions and species composition, suitable data on this is not available globally. For this reason, data on human pressure and habitat is used as a proxy for the loss and intactness of ecological functions and species composition. It is important to note that this screening layer may overestimate the amount of remaining Natural Habitat for two reasons. The first is that we took a precautionary approach when designating pixels a Natural or Modified value. Where there was disagreement between a Natural and Modified dataset for a given pixel, the precautionary approach was to designate it as a Natural pixel (depending on whether the datasets were classified as likely or potential).
The second is that not all aspects of human modification could be included because of data limitations. A prime example is hunting, which is a major cause of biodiversity loss (Maxwell et al., 2016) and therefore has large impacts on the ecological function and species composition of habitats, but there is no data available globally. The buffers we used around roads are a good proxy for hunting in some habitats, such as forests, but the impacts of hunting will vary based on terrain and may extend further in non-forested areas (Wu et al., 2017). In addition, as the Human Footprint Layer is a pressure map that includes the indirect effects of access in to natural areas, it does have a relationship with human pressures such as hunting and the introduction of invasive species .
Of the areas classified as likely or potential Natural in our screening layer, some areas may not be intact in terms of ecological function and species composition due to hunting and other anthropogenic pressures for which data are not available. For example, two thirds of Intact Forest Landscapes overlap with an area where a species has gone extinct in the past 500 years (Plumptre et al., 2019). And around 9% of tropical Intact Forest Landscapes and 11% of tropical Wilderness areas have lost at least 10% of their mammal abundance due to hunting (Benítez-López et al., 2019). When only considering large-bodied mammal assemblages, these figures go up to over 50% (Benítez-López et al., 2019).
Having said this, it does not mean that these areas do not still contain biodiversity values which would meet the definition of natural according to IFC PS6. They can still be important for conservation and could even be returned to an intact state with effective management or reintroductions. As this is primarily a screening layer for businesses, it

Likely
Intact forest landscapes (Potapov et al., 2008) 2016 Polygon Likely They are defined as "a seamless mosaic of forests and associated natural treeless ecosystems that exhibit no remotely detected signs of human activity or habitat fragmentation and are large enough to maintain all native biological diversity, including viable populations of wide-ranging species" (Peter Potapov et al., 2017).

Likely
Potential natural Global Distribution of Saltmarshes (Mcowen et al., 2017)

1973-2015 Polygon
Likely This dataset displays the extent of saltmarshes globally, however the presence of saltmarsh does not necessarily mean that it is in a natural state. For this reason it has been classified as potential Natural.

Potential
Global Mangrove Watch (Bunting et al., 2018) 2016 Polygon Likely This dataset displays the extent of mangroves globally, however the presence of mangrove does not necessarily mean that it is in a natural state. For this reason it has been classified as potential Natural. Likely GRIP (Global Roads Inventory Project) major roads (Meijer et al., 2018)

1997-2015 Polyline
Likely Roads are a large driver of habitat conversion and fragmentation, mortality, and also provide access for hunting and other nature uses (Ibisch et al., 2016;Laurance et al., 2009;Taylor and Goldingay, 2010;Trombulak and Frissell, 2000) Likely Hansen Global Forest Change (forest loss) (Hansen et al., 2013) 2018 30 m Likely Forest loss has major impacts on wildlife, hydrology and climate (Laurance et al., 2000).

Likely
OpenStreetMap quarries

1997-2015 Polyline
Likely Minor roads may have less of an impact in terms of mortality and fragmentation, but can still cause considerable modification (Goosem, 2007;Taylor and Goldingay, 2010). Railways can cause habitat modification and fragmentation, but their impacts differ from those of roads. Passengers rarely disembark in places other than rail stations, meaning they do not provide access to the habitat they cross .
Potential J. Gosling, et al. Biological Conservation 250 (2020) 108674 is essential to retain areas that have lost some degree of ecological function and species composition as Natural Habitat for screening purposes. This is because in reality naturalness is a continuum with most places on earth no longer in a completely natural state. This aligns with IFC PS6 guidance which states that "natural habitats are not to be interpreted as untouched or pristine habitats. It is likely that the majority of habitats designated as natural will have undergone some degree of historical or recent anthropogenic impact" (IFC, 2019). Operating in these areas could also still cause significant adverse effects to biodiversity. There may seem to be a discrepancy between our findings and that of the IBPES Global Assessment which stated that 75% of the terrestrial environment has been altered by human actions (IPBES, 2019). However, as we have mentioned above, including only pristine lands in our Natural categories would not align with the IFC PS6 guidance and could cause near-natural areas of habitat to be overlooked.
As with all approaches, our layer does not answer every question that the private sector should be considering. Using it alongside the Critical Habitat layer will help to answer many questions which might arise during project or portfolio screening, however there will still be aspects that need consideration at more local scales. For example, examining biodiversity distinctiveness, level of threat, and levels of habitat connectivity will need more local oriented datasets. Processes such as Environmental Impact Assessments and Biodiversity Action Plans should be tailored to these more local needs, although a key constraint on their ability to resolve detailed, localised questions is a lack of available data (UN Environment, 2018).
Although we used the best available datasets, no global dataset is perfect, and this may lead to some inaccuracies in our layer. For example, global road datasets are not able to map every road present in the world. They may miss unofficial or unplanned roads particularly in relatively natural areas such as the Amazon and Congo basins (Meijer et al., 2018), or could miss areas that have not been mapped well (Hughes, 2017). Although we used the best available global roads dataset, which combines national datasets with crowdsourced Open-StreetMap data (Meijer et al., 2018), there will still be roads missing and therefore some areas may be falsely identified as likely or potential Natural in our layer.
Pasture lands are another human pressure which are not mapped well globally, and we therefore did not include in our analysis. Although they cover a large proportion of the Earth's land surface (Ramankutty et al., 2008), their impact in different areas can vary J. Gosling, et al. Biological Conservation 250 (2020) 108674 greatly. For example, pasture lands in many parts of Europe are generally highly modified areas, whereas pasture land in many areas of Africa can still be fairly natural. This may also mean that some of these areas are falsely identified as likely or potential Natural in our layer. But we note that these limitations are still in keeping with the precautionary principle. There will be some variation of habitat state within each grid cell, but we have tried to minimise this with the thresholds we have chosen. When converting polygon datasets to raster, and when resampling finer raster datasets to 1 km, we set a threshold of more than half the 1 km grid cell must have been covered by the dataset for it to be assigned a value. When finer scale datasets become available, we will be able to produce a layer at a higher resolution.

Other similar layers
The Three Global Conditions for Biodiversity Conservation and Sustainable Use (3Cs) map (Locke et al., 2019) is in some aspects similar to the screening layer we present here. However, it classifies everything that is not large wild areas or cities and farms into a single shared lands group. Large wild areas are clearly natural, and cities and farms are modified, but shared lands (which covers 55.7% of the global land surface) are too broad a categorization for private sector decision making. It is also designed for a very different use. The 3Cs map is intended to provide a framework for actions by countries to address global targets for the Post-2020 Strategic Plan for Biodiversity. It is not a habitat state map, nor is it proposed to be used by businesses to screen for Natural and Modified Habitat.
The Earth's remaining Low human Impact Areas (LIA) dataset (Jacobson et al., 2019) is also similar, as it maps areas that have had low impacts from humans. However, it is not aligned with IFC PS6 so is not ideal for use by the private sector. For example, the impact of human population and livestock density will vary between ecosystems, which the LIA dataset does a try to account for. But other factors, such as how strictly laws are enforced, will also cause differences in impacts from human population and livestock density, which are not accounted for. Therefore, the inclusion of human population and livestock density may result in areas being falsely identified as modified, which doesn't follow the precautionary principle. Nor does it align with the IFC PS6 guidance for areas to be "considered a natural habitat regardless of some degree of degradation and/or the presence of some invasive alien species, secondary forest, human habitation, or other human-induced alteration" (IFC, 2019). The Human Footprint Layer also includes human population density and pastures, but the thresholds we have chosen to classify it into the four categories allows there to be a certain amount of degradation in Natural areas.
The Biodiversity Intactness Index (BII) (Newbold et al., 2016) is based on species abundance and composition at sites across varying land use and land use intensities. It provides an intactness value between 0 and 100% for each grid cell. However, it would be difficult to align BII with the IFC PS6 definition of Natural Habitat as a threshold of intactness would have to be chosen for which over a certain value is said to be 'natural' and under is 'modified'.

Validation
The validation indicates that the likely Natural and likely Modified categories are very accurate and the potential categories are less so. This is exactly what we would expect as the data underlying these categories were selected, in part, due to their accuracy. This highlights the importance of treating this as a screening layer, with more detailed ground surveys needed, especially in potential areas. It is also important to note that the validation points have a median acquisition year of 2010, whereas the datasets underlying our screening layer have a median year of 2016. Therefore, it is likely that some of the validation points classed as natural were no longer natural in 2016. As this layer is designed to take into account the precautionary principle and still has a strong overall accuracy and Kappa statistic value, we do not consider this an issue.

Use of the screening layer in business decisions
This layer is intended to be used as part of a larger screening exercise. It can be used to support and direct more detailed assessments. A screening process to identify potential impacts to biodiversity and ecosystem services must be undertaken by companies applying PS6. This "may take the form of an initial desktop analysis and literature review, including a review of regional studies and assessments, and the use of global or regional screening tools" (IFC, 2019). It is in this early stage where a screening layer for Natural and Modified Habitat can be used to indicate the presence of likely and potential Natural and Modified Habitat.
The layer is designed to support five main use cases: 1) compliance with IFC PS6, in particular augmenting the existing Critical Habitat screening layer to provide a more holistic and complete early screening of projects and investment opportunities at the landscape scale; 2) decisions on the location of new operations for companies who are not in receipt of funding from IFC or the Equator Principles Financial Institutions but are looking to adopt an international good practice approach; 3) portfolio-level analysis of existing operations by companies or financial institutions to understand the scale of their presence in natural or modified habitats; 4) supply chain analysis of sourcing regions to understand indirect impact on natural and modified habitats; and 5) supporting action by businesses to protect and enhance existing biodiversity values and to contribute to appropriate habitat restoration.
This globally consistent screening layer will fill gaps in areas where data typically used for screening, such as protected areas and Key Biodiversity Areas, do not cover. With protected areas data covering just 15.1% of terrestrial surface area (UNEP-WCMC and IUCN, 2020), and Key Biodiversity Areas currently only covering 8.8% (unpublished data), there are significant gaps and weaknesses in screening approaches which cannot otherwise differentiate the rest of the world. The screening layer will also support more globally consistent and representative analysis of exposure of portfolios and supply chains, which can similarly suffer from overreliance on protected areas and Key Biodiversity Areas.

Conclusion
Here we provide a methodology for combining a number of global datasets to identify areas of Natural and Modified Habitat. This methodology is aligned to the standards (IFC PS6) companies are complying to. The map we have produced, which is freely available for anyone to use, is much needed, as currently businesses do not have a way of identifying Natural and Modified Habitats on a global scale.
It is vital that companies have the most up to date data available for accurate decision making. The scripts we have produced, which run on Google Earth Engine, are easily adapted to include updated and additional datasets, as well as producing outputs at higher resolutions. This means that this screening layer can continue evolving as new data becomes available, allowing it to stay up to date. We are not suggesting that our layer should be used on its own to make decisions, and we emphasise that it does not remove the need for ground surveys. However, using this layer alongside the Critical Habitat layer will provide insights into both the state and value of habitat. It will give companies a much clearer idea of habitat in the early development stages of projects, potentially saving them time and money.
The Natural and Modified Habitat Screening Layer can be viewed and downloaded at https://doi.org/10.34892/4q5v-gf37.