Environmental heterogeneity and sampling relevance areas in an Atlantic forest endemism region

a r t i c l e i n f o


Introduction
The lack of studies on species distribution and the great amount of undescribed taxa have hampered biodiversity conservation worldwide (Hortal et al., 2015(Hortal et al., , 2008. Paradoxically, investments in biodiversity characterization have been higher in temperate habitats than in the tropical regions that concentrate the main global biodiversity hotspots (Collen et al., 2008). Further, within the tropics, available information can be highly geographicallybiased, being site accessibility, distance from research institutes, and the proximity of the protected areas to human settlements the determinants of the best surveyed areas (Sastre and Lobo, 2009). The negative consequences of uneven sampling include: (i) the inefficiency of conservation unit designing, (ii) the lack of parameterization for distribution predictive models, and (iii) the reduced probability of describing new taxa, many of which will become extinct before being known to science (Bini et al., 2006;Brito, 2010;Hortal et al., 2015Hortal et al., , 2008Pontes et al., 2016). Given the limited funding for conservation and the growing rates of biodiversity loss, optimizing sampling efforts is urgently needed, especially in the developing countries that retain much of the world biodiversity.
An efficient way to mitigate spatial survey bias is to incorporate regional habitat heterogeneity into sampling design (Funk et al., 2005). This approach relies on the assumption that environmentally distinct areas may harbor communities with different species composition. Thus, areas that are environmentally distinct from those already studied may be more likely to have new species (Schmidt et al., 2020). With this procedure, the inclusion of environmental gradients is also important because it permits to investigate the whole set of conditions in which target species can occur, improving the performance of predictive models and of biodiversity mapping (Hortal et al., 2015(Hortal et al., , 2008. Landscape features and bioclimatic variables have been recognized as biodiversity surrogates (Lindenmayer et al., 2008;Williams et al., 2002), and they can be useful to represent the environmental heterogeneity necessary to identify areas of high species survey relevance. In a recent example, Schmidt et al. (2020) identified poorly-sampled environmentally distinct areas for Amazon forest ant communities with the use of environmental maps of soil, temperature, and precipitation. Although this approach has some limitations, such as the availability of environmental data that could influence the species communities, this approach can be particularly useful to delineate species inventories in environmentally heterogeneous areas and with a lack of studies.
The Atlantic forest is a biodiversity hotspot that has been severely impacted by habitat loss and fragmentation (Ribeiro et al., 2009). It has been connected and disconnected from Amazon, the main forest formation in South America, in the past millions of years, and presently is isolated in Eastern South America by a diagonal of dry formations composed by the Cerrado, Caatinga and Chaco (Silva and Casteleti, 2003). Repeated connections and disconnections with other biomes, altitudinal and latitudinal gradients, and isolation resulted in a unique biota composed by more than 20,000 species of plants, 321 species of mammals, 861 species of birds, 300 species of reptiles, and 625 species of amphibians (Monteiro-Filho and Conte, 2017;Silva and Casteleti, 2003). The great environmental heterogeneity of the Atlantic forest (i.e. variations in relief and pluviometric regimes) also contributed to the high species diversity and levels of endemism (Tabarelli et al., 2010). This biome has been subdivided into five centers of endemism, based on the distribution of butterflies, birds, and mammals: Brejos Nordestinos, Diamantina, Pernambuco, Bahia, and Serra do Mar (Silva and Casteleti, 2003).
The Pernambuco Endemism Center (PEC) is the portion of the Atlantic forest located in northeastern Brazil northern from São Francisco river, distributed in the states of Alagoas, Pernambuco, Paraiba, and Rio Grande do Norte. Of the five centers of endemism, PEC is the most fragmented, which together with the remarkably high species richness led it to be considered as a hotspot within a hotspot (Pontes et al., 2016). In this region, less than 6% of the original forest cover has remained and large continuous forest fragments no longer exist, with only 23 fragments presenting more than 1000 ha (Pontes et al., 2016), and none is larger than 10,000 ha (Ribeiro et al., 2009). Differently from other Atlantic forest regions, PEC is characterized by a low percentage of protected areas (only about 1%; Ribeiro et al., 2009) and by a limited number of public conservation unities. It highlights the importance of the private conservation unities, denominated by Brazilian legislation as Private Reserves of Natural Heritage (hereafter RPPNs). The RPPNs in PEC are characterized as small conservation unities, but more abundant and homogeneously distributed in the landscapes than the public conservation unities and, therefore, have a great potential to maintain species in this fragmented landscape.
Despite the high rates of habitat fragmentation and local species extinctions, new and endemic species are still being described in PEC (Peixoto et al., 2003;Pontes et al., 2013;Silva et al., 2004), and others have been considered extinct even before their scientific description (Pontes et al., 2016). Thus, the indication of areas of high species survey relevance is urgently needed. Here, we characterized the environmental heterogeneity of the Pernambuco Endemism Center in terms of vegetation, soil, drainage density, altitude, and climatic variables. Then, we assessed whether private reserves are preserving the environmental heterogeneity of the PEC region and how fragmented are the landscapes around these reserves. Finally, we assessed the most relevant regions in PEC in terms of environmental dissimilarity for sampling vertebrates. These results will elucidate how to plan new species surveys to give support to conservation actions.

Environmental heterogeneity in PEC areas
To evaluate the environmental heterogeneity in the PEC, we used variables related to climatic, soil type, altitude, drainage density, land use, and vegetation type. Limits of PEC area were taken from Ribeiro et al. (2009) (Fig. 1). For climatic variables, we retrieved climatic data (annual mean temperature, maximum annual temperature, minimum annual temperature, annual precipitation, precipitation of wettest quarter, and precipitation of driest quarter) from WordClim (Fick and Hijmans, 2017). Data for soil type, altitude, drainage density and vegetation types were taken from AmbData repository (http://www.dpi.inpe.br/Ambdata/) and land use maps from MapBiomas 2018 version 4.1 (Souza et al., 2020). All these variables were transformed into raster layers with spatial resolution of 30 s ( 1 km 2 ). The number of cell ( 1 km 2 ) from each environmental variable in the PEC area was counted using the function values from raster R package (Hijmans and Etten, 2012).

Private reserves
To evaluate how much of the environmental heterogeneity has been preserved in private reserves, we first searched for all federal and state private reserves of natural heritage (RPPNs) set in PEC area (Fig. 1). The coordinates of private reserves were taken from federal and state environmental agencies repositories (Table S1 and S2). Then, we extracted the environmental variables from private reserve coordinates using the function values from raster R package (Hijmans and Etten, 2012). In addition, we estimated the percentage of land use classes (forest formation, pasture, agriculture, annual and perennial crop, mosaic of agriculture and pasture) and isolation degree in a 2 km buffer of each private reserve coordinate. In this case, we used the land use map from MapBiomas (see above) that originally has a spatial resolution of 30 m. Isolation degree was estimated as the mean of Euclidean nearest-neighbor distance among forest land use class, using the plugin LecoS (https://github.com/Martin-Jung/LecoS) from QGIS (www.qgis.org).

Maps of sampling relevance
We estimated the sampling relevance in the PEC that means how relevant is each cell of 1km 2 grid covering the PEC for further survey studies. It was based on the environmental dissimilarity of each cell of 1km 2 in relation to sites already sampled in the literature and the result is a raster containing a gradient of sampling relevance to PEC. To evaluate the sampling relevance in the PEC, we first searched for studies performed with terrestrial vertebrates in the region and then estimated the sampling relevance of each cell of 1 km 2 of the PEC following Schmidt et al. (2020). We used the data available in data papers for amphibians, birds, mammals, and camera traps (Bovendorp et al., 2017;Culot et al., 2019;Hasui et al., 2018;Lima et al., 2017;Muylaert et al., 2017;Souza et al., 2019;Vancine et al., 2018) (Fig. 1). They are the most complete datasets published so far and include published (peer-reviewed papers, books, chap-ters, thesis, technical documentation, and scientific conferences) and unpublished data. The authors of these datasets searched for data in the following sources: (i) online academic databases (e.g., ISI Web of Knowledge, Google Scholar, Scielo, Scopus, JStore) (ii) digital libraries of state and federal Brazilian universities, (iii) references cited in literature, and (iv) email contacts with experts and organizations that have conducted studies with vertebrate groups. In addition, these datasets were done by expertise of each taxonomic group and all data were checked for correct taxonomy. Considering the Atlantic forest distribution, data paper for amphibian accounts for 1163 sites, birds 4122 sites, bats 205 sites, primates 700 sites, small mammals 300 sites, medium and large-sized mammals 244 sites and 144 sites for camera trap studies. Camera traps comprise mainly records of medium and large mammals, and few opportunistic records of birds, bats, primates, and small mammals. Camera trap has become a major advance for monitoring terrestrial mammals in biodiversity rich ecosystems because allowed the record of species difficult to observe and detect otherwise (Lima et al., 2017). Small mammals include marsupials and small rodents (i.e. families Caviidae, Cricetidae, Ctenomyidae, Echimyidae, Cricetidae and Sciuridae, Bovendorp et al., 2017). Medium and large-sized mammals include non-volant terrestrial mammal species over 1 kg . Unfortunately, there is no data paper published so far comprising reptile and fish communi-ties and PEC areas, thus these vertebrate groups were not included in our analysis. We did not used data occurrence from GBIF because of high rates of error in the coordinates and incomplete inventories of species occupying a survey location (Troia and McManamay, 2016).
Based on the coordinates provided by the studies performed with terrestrial vertebrates (hereafter, survey sites), we assessed the relevance of terrestrial vertebrates sampling for further studies in PEC. The sampling relevance was estimated as the environmental dissimilarity between each cell of 1 km 2 grid covering the PEC and the survey sites, considering eight uncorrelated environmental variables at once: vegetation and soil type, altitude, drainage density, maximum annual temperature, minimum annual temperature, precipitation of wettest quarter and precipitation of driest quarter. The selection of uncorrelated environmental variables was done by calculating the variance inflation factor (VIF) considering all environmental variables and excluded the highly correlated from the set through a stepwise procedure. Continuous variables were previously standardized by z-score using the function scale from R package. Then, for each cell of 1 km 2 , we calculated the environmental dissimilarity between the cell and the survey sites using Gower distance (Legendre and Legendre, 2012). Next, the average among the values of environmental dissimilarity were calculated to obtain a single value of sampling relevance for each cell of 1 km 2 . The environmental dissimilarity was estimated using the function vegdist from vegan R package (Dixon, 2003) which calculates a single environmental dissimilarity among sites based on several environmental variables. We chose Gower distance because it is appropriate to measure dissimilarities of two sites with mixed numeric and non-numeric data. Finally, we normalized all values from 0 to 1, performing a Min-Max normalization (Patro and Sahu, 2015), in such a way that values close to 1 represent areas environmentally different from areas where groups of vertebrates have already been sampled.

Environmental heterogeneity
Most of the PEC areas are characterized by seasonal semidecidous forest, open ombrophilous forest and dense ombrophilous forest (Fig. S1A). The other types of forest are transition zones between steppe and savanna vegetation or zone of marine influence (Fig. S1A). The PEC areas present mainly yellow oxisol soil and red-yellow argisol soil, that are characterized by low fertility (Fig. S1B). Pastures and croplands predominate, with less than 15% of the pixels consisting of forests, savanna and mangrove (Fig.  S1C). The region has high heterogeneity in drainage (Fig. S1D) and most areas are up to 200 m in altitude (Fig. S1E). The annual mean temperature varies from 21 to 27 • C (Fig. S2A), being the maximum temperature 32 • C (Fig. S2B) and the minimum temperature 15 • C (Fig. S2C). Among sites, a maximum of 7 • C of temperature variation was observed. Annual precipitation presents a great variation, from 500 mm to 2145 mm (Fig. S2D). The precipitation in the wettest quarter varies from 300 mm to 1000 mm (Fig. S2E), and in the driest quarter from 20 mm to 200 mm (Fig. S2F).
In general, the private reserves preserve high environmental heterogeneity. There are private reserves in all main forest formations (semidecidous forest, open ombrophilous, and dense ombrophilous forest), and also in the transition zones between vegetation formations (Fig. S1A). The proportions of vegetation and soil types, drainage density, altitude, and climatic variables in private reserves followed the same proportions found for the whole PEC area (Fig. S1 and S2). However, many soil formations that occur in low proportion throughout PEC regions are not present in the private reserves (Fig. S1B). Most private reserves are in are highly fragmented landscapes (% forest formation below 50% and at least 1000 m to the nearest forest fragment) surrounded by pasture and agriculture fields (Fig. S3).

Sampling relevance
Except for bats and birds that were mainly surveyed in ombrophilous forest, most of the vertebrate surveys were carried out in seasonal semidecidous forests and in low altitude areas ( Fig  S4-S17). Notably, no surveys were conducted in the driest areas ( Fig  S4-S17). For most vertebrate groups, the most western portion of the PEC presents the highest sampling relevance in terms of environmental dissimilarity (Fig. 2 and 3). This area is mainly in seasonal semidecidous forest and in transition zones between this type of forest and steppe vegetation. In the case of large mammals, highest sampling relevance sites extend to all portions of the PEC, except for south-central portion (Fig. 2). Coastal and northwest region of PEC present high sampling relevance (>0.75 sampling relevance) for a maximum of two vertebrate groups, usually for terrestrial mammals or non-volant mammals (Fig. 3). Correlations among the sampling relevance values showed that bats have similar patterns to amphibians, birds and primates; and medium and large mammals the most distinct pattern (Fig. 3A). Terrestrial mammals or non-volant mammals are the vertebrate groups presenting more areas of high sampling relevance, while primates, bats, amphibians, and birds are the ones with more areas of low sampling relevance (Fig. 2, S18). Most of high sampling relevance sites (>0.75 sampling relevance) are in fragmented areas, with average forest cover of 8% and isolation of 1500 m (Fig. 4). Sampling relevance of the private reserves are presented in Table S2.

Discussion
The Pernambuco Endemism Center shows high environmental heterogeneity, mainly in relation to forest and soil types, drainage density and levels of precipitation, while temperature and altitude vary only slightly in this region. In general, private reserves preserve the environmental heterogeneity found in the PEC; however, they are in landscapes composed by agriculture and pasture matrix wherein natural vegetation is very fragmented and isolated. Few sites have been surveyed in the PEC, being the mammals, in general, the least studied vertebrate groups. Because of the high environmental heterogeneity, we found many sites of high sampling relevance for all vertebrate groups, but in general, the western region of the PEC presents the highest sampling relevance in terms of environmental dissimilarity. For all vertebrate groups, the sites with the highest sampling relevance are threatened by fragmentation, and sampling efforts must be allocated in these areas before they get totally converted into agricultural fields and pasturelands.
PEC represents the narrowest Atlantic forest region in term of longitude and shares extensive borders in the west with the most dried Brazilian biome, the Caatinga, and in the east with the Atlantic ocean. This causes the PEC to present a wide range of precipitation with low temperature variation. Precipitation is one of the most important selective pressures for species diversification worldwide, because different physiological adaptations are need, especially for those surviving in harsh dried environments (Dewar and Richard, 2007;Irl et al., 2015). This hypothesis still needs to be tested for the PEC and this can be done using landscape genomics tools (Carvalho et al., 2020). Private reserves are in areas with different precipitation rates thus, if the above idea is applicable to the PEC, these areas can be crucial to preserve species and populations adapted to different environmental conditions.
Private reserves are the main areas for the biodiversity protection in the PEC. Although we have shown that the private reserves maintain areas with high environmental heterogeneity, they are in isolated and fragmented landscapes. For example, we showed that most private reserves are isolated at least 1 km from other forest fragments, and they are placed in landscapes with less than 30% of forest cover. Many studies have shown that more than 30% of forest cover is needed to maintain species richness in degraded landscapes because species loss is more dramatic below this threshold level (Banks-Leite et al., 2014;Muylaert et al., 2016). In addition, the isolation of the remaining populations can increase inbreeding rates leading to genetic erosion and compromising the health of the populations in the long term. Thus, probably the main protected areas in the PEC might not be sufficient to protect all species in this region and more conservation effort must be done to encourage the creation of more private reserves. Moreover, population genetic studies are urgently needed to assess the conservation status of the remaining populations and, when necessary, promote genetic management to increase their genetic diversity.
Amphibians, primates, and birds are the vertebrate groups with more sampled sites in PEC and new species still have been recently described for these groups (Peixoto et al., 2003;Silva et al., 2004). This is indicative that, if more sites with known data deficiency are sampled, more species are likely to be discovered in this region (Bini et al., 2006;Brito, 2010). Small and large mammals, on the other hand, were the least studied vertebrates, in terms of number of study sites, and it has been estimated that at least half of them have been locally extinct in the PEC (Pontes et al., 2016). In addition to mammals, several birds have already become extinct or are threatened with extinction in this region (Pereira et al., 2014). Thus, to prevent that more species become extinct, it is needed to know where these species still occur to preserve them. Few studies were performed in the driest regions (low precipitation), comprising the most western region of the PEC. These uneven records, in addition to preventing new species from being discovery, can lead to errors in species distribution maps and impair their management plans, mainly because most of these maps are based on climatic data (Hortal et al., 2015(Hortal et al., , 2008. Finally, most of the highest sampling relevance sites are in very isolated and fragmented areas, which indicate the urgency to study these areas to prevent species from becoming extinct even before they are discovery.
In conclusion, PEC is one of the least studied regions in the Atlantic forest biome and the characterization of environmental variations showed that this region needs to be urgently studied. Because the survey studies in the PEC are spatially biased, it is necessary additional surveys to improve the spatial and environ- mental coverage of the region. These additional surveys can help to improve ecological niche modeling that can be used to propose areas of potential relevance for conservation. Moreover, based on these additional surveys, it will be possible to assess the importance of the private reserves for conservation. The carrying out this type of study is not yet possible in the PEC due to the few species surveys in this region. Surveying species and collecting data in the field, however, are expensive and time-consuming endeavors. Thus, efforts must be made to use funds allocated to this task in the most efficient manner. Here we found the regions and environments with high sampling relevance based on the environmental dissimilarity with sites already sampled. For this task, we used vertebrate groups that are the most studied species worldwide. Despite that, many regions in the PEC still need to be studied to generate a database useful to help conservation decisions and management planning. Our findings highlight the importance of the existing private reserves in the PEC, and are potentially helpful to improve the efficiency of new conservation units designing, boost the performance of distribu-tion predictive models, and increase the probability of describing new taxa in an important endemism area within the Atlantic forest.

Conflict of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.