Deforestation-induced surface warming is influenced by the fragmentation and spatial extent of forest loss in Maritime Southeast Asia

Deforestation in the tropics causes warming which contributes to regional climate change. Forest loss occurs over a broad range of spatial scales, producing a variety of spatial patterns of cleared and forested land. Whether the spatial attributes of these patterns influence the resulting temperature change remains largely unknown. We adopted a differences-in-differences approach to analyse remotely-sensed forest loss and land surface temperature (LST) data in maritime Southeast Asia. We found that deforestation increased LST, as expected, but that the temperature increases were smaller when forest loss produced more fragmented landscapes in which non-forest and forest edges were heavily interlaced. Temperature increases were greater where the forest loss was more extensive. Warming also extended beyond the location of forest removal, so that forest loss increased temperatures in undisturbed locations up to 6 km away. Different spatial patterns of land clearing, for example, as might be produced by small-holder agriculture as opposed to large-scale deforestation, would therefore have different impacts on the local climate. Conserving forests within 4 km of farmland, urban areas or other sensitive environments may help to avoid temperature increases that reduce land productivity and worsen human health.


Introduction
Among the many negative environmental consequences of tropical deforestation (Laurance 2004, Barlow et al 2007, Foley et al 2011, Gibson et al 2011, Hansen et al 2013, increases in temperature pose immediate threats to crop productivity, food security Roberts 2009, Schlenker andLobell 2010), and human health (Abram et al 2014, Wolff et al 2018. Forest conversion to other land uses in the tropics decreases evapotranspiration, increases albedo, reduces aerodynamic roughness and thus vertical mixing of air, leading to warming relative to undisturbed forest (Lawton et al 2001, Ray et al 2006  . Typically, this warming increases daytime land surface temperature (LST) by 1 • C-2 • C. For example, 50% tropical forest loss in a 5 × 5 km area has been estimated to cause average LST to increase by 1.53 • C ± 0.08 • C (Alkama and Cescatti 2016) and 1.08 • C ± 0.25 • C (Prevedello et al 2019). Pantropical, site-level (<1 ha) temperature data show that local LSTs are consistently higher outside of primary forest regions, with particularly pronounced warming following conversion of forest to agricultural land (minimum +1.6 • C, maximum +13.6 • C; Senior et al 2017). For example, a study in Sabah, Malaysia, found primary forest was up to 2.5 • C cooler than selectively logged production forest and up to 6.5 • C cooler than oil palm plantations (Hardwick et al 2015).
There remains, however, considerable local variation in the temperature increase due to forest loss. This variation is influenced by the land cover replacing forest (Bright et al 2017), the condition of the remaining forest (Blonder et al 2018), its elevation (Zeng et al 2018), proximity to water bodies (e.g. Cai et al 2018), and the spatial pattern of forest cover (e.g. edge effects arising from landscape fragmentation; Mendes and Prevedello 2020). Identifying which factors modify temperature change following forest loss would help inform forest management and conservation policies (Cohn et al 2019, Masuda et al 2019. Here we focus on four spatial properties of forest loss conceptually shown in figure 2 and discussed further below. Specifically, we ask whether temperature increase following loss is sensitive to how much forest remains (remaining forest cover), and how fragmented it is. We measure the degree of forest fragmentation as the density of forest/non-forest edges following forest loss (edge density). We consider also how extensive forest loss is-for instance, whether losing 40% forest over a radius of 1 km causes the same warming on (spatial) average as losing 40% of forest cover over a largere.g. 4 km-radius. Finally, we ask whether the effects of warming can propagate beyond the site of forest loss and influence the temperature of neighbouring areas.
These spatial properties have not formed the subject of focused research to date. For example, most prior studies estimating the temperature effect of forest loss (e.g. Li et al 2015, Alkama and Cescatti 2016, Cohn et al 2019, Prevedello et al 2019 did not control for variations in the remaining forest cover fraction following forest loss. This approach assumes, for instance, that the temperature effect of reducing forest cover from 100% to 50% is identical to that of reducing forest cover from 50% to 0%. Similarly, previous studies did not control for differences in the fragmentation of the remaining forest. Yet areas within 100 m of forest edges are typically hotter, drier, and subject to more variable thermal conditions than forest interiors (Williams-Linera 1990, Matlack 1993, Young and Mitchell 1994, Chen et al 1995, Murcia 1995, Pohlman et al 2007, Yan et al 2007, Didham and Ewers 2014, Magnago et al 2015, Latimer and Zuckerberg 2017, Bernaschini et al 2019. Temperature changes observed at larger scales than this must reflect the aggregate effect of these edge-interior gradients, and should therefore be sensitive to the spatial pattern of forest fragmentation (Arroyo-Rodríguez et al 2017, Mendes and Prevedello 2020). For example, Mendes and Prevedello (2020) found that fragmentation of a single forest patch into 100 patches in a 5 × 5 km pixel (without reducing overall forest cover) reduced mean daily LST by 0.26 • C (Mendes and Prevedello 2020). While numerous measures of landscape heterogeneity could be considered (Cale and Hobbs 1994, Weibull et al 2000, Wu et al 2000, Fahrig et al 2011, Fauset et al 2017, most correlate with the length of forest 'edges' , which therefore formed our metric of fragmentation (see figure 2(A)). Similarly, most previous studies evaluate the temperature impacts of forest loss at only one spatial scale of resolution-e.g. on a 5 × 5 km grid (Alkama andCescatti 2016, Prevedello et al 2019). This could be problematic, since other local climate responses to deforestation, such as precipitation, behave differently across different spatial scales Vandecar 2015, Khanna et al 2017). In a recent study, Zeppetello et al (2020) found more extreme warming when forest loss patches were between 33 and 100 km 2 in area than in forest loss on smaller spatial scales, suggesting a need to better quantify how the spatial extent of forest loss influences temperature change. Finally, several studies indicate that warming due to land cover change (Bassett et al 2016, Cosgrove andBerkelhammer 2018, Cohn et al 2019) may spread beyond the location where that change occurred. For example, Cohn et al (2019) observed significant effects on temperature due to forest losses occurring at sites up to 50 km away in the Brazilian cerrado. Nonlocal temperature increases imply that conserving remaining forest can 'cool' adjacent areas, providing a potentially important rationale for forest conservation. Yet to date, only Cohn et al (2019) has characterised these nonlocal effects in systems experiencing deforestation.
This study is therefore structured around three analyses. The first addresses the sensitivity of forest loss-induced temperature changes to (i) the remaining forest cover fraction and (ii) the density of forest edges associated with it (Research Topic 1). The second addresses the sensitivity of forest loss-induced temperature changes to variation in the spatial extent over which forest loss occurs (Research Topic 2). The last addresses whether locations can warm due to forest loss occurring in nearby but spatially distinct areas (Research Topic 3).
Two main challenges arise in undertaking these analyses. Firstly, the temperature change caused by forest loss must be separated from other potential drivers of temperature variation, such as variability in the El Niño-Southern Oscillation, Indian Ocean Dipole, or Madden-Julian oscillation (Trenberth et al 2002, Abood et al 2015, Islam et al 2018. To isolate the effects of forest loss on temperature, we used a 'differences in differences' (DiD) methodology. In a DiD analysis, a treatment or exposure occurs for one group (the 'treatment' group) across the study time period, while another group remains unaffected (the 'control' group) (Angrist and Pischke 2008). The difference in outcome relative to a baseline condition is assessed for each group; and the differences in these differences between groups measure the effect of the treatment. The DiD approach enables a robust attribution of temperature change to deforestation (the treatment variable) by controlling for temperature change in sites where no deforestation occurs (the control variable). Past global studies, such as Alkama and Cescatti (2016)    pixel that has experienced forest loss may contain a complex spatial pattern of remaining forest cover (dark green), and non-forest areas (pale green). This spatial pattern is summarised in this analysis based on its 'remaining forest cover' FC (given by the proportion of the dark green areas within the 1 km assessment pixel), and by its 'edge density' η (given by the fraction of the pixel that lies at a dark-pale green interface). (B) The same fractional forest loss shown over two different spatial extents, denoted by their radii R, with yellow-shaded areas representing forest loss areas and unshaded areas showing remaining forest cover. The box at the center of the circle corresponds to the 1 km pixel shown in panel (A). (C) Temperature effects of forest loss could also be experienced in neighbouring areas. To unambiguously assess this, we consider temperature changes within pixels that do not experience forest loss as a function of their distance R in to the nearest forest loss area (in annuli between R in and Rout). relied on DiD approaches to isolate the effects of forest loss on temperature change. The other challenge in the analysis is accounting for the high correlation between different spatial properties of forest loss (e.g. see figure 3), as well as the different values these properties adopt through time. To isolate the effects of the variable of interest on the temperature response to forest loss, we relied on careful sampling and balancing of the datasets used for analysis.
The study was conducted using the Maritime Continent (MC), comprised of Brunei, Indonesia, Malaysia, the Philippines, Singapore, and East Timor as a case study region. Figure 1 shows a map of the study area, overlaid with forest loss from 2001 to 2019, forest gain from 2001 to 2012, and remaining forest cover in 2019. The MC has experienced rapid forest loss in recent decades due to logging, establishment of palm oil plantations, and small-scale clearing (Gaveau et al 2016, Austin et al 2017a, 2017b. The climate and temperature effects of forest loss in the region have received less attention in regional modeling and remote sensing studies compared to, for example, the Amazon (notable studies are Mabuchi 2011, Tölle et al 2017, Takahashi et al 2017, Chen et al 2019, Tölle 2020. The MC offers, however, a broad range of contexts and spatial scales over which forest loss has occurred (e.g. from small holdings to large scale plantations; Austin et al 2017a), providing an opportunity to assess their effects on temperature change. The data and analyses were all conducted on 1 × 1 km spatial scales, reflecting the resolution of the MODIS LST product, which formed the response variable in the study.

Study area
The six nations and numerous islands comprising the MC are climatically and ecologically similar. The MC lies along the equator; its temperature is moderated by the surrounding oceans and experiences limited seasonality (Sari et al 2007, Gosling et al 2011. The well-developed regional monsoon climate means that rainfall is the main seasonal climatic feature (Cavendish 2008). The wet season occurs between October and May and the dry between June and September (McBride 1998, Chang et al 2005, Bowman et al 2010. The precise timing of the wet season varies in space and from year to year, depending on the migration of the Indo-Australian Monsoon. In 2000, forest cover in the MC represented 5.6% of the global forest area. At that time, 74% of Indonesia, 62% of Malaysia and 92% of Papua New Guinea were naturally forested (Hansen et al 2013). Extensive forest loss has occurred across the region since 2000 (Brookfield and Byron 1990, Curran et al 2004, 2016: from 2000 to 2019, Indonesia lost 26.8 Mha (17% of year 2000 forest cover) and Malaysia 8.12 Mha (28% of year 2000 forest cover). Other MC countries are less impacted; for example, Papua New Guinea has lost 1.49 Mha (3.5%) and the Philippines lost 1.23 Mha (6.6%) of forest cover since 2000.

Datasets
Remote sensing observations of LST were used to characterise temperature change due to forest loss. Human and agricultural outcomes are more closely related to near surface air temperature (AT). However, AT cannot be determined for the parts of the MC undergoing forest loss, due to the limited number of weather stations in forested areas in the region (De Frenne and Verheyen 2016). Thus, satelliteobservable LST provides a consistent measure of temperature response. LST closely correlates with AT (Li et al 2016, Heft-Neal et al 2017, and is the temperature measure used in many previous forest loss studies (Peng et  Similar to other contemporary deforestation studies, we measured temperature and its changes using MODIS satellite data (Alkama and Cescatti 2016, Prevedello et al 2019, Zeppetello et al 2020. Specifically, we used the 1 × 1 km 8-day MODIS Terra LST dataset (MOD11A), which spans the period from March 2000 to present (Wan et al 2015). Each pixel in the 8-day LST observations is an average of all the corresponding LST measurements collected within that 8 day period, made at the 10:30 AM overpass time. We repeated the analysis with the MODIS Aqua satellite, which differs from Terra primarily in its overpass time of 1:30 PM. Similar results to those obtained with Aqua were obtained, and are included in the supplementary information (available online at stacks.iop.org/ERL/16/114018/ mmedia). The raw MODIS datasets contain observations made with zenith angles between 0 and 65 degrees which can produce uncertainty in the spatial footprint (Townshend et al 2000, Campagnolo andMontano 2014). To minimise such uncertainty, only observations flagged as 'good quality' and with view zenith angles between −10% and 10% were used.
Forest cover was obtained from the Landsatderived global forest cover (GFC) dataset (Hansen et al 2013), which presents, at 30 m spatial resolution, global data for (1) percent forest cover in the year 2000, including all natural or planted vegetation greater than 5 m in height, and (2) the year during which a stand-replacing disturbance occurred, if any. Following a stand-replacing disturbance, the forest cover in the 30 m pixel is set to zero. We converted the GFC dataset into yearly presence-absence data: for a given year, pixels were defined as 'forest' if they contained >75% forest cover in 2000 (indicating forests of high structural integrity; Hansen et al 2020), and had not undergone a stand-replacing disturbance. From the presence-absence data, we defined locations that represented forest 'edges' , consisting of nonforest pixels adjacent to forest pixels (see figure 2(A)). These edge pixels were identified with the Google Earth Engine Canny edge detection algorithm (Canny 1986). As a metric of landscape fragmentation, 'edge density' measures the fraction of landscape made up of forest boundaries.
The resulting MODIS and GFC-derived datasets have different spatial and temporal resolutions. To enable inter-comparison of these datasets, we aggregated the 8-day MODIS data by computing the annual (calendar year) average LST, and upscaled the binary GFC presence-absence and edge data to the 1 × 1 km MODIS grid. For each 1 × 1 km MODIS pixel and year, the forest cover fraction was computed as the average number of 'forest' 30 × 30 m pixels it contained. Similarly, the 'edge density' within each 1 × 1 km pixel was computed as the fraction of 30 × 30 m pixels that were classified as forest edges. Finally, forest loss in a given year was computed as the fraction of forest 30 × 30 m pixels experiencing a stand-replacing disturbance in that MODIS pixel (sensu Hansen et al 2013) and year.
Thus, following our initial manipulation of the remote sensing products, we obtained four data layers for the MC at annual, 1 × 1 km resolution in MODIS projection: land surface temperature LST(t), forest cover FC(t), forest loss L(t) and edge density η(t) for each year t in the dataset.
The layers describing forest cover and loss-FC(t), L(t) and η(t)-covary within the dataset, as illustrated in figure 3. The scatter plots in figure 3 illustrate this covariation using 10 000 randomly sampled points, showing L versus FC (panel (A)), L versus η (panel (B)), and FC versus η (panel (C)). The marginal distributions of each covariate are also shown, aligned with the relevant axis.

Identifying temperature changes attributable to forest loss
To construct a differences-in-differences metric, we first considered the fractional forest loss L(t loss , j) occurring in a 1 km MODIS pixel j, for a given year t loss , with t loss ranging from 2002-2019. Due to this forest loss, the temperature in the following year, LST(t loss + 1, j), should be different than it would have been had no forest loss occurred. By contrast, nearby pixels that were unaffected by forest loss will also have different annual mean temperatures in t loss + 1 compared to previous years, but this will be attributable to climatic variation between the years (caused, for example, by the monsoon progression or sea surface temperature anomalies). The difference in the temperature change experienced in these sites and the deforested sites measures the temperature change due to forest loss.
We thus compute the temperature change over time at sites affected by forest loss ∆LST(t loss , j), between years t loss + 1 and t loss − 1. We did not use data from year t loss , because the timing of forest loss within year t loss is not identified in the GFC dataset, and t loss + 1 therefore offers a less ambiguous time from which to evaluate change: For a given t loss , we computed this metric for all pixels located within 10 km of any pixel in which 50% or more forest loss occurred in t loss . These regions contained a range of clearing and smaller scale Figure 4. Illustration of the criteria used to designate 'assessment' and 'control' areas. The assessment area-shown in pale yellow-includes pixels that experienced ⩾50% forest loss in a given year, and all pixels within 10 km of those pixels. Pale green shows undisturbed forest areas that could provide a control for the assessment area, and dark green shows the control area for a specific pixel j. Control pixels are defined as described in section 2.3 and table 1. The differences-in-differences metric ξ is computed as the difference between ∆LST(t loss , j) in pixel j and ∆LST averaged over the relevant control pixels. disturbance (i.e. selective logging or road construction at the frontiers of forest loss; Gaveau et al 2014, and are referred to as 'assessment areas' . We removed from the assessment areas all pixels identified as having more than 1% surface water occurrence in the Joint Research Centre Global Surface Water (Pekel et al 2016) product, restricting the assessment pixels to those with minimal surface water.
Next, for each pixel j in the assessment areas, control pixels were designated that were both near enough to experience the same background climate conditions and far enough to be unaffected by forest loss. Control pixels were selected to have high forest cover, minimal local forest loss between the years 2000 and t loss + 1, minimal forest loss within a 10 km radius, and to lie within 25 km of the pixel they controlled for. This process is illustrated schematically in figure 4, and the quantitative criteria used to Table 1. Criteria used in the data acquisition, filtering and processing steps.

Criteria Reasoning
Data acquisition: assessment criteria All pixels within 10 km of pixels with L(t loss ) ⩾ 0.5 Select pixels that experience significant loss in t loss , and all pixels within a 10 km neighborhood of these ⩾0.5 loss pixels. Data acquisition: control criteria FC > 0.9 Select pixels with high forest cover.
Select pixels with minimal loss.
Select pixels with minimal loss within 10 km. Distance from assessment pixel is <25 km Select control and assessment pixels that experience similar climate conditions. Data: filtering 10 km minimum distance between pixels Minimise spatial autocorrelation.
Research question 1 filter L(t loss ) > 0.1 Select pixels that experienced forest loss in the year t loss . L(t loss ± 1) < 0.02 Select pixels with minimal loss in t loss ± 1. Research topic 2 filter L R−10 < 0.05 Select pixels with minimal forest loss outside of the radius of interest. L R (t loss ± 1) < 0.05 Select pixels with minimal forest loss in years prior to or following t loss . Research topic 3 filter L R in −Rout in range 0.1-0.2 Select pixels with 'significant' neighborhood forest loss, i.e. in annuli between R in and Rout. L < 0.02 and L R in < 0.02 Select pixels with minimal loss both locally and within radius R in , to isolate the effects of loss occurring at R > R in .
Select pixels with minimal loss in t loss ± 1. FC > 0.9 Select pixels with high local forest cover.
specify the control pixels are specified in table 1. For each pixel j and loss year t loss , applying these criteria produced a variable set of control pixels, k(t loss , j). ∆LST(t loss , k(t loss , j)) was computed and spatially averaged for each set of control pixels k(t loss , j), giving ⟨∆LST(t loss , k(t loss , j))⟩. This average was subtracted from ∆LST(t loss , j) to form the DiD estimate of temperature change due to forest loss at pixel j in year t loss (see figure 4): To address the second research topic (whether the spatial extent of loss influences the spatially averaged temperature change due to loss), we also computed spatial averages of ξ over circles with radius R ranging from 1 to 10 km, designating these averages as ξ R (t loss , j).
Forest gain was not included in the analysis, as the GFC dataset includes forest gain for the 2001-2012 period, but does not cover the period after 2012. To minimise the influence of forest gain or regrowth on temperature change, the analysis was limited to temperature change in the year following that in which the considered loss occurred. This makes two reasonable assumptions: (1) that in the year immediately after forest loss, any forest regrowth is minimal, and (2) that any change in vegetation within pixels not experiencing forest loss is too small, on one-year timescales, to influence the 1 km LST.

Covariates: factors modulating temperature changes due to forest loss
We additionally produced data layers defining the forest loss and edge density over different radii R surrounding pixel j in year t loss : L R (t loss , j) and η R (t loss , j). These were computed by defining concentric circles, with radii R = 1, 2, 4, 6, 8 and 10 km, centered on each pixel j. Within each circle and for each loss year, the spatially averaged forest loss, L R , and edge density, η R , were computed (see figure 2(B)). For example, L 10 (t loss , j) is the mean forest loss occurring within 10 km of pixel j. These radially averaged layers are used in the analysis of spatial extent (Research Topic 2).
The radially averaged layers were also used to produce spatial averages over annuli of fixed width (1 and 2 km), denoted by their inner and outer radii, R in and R out : L R in −Rout (t loss , j). The average forest loss was computed within annuli with inner and outer radii corresponding to sequential values of R, by taking the areally-weighted difference in average forest loss between these circles. For example, L 2−4 (t loss , j) represents the mean forest loss in year t loss within a 2-4 km annulus centered on pixel j. These annular averages are used in the analysis of nonlocal warming These covariates were computed for the years t loss − 1, t loss and t loss + 1, for all pixels in the assessment area for year t loss . For clarity, the index j is not written out beyond this point, and the year is only specified if it is different from t loss (e.g. for t loss ± 1).

Methods
A hypothesis testing approach was used to assess the sensitivity of ξ to FC, η, L R and L R in −Rout . Pixels from all years were pooled to form a single dataset, and the analysis of change was centered in time around the relevant t loss for each pixel, as illustrated in figure 5. The differences in differences approach controls for interannual climate variability in figure 5. That is, prior to aligning the data around t loss , the differences in LST between years in the control areas (i.e. temperature changes due to interannual climatic variability) were subtracted from the differences in the target pixels. This minimises the effects of interannual climate variability, allowing loss years to be aggregated as shown in figure 5.
Hypothesis tests were specifically related to differences in the mean value of the DiD metric-written asξ, where the overbar denotes the mean across all samples-between subsets of the data characterised by differences in the spatial properties of forest loss. To address the covariation illustrated in figure 3, we adopted a sampling approach to balance the distributions of covariates prior to each hypothesis test. We also required at least 10 km distance between samples, to minimise spatial autocorrelation.

Research topic 1: how spatial forest characteristics modulate the temperature effects of forest loss
In this analysis, we ran hypothesis tests that compared ξ between pixels with high and low values of FC or η. These groups of pixels are referred to as 'partitions' . However, to understand the effects of FC or η on ξ, the structure and correlation between L, FC and η revealed in figure 3 needs to be be addressed. To do so, once the data was partitioned-for example, with respect to FC-we further sampled within each partition to enforce similarity in the other spatial covariates-i.e. L and η.
The details of the sampling and balancing process are illustrated in figure 6, using the example of determining how FC affects ξ. Firstly, quantiles of FC were used to divide the data into 5 different FC partitions (figure 6(A)). Figure 6(A) shows the comparison between P 1 , the partition containing the 0-20th FC percentile range, and P 5 , the partition containing the 80-100th FC percentile range. Next, the distributions of forest loss L were compared between FC partitions (figure 6(B)). L was distributed differently in each FC partition. To have balanced data for the analysis, we sub-sampled the data to ensure that the partitions had similar distributions of L: we sorted the data in each partition into bins defined by L, and sub-sampled the same number of pixels per bin between partitions. This resulted in the same number of pixels per bin in each partition, as illustrated in figure 6(C). Enforcing this similarity in the L distributions enabled us to control for the effects of L when comparing the mean ξ between FC partitions, that is, to understand the 'real-world' outcomes produced when FC is altered. This approach does not, however, separate the effects of concurrent changes in FC and η on ξ, nor does it distinguish between η and FC as the driver of the observed temperature changes. As FC is more observable than η, and the correlation between FC and η is ubiquitous and unavoidable (see figure 3), it is nevertheless useful to understand the sensitivity of the temperature responses to FC and η jointly in the dataset.
To attribute changes in ξ to FC and η independently of each other, we altered the balancing step so that the binning was done in two dimensions. To isolate the effects of FC on ξ, the partitions were balanced with respect to both L and η; likewise, to isolate the effects of η on ξ, the partitions were balanced with respect to both L and FC. The mean difference between partitions, ∆ξ, was then computed from the balanced data as: where h and l subscripts denote the high and low FC partitions, and the overbar indicates the mean of ξ Figure 6. Illustration of the approach used to address Research Question 1, using the covariate FC as an example. Panel (A) shows the distribution of FC with vertical lines indicating FC quantiles, which were used to delineate five partitions in the dataset, P1-P5. In this illustration, the hypothesis test will compare partitions Q1 and Q5. Panel (B) shows the corresponding distributions of forest loss L. These panels show that the distributions of L differ within the tested partitions, a difference that could bias the results of the hypothesis test. The subsets are therefore sub-sampled to 'balance' L between them, resulting in similar L distributions (panel (C)). The distributions of FC for the balanced subsets (panel (D)) are similar to those of the unbalanced data, but now contain similar distributions of loss. Finally, panel (E) illustrates the response variable, the mean difference ∆ξ between high and low FC partitions.
within each partition. Confidence intervals for ∆ξ were bootstrapped to assess the significance of the differences in means. ∆ξ was used to test against the null hypothesis that the difference is zero, i.e. that there is no effect of the covariate on the observed warming.

Research topic 2: how temperature change after forest loss is affected by the spatial extent of loss
This analysis examined the sensitivity of the spatially averaged temperature change caused by forest loss, ξ R , to the spatially averaged forest loss, L R , over a range of averaging lengthscales R. Our expectation was that, if all warming occurred only in response to the forest loss within individual pixels (and was not influenced by loss occurring in neighboring areas), these spatial averages would be equal regardless of the value of R. Any variation in ξ R with respect to R would indicate that other spatial interactions were taking place; therefore, ξ R was estimated for each of three fractional loss ranges representing 'low' (0.1 < L R < 0.2), 'moderate' (0.2 < L R < 0.3) and 'high' (0.3 < L R < 0.4) forest losses, and for 5 radii: R = 1, 2, 4, 6, 8, and 10 km. The 1 km 2 pixel scale was also included, and labelled as R = 0. In all cases, we required that forest loss outside of R was minor (L R−10 < 0.05) and that L R (t loss ± 1) < 0.05, so that the effects measured could be attributed to changes within the tested loss extent R during year t loss . For each case of 'low' , 'medium' and 'high' loss, the mean ξ R was computed for each corresponding subset and radius R. Confidence intervals on the mean ξ R were bootstrapped. Sufficient data to enable comparisons for 'moderate' and 'high' forest loss were only available for some of the radii tested, as there are fewer high L R samples for larger R.
Several modifications of the procedure above were tested by: lowering the 5% L R (t loss ± 1) threshold to 2% (reported in the supplementary information); adding a filter L 10 (t loss ± 1) < 0.05; and balancing the data with respect to η R . All these modifications reduced the amount of data in the hypothesis tests, generating similar results but with greater uncertainty. The results presented use the most permissive (unmodified) criteria and maximise the amount of data represented in the analysis.

Research topic 3: how locations warm due to nearby but nonlocal forest loss
The sensitivity of ξ to nonlocal forest loss was assessed for undisturbed pixels with high forest cover (with L < 0.02 and FC > 0.9). This simplifies the analysis by ensuring that changes in temperature were not caused Figure 7. Box plots (blue) for the DiD metric as a function of forest loss L for two spatial scales of averaging: the 1 × 1 km pixel scale, ξ (panel (A)), and loss averaged over circles of radius R = 2 km, ξ2 (panel (B)). To facilitate comparisons to previously published studies, the adjacent red dots show the mean for each range, along with the bootstrapped 95% confidence intervals on each mean. Panel (B) omits the L2 = 0.8-1.0 data point due to insufficient data. Box plots show the median (central line), the interquartile range (IQR) of the data (box limits), and whiskers extend 1.5 IQR past the boxes. In all cases, the data shown only include samples with minimal loss (<5%) outside the assessment pixel/averaging area. by local land cover change. We considered temperature changes only at assessment pixels that experienced significant forest loss across annuli of 1-2, 2-4, 4-6, 6-8 or 8-10 km from the undisturbed pixel. 'Significant' loss in this case refers to a spatially averaged loss in the range of (0.1-0.2). The apparently low average losses reflect the limited number of undisturbed locations that also experience high loss nearby-for example the maximum annular loss (averaged across all annuli) was 0.12 (see table 4). Forest loss effects were isolated to the annulus of interest by requiring that forest loss within the inner radius was less than 2%: L R in < 0.02.
No further balancing of FC or η was needed in this analysis, because the range of FC and η was small in undisturbed assessment pixels. For each R in , the mean temperature responseξ for the sampled pixels was computed, and confidence intervals on the mean were bootstrapped.

Influence of forest loss fraction on LST change
The initial dataset comprised 2.6 million pixels. Approximately 23 000 pixels experienced ⩾50% loss in any given year (2002 ⩽ t loss ⩽ 2019), and approximately 400 000 pixels experienced >10% loss. A summary of how the pixels were distributed across the MC, and the distribution of forest loss fraction in the dataset, is provided in supplementary information table 1.
An initial exploration of the data reveals that forest loss produces warming, and that the magnitude of warming increases with increasing forest loss. Box plots in figure 7 illustrate this warming trend over different spatial extents (a 1 × 1 km pixel and a circle with a radius of 2 km). To facilitate comparisons with previous studies reporting the mean warming due to Table 2. The spatially averaged DiD metric,ξR, for subsets with varying radii R = 0, 1, 2 km, and proportional loss ranges LR. The data was restricted to L R−10 < 0.05, so that any observed warming is primarily in response to loss within radius R. Column 'N' shows the number of samples. The same information is displayed in the red points in figure 7. forest loss, figure 7 also shows the mean of ξ for each loss range and bootstrapped 95% confidence intervals on the mean. This same information is summarised in table 2, which also includes ξ at the R = 1 km lengthscale.
For 40%-60% loss, we estimated a warming effect of 0.6 • C ± 0.11 • C for L, 1.67 • C ± 0.12 • C for L 1 and 1.67 ± 0.22 • C for L 2 , as shown in figure 7 and table 2. The L 1 and L 2 estimates are similar to the ≈1.5 • C of warming found by Alkama and Cescatti (2016) for 50% forest loss, and bound the spatial scale in that analysis (R = 1 km corresponds to a spatial extent of ≈3.1 km 2 , and R = 2 km to ≈12.6 km 2 )-a remarkable level of agreement considering the differences in study area, temporal scale, and definition of controls used in the differences-in-differences analyses in each study. Figure 7 shows that many of the samples are negative, suggesting deforestation-induced cooling. This is most likely due to a combination of noise in the remote sensing data, and limitations of the methodology (for instance, not accounting for vegetation changes in non-forested areas). The 8-day averaging period of the MODIS LST product could also produce negative DiD values in samples where the overpass days flagged as 'good quality' (i.e. no cloud cover) differed between control and target pixels. That is, because the MODIS LST data reflects the average of all days within an 8 day period, differences in cloud free days between control and assessment pixels within that period could bias the results.

Research topic 1: how spatial forest characteristics modulate the temperature effects of forest loss
Warming due to forest loss in the MC changes with the spatial properties of the remaining forest cover. This effect is illustrated in figure 8, which shows how the temperature change following forest loss, ξ, differs between partitions of the data, with this difference between partitions denoted as ∆ξ. The variable used to partition the data is the remaining forest cover (panels (A)-(C)) or the remaining edge density (panel (D)). Thus, in panels (A)-(C), a positive ∆ξ indicates more warming in locations with higher remaining forest cover, and a negative ∆ξ indicates less warming in locations with higher remaining forest cover. Where ∆ξ = 0 (or equivalently, the 95% confidence intervals on the mean include 0), there is no significant effect of remaining forest cover on the temperature change caused by forest loss. To account for the possibility that remaining forest cover only influences the temperature change above some threshold value, the ∆ξ values are plotted as a function of the mean remaining forest cover in the lower partition, FC l . Panel (D) shows a similar approach, except that the partitions are now based on remaining edge density, such that ∆ξ shows the differences in warming due to difference in remaining edge density, ∆η, between the tested partitions.
In figure 8 panels (A) and (B), the estimated ∆ξdifferences in warming due to differences in remaining forest cover-was computed without controlling for the effects of edge density. Provided the remaining forest cover in the lower partition is above 10% (i.e. FC l ⩾ 10%), these estimates suggest that warming due to forest loss is independent of the remaining forest cover (i.e. ∆ξ is not significantly different from zero). Differences in warming between low and high remaining forest cover partitions appear to be substantially independent of the size of the difference in remaining forest cover between partitions, ∆FC. This is shown by the lack of any clear association between ∆ξ and the color of the symbols (indicating the difference in mean forest cover between partitions ∆FC) in panel (A). Samples with FC l < 0.1 have negative ∆ξ, indicating that warming due to forest loss is amplified when the clearing is nearly complete. Panel (B) shows the same data as panel (A), with the marker colors now indicating the difference in mean edge density between the tested partitions, ∆η. In this panel, it is clear that cases with smaller FC l (x axis) are associated with higher ∆η, as a result of the association between FC and η illustrated in figure 3. Panel (B) suggests that the sensitivity of warming to remaining forest cover cannot be separated, in this analysis, from the sensitivity to remaining edge density. Panels (C) and (D) present the isolated effects of remaining forest cover and edge density. Panel (C) shows the differences in warming due to differences in remaining forest cover, now controlling for the effects of edge density in addition to forest loss (obtained by balancing the data with respect to both variables). For all tested forest cover partitions, the ∆ξ values are effectively zero (confidence intervals for all tests include zero), suggesting that remaining forest cover does not independently influence the magnitude of temperature changes caused by forest loss. Conversely, panel (D) shows the differences in warming between partitions with low and high remaining edge density, with both forest loss and remaining forest cover controlled for. Here, a clear trend emerges, in which the least warming occurs when the remaining forest cover has the highest edge density. For the most extreme η separation between partitions, where ∆η = 0.28, the 'edgier' landscape warmed by nearly 1 • C less than the less 'edgy' landscape (∆ξ = −0.89 • C ± 0.25 • C).
Supplementary information tables 2-4 provide further detail about the data in figure 8, including ∆ξ for each inter-quantile comparison.

Research topics 2 and 3: how temperatures respond to loss extent and to nonlocal forest loss
To illustrate the sensitivity of warming to the spatial extent of forest loss, figure 9(A) plotsξ R -the average temperature change due to forest loss in a circle of radius R-as a function of R. The same information is summarised in table 3. The analysis shows that, for a given proportional loss L R , the temperature responsē ξ R was greater for larger values of R. For example, considering 30%-40% loss (dark blue markers in figure 9(A), the average temperature increase over a 1 × 1 km pixel was 0.36 • C ± 0.8 • C, while in a circle of radius 1 km, it was 0.98 • C ± 0.07 • C. For 30%-40% forest loss occurring over a circle with radius 4 km, the average temperature increase was 1.65 ± 0.34 • C. The confidence intervals onξ R widen as both R and L R increase. Thus, the increase inξ R with R becomes Figure 8. This figure shows the difference in the temperature change caused by forest loss, ∆ξ, between partitions with high and low remaining forest cover FC (panels (A)-(C)) and high and low remaining edge density η (panel (D)). Panels (A)-(C) show ∆ξ as a function of the mean remaining forest cover in the lower partition, FC l , for each FC quantile pair. In these panels, positive ∆ξ values indicate more warming in locations with more remaining forest cover, and negative values indicate more warming in locations with less remaining forest cover. Panels (A) and (B) show data sampled to control for differences in L between partitions, and panel (C) shows data sampled to control for both L and η. Panel (D) shows the sensitivity of ∆ξ to the difference in mean edge density between partitions, ∆η, with both L and FC controlled for. Marker colors show, in panels (A) and (C), the difference in mean forest cover between partitions, ∆FC; in panel (B), the difference in mean edge density between partitions, ∆η; and in panel (D), the mean edge density in the lower partition, η l . Figure 9. (A) The spatially averaged change in temperature due to forest loss in a circle of radius R,ξR, increases with increasing R, for varying loss fractions, LR. Only samples with L R−10 < 0.05 were analysed, to avoid any influence of nonlocal forest loss on ξR, (B) Mean temperature increases relative to control sites,ξ, at undisturbed pixels that experienced significant forest loss in their neighborhoods. Data are plotted for sequential 1 or 2 km wide annuli denoted by their inner radius R in , for which the average annular loss was in the range of (0.1-0.2). To isolate the effects of forest loss in each annulus, the data was filtered to contain only pixels with LR in < 0.02. more uncertain as R ≳ 4 km. Additionally, no sensitivity to extent was observed for 'low' forest loss (0.1 < L R < 0.2), with theξ R confidence intervals overlapping for all lengthscales except R = 0. Sample size and data availability make it challenging to unravel the dependence ofξ R to R across all magnitudes of loss. This difficulty reflects the infrequent occurrence in the dataset of large areas with high proportional loss that meet the other sampling criteria.
The analysis depicted in figure 9(B) shows increased ξ in undisturbed pixels in response to forest loss of >10% in neighboring locations. This increase is greatest when the forest loss occurs immediately adjacent to the undisturbed site (i.e. R in = 1 km), and remains significant at the 95% level for forest loss occurring in the range of 4-6 km of the undisturbed site. The information is also summarised in table 4. Table 3.ξR shows the mean warming, with bootrapped confidence intervals on the mean in parentheses, for subsets with varying radii R and proportional loss ranges LR. The columnLR shows the mean proportional loss within the subsets. The data was restricted to L R−10 < 0.05, so that any observed warming is primarily in response to loss within radius R. The edge density (ηR) and remaining forest cover (FCR) are averaged over radii R surrounding the target pixels (coinciding with the assessment regions forξR).

Discussion
The analyses demonstrate how forest loss causes LST increases in the MC, which are comparable in magnitude to previous pan-tropical estimates of deforestation-induced warming (Alkama and Cescatti 2016) and increase with increasing fractional forest loss. Deforestation-induced warming decreases as the fragmentation (edge density) of the remaining forest cover increases, with, in the most extreme case in the analysis, a reduction in warming of ≈ 0.9 • C attributable to a difference in edge density of ∆η = 0.29. This effect is similar to that identified by Mendes and Prevedello (2020), who found lower temperatures in more fragmented forest landscapes. Methodological differences, however, prevent a direct comparison of the temperature-fragmentation relationship between the studies. The warming associated with forest loss does not change because of remaining forest cover (that is, LST increases do not depend on the remaining forest cover fraction). However, forest cover fraction covaries with forest edge density, which does alter the magnitude of warming due to loss. This covariation means that, in practice, the remaining forest cover following loss could influence the LST change (figure 8, panels (A) and (B)). Pragmatically, the observation of greater temperature increases in the MC when FC < 10% suggests that maintaining a minimum of 10% forest cover at 1 × 1 km scales could help mitigate temperature increase due to land cover conversion.
Forest loss occurring over larger areas appears to cause more warming than equivalent fractional forest loss occurring over smaller areas. Limited sample sizes for areas with both high fractional loss and large loss areas means that the analysis could not ascertain the upper limit of the scales over which this area-dependence may apply. Increasing warming with increasing loss extent, however, is visible up to radii of 6 km, and is quite evident at smaller scales (e.g. when comparing warming between 1 × 1 km pixel and R = 1 km scales). This finding is inconsistent with a previous study (Zeppetello et al 2020), which showed that temperature changes due to deforestation were related to the area of deforestation in tropical regions at large scales only. Specifically, no scale effects were found for areas >13 km 2 , including the range of scales over which significant sensitivity to loss extent appears in the present study. This discrepancy is unsurprising given methodological differences between the studies. In particular, Zeppetello et al (2020) analysed LST changes following deforestation, without using nearby intact forest areas to control for background climate variability, as was done in the present study. Using a DiD approach would likely increase the sensitivity of the analysis by reducing noise in the response variable. The spatial metrics of loss extent also differ between the studies (connected loss areas in Zeppetello et al (2020) versus average loss over circles of varying R presented here).
The last finding is that LST increases associated with forest loss are not confined to the location where forest removal occurs. Instead, areas located at distances within 6 km from forest loss are warmed by the nonlocal removal of trees. This finding is broadly consistent with previous work in the Brazilian Amazon-Cerrado transition area (Cohn et al 2019), which estimated that 10% forest loss would increase surface ATs by 0.06 • C ± 0.01 • C at undisturbed sites within 2-4 km of the loss. In the present study, 10% forest loss over the same 2-4 km annulus caused LST increases of 0.24 • C ± 0.06 • C. Although the temperature metrics (AT versus LST) and study areas (Brazil versus the MC) differ between the studies, the findings are broadly similar and suggest that forest loss in the tropics results in nonlocal warming.
Overall, the findings suggest the importance of spatial pattern and scale in considering the temperature change implications of forest loss. The mechanisms responsible for these spatial effects, however, remain unclear, with several mechanisms hypothesised to be relevant. Forest edges shade neighboring non-forested land (Zhou et al 2011, Li et al 2012. Forest edges are drier and warmer than forest interiors (Smit et al 2013, Arroyo-Rodríguez et al 2017, representing thermal gradients which are likely to generate lateral energy fluxes. If these fluxes occur over large enough scales, they could potentially generate secondary air circulation leading to larger-scale cooling (the 'vegetation breeze' effect; Mendes and Prevedello 2020). We note, however, that the lengthscales associated with the fragmentation metric η (<1 km) are much smaller than those previously observed to trigger such mesoscale circulations (Avissar and Liu 1996). Alternatively, the heterogeneity generated by a fragmented forest landscape might increase the effective aerodynamic roughness of the land surface, increasing the turbulent transfer of momentum and energy between the surface and atmosphere (Spracklen et al 2018). Modeling studies, for example, show enhanced surface sensible and latent heat fluxes caused by spatial heterogeneities with lengthscales of ≈10-10 3 m (Schmid and Bünzli 1995), enhancing surface cooling.
More simply, nonlocal warming presumably reflects advection of warm air from sites of forest loss to neighbouring areas-a phenomenon that has been better characterised in urban settings (Bassett et al 2016, Cosgrove andBerkelhammer 2018). Larger areas of forest loss (i.e. greater loss extent) could amplify the effects of many of these processes: for example, by increasing the connectivity of the landscape for advective airflow, reducing the ratio of forest edges to areas of loss, or by compounding local and nonlocal warming. While the LST changes observed in the MC cannot be attributed to these mechanisms based on the analyses presented here, there is meaningful scope for future work to do so. For example, testing how lengthscales of fragmentation affect temperature responses could discriminate between the aerodynamic effects of heterogeneity, emerging on 10-10 3 m scales (Schmid and Bünzli 1995) and mesoscale circulations arising on~10 5 m scales (Avissar andLiu 1996, Cochrane andLaurance 2008).
Recent studies have indicated that the elevation at which deforestation occurs influences the temperature response of the land surface (Zeng et al 2020), which may become increasingly important as deforestation increasingly occurs at higher elevations in Southeast Asia (Feng et al 2021). Variations in elevation did not influence the results of the present study, however, as 90% of sites were located at elevations of <250 m. However, the DiD methodology employed here could be valuable for further exploration of the effects of elevation on deforestation-induced temperature change.
The DiD and hypothesis testing approaches used here enable causal attribution of LST change to spatial properties of forest loss. While this methodology is a strong point of the study, it required careful efforts to control for confounding and covarying spatial properties (e.g. when separating effects of FC from η). Consequently, although the initial dataset of forest loss and temperature change was very large, data limitations nonetheless influenced the final analyses. Additionally, the methodology does not account for forest gain, vegetation regrowth following forest loss, or changes in vegetation in non-forested areas. For the short (1 year) assessment period considered here, the influence of these factors on surface temperature is likely small relative to forest loss, given that forest loss constitutes a step change whereas vegetation growth is more gradual. Nevertheless, more precise estimates of deforestation-induces surface warming would be obtained if these factors were accounted for.
Other study limitations include the use of LST rather than near surface AT, given that AT is more relevant to crop productivity and human health (Wolff et al 2018, Masuda et al 2019. Nonetheless, LST is strongly correlated to AT across regions and landcover types (Mutiibwa et al 2015, Alkama and Cescatti 2016, Pepin et al 2016, and LST has been shown to respond to land cover changes similarly to AT, albeit with different magnitudes Alkama and Cescatti (2016). The dataset used consists of cloud-free observations, which presumably have warmer temperatures than cloudy days-a potential source of bias that may or may not be controlled for by the DiD approach. The use of annually averaged temperatures means that the analysis does not directly shed light on how temperature extremes respond to deforestation and its spatial properties. These extremes are presumably the occurrences where forest cover derived cooling might be most important. Temperature extremes scale nonlinearly with mean temperature changes caused by global warming (Lewis et al 2017), and modeling studies have suggested that impacts of deforestation on seasonal maximum ATs in the MC could be in the range of 4 • C-12 • C-many times larger than the changes detected here (Avila et al 2012). However, we are unaware of any prognostic relationship between warming in annual average temperature and the resulting increase in extremes which could allow us to estimate changes in extremes from the present results.

Conclusion
Forest loss in the MC causes LSTs to rise both at the site of deforestation and in neighboring areas. The magnitude of the temperature rise depends on the spatial properties of forest loss and remaining forest cover, with greater mean warming associated with (i) forest loss resulting in less than 10% remaining forest cover and (ii) more extensive forest loss than loss occurring over smaller scales. These findings are largely attributable to the role of forest edges in mitigating warming, although the mechanistic underpinnings of this relationship require further exploration.
The study results suggest two pragmatic outcomes for forest conservation and land management policies: firstly, while fragmentation has been long identified as undesirable for forests, extending the length of forest-non-forest boundary (a side-effect of fragmentation) reduces temperature increase. As the results show, even small amounts of remnant forest cover provide a cooling effect (>10%); landscape management approaches that balance multiple competing uses while preserving some tree cover (e.g. agroforestry, multi-functional landscapes) would provide some mitigation of warming. This is important given the risk warming climates pose to many economic activities that use lands adjacent to forests. Secondly, this study corroborates a previous observation of nonlocal propagation of warming from deforested sites to surrounding areas. These nonlocal effects create an opportunity to value the 'climate stabilisation services' offered by forests. Seen through this lens, forest conservation can avert temperature increase in neighboring lands that would be caused by forest loss. This averted warming can be valued in terms of the avoidance of heat-related harms-for example, loss of agricultural productivity or human health impacts.

Data availability statement
The JavaScript code used to extract the data in Google Earth Engine (GEE), the dataset exported from GEE, and the Python code to analyse it are available through the Center for Open Science at: https://osf.io/38q7g (DOI 10.17605/OSF.IO/X2YZJ).
The data that support the findings of this study are openly available at the following URL/DOI: https://osf.io/38q7g.