Aboveground carbon emissions from gold mining in the Peruvian Amazon

In the Peruvian Amazon, high biodiversity tropical forest is underlain by gold-enriched subsurface alluvium deposited from the Andes, which has generated a clash between short-term earnings for miners and long-term environmental damage. Tropical forests sequester important amounts of carbon, but deforestation and forest degradation continue to spread in Madre de Dios, releasing carbon to the atmosphere. Updated spatially explicit quantification of aboveground carbon emissions caused by gold mining is needed to further motivate conservation efforts and to understand the effects of illegal mining on greenhouse gases. We used satellite remote sensing, airborne LiDAR, and deep learning models to create high-resolution, spatially explicit estimates of aboveground carbon stocks and emissions from gold mining in 2017 and 2018. For an area of ∼750 000 ha, we found high variations in aboveground carbon density (ACD) with mean ACD of 84.6 (±36.4 standard deviation) Mg C ha−1 and 83.9 (±36.0) Mg C ha−1 for 2017 and 2018, respectively. An alarming 1.12 Tg C of emissions occurred in a single year affecting 23,613 hectares, including in protected zones and their ecological buffers. Our methods and findings are preparatory steps for the creation of an automated, high-resolution forest carbon emission monitoring system that will track near real-time changes and will support actions to reduce the environmental impacts of gold mining and other destructive forest activities.


Introduction
The Amazonian rainforest is under the chainsaw of fast-growing economies and expanding populations. Of these threats, gold mining poses one of the most negatively impactful types of damage to the environment. The Peruvian Amazon, the sixth largest producer of gold with 155 metric tons removed in 2017 (Ober 2018), harbors gold-rich alluvium deposited by erosion in layers of sediments in the lowland ecosystems, including river floodplain forests (Asner et al 2013). This attracted gold miners to the Madre de Dios region of the Peruvian Amazon, which is responsible for 70% of Peru's gold production (Swenson et al 2011).
Gold mining in Madre de Dios has proliferated as a gold rush responsible for 100 000 hectares of deforestation between 1984 and 2017, with 10% of deforestation occurred only in 2017 and 53% occurred since 2011 (Caballero Espejo et al 2018). Multiple factors have contributed to this intensification, including an increase in international gold prices, increased accessibility following paving of the Interoceanic Highway (Moreno-Brush et al 2016), the lack of formalization of artisanal and small-scale mining activities (Salo et al 2016), and the hope of high short-term earnings by mining and selling the gold (Sanguinetti 2018), which also led to a migration of people from nearby regions to Madre de Dios (Asner and Tupayachi 2017). The three largest mines, Huepetuhe, Delta-1, and Guacamayo (now called La Pampa), used to dominate the landscape, with Huepetuhe accounting for 76% of region-wide mining in 1999 (Asner et al 2013). However, in 2012, small mining, clandestine operations accounted for 51% of total mining activities in Madre de Dios (Asner et al 2013). Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Asner et al 2013). The environmental damage produced by artisanal miners is visible from space (Swenson et al 2011, Asner et al 2013) and consists of large areas of deforestation and forest degradation, bare soil and water pools used in the mining process.
Earth Observation satellites have become widely used to report deforestation and improve intervention . Analyzing time series of free Landsat satellite images at 30 m spatial resolution, forest loss caused by gold mining averaged 4437 ha yr −1 between 1999 and 2016, with a total increase of estimated gold mining area of 40% between 2012 and 2016 (Asner and Tupayachi 2017). Landsat data were also used to identify ∼15 500 ha of mining areas for the three largest mines by 2009 (Swenson et al 2011) and found that ∼65% of all artisanal small-scale mining occurred outside the legal mining concessions (Elmes et al 2014), including protected areas, like the Tambopata National Reserve and its buffer zone (Asner and Tupayachi 2017). Asner et al (2013) combined Landsat data with airborne mapping and field surveys to find a 400% increase in gold mining between 1999 and 2012, with forest loss tripled in 2008. Aboveground carbon stocks and emissions were mapped at 0.1 ha resolution over 4.3 million ha in Madre de Dios using satellite imaging, airborne LiDAR and field plots, totaling 395 Tg C (million metric tons of carbon) (Asner et al 2010). Gold mining areas had the lowest mean carbon density (16.7 Mg C ha −1 ) when compared to forest degradation and deforestation (35.6 and 27.8 Mg C ha −1 , respectively), while deforestation caused by gold mining and logging concessions accounted for carbon emissions of 0.42 Tg C yr −1 between 2006 and 2009 (Asner et al 2010). However, because land-use conversion in Madre de Dios happens quickly and mostly illegally, high-spatial and temporal resolution satellite images are needed for ecosystem monitoring (Boyle et al 2014).
Planet Inc. satellite data address both challenges in terms of spatial and temporal resolution, offering daily 3.7 m spatial resolution Dove images with four spectral bands (Planet Team 2018). Until now, Dove images were mostly used in combination with other geospatial datasets to manually digitize land-use changes in the Peruvian Amazon region by conservation projects like the Monitoring of the Andean Amazon Project (MAAP) (Finer and Mamani 2018). Planet Dove images were successfully used to map coral reefs  or agricultural environments (Aragon et al 2018, Houborg and McCabe 2018), but were not used to assess gold mining effects using robust and automated approaches.
We developed high-resolution estimates of aboveground carbon density (ACD) and emissions for the region most affected by gold mining activities in Madre de Dios for 2017 and 2018 by combining hundreds of Planet Dove images, Sentinel-1, topography and airborne LiDAR measurements using a deep learning regression framework. We seek to build the pillars for an automated monitoring system that will help track the changes in near real-time and act towards reducing the environmental impacts of gold mining activities.

Study area
Madre de Dios is a biodiversity hotspot lying at the foothills of the Andes, whose Holocene alluvium deposited important subsurface gold resources (Asner et al 2013). Our study area is bounded by the Andes  To account for the different acquisition times for LiDAR and Planet Dove images, we removed the LiDAR patches that suffered changes through visual interpretation. Subtracting the digital terrain model from the digital surface model, derived from the last and first returns, respectively, resulted in a top-of-canopy height (TCH) model at 2 m spatial resolution and covering 164 563 hectares in our study area of ∼750 000 ha. TCH was shown to be a very good estimator of ACD, and we transformed the TCH into estimates of ACD using the equation proposed by Asner et al (2014), developed by correlating extensive field-based estimates of ACD with TCH at the national level (equation (1)): 3. Satellite data for regional upscaling To upscale the LiDAR ACD measurements to our entire study area, we used a combination of different remotely sensed geospatial datasets, like spectral bands, radar, and topography.
Planet Dove images are acquired daily by an orbiting constellation of more than 180 CubeSat satellites, using multispectral imagers with four spectral channels at a spatial resolution of 3.7 m, namely blue, green, red, and near-infrared bands (Planet Team 2018, 2017). We created two normalized seamless mosaics with a spatial resolution of 2.5 m for the dry season (July-September) of 2017 and 2018 by transforming hundreds of Dove surface reflectance scenes using a linear fit of each scene to co-registered Landsat images. We retained only the green, red and near-infrared bands for the analysis and removed the blue band due to its sensitivity to atmospheric conditions.
We derived seven vegetation indices using different combinations of the three Dove spectral bands, as follows: Simple ratio (SR) (Jordan 1969), Normalized Difference Vegetation Index (NDVI) (Rouse et al 1974), Green NDVI (GNDVI) (Gitelson et al 1996), Transformed Vegetation Index (TVI) (Tucker 1979), Soil Adjusted Vegetation Index (SAVI) (Huete 1988), Optimized SAVI (OSAVI) (Rondeaux et al 1996), and Enhanced Vegetation Index (EVI) (Jiang et al 2008). These vegetation condition indices are contributing to an improved differentiation between vegetation and non-vegetation and emphasizing variation in canopy biomass. Because most of these indices are highly correlated, we performed a standardized principal component analysis (PCA) to obtain a set of linearly uncorrelated variables. We retained the first principal component, which described 97.6% and 96.9% of the variation for 2017 and 2018, respectively.
Sentinel-1 is a Synthetic Aperture Radar C-band satellite that acquires images in an interferometric wide swath mode of VH (Vertical transmit-Horizontal receive) and VV (Vertical transmit-Vertical receive) polarizations (Torres et al 2012). We created two mosaics of the Sentinel-1 VH and VV overlapping our Planet data for the dry seasons of 2017 and 2018 using the Google Earth Engine (Gorelick et al 2017), at 10 m spatial resolution. These scenes were pre-processed for thermal noise removal, radiometric calibration and terrain correction. To account for the influence of elevation in ACD distribution, we In total, the seven variables used as predictors (green, red, near-infrared bands, PCA of vegetation indices, Sentinel-1 VH and VV, and ALOS DSM) and the LiDAR ACD were resampled to 1 ha resolution to create a perfectly overlapping stack of layers to be used as input in the deep learning estimation of ACD.

Carbon stocks, emissions and uncertainty by upscaling LiDAR data
Many machine learning regression algorithms are used to estimate the ACD from remotely sensed data (Mascaro et al 2014). We used a deep learning neural network since it has been shown to outperform many popular machine learning models (Asner et al 2018). Deep learning is a powerful supervised approach capable of learning from complex nonlinear data by using neurons to combine our input features to the response variable (LiDAR ACD) (LeCun et al 2015). We built and trained our deep learning models using the high-level API Keras (Chollet 2015) using Tensor-Flow-specific functionality (Abadi et al 2016). After tuning the hyper-parameters of the neural network, we created a wide and deep model using five layers, with three similar hidden layers of 250 neurons each. An activation function based on the rectified linear unit was used for the input and hidden layers, and a linear activation function for the output layer. We used the mean absolute error as a loss function with an Adam optimizer (Kingma and Ba 2014). To ensure a stable estimation of ACD, the final results of the neural network were obtained by averaging results from 10 iterations.
The LiDAR ACD samples were split into 80% to train the neural network and 20% were kept for the final validation of ACD estimates. The 80% input for the neural network was further split into 80% for training and 20% for testing the network. Since a neural network regression is sensitive to the range of values of input variables, all the environmental variables were normalized beforehand using a min-max normalization with a range of values between 0 and 1.
The uncertainty for our ACD estimations was estimated following the procedure described in Asner et al (2014). The ACD results were binned into 10 classes by natural breaks and for each bin, the RMSE was computed. A function was fitted using the 10 RMSE values to obtain the absolute RMSE values (in Mg C ha −1 ) for each pixel of our estimated ACD map. Nevertheless, the annual spatially explicit aboveground carbon emissions were calculated by using the stock-difference method, as suggested by the IPCC (2006) guidelines at 1 ha.

Accessibility influence on aboveground carbon emissions
We analyzed how the aboveground carbon emissions vary in relation to the distance to roads and rivers, the two main ways of transportation in the area, as well as to the elevation. For this, we created a distance map based on the roads and waterways layers available from OpenStreetMap that maps well our study area (OpenStreetMap Contributors 2017), while the ALOS DSM was used for the elevation analysis.

Results
3.1. The performance of deep learning estimation of ACD The final averaged ACD estimations for both years were evaluated against the 32 913 ha of LiDAR ACD not used in the deep learning process. For 2017, the satellite-based ACD estimates had an R 2 of 0.69 and RMSE of 21.5 Mg C ha −1 , while for 2018, our satellite-derived ACD had an R 2 of 0.67 and RMSE of 22.2 Mg C ha −1 (figure 2). During the deep learning training and testing iterations, the averaged RMSE was 22.2 Mg C ha −1 for 2017 and 23.0 Mg C ha −1 for 2018. The combined RMSE for 2017 and 2018, which is important to assess as the square root of the sum of the two squared errors, was 30.9 Mg C ha −1 . Considering that we used seven simple predictors, we consider these results as very good and a starting point for future exploration.

ACD stocks and uncertainties
We estimated that our study region contained 63. In our study region, the heavily mined Colorado, Inambari and Malinowski rivers, together with the Madre de Dios river and the Interoceanic Highway, shape the spatial distribution of human-caused forest disturbance and ACD loss (figure 3). Gold mining areas show near-zero values of ACD and below-average ACD in surrounding degraded forests. In contrast, high ACD values are found in protected areas and in areas that are relatively far away from the road network or rivers. In addition to the highly concentrated areas of low ACD, there are thousands of smaller low-ACD areas mainly attributable to gold mining activities and deforestation along and perpendicular to the Interoceanic Highway (figure 3).
The uncertainties in our ACD estimations, in terms of estimated absolute percent uncertainty, are under 20% for the highest ACD values (>100 Mg C ha −1 ), between 20 and 40% for ACD values between 65 and 100 Mg C ha −1 and the uncertainty increases as the ACD approaches 0 Mg C ha −1 (figure 3). Considering that the highest ACD values are related to a tall and healthy forest, the uncertainties below 20% are viewed as very good and similar to the error rates from a field-based estimation of ACD.

Aboveground carbon emissions from 2017 to 2018
The fine-scale spatial variations from high to low ACD values indicate that new deforestation and forest degradation occur mostly in intact forests that harbor more than 100 Mg C ha −1 . The large gold mining areas show highly variable carbon storage, such as in Huepetuhe, the Upper Malinowski and La Pampa mining area, situated in the Tambopata and Bahuaja-Sonene buffer zones (figure 4). We considered the combined RMSE error of 30.9 Mg C ha −1 for the two ACD maps and calculated the emissions above this threshold, which reached an alarming value of 1.12 Tg C in a single year, representing 23 613 ha affected.
While the protection regime of Tambopata, Bahuaja-Sonene, and Amarakaeri seem to maintain relatively stable ACD stocks, there are visible signs of mining intrusions inside the protected areas, mainly in the Tambopata National Reserve (figure 5). The newest mining advancement is represented by the La Pampa mining area between the Malinowski river and Interoceanic Highway, in the buffer zone of protected areas. This area is rapidly expanding towards the southeast, parallel with the river, adding about 5.5 km per year in length, while widening by roughly 1.0 km ( figure 5(a)). The northwest side of the La Pampa mining area is expanding southwest with two lobes, with more compact 3.8 km and more sparsely 4.5 km per year into the intact forest while the mining area already existing in 2017 is widening with 0.2 up to 1.0 km ( figure 5(b)).
The Upper Malinowski river has high carbon emissions from the expansive mining activities in areas where more than 100 Mg C ha −1 was present ( figure 5(c)). Following the river path, these mining scars now measure between 0.6 and 1.3 km and represent the common starting points of a fast-growing large-scale gold mining. This is happening in the buffer zone of the Bahuaja-Sonene National Park, but there is still at a 4.5 km distance to the edge of the national park ( figure 5(c)). Numerous small-scale carbon emissions areas that are constantly expanding are attributable to the easy-access granted by the Interoceanic Highway, with both parallel and perpendicular forest clearings ( figure 5(d)).
We refrain from detailing the carbon emissions in the buffer zone of Amarakaeri Communal Reserve, since cloud-related artifacts were present in the Planet Dove 2018 mosaic, in the western part of Huepetuhe mining area. These areas were removed when computing the 1.12 Tg C emissions. However, there are visible signs that this mining area is expanding southwest and east from the Colorado river while expanding also internally by clearing the last forest patches already surrounded by bare soil and mining water pools (figure 4).

Accessibility influence on aboveground carbon emissions
From the total of 1.12 Tg C of aboveground carbon emissions, 0.49 Tg C occurred within the first km accessible from a road or river, 0.70 Tg C within the second km, and 0.86, 0.98 and 1.04 Tg C within the third, fourth and fifth km, respectively (figure 6). Only 0.08 Tg C emissions occurred between 5 and 16 km to a road or river. Mining activities in our study area happen at a relatively low and accessible elevation, contributing to 0.85 Tg C of emissions between 200-300 m and 1.04 Tg C between 200-400 m, from a total of 1.12 Tg C (figure 6). Although not shown here, evidence of gold mining at higher altitudes was found southwest of our study area, in Quincemil, situated in the Andean foothills (Asner et al 2013).

Discussion
The Peruvian Amazon and its Madre de Dios subregion are a well-known tropical biodiversity hotspot (Stewart 1988). For more than a decade, there has been an ongoing clash between the economic benefits of gold mining and the resulting environmental and health issues in this region. Because the majority of deforestation and small-scale mining happens in remote places and is difficult to monitor (Elmes et al 2014), we fused multiple passive and active remote sensing datasets to estimate aboveground carbon stocks and emissions in 2017 and 2018. For a study region of ∼750 000 ha in Madre de Dios, we found 1.12 Tg C emissions spread over 23 613 hectares, caused by gold mining activities, deforestation and human expansion along the Interoceanic Highway. This observed trend of carbon emissions is in line with other studies tackling increasing deforestation rates through time in this region Tupayachi 2017, Caballero Espejo et al 2018).
The migration of miners is facilitated by the existence of rivers and roads, acting as the main routes for transportation. The high degree of fragmentation and accessibility have a major impact on the high biological diversity of the Madre de Dios region, where a multitude of floral and faunal species can be found within a single hectare (Gentry 1988, Asner et al 2013. Besides deforestation and forest degradation, another threat is the mercury used to amalgamate fine gold particles extracted from the river sediments, which is polluting the waters (Castello and Macedo 2016) and affecting the entire food web dynamics, including nearby communities  and wildlife (Alvarez-Berríos et al 2016). The lack of education for miners are leading to unsustainable mining practices and high exposure to mercury for the upstream and downstream communities (Kahhat et al 2019).
An accurate high spatial and temporal monitoring system will greatly benefit the actors responsible for combating illegal gold mining. We estimated that in 2018 more than 58 000 ha had ACD values between 0 and 10 Mg C ha −1 (7.76% of the study area), and such enormous losses continue to challenge efforts to stem the impacts of gold mining in the region. The first line of defense, the buffer zones of protected areas, are exposed to illegal mining activities because of a lack of coordination between responsible agencies, corruption problems and inadequate funding (Gardner 2012, Weisse andNaughton-Treves 2016). There are ongoing community-based efforts to monitor and mitigate the impacts of gold mining, including land restoration and reforestation of degraded forests (Sanguinetti 2018). However, reforesting abandoned gold mines with native species is challenging because of poor soil quality and mercury contamination, slow tree growth rates, but acceptable survivorship of the  Deep learning approaches are becoming essential tools for large-scale spatial analyses (Brodrick et al 2019). To our knowledge, the use of deep learning regression workflows to estimate ACD in tropical forests is rather limited (Asner et al 2018). We selected a deep learning approach because it resulted in a lower RMSE error for ACD estimation when compared to Random Forest regression (Breiman 2001), of 23.2 Mg C ha −1 , or when compared to XGBoost, a scalable tree boosting system (Chen and Guestrin 2016), of 24.9 Mg C ha −1 , for the year of 2017. However, the selection of numerous deep learning model hyperparameters and extensive computational time remains two of the main challenges when applying such a workflow. Our approach is not dependent on specific input data, so alternatives to commercial Planet imagery can be used, like Sentinel-2, with lower spatial and temporal resolution, but higher spectral resolution, or Landsat missions for a deeper retrospective analysis.
Madre de Dios is not the only region affected by aggressive gold mining activities (Hilson 2002). In many regions of the world, like northern Amazonian countries (Kalamandeen et al 2018), Africa (Snapir et al 2017) or Southeast Asia (Alonzo et al 2016), gold mining is leaving a spatial footprint of deforestation and land degradation. The United Nations Framework Convention on Climate Change created an initiative to reduce forest degradation and deforestation (REDD+) (Corbera and Schroeder 2011) by paying incentives to developing countries that demonstrate reduced carbon emissions. While most countries were found to report considerably less forest loss than in reality due to various methods and limited datasets (Nomura et al 2019), we see our methodology as a future high spatial and temporal workflow to accurately quantify the aboveground carbon emissions. This way, the national and regional authorities will have an objective, cost-effective and frequently updated monitoring system of carbon emissions to better control illegal activities and improve conservation efforts.

Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.