Quantification of land cover and land use within the rural complex of the Democratic Republic of Congo

The rural complex is the inhabited agricultural land cover mosaic found along the network of rivers and roads in the forest of the Democratic Republic of Congo. It is a product of traditional small-holder shifting cultivation. To date, thanks to its distinction from primary forest, this area has been mapped as relatively homogenous, leaving the proportions of land cover heterogeneity within it unknown. However, the success of strategies for sustainable development, including land use planning and payment for ecosystem services, such as Reduced Emissions from Deforestation and Degradation, depends on the accurate characterization of the impacts of land use on natural resources, including within the rural complex. We photo-interpreted a simple random sample of 1000 points in the established rural complex, using 3106 high resolution satellite images obtained from the National Geospatial-Intelligence Agency, together with 406 images from Google Earth, spanning the period 2008–2016. Results indicate that nationally the established rural complex includes 5% clearings, 10% active fields, 26% fallows, 34% secondary forest, 2% wetland forest, 11% primary forest, 6% grasslands, 3% roads and settlements and 2% commercial plantations. Only a small proportion of sample points were plantations, while other commercial dynamics, such as logging and mining, were not detected in the sample. The area of current shifting cultivation accounts for 76% of the established rural complex. Added to primary forest (11%), this means that 87% of the rural complex is available for shifting cultivation. At the current clearing rate, it would take ~18 years for a complete rotation of the rural complex to occur. Additional pressure on land results in either the cultivation of non-preferred land types within the rural complex (such as wetland forest), or expansion of agriculture into nearby primary forests, with attendant impacts on emissions, habitat loss and other ecosystems services.


Introduction
The rural complex is a distinctive agricultural land cover mosaic surrounding the network of inhabited areas found along rivers and roads in the forest block of the Democratic Republic of Congo (DRC). It contains paths, settlements, grassy and bare communal areas as well as the constituent land cover components of various land uses, primarily those of traditional small-holder livelihood shifting cultivation: cleared land, active fields, fallow fields, secondary forest and a permeable interface area with primary forest (Lebrun and Gilbert 1954, Vandenput 1981, Mayaux et al 1999, Russell et al 2011. The human footprint in the forest block of the DRC is represented by the rural complex; both its established core and those areas of expansion along its edges, as well as more distant and isolated forest disturbances (forest perforations) (Molinario et al 2015). Areas less impacted by the human footprint are therefore more wild; less disturbed and fragmented and containing more core forest (Sanderson et al 2002, Potapov et (Molinario et al 2015) was used to select the rural complex target region to be sampled. The AOI was created using rules based on the classification from Forest D'Afrique Central Evaluee par Teledetection (FACET) (Potapov et al 2012) and the DRC wetland forest map (Bwangoy et al 2010).
The land cover components of the rural complex occur in variable proportions, depending on several factors that modulate the demand for food production and other natural resources, the availability of land and the social and economic cost/benefit of cropping in a given area (Miracle 1967, Ruthenberg et al 1971, Mayaux et al 1999.
Previously, satellite data of coarse spatial resolution were used to map the rural complex as a single, homogenous class clearly distinguishable from surrounding primary forest using satellite remote sensing , Hansen et al 2008. However, thanks to the opening of the NASA/USGS Landsat archive, forest can now be mapped globally at 30 m spatial resolution and yearly intervals (figure 1) . The increased spatial resolution allows for forest monitoring at local scales for regional extents and for holistic modelling of the expanding human footprint; tracking its expansion and consequent fragmentation of forest (Gustafson 1998, Parent et al 2007, Vogt et al 2007, Molinario et al 2015. Accurate mapping of the shifting cultivation mosaic is a necessary prerequisite for effective land use planning and land management. In Reduced Emissions from Deforestation and Degradation (REDD+) land cover and land use maps inform baseline emissions scenarios. Monitoring, reporting and verification (MRV) systems are equally dependent and sensitive to land use activity maps. Small-holder agriculture has been shown to be the main source of forest degradation and forest cover loss in the DRC (Mayaux et al 2004, Defourny et al 2011, Potapov et al 2012, DeWasseige et al 2014. However, it is not necessarily the case in a small-holder livelihood agricultural system that all shifting cultivation has negative impacts on the forest ecosystem with attendant negative repercussions on long term sustainability (Ickowitz 2006, van Vliet et al 2012, Moonen et al 2016. For example, treating shifting cultivation univocally, without accounting for local variation and the successional land cover types derived from it could overestimate the decrease in carbon stocks from deforestation by 46% (Akkermans et al 2013) and misinform development strategies. Shifting cultivation's role in sustainable development has been the object of continuous debate (Ickowitz et al 2015, Van Hecken et al 2015 and previous work has shown quantitatively that shifting cultivation has varying impacts on the forest ecosystem of the DRC, ranging from areas of minimal impact to hotspot areas of forest loss and degradation (Molinario et al 2015, Harris et al 2017. The complexity of the relationship between shifting cultivation and the forest ecosystem goes beyond a binary causality of shifting cultivation with 'deforestation' (Molinario et al 2015), and therefore, the longer term impacts of shifting cultivation on forest cover and degradation have not been adequately understood and included in development schemes.
The goal of this study is to leverage recent advances in data availability to increase the quantitative understanding of the shifting cultivation cycle and the established rural complex of the DRC. We observed that were reclassified from the forest fragmentation maps (a), both published by Molinario et al (2015) and augmented with GFC data from Hansen et al (2013). In this example the target region is represented by the grey area in (b). the constituent land cover components within this rural complex area, including the successional regrowth that is part of the shifting cultivation cycle, and inferred from these observations the mean temporal rotation rate of shifting cultivation at the national level. To date, this rate has been established anecdotally or determined empirically for a limited number of case studies (Lebrun and Gilbert 1954, Conklin 1961, Miracle 1967, Ickowitz 2006, SPIAF 2007, Akkermans et al 2013.

Study area and data
The established rural complex area of the DRC is the target region from which a simple random sample of points was selected to identify land cover using high resolution satellite imagery (figure 2). More precisely, the established rural complex area that we focused on is the area of the rural complex that has not expanded during the period 2000−2014. This target region was created by updating the rural complex map produced by Molinario et al (2015) by adding forest cover loss data through 2014 from the Global Forest Cover (GFC) data of Hansen et al (2013). A sample of 1000 points was then randomly selected within the target region with a uniform probability density for selection over the entire region (figure 3). A number of steps were involved in acquiring and preparing the satellite imagery used for photointerpretation of the sample points. High resolution imagery was acquired freely, under the conditions of the NextView license, from the Commercial Imagery (CI) archive of the National Geospatial-Intelligence Agency (NGA) through the NASA Goddard Space Flight Center (GSFC) (Neigh et al 2013). The entire archive acquired for the DRC is composed of 31 224 images. One of the challenges of working with this archive is that it is composed of data from various sensors, with heterogeneous naming conventions, metadata, number and characteristics of bands and spatial and temporal resolutions. The archive was composed of mainly DigitalGlobe imagery: WorldView 1, WorldView 2, WorldView 3 and Orbview 5 (a.k.a. GeoEye 1). The spatial resolution of imagery varied from 0.5 meter pixels (WorldView 1) to 1.65 meter pixels (Orbview5). The sample intersected 3106 images, or 17% of the entire archive acquired (figure 4); 95% of the images used spanned the 2008−2016 period (table 1).
Google Earth imagery was used to supplement photo-interpretation when NGA imagery was available, and was used in lieu of it when it was not. Google Earth imagery varied in spatial resolution, with only Landsat-resolution imagery covering many of the sample points; in those cases the sample point was flagged as no-data as photo-interpretation was not possible.

Methods
The images were footprinted, intersected with the sample points and photo-interpreted. The intersecting imagery was orthorectified using code developed by the Polar Geospatial Center (PGC) of the University of Minnesota (PGC 2017). A digital elevation model (DEM) created using data from the Shuttle Radar Topography Mission (SRTM) (SRTM 2014) was used, with no-data pixels filled with ASTER DEM data (ASTER 2009). Using the same PGC code, the imagery was converted from the native national imagery transmission format (NITF) to tagged image file format (TIFF). The imagery was then manually photo-interpreted using ENVI software. The sample points did not overlap any of the available NGA imagery in 39% of the cases. In other cases, while the imagery was available, the sample point might have been covered by cloud, cloud shadow, or baddata artefacts. In those cases, Google Earth was used in lieu of it. However, as even in Google Earth high resolution imagery was not always available, ultimately 19% of the sample points were flagged as no-data (192 sample points). The remaining n = 808 sample points (81% of the original sample) were photointerpreted to determine the land cover of the pixel containing the sample point. In the rare cases in which a sample point had multiple-date high resolution images available, the most recent one was used to make the interpretation of its latest land cover.
The sample points were assigned a land cover label according to a 14-class legend (table 2) developed empirically from preliminary photo-interpretation, literature review and expert opinion based on observable land cover characteristics (Styger et al 2007, Lebamba et al 2009, Akkermans et al 2013. The sample points were also assigned a rating for interpretation confidence of high, medium or low (table 3). The confidence of the photo-interpretation varied as a function of several characteristics of the observed land cover, for example the spatial structure of canopy (primary forest having larger crowns, greater tree height as evidenced by adjacent shadows, and a more differentiated canopy than secondary forest) as well as the color of the canopy (primary forest having darker foliage compared to secondary forest). The photo-interpreter used expert judgment that included experience of two field campaigns in the DRC forest to interpret the imagery and assign confidence flags to the photo-interpretations, as illustrated in table 2 and further addressed in the Discussion (section 5). Examples of the imagery photointerpreted can be seen in figure 6, figure 7 and figure 8. In photo-interpreting the sample, we employed our legend of table 2 that conforms with the class definitions of Potapov et al (2012), in which forest is defined as land with ≥ 30% canopy cover for trees ≥ 5 meters tall, woodlands have between 30% and 60% tree cover, and primary and secondary forest have more than 60% canopy cover.
The sample-based estimate of the proportion of area of land cover class i within the rural complex is = where n = number of sample points identified as class i and n = sample size.  The estimated area of class i is given by the following equation: where A tot = 127 796 km 2 is the total area of the established rural complex, as illustrated by figure 2(b).
Both the estimator of the proportion of area (p ) and the estimator of the area (A ) of each class are unbiased estimators (Cochran 1977, chapter 3). The formula for estimating the variance of the estimated proportion is the following: The standard error formula for the estimated proportion of area of class i is  The standard error for the estimated area of land cover class i is The standard errors quantify the uncertainty or precision of the sample-based estimates. Clearly the standard error decreases as a function of the square root of the sample size n (e.g. a four-fold increase in sample size will halve the standard error) and the standard error also depends on p . The sample size of n = 1000 was based on the expectation that for p = 0.10 (10% of the area represented by a class), the standard error would be approximately 0.01 (or 1% of the area represented) and this standard error was deemed acceptably small for our purposes.

Results
Based on the sample data, an estimated 76% of the established rural complex is composed of a land cover mosaic that is the product of current and past shifting cultivation: clearings, active fields, fallow fields and secondary forest. This means that if added to primary forest, 87% of the established rural complex is available to be farmed in future shifting cultivation (figure 9). The sample points were categorized by interpreters into three confidence groups: 40% of all sample points were high confidence, 49% were medium confidence, and 11% were low confidence. Since 192 sample points were no-data, only 808 points comprise the subset of the sample on which the results are tabulated. The disaggregation of each land cover type by confidence group (table 3) shows which land cover types potentially have more confusion in their photointerpretation. For example, 85% of settlements and roads are high confidence, while only 22% of active agriculture is high confidence, whereas 64% of active agriculture sample points are medium confidence and 15% are low confidence.
While the majority of non-forest land cover sample points within the rural complex were anthropogenic, in some cases they were part of natural non-forest land cover such as river banks, landslides and even an Inselberg in the Ituri foresty. In other cases, such as in grasslands, the non-forest land cover may have been part of a historic anthropogenic land cover modification.

Distance sub-population estimates
Estimates of the proportion of area represented by the land cover classes are produced for each of three subregions defined by distance to the edge of established rural complex. The subregions were defined so that all three represented approximately an equal area (i.e. each subregion had an equal number of sample points). This resulted in the first subregion being defined as within 180 m of the rural complex edge, the second subregion covering the area between 180 m and 725 m from the edge, and the third subregion being the area beyond 725 m of the edge (table 5). There is a clear trend that shows more clearings and fallow fields in the interior of the rural complex (>725 m distance subregion), further from the interface with primary forest. Active agriculture and secondary forest, however, have similar proportions throughout the rural complex while the proportion of mature secondary forest is less in the innermost region (>725 m). The proportion of primary forest is almost double in the subregion within 180 m of the edge of the established rural complex as it is in the other two subregions, indicating its prominence in the land cover mosaic in this permeable interface area.

Discussion
The footprint of the rural complex needs to be accurately mapped as it is an area of high carbon dynamics. Tyukavina et al (2013) estimated that aboveground carbon loss of secondary forests was greater than that of primary forests in DRC. As the rural complex is the site of rotational clearing secondary regrowth, it should be a focus of any PES program aimed at avoiding deforestation and degradation, such as REDD+. To this end, previous work modelling and mapping the spatial patterns of the rural complex in the DRC (Molinario et al 2015) leveraged novel wall-towall per-pixel satellite remote sensing-based inputs (Hansen et al 2013)  that higher resolution imagery was necessary to quantify the proportions of the constituent land cover and land use components of the rural complex in all its areas; the stable, established, region that is investigated here, as well as the areas actively expanding into primary forest (Molinario et al 2015). Using the data and methods outlined, we showed how a simple random sample of the rural complex can be used to produce an enhanced, quantitative estimation of the heterogeneous land cover and land use components of the established rural complex.
Several challenges arose when photo-interpreting the sample points. One of the difficulties was that 19% of the sample points had no available imagery to allow an interpretation. This was sometimes due to the fact that imagery was not available neither in the NGA archive nor on Google Earth, while in other cases available imagery was too coarse, had bad-data artefacts, or was obscured by cloud cover and cloud shadow. Cloud cover, in particular, is a consistent problem in tropical environments (Ju and Roy 2008, Kovalskyy and Roy 2013) and it can only be overcome by mosaicking all . Distribution of area of land cover classes within the rural complex. An estimated 76% of the rural complex is already part of the cycle of shifting cultivation, and of the remaining area an estimated 11% is primary forest (see table 3). This means that 87% of the rural complex is the estimated area available for agriculture ('mtr' = mature, 'yng' = young).
available cloud-free observations from different dates (Potapov et al 2012 or having more frequent observations from optical sensors. Integrating cloud-piercing synthetic aperture radar data (SAR) data such as Sentinel-1 would aid in the land cover mapping of tropical areas (Wulder et al 2008, Malenovský et al 2012, but would need to be demonstrated as suitable for this application. A key issue with high resolution satellite imagery is that these data remain rare despite improvements in the availability of imagery. Planet's acquisition of Terra Bella and RapidEye and concurrent launch of 88 cubesats (Butler 2014, Planet 2017a, 2017b combined with the free availability of DigitalGlobe's archive to United States Government-affiliated researchers (Neigh et al 2013) offers the possibility of enhanced earth observation monitoring at higher spatial resolutions, yet the operational capacity of these systems needs to be demonstrated. The temporal resolution of high spatial resolution data is pledged to be on-par with the daily return rates of medium resolution imagery (e.g. MODIS) after the operationalization of Planet's constellation of ∼100 microsatellites. Thus far, most land cover mapping and monitoring within reasonable timeframes (e.g. annually) is currently feasible using medium spatial resolution sensors such as Landsat and Sentinel-2 (NASA 2017, ESA 2017). The larger data volumes of wall-to-wall, cloud free, high resolution imagery for the tropics would also pose significant limitations. Costs are also a limitation, compared to publicly free global imaging systems such as Landsat (Wulder et al 2012). The future accessibility of very high spatial resolution, given that such data exists exclusively within a commercial model, is not guaranteed with implications on the ability of governments and civil society to operationally monitor land resources.
For these reasons, high resolution imagery remains limited for wall-to-wall scientific research in the tropics, lending itself primarily for sample-based or localized case studies. In this study, two-thirds of the available NGA data were acquired between 2014 and 2016, however all available data was used (table 4). We take this period to represent a snapshot in time of the established rural complex for circa 2015. If we were to repeat the study at decadal time-scales, for example, we would be able to document changes to the land use components of the rural complex, with implications for the sustainability of DRC's swidden agricultural system.
Another issue was the confusion in the photointerpretation of certain land cover types, despite the use of the highest resolution data currently available (50 cm). This interpretation confusion was most prominent among certain land cover types; for example, active fields were often flagged as low confidence, because mature crops in the shifting cultivation cycle (generally second or third year crops) were hard to distinguish from young abandoned fallows. Similarly, mature secondary forest was often flagged as low confidence because it was sometimes difficult to distinguish from primary forest. In some cases, it was a certain tell-tale spatial pattern that showed an area of 'brighter' forest closer to the edge of an active agricultural area that helped us classify the sample point as secondary forest. In most cases, we did not have the luxury of having both panchromatic and multispectral high resolution imagery for the same sample point. When we did, we leveraged the spatial resolution of the panchromatic image to assess the spatial pattern of the sample point, and the spectral information of the multispectral image to interpret the land cover of the sample point. We used false-color band combinations when we had multispectral imagery in the NGA archive for sample points that had confusion. All the multispectral imagery in Google Earth was true color imagery; only available in the visible spectrum bands.
All results should be interpreted with these caveats in mind. There is important variability within the national averages reported. While our findings offer the best and most up-to-date quantification of the constituent land cover components of the rural complex, they are not intended to replace the higher accuracies of in-depth localized case-studies, in which one could address the highly nuanced qualitative socio-economic factors that drive the specific mixture of land cover and land use in a specific area (Rudel and Roper 1996, Mather et al 1998, Geist and Lambin 2002. Despite that, our estimate of a 5% clearing proportion of the established rural complex nationally echoes the estimate of 4.9% bare soil reported in a Landsat-based investigation of the successional cycle in the area of Kisangani (Akkermans et al 2013).
The estimates produced for the subregions defined by distance to the established rural complex edge showed that quantitatively the composition of land cover within these subregions fit the spatial and temporal model of the shifting cultivation cycle in the DRC previously published in Molinario et al (2015), which was based on extensive literature review, expert opinions and field-work. The shifting cultivation dynamic at the edge of the rural complex is dependent on the appropriation of primary forest, whereas in the interior of the rural complex, primary forest is no longer readily available, and fallows and secondary forest must be reused.

Conclusion
This study contributes quantitative estimates of the land use and land cover proportions within the rural complex of the DRC that were not previously available at the country level. At the current annual clearing rate (5%), it takes ∼18 years for all the available land for shifting cultivation in the established rural complex to be cleared at least once. When land use pressure increases, the increased demand for food and other resources can only lead to either intensification or extensification of the extraction of natural resources. For agriculture, such intensification is possible only through fertilization or mechanization, expansion of shifting cultivation in less-preferred areas of the rural complex (such as wetland forest), or expansion outside of the rural complex, either at its edges or in more isolated forest perforations (Molinario et al 2015). Our results and the discussion of their limitations and caveats, indicate that the land cover and land use maps that are used to establish the baseline emissions in projects such as REDD+ need to map carefully the boundaries of the established 'permanent' agricultural areas, as these are permeable, shifting, and contain a variety of successional stages of forest regrowth and forest degradation within them.
In a small number of cases, the sample points were clearly part of a commercial land cover change dynamic, such as plantations (2%). However, neither logging nor mining were found in our sample, which suggests that logging and mining dynamics are largely absent in the established rural complex. It should be expected that plantations, logging and mining occur further from the historically inhabited agricultural core area of the rural complex, where the extractive resource base is located or where competition for land is less intense. Our results pertain only to the land uses that are discernible in the photo-interpretation of high resolution satellite imagery. Whether the observation of a clearing or an active field is linked to land uses such as palmoil plantations, logging concessions, or mines, or the production of foodstuffs and gathering of fuel for distal populations in urban areas, was not determined, but should be the object of future research.