Forest harvest dataset for northern Colorado Rocky Mountains (1984–2015) generated from a Landsat time series and existing forest harvest records

This dataset provides a shapefile containing approximately 3500 polygons with the location, extent, size, and year of clearcut harvest events occurring between 1984 and 2015 in forested areas of the northern Colorado, Landsat WRS-2 scene Path 034, Row 032. Harvest events were modeled and mapped using a 32 year time series of Landsat imagery, the LandTrendr algorithm, and ancillary datasets. The dataset also conveys information on the elevation, aspect, ownership, distance to roads, and the watershed where each harvest event occurred.


Type of data
Analysis of forest management trends over the last 30 years and connections with policy or disturbance events.
Contributes to further study of forest carbon dynamics, forest regeneration processes, and ecological comparisons of harvested versus unharvested systems.

Data
This forest harvest history dataset was derived from a 32-year time series of Landsat imagery using advanced image normalization and disturbance detection methods. Historical harvest spatial extents were modelled by integrating limited existing forest harvest records with time series data in a boosted regression trees classification algorithm.

Landsat data acquisitions and processing
We acquired Landsat surface reflectance higher level data products for Landsat 4 and 5 Thematic Mapper (TM), Landsat 7 Enhanced Thematic Mapper plus (ETMþ ), and Landsat 8 Operational Land Imager (OLI) for the months of July, August, and September between 1984 and 2015 for WRS-2 Path 034, Row 032 via the USGS Earth Resources Observation and Science (EROS) Center Science Processing Architecture. All available images that contained less than 50% cloud cover were selected in our analysis, totaling 182 individual scenes across the 32-year study period.
We processed these Landsat scenes using LandsatLinkr (LLR) [1], an R [2] package that creates annual, cloud-free, spectrally-consistent tasseled cap [3] composites for use in change detection analyses. The LLR package unpacks Landsat surface reflectance products, masks each scene for clouds, water, snow, and ice, and calculates tasseled cap brightness, greenness, wetness, and angle using standardized coefficients [3]. In years where both Landsat 7 and 8 are available, spectral calibration is performed by the LLR tool using near date imagery to create an aggregate tasseled cap model that can be applied to all Landsat 8 OLI imagery to provide spectral consistency between TM/ETM þ and OLI sensors [4]. The tasseled cap images produced for each scene are then composited using the mean pixel value of all available images for each year to create annual, cloud-free tasseled cap composites.

Disturbance detection
We used LandTrendr [5] to delineate all spectrally-detectable disturbance events. We implemented a modified version of LandTrendr (LLR-LandTrendr) [6] that uses tasseled cap composites produced through LLR. LandTrendr is an advanced Landsat-based change detection algorithm that tracks changes in spectral trajectories at the pixel level to characterize disturbance events. We used tasseled cap wetness as our index for spectral segmentation and used default segmentation and trajectory-fitting parameters with no cover model. This resulted in multiple LandTrendr products including greatest disturbance magnitude, duration, and pre-disturbance event spectral vertex values which were used as predictors of harvest presence. Pixels identified as disturbed in LandTrendr outputs were used to limit the study area because the products represented spectrally-detectable disturbances of all magnitudes.

Modelling and mapping forest harvest events
Ancillary vector data containing forest management records were compiled from the 2014 LANDFIRE Public Events Geodatabase [7] and previously-validated harvest polygons (Woodward et al., unpublished data). Harvest polygons were visually inspected and validated using high-resolution (r 1 m 2 ) aerial imagery. We buffered the validated harvest polygons (n¼354) by 30 m to remove mixed pixels at the edge of harvests. Spatially-balanced points within the buffered harvest polygons (n¼ 1510) and background points (n¼ 4408) were generated to train the models described below.
Modelling was performed using the Software for Assisted Habitat Modeling (SAHM) module package for VisTrails, an open-source provenance management and scientific workflow system [8]. Predictor variables included greatest disturbance magnitude, duration, and pre-event vertex values, topography (elevation, slope, aspect, and compound topographic index) and distance to roads. We evaluated five classification models: random forests (RF), multivariate adaptive regression splines (MARS), generalized linear model (GLM), boosted regression trees (BRT), and maximum entropy (MaxEnt).
We compared evaluation metrics from a ten-fold cross-validation to select the final model. Area Under Curve (AUC), Cohen's Kappa, True Skill Statistic (TSS), sensitivity, specificity, and percent correctly classified (PCC) metrics were used to assess model performance rather than relying on a single statistic to allow for a better overall model evaluation [9]. We selected the BRT model (AUC¼0.98, Cohen's Kappa¼ 0.86, TSS¼0.89, sensitivity¼ .93, specificity¼ 0.95, PCC¼94.8) to generate the final maps.
Final outputs (Fig. 1) were refined by filtering out disturbances that were less than 11 contiguous pixels [5]. Outputs were converted from raster to vector format and additional information was appended to each polygon: year of harvest derived from LandTrendr outputs, elevation, aspect, slope,