Forest structures across Europe

Pan‐European gridded datasets derived from a single methodology to inform researchers, policy makers and conservationists on the state of forest structures would improve our ability to study forests independent of political boundaries and along various gradients. Although National Forest Inventory (NFI) data provide information on the characteristics of forests, including carbon content, volume, height, and age, such spatial data is not available across Europe. Before this study, remotely sensed data covering all of Europe had not been utilized to produce multiple European‐focused forest structure datasets over the entire continent that are methodologically consistent and comparable. We used existing European data to develop a pan‐European dataset covering tons of live tree carbon and volume per hectare, mean tree height and mean tree age by integrating remotely sensed and harmonized NFI data from 13 different European countries. We produced a pan‐European map for each of the four key variables on a 0.133° grid representing the time period 2000–2010.


Introduction
European forests are locally and globally important as they provide innumerable ecosystem services to Europeans, contain approximately one-third of the world's temperate forest biomass and are estimated to include over one-third of the world's temperate forest carbon sink (Pan et al., 2011). To assess the impact climate change will have on European forests it's imperative to spatially understand the current state of forest characteristics (Cramer et al., 2001). Continental forest structure information is typically quantified by forest inventories and is used to assess biodiversity, habitat suitability, timber market value, among many other ecosystem services (Cohen et al., 1995). There are 28 member countries in the European Union and even more forest inventories (Tomppo et al., 2008;Tomppo et al., 2010). Each forest inventory is conducted and distributed independently from one another with some countries (e.g. Italy) applying different inventory systems for each province, most of which are not publicly available. Thus, collecting and analyzing NFI data across Europe without a centralized system is challenging and hinders the effectiveness Europe's climate change mitigation and impact monitoring schemes (Kollmuss et al., 2015).
The United Nations Food and Agricultural Organization (FAO) publishes the Global Forest Resource Assessment (FRA) report to collate National Forest Inventory data (NFI). This report provides countrylevel inventory information on variables such as total carbon and volume (FAO 2010). Data, however, are not provided below country-level totals and averages prohibiting analysis that transcend political boundaries.
Researchers have previously created a collated NFI dataset focusing on Europe (Schelhaas et al., 2006). The European Forest Information Scenario Model (EFIScen) project gathered data from several European countries and created a freely available database. The EFIScen database is grouped into jurisdictions, or site types whose spatial boundaries are unknown and vary by country prohibiting spatial analysis by those outside of the project. Within the EFIScen project, this data was used to create various gridded datasets (Gallaun et al., 2010;Brus et al., 2011;Vil en et al., 2012). These gridded datasets, however, have limited or no accessibility, provide only one forest attribute each and are developed using different methods.
Other projects have also harmonized NFI datasets throughout Europe but without providing the data spatially (Tomppo et al., 2010;Kempeneers et al., 2013). Spatially explicit data on forest structures would allow analyses that cross political borders that cannot be done with country-level statistics; such as, studying forests across climate, elevational, species or human access gradients.
On the global scale, Simard et al. (2011) created a map of canopy height and Crowther et al. (2015) mapped tree density. The local applicability of global datasets is diminished by a lack of enough spatially explicit input data to accurately represent smaller areas such as individual European countries or regions, such as the Alps. Input data from European forests in the global tree density map are from 6,000 research plots with each plot covering less than 1% of the land area of the cells they represent (http://icp-forests.net/; Crowther et al., 2015). Crowther et al. (2015) states that there is high confidence in tree density on the global scale and low confidence at smaller scales. Instead of NFI data, Simard et al. (2011) used previously produced space based LiDAR data and topographical maps (Lefsky et al., 2005) to create a gap-filled 1 9 1 km map of canopy height. Global datasets are designed to be used on global scales. At smaller scales the accuracy of these datasets diminishes (Gemmell, 1995). Spatially explicit country or regional level maps of forest data have also been developed (Ruiz-Labourdette et al., 2012;Maselli et al., 2014). These datasets, however, may not be directly comparable to similar datasets from nearby areas as methods and assumptions may differ. Herein lies the tradeoff between scale and accuracy. Global datasets cover a large area but become less accurate on smaller scales, and local or regional-level datasets are accurate at local-scales but may be spurious to combine for studying larger areas. Prior to this study, no pan-European spatially explicit gridded dataset existed based upon common input data and one methodology for various forest structures.
We created pan-European spatially explicit gridded datasets to allow a consistent assessment of forest resources in Europe. Such data will support forest policy and management decisions across countries by assessing the role of forests for the growing bio-economy sector, quantifying the mitigation potential of European forests, and analyzing the ecosystem services provided by European forests transcending political borders.

Input data
We utilized both terrestrial observations (NFI data, Biogeographic Regions) and remotely sensed data (MODIS Land Cover, MODIS NPP, MODIS NPP Trend, Tree Canopy Height) ( Table 1). All datasets used in this study represent the time period 2000-2010. Only the raw NFI data is not publicly available, which we did not use in this study.

National Forest Inventories (NFI)
We used harmonized, processed and aggregated plotlevel National Forest Inventory (NFI) data from Neumann et al. (2016b). The original NFI data came from 13 countries in Europe: Austria, Belgium, the Czech Republic, Estonia, France, Finland, Germany, Italy, Norway, Poland, Romania, Sweden and Spain. Each country provided plot-level data from their own inventory systems ( Table 2). The data was collected between 2000 and 2010 and consists of 261 465 plots with falsified locations within a 0.0083°cellmeaning that the exact location of a plot can be anywhere within a known 0.0083°grid cell.
We aggregated the Neumann et al. (2016b) plotlevel forest variables (tons of live tree carbon/ha, volume/ha, mean tree height, and mean tree age) to a 0.133°resolution (grid-cell area range: 75-175 km 2 ) by averaging all of the plot values within a 0.133°g rid-cell to make gridded terrestrial data. To account for different sampling densities along country borders we weighted each plot by the forest area it represents using the country's particular grid design. This aggregation was done to maximize the confidence in the gridded terrestrial data (Figure 1, methods section; Moreno et al., 2016b).
Carbon data is given as tons of live tree carbon per hectare which includes carbon in the stem, branches, foliage, coarse and fine roots as well as foliage but does not include soil or dead-wood carbon (Neumann et al., 2016a,b). Volume is given as m 3 per hectare and has different definitions by country. Some countries define volume as the merchantable timber or growing stock while others define it as the total tree volume including branches; some measure volume over bark and others under. These definitions then affected field measurements; which cannot be changed. Carbon and volume estimates were provided to Neumann et al. (2016b) by each individual country using their own local volume calculation method but using the same carbon calculation method dictated by Neumann et al. (2016a,b).
Tree age is given as age classes. Each country has different age classes. We harmonized to eight age classes: 0-20, 21-40, 41-60, 61-80, 81-100, 101-120, 120-140, >140(years). Height is calculated as the mean height of all trees within a grid-cell as opposed to the dominant tree height or other measures traditionally used for plot-level height measurements. All of these measures only reflect trees that meet the minimum diameter and height requirements, as defined by each individual country, to be included in an NFI . These calculations were performed by individual countries and given to Neumann et al. (2016b), preventing access to tree level data.

MODIS land cover
The MODerate resolution Imaging Spectroradiometer (MODIS) is a sensor on the Terra and Aqua satellites. MOD12 land cover is given at a 0.0083°resolution (in Europe cell area range: 0.27-0.71 km 2 , approximately 1 km 2 at the equator) globally and includes 11 vegetative cover types: evergreen broad leaf, deciduous broad leaf, evergreen needle leaf, deciduous needle leaf, woody savanna, savanna, crops, open shrub lands, closed shrub lands, grassland, and mixed forest . The land cover data is produced by implementing the MODIS 12 Q1 collection four land cover classification algorithm. This algorithm is a supervised classification that gap-fills the International Geosphere-Biosphere Program Data and Information System (IGBP).

Biogeographical regions
Biogeographical regions were collated from individual EU member countries' reports and are given on a 1 9 1 km resolution. Biogeographical regions include: Alpine, Arctic, Black Sea, Continental, Mediterranean, Pannonian, Steppic, Atlantic, and Boreal (European Environment Agency, 2012). We aggregated bioregions to produce three different bio-region maps, thereby creating clustering options for our A basal area factor is required for countries with ACS. Plot area is required for the countries with FAP. Min. DBH is the minimum diameter at breast height required for each tree to be included in the sample. The arrangement of sample plots indicates whether the plots are arranged as single plots or within clusters. Some countries have varying distances between plots/clusters depending on location.
parameterization process described in the methods section. One map has 6 regions: Alpine, Continental, Mediterranean, Pannonian, Boreal (includes: Norwegian Alpine, and Arctic regions), and Atlantic. Another map includes three regions: Northern Europe (includes: Arctic, Norwegian Alpine, and Boreal regions), Central Europe (includes: Atlantic, Continental, Pannonian, and Steppic regions) and Southern Europe (includes: Mediterranean region). The third map has no bio-region designations and is only a mask of Europe.

MODIS NPP and MODIS NPP trend
Net Primary Production (NPP) was derived using the MOD17 algorithm. This algorithm, using MODIS fraction of Photosynthetically Active Radiation (FPAR), Leaf Area Index (LAI), land cover, climate data and a biome-property lookup table, calculates global NPP data every 8 days (Running et al., 2004). The MODIS NPP dataset we used in this study is an improved dataset of NPP for European Forests (MODIS_EURO) . MODIS_EURO was derived using the original MOD17 algorithm and a European regional climate dataset (Moreno and Hasenauer, 2015). This NPP dataset is freely available at ftp://pala ntir.boku.ac.at/Public/MODIS_EURO. The NPP trend data results from a linear regression fit line for annual values by cell for the original 0.00833°resolution from 2000 to 2010. The trend is then given as the slope of the linear regression line. We then aggregated to the 0.133°resolution by finding the average trend within the larger cell area.

Canopy height
Global tree canopy height data on a 0.0083°was produced by gap-filling Geoscience Laser Altimeter System (GLAS) data from the Ice, Cloud, and Land Elevation Satellite (ICESat) with 65 m footprint vertical profiles on north-south transects (Simard et al., 2011). This data has large areas without empirical observations. Simard et al. (2011) gap-filled GLAS data using annual mean precipitation, precipitation seasonality, annual mean temperature, temperature seasonality, elevation, tree cover percentage, and protection status as co-variates. They defined canopy height as the height of the tallest tree or the average height of the three tallest trees. The canopy height in this dataset is measured as the difference in elevation between the lowest ground and highest canopy points within a cell.

Climate limitation index
We define our climate limitation index is the product of three normalized gridded datasets: average growing season length, average annual short wave solar radiation (SWRad), and average annual vapour pressure deficit (VPD). Average growing season length is defined as the average time between the onset of increasing MODIS Leaf Area Index (LAI) in the spring and the end of decreasing autumn LAI (Myneni et al., 2002). LAI measures the one-sided green leaf area per unit ground surface area and shows seasonality for all tree species that can be encountered throughout Europe (Tian, 2004). The result is the relative growing season length in a cell compared to other locations in Europe.
SWRad and VPD are both calculated using the MtClim algorithm with climate data and the GTOPO30 DEM (data available from the U.S. Geological Survey) as inputs (Thornton and Running, 1999;Moreno and Hasenauer, 2015). This algorithm uses east and west horizons, aspect and slope along with climate data and day length to calculate the SWRad and VPD by calculating how the geometry of daily solar radiation and climate affect these two parameters. These datasets are then normalized. The climate limitation index is the product of these normalized maps. Temperature is not explicitly used because throughout Europe forests behave differently in response to temperature depending on the bio-region. Growing season length, however, is used to assess the temperature limitation on ecosystems.

Gap-filling algorithm
To derive data sets of forest structure everywhere in Europe, and not only where we have NFI plot data we gap-filled areas where we did not have NFI data. We developed a gap-filling algorithm using a two-step process: cluster analyses and kNN nearest neighbour to create a pan-European dataset of forest structures (Figure 2).
Clustering is the process in which cells are grouped together based on similarity defined by the user. Clustering can be done by predefined groups or by using a N Figure 1. Gridded terrestrial data based on NFI data. The colour indicates the number of NFI plots within a given gridded terrestrial cell. Note: Estonia is the only country in our data where the NFI is based on a random inventory grid design resulting in the number of plots per grid-cell being disproportionally high.
k-means machine learning principle (Hartigan and Wong, 1979). The k nearest neighbours algorithm (kNN) is a method that, for every point, finds its nearest neighbour based on co-variates representing different aspects of the points (Manning and Schutze, 1999).
We gap-filled every cell in Europe that does not have gridded terrestrial data (based off of NFI data) with similar cells that do have gridded terrestrial data based on the co-variates: NPP, NPP trend, canopy height and our climate limitation index (Figure 2 Coverage with MODIS_EURO data  was the limiting factor in spatial extent (Figure 2: step 1). We processed NFI data to produce a corresponding gridded terrestrial dataset with plots averaged under aggregated MODIS grid cells (from 0.0083°to 0.133°resolution) (Figure 2: step 1, Figure 1). Aggregation is necessary because a minimum number of NFI plots within a gridded terrestrial cell are needed to have sufficient statistical confidence in the derived cell value . At 0.0083°resolution (in Europe cell area range: 0.27-0.71 km 2 ) there is typically only one plot within a cell, whose exact location is unknown, which gives an undefined confidence interval. At 0.133°resolution (in Europe cell area range: 75-175 km 2 ) the average number of NFI plots per grid-cell is 27 throughout Europe ( Figure 1) . For noncategorical results derived from NFI data (carbon, volume, height) the aggregated cell value is an arithmetic mean of the plots within the cell. Age class was assigned using the most frequent plot age class within a cell. All other datasets were then aggregated to this 0.133°resolution (approximately 75-175 km 2 ) (Figure 2: step 2, Table 2).

Parameterization (Figure 2: step 3)
Parameterization was performed using an iterative algorithm that allowed assessment of hundreds of  combinations of internal parameters to find the best combination as defined by the following statistical measures: Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Shift, Kurtosis, and Variance (Willmott and Matsuura, 2006). This was done by using a 'leave one out' cross-validation approach to test the gap-filling algorithm estimate versus the gridded terrestrial data (Kohavi, 1995). We evaluated the cell-level error, indicated by the MBE, MAE and RMSE as well as the overall distribution error, given by the three moments: variance, skew and kurtosis. Analyzing both the overall distribution and celllevel error informs us on the biasvariance trade off inherent to a kNN algorithm giving us the ability to optimize cell-level accuracy or match continental-scale distribution, but not both as they are mutually exclusive (Hero et al., 1996). Our gap-filling algorithm has five parameters that must be optimized to achieve the desired output: bio-regions, cover types, k-means clusters, nearest neighbours and the strength of the inverse distance weighting. Bio-regions refer to the number of regions we used for clustering cells whether it be 6, 3 or no bio-regions. Cover types could consist of 11 cover types (as defined by MODIS) or no cover types. We also determined how many k-means clusters to use within each bio-region/cover type group. The number of clusters had no limit. We then determined the number of nearest neighboursthe number of which had no limitwhich defines the number of gridded terrestrial cells that would be used to calculate the gap-filling estimation based on Euclidean distance. Finally, the strength of the inverse distance weighting determines how the 'distance' in co-variate space affects the weighting of each nearest neighbour on the gapfilling estimate. We used a nearest neighbour approach without regression modelling for gap-filling and with this approach, no pre-defined segregation of data was used for parameterization but rather an iterative process of data exclusion was performed using a cross-validation approach (Reese et al., 2003;Tomppo et al., 2008;Kempeneers et al., 2011).

Gap-filling (Figure 2: step 4)
We created a pan-European gridded dataset of various forest structure variables by gap-filling all areas where we have no gridded terrestrial data, i.e. where we lack NFI data (Figure 1). Our gap-filling algorithm first clustered similar forests and then performed a nearest neighbour algorithm (kNN) on cells without gridded terrestrial data with cells that have gridded terrestrial data (Figure 1). The cells are first clustered by cover type, bio-region and k-means clustering (Figure 2: step 4a and 4b). This clustering is done to group forests with the same biophysical and site properties. The kNN was then performed within each cluster (Figure 2: step 4c). The kNN is based on our 4-dimensional covariate space (Table 1). kNN finds the nearest cells within our co-variate space to our target cell. For every 0.133°cell without gridded terrestrial data within a cluster we performed the kNN process. The nearest neighbours are the closest cells that have gridded terrestrial data based on the Euclidian distance ( Figure 2: step 4d). The neighbours are then combined using an inverse distance weighting formula (Figure 2: step 4e) to produce a gap-filling estimate (Figure 2: step 4f). Carbon and volume are per hectare values allowing users to combine our data with the forest cover data of their choice to produce total estimates, e.g. converting tC/ha to tons of C (Figure 2: step 4 g and step 5).

Algorithm development and parameterization
Our automated parameterization process determined the combination of parameters that resulted in the least bias and error and best distribution fit. Although it is possible to parameterize each variable individually we optimized accuracy for tons of live tree carbon/ha and used these parameters for every other variable to maintain comparability and consistency throughout each dataset. We used 6 bio-regions, 11 cover types and 2 k-means clusters and 1 nearest neighbour. The reason we used one nearest neighbour is several fold; we preferred continental scale distribution accuracy over cell-level accuracy, the cells are already averages of numerous plots (on average 27 plots), and the MAE of each variable is still within the standard deviation. By using the same parameters for each variable, we forced each data set to make estimates using the same cell so that the different variables will align with one another.
Input co-variates must delineate forest structure and correlate as little as possible to avoid inaccurate and biased results. The co-variates we used in our algorithm were chosen to represent different aspects of a given forest area. The co-variates we used are: NPP, climate limitation index, NPP trend and canopy height.
NPP values represent points along a productivity curve (Zhao et al., 2005). Given the same forest type, the productivity curve fluctuates depending on the limiting factors to growth (Jolly et al., 2005). The climate limitation index acts as a site quality index to match forests with similar productivity curves. Once the productivity curve is determined, NPP data in conjunction with the height data give us an indication where along a productivity curve a forest lies (Monserud, 1984). An absolute NPP value is not sufficient to determine placement on the productivity curve because NPP curves have a maxima followed by a decrease and eventual plateau (Cramer et al., 1999). This results in multiple forest development stages having the same absolute NPP value. Height in this instance acts as a proxy for forest development stage.
NPP trend gives us two relevant pieces of information. First, it tells us if the forest is reaching a plateau or if it is increasing or decreasing in productivity rate and it also gives us an indication of the presence of management. No management history was provided by the NFI data and no spatially explicit dataset of forest management currently exists. Forests with similar ages and NPPs can have different forest structures based on previous management practices (Brown, 1981;Harmon et al., 2009). Two stands planted at the same time with one being unmanaged and another managed may have similar absolute NPP values but the NPP trend in one may be plateauing while the other continues increasing (Aber et al., 1978). Therefore, the NPP trend map is an attempt to delineate forest structures that occur under the forest canopy and thus cannot be captured directly by remote sensing data. Elevation alone was not used as a co-variate because it was used in the calculation of VPD and solar radiation which are parts of our climate limitation index.

Gridded data
We used our algorithm to derive gridded pan-European forest characteristics to analyze the spatial distribution of forest resources across Europe. We focused this study on four forest structure variables: tons of live tree carbon/ha, volume/ha, mean tree height and mean tree age (Figure 3) representing the time period 2000-2010. Using our data, we can analyze the spatial differences in forest resources across countries throughout Europe. All four forest characteristics were developed to be forest area independent. Carbon and volume are given as per hectare values which can be used along with any forest area dataset to create total carbon and volume estimates. Height and age units are given as metres and age class, respectively, both area independent, i.e. values are irrespective of the amount of forest within a cell. Grid cells with little forest cover can still have a few tall and/or old trees reflecting the original NFI input data. For example, agricultural and dry-lands in Spain, where little gap-filling was performed, because we have NFI data for most of this area, still has height and age values in areas considered nonforest.
We created a mean tree height dataset as opposed to the canopy height or tallest tree height because at a 0.133°resolution there will always be at least three tall trees. Consequently, canopy or dominant tree height, which is meaningful at the plot-level, gives little insight into the forest on such a resolution as used in this study. Mean tree height, however, informs us as to how tall the forest is on average within a cell. The mean tree age data gives a landscape level age profile of a forest. It provides information as to how old a forest is on the landscape level rather than the age of trees in the dominant or first layer traditionally used in forest growth research. This allows us to understand forest age structure over large areas without focusing on plot-level variance and may provide insight into rotation lengths. This information can be useful for regional level planning and understanding how larger scale policies and management have affected the forest age structure across counties, states and countries.

Error analysis
We used cross-validation methods to assess the accuracy of our gap-filling algorithm (Tomppo et al., 2008). Cross-validation refers to iteratively taking out a portion of input data (gridded terrestrial data), estimating that data, and comparing the estimate with the removed data values. We performed a 'leave-one-out' crossvalidation where we estimated each cell using data regardless as to from which country that data comes resulting in a European-wide accuracy/error estimate. The same was also done for a 'country-wise' cross-validation. In this validation, we removed an entire country's data and estimated every point within the country only using data from other countries.
A leave-one-out cross-validation gives a more accurate and unbiased assessment of accuracy given large Table 4. Cross-validation values by country. Cross-validation was done using a leave-one-out cross-validation and a countrywise cross-validation where we removed entire country data sets and estimated every cell within that country using data from other countries. Tons of live tree carbon/ha, volume (m 3 /ha), and height (m) are given as the median of the % relative difference to the input gridded terrestrial data value. Age class is given as the mean of the absolute age class difference between cross-validation and gridded terrestrial values. Negative values indicate an underestimation in the algorithm. Leave-one-out is the cross-validation method by which input data cell is removed and estimated using other cells regardless as to from which country they are. Country-wise is the same as leave-one-out, except cells used for estimation must be from different countries than the original input data cell. Standard deviation (SD), Mean bias error (MBE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE). CI is the average confidence interval of the NFI data that makes each gridded terrestrial data cell. N is the number of 0.133°cells evaluated.
stratified data as compared to sectional cross-validation, i.e. cross-validation by country or region (Kohavi, 1995). The country-wise cross-validation is a test of the robustness of our input data because the countries we obtained for input data were selected to cover the latitudinal gradient of Europe. In some bio-regions, if a country is removed then there is little data for estimation. For example, in southern Europe we have data from Spain and parts of Italy. Estimating Spanish cells using only the few Italian points will result in higher error values in Spain whereas this effect is not present in central Europe where we have data from more countries. The statistics for validation are defined by Willmott and Matsuura (2006). To place the validation error in the context of the random error associated with our input gridded terrestrial data we evaluated the confidence interval of the NFI data within each gridded terrestrial data cell. Confidence interval is defined as: where r is the standard deviation of a cell and N is the number of plots within a cell. We used a critical value of 1.96 as our constant (a = 0.05). We can use the confidence interval at the European scale because each country samples at different distances, some including clusters or actual random samples, fulfilling the assumption of randomness. We provide the mean values for the original gridded terrestrial data and the cross-validated data and the corresponding standard deviations. We measured the bias via mean bias error (MBE) and the accuracy with the mean absolute error (MAE) and the root mean squared error (RMSE). Some countries did not provide certain variables which are reflected in the number of cells used in our analysis (Table 3). To further understand the error and the accuracy we examined the error spatially ( Figure 4) and by country aggregates (Table 4). The MAEs for all variables in the leaveone-out cross-validation are all within the standard deviation and the height estimates are also within the confidence interval. MAE and RMSE in the countrywise cross-validation are largely driven by southern Europe and the Czech Republic because in the south we only have data from two countries and the Czech Republic has exceptionally high tons of live tree carbon and volume per hectare values.

Dataset Location and Format
Data is provided as GeoTiffs with a WGS84 projection and the resolution is 0.1333°. File sizes are 373 290 bytes. The code used to produce this data is given as a python script. Data and code is archived at figshare (https://dx.doi.org/10.6084/m9.figshare.c.3463902) and at ftp://palantir.boku.ac.at/Public/ForestCharacter istics. All data has a cc-by license and the code is under a MIT agreement.

Data Use and Reuse
We derived methodologically consistent and comparable datasets on various forest structures permitting research that transcends political boundaries allowing spatially explicit studies across any gradient. These datasets can also be combined to study how different forest structures interact under various conditions. Various forest growth models also require input data similar to the data presented here, allowing such modelling over the entire European continent in a spatially explicit manner. If such datasets are produced in the future than the effect that a changing climate has on forest structure on the continental scale can also be studied. Additionally, our algorithm can produce other forest structure datasets in the same manner to be used in conjunction with the four variables presented in this study.