Using Spatial Regression Models in Identifying the Drivers of Forest Structure in the Hyrcanian Forests of Iran


 Background: Understanding the relationships between forest structure, in particular attainable height, and the environment is important for sustainable forest management. Similarly, modeling structural attributes improve our understanding of forest growth dynamics and may identify key drivers of long-term changes in the forest ecosystem. Due to the inherent complexity of these relationships, quantification of some drivers of forest growth is often not available, resulting in spatially auto-correlated errors of the regression model. Methods: To explore the tree height-environment relationships of oriental beech we compared the performance of a standard regression model (multiple linear regression, MLR) to those accommodating a spatial correlation structure, specifically a Generalized Least Squares model with exponential correlation structure (GLS) and three variations of the Simultaneous Autoregressive Model (SAR): the spatial lag model (SLM), the spatial Durbin model (SDM) and the spatial error model (SEM). Across 127 0.1 ha circular sample plots in the primeval World Heritage Hyrcanian Forests of Iran, we collected data on tree height and edaphic and topographic. Within each plot, the height of all trees with DBH ≥ 7 cm was measured. Results: The results showed that SAR and GLS models reduced spatial autocorrelation of model residuals and improved model fit, with both SDM and SEM slightly superior to the SLM in removing spatial autocorrelation in the model residuals. SDM performs better than SEM in terms of RMSE and adjusted R2. Conclusions: Although SAR-based models performed marginally better than GLS, we still recommend GLS for spatial analyses due to their easier implementation and ease-of-use compared to SAR models. However, when the computation time is a concern, SAR-based models can be more useful because of faster execution. Keywords: spatial autocorrelation; Hyrcanian forests; multiple linear regression model; simultaneous autoregressive model; generalized least squares


Background
In primeval forests, tree height is one of the most important forest structural parameters for both forest ecology and forest utilization: forest tree height, forest productivity and carbon sequestration are strongly related (Lou et al. 2016). And tree height also correlates positively with terrestrial plant diversity at various spatial scales (Lindenmayer et  Sustainable forest management requires an understanding of the relationships between forest structure attributes such as tree height and environmental parameters for multiple reasons. Modeling the structural attributes and environment can be used to understand the drivers behind variation in forest structure and to identify key indicators for monitoring long-term changes in the forest ecosystem. Understanding the variations in tree growth is also important for sustainable forest management, in order to tailor harvesting to growth dynamics (Coomes and Allen 2007;Pretzsch 2009). A large suite of non-linear, interacting factors, including climate, topography, soil conditions and competition for resources, in uence the growth of forest tree species (Oliver and Larson 1996). The topography is a key driving factor in forest structure and composition, constraining the local nutrient and hydraulic conditions under which trees grow (Jucker et al. 2018). These effects are similarly re ected in forest structure as well (Tateno and Takeda 2003).
The variation in topography constitutes a resource gradient and thus provides an opportunity to study the relative importance of soil nutrients availability in the forest structure development through resource competition between tree species (Tateno and Takeda 2003).
A variety of regression techniques have been used to model the complex relationships between biological response variables (e.g. the growth of individuals or the structure of ecological communities) and environmental predictors (in particular machine learning algorithms), but for interpretable results, multiple linear regression (MLR) regression is still the go-to method (e.g. Currie, 1991;Kerr and Packer, 1997; Rahbek and Graves, 2001). One critical point for all of them is the violation of the assumption of independence of data points, as spatial autocorrelation (SAC) is not taken into account (Keitt et al. 2002).
In recent years, spatial regression analysis has become an important part both in forestry and ecology to improve the description of spatial patterns in plant or other organism groups or communities (e.g. Beale . Several extrinsic factors and intrinsic processes may lead to spatial structure in forest stands (Moeur 1993;Rouvinen and Kuuluvainen 1997). Spatial autocorrelation describes the pattern in which observations becoming less similar as they are further apart in space (Fortin and Dale 2005). The presence of spatial autocorrelation in model residuals violates the independence assumption and can in ate type I errors (Kühn 2007). This, in turn, can result in the selection of unimportant predictors and poorly estimated regression coe cients in species distribution studies (Lennon, 2000;Dormann et al., 2007). Spatial autocorrelation is an important issue in ecology because it is a general property of ecological variables which is measured over the geographic space (Legendre and Legendre 2012).
There are two types of causes of spatial autocorrelation, depending on which kind of process generates the spatial structure: endogenous or exogenous (Fortin and Dale 2005; Legendre and Legendre 2012). In the case of endogenous processes, factors which are related to the biology of the species under consideration (for example, conspeci c attraction, dispersal limitation, demography, interspeci c interactions, colonial breeding, home-range size, host availability, predation or parasitization risk, and so forth) generate the spatial pattern (Lichstein et al., 2002;Dormann, 2007). On the other hand, exogenous processes which are spatially autocorrelated themselves (e.g. topography varies smoothly and is hence autocorrelated) that drive the response variable of interest imprint their spatial autocorrelation onto the response, in what is called (induced) spatial dependence (Fortin and Dale 2005). Many studies have shown the importance of spatial autocorrelation in studying the species-environment relationships (Rahbek and Graves 2001;Dormann et al. 2007;Miller et al. 2007;Kissling and Carl 2008;Beguería and Pueyo 2009;Beale et al. 2010;Dahlin et al. 2014;Teng et al. 2018). Analyses accounting for spatial autocorrelation provide a more detailed description of spatial structure in species performance data and lead to a better understanding of the underlying ecological processes (Diniz-Filho et al. 2003).
Northern forests of Iran, called Hyrcanian or Caspian forests, are important sources of genetic variation, biodiversity, commercial woody products, and various environmental services (Ahmadi et al. 2017). These forests inscribed as the World Heritage in 2019 cover an area of about 1.85 million ha. Hyrcanian forests account for 15% of the total Iranian forests and 1.1% of the country's area. These forests range from sea level up to an altitude of 2800 m and comprise various forest types, harboring approximately 80 woody species (trees and shrubs). Oriental beech (Fagus orientalis Lipsky) forests in the Hyrcanian ecoregion occupy about 18% of the total forest area, 30% of the standing volume and 24% of the stem number As one of the main timber species in the Hyrcanian forests, our overall goal in this paper was to explore the oriental beech tree height-environment relationships. We hypothesize that statistical models considering the spatial information provide both better description and a sounder statistical framework than using non-spatial models. We compare the performance of multiple linear regression (MLR), as benchmark, to alternative models with spatial correlation structure for modelling the tree height in relation to edaphic and topographic predictors in the study area.
The study area is about 450 ha ranging from 1000 m to 1500 m a.s.l. The minimum temperature in December is 6.6 °C and the maximum temperature of 25 °C occurs in June. The mean annual precipitation of the study area is 1500 mm at the Nowshahr city metrological station, which is located 40 km away from the study area. The bedrock is limestone -dolomite, leading to soils with silty-clay-loam soil texture (Ahmadi et al. 2013). The forests of the study area are mixed and uneven-aged and are dominated by Fagus associated with other species such as Carpinus betulus, Acer velutinum, Parottia persica and Quercus castaneifolia. There is no history of harvesting in these forests and managed as the protected area.

Data Collection
Data were collected from a total of 127 0.1 ha circular temporary sample plots by using a randomsystematic network laid out in the eld. Speci cally, we set up a 0.1 km x 0.1 km grid across the study region and selected 130 grid intersection plots at random. The sample plots were established in sites with no evidence of disturbances to minimise the noise in the response variable, removing three plots from the initial selection. Within each plot, the diameter of all tree species with DBH > 7 cm and the total height of all beech trees were measured by using caliper and Vertex IV (Haglöf, Sweden), respectively.
Since environmental changes to the soil are more strongly re ected in the topsoil, ve soil samples of the top 10 cm below the litter were randomly taken within each plot using core soil sampler. The soil samples were then mixed and analyzed in the laboratory. Roots, shoots and stones were separated by hand and discarded and the air-dried soil samples were then sieved at 2 mm mesh size. Soil organic matter was determined using the Walkey-Black method (Allison 1975). Total nitrogen was measured in the laboratory by the Kjeldahl method (Bremner and Mulvaney 1982). The available K was determined by a ame atomic absorption spectrophotometer (AA500F, PG Instruments Ltd, China). The available P was determined by using the Olsen method (Homer and Pratt 1961). The Bouyoucos hydrometer method was used for determining the soil texture (Bouyoucos 1962). Soil pH and bulk density (at air-dried moisture content) were determined using pH-meter and Plaster (1985) method, respectively. Site factors such as altitude, slope percent and aspect were recorded at each sample location. Aspect, as the azimuth measured from true north, was then converted to a topographic radiation index using the following equation TRASP = [1 -cos((π/180)(θ -30))]/2 (Alavi et al. 2019). Environmental variables collected as a basis for modelling are summarized in Table 1.

Statistical analysis
Collinearity among environmental predictors was tested by hierarchical cluster analysis using squared Spearman correlations with the Hmisc package (Harrell et al., 2018) in the statistical software R (R Core Team 2018). The variables percentage sand, percentage carbon, percentage organic matter and percentage saturation were hence removed from the set of predictors due to their high correlation with altitude, slope, TRASP, clay, silt, nitrogen, phosphorus, K, C-N-ratio, bulk density, pH as they were deemed ecologically least relevant based on the authors' expert knowledge. The linear model with quadratic and rst-order interaction terms was simpli ed using backward stepwise and Bayesian Information Criterion (BIC), which considers both the goodness-of-t and model complexity and penalizes model complexity more than the AIC does (Burnham and Anderson 2002). The model selection step was carried out on the MLR to identify a minimal adequate model structure for all model types. The residuals of the MLR were normally distributed (for details see supplementary information).
Simultaneous autoregressive (SAR) models assume that the response variable at each location i, conditional on the value of explanatory variables at i, depends on the other response variables at neighboring locations j (Haining 2003). SAR models enhance the linear regression model with an additional term that combines the spatial autocorrelation structure of observations in data. In simultaneous autoregressive models, the neighborhood relationship is formally expressed in an n × n matrix of spatial weights (W), in which elements (w ij ) represent a measure of the connection between locations i and j. In the present study, we used three different SAR models including the spatial error model (SEM), spatial lagged model (SLM) and spatial Durbin model (SDM = spatial mixed model). The SEM models assume that the autoregressive process is in the error term. This is most probably in cases when spatial autocorrelation is not completely explained by the predictor variables (Diniz-Filho et al. 2003). The SAR spatial error model takes the form where λ is the spatial autoregression coe cient, W is the spatial weights matrix, β is a vector representing the slopes associated with the explanatory variables in the original predictor matrix X, and ε represents the (spatially) independent errors.
The SLM models suppose that the autoregressive process affects only the response variable. It takes the form Y = Xβ + ρWY + ε where ρ is the autoregression parameter, and the remaining terms are as above.
If spatial autocorrelation can affect both response and explanatory variables, SDM takes the form (Anselin and Gri th 1988): Here a new term (WXγ) appears in the model, which represents the autoregression coe cient (γ) of the spatially lagged explanatory variables (WX).
In all SAR models, the neighborhood needs to be provided as input, which we consider as a room for arbitrary decisions (as discussed in Bauman et al. 2018 for eigenvector approaches), particularly compared to the generalized least squares model (GLS). GLS is, in principle, just the name for the algorithm used for tting regression models with pre-speci ed error covariances and hence also used to t the SAR models (Pinheiro and Bates 2000). Typically it is referred to, however, as an approach where the spatial covariance structure is usually modelled assuming a simple distance-decay, e.g. exponential or linear (Dormann et al., 2007).
All approaches, MLR, SAR and GLS, were tted with the free software R. The three SAR model types (SEM, SLM, and SDM) are implemented in the 'spdep' package (Bivand 2011). Determine the neighborhood distances is an important issue for using the SAR function. After that spatial weight matrix calculated by weighting the neighbors with a certain coding scheme. For specifying the best neighborhood distances, we looped through 20-50 m distance settings and the best neighborhood distance was selected as the one with the lowest AIC. The nal Simultaneous Autoregressive Regressions were run with a spatial weights matrix based on a neighbourhood distance of 150 m and a row standardized coding scheme 'W'. For the GLS model, three different correlation structures (corExp, corGaus, and corSpher) were speci ed using the nlme package (Pinheiro et al. 2019).
In order to test the spatial autocorrelation in the model residuals, we employed Moran's I, which quanti es the variance among residuals as a

Mlr Model
Ten of the 14 explanatory variables were signi cant and entered the model in MLR regression analysis ( Table 3). The non-spatial multiple linear regression (MLR) explained 56.6% of the variance in the dataset. Using standardized regression coe cients as a criterion for determining the relative importance of explanatory variables (Kabacoff 2015) showed that phosphorous, altitude and nitrogen were the most important variables for the observed tree heights. Carbon-to-nitrogen ratio was the least important predictor, followed by silt, pH and TRASP.
The spatial correlogram (Fig. 2) representing the spatial autocorrelation for the model indicated that at short distances there was positive autocorrelation, as was also indicated by a highly signi cant Moran's I score of 0.17 (P < 0.001). Tables 2 and 3 show the statistical results of the SAR and GLS models tted by maximum likelihood. The spatial autoregressive parameter for SEM (λ = 0.533) is signi cant, as showed by the p-value < 0.001 in an asymptotic t-test as well as likelihood-ratio test (LR) (P < 0.001), clearly indicating the in uence of neighboring observations. The estimate for the spatial autoregressive coe cient ρ = 0.155 for SLM is also highly signi cant (P < 0.01). For both the SLM and the SDM, the Likelihood Ratio test (LR) on ρ was also signi cant (P < 0.01) and was a bit higher for the SDM (0.196 vs. 0.155).

SAR model
Across all models, the structurally most exible approach, the SDM, achieved the best t in terms of adjusted R 2 and RMSE. However, it did not rank best on the AIC, due to its larger number of spatial parameters. Instead, the SEM had the smallest AIC, followed by SLM and the GLS with "spherical" correlation structure. GLS with Gaussian correlation structure, SLM, SDM and GLS with exponential correlation structure had larger AICs, but the differences among those models were small (638.14 to 643.39). This group was also not substantially different in AIC from the non-spatial MLR.
For the Moran's I of model residuals (Fig. 3), SEM had the smallest value (Moran's I = 0.027, P = 0.608), while for the GLS Gau it was the largest (Moran's I = 0.160, p-value = 0.72) and nearly indistinguishable from the non-spatial MLR (I = 0.169, P < 0.001). The results of this study showed that the residual autocorrelation was removed by all spatial models but not by the non-spatial MLR.
There were differences in the estimates and signi cance level of the variables. For some variables such as altitude and bulk density, P-values changed as well (Table 3). The intercept coe cients β 0 of SEM and GLS spatial models were close to that of MLR and were highly signi cant, but in the other spatial models (SLM and SDM), the intercept coe cients β0 were very different. There are considerable differences in the slope coe cients β of SLM, SDM, SEM and GLS models with MLR (Table 3 and Figure of parameter estimates in the appendix). The standard errors of β0 and β1 of SLM, SDM, SEM and GLS models were nearly equal to or lower than those of MLR (Tables 3).  Table 3 Parameter estimates of all models (and their standard error). Note that all predictors were standardised before the analysis so that (absolute) estimates serve as an indicator of variable importance. ***, **, * and n.s. refer to signi cance levels of the model term at P < 0.001, < 0.01, < 0.05 and > 0.05, respectively. One important effect of including spatial autocorrelation is a much larger uncertainty of the estimates in all spatial models.  In this study, three spatial autoregressive models and three GLS model structures were used to evaluate the relationships between beech tree height and environmental variables. All spatial models had a better performance than the nonspatial multiple regression. In general, the results of SDM and SEM were signi cantly better than SLM, suggesting that the spatial error was largely due to environmental, not to endogenous, ecological processes. Having high potential for reducing the spatial pattern of model residuals, spatial autoregressive and generalized least square models help to meet the assumption of independence in regression models. In the modeling process, we seek to select the best model based on comparing the evaluation criteria of the models such as R 2 and AIC. The results showed that when the spatial weight matrix is added in SAR models, the adjusted R 2 value increased from 59% for the MLR model to 71% for the SAR model.
The GLS with spherical correlation was the best-tting model from the GLS set, and the mixed SAR and error SAR for the SAR set. Kissling and Carl (2008)  . Some authors assume that spatial models always provide better parameter estimates than the MLR technique, but Kissling and Carl (2008) suggested that researchers must be cautious about this assumption. In Lu and Zhang (2011) study, SEM commonly provided coe cient estimates similar to the MLR model. We found that all spatial models had better performance compared to the MLR model, on the other hand, there are considerable differences in parameter estimation between MLR and spatial models which are inconsistent with Lu and Zhang (2011). More important than bias may be the fact that spatial models always had larger uncertainty for the estimates, resulting also in a wider error margin for model predictions.
Both SAR and GLS rely on generalized least squares regression, therefore they are mathematically very similar, although GLS shows more exibility in the way spatial autocorrelation is accounted for ). Different settings of the distance parameters in the SAR-based models were compared by using AIC and the best con guration of the correlation matrix was found. Comparing the performance of models here suggests that GLS with spherical correlation structure and SEM provide more accurate results than other models. This result is attributed to the fact that spatial autocorrelation is considered at all scales in the GLS model, therefore, we do not need a priori knowledge of the distance range of spatial in uence. By including the correlation structure derived from a semivariogram, an improvement to SAR models could be achievable (Beguería and Pueyo 2009 division, and growth of meristem tissues, its limitation is associated with a sharp decline in tree growth. As a result, phosphorus de ciency will slow down or stop the growth of above-and underground parts of the forest trees. In temperate forests, the main limiting nutrient factor for plant growth is generally the available nitrogen ). It is expected that when tree nitrogen requirements are satis ed, some other nutrient may become the growth-limiting factor (Taylor 1934  The performance of species along elevation gradient is governed by a series of interacting biological, climatic and historical factors (Colwell and Lees, 2000). As such, elevation represents a complex gradient along which many environmental variables change simultaneously (Austin et al., 1996). Beech tree has the best performance at lower altitudes (ca. 1100 m), which is consistent with (Mohadjer 2005). It seems these altitude ranges have optimum humidity conditions and resource availability, yielding high productivity (Rahbek 1995;Rosenzweig 1995). The major decline in beech performance at higher altitudes could be due in part to ecophysiological constrains, such as reduced growing season length, low temperatures and hence low ecosystem productivity at high elevation (Körner 1998).

Conclusions
In this study, three spatial autoregressive models and three GLS models were used to model the relationships of beech tree height and environmental variables, with MLR as a benchmark. The three spatial autoregressive models had a better performance to the MLR model. In general, the results of SDM and SEM were signi cantly better than SLM. SEM was better than SDM based on the AIC evaluation criterion and spatial correlogram. SDM performs better than SEM in terms of RMSE and adjusted R 2 .
SDM has the advantage of analyzing the spatial variation of micro-environmental conditions in forest stands and competition among individual trees simultaneously. However, if the complexity of the model structure is important, SEM is de nitely a reasonable choice over SDM because makes the understanding of the model much easier. Although SAR-based models have better performance than the GLS model, we recommend using the GLS model for modeling the height of trees, because the GLS is easier to than SARbased models. However, when the computation time is a concern, SAR-based models can be more useful because of faster execution.