Hierarchical change-point regression models including random effects to estimate empirical critical loads for nitrogen using Bayesian Regression Models (brms) and JAGS

Highlights • Hierarchical change-point regression models are suitable for estimating critical empirical loads.• The Bayesian framework of these models provides the inclusion of the current critical load and various confounding or modifying variables.• Here we present two ways of implementing hierarchical data sets in Bayesian change-point regression models using JAGS and brms.• The used model data are foliar N:P ratios of 10 different conifer tree species from 88 European forest sites and 9 different countries covering 22 years (1995-2017).

The concept of critical loads is used in the framework of the Convention on Long-range Transboundary Air Pollution (UNECE) to define thresholds below which no damaging effects on habitats occur based on the latest scientific knowledge. Change-point regression models applied in a Bayesian framework are useful statistical tools to estimate critical empirical loads. While hierarchical study designs are common in ecological research, previous methods to estimate critical loads using change-point regression did not allow to analyse data collected under such a design. This method update provides an implementation of hierarchical data structure by including random effects such as study sites or as in this example tree species within the Bayesian approach of changepoint regression models using two different approaches. The example data set is an European wide gradient study of the impact of climate change and air pollution on forest tree health assessed by foliar nutrient status of nitrogen (N) to phosphorus (P) from 10 different conifer tree species originated from 88 forest sites and 9 countries covering 22 years (1995-2017). Both modelling approaches using JAGS and Bayesian Regression Models using 'Stan' (brms) resulted in reasonable and similar estimations of the critical empirical load for nitrogen (CL emp N) for temperate forests. These methodological examples of using different approaches of Bayesian changepoint regression models dealing with random effects could prove useful to infer CL emp N for other ecosystems and long-term data sets.
• Hierarchical change-point regression models are suitable for estimating critical empirical loads. • The Bayesian framework of these models provides the inclusion of the current critical load and various confounding or modifying variables. • Here we present two ways of implementing hierarchical data sets in Bayesian change-point regression models using JAGS and brms.

Subject area Environmental Science
More specific subject area Forestry research Method name Revised Bayesian change-point regression models including random effects using brms and JAGS Name and reference of original method Change-point models applied in a Bayesian context by [1] Resource availability Model data can be found in [2] . Codes including models and graphics are written in R [3] and given as an RMarkdown [4] output. The MCMC simulations of the change-point modelswere conducted using JAGS, version 4.3.0 [5] , executed in R using rjags [6] and in STAN with brms [7] .

Bayesian change-point regression model settings
The revised change-point models using either JAGS or brms are based on the current method of using Bayesian change-point models as in the example given by [8] . The Bayesian analysis is based on MCMC methods [9] with a similar setting used by [8] conducted using JAGS (version 4.3.0 [5] and executed in R using rjags [6] . Posterior distributions were based on parallel chains ( t.n.chains = 2 ) with 10 0 0 0 0 iterations each ( t.n.iter = 10 0 0 0 0 ), discarding the first 50 0 0 0 values ( t.n.burnin = 50 0 0 0 ) and thinning the remainder by 2 ( t.n.thin = 2 ). The two parallel chains were used to assess convergence with Gelman and Rubin's diagnostics [10] using trace and marginal density plots ( Fig. 2 ) with the R function plot{coda} and scale reduction factors with the R function gelman.diag{coda} using the R package coda [11] . We are presenting the results as the mean (point estimate) and the 2.5% and 97.5% quantiles (95% credible intervals (CI), [12] ) of the posterior distribution following [13] .

Model data set
The used data set is a European wide gradient study of the impact of climate change and air pollution (N deposition) on forest tree health assessed by foliar nutrient status N:P of different different conif er species (n = 10) from the years 1995-2017 [2] . N deposition are based on the EMEP MSC-W model with a 0.1 °spatial resolution [14] . Mean annual temperatures (MAT) are based on the E-OBS climate data set with a 0.25 °spatial resolution [2] .

Prior settings
We used the approved critical load for coniferous woodland, which is currently set at 5-15 kg N ha -1 a -1 [15] , to construct an informative prior (CL emp N) assuming a normal distribution with the approved critical load as its mean and half the range as its standard deviation (Normal(mean = 10, sd = 5)) and vague priors with mean = 0 and sd = 2 for fixed effects such as the mean annual temperature (MAT) in this example.

Bayesian change-point regression model using JAGS (BCR_JAGS)
According to the current Bayesian framework of change-point regression models by [1] , we used a linear mixed effect model (c.f. data distribution in Fig. 1 ) with identity-link function. We described the variation of measured foliar N:P ratio ( NP i ) between i = 1 , ..., N sampling sites using a Normal distribution with expected foliar N:P ratio λ i : NP i ∼ Normal(λ i ) and the expected foliar N:P ratio λ i as identity link function expressed as: β k X i,k + α i with β 0 as intercept and β k as linear slopes for k = 1 , ..., K confounding variables at sampling site i with covariate value X i,k [16] . According to [1] we added α i as the effect of N deposition N i on the expected foliar N:P ratio λ i : with the intercept β 0 ,s that varies between conifer tree species in this example.
The differences between the tree species is modelled using a normal distribution, which is the definition of a random effect.

BCR_JAGS model diagnostics
The convergence diagnostics using Gelman and Rubin's [10] trace and marginal density plots ( Fig. 2 ) showed a good convergence of the two chains. Also the smoothed density plots showed rather balanced histograms of the trace plot values for this model.

Estimated change-point using JAGS
The effect plot of the estimated change-point including the random effect of conifer tree species is shown in Fig. 3 . The change-point, using the underlying data on foliar N:P ratio and the Bayesian change-point regression model BCR_JAGS, is 5.7 ±4.8 (Tab. 1 ).

Bayesian change-point regression model using brms (BCR_brms)
With the package "brms" [7] we fitted a Bayesian non-linear multivariate multilevel model with random intercept ( β 0 ) and different CL emp N between the tree species:   ( Table 1 ) of N including the corresponding 95% credible intervals as dotted lines. Points are measurements from [2] and are coloured according to conifer tree species. The black line is the estimated change-point regression including 95% credible intervals.

BCR_brms model diagnostics
The MCMC diagnostics showed a good convergence of the two chains. The posterior distributions are centred around one peak value ( Fig. 4 ).

Estimated change-point using brms
The estimated change-point using the model BCR_JAGS is 9.0 ±4.3 ( Table 1 ) with an estimated Bayesian R 2 = 0.42 [17] . The effect plot of the change-point is shown in Fig. 5 .

Method validation
The two presented ways of modelling change-point regressions including random effects, with JAGS (BCR_JAGS) and with the brm function from the brms [7] package (BCR_brms), resulted in similar estimated CL emp N ( Table 1 ), latter being slightly higher with the model (BCR_brms). The estimated CL from the model (BCR_brms) including the non-linearity and the random intercepts of β 0 made the outcome more realistic compared to the underlying data ( Fig. 5 ).
In addition, a simple boxplot comparison (pinpoint method applied by [15] ) of different groups of N deposition is suggesting a change in foliar N:P ratio between the first group of 0-5 kg ha -1 a -1 and the second group of 5-10 kg ha -1 a -1 ( Fig. 6 ), which verifies the estimated CL emp N of both models.

Introduction to critical empirical loads and change-point regression models
The deposition of atmospheric nitrogen (N) is a major threat to biodiversity and ecosystem functioning globally [18][19][20][21][22] . The accumulation of N over time is one of the main drivers of changes in species distribution across many habitats, especially for habitats with low management regimes such as natural forests [15] . The impacts of N deposition are known for certain terrestrial habitats, but many still remains uncertain [15] . Critical loads of N has been developed in the framework of the Convention on Long-range Transboundary Air Pollution [15,23] . These critical loads are defined as thresholds below which damaging effects on specific habitats do not occur based on the latest scientific knowledge [24] . Critical loads of N are often defined based on negative effects on plant diversity [20] , as they are often lower compared to critical loads of N from soil processes such as soil acidification or N leaching [25] . Empirical critical loads can be determined with the use of N addition or reduction experiments or in gradient studies covering a gradient of N deposition. Latter is a more holistic approach taking into account the variability of natural habitats. However, gradient studies can also have various limitations for instance they should use a sufficient spatial scale of the air pollutant assessed and they should take into account the most important modifying factors [26] . In addition, an appropriate framework of regression model should be selected, suitable for the selected study design. Bayesian change-point regression models have been shown to overcome some difficulties of large spatial variation found in natural habitats by taking into account confounding factors such as climatic  [15] ) highlights a suggested change in foliar N:P ratio between the first (0-5 kg ha -1 a -1 ) and the second group (5-10 kg ha -1 a -1 ).
variation or different soil types [23,27] , allowing a more accurate estimation of the critical load [28] . Therefore, change-point regression models applied in a Bayesian framework are useful statistical tools in estimating critical empirical loads [1] .

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments
This paper is a collaborative piece of work between ST and SB from the Institute for Applied Plant Biology providing the idea for the adaption of the current change-point method in the frame work of the revision of the critical empirical load for nitrogen and TR from Hintermann & Weber AG providing the statistical back ground. ED provided the data and edited the manuscript. This project was supported by the Swiss Federal Office for the Environment (FOEN).

Supplementary material
Supplementary material associated with this article can be found, in the online version, at doi: 10. 1016/j.mex.2022.101902