Enhanced effects of biotic interactions on predicting multispecies spatial distribution of submerged macrophytes after eutrophication

Abstract Water eutrophication creates unfavorable environmental conditions for submerged macrophytes. In these situations, biotic interactions may be particularly important for explaining and predicting the submerged macrophytes occurrence. Here, we evaluate the roles of biotic interactions in predicting spatial occurrence of submerged macrophytes in 1959 and 2009 for Dianshan Lake in eastern China, which became eutrophic since the 1980s. For the four common species occurred in 1959 and 2009, null species distribution models based on abiotic variables and full models based on both abiotic and biotic variables were developed using generalized linear model (GLM) and boosted regression trees (BRT) to determine whether the biotic variables improved the model performance. Hierarchical Bayesian‐based joint species distribution models capable of detecting paired biotic interactions were established for each species in both periods to evaluate the changes in the biotic interactions. In most of the GLM and BRT models, the full models showed better performance than the null models in predicting the species presence/absence, and the relative importance of the biotic variables in the full models increased from less than 50% in 1959 to more than 50% in 2009 for each species. Moreover, co‐occurrence correlation of each paired species interaction was higher in 2009 than that in 1959. The findings suggest biotic interactions that tend to be positive play more important roles in the spatial distribution of multispecies assemblages of macrophytes and should be included in prediction models to improve prediction accuracy when forecasting macrophytes’ distribution under eutrophication stress.

to evaluate the changes in the biotic interactions. In most of the GLM and BRT models, the full models showed better performance than the null models in predicting the species presence/absence, and the relative importance of the biotic variables in the full models increased from less than 50% in 1959 to more than 50% in 2009 for each species. Moreover, co-occurrence correlation of each paired species interaction was higher in 2009 than that in 1959. The findings suggest biotic interactions that tend to be positive play more important roles in the spatial distribution of multispecies assemblages of macrophytes and should be included in prediction models to improve prediction accuracy when forecasting macrophytes' distribution under eutrophication stress.

K E Y W O R D S
aquatic plants, facilitation, freshwater lakes, species distribution model

| INTRODUCTION
Nutrient enrichment in aquatic systems is a widespread environmental problem caused by the massive conversion of the earth's land surface to agriculture or urban land as well as global climate change and nitrogen deposition (Jeppesen et al., 2010). As a result, regime shifts in shallow lake ecosystems have been widely documented in temperate and tropical regions (Mesters, 1995;Sand-Jensen et al., 2008;Scheffer, Hosper, Meijer, Moss, & Jeppesen, 1993;Zhang et al., 2016).
Within these aquatic ecosystems, submerged macrophytes dominate in clear oligotrophic water and phytoplankton dominates in turbid eutrophic water. In certain cases, the dominance of floating plants may represent as a third state (Mesters, 1995;Scheffer et al., 2003). In the case of both phytoplankton dominance and floating plant dominance, phytoplankton blooms or dense mats of free-floating plants reduces the depth to which light penetrates and increases anoxic conditions, leading to an unfavorable environment for submerged macrophytes (Bornette & Puijalon, 2011). As a result, suitable habitats for submerged macrophytes are reduced and only occur near the lakeshore, even disappear in nutrient-rich situations (Declerck et al., 2005;Kosten, Kamarainen, et al., 2009;Scheffer et al., 1993).
In unfavorable environments, the reduction in habitat can lead to intense competition among species occupying overlapping niches (Hao, Wu, Shi, Liu, & Xing, 2013;Maestre, Callaway, Valladares, & Lortie, 2009;Sand-Jensen et al., 2008). In contrast, positive species interactions may arise as an important mechanism to maintain diversity according to the prediction of the stress-gradient hypothesis (McIntire & Fajardo, 2014). In both cases, biotic interactions, including competition and facilitation, play an important role in explaining species coexistence. Empirical studies have revealed that macrophytes affect many key processes that promoted water clarity (Kéfi, Holmgren, & Scheffer, 2016;Scheffer et al., 1993). This vegetation-turbidity interaction ensures a low chlorophyll-a concentration and high Secchi depths in eutrophic shallow lakes that are vegetated (Kosten, Lacerot, et al., 2009). Therefore, submerged species that are adapted to growing in nutrient-rich, turbid water may have positive effects on species that prefer nutrient-poor, clear water habitats, which is a type of indirect facilitation (Le Bagousse-Pinguet, Liancourt, Gross, & Straile, 2012).
Abiotic factors are commonly used to predict the spatial distribution of submerged macrophytes: Among which transparency and depth, both related to light conditions, are often shown to be most important variables, followed by chemical contents, hydrographic parameters, substrate conditions, and other factors (Lehmann, 1998;Schmieder & Lehmann, 2004;Snickars et al., 2014;Wang et al., 2005;Zhang et al., 2016). The impacts of vertical interactions such as herbivory pressure or algae cover occasionally affect these predictions (Snickars et al., 2014;Viana et al., 2016). Few studies have considered horizontal interactions (i.e., the relation between submerged macrophytes) as explanatory factors. Recent studies have demonstrated that the incorporation of horizontal biotic interactions improves the accuracy of predictions on both a plot scale (Nylén, Le Roux, & Luoto, 2013;le Roux, Lenoir, Pellissier, Wisz, & Luoto, 2013) and a regional scale (Latimer, Banerjee, Sang, Mosher, & Silander, 2009) and have indicated that the interactions among plants play an important role in the spatial distribution of multispecies assemblages (Nylén et al., 2013;le Roux et al., 2013).
Here, we evaluated the relative importance of biotic interactions in a spatial assemblage of submerged macrophytes and how the biotic interactions between species pairs changed due to regime shifts using species distribution models. To this end, we analyzed the spatial presence/absence data of four common submerged macrophytes during two periods (1959 and 2009) in Dianshan Lake, which became eutrophic since the 1980s. We hypothesized that (H1) models with both abiotic and biotic variables would more accurately predict spatial distributions than models including only abiotic variables in each alternative regime state (Nylén et al., 2013;le Roux et al., 2013), and the improvement effects could enhance in eutrophication state.
Moreover, we expected (H2) that the biotic interactions would be more positive (facilitation) due to the regime shift caused by eutrophication (Brooker et al., 2008;Le Bagousse-Pinguet et al., 2012;McIntire & Fajardo, 2014). By testing the two hypotheses, we would provide a field evidence of how the role of biotic interaction changes in governing submerged macrophytes community assemblage before and after eutrophication.

| Study site
The study was carried out in Dianshan Lake (31°04′N-31°12′N and 120°54′E-121°01′E), which is at the western margin of Shanghai and is bounded by Jiangsu and Zhejiang Provinces ( Figure 1). Dianshan Lake belongs to the Taihu Lake drainage in eastern China and receives the upstream water from the Taihu Lake. Downstream, the Huangpu River is the largest river flowing through the metropolitan area of Shanghai and merges into the Yangtze River just before entering the East China Sea. The area of the lake is 63.7 km 2 , and the mean depth is ~2.1 m (Kung & Ying, 1991;Zhang, Zhang, & Zhong, 2011).
Since the early 1980s, in association with rapid economic development and urban expansion, the water quality of Dianshan Lake has deteriorated, and the lake has experienced cyanobacterial algal blooms since the end of the last century (Cheng & Li, 2010). The trophic state of Dianshan Lake transitioned from eutrophic to hypereutrophic in 1999-2000 (Cheng & Li, 2010). As a result, the aquatic ecosystem has gradually become degraded and has switched from a macrophytedominated clear water system to a turbid, phytoplankton-dominated system (Shi, Liu, & Da, 2011). The aquatic vegetation, which was present throughout the entire lake in 1959, covered no more than 30% of the lake in 2009 (Shi et al., 2011). Although the overall richness of the vascular aquatic plants has changed little, ten native species have disappeared and non-native species have increased from 1959 to 2009 (Shi et al., 2011). In response, the fish richness has declined markedly, particularly for herbivores and piscivores (Hu et al., 2014).

| Species distribution data
The species distribution data for the two alternative regimes were col- The 1959 data were taken from spatial species distribution maps in a historical report (Freshwater Aquaculture Lab, 1960). The historical survey aimed to document species distribution across the whole lake.
The macrophyte data were collected from 55 sampling sites that were evenly distributed across the lake, and each sampling site represented an area of slightly more than 1 km 2 . In each site, macrophytes samples were randomly taken in 3-10 sampling points using the bottom trawling method. The presence/absence of macrophytes was recorded at a total of 375 sampling points.
Four species-Potamogeton malaianus, Myriophyllum spicatum, Vallisneria natans, and Hydrilla verticillata-were chosen to model the spatial distribution of submerged macrophytes because they were the dominant submerged macrophytes species during the both periods, and the other submerged macrophytes were too rare for spatial distribution modeling. For each species, the spatial point data were transferred to 117 raster cells with 30-s resolution with area ~0.75 km 2 and the presence/absence was determined for each cell (Fig. S1). Given that the different sampling density can cause bias in biotic interaction estimation, we compared the sampling density in area vegetated by rooted macrophytes in the two periods. The sampling point density in both periods was around four sampling sites per cell unit. Therefore, the cell-based data were considered to be comparable between two periods as they had similar sampling density for area vegetated by rooted macrophytes (Appendix S1).  (Table S1). The first two variables are commonly used as proxies for light conditions and are considered to be the most important abiotic variables for submerged macrophytes (Sand-Jensen et al., 2008;Wang et al., 2005;Zhang et al., 2016). The other chemical variables are considered to be linked to the physiology, occurrence, life-history traits, and community dynamics of submerged macrophytes (Bornette & Puijalon, 2011;Snickars et al., 2014).

| Environmental data
In 2009, Z M and Z SD were measured at 113 monitoring points. F I G U R E 1 The location of Dianshan Lake, eastern China Z M and Z SD were measured with a calibrated stick and a Secchi disc, respectively. W pH was measured in the field with a pHep ® 5 pH/ Temperature Tester, HI 98128 (Hanna Instruments, Texas, USA) during 2009 and was determined using a colorimetric method in 1959. DO was determined using the iodometric method. NO 3 − -N was determined by a spectrophotometric method using phenol disulfonic acid. PO 4 3− -P was determined according to an ammonium molybdate spectrophotometric method. TP was digested with alkaline K 2 S 2 O 8 and determined according to the ammonium molybdate spectrophotometric method. TN was digested with alkaline K 2 S 2 O 8 and determined using the ultraviolet spectrophotometric method. All the analytical methods were following Chinese Water Analysis Methods Standards (Huang et al., 1999).
For each abiotic variable, the mean value of three collections was calculated for each sampling site. Then, the inverse distance weighting and ordinary kriging interpolation methods were applied to estimate the predicted values for the cell-based data of species occurrence.
Because there was a high correlation between the interpolated results of the inverse distance weighting method and the ordinary kriging method for each abiotic variable (r ranged from 0.942 to 0.988), we used only the predicted results from the inverse distance weighting method for statistical analysis.

| Test for hypothesis one (H1)
To test whether the presence of other submerged macrophytes (biotic variables) could improve the model predictions of species occurrence, two models were run for each species as follows: a model with only abiotic variables (null model) and a model with both abiotic and biotic variables (full model): Principal components analysis was applied to the abiotic variables to avoid multicollinearity, which artificially increases the amount of variation explained (Legendre & Legendre, 2012).The generated component scores were used in the models for abiotic variables. The occurrences of the other three species were used as the biotic variables for each species.
To account for potential differences related to statistic methodologies, generalized linear models (GLMs) and boosted regression trees (BRTs) were used to model variation in the occurrence of the four macrophytes (Elith, Leathwick, & Hastie, 2008;Franklin, 2009). For the GLMs, species occurrence was fitted with a binomial family and logit link function. To limit model complexity, statistical interaction terms were not included in the GLMs due to a lack of a priori knowledge of contingencies among the predictor variables. The relative importance of each predictor variable was calculated using the change in deviance after the exclusion of that variable from a GLM model containing all variables and then scaling it to 100, with higher values indicating a stronger influence on the response variable (le Roux et al., 2013).
BRTs are a machine learning method that combines the strengths of two algorithms-regression trees and boosting-with the advantages of handling different types of predictor variables with no need for a priori specification of a data model (Elith et al., 2008;Schibalski, Lehtonen, & Schröder, 2014). BRTs automatically incorporate interactions between predictors and are capable of fitting complex nonlinear relationships, which can cover the shortcomings that the GLM models were without interaction terms. The measure of variable importance in this method is based on both the frequency with which a variable is selected and the improvement resulting from the inclusion of the variable, with higher values (ranging from 0-100) indicating a stronger influence (Elith et al., 2008;le Roux et al., 2013). All BRT models were fitted assuming a Bernoulli distribution and setting the tree complexity to 5, the learning rate to 0.005 (reduced to 0.001 for species for which the calculated trees were inadequate) and the bagging fraction to 0.75.
To quantify model performance, repeated fourfold cross-validation was used with 999 repeats due to our small data set. For each repeat, fourfold data partitioning with random assignment was used to divide the data into a training group and a testing group for model validation.
The pairwise distance sampling method was then used to remove any spatial sorting bias (Hijmans, 2012). The area under the receiver operating characteristic curve (AUC) (Fielding & Bell, 1997) was calculated for each model. The predicted occurrence probabilities were then converted to binary presence/absence predictions using a threshold that maximized the sensitivity and specificity of the model during crossvalidation, from which the true skill statistic (TSS) and kappa were calculated (Allouche, Tsoar, & Kadmon, 2006). The differences in the predictive power between null models and full models were tested using a Mann-Whitney test (Nylén et al., 2013;le Roux et al., 2013).

| Test for hypothesis two (H2)
To test whether a positive shift (facilitation) of biotic interactions could arise among submerged macrophytes due to eutrophication, the co-occurrence pattern of the four submerged macrophyte species was fitted using joint species distribution models for both periods (1959 and 2009) (Pollock et al., 2014). This method is distinct from the species distribution model, which only includes the unidirectional influence of one or a few species; using error matrices (i.e., Σ in eqn1) in hierarchical multivariate probit regression models allows species interactions involving multiple species to be modeled explicitly using spatial co-occurrence data (Kissling et al., 2012). In addition, this joint modeling method has ability to decompose the species coexistence into shared environments caused coexistence and biotic interaction caused coexistence (Pollock et al., 2014).
In each model, the response is the species occurrence matrix (with element Y ij ) arranged as n sites by m species. The probability of occurrence was defined as the probability density of a latent variable (Z ij ) greater than zero (eqn. 1). The row vectors of the latent variable matrix, Z i , follow an m-dimensional multivariate normal distribution with a mean vector μ i and an m × m variance/covariance matrix Σ. μ i is predicted with an environmental data matrix (X) with the dimensions n sites by K predictors, which included one column of intercept terms set as one and K − 1 columns of principal components of biotic variables that were centered on zero and scaled by their standard deviations. β is the m×K matrix of regression coefficients, which were drawn from the normal distribution with mean μ k and standard deviation σ k for each column.
The inverse Wishart distribution was used as the prior for the variance/covariance matrix Σ. After model fits, the covariance terms were divided by the standard deviations of the variance/covariance matrix (Σ) to generate a correlation matrix in which all standard deviations were equal to one. The element in the rescaled variance/covariance matrix presents pairwise correlation coefficient between two species, an indicator of biotic interaction (Pollock et al., 2014).
To fit the models, sampling from the posterior distribution of all parameters was performed using the Markov chain Monte Carlo method. All analyses were conducted in R.3.2.2 (R Core Team, 2015); we used "gstat" package (version 1.1-3) to interpolate environmental variables, and "gbm" package (version 2.1.1) and "dismo" package (version 1.0-12) for BRT models fit, and "ncf" package (version 1.1-5) to compute spline correlograms. JAGS (version 3.4.0) was run through package "R2jags" (version 0.05-01) to fit the joint species distribution models using the code of Pollock et al. (2014).

| RESULTS
In most cases, the full models with both abiotic and biotic variables performed better than the null models only with abiotic variables (Table 1) (Figure 3). Furthermore, the three species pairs that showed a shift from neutral interaction to positive interaction had a significantly higher trait dissimilarity than the other three species pairs (one-way ANOVA, df = 5, p < 0.05) (Appendix S2). Adding biotic variables within trophic levels, such as chlorophyll-a content (as a proxy of algal abundance), richness of emergent macrophytes, and richness (1)

| DISCUSSION
Changes in biotic interactions have been widely recorded during global changes, particularly biological interactions in response to climatic change (Brooker et al., 2008;Tylianakis, Didham, Bascompte, & Wardle, 2008). In addition to global warming's impacts on aquatic vegetation (Alahuhta, Heino, & Luoto, 2011;Kosten, Kamarainen, et al., 2009;Netten, Van Zuidam, Kosten, & Peeters, 2011), eutrophication is another, and perhaps more serious, global problem for inland waters (Davidson et al., 2015). Although there are a few controlled ex- and the full model (with both abiotic and biotic variables) for generalized linear models and boosted regression trees, respectively. Area under the receiver operating characteristic curve (AUC), true skill statistic (TSS), and kappa were used to measure model performance. The significant differences in model performance between null models and full models (Full-Null) (Cheng & Li, 2010). Based on an empirical study, the maximum depth should be <1.8 times the Secchi depth to ensure a net biomass accumulation during the growth season in Yangtze lakes (Wang et al., 2005). That value translates to 0.9 m, which is much shallower than the mean depth (2.1 m) of this lake. As a result, the stress of low light intensity caused the submerged macrophytes to squeeze together in the shallow water along lakeshore. In this situation, it has been suggested that indirect facilitation by submerged macrophytes, which maintain clear water conditions by competing with phytoplankton for light and nutrients, is the underlying mechanism promoting species coexistence (Le Bagousse-Pinguet et al., 2012). This is supported by the evidence that the mean value of transparency was significantly higher in macrophytes beds ( (Table 1). Our findings are consistent with the suggestion that trait divergence can switch the competition-facilitation balance (Beltrán et al., 2012) and conversely imply that facilitation increases trait dispersion at the community level (McIntire & Fajardo, 2014).
Many studies have concluded that biotic interactions have an important role in improving species distribution predictions (Araújo & Luoto, 2007;Kissling et al., 2012;Nylén et al., 2013;le Roux et al., 2013); our results support this conclusion for macrophytes, particularly in the case of eutrophication that increased indirect facilitation among macrophytes. This finding suggests that the biotic interaction among macrophytes could be critical in predicting their occurrence. Moreover, our study also suggests that integrating biotic interactions into species distribution models can promote the prediction accuracy even beyond the community scale. In contrast to experiments at a fine grain of no more than several square meters (Hao et al., 2013;Le Bagousse-Pinguet et al., 2012), our results were based on a relatively larger grain of approximately 0.75 km 2 .
Both of simulation study (Araújo & Rozenfeld, 2014) and empirical analysis (Belmaker et al., 2015) indicated that positive interactions may decrease with increase in grains but can remain important even at coarse grains.
In this study, the application of joint species distribution models is novel to aquatic system, which provides a feasible way to explore how  (Kissling et al., 2012;Pollock et al., 2014). Although major environmental factors for submerged macrophytes in the shallow water of eastern China has been covered in the models (Wang et al., 2005;Zhang et al., 2016), other important abiotic factors, such as sediment characteristics and hydrometeorology variables, could change the distribution of macrophytes. And the different nutrients indices used in two periods could also bring bias. Second, biotic interaction among trophic levels, that is, herbivory stress (Hu et al., 2014), may also alter species interaction, in spite of the fact that other biotic factors, that is, algae, emergent macrophytes, free-floating macrophytes, had few effects on inference of interaction coefficients (Fig. S2).
Thirdly, we only used one growth season data in each period nearly 50 years apart, which might ignore the effort of historical accidental events, such as human disturbances (Shi et al., 2011), and did not consider seasonal variation of macrophytes composition and environment factors. Therefore, this modeling method should be applied widely to different aquatic ecosystem with long-term observation data to further verify its performance.
Additionally, there are two key questions that must be addressed to improve the modeling of biotic interactions in multispecies assemblages of macrophytes. First, as with many joint species distribution models (Leach, Montgomery, & Reid, 2016;Pollock et al., 2014;Wisz et al., 2013), the pairwise interactions were assumed to have the same interaction type and strength in our study due to little prior knowledge. the strength of the interaction may be dependent on the species' relative abundances. In the same control experiment (Hao et al., 2013), the facilitation effect was stronger when the ratio of biomass density between M. spicatum and P. macckianus was 1:3, rather than 2:2 or 3:1.
Second, changes in the strength and type of biotic interactions among macrophytes in different alterative ecosystem states highlight the risk of including biotic interaction factors in the prediction of species distribution. This challenge confirms that problems in the transferability of species distribution models may not only be caused by environmental heterogeneity (Schibalski, Lehtonen, & Schröder, 2014) but also by variable biotic interactions over space and time (Kissling et al., 2012;Tylianakis et al., 2008;Wisz et al., 2013).
To untangle these Gordian knots, we must move toward a greater integration of observed natural environmental gradients and multifactorial controlled experiments to learn how biotic interactions among macrophytes respond to eutrophication and its interaction with climate change (Jeppesen et al., 2010) and to integrate the findings into species distribution models with spatially explicit abundance data . To avoid the complexity of pairwise interactions, this prior knowledge learning process should pay particular attention to communities composed of a small number of species interacting strongly (Gilman et al., 2010), which can be inferred based on plant trait similarities. As we have shown, species with greater trait divergence exhibit a greater change in the type and strength of biotic interactions and can be considered as community module worthy of further study.

| CONCLUSION
The study demonstrated, including biotic interactions in modeling species distributions, can increase the prediction accuracy of macrophytes multispecies assemblages, and the role of biotic interactions in determining species distribution can be even more important than abiotic factors for macrophytes, in the case of eutrophication.
Species association between macrophytes in stressful environmental, that is, low light condition in eutrophic water, become more tightly and probably tend to be positive, which has large contributions on predicting distribution of multispecies assemblages. We suggest the approaches that join horizontal biotic factors into species distribution model can be applied to a wide range of aquatic ecosystem from community scale to regional scale, which is helpful for restoration and management of eutrophic water. However, the join species distribution models built on different alterative ecosystem states cannot simply be used for each other, without extrapolating the distribution of all potentially interacting species and the variation of their interactions.

ACKNOWLEDGMENTS
This study was supported by the National Natural Science Foundation