Simulating water quality and ecological status of Lake Vansjø, Norway, under land-use and climate change by linking process-oriented models with a Bayesian network

• A Bayesian Network chained to a lake model allows predicting cyanobacteria


Introduction
Eutrophication of lakes due to nutrient inputs from agriculture and wastewater is a widespread environmental problem (Schindler et al., Science of the Total Environment 621 (2018) 713-724 2016). 55% of European waterbodies failed to meet the objective of good ecological status mandated by the Water Framework Directive (WFD) (EU, 2000), with nutrient pollution being one of the most significant causes. Eutrophic lakes may have blooms of toxic blue-green algae (cyanobacteria), fish kills, unpalatable drinking water, and unaesthetic bathing water (Carvalho et al., 2013).
Climate change may exacerbate these problems (Chapra et al., 2017), for example, by increasing water temperatures or by increasing precipitation, impacting nutrient loads from the catchment. Mitigation measures undertaken to fulfil the WFD may be offset by the effects of climate change. Conversely, climate change may ameliorate eutrophication by diluting nutrient concentrations through increased precipitation and discharge. Eutrophic lakes thus face multiple stressors, and the combined effects can be synergistic or antagonistic (Hering et al., 2015).
Quantitative assessment of the future combined effects of these multiple stressors on lake eutrophication benefits from the use of models. Future climate conditions are typically estimated by means of downscaled global climate models (GCM), allowing to simulate future discharge from the lake catchment using a hydrological model. A biogeochemical model simulates nutrient losses in runoff, and a lake model simulates the lake temperature, nutrient concentrations and algal abundances. A Bayesian network approach may be used to estimate the combined effects of these environmental changes on the abundance of cyanobacteria and on the ecological status of the lake, as stipulated by the WFD .
Here, we use a linked chain of models (Couture et al., 2014) and feed the outputs from these into a Bayesian network  to evaluate alternative scenarios (storylines) that combine future mitigation measures and climate change. We project the phosphorus (P) transport from the catchment to Lake Vansjø, estimate the probability of cyanobacteria blooms, and estimate the probability that the lake will acquire "good ecological status" in the future.
Lake Vansjø in south-eastern Norway is a meso-eutrophic lowland lake in a catchment dominated by forestry and agriculture. It has occasional blooms of cyanobacteria (Haande et al., 2011). The lake and the major inflowing river have been monitored since the 1970s and 1980s, respectively. Management measures taken to limit the inputs of nutrients have included restrictions on agricultural activities, limitations on fertilizer applications, buffer strips, artificial wetlands, and installation and upgrading of sewage systems (Barton et al., 2016). The site is one of several case studies in the EU project MARS (Managing Aquatic ecosystems and water Resources under multiple Stress) (Hering et al., 2015). The MARS project has set up a framework for studying multiple stressors on lakes across the climatic and geographic gradients in Europe. Several common climate scenarios and economic storylines are used to assess possible alternative future ecological conditions at 16 case studies across Europe.

Study site
The Vansjø-Hobøl catchment (area = 675 km 2 ), also known as the Morsa catchment, is located in south-eastern Norway (59°24′N 10°42′ E) (Fig. 1). The Hobøl River, with a mean discharge of 4.5 m 3 s −1 , drains a sub-catchment of 336 km 2 (301 km 2 before the monitoring gauge and 35 km 2 after) into Lake Vansjø, the catchment's main lake. Lake Vansjø has a surface area of 36 km 2 and consists of several sub-basins, the two largest being Storefjorden (eastern basin, draining a catchment of 244 km 2 ) and Vanemfjorden (western basin, draining a catchment of 58 km 2 ). The water-column of both basins remains oxygenated throughout the year. The deeper Storefjorden basin (max depth 41 m) drains to the shallower Vanemfjorden basin (max depth 19 m) through a shallow channel, with an overall water residence time of about 13 months. The outlet of Vanemfjorden discharges into the Oslo Fjord (Fig. 1). The Storefjorden basin is used as the drinking water source for 60,000 inhabitants in the city of Moss and surroundings, and both basins are extensively used for bathing, fishing and other recreational activities.
Land cover of the Vansjø-Hobøl catchment is dominated by forest (78%) and agriculture (15%), of which the agricultural land-use is mainly for cereal production (89%). The local catchment for Vanemfjorden has the highest proportion of vegetable crops (b1%). Agricultural activities contribute about 48% of the total P input to the river basin, followed by natural runoff (39%), wastewater treatment plants (WWTP) (5%) and wastewater from scattered dwellings (8%) (Skarbøvik and Bechmann, 2010b).
The lake has a long history of eutrophication from at least the 1970s when systematic monitoring of the lake began. Total P concentrations in Vanemfjorden lie between 20 and 40 μg P L −1 , which is above the threshold for the good ecological status as required by the Water Framework Directive (Fig. 2). Lake Vansjø, and in particular the western basin, Lake Vanemfjorden, experienced blooms of toxinproducing cyanobacteria causing beach closures due to high toxin levels, although blooms have largely collapsed after 2007 (Fig. 2

Meteorological data
Observed precipitation, temperature and wind records at Lake Vansjø were obtained from daily weather data at the Norwegian Meteorological Institute stations (1715 Rygge; 1750 Fløter; 378 Igsi) located between the two lake basins (59°38′N, 10°79′E).

Water quality
Water chemistry and temperature data were provided by the Vansjø-Hobøl monitoring programme, conducted by the Norwegian Institute for Bioeconomy Research (NIBIO) and by the Norwegian Institute for Water Research (NIVA) . Suspended sediment (SS) and total phosphorus (TP) in the Hobøl River have been sampled weekly or bi-weekly since 1985 as part of a monitoring programme (Skarbøvik and Haande, 2012;Skarbøvik et al., 2017). Watercolumn sampling of epilimnion P, nitrogen (N), chlorophyll-α (Chl) and phytoplankton, as well as other parameters have been sampled weekly or bi-weekly since 1976, at the deepest site of both basins using a depth-integrating tube sampler from 0 to 4 m depth (Skarbøvik and Haande, 2012). For both basins values of TP, particulate phosphorus (PP), and dissolved inorganic P (termed ortho-phosphate PO 4 in the database) are accessible for both basins through NIVA's online database (http://www.aquamonitor.no).

Land cover data and sources of P load
The land cover for the Vansjø-Hobøl catchment was constructed from GIS digital terrain elevation maps provided by NIBIO and complemented by a recent report on the fertilization regimes of agricultural fields (Skarbøvik and Bechmann, 2010a). Historical nutrient inputs from wastewater treatment plants (WWTPs) were obtained from the online database KOSTRA, maintained by Statistics Norway (http:// www.ssb.no/offentlig-sektor/kostra). TP and SS data were analysed downstream of Høgfoss, at Kure . P loadings from scattered dwellings came from NIBIO's online GIS information system, GISavløp.

Phytoplankton data
Bi-weekly Chl and cyanobacteria data from 2005 to 2014 were downloaded from NIVA's monitoring database (http://www. aquamonitor.no) . These are for the summer months April-October, as only these months are relevant for the ecological status classification.

Model chain
The model chain consists of five separate modules: pre-processors for climate model outputs, a hydrological model, a catchment model for P, a lake model, and the Bayesian network for cyanobacteria and lake ecological status (Fig. 3). The data for the most recent 10-year period (2005-2014) were used to calibrate the models. Each of the models is first calibrated, and then run with climate change scenarios and storylines to simulate conditions in the future. The model chain is described in detail in Couture et al. (2014) and Moe et al. (2016), and summarized below.

Climate models
Outputs from each of the two global climate models were used to generate air temperature, pressure, precipitation and wind speed data for the period 2006-2099. The two models are part of a suite of models used by the Inter-Sectoral Impact Model Inter-comparison Project (ISIMIP) at the Potsdam Institute for Climate Change Research. The first of these was developed by the Geophysical Fluid Dynamics Laboratory (GFDL) run by the National Oceanic and Atmospheric Administration at Princeton, New Jersey, USA. Here the earth system model 2 M (ESM2M) was used. The second was developed by the Institute Pierre Simon Laplace (IPSL) climate-modelling centre, a consortium of several organisations in France. Here, the climate model 5 (CM5) was used. GFDL and IPSL were selected here as they give results close to the ISIMIP median for the northern region (Faneca Sanchez et al., 2015). These two models differ in term of atmospheric prognostic state variables (wind, pressure, temperature, humidity, cloud water and ice and cloud fraction for GFDL and wind, pressure, temperature and cloud water for IPSL) and of the spatial resolution of their atmospheric grid (2.5 o lon. by 2.0 o lat. for GFDL and 2.5 o lon. by 3.75 o lat. for IPSL). Further conceptual differences between the GFDL and IPSL models and details on the setup of the inter-comparison experiments are given in Warszawski et al. (2014) and references therein.

Hydrology model
The outputs of the GCMs, together with basin characteristics, were used as inputs for the hydrological PERSiST model (Futter et al., 2014) to produce daily estimates of runoff, hydrologically effective rainfall and soil moisture deficit. PERSiST (v. 1.0.17) is a daily-time step, semidistributed rainfall-runoff model designed specifically for use with INCA models. Although PERSiST shares many conceptual characteristics with the HBV model (Bergström, 1976), such as the temperature index representation of snow dynamics and evapotranspiration, it differs in its description of water storage. PERSiST uses the same conceptual representation of water storage as the INCA models. Coupling PERSiST with INCA allows a consistent conceptual model of the runoff generation process for both hydrological estimations and water chemistry simulations.

Catchment P model
Daily hydrological outputs from PERSiST and weather forcing from the climate models were used as inputs for INCA-P. INCA-P is a process-based, mass balance model that simulates temporal variation in P export from different land-use types within a river system (Wade et al., 2002). INCA-P is semi-distributed, in that soil properties and management practices are spatially averaged within user-defined subcatchments. It has been used extensively in Europe and North America to simulate P dynamics in soils and surface waters and to assess the potential effects of climate and land management on surface water quality (Crossman et al., 2013;Whitehead et al., 2011). We use the recent fullybranched version of INCA-P (Branched-INCA-P v. 1.4.11;Jackson-Blake et al., 2016), in which reaches are defined as stretches of river between two arbitrarily defined points, such as a gauging station, a topographic feature or a lake basin (Jackson-Blake et al., 2016). Here we define the 4 reaches referred to above: The Hobøl River sub-catchments upstream and downstream from the gauging station at Kure, and the local subcatchments of the Storefjorden and Vanemfjorden basins (Fig. 1). The reach structure was established using GIS and land-use maps for the area and the location of monitoring stations and discharge points into lake basins. It produces daily estimates of discharge (Q), concentration of suspended solids (SS), total dissolved P (termed TDP in INCA) and total phosphorus (TP). The simulated TP concentrations and total flux of phosphorus will be used as input for MyLake model in the model chain.

Lake phosphorus and chlorophyll model
Output from the climate models and INCA-P were then used to drive the lake model, MyLake. MyLake (v. 1.2.1) is a one-dimensional processbased model designed for the simulation of seasonal ice-formation and snow-cover in lakes, as well as for simulating the daily distribution of heat, light, P species, and phytoplankton chlorophyll-a in the water column (Saloranta and Andersen, 2007). MyLake has been successfully applied to several lakes in Norway, Finland and Canada to simulate lake stratification, ice formation (Dibike et al., 2012;Gebre et al., 2014) and phytoplankton biomass (Couture et al., 2014). It uses daily meteorological input data including global radiation, cloud cover, air temperature, relative humidity, air pressure, wind speed and precipitation, as well as inflow volumes and P fluxes to produce daily temperature profiles in the water column, concentrations of chemical species at various depths in the water column and in the outlet. State variables include suspended solids (SS), dissolved inorganic P (PO 4 ), particulate inorganic P (PIP), dissolved organic P (DOP), particulate organic P (POP), and Chl. The biogeochemical processes linking these state variables in the watercolumn are the mineralisation of DOP and Chl to PO 4 , and the removal of PO 4 through phytoplankton growth (yielding Chl) or through sorption onto SS (yielding PIP). Phytoplankton growth is function of PO 4 concentration, temperature and light (which is then attenuated by phytoplankton biomass). Phytoplankton decay is subject to sinking losses and to a first-order remineralisation rate representing losses by respiration, lysis and predation (Saloranta and Andersen, 2007). In the sediments, mineralisation of organic-bound P and equilibrium partitioning of PIP to the pore water governs the fluxes of PO 4 to the water-column, while resuspension allows Chl and PIP to return to the bottom water from the sediment. PO 4 release from the sediment in response to seasonal anoxia (e.g., internal P loading) was not modelled as the lake basin considered is perennially oxygenated. Details on the equations governing these processes are given in Saloranta and Andersen (2007). In the MyLake model, phytoplankton has a constant C:P ratio of 106:1. It assumes that the particulate organic-P produced is a proxy for Chl. Similar stoichiometry and constant P:Chl ratios can be found Mean of summer (June-September) TP (circles, μg L −1 ), chlorophyll-a (squares, μg L −1 ) and maximum cyanobacteria (diamonds, mg L −1 ) concentrations between 0 and 4 m depth in the water column of Vanemfjorden (Lake Vansjø) for the time-period 1976-2015. Lines show the thresholds for ecological status classes "good" to "moderate" for TP (long dashed), chlorophyll-a (square) and cyanobacteria (dotted). TP data from . in other models (Reynolds et al., 2001). Total particulate P was calculated offline and compared to field observations to calculate performance metrics. For this study MyLake was set-up for the western lake basin, Vanemfjorden. Vanemfjorden discharges to the Oslo fjord (Fig. 1). The outputs of the simulations from INCA-P for the seven separate subcatchments to Vanemfjorden were combined and used as inputs. Finally, nitrogen was not included in the model as the phytoplankton is assumed to be phosphorus-limited in this lake. The growing season total nitrogen (TN) over TP weight ratio (N 25) and nitrate concentrations (N 100 μg) remain high throughout most of the growing season during the period included in the modelling (Skarbøvik and Haande, 2012).

Bayesian network (BN) model
In a BN, each node (variable) is defined as a set of alternative states (e.g. intervals of nutrient concentrations or ecological status classes), and a probability is assigned to each status. The probability distribution for each given child node is determined by the probability of each state of its parent nodes. For example, the child node "cyanobacteria" has the parent nodes "Chl" and "water temperature". The probability distributions of a child and its parent nodes are linked by a conditional probability table (CPT). When the BN model is run for different scenarios, changes in the probability distribution of parent nodes are combined with the CPT and result in updated posterior probability distributions of the child nodes. As an example, the CPT for cyanobacteria is shown in Supplementary Information Table SI-2. This CPT is based on the empirical relationship between Chl, water temperature and cyanobacteria in the monitoring data (i.e., the count of observations of different states).
The purpose of the BN model was to link the outcome of the processbased hydrologic, catchment, and lake models under different climate and land-use scenarios to a biological response such as the abundance of cyanobacteria. The ecological status of water bodies should be determined primarily by biology and secondarily by supporting physicochemical element such as nutrients. However, process-based models typically do not accurately predict biological responses other than the Chl concentration. By including the cyanobacteria abundance in this model, we can obtain a better assessment of the ecological status in Lake Vansjø. The BN model included the output of lake temperature, TP concentration and Chl concentration from the MyLake simulations. The development and application of the BN model for Lake Vansjø (basin Vanemfjorden) under different scenarios are described by Moe et al. (2016). In contrast to Moe et al. (2016), we no longer include Secchi depth (transparency) in the status assessment, because the most recent version of the national classification guidance states that current class boundaries are not applicable for lakes with high turbidity such as Lake Vansjø.
The selection of predictor variables in the BN was limited to those that are predicted by MyLake (TP, Chl and water temperature). Predictor variables for cyanobacteria included water temperature and Chl. TP was omitted here because it is strongly correlated to Chl, as is commonly observed in lakes . Therefore, even if both variables are suitable parent nodes for Cyanobacteria, we chose Chl it is a better predictor of cyanobacteria biomass than the more commonly used TP . Variables for assessment of ecological status were TP, Chl and CyanoMax (seasonal maximum of cyanobacteria concentration). Each of these nodes had three intervals that corresponded to concentrations of High-Good, Moderate and Poor-Bad status classes, respectively (see Table 1). The node Temperature (water temperature) had two intervals: above and below 19°C. This boundary was determined by a regression tree analysis of the monitoring data: the probability of cyanobacteria abundance exceeding 1000 μg L − 1 was significantly higher when the water temperature was above 19°C .
The BN assigns the probability of ecological status of the lake given estimates of total P, Chl and cyanobacteria abundance, according to the status class boundaries for each variable (Table 1). The ecological status is determined by two biological indicators (Chl and cyanobacteria) and one physico-chemical indicator (TP), using the following two combination rules: (1) If the cyanobacteria status, expressed as normalized Ecological Quality Ratio (EQR) values, is lower than Chl status, then the combined phytoplankton status is set to the average of the normalized EQR values of Chl and cyanobacteria (if the cyanobacteria status is equal to or higher than the Chl status, then the cyanobacteria status is not included), (2) If the phytoplankton status is High or Good, and TP status is lower than the phytoplankton status, then the combined lake status is reduced by one status class, from high to good or from good to moderate. Such combination rules follow the guidance described in the intercalibrated Norwegian classification method for accessing ecological status for phytoplankton in lakes (Lyche-Solheim et al., 2014).

Climate scenarios
The two global climate models (GFDL and IPSL) were driven by two separate trajectories of greenhouse gas concentrations, Representative Concentration Pathways RCP4.5 and RCP8.5. These RCPs are moderate and extreme greenhouse gas concentration trajectories adopted by the United Nations Inter-governmental Panel on Climate Change (IPCC) for its fifth Assessment Report (Meinshausen et al., 2011;Moss et al., 2008;Stocker, 2014). Daily data  for these two models driven by the two RCPs were downloaded from the ISIMIP database for grid size 0.5 × 0.5 degrees (Faneca Sanchez et al., 2015). The temperature and precipitation data were subsequently bias-corrected based on monitoring data at the Vansjø catchment, using the delta change approach (Rojas et al., 2011). These input data are summarized in Table 3.

Storylines for land-based phosphorus sources
We used three alternative "storylines" to estimate future loadings of P from land-based sources to water. We use the three storylines developed by the MARS project (www.mars-project.eu/index.php/factsheets.html) (Cremona et al., 2017;Faneca Sanchez et al., 2015), describing different sets of possible future economic, environmental, political and climatic developments to impose different levels of land-based sources of P from agriculture and wastewaters from urban and rural dwellings, by and large the main sources of P to lakes (McCrackin et al., 2017). In "Consensus World" (CW), the economy grows at the same pace as now, with current guidelines and environmental policies continuing after 2020 in a more integrated manner. Sustainable and efficient use of resources is promoted and water management strategies are set to comply with the regulations. This world is based on a combination of the shared socioeconomic pathways 2 (SSP2; See Kriegler et al., 2012 for a description of SSP's) and a radiative forcing of 4.5 W m −2 . This storyline represents a "best case" situation with respect to nutrient pollution to the lake. In "Fragmented World" (FW) economic growth is promoted at the expense of ecosystem preservation. This storyline represents an intermediate business as usual situation and is based on a combination of SSP3 and a predicted radiative forcing of 8.5 W m −2 . Finally, in "Techno World" (TW) the economy grows quickly and outpaces regulation of environmental protection by governments. Water resources management focuses on water needed for economic development and production of drinking water. This storyline Table 1 Boundaries of status classes for biological and chemical elements included in the BN model, according to the Norwegian classification system ⁎ for lakes of type L-N8a (large, lowland, moderately calcareous, humic).
is based on a combination of SSP5 and a predicted atmospheric radiative forcing of 8.5 W m −2 . This storyline represents a "worst case" situation with respect to nutrient pollution to the lake. The main sources of anthropogenic P to the river and lake include diffuse sources from agricultural activities and point sources from wastewater. For all three storylines, the stressors were modified by applying specific measures as described in Table SI-1. The downscaling of these socio-economic factors and anticipated land-use changes for the three storylines for Norway and the Vansjø-Hobøl river basin was performed together with stakeholders involved in the catchment's landuse and water management during the REFRESH EU FP7 project (Couture et al., 2014). As a result, the following scenarios represent realistic developments and actions that stakeholders have the capacity to implement.
The "extended baseline" (BL) assumes no change in climate and no changes in land-use practices from present-day conditions. We then ran the chained model setup for a total of 11 different scenarios for the future 2015-2070. There are four separate climate change simulations, two forcing functions (RCP 4.5 and RCP 8.5) each simulated by two different climate models. There are also three "storylines" for future P loadings, each combined with the climate projections made with the two climate models ( Table 2). The RCP 4.5 was used for the CW scenario, while for the two other storylines (FW and TW) we used the RCP8.5.

Calibration results
The PERSIST and INCA-P models were manually calibrated against Q and TP loads, respectively, for the time-period 1/1/1983-31/12/2014. Performance of these models is shown in Table 4. Upscaling of INCA-P from the calibration point at Kure on the River Hobøl was made by assuming that the INCA parameters for Kure also hold for the other six sub-catchments to Vanemfjorden. The MyLake model then took the output of INCA-P, summed over the relevant sub-catchments as input to Storefjorden and Vanemfjorden. Here we use previously determined parameters as a starting point (Couture et al., 2014) and add 2 years of more recent data to report on temperature and Chl in the Vanemfjorden basin ( Fig. 4 and Table 4), as they are the state variables needed for the BN model. The MyLake model was calibrated for the time-period 1/4/ 2005-1/9/2014 using the genetic algorithm provided with the Matlab global optimization toolbox. MyLake calibration at Vanemfjorden captures the seasonal minima in PO 4 and maxima in both TP and Chl. The seasonal trends in Chl were also well captured by the model, with the exception of an algal bloom in the summer of 2006, whose magnitude was not fully captured (Fig. 4). For the best fit to the observations, the input of P simulated by INCA-P needed to be scaled up by 9%. Finally, the evaluation of the BN is described by Moe et al. (2016).

Scenario and storyline results
The hydrologic model PERSiST chained to the INCA-P model projects that discharge in the River Hobøl will increase in the future (Fig. 5). The increase is projected by both the GCMs, and is larger under the stronger RCP 8.5 relative to RCP 4.5 (not shown). The IPSL climate model indicates much larger year-to-year variation in discharge with extreme years having as much as four-fold the normal discharge.
The INCA-P model projects that the extended baseline will not lead to significant long-term change in the fluxes of TP in the River Hobøl at Kure (Fig. 6), although with climate change, the model projects higher fluxes of TP. This is the case for both GCMs driven by both RCPs. Using the IPSL climate model results in higher year-to-year variations in TP fluxes relative to the GFDL model, and for both climate models the higher forcing of RCP 8.5 yields larger changes in TP fluxes compared to the RCP 4.5.
For the storylines, the INCA-P model predicts that the TW gives the highest TP fluxes, the CW storyline the lowest, and the FW storyline intermediate values (Fig. 6). Results from the IPSL climate model are somewhat higher than the GFDL model, probably because the IPSL model projects an increase in precipitation and runoff greater than that of the GFDL model. INCA-P indicates that for the "best case" consensus storyline, both climate models project TP fluxes below the extended baseline (Fig. 6).
The MyLake model indicates that under the extended baseline scenario, the TP and Chl concentrations in the Vanemfjorden basin of Lake Vansjø will decrease slightly in the future (Fig. 7). Concentrations of TP and Chl are projected to increase relative to the extended baseline for both climate models and both RCPs. The MyLake results show differences between the various scenarios and storylines in Chl concentrations for the critical summer months, June, July and August (Fig. 8). The IPSL climate model, again, gives somewhat larger year-to-year variations compared to the GFDL model. For the storylines, the MyLake model suggests that the CW storyline gives the lowest, the FW intermediate, and the "TW" the highest TP and Chl concentrations (Fig. 9a).  Results from the IPSL climate model are again somewhat higher than the GFDL model. MyLake indicates that for the "best case" consensus storyline, both climate models project TP and Chl concentrations below the extended baseline. This is the period in which blooms of cyanobacteria might occur. The results show that climate change alone will only yield a slight rise in the number of summer days with high Chl concentrations (Fig. 9b). The three storylines, however, differ significantly in the number of summer days with high Chl concentrations.
The Bayesian Network was used to project the changes in the ecological status of Lake Vansjø, given the alternative future climate scenarios and storylines (Fig. 10). A BN is an essential end point in the model chain as it provides the necessary quantitative tools to link the projections of Chl provided by MyLake to estimates of the probability of cyanobacteria abundance that exceed the threshold for good ecological status. In addition, it integrates this result with others from the processbased models using the combination rules of the national classification system. Two examples of the BN's model predictions are shown in Fig.  11: the posterior probability distributions for all nodes under the two storylines CW (a) and TW (b), both using the climate model GDFL and the time period 2050-2070. In the CW storyline (Fig. 11a), the probability High-Good status for Chl as predicted by the MyLake model is 84%.
The probability of High-Good status for cyanobacteria, however, is only 67%, and the combined phytoplankton status has even lower probability (63%) of being acceptable (High-Good). The TP predicted by MyLake in this scenario has a high probability (93%) of High-Good status, and therefore contributes little to the overall lesser lake status (59% probability of High-Good). In the TW storyline (Fig. 11b), the probability of High-Good Chl status (63%) is much lower than in the CW. The probability of High-Good cyanobacteria status is practically unaltered (66%), but the probability of combined phytoplankton status is reduced to 46%. In this scenario the Total P status has only 58% probability of being High-Good, resulting in an overall lake status with only 30% probability of High-Good.
The posterior probability distributions for selected nodes can be compared across all scenarios. The BN results indicated that climate change alone as projected by the two GCMs and driven by the two RCPs does not produce great differences in the probability of obtaining good ecological status for Lake Vansjø (Vanemfjorden). On the other hand, land-use changes in the three alternative storylines, presented significant changes in the probability of obtaining good ecological status in the future. Relative to the extended baseline, the consensus storyline improved the ecological status, the fragmented storyline produced a deterioration of the status, and the techno storyline gave a large deterioration of the status (Fig. 11).

Calibration of the models
Achieving satisfactory performance at the daily time-scale with INCA-P remains a challenge. The difficulties lie both with the INCA-P model, which is necessarily a simplification of the hydrologic and biogeochemical processes occurring in the landscape, and in the fact that the observed data used here are fortnightly rather than daily. In an application of INCA-P to a small agricultural catchment in Scotland, Jackson-Blake and Starrfelt (2015) show that more frequent measurements of P in stream water greatly reduce the parameter-related uncertainty in the model calibration. They also indicate that manual calibration may be just as effective as automatic calibration methods. We thus elected to perform manual calibration of PERSiST and INCA-P using the entire dataset rather than divide it into a calibration and validation period. The NS metrics we obtained for Q of about 0.7 for the River Hobøl at Kure are nearly as good as that obtained by Jackson-Blake and Starrfelt (2015) for automatic calibration on daily data for the Scottish catchment. Simplifications in the PERSiST and INCA-P models as well as the lack of daily data both contribute to the imperfect fit between simulated and observed concentrations and fluxes of TP in the River Hobøl at Kure. While shortcomings in the INCA model Table 4 Outcome of INCA-P and MyLake calibration assessed against the coefficient of determination (R 2 ), the Nash-Sutcliffe metric (N\ \S), and the normalized root-mean square deviation (RMSD*; INCA-P) or root-mean square error (RMSE; Mylake).   contribute to some of the calibration challenges (Jackson-Blake and Starrfelt, 2015), it is also likely that greater uncertainty in runoff measurements and high particulate loads at peak flow also cause uncertainty in TP calculations. In our case, INCA-P simulated TP loads were used as inputs to the MyLake model. As Lake Vansjø has a theoretical water retention of about 13 months, precise simulation of the TP concentrations from the catchment was therefore not of primary importance. Instead we focused on cumulative TP loadsrather than instantaneous concentrationsto the lake basins, as those have the controlling influence on TP concentrations in the lake water. Calibrating to TP loads, at the expense of decreased performance against TP concentrations (Table 4), means that satisfactory performance against TP loads was obtained. This procedure was recommended by Farkas et al. (2013) for the nearby Skuterud catchment, a small catchment (4.5 km 2 ) located a few kilometres north-west of Vansjø-Hobøl.

Outcome of climate change and storylines at the river basin scale
Results of hydrological modelling show that the RCP 8.5 leads to more annual runoff, due to higher precipitation, relative to RCP 4.5. There are also differences between the outputs of the two climate models, such that IPSL generally projects a greater increase in precipitation and runoff and more extreme inter-annual variations (Fig. 5). The INCA-P simulations suggest that TP loads are affected by climate due to variations in runoff and erosion resulting from changing precipitation. Consequently, storylines relying on RCP8.5 (TW and FW) produce higher annual TP loads. IPSL-based storylines yield higher TP loads to the lake inlet, while for the storyline CW, based on RCP4.5, both IPSL and GFDL give similar outcomes. While predicting stable or increasing TP at the lake inlet, INCA-P predicts decreasing TDP, which suggest that although erosion sustains TP loads under climate change (Fig. 6), dissolved P (both organic and inorganic) is slowly being leached out from the soils and is not replenished due to measures imposed to reduce fertilizer applications.
Legacy P in soils at the catchment scale is the focus of current efforts to understand lake recovery (Kleinman, 2017;Sharpley et al., 2013). At Vansjø, the legacy P may delay the lake response to management measures taken to reduce P inputs from the catchment. Most difficult to appraise is the role of P accumulated in agricultural soils over decades of organic and artificial fertilizer applications. Data from the few studies available indicate that legacy P in soil may delay the response of "edge-of-the-field" P losses to changes in fertilizer applications by 4-40 years (Sharpley et al., 2013). At Vansjø, Bechmann and Øgaard (2013) suggests that agricultural soils near the Vanemfjorden basin of Lake Vansjø have high concentrations of adsorbed P, and fertilizer applications are larger than the P removed annually in the crops. These authors also report that for six small streams, the effects of mitigation measures on TP and SS concentrations were difficult to detect over a 7 and 5-year period, respectively, despite comprehensive implementation of mitigation measures in all sectors.
The effect of changing land-use on TP loads according to the storylines is shown in Fig. 6. As previously estimated for different storylines (Couture et al., 2014), land-use is the key driver of TP loads in the basin. Here, the CW storyline is worth highlighting as it predicts a sharp reduction of TP loads in the river (60% for the GFDL climate model and 80% for IPSL climate model; Fig. 6). The best case is the CW storyline under the GFDL model, while the worst case is the TW  Table 2.  Table 2. storyline under the IPSL model. All other simulations, including the baseline, are within those two extremes. The great spread of predicted P loads associated with the worst case (TW) and best case scenario (CW) will imply widely different water qualities in receiving waters and potentially also significantly different responses within the biotic community. The different storylines include a combination of different measures within both the agriculture and the wastewater sectors. The flux of P from wastewaters has been greatly reduced to present-day levels, thus most of further reductions in P loading will have to come from the agricultural sector. A closer evaluation of the efficiency of each measure is therefore not possible without a more detailed analysis in which the measures are simulated one by one and in various combinations. Further, INCA does not allow catchments characteristics such as % land-use to vary during the course of multi-year simulations, as is the case for many catchment models. As a result, storylines are fully implemented in the year 2015, rather than as progressive changes. In reality, changes in catchment land-use occur more gradually.

Outcome of climate change and storylines at the lake scale
The linked catchment-lake model was first tested under scenarios of climate change only, keeping historical land-use throughout the simulations. The result (Fig. 7) shows that climate change, regardless of its magnitude (e.g., RCP4.5 vs RCP8.5) does not cause a significant increase or decrease in the average phytoplankton biomass at Vansjø. The overall decrease with time is consistent with decreasing TDP fluxes from the river basin discussed above. Another factor may be that internal P loading, which refers to the release of P stored in the lake sediments to the overlying water (Orihel et al., 2017;Sondergaard et al., 2013), deceases with time. Internal P load at Vansjø is, however, deemed to be of minor importance, as the perennially oxygenated water column precludes the diffusion of PO 4 through the sediment-water interface (Andersen et al., 2006).
The model suggests that the variability in the 2020-2070 mean annual phytoplankton biomass will increase with the magnitude of the climate change, with standard deviation of 0.47, 0.56, and 0.67 for historical, RCP8 (GFDL) and RCP8 (IPSL) climate projections, respectively. Fig. 7 shows clear effects of changes in land-use and wastewater effluent loads on mean annual Chl throughout the 2020-2070 period. Mean annual Chl is highest for the Techno storylines and the FW storylines, and lowest for the CW storylines. These differences reflect the measures implemented for these scenarios. The Techno storyline, which uses RCP8.5 (IPSL model), stands out as the worst-case scenario at Lake Vansjø, while the CW storyline, which uses RCP4.5 (GFDL model), stands out as the best-case scenario (Figs. 6 and 7).

Accessing lake status using Bayesian Network
Interaction between temperature and Chl as stressors for cyanobacteria was included in the CPT for cyanobacteria in the BN  model . The Bayesian Network results indicated that the response of cyanobacteria status (status Moderate or Poor-Bad with a seasonal concentrations maximum N 1 mg L − 1 ) to change brought about by the climate scenarios and the storylines are small. In addition, the difference in Chl status observed for the three storylines is not reflected in the cyanobacteria status (Fig. 11). One reason is that the probabilities of concentrations N 1 mg L −1 are very low (typically b1%). Another reason for the lack of response is that the CPT for cyanobacteria contains high uncertainty. It is based on relatively few observations (103 samples from Vansjø distributed on 18 cells in the CPT), of which only 13 samples have cyanobacteria concentration exceeding 1 mg L −1 . As a result, the probability of different outcomes does not vary much between different scenarios. In follow-up work with this BN we aim at improving this CPT by including data from other lakes or by basing the CPT on an ordinal regression model rather than on observations counts. Nevertheless, although the cyanobacteria did not show a marked response to the scenarios in the study, calculating this state variable with the help of the BN model is crucial to determine the ecological status of the lake. For all scenarios, cyanobacteria contribute to worsening the status assessed for phytoplankton, compared to using Chl alone. In this way, the status of cyanobacteria, as predicted by the BN, contributes to the overall response of phytoplankton status and of total lake status to the scenarios.

Future lake water quality under multiple stressors
The scenarios and storylines used here in this modelling study for the MARS project differ from those in previous model applications at Lake Vansjø (Couture et al., 2014;Moe et al., 2016). Here the climate scenarios have been updated to the latest RCPs suggested by the IPCC, and theses have been used to drive the two GCMs, IPSL and GFDL. The modelling results for these new storylines generally fall within the range indicated by the previous best case/worst case storylines (Couture et al., 2014), although future eutrophication status in Lake Vansjø differ from the previous assessment, generally predicting lower Chl abundance under climate change. Overall, our results further illustrate that better connectivity of scattered dwellings to wastewater treatment plants and fertilizer load reductions are effective in reducing P fluxes to Vansjø, consistent with what was observed throughout Europe (Ibisch et al., 2016). Further improving the ecological status of the lake likely depends on more stringent measures to reduce diffuse pollution from agriculture.
Chaining climate, hydrologic, catchment and lake models is a useful approach to simulate the outcome of climate and land-use changes. Our results also show the value of predicting a biological indicator of lake ecological status, in this case cyanobacteria abundance, with a BN model. Given the model complexity, only the chief factors controlling blooms, namely climate variables and catchment P dynamics, were Probability of High-Good status of Chl for the CW and TW scenarios are 84% and 63%, respectively, while the probability of High-Good status for Cyanobacteria are 67% and 66%, respectively. Note the impact of the "Status Cyano" on the "Status Phytoplankton" node, which reduces the probability of High-Good to 63% and 46% for the CW and TW storylines, respectively. investigated here. Under those constraints, our results suggest that climate change alone does not cause significant increase in the mean phytoplankton biomass at our site, but more severe climate change does increase inter-annual variation. Higher inter-annual variation in phosphorus loads and phytoplankton biomass associated with future climate change implies that operational monitoring programmes prescribed by the WFD should be run for several more years to allow a credible assessment of the effects of P-reducing mitigation measures.
Improving on this work to assess the impact of multiple stressors on the ecological status of lakes should, on one hand, consider nutrient legacies at the river basin and lake scales, and, on the other hand, novel stressors of increasing importance. Along these two lines, we identify specifically (i) the effect of catchment P legacy (Sharpley et al., 2013) and N (Van Meter et al., 2017) on the timing and inputs of nutrients to the lakes, which remains uncertain in particular under high flows, (ii) the effect of changes in lake colour (i.e., browning) brought about by increasing dissolved organic carbon (DOC), on the lake photon budget and thus on phytoplankton growth (Deininger et al., 2017;Solomon et al., 2015) with effects on cyanobacteria (Seekell et al., 2015), and finally (iii) the controls on the internal loads of elements critical for the growth of cyanobacteria, in particular P and ferrous iron (Verschoor et al., 2017). exemple conditional probability table for the BN model, and a graphical representation of the structure of the BN model.