State of the art in modelling of phosphorus in aquatic systems: Review, criticisms and commentary

This systematic review considers how water quality and aquatic ecology models represent the phosphorus cycle. Although the focus is on phosphorus, many of the observations and discussion points here relate to aquatic ecosystem models in general. The review considers how models compare across domains of application, the degree to which current models are fit for purpose, how to choose between multiple alternative formulations, and how models might be improved. Lake and marine models have been gradually increasing in complexity, with increasing emphasis on inorganic processes and ecosystems. River models have remained simpler, but have been more rigorously assessed. Processes important in less eutrophic systems have often been neglected: these include the biogeochemistry of organic phosphorus, transformations associated with fluxes through soils and sediments, transfer rate-limited phosphorus uptake, and responses of plants to pulsed nutrient inputs. Arguments for and against increasing model complexity, physical and physiological realism are reviewed. Display Omitted The treatment of phosphorus in aquatic models is systematically reviewed.The complexity of lake, river and marine models is increasing over time.Catchment-river models tend to be simpler than lake and marine models.Performance assessment of lake and marine models is generally inadequate.Processes not included in models are discussed.


Introduction
Though planktonic population dynamic models date back to the 1930s (Fleming, 1939), the current generation of mechanistic water quality models are in many senses the direct descendents of the NPZ models developed for eutrophic coastal systems during the 1950s, 60s and 70s (Dugdale, 1967;Steele, 1959Steele, , 1974 and the variations designed to facilitate management of eutrophic rivers and lakes (O'Brien, 1974;Scavia and Park, 1976). In the intervening decades, these models have been applied to an ever-growing range of aquatic environments: oligotrophic lakes (Buzzelli et al., 2000), rivers (Garnier et al., 2005), weir pools, floodplains and freshwater wetlands (Paudel et al., 2010;Wang et al., 2010), estuaries (Colijn et al., 1987;Wild-Allen et al., 2010), coastal and deep ocean systems (Allen and Clarke, 2007;Blauw et al., 2009), and even salt ponds (Bruce and Imberger, 2009). A brief history of marine plankton models has been presented by Gentleman (2002).
Over time, many models have become increasingly detailed, both in resolution and in the number of processes and components represented. It is timely to pause and take stock of these models. Are models that were developed for eutrophic lakes suitable for the other systems to which they are now being applied? Do they accurately reflect our current understanding of the biogeochemistry of aquatic systems? Are they getting more accurate? Are they improving in the range of system properties and behaviours that can be predicted? Where there is variety, can we demonstrate that one approach is better than another and should always be preferred?
This review will begin this process, focussing on how phosphorus cycles are represented in models of aquatic systems. Phosphorus is important as the dominant limiting nutrient in many freshwater systems, and has long been considered a key control of harmful algal blooms and the productivity of aquatic ecosystems (Abell et al., 2010;Elser et al., 1990). Nitrogen more commonly limits planktonic production in marine systems, where nitrogen fixation by cyanobacteria is more constrained and inputs from sewage treatment plants are less important (Vitousek and Howarth, 1991).
The paper begins by reviewing recent criticisms of and commentary regarding aquatic ecological models. I will then consider the various ways in which the phosphorus cycle is represented, considering water column and benthic processes as well as the physical frameworks in which water quality models are embedded. The variety of formulations that have been used to represent specific processes are not reviewed in detail, but the following section will refer readers to other such reviews and suggest a framework by which to evaluate alternative formulations.
The quantitative contribution of this paper is a systematic review of 158 published aquatic modelling papers that include aspects of the phosphorus cycle. The models included in the systematic review are compared in terms of their representation of components and processes in the phosphorus cycle and assessed in terms of how well these representations hold up in the face of current biogeochemical understanding, and how the complexity of phosphorus cycle modelling is changing over time. I will discuss whether increases in model complexity have been justified, how representations vary between sub-disciplines (lotic and lentic limnological versus oceanographic modelling), and how well aquatic ecosystem models are currently being evaluated. I will use the terms "aquatic ecosystem models," "aquatic biogeochemical models" and "water quality models" somewhat interchangeably. Some of the models included in the review are true ecosystem models, incorporating a range of biota from predatory fish and macroinvertebrates to seagrasses and periphyton. Others are best considered biogeochemical models, concentrating exclusively on phosphorus (and sometimes also nitrogen) processes, but including aspects of plankton ecology. A few simulate only a limited number of nutrient processes, but still aim to simulate variations in water quality. The boundaries between these three groups are fuzzy, as they represent parts of a spectrum rather than discrete model types.
Finally, I will discuss some of the processes not included in most current phosphorus models, and the circumstances in which the omission of these processes may be important, and will conclude with some general recommendations for the future direction of aquatic ecological modelling.

Methods
For the systematic review, I assessed 73 distinct models of aquatic water quality and ecology, in a total of 158 applications from the published literature. This total includes 51 applications of lake and reservoir models, 42 applications of coastal and marine models (including models of estuaries), 38 applications of catchment and river models, 13 wetland models and, for purposes of comparison, 8 sediment diagenesis models and 6 process-based models of biological wastewater treatment systems. Models were compared in both qualitative and quantitative terms.
There are many thousands of aquatic models that include phosphorus dynamics in the published literature, and it is not possible to include them all in this analysis. The approach taken here has been to select a few highly cited models from each decade, supplemented with an ad hoc selection of other modelling publications, giving preference to those that mention phosphorus as important in the system under consideration or in verification of the model, and aiming to include a fair representation of three broad groups: lake models, marine models and Catchmente river models. I have preferred, but not limited the selection to models published in higher-impact factor, peer-reviewed journals. A few grey-literature reports are included where these represent the only comprehensive descriptions of models that have been widely used in journal-published literature. Catchment and paddock models that do not represent any in-stream phosphorus processes have been deliberately excluded, though such models are also widely used. Only mechanistic models are considered, though a variety of other approaches (e.g. Kuo et al., 2007;Ticehurst et al., 2007;Ulen et al., 2001) are also in use. Finally, I have preferentially included more recent papers; those published since Arhonditsis and Brett's (2004) review of the state of the art in mechanistic biogeochemical modelling.
Following the above guidelines, modelling publications were selected for inclusion from amongst matches found on Web of Science and Google Scholar search results from the following keyword sets: 1. ["phosphorus model" or "biogeochemical model" or "water quality model" or "ecological model"] AND ["lake" or "marine" or "river" or "coastal" or "water" or "wetland" or "ocean" or "hydrological" or "hydrodynamic"]; or 2. "phosphorus" AND "water quality" AND "model" A second round of publications was selected for inclusion from amongst Web of Science and Google Scholar search results for the names of each of several named modelling packages (GEM, CAEDYM, SWAT, ERSEM, DELFT-3D, BLOOM, WWQM,  TUDP, SIMCAT, INCA-P, SCOBI, QUAL2K, QUAL2E, MyLake, LakeWeb, LakeMab, MINLAKE, PHOSMOD, EMS, EMOWAD, GEM, CoastWeb, CoastMab and CCHE3D-WQ). In this second round, only publications since 2002 were considered for possible inclusion.
It is acknowledged that the list of models included in the analysis is not comprehensive, though I believe it to be representative.
In assessing model complexity, I have considered the number of phosphorus stores and process types included in each model. In each process category, one "complexity point" was counted for inclusion of each of the components or processes included in each category in Table 1. The categories are somewhat arbitrary: components influencing redox processes, for instance, could be included in either the benthic or abiotic count, for instance. The count is also simplified: one point is assigned for inclusion of multiple phytoplankton groups, for example, regardless of whether the model includes two groups or twenty, and one point is assigned for inclusion of "other animals", regardless of whether this is a single fish group or an entire foodweb. The focus of this paper is the treatment of phosphorus rather than the overall complexity of aquatic ecosystem models.
Not all of the models included in the systematic review are included in the reference list. A full listing is available in the supporting information.
3. Criticisms of the state of the art Arhonditsis and Brett (2004) conducted a systematic review of mechanistic marine and lake biogeochemical models published between 1990 and 2002, finding that less than 28% reported any quantitative sensitivity analysis results and only 30% reported any quantitative metrics to evaluate model performance. Where r 2 and RE were provided or could be calculated from the results that were provided, nitrogen, phosphorus and phytoplankton were found to be only moderately well predicted (r 2 ¼ 0.4 to 0.6 and REw40%), while zooplankton and bacteria were poorly predicted. Model complexity varied greatly, but increased complexity was not associated with better predictive power. The authors concluded with a call for more consistent adoption of best practise in modelling and for the adoption of an agreed set of performance standards.
Continuing this analysis, Arhonditsis et al. (2006) found that neither the degree of rigour employed in model development nor the performance of these models had any impact on the frequency with which publications were subsequently cited, either in the modelling literature or the limnological or oceanographic literature. In other words, not only are we in the aquatic systems modelling community not rigorously assessing our models: it seems that we don't care! Part of my aim in the present paper is to evaluate whether there has been any improvement since the Arhonditsis and Brett (2004) and Aronditsis et al. (2006) reviews. Arhonditsis et al. (2006) also found that applications of models to address local environmental management concerns tend to be less well cited than other modelling publications, and attribute this to inadequacies in model implementation. An alternative explanation may be simply that local model applications are less likely to be of wide interest to others in the modelling and oceanographic communities. Flynn (2005) fiercely criticised the state of the art in planktonic ecosystem models, pointing to the failure of models to update the biophysical underpinnings of their algorithms. He argued that, though models are increasing in complexity, they are doing so by "bolting together" familiar model components rather than improving the structural representation of existing components. In making this case, Flynn pointed to conceptual deficiencies in modelling of combined nitrogen and phosphorus limitation of phytoplankton, in representation of zooplankton grazing behaviour, and in representation of bacterial nutrient uptake and dissolved organic matter. In summing up, Flynn called for more reflective model development and greater interaction between modellers and biologists. Franks (2009) criticised the use of models as prediction "toasters" rather than as hypotheses to be tested. I concur and suggest that, because the implementation of mechanistic models of aquatic systems can be so much more time-consuming and resource-intensive than other modelling approaches and yet has not been demonstrated to have a greater payoff in terms of predictive power in biogeochemical and ecological applications, the use of mechanistic models can only be justified if the models are used to generate hypotheses about system function, or for educational purposes, or are able to predict (possibly unforeseen) higherlevel emergent properties of complex systems.
Allen and Polimene (2011) took up the call for greater conceptual accuracy, arguing that models that more closely represent our true physiological understanding of organisms will be better able to demonstrate prediction of strongly emergent properties of complex ecosystems. This seems like a reasonable hypothesis but has not yet, to my knowledge, been demonstrated in practise. Mooij et al. (2010) discussed challenges and opportunities in lake ecosystem modelling, arguing that the proliferation of competing lake models has led to duplication of effort, with modellers "reinventing the wheel" with a variety of different but functionally similar formulations and models. They summarised some of the more common approaches in lake modelling and describe briefly key characteristics of several named model systems (CAEDYM, CE-QUAL-W2, DELFT 3D-ECO, EcoPath with EcoSim, LakeMab, LakeWeb, MyLake, PCLake, PROTECH and SALMO), some of these widely used within the modelling community and others used repeatedly by particular modellers or small groups. Mooij et al. (2010) concluded with a call for development of openaccess, community lake models that will retain the flexibility to adopt new approaches while minimising duplication of effort and proliferation of trivial variations. They also called for more detailed representation of sediment diagenesis, more focus on fish and zooplankton as top-down controls of water quality, and more effort to predict biodiversity of lake ecosystems.
Others (Min et al., 2011) have called for better integration between data collection and modelling efforts, highlighting the potential to improve our conceptual understanding of aquatic systems in measurements are designed to test conceptual models and fill data gaps in predictive models.
4. How is the aquatic phosphorus cycle represented in models?

Water-column processes
Aquatic biogeochemical models may represent the phosphorus cycle in any of several ways, differing mainly in the number of components (phosphorus stores) and processes (phosphorus transformations) included. Conservation of mass is the key governing principle: with the exception of transfers across boundaries of the model domain, any process that adds (or removes) phosphorus from one component must remove (or add) an equal mass of phosphorus to another pool. Five examples of conceptual representation of water-column phosphorus cycling as embodied in mechanistic models are given in Fig. 1. Although models exist on a spectrum of complexity, varying widely in the number of components (phosphorus stores) and processes (phosphorus fluxes) represented, there tends to be broad conformity surrounding which processes and components are included for a model of a given complexity (Fig. 1).

Representation of plankton
A major focus of modelling of lakes and estuaries has been the prediction of harmful phytoplankton blooms. To this end, the phytoplankton and zooplankton and grazer compartments of aquatic ecological models are commonly divided into several groups, either size-based, taxonomically based (Reynolds et al., 2001), or taking a hybrid or "functional groups" approach (Baretta et al., 1995;Reynolds et al., 2002). The use of multiple groups allows a more detailed representation of the plankton dynamics of the system, such as representations of phytoplankton succession Robson, 2000, 2001), transitions at estuarine interfaces (Chan et al., 2002), or occurrences of harmful algal blooms (Robson and Hamilton, 2003), and also allows more realistic simulation of overall system dynamics, as the phytoplankton and zooplankton communities adapt to changing environmental conditions (Baird, 2010).
In models that use taxonomically or phylogenetically specified groups, parameter values are specified that relate to specific classes or species that are known or believed likely to be present. These models make use of maximum growth rates, nutrient uptake rates, photosynthesis-irradiance response curves and sometimes settling velocities measured in the laboratory or in situ for each taxonomic group. Division by phylogenetic class lumps together species that  (Burger et al., 2008;Elliott et al., 2000). b) Division of organic phosphorus into particulate and dissolved fractions (Cole and Wells, 2008;Petihakis et al., 2002). c) including dissolved organic phosphorus. d) with explicit representation of bacterial biomass and a particulate inorganic phosphorus (or sediment-adsorbed phosphorus) component (Allen and Clarke, 2007;Garnier et al., 2005). e) A typical more complex model, with additional sources of detritus and division of organic phosphorus into two pools of differing lability (Gal et al., 2009;Wild-Allen et al., 2011). Note that some of the examples cited also include components and processes not included in these figures, such as additional biota. For simplicity, benthic sediment processes (including exchanges between the sediments and water column), soil and crop processes (such as erosion) physical transport processes (advection, diffusion, vertical movements, resuspension, boundary fluxes and external sources) and higher ecology (predators and fish) are omitted from all figures, though they are commonly included in the models. Grazers, for the purpose of these diagrams, may include benthic filter feeders. Components that may be subject to removal from the water column by settling and sedimentation are shaded. Acronyms: DIP (Dissolved Inorganic Phosphorus), PIP (Particulate Inorganic Phosphorus), DOP (Dissolved Organic Phosphorus). may have widely varying parameter values, so it is common to include additional groups that represent particular species known to be important in the system (e.g. Robson and Hamilton, 2003). This approach requires the model developer to anticipate all of the functionally distinct species that might become important in the system.
A more systematic approach is to use size-based mechanistic models (Baird, 2010;Baird and Suthers, 2007), which make use of empirically established or theoretically determined relationships between organism size and functional characteristics. The crosssectional area of a phytoplankton cell, for instance, is a key control on the amount of light it can intercept, and hence can be related to its expected photosythesis-irradiance response. This same crosssectional area is also related to the cell's surface area relative to volume, and hence to the rate at which it can take up nutrients from the water column. Cell size can also be used to calculate an expected settling velocity (for a given cell shape and density) and influences both zooplankton encounter rates and suitability as food for zooplankton of a given size. Similarly, zooplankton organism size is strongly related to maximum grazing rates, growth rates and swimming speeds (Hansen and Bjørnsen, 1997). Size-based models use these relationships to effectively simulate multiple functional groups without introducing a variety of additional parameters. The specific characteristics of the local phytoplankton and zooplankton communities do not need to be known, but can be inferred through the evolution of a model that includes several competing size classes.
Models that employ size-based groupings are typically considerably more efficient than taxonomic or other multiple-group models in the number of parameters they require. Possible disadvantages are that the results may be less readily related to observed community structures, and functional responses that are not readily related to organism size (including salinity responses, nitrogen fixation, buoyancy variations and toxin production) requires another approach.
Instead of dividing plankton taxonomically or purely by size, some models make use of functional groupings, typically specifying up to 30 groups, which are divided according to ecological function, observed conditions of occurrence and similarities in responses to environmental variations (Le Quere et al., 2005;Reynolds et al., 2002). This approach has been successful in predicting phytoplankton community composition in local and regional applications, where parameter values are locally calibrated, but has not been demonstrated to be effective across larger geographic scales and has been criticised as introducing too much complexity to ecosystem models before their dynamics are sufficiently well understood (Anderson, 2005). Kruk et al. (2011) proposed and advocated morphology-based functional groups, using surface area, volume and the presence or absence of aerotypes, flagella, mucilage, heterocysts and siliceous structures as defining characteristics of each of 7 groups. Kruk et al. (2011) then assessed the power of multiple linear regression models using commonly measured environmental variables as inputs to predict phytoplankton according to species, phylogenetic class, Reynolds functional group, or morphology-based functional group. They found that predictive power increased from left to right through this list (i.e. morphology-based groups were best predicted by the models, while species models had the weakest predictive power). That a model to predict one of only 7 morphologicallybased groups has greater predictive power than a model to predict from among 28 Reynolds functional groups is not surprising. That phylogenetic classes (8 groups) could be not be as wellpredicted as either morphological or Reynolds functional groups, is more interesting.
One commonly-seen limitation of models that represent plankton species or functional groups is that they do not simulate a realistic, sustained co-existence of multiple groups: instead, one group outcompetes all others and completely dominates until conditions change. Other groups avoid extinction in these models only by imposition of a minimum concentration or external source. This may well be due to under-representation of microhabitat heterogeneity, spatial and temporal patchiness that would allow each group to find its own niche within the system. Alternatively, it has been suggested (Thingstad et al., 2010) that current models lack "kill-the-winner" mechanisms that enhance diversity in real systems: these mechanisms may include increasing populations of predators and viruses that specifically target dominant species or functional groups. Thigstad et al. (2010) recommended the inclusion of size-selective predator functional groups (e.g. heterotrophic flagellates, ciliates and mesozooplankton) to complement the use of phytoplankton functional (or size-based) groups, as an effective and realistic "kill-the-winner" mechanism.
An alternative to multiple groups is the use of "goal function" models, in which a single phytoplankton or zooplankton component changes its properties as conditions change, in order to maximise some function such as the calculated exergy or emergy 1 of the system (Bendoricchio and Jorgensen, 1997;Fath et al., 2001;Zhang et al., 2010). Though theoretically attractive, this approach is relatively little-used, perhaps because it is more difficult to implement than a multiple-group approach, while retaining some of the limitations of a single-group model.

Other animals
Benthic filter feeders have often been found to be particularly important controls on phosphorus cycling and algal blooms in lake ecosystem models, so these are explicitly included as grazers in several models (e.g. Los and Wijsman, 2007). A few models explicitly include a range of other aquatic animals: representation of fish populations and food-webs is beyond the scope of this review.

Grazers
Most models do not explicitly simulate bacterial biomass (instead implicitly representing bacteria through hydrolysis or remineralisation rates applied to organic matter, as in Fig. 1c), but a growing number of models (Bruce and Imberger, 2009;Hakanson, 2009) do simulate bacteria (Fig. 1d). These are rarely calibrated or validated against observations of bacterial abundance or phosphorus content, but it is argued that they allow more realistic system dynamics; for example, by simulating the impact of microbial processing on water-column stoichiometry (Li et al., 2013). Another advantage may be the ability to more easily represent positive feedbacks between concentrations of organic matter and degradation rates (Lønborg et al., 2009) as well as the impact of organic matter stoichiometry (particularly C:N:P) on degradation rates (Wohlers-Zollner et al., 2011).
A few models that do explicitly include bacteria include more than one functional group of bacteria; for example, aerobic and anaerobic groups (Allen and Clarke, 2007;Kaufman and Borrett, 2010), sulphate-reducing and non sulphate-reducing , or nitrifying and denitrifying bacteria.

Phosphorus adsorption and desorption
McGechan and Lewis (2002)  be modified to represent the influence of pH, dissolved oxygen, and sediment aluminium and iron concentrations.

Benthic processes
Although early lake models tended to omit interactions between the water column and benthic sediments, and this is still true of most deep ocean models and some river models, modern lake and coastal models do include interactions at the sediment interface. There are essentially three approaches to this: 1) Represent the benthos as a boundary condition, applying specified, calibrated phosphorus fluxes that may either be fixed as in CoastMab (Hakanson and Bryhn, 2008) and PROTECH (Elliott et al., 2000) or vary as a function of temperature, dissolved oxygen and DIP in the bottom layer of water (Allen and Clarke, 2007;Robson et al., 2008), as in simpler implementations of CAEDYM (e.g. Robson and Hamilton, 2004). 2) Duplicate water-column components and processes in one or more sediment layers (often a surface aerobic layer and a bottom anaerobic layer), specifying different parameter values to control the rates of remineralisation or hydrolysis in aerobic and anaerobic layers and either omitting phytoplankton or replacing them with benthic microalgae/microphytobenthos in the sediment layer. Examples of models that take this approach include EMS (Wild-Allen et al., 2011), CE-QUAL-W2 (Cole and Wells, 2008) and recent implementations of ERSEM (Allen and Clarke, 2007). Models that include rooted plants also often include uptake of phosphorus from sediments, though some models simply assume that macrophytes (especially seagrasses) are not nutrient limited, and omit them from the simulated phosphorus cycle (de Boer, 2007). 3) Incorporate a complete sediment diagenesis model, simulating redox conditions as a function of concentrations of relevant ions (e.g. Fe 2þ , Fe 3þ , Ca 2þ , S 2À , SO 4 2À , H þ ), and including representations of processes such as precipitation and immobilisation of apatite phosphates. Only a very few models take this approach e the more complex implementations of CAEDYM (Hipsey and Busch, 2012) and the model presented by Komatsu et al. (2006) being key examples.
Approach (1) has the advantage of simplicity and does not require detailed characterisation of the benthos. Its great disadvantage is that the resulting parameterisation will be highly sitespecific and the model will not be able to simulate long-term changes that alter the state or character of the sediments. If sediment fluxes are not observationally measured, there is also a danger that the sediment fluxes may become a "fudge factor" in calibrating such models, as they have a broad range of reasonable values and can have a strong influence on predicted water-column concentrations.
Approach (2) avoids these problems and allows conservation of mass within the system as a whole, at a cost of an increase in the number of parameters included and the initialisation requirements of the model.
Approach (3) is strongly driven by a geochemical conceptualisation of the system and allows a stronger representation of abiotic chemical processes that bind inorganic phosphorus in sediments, but results in a very complex, computationally expensive and data-hungry model, as it requires a raft of additional components to be included in the model, both within the watercolumn and multiple sediment layers (which may be highly spatially heterogeneous). This approach may be indicated when modelling systems with iron-rich sediments, in which coprecipitation of phosphorus may be an important removal mechanism (Baldwin et al., 2002). Calcite co-precipitation may be particularly important in hard water systems, rich in calcium, especially when phytoplankton or sediment biofilm blooms elevate pH (Baldwin, 2013). Iron precipitation is mediated by bacterial processes (Baldwin, 2013), though this interaction adds another layer of complexity and is rarely if ever explicitly simulated.

Physical framework
Aquatic biogeochemical and ecological models are typically embedded within hydrologic or hydrodynamic models. These range from simple box models of lakes (Imboden, 1974), through multiple box models of seas (Baretta-Bekker et al., 1997) and flowrouting network models of rivers (Tolson and Shoemaker, 2007), through one dimensional, vertically resolved models of lakes (Hamilton and Schladow, 1997), one dimensional, longitudinally resolved models of rivers (which may be either simple hydrologic models or physics-based hydraulic models) (Jia et al., 2010), twodimensional, spatially resolved models of catchments or aquatic systems (sometimes with additional layers to represent groundwater storages) (Cole and Wells, 2008) and two-dimensional, vertically resolved river and estuary models, to full threedimensional, baroclinic hydrodynamic modelling systems (Cerco, 2000).
Greater resolution and dimensionality is becoming increasingly common as computing resources improve (Fig. 2), but threedimensional modelling should not always be the default. Three dimensions are sometimes necessary to simulate important physical dynamics, but implementing a three-dimensional model of reasonable resolution is computationally expensive, which limits the options available for parameter estimation, sensitivity analysis and uncertainty analysis, while also making verification less straightforward and reducing the number of scenario runs that can be conducted.
Physical processes sometimes dominate biogeochemical and ecological responses and often overwhelm any other considerations in an aquatic system model (Robson, 2014). The choice of physical framework is therefore important in phosphorus modelling, but is beyond the scope of the present paper, though a few examples are given in Table 2. Blocken and Gualtieri (2012) offer guidelines on computational fluid dynamic model development procedures, while Hodges (2013) discusses the transition from hydrological to hydraulic modelling of river systems.
Coupling of hydrodynamic with biogeochemical models presents some challenges. In a fully coupled model system (e.g. Skerratt et al., 2013), the hydrodynamic model directly handles Fig. 2. Number of spatial dimensions in 96 lake and marine models, grouped by year of publication. Models with fewer than 12 boxes arranged two-dimensionally or fewer than 5 boxes arranged as vertical layers are counted as zero-dimensional for the purpose of this count. transport and diffusion of phosphorus and other biogeochemical species, on the same time-step and spatial grid as used for temperature and salinity. This usually ensures mass conservation and numerical stability of transport processes, and enables feedback between biogeochemical and hydrodynamic processes, such as the effect of particulate and dissolved substances on light and heat absorption. This approach is also very computationally expensive, as it dramatically increases the number of tracers to be advected by the hydrodynamic model and requires biogeochemistry to be handled on a time-step that is often much shorter than the timestep of important biogeochemical processes. An additional disadvantage is that animals (including zooplankton and sometimes also fish) are represented as "concentrations" of phosphorus in fullycoupled models, which limits the approaches that can be used to simulate swimming behaviours and population dynamics.
Decoupling transport processes from kinetic processes to allow longer time-steps and faster integration schemes for biogeochemical and ecological functions run on the same spatial grid as the hydrodynamic model allows considerably increased computational efficiency at some cost to numerical accuracy (e.g. Lazzari et al., 2010;Park and Kuo, 1996). Butsenschon et al. (2012) discuss issues to be considered in time-splitting coupled models.
If biogeochemistry and associated transport is run "offline", after completion of the hydrodynamic model run, much faster biogeochemical model runs are possible (e.g. Cerco and Noel, 2013). It also makes it relatively easy to swap in different hydrodynamic models, increasing the flexibility of the system. This approach requires substantially more storage space, as transport variables must be saved across the whole domain at every time-step, and precludes simulation of possible effects of biogeochemistry on hydrodynamics.
Finally, hydrodynamic and biogeochemical models can be fully decoupled, using either a particle tracking or mass balance approach to calculate transport fluxes from outputs of the hydrodynamic model. This is again potentially much faster, as it allows the biogeochemistry to be calculated on both a longer time-step and a different spatial structure from the underlying hydrodynamic model. This may be a different grid or mesh (e.g. Larsen et al., 2013), or even a Lagrangian modelling scheme (e.g. Paster et al., 2013). Accuracy of transport is likely to be reduced, but this approach facilitates long simulations and agent-based modelling of ecosystems.

How should we choose between alternative functional forms?
Although many models share similar overall structures in terms of which components and processes are included, there is great diversity in how these processes are represented. Tian (2006), making a call for standardisation, reviewed the variety of functions used in marine plankton models, identifying thirteen different functions that have been used to represent the impact of light on phytoplankton growth (differing mostly in the sharpness of the decline e if any e exhibited at high irradiance due to photoinhibition), five functions for nutrient limitation of phytoplankton growth (all variants on either MichaeliseMenten kinetics introduced by Dugdale (1967) or Droop kinetics (Droop, 1973), which allow variable intracellular nutrient concentrations), ten functions for the effect of temperature on process rates (some allowing inhibition at high temperatures, others, not), eight for mortality of zooplankton, six for respiration and excretion, two ways of combining light, temperature and nutrient limitation of phytoplankton, and a total of more than 35 different ways of representing zooplankton grazing. For details of these various alternative functions and one perspective on which should be preferred, refer to that paper. Note that Tian's (2006) extensive list of alternative formulations was not exhaustive, even at the time of publication, and proliferation of variants has continued since that time. Tian (2006) observes that in some cases, one functional form is equivalent to another, given certain parameter values. In these cases, the more flexible form might be preferred if it is believed to be valid over a wider range of conditions, using fixed parameter values to simplify the model where this is appropriate.
Models are sensitive to the details of the functional forms chosen to represent processes, but in many cases, the selection of one functional form over another is a matter of tradition or convenience rather than analysis. There are three types of analysis to which these functions are amenable: 1) comparison of the performance of models using different functional representations in specific model applications; 2) evaluation of alternative model structures in a theoretical framework, testing whether their behaviour is in reasonable accordance with ecological understanding in all circumstances; and 3) close examination of the physical, chemical and physiological processes underlying the relationship summarised by a particular model process and the way these more fundamental processes interact over the range of conditions likely to occur in the modelled sub-system. Flynn (2010) uses all three approaches to demonstrate the deficiencies of the (still dominant) MichaeliseMenten (a.k.a. Redfield-Monod) formulation for phytoplankton nutrient uptake. He (approach 1) demonstrates that a MichaeliseMenten model is not able to accurately simulate nutrient dynamics in a phytoplankton batch culture, whereas a variable stoichiometry model can; (approach 2) demonstrates that a MichaeliseMenten model calibrated to produce the same results as a variable stoichiometry model under steady-state conditions will give very different results in time-varying, nutrient-limiting conditions, and (approach 3) demonstrates that intracellular nutrient stoichiometry is known to be variable in the real world, especially with respect to C:P under the nutrient-limited conditions that these models are designed to simulate. Table 2 Examples of physical dynamics likely to be of primary importance in modelling of phosphorus cycles in aquatic systems.

Type of system Important physical dynamic Minimally sufficient physical model Example
Deep lake or reservoir Timing and duration of thermal stratification and mixing Vertically-resolved one-dimensional baroclinic hydrodynamic model (e.g. DYRESM) (Trolle et al., 2008) Shallow lake Wind-driven advection and resuspension Vertically averaged two-dimensional model (Ji, 2007) Salt-wedge estuary Salinity stratification and along-river movements of the salt wedge Two-dimensional baroclinic hydrodynamic model, resolved vertically and longitudinally (Kurup et al., 1998) Macrotidal estuary Resuspension and formation of a turbidity maximum zone Longitudinally resolved one-dimensional hydraulic model (can be barotropic) (Even et al., 2007) Offshore marine waters Depth of thermocline relative to depth of photic zone Vertically-resolved one-dimensional model (Kuhn and Radach, 1997) Continental shelf marine waters Upwelling Three-dimensional baroclinic hydrodynamic model (Zhurbas et al., 2008) Fiksen et al. (2013) recently reviewed MichaliseMenten nutrient uptake functions in the context of modelling of microbes, examining the physical and physiological basis of these functions, including the roles of cell size, molecular diffusion, enzymes and uptake sites, all of which are equally relevant in the context of phytoplankton modelling. They suggest a more accurate alternative formulation which requires a number of additional parameters (cell size and parameters controlling the rate of diffusion), but point out that some models already include these parameters for other purposes, in which case the adoption of the more complex formulation may not increase the overall complexity of the model.
Formulations for grazing of phytoplankton by zooplankton are considered by Mitra et al. (2007), who assess the behaviour of a range of alternative formulations in a range of theoretical situations as well as in a real application. They show that, in models in which prey quality (i.e. the stoichiometry of phytoplankton) influences zooplankton assimilation efficiency and ingestion rate, more material flows through the detrital compartment of the aquatic ecosystem. Because it makes an important difference to the ecological behaviour of the system, they again recommend that this more conceptually accurate (though more complex) approach should be preferred. Recent work has also shown the importance of fatty acids (PUFA and HUFA) to food quality. Perhar et al. (2012Perhar et al. ( , 2013b show that including HUFA in a dynamic ecosystem model alters predatoreprey interactions in a way that affects phytoplankton community composition, zooplankton population size, and overall system stability. uptake of DIP from the water column, uptake of DOP from the water column, uptake from sediments, mortality, lysis, and respiration. Animal processes include grazing of plants, release of phosphorus due to feeding inefficiencies, release for stoichiometric balance, grazing of detritus, mortality, respiration and predation. Abiotic factors include settling, resuspension, burial, dispersive fluxes from sediments, effects of temperature, effects of pH, effects of salinity, apatite precipitation, effects of iron concentrations, effects of sulphate concentrations and effects of calcium. Detrital processes include hydrolysis, remineralisation of detritus, remineralisation of DOP, uptake by bacteria, and release by bacteria. Numbers in parentheses indicate the number of model publications included in the analysis. The scale (0e8) represents the number of processes and components included, from Table 1. Predation formulations are also considered by Montagnes and Fenton (2012), who report evidence that prey abundance affects zooplankton assimilation efficiency, and demonstrate that including this in models alters simulated system dynamics, sometimes in counter-intuitive ways. Fulton et al. (2003) compared different formulations for grazing and mortality, finding that the functional form does matter, but that the use of more complex functional responses is not justified in all circumstances. They provide some specific advice regarding when more complex formulations may be needed in foodweb models. Mitra (2009) considered the treatment of zooplankton mortality, comparing the behaviour of models employing hyperbolic, quadratic or sigmoidal closure terms with that of a model in which zooplankton mortality is simulated through zooplankton eating zooplankton ("intra-guild predation"). The latter was found to predict a more realistic trophic transfer efficiency and, again, a greater role for the detrital loop, while presenting similar results for phytoplankton and zooplankton concentration time-series. The recommended model in this case was not significantly more complex than the simpler alternatives.
6. How do river models, lake models and marine models compare? Fig. 3 summarises the complexity and relative emphasis of models published in the past decade. The difference between marine and lake models on one hand and catchment/river models on the other hand is striking.
Lake and marine models are often complex (i.e. Fig. 1 c, d or e), and the formulation of lake models and marine models is very similar. If salinity is included, the models can be identical, though CAEDYM (Robson and Hamilton, 2004;Spillman et al., 2007;Trolle et al., 2011), LakeMab/CoastMab and the associated LakeWeb/ BaltWeb (Håkanson and Boulion, 2002;Hakanson and Eklund, 2007) seem to be unusual in having been applied to both marine and lake systems. These models are sometimes applied to the lower portion of rivers, including estuaries (Robson et al., 2008), but they are rarely extended further upstream.
Marine models are more frequently extended to include ecosystems (foodwebs and a variety of plants), while lake models more frequently include detailed sediment chemistry.
Catchmenteriver models are almost universally on the simpler side with respect to their treatment of in-stream processes (e.g. Fig. 1 a or b), and sometimes include only one water-column phosphorus store (TP) and one removal process (settling). Pure catchment models (not included in this systematic review) include no in-stream nutrient processes at all, assuming that landscape generation processes are dominant in determining river nutrient loads. This assumption is likely to be reasonable only when residence times in river reaches and weir pools are shorter than the time-scales of biogeochemical transformations or of in-channel physical processes such as bank erosion, sedimentation and bed scouring.
Wetland models (e.g. Paudel et al., 2010;Walker and Kadlec, 2011;Wang et al., 2010) are less common than lake or marine models. As an indication, a Web of Knowledge search for the topic "wetland*" occurring with "model" and "phosphorus" conducted on 8 Oct 2013 returned 866 matches, whereas "lake" occurring with "model" and "phosphorus" returned over 5500 matches and "river" occurred with "model" and "phosphorus" in 2208 indexed papers. Wetland ecosystems and ecosystem services have historically been under-valued, though this is now changing (Woodward and Wui, 2001). This may be a reason for a historical paucity of wetland models, in which case, we should expect to see an increase in demand for wetland modelling in the near future.
Wetland models are variable in design and tend to be custombuilt for specific applications. Not surprisingly, they often emphasise processes relating to emergent macrophytes and (in the case of coastal wetlands) intertidal vegetation, which may occur over longer time-scales than processes relating to pelagic primary production in lakes and marine environments. Wetland models are typically run over time-scales of several years to several decades. Work on emergent vegetation processes in wetland models may be a useful reference for further development of river models, where these processes are also important.
A selection of other phosphorus models is included for purposes of comparison. Sediment diagenesis models (e.g. Tromp et al., 1995;Vancappellen and Berner, 1988) include detailed representation of redox chemistry, usually simulating a range of ions (e.g. O 2À , S 2À , SO 4 2À , Ca 2þ , Fe 2þ and Fe 3þ ) in many sediment layers in order to calculate redox state profiles to determine processes such as phosphorus precipitation and immobilisation. These models are very computationally intensive and usually implemented for idealised systems rather than spatially heterogeneous applications in the real world. Sediment diagenesis models have occasionally been incorporated into lake or marine models for particular applications (e.g. Hipsey et al., 2011), but most subsequent applications of these models do not employ this capability.
Looking at these data in another way, we might divide the models into three broad groups: water quality models (those including fewer than 3 biogeochemical or ecological processes), biogeochemical models (those including 3 or more biogeochemical processes from Table 1 but fewer than three ecological processes or components), and ecological models (those including 3 or more ecological processes or components from among: multiple phytoplankton groups, multiple zooplankton groups, benthic invertebrates, other animals, macroalgae, microphytobenthos, periphyton, submerged macrophytes, macrophyte roots or rhizomes, and bioturbation). The mean number of processes and components from Table 1 included in ecological models was 11.3, in biogeochemical models, 6.5, and in water quality models, 1.8. Considering post-2003 publications only, 88% of marine models and 91% of lake models described are best classified as ecological models, while none of the river models described fall into this category: 74% of river models fall into the "biogeochemical model" category, with the remainder being simple water quality models. This reflects differing priorities in river modelling, lake and marine models. Rivers are often conceptualised as delivery mechanisms of nutrients to important aquatic ecosystems (e.g. Correll, 1998;Seitzinger et al., 2005) rather than being considered as aquatic ecosystems in their own right. When ecological models are built for rivers, they often do not consider phosphorus at all, being defined instead in terms of flow requirements of key species (e.g. Chan et al., 2012;Escobar-Arias and Pasternack, 2011). Hydroecological and environmental flow models of this type do not fall within the domain of this review.

How are models changing over time?
In the preceding sections, I have described repeated calls for improvements in the physiological realism of models. In many cases, these calls have been backed up by work demonstrating that less conceptually accurate models of particular processes produce results that are demonstrably incorrect, especially when it comes to simulation of higher-level emergent properties and trophodynamics of ecosystems.
Unfortunately, greater physiological realism is almost always associated with increased model complexity. Fig. 4 shows how the complexity of models included in our systematic review has changed over time (counting all phosphorus processes included in Table 1, but not phosphorus stores). Over time, complexity in all sub-disciplines is increasing, as is the variability in complexity among models.
This comes at a cost of reduced computational efficiency and reduced identifiability (in the sense of Beck, 1987). More complex models usually include more parameters that require calibration: Denman (2003), for instance, found that the number of parameters in plankton models increased approximately as the square of the number of components simulated. In turn, data requirements for robust parameter estimation increase approximately in proportion to the square of the number of parameters. Models that are too complex relative to data support: May support multiple parameter sets that produce equally good calibration results, but make different predictions; May have greater parameter uncertainty than simpler models which are better constrained by the observational data; and May not be suitable for hypothesis testing, as it can be difficult to identify the point of conceptual failure when the model fails to correctly predict system responses.
These issues have been discussed in detail by Snowling and Kramer (2001).
In some cases, these increases in model complexity are a direct response to calls for increased realism. As noted by Flynn (2005), however, this is not always the case, as it is also common for new components to be added to models e for example, by coupling plankton models with foodweb models e without pausing to examine how existing components can be made to better reflect a biological understanding of the system. It is rarely obvious what level of complexity is appropriate. This issue is discussed further in Section 10.
8. How successful are our models? Arhonditsis and Brett (2004) conducted a systematic review of 153 aquatic biogeochemical models published between 1990 and 2002. Only 30% of these publications included any quantitative performance metrics, and less than 6% objectively quantified performance with all appropriate measures (r 2 and relative error (RE) of the match between observations and model output for key state variables). It appears that little has changed in the intervening decade. Table 3 summarises reporting statistics for the more recent papers analysed in the present review. Marine modelling papers are still hovering around 30% reporting any statistics of performance and it is very rare for such statistics to be reported for all relevant phosphorus stores. Lake modelling papers are little better in this regard. Interestingly, about half of the lake modelling papers that did report performance metrics were implementations of CAEDYM, suggesting that the CAEDYM modelling community is now aware of the issue.
A range of other performance metrics that may be suitable for assessment of aquatic ecological models are reviewed in the special issue of Journal of Marine Systems led by Lynch et al. (2009), and more recently in Environmental Modelling and Software by Bennett et al. (2013).
When lake and marine modellers do report performance statistics, these are not always for independent validation data sets: calibration and verification were conflated in about half the papers assessed for this analysis. It is rare for statistics to be reported for more than a selected few of the variables (often chlorophyll a and total phosphorus) simulated by the model.
The situation is quite different in the river modelling community, with a clear majority of papers reporting performance statistics for separate (though not always independent) calibration and verification datasets. In many cases, Nash-Sutcliffe efficiency, r 2 and relative error are reported for flow, and only a subset of these is reported for phosphorus.
None of the papers reviewed reported sufficient flux data (e.g. phosphorus sedimentation rates or rates of release from benthic sediments) to allow evaluation of model performance with regard to calculated flux rates.
Though the metrics included in publications were inadequate, Arhonditsis and Brett (2004) were in many cases able to calculate some basic performance metrics from the information presented in time-series plots or tables. Statistics for phosphate, phytoplankton and zooplankton are relevant here, as each of these is also a phosphorus store (Fig. 1). Arhonditsis and Brett (2004) provide metrics for each decile of model performance, along with several other analyses. A few key results are reproduced in Table 4.
From these figures, it is difficult to support a claim that mechanistic models have (or had, by 2002) any capacity to predict observations of bacteria, and the same may be true of zooplankton. If these components are so poorly simulated, but phosphate (which is released by bacteria) and phytoplankton (which is grazed by zooplankton) are better simulated, it seems likely that the system is not well represented, but that the models are either overparameterised or contain several errors which to some degree cancel out in predicting phytoplankton and phosphate concentrations.
I have not attempted in the current review to replicate the massive undertaking of digitising time-series plots in order to calculate consistent metrics for papers published since 2003, however a qualitative examination of the metrics that have been provided in some of these papers does not suggest that there has been any significant improvement in predictive performance with respect to phosphorus. Nor does it suggest that catchmenteriver models generally perform better or worse than lake models, though this is difficult to confirm, since catchment modellers often report phosphorus results only in terms of monthly or annual loads, while lake and marine modellers tend to present continuous time-series model comparisons with point-in-time observations.
A complicating factor is the uncertainty inherent in the observational data, which may increase as we move from top to bottom of Table 4. While most performance metrics treat observational data points as an objective standard of truth, measurements also embody considerable uncertainty and error, including analytical error, sampling error, and representational errors such as the difference between actual phytoplankton phosphorus stores and phytoplankton phosphorus calculated using an assumed constant relationship with measured chlorophyll a. Refsgaard et al. (2007) discuss sources of uncertainty in environmental modelling and include data uncertainty as one source. The question of how best to evaluate model performance in the presence of variable (yet potentially calculable) data uncertainty remains open. The GLUE methodology (Beven and Binley, 1992) is one attempt to address this. The effectiveness of this approach has been the subject of some discussion (Beven et al., 2007;Christensen, 2004;Mantovan and Todini, 2006;Mantovan et al., 2007) and in recent years, the formal Bayesian MCMC methodology, DREAM (Beven, 2009;Vrugt et al., 2008Vrugt et al., , 2009a has gained ground in the hydrological literature. Both approaches are computationally expensive, which may limit their application in association with complex models incorporating computational fluid dynamics. Other Bayesian approaches are the subject of active development.

What explains the disparity between river models and models of other aquatic systems?
I have demonstrated that the treatment of phosphorus in catchment and river models tends to be much simpler than in lake and marine models and that the catchment modelling community is more rigorous in sensitivity analysis and performance evaluation. What explains these differences? The argument that biogeochemical processes are less important in rivers than in other aquatic systems, and that it is therefore appropriate to leave these processes out of river models, is unconvincing. I will speculate instead that several cultural factors have influenced the current situation: River modelling is often conducted within the context of a tight regulatory framework (e.g. Duncan, 2006), which encourages standardisation and discourages innovation and ad hoc development of new models and model variants. Lake and marine water quality models tend to be coupled with hydrodynamic models, while river models usually use a simple hydrological or flow-routing model framework. As a result, lake and marine models are more complex and computationally intensive to begin with, so this modelling community has a greater tolerance for, and expectation of, complexity. Hydrological journals have, for a long time now, focused on rigour in model calibration and assessment (e.g. Beven, 1989;Refsgaard, 1997), which encourages the use of simple models with mathematically defensible, low numbers of parameters, as well as the use of standardised models with built-in calibration and performance characterisation tools. By contrast (Arhonditsis et al., 2006), rigour in calibration and assessment appears to have played no significant role in the publication and subsequent citation of marine models e though this may now be starting to change. It is now Environmental Modelling & Software policy, for example, to expect at least a sensitivity analysis to have been conducted, and this journal has recently published advice regarding best practise in development methodology for aquatic models (Blocken and Gualtieri, 2012;Robson et al., 2008) as well as more general advice regarding performance characterisation (Bennett et al., 2013). Limnological and oceanographic journals, by contrast, have put more emphasis on rigour in the biogeochemical conceptualisation of models and novelty in scientific application. It is difficult to publish results of straightforward applications of an existing model to new systems in highly-ranked limnological and oceanographic journals, while this appears to be comparatively common in even highly regarded hydrological journals. Again, this tends to encourage novelty, diversity and creeping complexity in lake and marine models. It is possible (though difficult to confirm) that the scientific training of many of those in the lake and marine modelling communities may be different from that of those in hydrological modelling communities. Lake and marine modelling communities may include a greater proportion of biologists and oceanographers and a lower proportion of civil, environmental or software engineers, which could contribute to differences in approach.
The greater standardisation of models and assessment methodologies for rivers has encouraged the development and wide application of utilities like SWAT-CUP (Abbaspour et al., 2007), which automates calibration, sensitivity and uncertainty analysis of SWAT applications. The development of similar utilities for lake and marine models is not a trivial task, as these models are typically more computationally intensive (which limits the application of analytical techniques that require many model runs) and often have many parameters (which increases the potential for equifinality (Beven and Binley, 1992)). Lake and marine modellers can borrow appropriate approaches from SWAT-CUP while also looking to innovative approaches such as Bayesian hierarchical modelling (Arhonditsis et al., 2008;Parslow et al., 2013) and Bayesian melding (Chiu and Gould, 2010) to help overcome these challenges.

How complex should models be?
It is clear that aquatic biogeochemical models are becoming more complex, but there is no clear agreement regarding whether this is appropriate.
In one camp are those who argue in favour of complexity and physiological realism. Flynn (2005) states that "modellers should not omit representations of biological behaviour unless it is demonstrated (empirically and/or mathematically) that it is safe to do so." Allen and Polimene (2011) call for a new generation of more complex, physiologically detailed models, arguing that simple models which operate at a high level of abstraction will never be able to simulate chaotic emergent ecosystem properties or accurately predict unexpected ecosystem responses that could not be anticipated simply by examining the component parts of the model.
In the opposing camp are those who argue for simplicity. Paudel and Jawitz (2012) present such an excellent discussion that it is worth quoting directly. "Complex biogeochemical models usually have fewer restricting assumptions and exhibit more flexibility; however, increasing the level of complexity in the model leads to an increased sensitivity of the output to the input (Lindenschmidt, 2006b;Snowling and Kramer, 2001). This is primarily because large uncertainties may arise due to the increased number of interactions between state variables and unconstrained parameters (Robson et al., 2008). Incorporating comprehensive representations of biogeochemical processes and their effects into models also entails practical limitations. For example, mechanistic biogeochemical models require large amounts of data (Robson et al., 2008), which may be relatively scarce. If the model is not constrained by the available field data, the cost associated with fitting noise could lead to diminished performance (Friedrichs et al., 2006). Complex models also need huge human and computer resources (Jorgensen et al., 2002); therefore, increasing the level of complexity by incorporating more state variables and processes may not be cost effective, because the majority of the modelling resources may then be devoted to developing and maintaining the model, rather than its application (Fulton et al., 2003). In addition, the computational cost of adding more detail may effectively inhibit the utility of the model. As a consequence, there is a conflict between the desire to constrain the model complexity and to incorporate more processes mechanistically." This tension between physiological realism and model parsimony can only be resolved by assessing the actual performance of models with different structures.
Because metrics are not routinely published in lake and marine modelling and because there has been no agreement regarding which metrics should be used, meta-analyses comparing models of different types or degrees of complexity are difficult to conduct. Even in catchment and river modelling of phosphorus, there have been few studies that have attempted to assess the impact of varying model complexity.
Those studies that have been undertaken have generally been limited to comparisons of two or three formulations of a model as applied either in a very specific applied context or in an abstract, mathematical sense. Some examples follow: Migliaccio et al. (2007) evaluated whether coupling the catchmenteriver model SWAT with a more detailed in-stream water quality model (QUAL2E) improved prediction of nutrient loads in a north American catchment, and found that it did not. Kim et al. (2013) compared two models of differing complexity to simulate phosphorus dynamics in a north American bay, and were able to achieve better performance using the more complex model. Table 4 50th percentile r 2 and relative error (RE) of 153 marine models assessed by Arhonditsis and Brett (2004 Min et al. (2011) compared two mechanistic phosphorus cycle models applied to the Florida Everglades, USA. In this case, the simpler model was just as effective as the more complex model in predicting variations in total phosphorus, but the authors found that the more complex model was useful in that it provided greater insight into sources of error and the role of soluble reactive phosphorus. Baird et al. (2007Baird et al. ( , 2003 compared an estuarine eutrophication model using mechanistic versus empirically determined process representations for algal growth and zooplankton grazing in application to an Australian bay, and found the results generally similar. Fulton et al. (2004) compared a simple marine ecosystem model with a more complex model in application to the same Australian bay. The simpler model produced equally accurate results in most circumstances, but the complex model performed better in times of extreme change and also appeared to do better in simulating the behaviour of infauna, for which limited field data were available. Lindenschmidt (2006a) tested five different dissolved oxygen submodels in WASP to test the hypothesis that increasing model complexity resulted in a better fit to observational data but also increased model sensitivity to changes in parameter values. The results supported the hypothesis. Crout et al. (2009) proposed a method to evaluate the appropriate level of model complexity by establishing a family of models, iteratively replacing one or more variables with constants and comparing model performance in a Bayesian framework. This approach may be useful for some aspects of biogeochemical modelling, but it would need to be used with care to ensure that it does not "break" the model with respect to conservation of mass or capacity to simulate feedback effects likely to be important outside the test conditions. Baird and Suthers (2010) and Baird (2010) found that the structural complexity of a size-resolved pelagic ecosystem model affected its results. Results varied as the number of size classes increased from 1 to 123, but did not change beyond this resolution in the test conditions. Models with more size classes or a greater number of predatoreprey interactions were found to be less sensitive to errors in initial conditions. Paudel and Jawitz (2012) applied a suite of biogeochemical models of varying complexity to a storm-water treatment wetland. They found that the most complex of the six models tested provided the most accurate fit to the data, but with markedly diminishing returns as complexity increased. Heinle and Slawig (2013) explored the dynamic behaviour of three differently formulated NPZD (nutrient-phytoplanktonzooplankton-detritus) models, finding that the limit cycles exhibited by the simplest model disappeared when more complexity was added. Perhar et al. (2013a) compared a model that represented the effects of food quality (phytoplankton stoichiometry and highly unsaturated fatty acids (HUFA) concentrations) on ingestion efficiency and zooplankton growth with an earlier model that did not consider food quality. The more complex model provided a better fit to the observational data, though this might simply reflect greater freedom in fitting due to an increased number of parameters. The more complex model was also found to be more dynamically stable.
The question is far from resolved, in part because it has sometimes been poorly posed. It is not simply a question of how much complexity is needed, but of which components of a model require greater or lesser complexity in what circumstances, and for what types of application. To answer this will require either a much greater investment in testing a range of models in a range of theoretical (ala Baird and Suthers, 2010) or real-world applications (perhaps "modelling everything everywhere", as intriguingly proposed by Beven and Alcock, 2012), or a greater investment in epidemiology-style meta-analyses, which in turn will require much more consistent reporting of performance metrics by the modelling community.
To assess the capacity of models to anticipate significant change, including regime changes between alternative stable states (Scheffer et al., 2001), modellers should also endeavour to validate models under conditions that are substantially different from calibration conditions. While not all applications require this capacity, in some cases, it is very important. The potential to simulate hysteresis and predict regime shifts has been put forward as an argument in favour of mechanistic over statistical models. While this capacity has been demonstrated in theoretical ecology (Genkai-Kato, 2007;Scheffer et al., 1993) and speculative scenarios (e.g. Webster and Harris, 2004;Zhang et al., 2003), there are only a few examples (e.g. Janse, 1997;Wang et al., 2012) of such models being compared with observations of hysteresis or state change in real systems, and fewer (if any) of the validated use of models to predict hysteresis in a real system in advance of it actually being observed.
In the meantime, I suggest that we do not need a "one size fits all" model that includes every process for every application, but we do need a toolkit of well-considered submodels for different components of the phosphorus cycle, so that we can pick and choose appropriate submodels to include in each application, selecting those most important to the problem at hand and supportable by measurements, perhaps choosing simpler representations when it is essential that uncertainty is well defined and parameter values are defensible, and choosing more complex (but physiologically determined) representations when it is desirable to predict unexpected and possibly chaotic ecosystem responses. This may be best achieved through the development of the flexible, open-source community models advocated by Mooij et al. (2010) and Trolle et al. (2012). This modelling toolkit should extend beyond lake models to also encompass landscape, catchment, estuary and marine applications.
Whenever one formulation has been demonstrated to be both less physiologically realistic and no less complex than an alternative formulation (e.g. Mitra, 2009), the less accurate formulation should immediately be abandoned.

How does what is modelled relate to what is measured?
In routine monitoring programmes, water-column phosphorus measurements may be taken at any of a number of degrees of detail. It is not my intention to delve into the intricacies of sampling and analysis techniques here, but rather to introduce the commonly reported measurements as they are seen and treated by modellers.
1) Total phosphorus (TP) only. This is common in poorly resourced monitoring programmes as it is the cheapest way to measure phosphorus, requiring the simplest sampling protocol. Modellers must remember that TP includes not only dissolved and non-living particulate material, but also phosphorus associated with bacteria, phytoplankton (and, depending on sampling procedures, some zooplankton) in the water column. In interpreting field data, modellers typically estimate the store of phosphorus associated with phytoplankton by assuming a fixed ratio between intracellular phosphorus stores and chlorophyll, though this assumption is very problematic (Klausmeier et al., 2004(Klausmeier et al., , 2008.
Models that consider only TP, or that divide TP only into algal and non-algal phosphorus, are likely to be highly tuned to specific systems and are not likely to be capable of predicting responses to substantial changes in environmental forcing. The chemical form in which phosphorus is present depends on its sources and transport mechanisms, and can have a dramatic impact on its bioavailability and reactivity.
The residence time of the system under consideration also makes a difference here. Relatively refractory phosphorus in the form of terrestrial organic matter may be of little significance in a stream with a residence time of a few hours or a few days, but may become important when delivered to a reservoir in which it can be broken down over the course of months or years.
This provides some basic additional information which can be used in models. Where these are the only observational data available, a common assumption is that all dissolved phosphorus is labile and available for immediate biological uptake, while all particulate phosphorus is associated with either phytoplankton or organic detritus. The accuracy of these assumptions varies considerably between systems. 3) TP, TDP plus total organic phosphorus. In this case, a common modelling assumption is that all non-algal organic phosphorus is particulate and detrital, while all dissolved phosphorus is inorganic. 4) Any or all of: TP, filterable reactive phosphorus (FRP, also known as soluble reactive phosphorus, SRP) or phosphate (H x PO 4 (3Àx)À ), particulate organic phosphorus (POP, which includes phosphorus in phytoplankton) and dissolved organic phosphorus (DOP).
In model representations, FRP, phosphate and DIP (dissolved inorganic phosphorus) are interchangeable. Chemically, they are not identical, as the analytical methods used to determine FRP include not only phosphate, but also the readily hydrolysed fraction of dissolved organic phosphorus (DOP) (Baldwin, 1998) including simple nucleotides such as ATP and ATM (Baldwin, 2013). If we assume that the chemically hydrolysable fraction of DOP corresponds exactly to the highly biologically available fraction of DOP, then the distinction between DIP and measured FRP is a fine one, unlikely to affect the accuracy of models. This assumption, however, has not to my knowledge been demonstrated to be true.
In routine monitoring, it is rare for organic phosphorus to be reported with any more detailed level of chemical description than POP and DOP, and it is also uncommon for particulate inorganic phosphorus (chiefly, phosphate adsorbed onto mineral particle surfaces) to be separately reported. The phosphorus content of benthic sediment stores, if measured at all, is measured on a one-off basis rather than routinely.
This pattern of measurement is reflected in the treatment of phosphorus in models. The simplest water quality models consider only living and non-living water-column TP. The most complex divide phosphorus into refractory and labile POP and DOP pools, dissolved inorganic phosphorus, labile and immobile adsorbed inorganic phosphorus, and living phosphorus pools such as algal phosphorus stores, in both the water column and benthic sediments.
The frequency and spatial resolution of measurements can also be an issue. While aquatic biogeochemical models may have a resolution of seconds, hours or days (designed to match the timescale of processes believed to be important or imposed by the stability requirements of coupled hydrodynamic models), many monitoring programs have sampling intervals of months or more and sample at only a few sites within an area that may span hundreds of kilometres.
There is a fundamental dilemma here. If our models represent phosphorus cycles in more detail than our measurements, can they be adequately constrained? If mechanistic models represent phosphorus too simplistically given our understanding of the biogeochemical mechanisms involved, can they have sufficient predictive power? Should monitoring programmes drive model design, or should it be the other way around? These questions require ongoing interaction between modellers and observational scientists (Min et al., 2011).

What is not modelled? Challenges and gaps
Today's aquatic biogeochemical models have their roots in management and modelling of harmful algal blooms. These models were designed with mesotrophic to eutrophic, lentic waters in mind. What processes are missing from our current generation of models that might be important in the other contexts in which these models are now being applied?

Transfer-limited nutrient uptake
Uptake of phosphorus by plants is controlled by: the rate at which phosphorus can be incorporated into the plant's cellular infrastructure; the rate at which phosphorus passes through receptor sites on the cell surface (and the density of such sites); and the rate at which phosphorus moves from the bulk water to the surface of plant cells. For phytoplankton, the rate of transfer to the cell surface is controlled primarily by the concentration of available phosphorus in the water column, as the diffusive boundary layer thickness around the planktonic cell can be considered constant. For benthic plants and corals, however, this boundary layer thickness varies as a function of water velocity, leaf stiffness and morphometry and surface roughness.
In oligotrophic (low nutrient) environments, the rate of transfer of dissolved phosphorus across the diffusive boundary layer is likely to be the primary control on the rate at which plants are able to take up phosphorus from the water column. Hearn et al. (2001) present a formulation for modelling of flux-limited nutrient uptake in coral reefs, Stevens et al. (2003) and Fram et al. (2008) do the same for giant kelp, while Townsend and Padovan (2009) and Robson (2010) consider flux-limited nutrient uptake in modelling benthic macroalgae in a relatively pristine tropical river. These formulations, however, have not yet been widely adopted in complete biogeochemical or ecological modelling packages. Though there are some exceptions in marine modelling, there appears to be little awareness of this issue in the freshwater modelling communities.

Flow through sediments
Flow through sediments due to groundwater-surface water interactions (Jones and Holmes, 1996;Vanliere and Mur, 1982), flow through sand ripples (Rocha, 2008), tidal pressure gradients (Werner et al., 2006) and wave action (Cardenas et al., 2008) greatly increases interactions between the water column and sediment pore-water and may enhance microbial activity and rates of many biogeochemical reactions by providing dynamic redox conditions and exposure to reactive particle surfaces. Rocha (2008) presents a discussion of sandy sediments as biogeochemical reactors, but the quantitative importance of this across a diverse range of aquatic environments is not yet well understood. These interactions are not yet generally considered in biogeochemical models. In intertidal environments, shallow streams, and aquatic environments in which reaction rates are otherwise low, this may be an important oversight.

Soil and groundwater biogeochemistry
Though many catchment hydrology models include groundwater stores, and models such as SWAT (Vadas and White, 2010) do consider transport of dissolved phosphorus through groundwater, most models do not go further than this. Hydrolysis of dissolved organic phosphorus to phosphate, removal of phosphorus through adsorption to soil surfaces (Notodarmojo et al., 1991) or uptake by riparian vegetation (Boar et al., 1995;Kuusemets et al., 2001), and long time-lags between addition or removal of sources of groundwater nutrient pollution and subsequent delivery to aquatic systems (Spiteri et al., 2007) are rarely considered. Recent work (de Vries et al., 2013) has shown that soil biota are a better predictor than land use of carbon and nitrogen loss across a large geographical area. The same is almost certainly true of phosphorus.
Groundwater phosphorus fluxes are an important component of the overall phosphorus budget of many aquatic systems, both lotic and lentic (Hayashi and Yanagi, 2009;Holman et al., 2008), so the behaviour of phosphorus in groundwater might warrant closer attention in future integrated catchmenteriver or catchmentelake models. Mechanistic simulation of soil and groundwater biogeochemistry requires coupled reactive transport modelling capable of simulating depth-dependent variations in redox conditions and soil properties across soil-water interfaces.

Effects of sediment drying and rewetting
Sediments in many aquatic environments are subject to wetting and drying on tidal timescales in coastal environments, seasonal timescales in seasonal rivers and wetlands, and for years at a time on many floodplains. Drying and subsequent rewetting can have complex biogeochemical implications as sediment surfaces switch between anaerobic and aerobic respiration and bacterial communities change . Biogeochemical processing continues when soil is dry, and can result in the release of large quantities of phosphorus immediately after rewetting. This depends on the concentrations and ratios of sulfides, oxidised iron and phosphate in the sediments Loeb et al., 2008).
At present, even models that simulate the physics of wetting and drying (Bruce et al., 2006;Wild-Allen et al., 2010) do not simulate biogeochemical processes in dry cells. This is something that probably needs to be considered if we are to accurately simulate phosphorus cycles of floodplains or seasonal wetlands.

Biochemistry of dissolved organic matter
Biogeochemical understanding of organic phosphorus in aquatic environments is incomplete, but has been developing rapidly in recent years. Our models do not yet reflect this growing understanding. Baldwin (2013) has recently reviewed the biogeochemistry and ecological significance of organic phosphorus in some detail, and the following discussion is informed primarily by this review.
Most modern aquatic biogeochemical models represent organic phosphorus in some form. Many (e.g. Chavan and Dennett, 2008;Eilola et al., 2009;Reed et al., 2011) do not distinguish between particulate and dissolved forms of organic phosphorus, and simply represent all organic material as "detritus". Some assume a constant C:N:P ratio for this detritus, requiring any excess phosphorus to be immediately available as DIP to maintain a phosphorus mass balance.
More complex models track organic nitrogen, carbon and phosphorus separately to allow variable stoichiometry, and most divide organic phosphorus into separate particulate and dissolved pools. Sometimes the particulate pool (Cerco, 2000;Cole and Wells, 2008;Janse, 1997;Wild-Allen et al., 2010) and more rarely both dissolved and particulate organic phosphorus (Garnier et al., 2005;Hipsey et al., 2004) are divided into "labile" and "refractory" pools, distinguished by different specified or (most commonly) calibrated breakdown rates.
The origin of organic phosphorus in aquatic environments determines its chemical composition and physical characteristics, which in turn affects bioavailability, chemical availability and nutritional quality. Organic phosphorus can be found in the form of nucleic acids (including DNA, RNA and other nucleotides), inositol phosphates and phospholipids, among other forms (Baldwin, 2013).
Our review has not uncovered any examples of biogeochemical models of natural environmental systems that divide organic phosphorus into chemically distinguished components, though some wastewater treatment models do so (Liu et al., 2007). I will not put forward a new formulation for organic phosphorus modelling here, but will at least discuss some of the major forms of organic phosphorus in aquatic systems which may be of interest to modellers, and the factors that influence their cycling which may be included in future models.

DNA and RNA
Nucleic Acids (DNA and RNA) make up a substantial proportion of dissolved organic phosphorus (DOP) in aquatic systems, though that proportion varies between systems, between the water column and sediment stores, and over time. Siuda et al. (1998) found that DNA accounted for between 0 and 100% of total phosphorus in the 21 German lakes studied (mean 54%) while RNA also accounted for anything from 0 to 100% (mean 36%). Similar results have been found in coastal systems (Sakano and Kamatani, 1992).
The proportion of organic phosphorus that is in the form of DNA and RNA is therefore both substantial and highly variable. This proportion has been found to vary as a function of trophic status, with eutrophic systems having a greater proportion of DOP as nucleic acids than oligotrophic or mesotrophic systems (Siuda and Chrost, 2001), probably due to the release of DOP from phytoplankton by lysis.
The bioavailability of dissolved DNA depends on whether it is "free" DNA or whether it is bound up in viral particles. Siuda and Chrost (2000), in their coastal case study, found that the proportion of dissolved DNA that was "free" and readily hydrolysable, varied from 50 to 90%.
Simulating this variability could be important in the next generation of aquatic phosphorus models because DNA and RNA are often very bioavailable in comparison with other forms of organic phosphorus.

Other nucleotides
Other nucleotides (including ATP, AMP and GTP) also contain phosphorus in organic form, and have been measured in lakes and marine systems, though in small quantities. Relatively little is known about the biogeochemistry of these molecules in aquatic systems (Baldwin, 2013). Although concentrations are low, the rate of turnover of nucleotides may be very high (Bjorkman and Karl, 2005), and it has been suggested that ATP in particular may be important in the phosphorus cycles of oligotrophic systems (Azam and Hodson, 1977;Baldwin, 2013). ATP and AMP are produced by phytoplankton during blooms and can be adsorbed to mineral surfaces, which reduces their reactivity (Baldwin, 2013).

Inositol phosphates
Inositol phosphates are common in terrestrial soils and plants, with high concentrations occurring in seeds, including seeds of aquatic macrophytes. Inositol phosphates may account for a substantial proportion of organic phosphorus in aquatic systems (up to 80% of sediment organic phosphorus (Mckelvie, 2007)), but this has not yet been sufficiently assessed (Baldwin, 2013).
Inositol phosphates in aquatic systems may be largely of terrestrial origin. When released through decomposition of plants, these phosphates are quickly bound to soil surfaces, but may be released in estuaries when exposed to marine salinities. The bioavailability of phosphorus in this form is not clear (Baldwin, 2013), so this is one topic regarding which modellers may have to watch and wait. In the meantime, inositol phosphates could be modelled as a refractory form of organic phosphorus that is present in detrital form in terrestrial loads, is adsorbed to sediment surfaces on breakdown of terrestrial detrital material, and is released as refractory dissolved organic phosphorus at higher salinities.

Phospholipids
Phospholipids (fats containing phosphorus) are present primarily in particulate form and may concentrate at the surface of waterbodies (for example, in sea foams) and on sediment surfaces. They may represent an important sink of phosphorus in oligotrophic systems, effectively removing it from the aquatic phosphorus cycle on timescales relevant to water quality modelling (Baldwin, 2013). This removal mechanism is not yet considered in aquatic phosphorus models.

Other processes
Other processes and interactions that are not generally included in aquatic biogeochemical models include: Direct uptake of some forms of dissolved organic phosphorus by phytoplankton; which may be a major pathway when dissolved inorganic phosphorus concentrations are low (Cotner and Wetzel, 1992); Bed transport (saltation) of phosphorus-carrying particles (Campbell, 1978); Changes in phosphorus uptake rates of aquatic plants associated with a pulsed nutrient supply (Touchette and Burkholder, 2000); Adaptations to nutrient deprivation such as changes in pigmentation of cyanobacteria and shifts between C3 and C4 photosynthesis (Touchette and Burkholder, 2007); The influence of burrowing animals on sediment ventilation (Van Cappellen et al., 2005).
Each of these may be significant in some circumstances (Fig. 5).

Methodological issues in environmental modelling
The ongoing development of aquatic water quality and ecological models is occurring in the wider context of ongoing improvements in environmental modelling methodology. In recent years, the journal, Environmental Modelling & Software, has covered developments, amongst others, relating to: Best practise modelling procedures. Jakeman (2006) outline a generic environmental modelling framework consisting of ten steps, to guide modellers through the process from clear identification of clients and objectives, through to final evaluation of the model and documentation of its accuracy and limitations. The relevance of the ten steps approach to biogeochemical modelling has been assessed by Robson et al. (2008), while Blocken and Gualtieri (2012) considered these steps in the context of computational fluid dynamics. Both found the suggested procedure to be very suitable for this type of modelling. More recently, Black et al. (2014) presented a framework for best practice modelling in the context of water resources management, which is particularly relevant to river phosphorus modelling. Sensitivity and uncertainty analysis. Refsgaard et al. (2007) presented a general framework for uncertainty analysis, including a brief review of available methods. This review offers a very useful starting point, as it covers not only formal, numerical methods that are not always appropriate in the context of environmental fluid dynamics, but also more flexible Fig. 5. A speculative summary of types of systems in which some less-commonly modelled phosphorus processes are most likely to be important. approaches such as uncertainty matrices, expert opinion elicitation, and scenario analysis. When a detailed sensitivity analysis is required, Campolongo et al. (2007) describe a method to minimise the size of the task, while Saltelli and Annoni (2010) provide guidance regarding how to ensure that the results are meaningful. Model emulation (also known as metamodelling or surrogate modelling). Ratto et al. (2012) and other authors in the 2012 thematic issue on model emulation describe approaches that can be used to derive a "model of a model", i.e. a relatively simple, data-based model of a more complex (often mechanistic) environmental model. Emulation techniques can be used to conduct sensitivity analysis (by proxy) of a computationally intensive model (Borgonovo et al., 2012), or for parameter optimisation (Margvelashvili et al., 2013), or to evaluate large numbers of scenarios to optimise management (Carnevale et al., 2012). Model evaluation. Bennett et al. (2013) review methods to characterise the performance of environmental models, including a range of traditional and more innovative methods that have been applied to models of aquatic systems. Integrated modelling. Laniak et al. (2013) and other authors in the special issue on integrated modelling address challenges in integrating models across environmental domains (e.g. catchment-to-coast or source-to-sea modelling) and subject domains (e.g. combining biophysical models with social and economic models). These include both technical challenges of data architecture and model infrastructure and social challenges of bringing together different knowledge systems and ontologies. This work is particularly relevant to large-scale modelling of phosphorus in environmental systems (which is likely to require integration of landscape and receiving water models), modelling of water quality in an operational or forecasting mode (which is likely to require robust data infrastructure), and modelling in a decision support framework (in which simulation of the fate of phosphorus is just one small part of an interdisciplinary puzzle). Fitness for purpose. Alexandrov et al. (2011) present guidelines regarding how to demonstrate that a model is fit for the purpose for which it has been developed. This includes guidelines for delineation of the model's appropriate domain of application, demonstrating that the model is credible, and demonstrating an improvement over prior art. Kelly et al. (2013) offer one perspective on how to choose the most appropriate modelling approach, given the context in which the model will be used. Systems dynamic models, the focus of the present review, are not always the best tool for the job.

Conclusions
There is a clear separation between marine and lake models on one side (diverse and conceptually sophisticated, but lacking rigour in implementation and assessment) and catchment and river models on the other (tending to neglect important biogeochemical processes, but with more rigorous characterisation of performance). As we move towards integrated catchment-to-ocean modelling, we need to overcome these differences, establishing a consistent approach across the two communities.
At present, the predictive performance of most aquatic biogeochemical and ecological models is not really very good, at least if judged by r 2 and relative error comparisons with observational data points. If we are to improve our predictive capacity, we need to understand why. To build this understanding, we must: Assess the physiological basis of alternative functional forms for phosphorus processes, and reject those than cannot produce ecologically realistic results within the range of conditions for which the models are being applied; Interact continually with observational scientists, to ensure that our models reflect the best available systems understanding and to ensure that observational programs are designed that can provide the information needed by our models; Agree on standard metrics and methodologies for performance assessment, so that we can extract knowledge from a range of model applications and conduct meaningful meta-analyses to identify problems and successes; and Develop techniques and utilities for automatic sensitivity, calibration, uncertainty estimation and performance characterisation that are suited to complex lake and marine models and sufficiently flexible to work with a range of model platforms.
As a community, we should be working to build a toolkit of submodels from which we can pick and choose to suit each modelling application e not necessarily within a single modelling framework (though this could be achieved through opensource community modelling of the sort recommended by Trolle et al. (2012) or Pereira et al. (2006)), but across the literature as a whole.
This toolkit should include a range of processes relevant to oligotrophic systems. Most current models are derived from models developed to aid management of eutrophication, and may not be suitable for the low-nutrient systems to which they are now also being applied. In applying biogeochemical models to oligotrophic systems, modellers should consider including formulations to represent: transfer limited uptake of phosphorus by benthic plants; direct uptake of bioavailable dissolved organic phosphorus by phytoplankton; rapid cycling of organic phosphorus in the form of nucleotides; and production of phospholipids and subsequent physical removal mechanisms.
Other processes, not commonly simulated at present, which might be included in such a toolkit include the effects of sediment drying and rewetting, flow through sediments, and groundwater biogeochemistry.
At the same time, we should beware of continuing increases in the complexity of models, as increases in model complexity over the past few decades have not been demonstrated to have substantially improved our predictive capacity. As a rule of thumb, a more complex model should only be used where a simpler model has been demonstrated to be inadequate, and a proposed, more complex formulation should always be thoroughly investigated to ensure that (even in light of the limitations of available data and uncertainties in other parts of the phosphorus cycle) it really does lead to more accurate representation of phosphorus cycling in aquatic ecosystems.