Comparing strengths and weaknesses of three ecosystem services modelling tools in a diverse UK river catchment

(water explicit (Land Utilisation and Capability Indicator), (Arti for Ecosystem Services) and InVEST (Integrated Valuation of Ecosystem Services and Tradeoffs). Models were parameterised for the UK and applied to a temperate catchment with widely vary- inglanduseinNorthWales.Althougheachtoolprovidesquantitativemappedoutput,canbeappliedindifferent contexts,and canworkatlocalornationalscale,they differintheapproachestakenandunderlying assumptions made. In this study, we focus on the wide range of outputs produced for eachservice and discuss the differences between each modelling tool. Model outputs were validated using empirical data for river ﬂ ow, carbon and nu- trientlevelswithinthecatchment.Thesensitivityofthemodelstoland-usechangewastestedusingfourscenar-ios of varyingseverity, evaluatingtheconversionof grassland habitatto woodland(0 – 30%ofthelandscape). We show that, while the modelling tools provide broadly comparable quantitative outputs, each has its own unique features and strengths. Therefore the choice of tool depends on the study question.


H I G H L I G H T S
• Ecosystem service decision support tools range in complexity and sophistication. • We compared three spatial ecosystem service tools: ARIES, InVEST and LUCI. • Models were run for water supply, carbon storage and nutrient retention services. • All three tools performed similarly, but have different strengths. • As each tool has unique features, choice of model depends on study question.

G R A P H I C A L A B S T R A C T
a b s t r a c t a r t i c l e i n f o

Introduction
Ecosystem services modelling tools allow the quantification, spatial mapping, and in some cases economic valuation, of ecosystem services. The output from these tools can provide essential information for land managers and policy makers to evaluate the potential impact of alternative management options or land-use change on multiple services . Such tools are now being used around the world, at a range of spatial scales, to address a wide variety of policy and management questions. For example, they have been used to investigate the possible effects of climate change on water provisioning and erosion control in a Mediterranean basin (Bangash et al., 2013), to provide guidelines for water resource management in China (Fu et al., 2014), and to examine the potential impact of agricultural expansion on biodiversity and carbon storage in Brazil .
Ecosystem service decision support tools range in complexity, with the simpler models requiring less user time and data inputs while the more complex models require more technical skill but can result in greater accuracy and utility. The simplest include spreadsheets (e.g. Ecosystem Services Review [ESR]; WRI, 2012), and mapping overlay tools based on land-cover based lookup tables (Burkhard et al., 2009). Intermediate complexity spatial tools provide information on the relative magnitude of service provision (e.g. SENCE; Vorstius and Spray, 2015), and the more complex tools allow spatial quantification and mapping of services, for example InVEST (Integrated Valuation of Ecosystem Services and Tradeoffs; Sharp et al., 2015), LUCI (Land Utilisation and Capability Indicator; Jackson et al., 2013) and ARIES (Artificial Intelligence for Ecosystem Services; Villa et al., 2014). With an ever increasing variety of tools available, there are now a number of reviews and comparisons that help potential users make informed decisions on which tool might be appropriate for their needs. These typically focus on tool capabilities, ease of access/use, time requirements and generalisability (Nelson and Daily, 2010;Vigerstol and Aukema, 2011;Bagstad et al., 2013a;Drakou et al., 2015;Burgess et al., 2016). For example, model outputs from ARIES and InVEST for carbon storage, water and scenic viewshed services were compared for a semi-arid river basin in Arizona, USA, and northern Sonora, Mexico, under different management scenarios (Bagstad et al., 2013b). Vorstius and Spray (2015) investigated similarities in mapped outputs from three different tools in relation to service delivery at a local scale. Turner et al. (2016), focusing on methods to assess land degradation, briefly reviewed a range of decision support tools and other models whose outputs have been evaluated in the context of ecosystem services. There are also online toolkits available, for example, the National Ecosystem Approach Toolkit (NEAT; http://neat.ecosystemsknowledge.net/), providing guidance on selecting an appropriate modelling tool.
At first glance, many of the ecosystem services modelling tools appear to produce similar outputs; they can model multiple services, and are designed to be used for scenario analysis and decision-making. However, the approaches taken and underlying assumptions made for the models within each tool are often different, the appropriate resolution and scale of their application can vary and, since the models are in continuous development, reviews can become rapidly outdated. Therefore, there is an ongoing need for comparison studies that compare multiple models for the same service(s) and study site(s), along with a need to evaluate models in new biophysical settings. In particular, this paper demonstrates how three such tools differ, highlighting unique aspects and discussing their strengths and weaknesses, at a level of detail which is not met in most previous reviews.
In this paper we compare three spatially explicit ecosystem services modelling tools, using examples of provisioning and regulating services (water supply, carbon storage and nutrient retention). The models are parameterised for the UK and applied to a temperate catchment with widely varying altitude and land use in North Wales. While two of the tools have previously been compared (ARIES and InVEST) (e.g. Vigerstol and Aukema, 2011; Bagstad et al., 2013b), LUCI has not been evaluated in a tool comparison. Additionally, we focus on an aspect receiving little attention in previous reviews, i.e. that the modelling tools produce a range of different outputs for each 'service'; these differing outputs may inform the choice of tool for a particular application. Lastly, since ecosystem services modelling tools are often used to evaluate the impacts of land-use change, we assess their sensitivity to varying severities of land-use change (0-30% change of catchment area).

Study site
The Conwy catchment in North Wales, UK, is 580 km 2 in area (Fig. 1).
It is a small catchment in global terms, but is characterised by a diverse range of elevation (0-1060 m), climate, geology and land uses. Predominantly rural, the land-use comprises sheep farming in the upland areas to the west and mixed dairy, beef and sheep farming in the lower areas to the east. The lowland flood plain area also contains some arable land. There is a large afforested area to the mid-west. Most of the sub-catchments contain some semi-natural woodland, including areas of riparian woodland. In the uplands to the south of the catchment lie extensive areas of blanket bog, protected under the European Natura 2000 biodiversity designation. More information can be found on the Conwy catchment in Emmett et al. (2016).

Modelling ecosystem services
We have chosen examples from both provisioning and regulating services, including those where the spatial context is important to the flow of services (water yield, nutrient retention) and where it is less directly important (carbon storage). We did not include a cultural service as ARIES and LUCI do not have readily available cultural models parameterised for the UK.

Overview of model approaches
ARIES, InVEST and LUCI were chosen as spatially explicit ecosystem services modelling tools that provide quantitative output, can be applied in different contexts, and can work at local or national scale, depending on the available data. InVEST combines land use and land cover (LULC) data with information on the supply (biophysical processes) and demand of ecosystem services to provide a service output value in biophysical or economic terms . The models, written in Python, are available as stand-alone applications. LUCI is a decision support tool that can model ecosystem service condition and identify locations where interventions or changes in land use might deliver improvements in ecosystem services. Output maps are colour-coded for ease of interpretation: in default mode green is used to indicate good opportunity for changes, and red to mean "stop, don't make changes here". The models incorporate biophysical processes, applying topographical routing for hydrological and related services, and use lookup tables where appropriate, e.g. for carbon stock. The models are written in Python, and run in an ESRI GIS environment. LUCI has a unique, built-in trade-off tool, which allows the user to identify locations where there is potential for "win-wins", i.e., where multiple services might benefit from interventions, or where there may be a trade-off, with one service benefitting from interventions while another is reduced.
In contrast, ARIES was developed as an online platform to allow the building and integration of various kinds of models. This allows the most appropriate ecosystem services model to be assembled automatically from a library of modular components, driven by context-specific data and machine-processed ecosystem services knowledge (Villa et al., 2014). ARIES focuses on beneficiaries, probabilistic analysis, and spatio-temporal dynamics of flows and scale, aiming to distinguish between potential and actual benefits. While InVEST and LUCI focus on using known biophysical relationships (where possible) to model physical processes, ARIES, in addition to standard modelling approaches incorporated by model wrapping, can also use probabilistic methods (Bayesian networks) if there are insufficient local data to use in biophysical equations (Vigerstol and Aukema, 2011). A key feature of ARIES is its conceptualisation of 'source' elements within a landscape that contribute to service provision, and 'sink' elements that detract from service provision (Villa et al., 2014).
Models were parameterised for the UK and then applied to the study catchment. Although conceptually similar in some ways, differences in modelling approach can create differing requirements for some input data. Our aim was also to run each model realistically, as users would do in the real world, rather than as a direct comparison with identical input data. ARIES and InVEST were run using 50 m by 50 m resolution digital elevation data (CEH Integrated Hydrological Digital Terrain Model) (Morris and Flavin, 1990) and the UK Land Cover Map (LCM) 2007 (Morton et al., 2011) (50 m by 50 m custom aggregation). LUCI used vector format LCM 2007 and soil type data (National Soil Research Institute, 1999), and 5 m by 5 m resolution digital elevation data (NextPerspectives, 2014) as the accurate simulation of overland and near-surface flow mitigation requires detailed simulation of catchment hydrology, using high resolution topographic data. This is not required for InVEST and ARIES due to their approach of aggregating outputs at catchment or sub-catchment scale (see Supplementary material for further information). Due to differences in required input format between the models and LUCI's requirement for higher spatial data as mentioned above, it was not possible to use the same sources for all input data used. More detail on data inputs for all models (including biophysical lookup tables and summary tables showing which data inputs differed between tools) is available in the Supplementary material.

Water supply models
The InVEST water yield model provides a value for annual water yield per grid-cell by subtracting the water lost via evapotranspiration from the average annual precipitation. Evapotranspiration is based on an approximation of the Budyko curve (Zhang et al., 2004), and information on vegetation and rooting properties . The value per grid-cell is then summed to provide a total yield for the watershed. Water abstractions (i.e. removal of water from the system) can also be included. The model can calculate the value of the energy that would be produced if the water reached a hydropower facility, therefore providing biophysical and economic outputs.
The LUCI flood mitigation and water supply services model calculates direction of flow over the landscape using GIS functions. The model then combines this with spatial data on hydrologically effective rainfall, calculated by subtracting estimated evapotranspiration from precipitation, and simulates accumulation of this water across the landscape using flow accumulation routines. Average flow delivery to all points in the river network is simulated, and can be used to estimate water supply. The model identifies "mitigating features" that enhance infiltration and retention of water, such as woodland or wetlands, based on land use data (and soils data where parameterised). The model also identifies areas where flow is routed through the mitigating land use features, and maps these areas as "mitigated", i.e. much less water is expected to travel to the watercourse as overland or other rapid flow.
A few options have become available to model water supply under the ARIES framework, as the product has evolved. Process-based hydrological models are currently supported and maintained for this service (e.g. PRMS, Leavesley et al., 1995or SWAT, Neitsch et al., 2011. However, the probabilistic approach, using Bayesian networks, has been applied more often (e.g. Bagstad et al., 2013b). We created two water supply models in ARIES: a probabilistic model, calibrated at UK scale and designed to accommodate the influences of land cover on evapotranspiration to enable sensitivity to land-use change to be evaluated, and a 'Flow & Use' model that can track service flows through the landscape using mechanistic routing algorithms (which simulate the process of water flowing downwards through the path of least resistance) and accounts for abstraction. The latter was implemented in an independent GIS environment as a variation of the Service Path Attribution Network (SPAN) modules, designed to map the flow of services, and its components, in previous distributions of ARIES (Johnson et al., 2010). Using this method, it is possible to incorporate both point and diffuse water 'sources' and abstractions ('sinks') and to follow the fate of the service across the landscape. Spatial data on rainfall and evapotranspiration are handled in a deterministic way through flow routines, while source/abstraction points are accounted for as contributing masses. Both the Bayesian and the 'Flow & Use' models provide annual water supply as output for any location (grid-cell) in the catchment (see Supplementary material for further detail).
For all models, UK precipitation data (1 km resolution) from the CEH-GEAR dataset was used (Tanguy et al., 2014). Potential evapotranspiration data for the UK (1 km resolution) were calculated from the CHESS meteorological dataset (Robinson et al., 2015) using the Penman-Monteith equation (Monteith, 1965). An annual average over the period 2000-2010 was used for the climate data, accounting for variability between years. Data on UK water abstractions were taken from annual estimates by the UK Department for Environment, Food and Rural Affairs (Defra).

Carbon models
The InVEST carbon storage and sequestration model calculates carbon stocks within the landscape using lookup tables containing one value per land cover type. Carbon in four stores is summed: aboveground biomass (bark, trunk, branches, leaves), below-ground biomass (roots), dead organic matter (standing deadwood and litter), and soil carbon. The depth of soil assumed for carbon stock depends on available data: two depths, 30 cm and 1 m, were used for the Conwy catchment. There is an option to provide current and projected land cover maps, which allows the net change in carbon stock resulting from land-use change over time, interpreted as either sequestration or loss, to be mapped. The model can perform an uncertainty analysis if the mean and standard deviation for each carbon estimate is given. The market or social value of the carbon stored in the study area can also be calculated (based on values provided by the user).
The LUCI model can calculate total carbon stocks at steady state, i.e. assuming that soil and vegetation carbon are at equilibrium, using data on average carbon stock in above and below-ground biomass, dead matter and the top 30 cm (or 1 m) of soil for different soil and land use combinations. The lookup table aggregates land use into four types (wood, permanent grass, semi-natural, arable) to provide sufficient data points for each soil and land use combination. If spatial data on historic or scenario-based land-use change are available, the model can be used to calculate change in carbon stock between two equilibrium values. In the absence of land-use change data, the model allows a comparison of carbon stock, with the potential value under the current land use assigned as the maximum soil carbon stock for that soil type, highlighting areas with potential to increase carbon stocks. LUCI output maps present opportunities to increase soil carbon stocks, or to protect areas where carbon stocks are already high within the landscape.
Carbon regulation can be modelled in ARIES in different ways. When enough data are available, carbon budgeting simulated through biomass dynamics is handled by the LPJ-GUESS model (Smith et al., 2014), which has been ported to ARIES. However, in data-poor situations, other modelling choices are available. We opted for the Bayesian network approach, applied by Balbi et al. (2015). Carbon concentration in the upper 15 cm of soil was calculated, using available data to train the model. Explanatory variables were topography (aspect, slope, elevation), growing period in degree days, precipitation, soil group and land use type. The model was calibrated and validated on measured carbon concentration from~2500 sample points across the UK (topsoil (0-15 cm) (Emmett et al., 2010), then applied spatially to the Conwy catchment (see Supplementary material for further details). Carbon stock was not calculated using the ARIES model.

Nutrient retention models
Annual average runoff in InVEST is calculated (per grid-cell), using the water yield model. The model then determines the quantity of pollutant exported and retained by each grid-cell, based on a lookup table containing the nutrient loading (export coefficients) and the filtering capacity of each land cover type. The nutrient loading value is adjusted by a Hydrologic Sensitivity Score (creating the Adjusted Loading Value, ALV), which helps to account for differences between conditions where the export coefficient was measured and the study site. Natural vegetation, for example forests and wetlands, retains a high percentage (60-80%) of the nutrient flowing through the cell, while urban areas have low retention. The final output is total annual nutrient export (i.e. load) and retention for the watershed. Mapped outputs include, for each grid-cell, the nutrient export to stream (kg), which reflects the nutrient released from each cell that reaches the stream, and the nutrient retention (kg), with values based on the filtering capacity of the cell and the total load coming from upstream. The model can also calculate the economic saving that habitats within the ecosystem have provided due to avoided water treatment costs (input by the user).
The LUCI diffuse pollution mitigation model estimates nutrient loading in the landscape based on land cover, average stocking density and fertiliser input. Further functionality that also considers the influence of slope, soil type, and detail on land management has recently been developed, but is not yet parameterised for the UK (see Trodahl et al., in press). Accumulated loading over the landscape is calculated by combining the nutrient loading estimates with the flow direction layer calculated from topography and applying flow accumulation routines. Nutrient flow accumulation for near surface flow is calculated similarly, by weighting spatial data on flow direction with nutrient export coefficients and a factor for the solubility of nitrogen. For overland and rapid near-surface flow, "mitigating features", as identified by the water supply model, are assumed to remove all or part of N and P entering them. The combined output from routed overland and near surface flow provides simulated values of spatially distributed annual mean in-stream loading and concentrations of dissolved nutrients within the stream network. As InVEST and LUCI only model diffuse pollution, a post-hoc estimation of point-source phosphorus entering the catchment, based on the number of people served by sewage works in the catchment (see Supplementary material), can be added to the final model output.
At the time of writing, no modules for nutrient regulation have been formally released and supported within the ARIES framework, therefore no ARIES model was run for this service. However, users are free to implement customised models (e.g. Balbi et al., 2015) or adopt models made available by the ARIES user community. Barquín et al. (2015) discuss insights into the future development of ARIES, including broader scope for water quality modelling.

Scenarios
To test the sensitivity of the models to land-use change, scenarios of varying severity were compared to a baseline of no change. The first three scenarios were 5%, 10% and 30% of the catchment changing from grassland to woodland. These scenarios were inspired by Welsh Government targets to increase the extent of forests by an additional 5% of the land area of Wales (Welsh Government, 2009). Semi-natural grasslands (rough, neutral, acid, calcareous and montane) were merged to form a single grassland habitat type. Random patches of woodland were placed within current grassland habitats to create input layers. Patch size ranged between 5 ha and 100 ha, based on an average field size of~4 ha and an average farm size of~40 ha in Wales. Given the structure of the landscape, it is unlikely that any larger patches would emerge. The fourth scenario used the 'Managed Ecosystems' scenario (Prosser et al., 2014) from the DURESS (Diversity of Upland Rivers for Ecosystem Service Sustainability) project. The DURESS scenarios were developed through discussions with stakeholders and experts on current and future drivers of land-use change in Wales. The 'Managed Ecosystems' scenario envisages an upland landscape with focus on management for carbon and biodiversity, expansion of woodlands and wetlands, restoration of peatlands and de-intensification of pastures (improved grassland). In terms of land-use change, this leads to a 2.3% increase in woodland, a 16% decrease in pasture and a 13% increase in semi-natural grassland within the Conwy catchment.

Trade-offs
Using ARIES and InVEST, post-hoc examination of the spatial pattern of service provision across the landscape under a variety of future scenarios can demonstrate trade-offs across multiple services (e.g. Nelson et al., 2009). However, LUCI is the only tool in this study that currently has a module for evaluating trade-offs. The model applies an equal weighting to each service as a default, but users can increase the weighting of services that are of particular interest and/or exclude services. Units are normalised prior to input to the trade-off tool, with classifications based on user-defined thresholds.

Model validation
Model outputs were validated against measured data collected in the Conwy catchment (Fig. 1). Flow data were taken from two sites within the gauging station network of the UK National River Flow Archive (NRFA), which is coordinated by the Centre for Ecology and Hydrology (CEH). The boundary for each NRFA catchment was defined using the CEH Integrated Hydrological Digital Terrain Model (Morris and Flavin, 1990). Soil carbon, above and below-ground biomass data were collected from up to 18 sites with varying land-use in the catchment (Glanville et al. (in prep), Smart et al. (in prep)). Water quality data (nitrogen as nitrate and phosphorus as orthophosphate concentrations) for one site within the Conwy catchment were extracted from the UK Environment Agency's Harmonised Monitoring Scheme database and annual loads calculated using river flow data following Dunn et al., 2014 (see Supplementary material for further information).

Water supply
All of the tools provide comparable maps of annual water yield per grid-cell (Fig. 2a, d, g), with the lowest yield consistently seen on the eastern half of the catchment. The key InVEST output is annual water yield per sub-catchment (Fig. 2b), as calculations are performed at this scale. Grid-cell level maps are for model checking purposes only. The LUCI traffic light map of flood interception (Fig. 2c) shows areas providing flood mitigation in red, including trees and deep permeable soils, while green areas have high flood concentration and could benefit from mitigation. LUCI also provides quantitative outputs, including overland flow accumulation (Fig. 2e), showing the accumulation of water over the landscape according to topographic hydrological routing of hydrologically effective precipitation, and average flows estimated as the annual flow at each point in the stream network (Fig. 2f). The ARIES 'Bayesian' model can also produce an uncertainty map (Fig. 2h), while the ARIES 'Flow & Use' model delivers the flow of water available for use through the catchment (Fig. 2i); named "Actual source" in the model, see Supplementary material).
When model outputs were compared with observed annual flow data from two gauging stations in the catchment, model performance was similar, with LUCI and the ARIES 'Flow & Use' model providing the closest estimates to the measured values (Table 1). The ARIES 'Flow & Use' model showed that demand (amount required by users, but not necessarily available to them) was equal to water use (i.e. volume abstracted) at all abstraction points, while dummy data was used to illustrate how the model could report unmet demand (Table 2). Water use at the Llyn Cowlyd abstraction point was particularly high (77% of total available).

Carbon models
For the carbon models, InVEST and LUCI provide broadly comparable mapped outputs for total carbon stock (biomass + soil at both 30 cm and 1 m depth) ( Fig. 3a & d, b & e). However, while the maps for 'biomass + 1 m depth' are similar between the models, the maps for 'biomass + 30 cm soil depth' show some differences in both the spatial pattern and the magnitude of carbon stocks reflecting differences between the carbon approach used for the two models (see Supplementary material, Section 4). The ARIES output is carbon concentration in the top soil (15 cm) (Fig. 3g). Both InVEST and ARIES provide maps of uncertainty, using the standard deviation and the coefficient of variation (ratio of the standard deviation to the mean; %) respectively, for the carbon estimates per land class (Fig. 3c, h). The LUCI 'carbon sequestration potential' map identifies areas where existing carbon stock is already high (red; no potential for change), and where there may be potential for increasing carbon stocks under different land use (green) (Fig. 3f).
Modelled outputs for total carbon stock for the catchment (30 cm and 1 m soil depth) using InVEST and LUCI were very similar, within 10% of each other (Table 3), despite the differences in spatial distribution of carbon shown in the mapped output. When compared to total carbon calculated from measurements taken in the catchment, both models showed over-estimates, however values were on the same order of magnitude (Table 3). Total measured carbon at points within the catchment was also compared with LUCI modelled values, with a mean difference of 14.35 kg m −2 (12.43 sd; 17 points) for biomass + 30 cm soil depth and a mean difference of 5.47 kg m −2 (5.43 sd; 22 points) for biomass + 1 m soil depth (see Supplementary material).

Nutrient retention
The InVEST and LUCI nutrient retention models produce slightly different mapped outputs, which are not directly comparable. Fig. 4 shows examples for phosphorus but outputs are also available for nitrogen (see Supplementary material). While the InVEST Adjusted Loading Value and LUCI phosphorus (P) load (at each point) maps (Fig. 4a, d) are based on nutrient exports per land class, the InVEST map also takes into account the flow upstream of the grid-cell. InVEST provides an output of the vegetation filtering capacity of each land class (Fig.  4b) and the load from each grid-cell that eventually reaches the stream (accounting for the nutrient being retained downslope) (Fig. 4c). LUCI outputs the nutrient concentration for any point in the stream network (Fig. 4e) and the accumulated P loading for each point, considering the P contribution from uphill sources (Fig. 4f).
When the average annual load was calculated using measured concentration and flow data (following Dunn et al., 2014; see Supplementary material) and compared to model outputs, both models showed considerable underestimates for the study catchment, particularly the InVEST nitrogen model (Table 4).

Trade-offs -LUCI
The LUCI trade-offs tool showed that, when all modelled services were considered, there is some opportunity to enhance multiple services, particularly in the north and east of the catchment (Fig. 5a). The potential for possible gains was explored using maps of trade-offs between pairs of services. For example, when pairing carbon and flood mitigation (Fig. 5c), the area mapped in dark green indicates opportunity to enhance both services, while areas in the south and west of the catchment have existing high provision for both services.

Scenarios
The sensitivity of the models to land-use change depended on the service, with greater changes seen for carbon stocks and nutrient loads than water yield. The change in annual water yield per watershed (compared to baseline) was minimal for all models. However, the area of mitigating features (LUCI) increased greatly and was 85% greater than the baseline for the 30% grassland to woodland (GW) scenario (Table 5i). Change in mitigated area is reported for the LUCI model as opposed to change in water yield, because the functionality available for UK applications does not yet include a function to adjust evapotranspiration for land-use change scenarios. For InVEST and LUCI, the change from grassland to woodland led to an increase in total carbon stocks, as did the DURESS scenario. ARIES predicted decreases in soil carbon for the GW scenarios (Table 5ii). Nitrogen load generally decreased with increasing woodland, while the DURESS scenario for InVEST and LUCI showed large reductions in annual N load. While phosphorus load increased with increasing woodland using InVEST, LUCI showed a gradual decrease, however both models had similar outputs for the phosphorus DURESS scenario (Table 5iii). Table 3 Carbon measurements (soil, above and below-ground biomass) from 18 sites in the catchment scaled to give a total estimate of carbon for the catchment compared to modelled carbon stocks to 30 cm and 1 m soil depth (including soil and above-ground, below-ground, and dead vegetation biomass) (percentage difference between modelled and measured in brackets).  Fig. 4. Nutrient retention models for phosphorus: a) InVEST, Adjusted Loading Value (ALV) (kg year −1 per 50 m by 50 m grid-cell): the export coefficient is multiplied by a Hydrologic Sensitivity Score, which accounts for each grid-cell's run-off index, i.e. log (sum of the water yield of grid-cells along the flow path above the grid-cell); b) InVEST, vegetation filtering capacity (% of nutrient retained from cell upslope); c) InVEST, P export to stream (kg year −1 per 50 m by 50 m grid-cell); the load from each grid-cell that eventually reaches the stream (accounting for the nutrient being retained downslope); d) LUCI, Point source P load (kg ha −1 year −1 ): phosphorus load at any point in the landscape, based on export coefficients per land use; e) LUCI, P river concentration (mg l −1 ); f) LUCI, accumulated P loading (kg year −1 for each 5 m by 5 m grid-cell), considering the load not just at a point source but also that contributed from "uphill" sources.

Strengths and performance of modelling tools
To increase our understanding of ecosystem services modelling tools, there is a need for quantitative comparisons, for the same services and study area, in a variety of environmental conditions. Run for a temperate UK catchment, the three tools in this study were found to have broadly comparable quantitative model outputs for each service, as Bagstad et al., 2013b also concluded when using ARIES and InVEST for a semi-arid environment. However, the modelling tools also have unique features and strengths. InVEST has been used widely, has a comprehensive user manual and provides example input data per model . In addition to biophysical outputs, InVEST also provides estimates of valuation, based on user inputs, highlighting areas with high levels of provision for particular services, e.g. Fu et al., 2014. LUCI's traffic light maps allow quick and easy interpretation of the model output. The LUCI flood mitigation map has been applied as part of the Glastir Monitoring and Evaluation Programme (GMEP), simulating impacts of interventions, such as riparian planting, to provide fast feedback to the Welsh government (Emmett and GMEP Team, 2014). LUCI is also the only tool with a trade-off module, providing a useful visual output of the impacts of land-use change on multiple services, and the only tool that respects fine-scale spatial configuration of landscape elements. ARIES represents a good option in data scarce areas and its probabilistic approach can cope with data gaps, providing maps of modelled outputs along with associated uncertainty. When analysing abstraction and water use, tracking the flow of service provision across the landscape is necessary. The ARIES 'Flow & Use' model allows for detailed mapping of the various flow components.
Model validation revealed that performance against observed data was variable. The water yield models performed well, as Redhead et al. (2016) found when running the InVEST model for 42 catchments across the UK. Annual average flow values from the LUCI model also compare very well with measured values from the NRFA at Wales national scale (see Supplementary material). The InVEST and LUCI carbon models provided overestimates when total carbon in the catchment was considered, however values were on the same order of magnitude. This is to be expected as input data was extracted from a variety of literature sources and national scale spatial data (soil) was used. Also, the measured C data from the catchment did not include an estimate of dead matter, although this would represent a small percentage of the overall total. When modelled and measured carbon was compared for individual points, the LUCI model showed reasonable performance given the same constraints of generalised inputs and spatial data which did not match the observed land use or soil type for all points. All of the nutrient retention models performed less well, particularly for InVEST, partly due to the difficulties in assigning suitable export coefficients. However, at national scale, LUCI values for N in Wales compare well to measured values from the Water Information Management Solution (WIMS) database (see Supplementary material).

Common limitations
All of the modelling tools share some limitations; these are ongoing areas for development. The water and nutrient retention models work on an annual basis, meaning that more detailed temporal changes in water supply, hydropower production and nutrient concentration are not considered. Sparsely sampled measured phosphorus data may not be representative of the annual load, due to high variability and the tendency for sampling during base-flow dominated conditions whereas much of the load may actually come from events. The water models also do not allow for surface watergroundwater interactions, where streams can either gain or lose water through the streambed. The use of average inventory values for carbon fails to account for variation within a land use type, due to many factors, including land management history, temperature or elevation. Chaplin-Kramer et al., 2015 adapted the InVEST carbon model to allow for edge effects on carbon storage in forests. There are also difficulties with calculating carbon emissions for land-use change scenarios, as soil type (and associated carbon stock) will affect preferred land use and inventories may thus not actually be indicative of the impact of land use on soil type. This space for time substitution is currently necessary due to a lack of appropriate process-based models, which do not require site specific calibration; incorporation of simplified process-based approaches would be advantageous.
The nutrient retention models are highly dependent on the accuracy of the export coefficient values used. Published export coefficients tend to be derived from only a few case studies, and may not be directly applicable to the study area. Many factors can influence nutrient export within a land use type, including management practices (grazing regime, fertiliser application rates), livestock density (particularly important for nitrogen), topology, soil type and rainfall (Reckhow et al., 1980). Also, published export coefficients implicitly include the retention element, while InVEST decomposes the coefficients into export and retention factors, which may add further uncertainty. There was only one water quality monitoring station with associated flow data in our study area, which may not be sufficient to validate the models. Discrepancies between reality and export coefficients based on variations in land management may be expected to average out at larger scales. While an estimate of point source P was added to the InVEST and LUCI outputs, this value was based on the human population served by the sewage works within the catchment and did not account for the export of phosphorus from septic tanks.

Key tool differences
InVEST is currently easily accessible and free to download, but LUCI is not yet freely available for public use although it can be accessed by contacting the model developers, and a fully accessible, free to download version is planned for release in April 2017. ARIES is currently rooted on a shared and open source development, so is available at no cost for non-profit use, while its k.Lab technology, the technical documentation and the development environment are freely accessible to registered users. There is also an ARIES online modelling tool under development, which is due to be released in 2017. InVEST and LUCI are straightforward and simple to use for those with basic GIS skills; the gathering of input data is often the most time consuming step for application of either tool. The planned online ARIES tool is intended to be simple for new users, however the development of customised models in ARIES and further new algorithms through its k.Lab technology requires a high degree of technical skill.
The InVEST water and nutrient retention models run at the grid-cell scale and summarise by sub-watershed/watershed, while the LUCI and Table 4 Average annual N and P load (2000-2010), calculated using N as nitrate and P as orthophosphate concentrations, following Dunn et al., 2014,  ARIES models can provide information (e.g. flows, mass, concentrations) for every point in the landscape. The choice of modelling tool therefore depends on the required scale of the outputs. Also, as the key output for the InVEST nutrient retention model is annual load, both measured flow and nutrient concentration data are required for model validation, whereas the LUCI model outputs nutrient concentration as well as load per point. The differences between the carbon stock maps for the InVEST and LUCI maps (biomass + soil 30 cm depth) reflect differences in the modelling approach; LUCI calculates carbon based on soil type and aggregated land use, whereas InVEST uses only land use, but splits this into more categories. In theory, InVEST could also allocate carbon based on the soil-type and land use combination, however this approach is not currently applied. As seen here, spatial variation in model output may cancel out at catchment scale, given that lookup table values are based on averages.
In terms of spatial allocation of demand, in InVEST one value for consumptive demand is ascribed to each land class, although this could vary greatly within the same land class. Variation in demand could be incorporated by defining additional land classes, but only to a very limited extent. The LUCI tool does not currently consider demand. The ARIES 'Flow & Use' module can explicitly model demand spatially, if local information is available.

Scenarios and trade-offs
As the sensitivity of the models to land-use change depended on the service, an assessment of different scenarios will depend on which services are being prioritised within the catchment. Bagstad et al., 2013b found broadly similar gains and losses for each service (carbon, water yield and viewsheds) when comparing the impacts of land-use change scenarios using ARIES and InVEST. In the current study, the outcome of the GW scenario for the phosphorus nutrient retention model varied between InVEST and LUCI. This may be due to varying model assumptions on nutrient uptake/retention or slight differences in the default export coefficients used for each model. This highlights the importance of using data input values that have been collected under as similar conditions as possible to the study site and also demonstrates a need to be aware of and understand differences between default parameterisations for the models.
For the current study, small changes in water yield are mainly due to the small amount of evapotranspiration relative to precipitation, so that change in vegetation does not greatly affect the amount of water flowing downstream. The same models applied to hot and/or dry regions may be expected to provide differing results. The placement of land-use change is also important (e.g. Verhagen et al., 2016) and may affect the simulation of flood mitigation in LUCI due to influence of hydrological routing, and carbon modelling (ARIES and LUCI) due to influence of soil and other site specific factors. Both extent and placement in the landscape should be considered when designing land-use change scenarios, with ARIES and LUCI providing added value for use in assessing the impact of spatially explicit land-use change.
Using LUCI's trade-off module, the variation between maps indicates that, depending on the services considered, appropriate placement of interventions and protective measures may differ significantly. This unique feature is particularly useful for stakeholder participatory exercises, allowing visualisation of the impacts that different scenarios could have on multiple services.

Ecosystem service provision in the study site
This comparison study also demonstrates how the use of a suite of modelling tools can deliver extensive information on service provision within a catchment. All of the tools emphasise the high carbon storage capacity of the catchment, especially the Migneint blanket bog to the south. Stream phosphorus concentrations are generally low, although the LUCI outputs suggest that any mitigation efforts should be targeted to the north-east of the catchment. LUCI output on mitigation services shows that large areas are already providing mitigation from nutrient runoff and overland flow but, due to placement of these features, a relatively small additional area receives benefit. The benefitting areas are mostly uplands in the west of the catchment, which have relatively low flow concentration and nutrient accumulation; hence current service provision may be considered somewhat limited compared to potential service provision. Analysis of flows using the ARIES models suggests a sustainable use of water resources in the catchment overall. Having a high proportion of water use compared to the total available (Llyn Cowlyd), may imply serious consequences for both ecosystems and users downstream. In particular it could impact river habitats directly and ecological processes directly or indirectly (Dewson et al., 2007). On the users' side, an overexploitation may hamper continuity of service provisioning (Ngigi et al., 2008). There may be room for increasing abstraction from some of the other reservoirs, in case of further demand, but that would likely involve trade-offs with other services.

Conclusions
Ecosystem services modelling tools can provide useful decision support outputs. While the three tools highlighted the key areas of service provision within the catchment, each has unique strengths. The choice of tool therefore depends on the study question and user requirements. Based on our experience of using these three ecosystem services tools, we outline the characteristics that we judged most useful for each tool ( Table 6).
As InVEST is freely available, with detailed documentation and example data, it is recommended for users with time constraints. It is also the only of the three tools with well-developed economic valuation models, so is recommended to those requiring economic valuation as an output. LUCI, available for public use in 2017, would benefit users seeking fine scale outputs (for local or national scale applications) or requiring trade-off maps for multiple services. Once parameterised for international use, LUCI will be particularly well suited to explore impacts of detailed rural change. While ARIES, with an easy to use online tool under development, currently allows the customisation of models and is particularly useful when data is scarce. Studies are beginning to assess the sensitivity of these modelling tools to the scale of input data (e.g. Grafius et al., 2016), and to the local or national relevance of input data compared with global default values (Redhead et al., 2016). Further work is required in both of these areas. There is still a lack of tools to map or quantify cultural ecosystem services. Although InVEST contains some tools such as 'Scenic Quality' and 'Recreation and Tourism,' further development of this aspect of ecosystem service modelling is desperately needed. Similarly, the majority of modelling tools focus Table 5 Testing the sensitivity of ARIES, InVEST and LUCI to land-use change (compared to a baseline of zero change). The first three scenarios represent 5%, 10% and 30% of the catchment changing from semi-natural grassland (G) to woodland (W). The DURESS 'Managed Ecosystems' scenario aims to maintain ecosystem integrity, focusing on management for carbon and biodiversity. Land-use change includes 2.3% increase in woodland and 16% decrease in pasture (improved grassland). on the supply side or potential ecosystem service delivery (Tallis et al., 2012;Jones et al., 2016) and do not focus sufficiently on the beneficiaries. While ARIES incorporates 'Users' within its conceptual approach, much more work is required to develop tools which adequately incorporate spatial mapping of the demand side, i.e. to map where, and how much, services are actually used by beneficiaries. For future validation studies, it would be useful to compare modelling tools across multiple scales (e.g. sub-catchment to sub-continental) and also to further develop coefficients and look-up tables for a variety of climates and regions. A tool comparison including more diverse services, for example, cultural (viewsheds, recreation) would further inform users in their choice of tool.