A comparison of three simple approaches to identify critical areas for runoff and dissolved reactive phosphorus losses

. Diffuse phosphorus (P) losses are the main cause for eutrophication of surface waters in many regions. Imple-menting mitigation measures on critical source areas (CSAs) is seen to be the most effective way to reduce P losses. Thus, tools are needed that delineate CSAs on the basis of available data. We compared three models based on different approaches and sets of input data: the rainfall-runoff-phosphorus (RRP) model, the dominant runoff processes (DoRP) model, and the Sensitive Catchment Integrated Modeling Analysis Platform (SCIMAP). The RRP model is a parsimonious dynamic model using the topographic index and a binary soil classiﬁcation to simulate discharge and P losses. The DoRP model distinguishes eight soil classes based on soil and geological maps. It does not account for topography when calculating runoff. SCIMAP assesses runoff risks solely on the basis of topography using the network index. surface runoff and soil moisture data available from a in Switzerland, the RRP model and SCIMAP made better predictions than the DoRP model, suggesting that in our study area topography was more important for CSA delineation than soil data. Based on the results, we suggest improvements of SCIMAP to enable average risk predictions and the comparison of risk predictions between catchments.


Introduction
Diffuse phosphorus (P) losses from agricultural land continue to be a severe problem for water quality, causing eutrophication of many surface waters (Buda et al., 2012;Kleinman et al., 2011b;Schoumans et al., 2009). The European Union (EU) Water Framework Directive (WFD) calls for restoration of all water bodies to good quality by 2015 if possible, and 2027 at the latest (Hering et al., 2010). Since P losses from point sources, which can be identified and targeted rather easily, were substantially reduced over the last 2 decades (Dubrovsky et al., 2010), diffuse P losses from agricultural areas are now the main cause of eutrophication of water bodies in many countries (Carpenter et al., 1998;Sharpley et al., 1994). Thus, in order to meet the WFD goal, the focus now needs to be directed to measures by which these diffuse inputs can effectively be reduced.
A wide range of different mitigation options were proposed and discussed within COST Action 869 (Schoumans et al., 2011). Some measures aim to prevent or intercept P transfer by runoff from fields to surface waters (e.g., buffer strips; Roberts et al., 2012;; others focus on the immobilization of P sources in soil and manure by adding P sorbing agents (Buda et al., 2012).
Several studies reported that most P found in runoff at the catchment outlet originated from rather small areas within the catchment (Gburek and Sharpley, 1998;Pionke et al., 1997Pionke et al., , 2000. Targeting these critical source areas (CSAs) seems to be the most efficient and cost-effective approach to mitigating the water pollution problem (Doody et al., 2012; Published by Copernicus Publications on behalf of the European Geosciences Union. Heathwaite et al., 2003;Rodriguez et al., 2011;Schulte et al., 2009). Kleinman et al. (2011b) reviewed source and transport factors that need to be considered when addressing diffuse P losses and CSA delineation. Distinguishing chronic or legacy sources and acute or temporary sources, the main P sources are pools of legacy P in soils due to excessive fertilizer application in the past and temporary increases in available P resulting from current applications of manure or fertilizer. Decaying crop residues can also play a role as a temporary P source (Kleinman et al., 2011b;Pote et al., 1999). While proper crop nutrition and soil management according to codes of best practices can reduce acute sources very effectively, sources of legacy P are more difficult to deal with as they can persist for a long time (Kleinman et al., 2011a). In earlier studies erosion and surface runoff were considered the main or only transport mechanisms carrying P to streams. More recently, other processes have been recognized to transport substantial amounts of P to surface waters (Doody et al., 2012;Kleinman et al., 2011b) such as subsurface flow (Kleinman et al., 2009), and tile drain flow Stamm et al., 1998;Vadas et al., 2007).
Given the diversity of sources and transport processes involved in diffuse P losses, the identification of CSAs in a catchment is a challenging task. Various tools have been developed to model diffuse P losses and to delineate CSAs (Krueger et al., 2012;Radcliffe et al., 2009;Schoumans et al., 2009), ranging from rather simple site-assessment tools to complex physically based catchment models. The former are usually easy to apply but are often too simplistic, whereas the latter are generally too complex for practical applications and usually require input data that are not easily available. As pointed out by various authors (Heathwaite et al., 2007;Radcliffe et al., 2009;Srinivasan and McDowell, 2007), there is a need for parsimonious process-based and easy-to-use but nonetheless sound and reliable models that are easy to parameterize and can assist water protection policy and agricultural P management in targeting effective and cost-efficient water protection and mitigation measures to the most critical areas.
Here we compare three tools that can be used to predict critical source areas in catchments based on different approaches and input data: the rainfall-runoff-phosphorus (RRP) model, the dominant runoff processes (DoRP) model, and the Sensitive Catchment Integrated Modeling Analysis Platform (SCIMAP). The RRP model is a dynamic, parsimonious model that simulates runoff and dissolved reactive P (DRP) losses from a catchment, using the topographic index λ proposed by Beven and Kirkby (1979) and a binary soil classification by drainage capacity to simulate a site's hydrological responses to rainfall events (Hahn et al., 2013;Lazzarotto, 2005). The DoRP model, which was proposed by Schmocker-Fackel et al. (2007), distinguishes eight soil classes by drainage and soil water storage capacities, based on soil and geological maps, but does not account for topography when calculating runoff. In contrast, SCIMAP, which was proposed by Lane et al. (2006), assesses runoff risks solely on the basis of topography, using the network index NI introduced by Lane et al. (2004). This network index is derived from the topographic index λ by correcting it for reduced topographic connectivity of locations that are not directly connected to a stream. RRP and SCIMAP provide a framework to combine these hydrological predictions with pollutant source data to identify CSAs. DoRP on the other hand solely comprises the hydrological part and does not originally provide a structure to combine the hydrological predictions with pollutant source data.
While the three models represent very different approaches, their performance has never been tested in direct comparison. Our hypothesis here was that we can get useful information from such a comparison not just about the specific models but also on the underlying general approaches. Thus, we tested the three approaches by applying them to two catchments located in a grassland-dominated hilly region of the Swiss Plateau. First, we investigated the importance of topography relative to soil properties in defining runoff risks by comparing the ability of the RRP and DoRP models to reproduce observed runoff patterns. We then compared the RRP model to SCIMAP to test the impact of applying a simpler, time-integrated model. While Lane et al. (2009) also evaluated the network index using a dynamic model, they suggest that it is necessary to further investigate the index's potential with regards to the duration of integration (monthly, yearly). We therefore focus on that aspect during our assessment of SCIMAP. We used the RRP model as a reference because it is the most comprehensive of the three models, and it had already been validated against data from the catchment outlets as well as from within the experimental catchments, including observations on soil moisture, runoff generation and groundwater levels. For a detailed presentation and discussion of the validation of RRP, readers are referred to Hahn et al. (2013) and Lazzarotto (2005).
By comparing the three approaches (RRP, DoRP, SCIMAP), we investigate the importance of connectivity and detailed soil data for the delineation of CSA and identify the benefits of a parsimonious dynamic model using soil and topographical information. In addition, we assess the performance of SCIMAP regarding the duration of integration (storm event, yearly) and the comparability of SCIMAP results between catchments.

The rainfall-runoff-phosphorus (RRP) model
The RRP model is a parsimonious dynamic model developed by Lazzarotto (2005) to predict dissolved reactive P (DRP) losses from small agricultural catchments. The model consists of a hydrological and a P submodel. For this study we used version 2 of the model, which has been developed and described in detail by Hahn et al. (2013). The hydrological submodel (Lazzarotto et al., 2006), referred to here as rainfall-runoff model, is a semi-distributed hydrological model with an hourly time resolution that uses soil and topographic information to describe the dynamics of the system. It is based on the concept that areas with the same topographic index λ and the same soil drainage capacity have the same hydrological behavior. The topographic index λ (Beven and Kirkby, 1979;Kirkby, 1975) is an indicator of the wetness of the soil at a given location within a catchment and is determined by the upslope contributing area and the local slope of a location. Soils are divided into two drainage classes -well-drained and poorly drained soils -the ensembles of which form a poorly drained and a welldrained hydrological response unit (HRU), respectively. The discharge from both of these two HRUs is composed of a fast-and a slow-flow component. The former comprises all kinds of quickly responding flow, including preferential flow, saturation excess runoff and Hortonian overland flow. The slow-flow component is an approximation of the baseflow. The model was simultaneously calibrated (uniform Monte Carlo method) on discharge data from four catchments draining into Lake Sempach (Hahn et al., 2013;Lazzarotto et al., 2006). The four catchments varied in soil composition and hydrological responses. The model parameters were determined by repeated random sampling from a uniform prior distribution within the range of each parameter. The performance of each parameter combination was assessed by comparing simulated discharge with measured discharge in the four catchments. While not being very long, the calibration period covered a wide range of stream flow conditions, which is more important than its actual length for obtaining reliable results (Gupta and Sorooshian, 1985;Lazzarotto et al., 2006;Yapo et al., 1996). Using the modified Nash-Sutcliffe criterion, NSC (Nash and Sutcliffe, 1970), as defined by Lazzarotto et al. (2006) and a NSC threshold value of 0.6, 724 parameter sets, out of 5 million, were judged behavioral and used for model application (Hahn et al., 2013). This GLUE approach produced parameter values that gave very good predictions of the discharge for validation time periods as well as for a different catchment and thus was considered satisfactory. The P-model assigns a constant P concentration, observed during baseflow (Lazzarotto, 2005), to baseflow runoff. For areas where fast flow occurs, P loads comprise incidental P losses from manure, P losses from topsoil, which might be enriched with P due to P application in excess of crop demands in the past (also called legacy P), and losses associated with baseflow.

The dominant runoff processes (DoRP) assessment scheme
In contrast to the simplification in the RRP model regarding the soil data, the GIS-based DoRP model presented by Schmocker-Fackel et al. (2007) classifies the soil hydrology in a more differentiated way based on several soil attributes contained in a standard soil map and information about the parent material obtained from geological maps. The approach is based on the study of Scherrer and Naef (2003), who developed decision schemes to determine the dominant runoff process (DoRP) of a soil profile. Schmocker-Fackel et al. (2007) simplified these schemes to reduce data requirements and to enable automatic GIS-based mapping of DoRPs at catchment scale. The DoRP maps were used for flood discharge simulations, but they can also be used to determine risk areas for pesticide or P losses and erosion. The DoRP model distinguishes between the following runoff processes: Hortonian overland flow (HOF), saturated overland flow (SOF), fast subsurface flow (SSF), and deep percolation (DP). For SOF and SSF to occur an impermeable layer in the soil profile is required. According to the drainable porosity of a profile, three classes of soil water storage capacity are distinguished: STO1 -very low (0-40 mm), STO2 -medium (40-100 mm), STO3 -large (100-200 mm), where the numbers in parentheses give the total volume of drainable porosity per unit area above the impermeable layer (in mm). Schmocker-Fackel et al. (2007) used a soil map, a geological map, a forest map and a land-use map to determine the occurrence of DP and HOF and to determine the storage classes STO1, STO2, and STO3. Apart from roads and other artificially sealed areas, HOF is assumed to occur on soils with very low infiltration capacity, e.g., due to soil surface sealing, water repellency or compaction, at very high rainfall intensity (> 50 mm h −1 ). At lower intensities, these areas are assigned to another runoff process. Information on slope is needed to differentiate between SOF and SSF. SOF was observed to occur below a slope of 15 %, and SSF on slopes above 15 % (Schmocker-Fackel et al., 2007). If drainage data are available, SSF can be further differentiated into drainage flow and natural SSF. Following the terminology of Schmocker-Fackel et al. (2007), SOF and SSF runoff from soil with storage capacity STO i (i = 1, 2, 3) is denoted as SSFi and SOFi, respectively.
Areas dominated by DP are assumed to generate no runoff; instead, rain falling on these areas infiltrates and is delivered to groundwater. As STO1 areas are saturated much faster than STO2 areas and these faster than STO3 areas under the same initial conditions, the risk of runoff is highest on areas with HOF, followed by STO1, STO2 and STO3 areas in a given event. Furthermore, SSF areas are assumed to drain faster than SOF areas. Following Schmocker-Fackel et al. (2007) we used storage capacities of 5 mm for HOF and 20, 70 and 150 mm for STO1, STO2 and STO3 areas, respectively. While the DoRP classification of soils gives a qualitative assessment of the hydrological behavior of a given location, the DoRP model also includes a simple bucket model to make quantitative discharge predictions (Schmocker-Fackel et al., 2007). Neglecting lateral flow, runoff is generated, as soon as rainfall exceeds the soil's storage capacity at a given location. To account for antecedent soil moisture conditions, all rain water is stored for 5 days in the soil, if storage capacity is available, and is then removed.

SCIMAP
The Sensitive Catchment Integrated Modeling Analysis Platform (SCIMAP) estimates the risk of pollutant loss from a location x to a stream or other receiving water body. This locational risk p gc x  indicates whether a pollutant is both available at a location and can be delivered to the water body. It is calculated by multiplying two relative site indicators, the availability risk at point x (p g x ) and the connection risk (p c x ), the risk that it can be delivered to a water body: In this study, which focuses on DRP losses, the availability risk is defined by the P availability (source factor). The connection risk is the likelihood that runoff is generated and delivered to a stream (transport factor). The availability risk for dissolved P losses can be related to the concentration of legacy P in the soil, which is a permanent P source. Given that the relationship between water extractable P concentrations (WSPs) in the soil and DRP in runoff was found to be linear in many studies (Hahn et al., 2012;Vadas et al., 2005), we normalized the WSP data by the maximum WSP value.
In line with the original SCIMAP concept, this normalization was carried out for each catchment separately. To allow for comparisons across catchments, we also modified this normalization procedure to account for the value range in both catchments. Following Reaney et al. (2011), the connection risk was derived from the network index (NI), which is a connectivity index equivalent to the topographic index λ but corrected for locations of reduced connectivity (Lane et al., 2004). Although cells with a higher λ are more likely to be saturated than cells with a lower λ, water from these cells might not reach the stream network, if drier cells along the flow path enable infiltration. Assuming that the contribution of an upslope location to stream runoff is limited by the lowest λ value further downslope along the flow path, the NI assigns the lowest λ value found further downslope along that flow path to all cells further upslope that otherwise would have a higher λ value. The NI then needs to be related to p c x . Based on a comparison with a distributed hydrological model (Lane et al., 2009), Reaney et al. (2011) suggested a p c x of 0 for NI values below their 5 % quantile and 1 for values above the 95 % quantile, with a linear p c x to NI relationship in between (i.e., from p c x = 0 at the 5 % quantile to p c x = 1 at the 95 % quantile). Again, this normalization does not allow comparisons across catchments. We modified this procedure by deriving quantiles for the pooled data sets of both catchments.

Spatial comparison of model results
Weighted kappa (Cohen, 1968) was used to compare spatial risk predictions. To calculate kappa the model results of SCIMAP and RRP were rescaled and grouped. For this purpose, the results obtained with each of the two models were divided by the respective maximum value and then grouped as follows: locations with values ranging between 0 and 0.2 were considered to be at low risk, with values between 0.2 and 0.5 to be at medium risk, with values between 0.5 and 0.8 to be at high risk, and with values between 0.8 and 1 to be at very high risk. Weighted kappa was calculated using R (RDevelopmentCoreTeam, 2007) and the psy package.

Study sites
We applied the three approaches to the Lippenrütibach catchment (3.3 km 2 ) and the Stägbach catchment (8.24 km 2 ), which are both situated in the hilly area of the Swiss Plateau northwest of Lucerne (Fig. 1). The former drains into Lake Sempach, whereas the latter drains into Lake Baldegg. Both lakes have a legacy of eutrophication and are artificially aerated to avoid oxygen depletion in the lower part of the water column. The region is characterized by intensive livestock production (diary and pig farms, 2.4 livestock units per ha; Herzog, 2005) and intensively manured permanent grassland (4 to 6 cuts and manure applications per year). Soil P levels are elevated due to a legacy of P inputs in excess of crop demands. The average annual precipitation ranges from 1000 to 1200 mm. The catchments are characterized by hilly terrain that undulates between altitudes from 500 to 800 m a.s.l. (above sea level). The landscape shows strong molding by the Würm glaciation. The parent material varies between upper freshwater molasse and moraines (Peyer et al., 1983). The soils are predominantly Eutric and Gleyic Cambisols, with loamy texture (Peyer et al., 1983;AGBA, 1993). Less than 10 % of the two catchment areas are covered by settlements, roads and other constructed features. Forests account for 17 % of the Lippenrütibach catchment and 8 % of the Stägbach catchment. The remaining area is used for agriculture.
The Lippenrütibach catchment (LIP) is one of the four catchments that were used for the calibration of the RRP model, using discharge data and DRP analyses of runoff collected from 7 until 17 July 2000. In the present study we applied the model to the period from March until November 1999, for which precipitation, ET, discharge, and manure application data were available that had not been used for model calibration. For the Stägbach catchment (Stäg), we used data collected in 2010 to run the RRP model and to validate model results. In addition to discharge and DRP measurements at the catchment outlet, permanent soil moisture measurements at four locations (S1 to S4) as well as groundwater level and overland flow detector (OFD) measurements at 10 stations (S1 to S10) were available for validation. For more details the reader is referred to Hahn et al. (2013).

RRP
The RRP model predicted the discharge dynamics at the outlet of the Lippenrütibach and the Stägbach catchment fairly well, with median NSC values of 0.5 and 0.62, respectively (Hahn et al., 2013). To characterize the uncertainty of fastflow predictions, we constructed maps showing the fraction of the accepted parameter sets that predicted fast flow for each pixel at a given time (Fig. 2a). Values between 0 and 0.2 are considered to indicate a low risk, values between 0.2 and 0.5 a medium risk, values between 0.5 and 0.8 a high risk, and values between 0.8 and 1 a very high risk of fast flow. Thus, if for example more than 80 % of the model realizations predicted fast-flow generation for a pixel, that pixel was assigned to the very high risk class. For both catchments the percentage of high-risk areas increased with soil moisture ( Table 1). The percentage of the respective agricultural areas with very high predicted runoff risk varied between 9 % for small events and 27 % for large events in the Lippenrütibach catchment, and between 7 and 16 % in the Stägbach catchment. Runoff measurements from OFDs installed at different locations within the Stägbach catchment were in good agreement with these predictions. For example, runoff was collected at five out of seven locations during the large event in June 2010, but at only two locations during the small event in May. The RRP model was able to capture the temporal soil moisture and runoff patterns in three out of the four locations where soil moisture was monitored continuously in 10 and 30 cm depth, and runoff was collected using OFDs. Only for one site (S4) the RRP model underestimated runoff risks (Hahn et al., 2013).

DoRP
Discharge predictions of the DoRP model are based on an extremely simple model concept of a limited storage volume characteristic for each soil type (see above). Despite its simplicity and the lack of any calibration, the predicted discharge volumes for different events correlated reasonably well with the measured values. However, the slope deviates from the 1-to-1 line in both catchments with an underestimation for the Stägbach catchment and an overestimation for the Lippenrütibach (Fig. 3). In contrast to the RRP model, DoRP predicts discharge only in direct response to rainfall events, and thus, due to the small number of events, no reliable comparison with discharge measurements was possible for the Stägbach catchment. For the Lippenrütibach catchment more runoff events were available to compare DoRP predictions with measurements. The model thus proved to be useful to predict the discharge of an event relative to discharge of other events from the same catchment, but it cannot be used to predict discharge from another catchment without re-calibration. This is in contrast to the RRP model that yielded satisfactory results without site-specific calibration.
Hydrol Given the large percentages of STO2 areas in the two catchments (Table 2), the runoff contributing area predicted by the DoRP model increased sharply as soon as rainfall plus antecedent moisture exceeded the storage capacity of this class of soils. The difference between simulated and measured discharge increased with increasing event size (Fig. 3) suggesting that either storage capacities need to be calibrated or more differentiation between locations in STO2 is needed.
Surface overland flow from soils of storage class 2 (SOF2) was the most prevalent runoff category in both catchments (Fig. 2b). Three of the four permanent measurement stations (S2, S3 and S4) in the Stägbach catchment were situated on soils of this category, while S1 was classified as DP. Indeed, S1 was the only station where surface runoff was never detected during the whole monitoring period and where the groundwater table never rose above a level of 0.5 m below the soil surface. Thus, in contrast to the RRP model, the DoRP model differentiated correctly between S1 and S4, but not between S2, S3 and S4, although S2, S3 and S4 also showed substantial differences in their soil moisture and runoff regimes (Hahn et al., 2013). The fact that S2, S3 and S4 were in line with RRP predictions suggests that topography was in general a better predictor of the runoff responsiveness of a location to rainfall events than hydraulic soil properties.

Comparison of RRP and DoRP
The highest observed P losses occurred during the largest event in each of the two catchments during the study period. Thus, we used these two events ( Fig. 1) to compare the DoRP assessment of the soils of the two catchments with risk classes determined by the RRP model. In general, the extreme risk classes (very high or very low) agree well between the two approaches. On the one hand, very high fastflow risks according to the RRP model were strongly associated with soils of low (STO1) to medium (STO2) storage capacity (Table 3). On the other hand, the DoRP storage classes DP and STO3 primarily fell into the categories of low fast-flow risks according to RRP (Table 4). The intermediate categories in each approach were distributed across the classes of the other approach in much more homogeneous manner. This holds true especially for the DoRP storage class STO2, which contains for example 45.6 % of RRP low-risk class areas but also 29.7 % of very high risk during the largest runoff event in the Lippenrütibach catchment (Table 4). The STO2 class dominates in both study areas, which explains why DoRP allows for little spatial differentiation. One reason for the relatively good match between the two models regarding the extreme risk classes is that all STO1  soils were classified as poorly drained soils in the RRP model and that locations with low soil water storage capacity tended to have large λ values (Fig. 4). In line with this observation, the Kruskal-Wallis test, which was used here because of outliers and non-normal distribution, revealed that the mean TI values of the storage classes were significantly different. These relationships can be understood as an expression of the dependence of soil formation processes on topography and position within the landscape.

Comparison of RRP and SCIMAP
In contrast to the RRP and the DoRP model, SCIMAP cannot be used to predict discharge since it is a time-integrated rather than dynamic model and so does not make discharge predictions. However, it is the only one of the three approaches that accounts for connectivity when identifying runoff risks within a catchment. This aspect, however, does not seem to play a major role in our study areas. A comparison between the topographic index λ and network index NI reveals only minor differences (see Fig. S1 in the Supplement). Accordingly, it is not surprising that the spatial NI patterns are similar to the RRP risk classes as can be seen in Fig. 2 for the largest event during the respective monitoring periods for the two study areas. Figure 5a shows that there were close relationships between the NI and the risk of fast flow predicted by the RRP model. The scatter is due to pixels where the topographic index λ differs from the network index, NI. For such locations the risk might be overestimated by the RRP model, unless there is a direct connection to a stream that does not directly depend on topography, e.g., a connection via tile drains. In Fig. 5b these pixels were not displayed to avoid the influence of connectivity when comparing the RRP hydrological risk with the SCIMAP connection risk.
As should be expected, well-drained locations were predicted by the RRP model to have lower fast-flow risk than poorly drained locations with the same NI (Fig. 5). For the Lippenrütibach catchment this observation was even more pronounced.
Because SCIMAP yields a static risk assessment while RRP predicts risk for fast flow as a function of time, it is useful to compare the SCIMAP prediction with RRP predictions for discharge events of different magnitude (Fig. 5b). For small events, the SCIMAP risk predictions are systematically larger than for those resulting from RRP (all RRP results lay below the 1 : 1 line in Fig. 5b). For large events, the situation is more complex. According to the RRP model, a considerable areal fraction has a higher risk than according to SCIMAP. On well-drained soils this relationship turns to the opposite for RRP low-risk areas. For poorly drained soils, Hydrol. Earth Syst. Sci., 18, 2975-2991, 2014 www.hydrol-earth-syst-sci.net/18/2975/2014/ RRP yields systematically higher risk values than SCIMAP. The average RRP risk over the monitoring period does not show a risk of 1 because it is an average of storm flow events and low flow. Even after rescaling the RRP risk to range between 0 and 1 to enable a fair comparison, the average RRP risk predictions are always lower than the SCIMAP risk predictions. The average risk integrates periods of large and small storm flows as well as very low runoff events and therefore continues to increase until quite high NI values are reached (Fig. 5a). At a SCIMAP connection risk of 1, the average RRP risk ranges between 0.2 and 0.6 (Fig. 5b). This is because the scaling between the 5 and 95 % quantiles does not enable a fine enough differentiation between high NI values. Therefore, SCIMAP differentiates in space less than the dynamic RRP model. Hence, SCIMAP agrees better with RRP predictions for larger events.

Comparison of all three model predictions
Despite the differences between RRP and SCIMAP described above, their spatial predictions of areas prone to fastflow processes and hence to DRP losses are much more similar to each other than to the DoRP risk predictions (Fig. 2). Many topographic features where RRP and SCIMAP predict a high risk for runoff generation are not reflected at all in the DoRP model. Despite a more refined soil classification, such features may get lost in the DoRP classification. Interestingly, this is not always the case. The two SOF1 areas (DoRP) in the Stägbach catchment, for example, appear very clearly in both RRP and SCIMAP. In general, however, DoRP seems rather coarse in its classification compared to differentiation indicated by the other two models. Interestingly, the incorporation of soil information, which is part of the RRP model concept, caused very little change in spatial patterns relative to SCIMAP, which is solely based on topographic information. Accounting for soil drainage classes in the RRP approach resulted only in two small differences between Fig. 2a and c, observed in the southwestern part of LIP and the northeastern part on Stäg. Accounting for connectivity (SCIMAP) had little influence on the spatial pattern of predicted hydrological risk areas relative to those predicted based on the original topographic index (Fig. 2a and c).

Comparison of RRP and SCIMAP
As shown by Hahn et al. (2013), the RRP model produced good predictions of P loads at the outlet of the Stägbach catchment and one of its subcatchments. Spatial RRP predictions of runoff risks were also in good (qualitative) agreement with local measurements of soil moisture, groundwater levels and surface runoff. This gives confidence that the RRP model reflects the main processes and is a valid tool to delineate CSA for P exports into the streams of the study catchments. Given the good performance of the RRP model, it is interesting that SCIMAP predicted similar areas with high risk of DRP loss and that this agreement appears to be better for high-runoff events (kappa Stäg: 0.54; kappa LIP: 0.68) than for the average DRP load during the simulation period (kappa Stäg: 0.26; kappa LIP: 0.29; see also Figs. 6 and 7, and Table 5). Thus SCIMAP appears to have good potential to be used at least as a first screening tool for the identification of critical source areas.

Relationship between NI and the connection risk used in SCIMAP
The original SCIMAP model prescribes a static linear relationship between NI and the connection risk p c x from 0 at the 5 % NI quantile to 1 at the 95 % quantile. This relationship is considered time invariant, and it is based on the assumption that the least connected 5 % fraction of a catchment never connects, while the most connected 5 % fraction always connects to a stream. This approach has three major limitations. Firstly, the comparison with the RRP model shown above suggests that the relationship between NI and connection risk is not time invariant but that SCIMAP predictions mainly reflect larger events in our study areas. Secondly, the assumption that 5 % of the catchment is always connected and 5 % is never connected makes the method insensitive to these parts of the catchment. Assuming that areas with very low NI values never connect is reasonable for single runoff events and probably also for most monitoring periods. Assuming that areas with the highest 5 % of NI values always connect is appropriate for large events, but not necessarily for aggregated risks over a period of time or for small events (Fig. 5b). This can be seen in Fig. 7a and c, which show a considerable scatter for SCIMAP locational Hydrol. Earth Syst. Sci., 18, 2975-2991, 2014 www.hydrol-earth-syst-sci.net/18/2975/2014/   Figure 7. Comparison of the P load calculated with the RRP model, which differentiates between well-drained and poorly drained hydrological response units (HRU well , HRU poor ), for the entire monitoring period and the highest runoff event with the locational risk for P losses predicted using the SCIMAP approach. HRUwell HRUpoor Lippenrütibach catchment Stägbach catchment Figure 8. Comparison between the average DRP loads for the whole monitoring periods predicted by the RRP model for both hydrological response units (HRU well , HRU poor ) and the global SCIMAP locational risk estimations. For the global locational risk we set the delivery risk to 0 at network index NI ≤ 6, the 5 % quantile, and to 1 at NI ≥ 20, the maximum NI of both catchments and scaled linearly for NI in between. The source factor (generation risk) was normalized by dividing with the maximum WSP of all catchments. In our catchments the difference to the original locational risk resulted mainly from the changed delivery risk delineation.
risks. The scatter was much less when the 5 % assumption was relaxed and the connection risk assumed to scale linearly with NI up to its maximum value (Fig. 8), accounting for the fact that there were significant differences in connectivity even within the most connected 5 % fraction of our catchment. While areas close to the catchment outlet characterized by very high λ and NI values frequently contributed to runoff according to the RRP model, even during very small events, areas further upstream, where the λ and NI values were lower but still within the top 5 %, contributed runoff much less frequently. Extending the linear NI/risk scaling up to the maximum NI enabled differentiation between these areas. A third major limitation of the original SCIMAP approach is that by normalizing the generation risk and NI values between zero and one the model can predict risks at a given location only relative to the risks at other locations within the same catchment. To enable comparisons between different catchments, we normalized the generation risk (source factor) and delivery risk (transport factor) by setting a common upper limit for all catchments. For the source factor we simply used the maximum value of all catchments for this purpose. For the transport factor it was less straightforward. The highest NI value (NI max ) of the two catchments studied here was 20, and the lowest (NI min ) was 4.7. The 5 % quantile of all NI values was 6, and based on our RRP model predictions the runoff risk of cells with NI values lower than 6 can be neglected. Thus, we set the transport factor to 0 at NI ≤ 5 % quantile and to 1 at NI ≥ NI max and to vary linearly with NI between these limits. The locational risk calculated with these "globally" scaled source and transport factors ranged between 0 and 0.4. Using the RRP results as a reference, the global locational risk was in better agreement with the average DRP loads over the whole monitoring period (kappa Stäg: 0.29; kappa LIP: 0.45; Fig. 8) than the original locational risks with catchment-specific normalization (kappa Stäg: 0.26; kappa LIP: 0.30; Fig. 7). Since the catchments had similar soil P status, this improvement can be attributed to the modified relationship between NI and delivery risk.

Discussion
The performance of the three models indicates that a large amount of useful hydrological information can be extracted for making predictions on P export risks and CSAs from widely available data sources (soil map, geological map, topography). The results further indicate, for our study area, that runoff generation and associated P export risks depend more on topography than on soil characteristics, when the information is solely extracted from conventional maps. Although neglecting topographic connectivity and using a rather crude binary soil classification by drainage capacity, the RRP model predicted spatial distributions of runoff risks that were in good (qualitative) agreement with local measurements of soil moisture, groundwater level and surface runoff (Hahn et al., 2013). The DoRP predictions showed surprisingly close relationships with measured discharges at the catchment outlets. However, the DoRP model could not sufficiently differentiate between locations, because large areas fell into the same runoff category (STO2), while there was substantial variation in runoff risks within this category due to topography, which was accounted for by the RRP model and SCIMAP. The SCIMAP predictions were very similar to the RRP simulations, indicating that there are not many areas with reduced topographic connectivity in the study catchments.
Hydrol. Earth Syst. Sci., 18, 2975-2991, 2014 www.hydrol-earth-syst-sci.net/18/2975/2014/ While the RRP model was found to be well suited to the identification of high-risk areas in our catchments, taking account of connectivity and differentiating between more soil types may be crucial in other types of catchments. This could be achieved most simply by combining predictions of the three models. The DoRP model can help to better distinguish between sites assigned to the same drainage class by the RRP model, as shown for the example of the measurement locations S1 and S4. Thus, in targeting mitigation measures to areas identified as CSAs using RRP, it may be worth checking for the runoff category assigned to these areas by the DoRP model, before priorities are determined.
SCIMAP, which accounts for topographic connectivity along flow pathways, can be used to identify areas whose connectivity with the stream may be overestimated by the other two models due to the inherent assumption of unrestricted connectivity. In case of a DP zone, it is unlikely that the area is connected to a stream or lake by an artificial drainage system, which means that the risk of P export is less than predicted by the latter models. In STO1 and STO2 zones, it may be necessary to check for the presence of a drainage network before correcting the risk prediction.

Connectivity
Many studies highlight the importance of accounting for connectivity when identifying CSAs (Doody et al., 2012;Doppler et al., 2012;Frey et al., 2009;Lane et al., 2009;Srinivasan and McDowell, 2007). However, connectivity has been defined in different ways that are relevant for the interpretation. The network index (NI) used in SCIMAP assumes that the connectivity along a flow line breaks down if the topographic index values λ along a flow line have a local minimum. Given the definition of λ, this implies that the connectivity is lost if a section of the flow path is steeper than the upslope part. Lane et al. (2009) illustrated the potential of the NI to generalize information about the hydrological connectivity between locations in areas where topography is the dominant factor. Another approach assumes that connectivity is only lost if water cannot flow further downstream because it is retained in a sink area. While internal sink areas in catchments are often filled by default by conventional GIS software, detailed analyses have revealed that internal sinks may be essential elements of catchments. This may be caused by natural processes in landscapes like the pothole region in North America or may be caused by anthropogenic influence on topography in the Swiss Plateau (Doppler et al., 2012;Frey et al., 2009).
In summary, the two concepts assume different mechanisms behind the loss of connectivity. The first concept (NI) assumes that re-infiltration due to an increased gradient prevents surface runoff downhill. The second approach assumes the opposite in that ponding occurs because a topographic barrier prevents any flow downhill on the soil surface. Which process is more relevant for any catchment depends on the specific conditions. The steeper the terrain, the less probable are formation of sinks. This was the reason why the NI has been used in our study area, which is steeper than other parts of the Swiss Plateau where sinks may be very relevant. Due to its simplicity, the NI has a high potential to be widely applied especially within the scope of the WFD, as shown for Ireland by Doody et al. (2012) and Wall et al. (2011). This potential is still limited by insufficient knowledge about the relationship between the NI and the probability of a site to connect to the stream network of a catchment over time (Lane et al., 2009;Reaney et al., 2011). Here we related the NI to the runoff risk as predicted with the RRP model. These relationships contain information about the catchment responses during the monitoring period that can be used to refine the scaling of NI/connection-risk relationships used in SCIMAP.
The reasonable match between RRP and SCIMAP predictions of CSAs for larger events in our study catchments suggests that SCIMAP can be used as a screening tool for CSA delineation in catchments where CSAs are primarily determined by topography. This is of particular interest in regions where dynamic models cannot be applied due lack of discharge data. While SCIMAP was originally developed to predict relative risks for individual catchments, a modified version of SCIMAP can be used to compare P export risks among catchments and thus help to identify the most critical catchments within watersheds, given a homogeneous distribution of rainfall. However, the transformation of NI into a connection risk needs further refinement to appropriately describe the hydrological risk. If applicable, RRP simulations can be used for this purpose in addition to field data. The stepwise linear relationship with zero risk up to the 5 % NI quantile and a maximum risk level with no further change at the 95 % NI quantile, as proposed by Reaney et al. (2011), was found to be appropriate for storm events but not for ensembles of different events. Based on our results, we recommend a relationship with a linear increase from the 5 % quantile up to the maximum NI value in the latter case.

Limitations of the model approaches
In contrast to the other two models, SCIMAP is not a dynamic model and thus cannot be used to predict how discharge and associated P losses vary between rainfall events of different magnitudes. However, it can be very useful to identify areas with a high potential for P losses in regions where topography governs hydrology. The DoRP model in contrast is advantageous where the drainage properties of soils are the dominant factor. The RRP model is less refined than the DoRP model with respect to differentiation between soils and does not account for topographic connectivity. However, combining the most basic of information on soil hydrology with the topographic information within a process-based framework seems to predict the relevant hydrological and DRP export processes in our study catchments well. The simplifications regarding the representation of the hydrologic processes in the RRP approach are an advantage in practical applications, but by necessity they also come with limitations.

Hydrological drivers
The generation of fast flow, which includes all kinds of quickly responding runoff in the RRP model, is bound to λ and thus is more likely in wet areas. In reality, however, infiltration excess runoff (IER) can also contribute significantly to the total runoff and transport DRP and PP to a stream (Doppler et al., 2012;Srinivasan et al., 2002), depending on rainfall patterns and soil properties. For certain highresolution soil maps, it is possible to identify areas prone to HOF using the DoRP model (Schmocker-Fackel et al., 2007). Unfortunately, most soil maps do not contain sufficient information. In addition, one has to consider that relevant properties like the infiltration capacity of soils may strongly depend on management practices and hence vary in time. Srinivasan et al. (2002) for example reported that the occurrence of IER during field experiments was scattered, disjunct and transient, which makes the prediction of areas prone to IER difficult. IER may also be due to water-repellent soil surfaces. While repellency is common for many land-use types with permanent vegetation cover in humid temperate climates (Doerr et al., 2006), we assume that it did not cause a lot of runoff that reached the stream network.
Another limitation of the approaches discussed here relates to the assumption that surface topography reflects the relevant gradients controlling water fluxes. If the topography of an impermeable layer differs significantly from surface topography, runoff may occur on unexpected areas. Like the models evaluated by Srinivasan and McDowell (2007), the models compared in our study do not enable users to recognize areas where subsurface flows can potentially emerge to the surface as seeps and springs on steep hillsides (Srinivasan and McDowell, 2007). Zheng et al. (2004) reported that particularly high P concentrations were associated with this type of runoff.
In principle, the RRP and DoRP models do not only account for surface but also for subsurface flow, which can also carry substantial amounts of P to a water body (Kleinman et al., 2011b). In practice, the difficulty is to account for artificial drainage systems, which can contribute substantially to DRP losses (Stamm et al., 1998(Stamm et al., , 2002Watson and Matthews, 2008). Drains can connect areas that appear disconnected on the basis of topographic analysis, and thus the lack of drainage data can be a major problem for CSA determination. However, in these kind of landscapes, surface runoff and tile drain flow are often not separate flow processes, but they may occur in sequence: flow may start as surface runoff and gets intercepted by, e.g., macropores connected to tile drains (Stamm et al., 2002;Doppler et al., 2012). The drained area amounts to approximately 10 % of the agricultural area in the Lippenrütibach catchment and to around 15 % in the Stägbach catchment. Field inspections and measurements in the Stägbach catchment revealed that locations predicted to be wet were indeed wet and that surface runoff from some of these locations occurred although they were in close proximity to drains. Thus, our results show that, even in presence of drainage systems, topography may still provide crucial information on runoff generation risks and CSA delineation. Because the combination of surface runoff and macropore flow to tile drains is part of the fastflow component of RRP, the influence of tile drains is accounted for during the calibration process. To assess the potential role of IER and artificial drainage systems, it may be helpful to complement available ground information through field visits and interviews of local farmers, as suggested by Frey et al. (2011). The models discussed here can be very useful in guiding the collection of such information.

Sources and types of P
This study focused on the prediction of P losses in form of DRP, because DRP can immediately be taken up by algae (Dorioz et al., 2006;Sharpley, 1993;Sharpley et al., 1994) and thus represents the main risk factor for eutrophication (Kleinman et al., 2011b). However, it should be recognized that also particulate P (PP) can become bioavailable, given specific physical-chemical dynamics (Dorioz et al., 2006). Particulate P losses can be high, especially on arable land (Doody et al., 2012) or due to bank erosion . There are several models that simulate sediment and PP losses (Krueger et al., 2012). SCIMAP can also be adjusted to predict CSAs for sediment  and thus PP export. However, the RRP model was developed for grassland-dominated catchments and solely simulates DRP losses. Phosphorus losses from vegetation (Kleinman et al., 2011b) and seasonal changes in P availability (Dorioz et al., 2006;Pote et al., 1999) were assumed to be negligible. Furthermore, freshly applied manure or other fertilizers can be a relevant source for P export (Withers et al., 2003). The RRP model is able to account for this source. However, as the application of manure can be easily controlled by appropriate management, we did not consider it for CSA delineation in this study.

Conclusion
The study demonstrates that a large amount of hydrological information needed for the prediction of P export risks and CSAs can be extracted from widely available data sources. The comparison of the predictions obtained with the three tools to each other and to the available experimental data indicates that the location and extent of CSAs within the study catchments was more dependent on topography than on variation in soil properties as they are represented in the DoRP Hydrol. Earth Syst. Sci., 18, 2975-2991, 2014 www.hydrol-earth-syst-sci.net/18/2975/2014/ model. SCIMAP predicted CSAs particularly well for large storm events. It may be especially useful for screening purposes, in particular in regions without discharge data. RRP simulations can be used to adapt the relationship between NI and connection risks used in SCIMAP to the particular characteristics of a catchment or region. The stepwise linear relationship with zero risk up to the 5 % NI quantile and a maximum risk level with no further change at the 95 % NI quantile, as proposed by Reaney et al. (2011), was found to be appropriate for storm events, while we recommend a relationship with a linear increase from the 5 % quantile up to the maximum NI value for a multitude of events. Two major problems in predicting P export risks and delineating CSAs in general are the difficulty to account for IER and the limited availability of data on tile drainage systems. To cope with this problem, we suggest that model predictions are complemented by ground information obtained from field visits and interviews of local farmers, as suggested by Frey et al. (2011). The models can be very useful in guiding the collection of such information.
The Supplement related to this article is available online at doi:10.5194/hess-18-2975-2014-supplement.