Tracers reveal limited influence of plantation forests on surface runoff in a UK natural flood management catchment

Studies of NFM have not


Introduction
There is increasing interest globally in the use of nature-based solutions for flood risk management and disaster risk reduction (EEA, 2017;World Bank, 2018). In the UK, one manifestation of nature-based solutions is a new wave of policies and projects in support of natural flood management (NFM) (Dadson et al., 2017;Kay et al., 2019;Lane, 2017). These policies promote a catchment-wide Tracer-based hydrograph separation has a physical basis that is lacking in most standard hydrometric separation methods (Klaus and McDonnell, 2013). When applied at high resolution through storm events, tracers can reveal not just the proportions of different runoff components during events but the timing of different components during individual event hydrographs (e.g. whether flood peaks are dominated by event rainfall). Finally, tracers can help in evaluating the co-benefits and risks of different types of interventions, for example by helping to investigate flow paths and residence times of water around interventions such as field buffer strips (Novara et al., 2013;Wilkinson et al., 2014).
We report tracer-based hydrograph separation ( 2 H/ 18 O and acid neutralising capacity (ANC)) in three neighbouring catchments within a wider 70 km 2 pilot NFM project in the UK uplands. Two of these catchments are paired, with similar soils, geology and topography but ~50% difference in forest cover. The third catchment, with the lowest forest cover, has some differences in underlying characteristics, but is representative of the range of conditions in the wider NFM project. We test the null hypothesis that plantation forest cover has no influence on the source components of runoff. We then further investigate the relative influence of catchment and storm event characteristics on fractions of event and groundwater runoff.
The main questions addressed in the research were: 1 What is the effect of plantation forest cover on the fraction of event water runoff? 2 How do other catchment features (soils, geology and topography) impact the source components of runoff? 3 How do the event magnitude and intensity impact these differences?
Finally, we discuss the implications for NFM in the UK from these process findings.

Table 1
Summary of characteristics of the three catchments. 'Mixed bottom land' soil type was reclassified into Alluvial soils, and Lithosols into Brown soils for clarity. Data sources: topographic datain ArcGIS from 5 m × 5 m resolution digital terrain model (DTM) (Ordnance Survey, 2016); land cover -Scottish Borders Council survey (Medcalf and Williams, 2010); soils data -1:25,000 soils map of Scotland (Soil Survey of Scotland Staff, 1970); geology -1:25,000 geological map produced by the British Geological Survey (Auton, 2011 Fig. 1. Maps of the study site. a) location of the Eddleston Water catchment (including TBR rain gauges and weather station locations) highlighting the three catchments (pink outline shows catchment boundaries) where event sampling was carried out (red arrow indicates stream flow direction); b) monitoring network in the three sub-catchments and sub-catchment topography (red arrows indicate stream flow direction). Automatic water samplers were located adjacent to stream gauges; c) sub-catchment land use; d) sub-catchment soil cover ('Mobol' is undifferentiated mixed bottom land); e) sub-catchment geology. R. sampler (seq): sequential rainfall sampler; TBR / S R.gauge: paired tipping bucket and storage rain gauges. At the time of the study, forest cover figures in the Middle Burn and Shiplaw catchments, respectively, included ~25% and ~10% recently felled forest, with stumps left in the ground.

Study site and design
The research focussed on three small headwater catchments (2.4-3.1 km 2 , Fig. 1, Table 1) in the Scottish Borders, UK, which are typical of many UK upland catchments and are monitored within the wider NFM project. The catchments have a large range in forest cover (94%, 41% and 1%), making them ideal for investigating the influence of forest cover on source components of flow. Comparisons between the neighbouring Shiplaw and Middle Burn catchments enabled a 'paired catchment' approach, given their similar catchment characteristics other than percentage forest cover. Comparisons between these catchments and the Longcote catchment allowed for insights into the relative role of other catchment characteristics on runoff components across the wider NFM catchment. We used hydrograph separation based on isotopes ( 2 H and 18 O) and ANC to compare time and geographic source components, respectively, of runoff in the three catchments during four high flow events. The wider Eddleston Water catchment (a sub-catchment of the River Tweed) is the focus of one of the UK's largest and long running NFM pilot sites, which aims to inform national water policy (under the EU Water Framework Directive and EU Floods Directive) (Tweed Forum, 2019;Werritty et al., 2010).
Forest cover was historically limited in most of the catchment during the 20th century, but conifer plantations (primarily Sitka spruce, Picea sitchensis) were established in the 1960s and 1970s particularly in the west. The peaty soils were prepared for planting by deep ploughing to create drainage ditches. However, it was assumed that these had minimal impact in this study since they have not been maintained since at least the 1990s and have been actively blocked in some cases to follow more recent forest management guidelines. Approximately 94% of Middle Burn was conifer forest cover at various growth stages at the time of the study, including ~25% recently felled forest. No new access roads were built during felling and brash mats were used to minimise compaction. A series of leaky wooden dams has also been constructed in this catchment as part of the NFM project, but these are only active during the largest events (Black et al., 2021) and release water rapidly as storms subside, so we assume they have negligible effect on total event water runoff fractions. The adjacent Shiplaw catchment has 41% conifer forest cover (including ~10% recent felling), 28% improved and semi-improved grassland, much of which has under-field tile drainage that is common in the UK uplands (Harrison, 2012), and 27% fenland. The steeper Longcote catchment in the east has 1% conifer forest cover, 11% improved grassland and 64% rough grassland (30% heathland, 22% unimproved acid grassland, 12% bracken), and 22% fenland. Most of the catchment is grazed by sheep, cattle grazing occurs on the lower slopes and drains exist under the improved grassland areas.
Soils in the western catchments include extensive areas of poorly permeable gleyed soils and peats, but also areas of more freely draining brown soils, whilst the eastern catchment is dominated by brown soils with some peaty and gleyed soils on hilltops. Soil median field saturated hydraulic conductivities measured nearby (~3 km) in the wider catchment in a separate study (Archer et al., 2013) were 0.50− 0.94 m d − 1 for improved grassland sites, 1 m d − 1 for ~50 year old plantation forest, and 2.86-4.18 m d − 1 for broadleaf forests >180 years old. Soils and underlying geology are strongly associated. The western catchments are dominated by poorly permeable glacial till (Aitken et al., 1984) with pockets of permeable glacio-lacustrine sands and gravels (Fig. 1a). The estimated hydraulic conductivity of the glacial till is < 0.001 to 1 m d − 1 (MacDonald et al., 2012). The eastern catchment is mostly rock head overlying bedrock, with smaller areas of glacial till mantling some of the main streams. The hydraulic conductivity of the Silurian greywacke bedrock was not measured, but Silurian greywacke aquifers elsewhere in southern Scotland have low productivity (Ó Dochartaigh et al., 2015), with an estimated average transmissivity of 20 m 2 d -1 (Graham et al., 2009).
Elevation ranges between 250 and 600 masl across the three study catchments. At Eddleston Village mean annual precipitation (2011-2017) is ~900 mm, falling mainly as rainfall; monthly mean air temperatures are 3-13 • C; and actual daily evapotranspiration ranges from 0.2 mm in winter to 2.5 mm in summer (estimated using methods of Granger and Gray, 1989).

Hydrometric monitoring
Rainfall has been measured since April 2011 at four locations, representative of the wider ~70 km 2 Eddleston Water catchment, using stainless steel Octapent storage rain gauges and tipping bucket rain gauges (RIM8020) recording at 15 min intervals and in increments of 0.2 mm. These gauges are situated inside or within 1 km of the study catchments (Fig. 1). Air temperature, solar radiation, relative humidity and wind speed and direction have been measured at the same time step over the same period at a weather station (Campbell CR1000 Automatic Weather Station) located at the centre of the catchment (Fig. 1 a and b).
Stream water levels have been measured every 15 min (Hobo U20 0-3.5 m unvented pressure-based water level recorders) in each catchment since April 2011. Discharge was calculated at the same time step using rating curves derived from applying the mid-section method (Dingman, 2014) to velocity-area gauging at natural rated sections approximately 8 times a year under a range of conditions (Peskett, 2020).

Event precipitation sampling
Event sampling was carried out over a 48 h period in the three catchments for seven events between December 2015 and February 2017. Events were targeted based on weather forecasts and predicted precipitation maps from the UK Met Office, and were only considered if total predicted event rainfall was above 15 mm with an average intensity of approximately 2 mm h − 1 (based on prior knowledge of the responsiveness of the catchment).
Event rainfall was sampled within the Longcote and Shiplaw catchments ( Fig. 1) using sequential rainfall samplers built using a modified version of the method described in Kennedy et al. (1979). These were deployed a few hours prior to the start of the forecast event rainfall in open field locations. The samplers collect rainfall in 6 mm increments for the first three samples, 11 mm for the fourth sample and then a bulk sample for the rest of the event. The volume increments were converted to time increments by pairing with the cumulative rainfall data from the closest TBR rain gauge. Bulk rainfall samples for the whole event were also collected from adjacent temporary storage gauges in case of failure of the sequential samplers.
Rainfall samples were collected before any further rainfall (and within 12 h of the end of each event), with aliquots for isotopic analysis transferred in the field into two 15 mL HDPE bottles filled completely to remove air. Snow samples were also taken in the February event to test for the influence of snow on separation results (see Supplementary data S1.1). Throughfall was not sampled, but a sensitivity analysis was conducted for the effects of throughfall in hydrograph separation using data from the literature (S1.1).

Event stream water sampling
Stream water sampling for the first event (December 2015) used two automatic water samplers (ISCO 6712, Teledyne ISCO, Nebraska, USA) programmed for a 2 hly sampling frequency and manual sampling in the third catchment (Middle Burn) at lower frequency, as an automatic sampler was not available. All subsequent events used three automatic samplers sited adjacent to the gauging stations and programmed at a 2 hly sampling frequency, giving a sampling window of up to 48 h, since each sampler held 24 bottles. All samplers were primed with clean, dry bottles prior to the event, timed to start a few hours before forecast rainfall and programmed to purge the inlet tubing with river water prior to collecting each sample. We used a number of approaches to prevent and check for evaporation: samples were pumped into bottles with narrow necks within the shaded, wind-free and pale-grey compartment of automatic water samplers; samples were collected within 12 h of the sampler programmes finishing; samples were transferred directly from the automatic sampler bottles into two 15 mL HDPE bottles in the field, and were filled completely to exclude air before sealing; and after sample analysis we also checked that there was no systematic deviation on δ 2 H vs. δ 18 O plots compared to the Local Meteoric Water Line. The remaining stream water samples were transported to the laboratory in capped automatic sampler bottles, where they were refrigerated at 4 • C prior to ANC analysis and analysed within 48 h of the event ending.
Automatic water samplers failed during two events. During the July 2016 event, a blocked sampling tube in Shiplaw meant that only the first sample was collected and this catchment had to be removed from the analysis of that event. An electronic failure in the Middle Burn automatic sampler during the February 2017 event resulted in some missing data for the first part of the event. Manually collected samples prior to the event and before the onset of the main flow peak enabled estimation of the event fraction using interpolation and linear regression based on the neighbouring Shiplaw catchment data (as explained in Tables S1 and S2).
In total, 60 event rainfall samples, 4 event snow samples and 395 stream samples were collected across seven events. The final event dataset for isotopic and ANC analysis reported here is based on four events ( Table 2). Three of the events were excluded due to smaller than forecast rainfall or rainfall isotopic composition that was insufficiently different from streams to enable hydrograph separation.

Laboratory methods
Precipitation and stream samples were analysed for H and O isotope compositions using a Los Gatos Research liquid water Off-Axis Integrated-Cavity Output Spectroscopy (Off-Axis ICOS) laser absorption spectrometer at the University of Saskatchewan, Canada. We used standard analytical methods (IAEA, 2009) and report δ 2 H and δ 18 O values relative to V-SMOW; precision was ±1.0‰ and ±0.2‰, respectively.
ANC was determined in stream water samples using acidimetric titration with H 2 SO 4 in accordance with Rounds (2012) to endpoints of pH 4.5, 4.1, 4.0 and 3.5 within 48 h of returning from the field. In natural waters where aluminium concentrations are low this method has been shown to give a good approximation of ANC (Neal, 2001).
Conductivity and pH were not a focus of this study, but were determined in stream samples to help characterise water chemistry. Conductivity was measured using a temperature compensated conductivity meter (Mettler Toledo SG7) and pH was measured during Table 2 Summary of samples and hydrological characteristics for four events sampled at high frequency in the three catchments. R: total event rainfall; I: maximum rainfall intensity; API: 5 day Antecedent Precipitation Index (Kohler and Linsley, 1951); AP28d: 28 day pre-event rainfall; RR: runoff ratio; Bulk δ 2 H; isotopic signature for all precipitation sampled during the event. To help characterise catchment responses to rainfall, hydrometric data were initially analysed for the three catchments using the complete 15 min frequency time series (October 2011 -September 2017) to generate summary statistics as well as statistics based on a set of ~60 events in each catchment. The data analysis methods used are detailed in S1.3. These data were used to calculate Spearman rank correlation coefficients to highlight any possible relationships between event runoff ratios and event total rainfall depth, rainfall intensity and antecedent rainfall. Summary statistics were also derived from routine (weekly / 2 weekly) geochemical sampling data for May 2015 -May 2017 encompassing the duration of the storm sampling (S1.4).

Table 3
Discharge statistics for the three catchments based on daily discharge data for October 2011-September 2017. Calculation methods are detailed in S1.3. MAPR: mean annual peak runoff; RB: Richards-Baker flashiness index (Baker et al., 2004); FDC_Q1_5: gradient of flow duration curve between 1st and 5th percentiles; Lag time (LT) is between rainfall centroid and discharge peak for events selected based on the methodology outlined in S1.3. Time to peak (TTP) and runoff ratio (RR) are based on the same events dataset.

Isotope and ANC event-based hydrograph separation
We used isotope-and ANC-based hydrograph separation to calculate the fractions of pre-event water and groundwater, respectively, in stream discharge at the 2 h stream water sampling time step for each event. Hydrograph separation relies on a number of assumptions and has limitations that have been extensively reviewed elsewhere (Klaus and McDonnell, 2013). Despite these limitations, isotope-based hydrograph separation is considered more objective than separation methods based on hydrometric data alone and provides a useful first approximation of runoff components operating at the catchment scale (Klaus and McDonnell, 2013).
Pre-event water fractions were estimated as follows: where Q t is stream discharge, Q p the discharge contribution from pre-event water, Q e the discharge contribution from event water, C t , C p and C e are the δ values of stream water, pre-event water and event water, and F p and F e are the fractions of pre-event and event water in the stream (Klaus and McDonnell, 2006). The analysis was conducted using both 18 O and 2 H. Both isotopes gave similar results, indicating minimal fractionation effects, so only the 2 H data are presented here.
To define the pre-event endmember for each stream, we used the mean of the pre-event stream water samples collected on the day prior to the event, as in other studies (Klaus and McDonnell, 2013). For the event endmember, we used both the sequential rainfall samples for each event (except December 2015 when sequential samples were not available) and bulk rainfall samples in order to crosscheck the results. F p and F e are presented in the subsequent analysis as values at peak discharge using the sample closest to peak discharge (F p (Q max ) or F e (Q max )) and as a total for the storm based on a sampling window defined by the first sample on the rising limb of the hydrograph and the earliest final sample across the three catchments (F p (Q tot ) or F e (Q tot )). Time to peak (TTP) from the start of the rising limb to the peak discharge and the runoff ratio based on the event water fraction (RR) were also calculated (with subscripts p and e denoting pre-event and event water respectively). We also carried out sensitivity analyses to assess the influence of throughfall and of snow on hydrograph separation (S1.1).
We used a similar two-component mixing model to estimate the groundwater fraction in runoff during each event, subject to the same assumptions as the isotope-based model: L.M. Peskett et al. where F gw is groundwater fraction, Q t is stream discharge, Q gw is discharge contribution from groundwater, A s is ANC of stream water, A r is ANC of surface runoff endmember, and A gw is ANC of groundwater endmember. The groundwater endmember was defined as the mean ANC of the five lowest flows in each sub-catchment for the period September 2015-August 2016 (based on weekly sampling as discussed in S1.4) similar to other studies (Neal et al., 1997;Soulsby et al., 2003). The surface runoff endmember was defined as zero, as this approximates the ANC of rainfall. The stream water endmember was taken as the ANC at the time of sampling. As for isotope-based hydrograph separation, F gw is presented in the subsequent analysis as values at peak discharge or total discharge (F gw (Q max ); F gw (Q tot )), and TTP gw and RR gw are reported for time to peak and runoff ratios based on groundwater components.
Uncertainties in the pre-event water and groundwater fractions were estimated using the Gaussian error propagation approach of Genereaux et al. (1998), based on 70% confidence intervals which were considered appropriate for analysis of this size of dataset and the uncertainties involved in hydrograph separation (Bazemore et al., 1994). Details of the input parameters for the uncertainty analysis are outlined in S1.5.
Finally, three-component hydrograph separation was conducted using a two-step approach (Klaus and McDonnell, 2013) to approximate soil water based on the difference between the pre-event water and groundwater fractions. Fig. 4. Relationships between event water (F e ) or groundwater (F gw ) fractions and event size, in plots of (a, b) maximum event water (or groundwater) discharge fraction against maximum discharge (Q max ) and (c, d) total event water (or groundwater) discharge against maximum discharge fraction (Q max ). Error bars represent 70% confidence intervals calculated as explained in the text.
These data were used to compare catchment responses during the different events. Spearman rank correlation coefficients were calculated to highlight any possible relationships between tracer-based event water fractions/runoff ratios and event total rainfall depth, rainfall intensity and antecedent rainfall.

Long-term hydrometric and geochemical sampling data
Analysis of the complete stream discharge time series indicated that the two western catchments, Middle Burn and Shiplaw, were more responsive to rainfall events than the eastern catchment, Longcote (Table 3). The highly forested Middle Burn catchment was less responsive than the adjacent Shiplaw with lower forest cover, but much more responsive than Longcote with the lowest forest cover.
Large differences were also observed in catchment long-term (May 2015 -May 2017) stream water geochemistry (Fig. 2, Fig. S1). Middle Burn was most acidic, and had the lowest ANC and conductivity compared to the other catchments, which may be due to the acidifying effects of the forest; Shiplaw had intermediate values; and Longcote had the highest values. Shiplaw had a wider range of ANC and conductivity compared to the other catchments, which may reflect its greater responsiveness. The median values and range in isotopic composition were similar for Middle Burn and Shiplaw, but notably different for Longcote, reflecting both the proximity of the western catchments and potentially the higher groundwater fraction in Longcote.

Event-based hydrograph separation
Overall, despite the responsiveness of the catchments, event water runoff was generally limited. The total event runoff fraction, F e (Q tot ), was <0.36 for all catchments and events (Table 4). The peak runoff fraction, F e (Q max ) was higher (<0.54) in all catchments and events, but with the same relative values as those based on F e (Q tot ). Groundwater runoff was generally low across the catchments. Total groundwater runoff fraction, F gw (Q tot ) was <0.35 for all winter events, though much more variable during the summer event in the two catchments sampled (0.25− 0.65). F gw (Q max ) showed a similar pattern between catchments and events, but estimates were up to 10% lower than fractions based on totals (Table 4). These low groundwater estimates, combined with low event water estimates result in large soil water fraction estimates in the three-component hydrograph separations.
In terms of the differences between catchments, the hydrograph separation results show a similar pattern to the results based on long term data, with no systematic relationship between forest cover and the fraction of event water runoff. Runoff ratios based on event water estimates from isotope based hydrograph separation indicate that Shiplaw had the highest event water runoff ratios, Longcote the lowest and Middle Burn intermediate values despite having the highest forest cover. Uncertainties in these estimates are large but apparent differences were observed between Shiplaw and Middle Burn for the largest event (November 2016 and February 2017, Table 4). There is good consistency between the relative order of these runoff ratios for each catchment and runoff ratios calculated using the hydrometric data for events over the longer time period (cf. Tables 2 and 4).
Groundwater runoff fractions across the catchments followed a different pattern. They were lowest in Middle Burn (0.13 of total runoff during the largest event), intermediate in Shiplaw and highest in Longcote, although groundwater still contributed 0.27 of total runoff during the largest event. The differences were systematic between catchments and events (Table 4). Despite similar fractions of event water runoff during the intense summer event (July 2016), the groundwater contribution was much higher at Longcote (0.65 of total runoff) than Middle Burn (0.25 of total runoff).

Intra-event dynamics
The intra-event isotope/ANC dynamics ( Fig. S2-S5) as well as the associated runoff components within events give additional insights into runoff processes. These are best illustrated during the event with the largest discharge peak (Fig. 3). Longcote had a double peak hydrograph (also observed in the hydrometric data for some other events that were not sampled) with an initial rapid discharge of soil water / groundwater, followed by a slower increase in event water during the second peak that was quickly overwhelmed by soil water / groundwater on the falling limb. It was the only catchment with an increase in soil water / groundwater fraction following the hydrograph peak. Shiplaw had some similarities to Longcote, in that the initially high soil water / groundwater inputs were rapidly overwhelmed by event water runoff that dominated the peak of the hydrograph, but decreased rapidly on the falling limb. In Middle Burn, the response was generally more damped with more coincident peaks in event water, soil water and groundwater, and a greater fraction of event water on the falling limb.

Influence of event characteristics on runoff components
Event water and groundwater fractions appear to vary approximately linearly with maximum discharge in the autumn/winter events in all catchments (Fig. 4). The exception was Longcote, which had a less linear response, particularly in the F e (Q max ) values, perhaps due to the higher measured rainfall on the eastern side of the catchment for the largest event or to threshold behaviour. The switch in direction of ANC hysteresis in Longcote between the two smallest and largest events may also be indicative of such behaviour (Fig. 5).
The differences in estimated event water fraction were higher between events than between catchments, suggesting that event characteristics are an important control on runoff partitioning. Spearman rank correlation coefficients between event water fractions and tracer derived runoff ratios, and event total rainfall depth, rainfall intensity and antecedent rainfall, showed some possible associations (p = 0.07 for correlation of runoff ratio with total event rainfall), though it was not possible to determine the main drivers due to the limited number of events sampled. Correlation analysis based on hydrometric data alone for the events during the longer time period suggested that the only significant correlation (p < 0.01) was between total runoff and event total rainfall depth.

Discussion
The results give insights into the role of rapid surface rainfall runoff versus stored water during storm events in this NFM pilot catchment. As we outline below, stored water dominates the hydrograph under most conditions, and appears to be mediated to some extent by forest cover. Across the wider NFM catchment as a whole, other factors such as soils and geology are a more dominant control. However, these catchment characteristics appear to be dominated by the event magnitude. As we discuss, these findings have implications for locating NFM interventions involving land use change and the level of benefits we might expect from these flood risk management approaches. They also demonstrate how tracers can complement more traditional hydrometric techniques in the evaluation of NFM.

Pre-event and subsurface sources of runoff during events
Pre-event water was an important fraction of stream discharge in all sampled events and sub-catchments, constituting the dominant fraction of total runoff (>0.64). This finding is consistent with many other isotope-based hydrograph separation studies that have demonstrated the importance of pre-event water (e.g. Bonell et al., 1990;McDonnell, 1990;Pearce et al., 1986;Sklash and Farvolden, 1979). Findings were similar using either 2 H or 18 O, with differences in the event fractions of +/-7%, which were lower than the error propagation estimates. Nevertheless, there was high variability in pre-event water fraction between different events, suggesting that humid forested catchments may have more variable pre-event fractions than many early studies suggested (Fischer et al., 2017). There was also relatively low spatial variability in bulk precipitation isotopic composition for different events across different parts of the catchment, in contrast to some studies (Fischer et al., 2017;McGuire et al., 2005), suggesting that in this environment isotopic tracers can provide a useful tool for understanding event runoff. Further research is warranted on the effects of throughfall on isotopic composition. Assuming the bulk throughfall δ 18 O and δ 2 H value differences to storm rainfall reported in Kubota and Tsuboyama (2003), we estimated that F e (Q max ) ranged from 12% lower to 17% higher than estimates based on the sampled rainfall across the different events in Middle Burn. However, this did not alter the relative order of catchments in terms of their event water runoff fractions.
Groundwater was a small fraction of stream discharge during the winter events, although variable between catchments and important in Longcote where it still constituted 21% of stream discharge during the largest event. The small groundwater fraction implies that soil water is an important water source in these temperate upland catchments. This is consistent with studies in similar catchments with poorly permeable bedrock (or superficial deposits), where surface runoff and shallow subsurface runoff in soils tend to dominate (Tetzlaff et al., 2007b). Groundwater appears to be more important in summer events, but testing this hypothesis would require sampling larger and more summer events.

Forest cover and management controls on event runoff fractions
Comparisons between Shiplaw and Middle Burn suggest that forest cover has some influence on event water fraction. In these paired catchments, the total event water fraction in the forested Middle Burn catchment was consistently lower across the sampled events, and in the largest event it was 11% lower (and 17% lower at peak discharge). We are not aware of data on event water fractions from similar environments with NFM schemes in the UK, but reductions due to forest cover are consistent with other studies generally (Buttle, 1994), although not as high as reductions seen in other areas. For instance, Muñoz-Villers and McDonnell (2013) reported ~24% difference in event water contributions between forest and pasture in a tropical environment due mainly to (much higher) contrasts in infiltration rates between the two land uses, and to a lesser extent to canopy interception. Considering this evidence, and the infiltration data we have from the wider Eddleston catchment and similar environments in the UK (3.4/0.026 m d − 1 (A/B horizon) for grassland, versus 8.3 m d − 1 for forest Marshall et al., 2009), we suggest that the reductions in event water in Middle Burn compared to Shiplaw are linked to relative differences in exceedance of rainfall infiltration capacity between forest and grassland areas and resulting reductions in overland flow in the forested areas. Interception by the forest is also likely to have led to reductions in the event water fraction as reported in other studies (Roa-García and Weiler, 2010).
Forest management methods (particularly drainage in site preparation and compaction during harvesting) and under-field drainage in improved grassland may introduce uncertainties. The effects of drainage systems in similar temperate environments are poorly studied, particularly at catchment scale, and are highly variable depending on factors such as drain spacing, soil type and antecedent conditions (Dadson et al., 2017). Open forest drainage ditches and compaction are likely to have increased the fraction of event water runoff compared to undrained forest, although, as already noted, the effects of drainage ditches and compaction are expected to be minimal in the study catchments. Under-field drainage is also likely to have increased event water runoff fraction under the wet antecedent conditions of most of the storms we investigated, but less so than increases due to drainage in the forest. The net effect of drainage in both the Middle Burn and Shiplaw catchments is a likely decrease in the difference in event water fractions between them, though it is unlikely to have changed the results we have reported. These relative effects of drainage would benefit from further study in this catchment and elsewhere in the UK, given that around 61% of agricultural land in the UK is drained (Wiskow and van der Ploeg, 2003).

Soil, geological and meteorological controls on runoff fractions
Comparisons between the western (Shiplaw, Middle Burn) catchments and the eastern catchment (Longcote) illustrate the wide variation in response across the wider NFM catchment. Event water fractions were up to 48% lower in Longcote compared to Shiplaw (whilst up to 33% lower in Middle Burn). Longcote responded rapidly to rainfall, though with lower peak discharge, slow recession rates and double peak hydrographs, as well as high groundwater and soil water fractions. These findings suggest a dominance of subsurface flow despite the steeper catchment topography and low forest cover. Given the poorly permeable bedrock in Longcote, subsurface flow is most likely to occur in the soils, superficial deposits and weathered bedrock. This is in contrast to some hydrograph separation studies in steep catchments that have reported positive correlations between slope and event water fraction (Suecker et al., 2000).
There is some uncertainty in whether the lower event water fractions in the Longcote catchment are related to lower levels of disturbance (and particularly the acid grassland, heathland and bracken land cover with an expected higher roughness and infiltration) or to the underlying soils and geology. Research on catchment disturbance has frequently found evidence of increased runoff and event water fraction in more disturbed catchments (e.g. Monteith et al., 2006;Nainar et al., 2018). However, the higher groundwater fraction and the nature of the stream response, suggest that more permeable soils and superficial deposits are particularly important in controlling event water fractions. This corresponds with research in other Scottish upland catchments, which has demonstrated the importance of freely draining soils and glacial drift deposits in controlling catchment flow paths, leading to much longer transit times even on steeper slopes (e.g. Soulsby et al., 2006). Regardless of these uncertainties, a conclusion emerging from our results that is relevant to planning NFM interventions more widely is the likely limited effect on flood risk of hillslope afforestation in areas already dominated by deeper subsurface flow.
The small number of storms sampled make it difficult to test how different meteorological characteristics (in terms of total rainfall depth, intensity and API) control event runoff fractions. However, given that there were greater differences in event runoff between events than between catchments, these characteristics may be more important controls on the fraction of event water in the hydrograph than catchment characteristics. Fischer et al. (2017) reported a similar hierarchy of influences controlling pre-event water fraction across multiple storms in pre-alpine catchments. Antecedent conditions may be particularly important, as suggested by the large contrast between summer and winter events, and in Longcote the switch in hysteresis direction, which has been linked to the changing inputs of hillslope soil water in streams under different wetness states (Zuecco et al., 2016).

Implications for natural flood management policy
Our study demonstrates the importance of the displacement of pre-event water in runoff mechanisms in landscapes subject to NFM interventions, implying that saturated overland flow may not be a dominant runoff mechanism even in relatively responsive catchments. The results also suggest that upland afforestation can influence the partitioning of runoff at the catchment scale, so could be used to manipulate flow paths in NFM. However, the effects might be minimal for the largest events compared to improved grassland areas, which NFM measures frequently target, due to the limited effects of forests on infiltration and transient storage (Soulsby et al., 2017;Tetzlaff et al., 2007a) and relatively high infiltration rates in improved grasslands in many UK upland environments. These findings support suggestions in the NFM literature that climatic conditions and individual storm event characteristics dominate over catchment characteristics in influencing peak flows in the largest events (Dadson et al., 2017). In other regions, for example where grasslands are severely degraded, forest planting may have a greater effect on reducing event water runoff fractions (van Meerveld et al., 2019;Zwartendijk et al., 2017).
The implication of these results is that in areas of more permeable geology and soils, similar to the Longcote catchment, hillslope afforestation may have limited benefit as a NFM strategy (we do not consider the effects of riparian or floodplain planting), although woodland may of course bring other benefits. Therefore, hillslope planting for NFM should target other areas, in particular areas with compacted soils overlying relatively permeable soils or geology, where forest planting may help to connect runoff to streams via the groundwater zone. In this case the location of forest cover and the type of trees become important, as discussed by Neal et al. (1997) who suggested long term experiments with more deeply rooting trees. To take account of these mechanisms in NFM, implementation might require greater consideration of subsurface features. Flood vulnerability maps used to determine the location of woodland-related NFM interventions in the UK consider some proxies for soil and related geological substrate permeability at a coarse resolution (Environment Agency, 2018), but should include further consideration of the interaction with factors such as the degree of compaction and depth to groundwater. Targeted tracer studies could improve such process understanding, helping better constrain models to detect change as well as quantify co-benefits of NFM interventions involving woodland, such as increased biodiversity, carbon storage and water quality (Iacob et al., 2014;Keesstra et al., 2018). Of course, there are practical challenges, given high costs associated with tracer studies, but the increasing availability of in-situ high frequency sensors for tracers such as conductivity, temperature and potentially isotopic tracers (Berman et al., 2009) mean that such approaches could be incorporated into NFM planning and monitoring.

Conclusions
To our knowledge this is the first study using natural tracers at the catchment scale to investigate runoff mechanisms in UK-based NFM projects. It gives insights into the diversity of runoff mechanisms operating during storm events across different upland catchments. The results suggest that pre-event water is an important fraction of stream discharge during events for the size of catchments (< 10 km 2 ) and events (< 20% of the mean annual flood) for which there is currently some evidence that forest cover could have an impact on discharge (Dadson et al., 2017). The study also suggests that plantation forest cover reduces the fraction of event water runoff in streams over this range of event magnitudes. However, the effects of these differences in land cover may be dominated by differences in event characteristics, suggesting limited impacts of land cover for the largest events. Since many NFM measures are designed to target event water, this implies a need for careful consideration of the types and locations of NFM interventions to ensure they are effective. Finally, the study demonstrates the potential utility of using tracers in NFM for understanding runoff processes and monitoring co-benefits such as surface water and groundwater pollution management.