Climate variability and implications for keeping rivers cool in England

Water temperature (Tw) is a primary determinant of river ecosystem health and function that is strongly controlled by climate variability and change but mediated by catchment properties. We apply a nested analysis to: (1) evaluate how annual and seasonal mean Tw varied across England during the period 2000–2018; (2) assess the extent to which these regional-temporal dynamics correlate with the North Atlantic Oscillation (NAO); and (3) quantify the impact of local climate variability on modelled daily maximum Tw for open, shaded and spring-fed river reaches. Such information is used to identify sentinel locations for long-term monitoring and reporting, to evaluate the true benefit of riparian shade management, and to assess the impacts of climate change on Tw. We draw on a national archive of nearly 1 million Tw values and data from a highresolution field experiment in central England. Nationally, annual mean Tw changed by − 0.4 C/ decade over the period 2000 to 2018, broadly in line with Central England Temperatures, although summer Tw changed by +0.6 to +1.1 C/decade in parts of central and northern England. There were significant associations between summer Tw and NAO (rho = 0.64, p < 0.05), especially at sites above 300 m altitude (rho = 0.70, p < 0.01). The regional analysis reveals strongest links between summer Tw and NAO in northeast England and weakest associations in lowland regions of southern and east England with major aquifers. Hence, places with significant groundwater flows offer the greatest chance of detecting long-term signals in Tw that are not being driven by the NAO. Site-specific, logistic regression models of daily maximum Tw are found to be sensitive to the prevailing NAO phase during calibration periods. Such models show a thermal benefit for shaded sites compared with open sites that is one average 0.2 C under negative NAO but 2.8 C under positive NAO. Based on the findings from our nested analysis we suggest ways of optimising monitoring networks plus improving the appraisal of measures intended to keep rivers cool. In particular, we call for the creation of a national indicator of Tw for use in UK Climate Change Risk Assessments.

Water temperature (Tw) is a primary determinant of river ecosystem health and function that is strongly controlled by climate variability and change but mediated by catchment properties. We apply a nested analysis to: (1) evaluate how annual and seasonal mean Tw varied across England during the period 2000-2018; (2) assess the extent to which these regional-temporal dynamics correlate with the North Atlantic Oscillation (NAO); and (3) quantify the impact of local climate variability on modelled daily maximum Tw for open, shaded and spring-fed river reaches. Such information is used to identify sentinel locations for long-term monitoring and reporting, to evaluate the true benefit of riparian shade management, and to assess the impacts of climate change on Tw. We draw on a national archive of nearly 1 million Tw values and data from a highresolution field experiment in central England. Nationally, annual mean Tw changed by − 0.4 • C/ decade over the period 2000 to 2018, broadly in line with Central England Temperatures, although summer Tw changed by +0.6 to +1.1 • C/decade in parts of central and northern England. There were significant associations between summer Tw and NAO (rho = 0.64, p < 0.05), especially at sites above 300 m altitude (rho = 0.70, p < 0.01). The regional analysis reveals strongest links between summer Tw and NAO in northeast England and weakest associations in lowland regions of southern and east England with major aquifers. Hence, places with significant groundwater flows offer the greatest chance of detecting long-term signals in Tw that are not being driven by the NAO. Site-specific, logistic regression models of daily maximum Tw are found to be sensitive to the prevailing NAO phase during calibration periods. Such models show a thermal benefit for shaded sites compared with open sites that is one average 0.2 • C under negative NAO but 2.8 • C under positive NAO. Based on the findings from our nested analysis we suggest ways of optimising monitoring networks plus improving the appraisal of measures intended to keep rivers cool. In particular, we call for the creation of a national indicator of Tw for use in UK Climate Change Risk Assessments.

Introduction
Water temperature is a primary determinant of freshwater ecosystem health and function. Site-specific controls of river water temperature (Tw) are multivariate and nested (Hannah and Garner, 2015). Relatively few studies have explored variations in Tw at global to national scales where climate gradients and geological factors dominate (e.g. Graf and Wrzesiński, 2020;Isaak et al., 2020;Laizé et al., 2017;Orr et al., 2015a;2015b;van Vliet et al., 2013). Most focus on factors governing energy and water fluxes at catchment to river reach scales including: river regulation, weirs and abstractions (e.g. Bae et al., 2016;Webb and Walling, 1997;Maheu et al., 2016); runoff sources, tributaries and hydrological pathways (e.g. Dugdale et al., 2013;Johnson et al., 2014;Nichols et al., 2014); land cover and timber harvesting (e.g. Albertson et al., 2018;Arismendi and Groom, 2019;Hannah et al., 2008); topographic and riparian shade (e.g. Broadmeadow et al., 2011;Johnson and Wilby, 2015); drainage network density, slope and aspect (e.g. Garner et al., 2017a;2017b;Lisi et al., 2013). Other work on spatial-temporal dynamics of Tw is contributing to the development of tools for managing thermal habitats and/or improving their resilience to drivers of thermal change (Jackson et al., 2017;Ouellet et al., 2020). This includes practical measures for protecting instream refugia and 'keeping rivers cool' by careful management of riparian shade (Lenane, 2012;Orr et al., 2015a;2015b).
Interventions are needed because climate change is exacerbating pressures on freshwater ecosystems by altering the thermal regimes of rivers (Garner et al., 2017a;2017b;;Reid et al., 2019;van Vliet et al., 2013). Long-term studies show regional warming of rivers that reflects the complex interplay of rising air temperatures (Ta) with changing hydrological regimes, moderated by catchment controls, as mentioned before (Isaak et al., 2012;Webb and Nobilis, 2007). However, there has been limited attention to inter-annual variations in Tw linked to opposing phases of climate modes such as the El Niño Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), or North Atlantic Oscillation (NAO). This may reflect the paucity of multi-decadal Tw seriesa pre-requisite for investigating quasi cyclical phenomena with periodicities spanning 2-7 years for ENSO, 5-8 years for NAO, and 20-30 years for the PDO. Exceptions include analyses of ENSO and PDO signals in Tw records for the Pacific Northwest (Kiffney et al., 2002;Mote et al., 2003); or coherent patterns of regional forcing of Tw by the NAO in the Alps (Hari et al., 2006), Danube basin (Basarin et al., 2016;Webb and Nobilis, 2007), Poland (Graf and Wrzesiński, 2019) and Serbia (Milenković et al., 2017).
In contrast, there is a larger body of evidence on associations between the NAO and hydroclimatic conditions across northwest Europe (e.g. Hurrell, 1995;Kingston et al., 2006). Signatures of positive NAO phases in the UK include a more northerly position of the North Atlantic storm track in summer (Dong et al., 2013); more frequent westerly airflows in winter and blocking anticyclonic conditions in summer (Fowler et al., 2003); above average winter rainfall over Scotland but lower than average across central, northeast and southeast England (Wilby et al., 1997); amplified orographic precipitation and river flow over upland Britain (Burt and Howden, 2013); more extreme winter rainfall (Maraun et al., 2011); more protracted high-flows in northern and western catchments (Hannaford and Marsh, 2008); higher groundwater levels (Rust et al., 2019); milder winters, springs and autumns over central England (Wilby et al., 1997). Generally, opposite behaviour occurs in all these indicators under negative phases of the NAO.
Inter-annual variations in the NAO matter to Tw because the index is a proxy for changes in UK solar radiation (Colantuono et al., 2014), regional rainfall (West et al., 2019), hydrological pathways (Rust et al., 2020) and flow regimes (Burt et al., 2016) that ultimately drive stream temperatures (Briers et al., 2004;Durance and Ormerod, 2007). Furthermore, the NAO may confound climate change detection and evaluation of the thermal benefits of riparian shading . Hence, there is a risk that results from short-lived (<5 year) field experiments may be biased by periods with extremely negative (cool) or positive (warm) NAO phases. Therefore, the aims of the present nested analysis are to: (1) evaluate how annual and seasonal mean Tw varied across England during the period 2000-2018; (2) assess the extent to which these spatial-temporal dynamics were correlated with the NAO; and (3) quantify the impact of climate variability on Tw for shaded and unshaded river reaches in the period 2011-2018. This research is relevant to UK Climate Change Risk Assessments (CCRAs) because sentinel locations for detecting emergent trends in Tw are identified, and the benefits of riparian shade management (a key adaptation option) are more robustly evaluated. Moreover, this is the first time that NAO-related variations in Tw spanning nearly two decades have been presented for the whole of Englandother studies (cited above) have concentrated on individual catchments or shorter periods.

Data sources
We implemented a nested analysis to situate spatial and temporal variations in Tw at the catchment scale within regional and national contexts (Fig. 1). Sources of data and methods of analysis are described below.

Environment Agency archive (2000-2018)
The Tw data were collected by the Environment Agency of England (EA) between 2000 and 2018. This part of the archive holds 922,694 values from routine monitoring of rivers. Measurements were predominantly taken via handheld, multi-purpose probes that record temperature, pH, and dissolved oxygen, with a typical accuracy of 0.3 • C. EA procedures require that probes be held in the water until readings stabilisethen Tw is recorded to the nearest 0.1 • C. For an overview of EA field methods, including sensors used for Tw data collection, see Orr et al. (2015).
Following quality assurance and screening processes, we analysed 867,557 Tw values collected from 1154 to 6148 sites, depending on year. The sampling frequency was typically monthly, with spot samples taken from fixed locations between 08:00 and 16:00 hrs. However, over time there have been changes in the sampling network coverage (Fig. 2a), number of sites ( Fig. 2b) and sampling frequencies at each site (Fig. 2c). Some sites did not have monthly data for the full period, and/or were retired part way through the record. Sites with less than six values in a year were removed from the analysis. Other standardisation procedures are described in section 2.2.

Luten (2011-2019)
The Loughborough University Temperature Network (LUTEN) 1 was established in March 2011 to investigate the influence of catchment and riparian properties on surface water temperature in the Rivers Dove and Manifold, central England (Toone et al., 2011;Wilby et al., 2012). Key catchment features, land-use, sampling sites and procedures have been described in detail before (see Johnson et al., 2014;Johnson and Wilby, 2015). In short, the headwaters are characterized by open moorland and pasture valleys; lower reaches by deep Limestone gorges and perennial spring flows. Of particular relevance to the present study is the high-density of paired air and water temperature monitoring (up to 36 sites at 15 min resolution) for the period 2011-2019. Both air and water temperature were monitored at every site using Gemini Tinytag Aquatic 2 thermistor data loggers. LUTEN sites reflect a wide range of in-channel and riparian conditions, including water depth, width, velocity and discharge, amount of riparian and landscape shade, influence of weirs, groundwater springs and tributaries, channel aspect and gradient. Seven sites with most complete records are used here (Table 1). In terms of riparian and topographic shade in summer, the sites with least to most upstream shade are: D1 > D23 > D20 > D17 > D16 > D6 > D10. Flows at D17, D20 and D23 are impacted by a series of small step-weirs, D16 to D23 by ephemeral springs, and D23 is situated downstream of a reach with significant groundwater influx.

North Atlantic Oscillation index
A monthly NAO index was obtained from the NOAA Climate Prediction Center 2 , based on the classification developed by Barnston and Livezey (1987). This NAO series is available from January 1950 to present and updated monthly. The national analysis (below) used the annual mean winter (December to February) and summer (June to August) indices for the years 2000-2018. During this period, the most negative NAO seasons (with index values) were winter 2009-10 (-1.67) and summer 2012 (-1.61); the most positive   ) was more positive in winter (+0.44 versus − 0.12) but more negative in summer (-0.44 versus + 0.14). Across England, NAO conditions in the study period favoured more unsettled, milder winters and wetter summers than in the WMO reference period.

Other supporting information
National-scale variations in Tw were also correlated with the Central England Temperature (CET) series (Manley, 1953;Parker et al., 1992) to evaluate the EA national and regional data, and to assess the longer-term representativeness of the period of study. National scale Tw was also correlated with the England runoff series (Marsh et al., 2015) to explore whether annual and seasonal variations are linked to discharge (m 3 s − 1 ). The CET was obtained from the Met Office Hadley Centre 3 ; the runoff data from the National River Flow Archive via the UK Centre for Ecology and Hydrology 4 .
The average CET during 2000-2018 compared with the WMO climate reference period  was warmer in both winter (+0.8 • C) and summer (+0.7 • C); average England runoff was 6% higher during 2000-2018 than in the period . Again, these support the view that our study period was generally milder and wetter than average, consistent with observed long-term trends in the UK climate reported by Kendon et al. (2020).

Standardizing EA sample times
The EA data were initially sorted to remove values not associated with routine monitoring. Other data gathered reactively or for compliance purposes (e.g. discharge permits or pollution incidents) were withdrawn to avoid biasing by artificial heat sources, such as wastewater. Raw data were also filtered to omit values that were not measured in rivers. This included sampling of effluent discharges and estuaries where tidal effects render monthly monitoring at different times more problematic. Data were then stratified into four seasons: winter (DJF), spring (MAM), summer (JJA) and autumn (SON), although only results for winter and summer are presented herein. The archive begins on 1 December 2000 -the start of winter in the first year of record (2000), with subsequent June to August (2001) being the first summer in the record. The resultant series were then quality assured for suspect entries by removing values above 35 • C (0.001% of records) or more than three standard deviations from the regional monthly mean Tw. Checks were also made for Tw values below 0 • C but none were found. The overall percentage of values removed by these checks was 0.01%.
The EA data were not collected at the same time of day, which can bias estimates of annual mean temperature (Everall et al., 2015) and introduce artificial trends (Toone et al. 2011). Therefore, an equivalent temperature was derived for a reference time. This was achieved by fitting linear regressions of Tw versus decimal time (such as 8.25 for 08:15 hrs) for each EA region (Fig. 1), season s and year y (spanning 1 December to 30 November to match the seasonal analysis, described below). The EA region-, season-and yearspecific regression slope m EA was then used to adjust every sample i of observed Tw i taken at decimalised time t i to the temperature at reference time 12:00 hrs Tw 12,i using: The standardised temperature Tw 12,i was used in all subsequent analyses, including for estimating annual and seasonal means of Tw at each site.

Standardizing EA spatial coverage
Annual and seasonal averages of EA point data were entered into ArcGIS 10.3.1. for processing and mapping. These data were used to create national heat maps to compare regional variations in Tw for each year despite changes in the sampling network over time (Fig. 2). Site-specific temperature anomalies were calculated by subtracting unweighted annual, winter or summer average temperature for each EA region for the 18-year study period from the equivalent local value. Regional rather than national reference values were used to better draw out spatial variations in Tw anomalies that would otherwise be masked. For instance, Devon and Cornwall (DC) would always have a strong positive anomaly because it is relatively warm nationally, whereas Cumbria and Lancashire (CL) would always have a negative anomaly because it is relatively cool nationally, simply due to the north-south gradient in Tw across England.  (Fig. 2b) whilst the average number of samples taken at each site increased by 18% (region range − 31% to +113%) over the same period (Fig. 2c). Overall, there was a 75% decline (region range − 28 to − 94%) in the number of sample sites and ~30 km north-easterly shift in the network locus. During the same period, the mean altitude of sample sites dropped slightly from 37 m to 32 m.
We countered these variations in sample density by aggregating data to regional and national averages. Standard errors for annual and regional annual and seasonal averages reveal the uncertainty due to variations in sample size. For annual averages, the standard error ranges between 0.02 • C in summer and winter 2007 to 0.05 • C in summer 2018. Standard errors for regional means during 2000-2017 were on average 0.09 • C and up to 0.21 • C (with only 14 of the 272 year-region combinations having standard errors over 0.15 • C). In 2018, when sample sizes fell considerably (Fig. 2), the average standard error across the regions was 0.18 • C (but affected by regions KSL and LN, which had very small sample sizes and relatively large standard errors of 0.47 • C and 0.31 • C, respectively). No other region had standard errors above 0.2 • C. Hence, the vast majority of regional and seasonal means had standard errors that were still less than measurement error, despite significant changes in the overall monitoring network density.

Correlation and logistic regression analysis
Statistical analyses were performed on annual, winter and summer Tw. Summer was prioritised because this season typically has the highest Tw, and scope for riparian shade management. Winter Tw is ecologically significant to over-wintering aquatic organisms and was also assessed as a comparator to summer. Annual and seasonal mean Tw anomalies (national and regional) were correlated with the concurrent NAO index, using the Spearman Rank correlation (rho), to account for non-normal data.
Following Johnson et al. (2014), logistic regression models were fitted to paired daily maximum Ta and Tw values measured at each LUTEN site. As with other Ta-dependent models, Ta is a proxy for heating of the water body by solar radiation. The S-shaped logistic function is preferred over a linear model because Tw departs from linearity for Ta < 0 • C due to thermal buffering of river temperatures by hyporheic and phreatic water from channel margins. The Ta-Tw relationship is also non-linear at high Ta due to evaporation and latent cooling which set an upper limit for Tw. All three parameters of the logistic model (alpha, beta, and gamma) have units in • C. Alpha specifies the upper asymptote of the modelthe maximum Tw that the model can predict for any Ta. Beta is the inflection point of the curvethe Ta with greatest rate of change in Tw with Ta. Gamma is the gradient of the model at the inflection pointthe increase of Tw per unit increase in Ta. Separate models were calibrated for each year and site to examine variability in logistic parameters with NAO phase.

Results
We begin by describing inter-annual variations in Tw detected at the national scale linked to NAO phase. These findings provide the context for regional-and catchment-scale variations in Tw. Through this nested analysis we explore the dominant controls of Tw at each scale.

National-scale analysis
Over the period 2000-2018, annual mean Tw changed by − 0.4 • C/decade, broadly in line with changes in the CET (r = 0.68, <0.05) (Fig. 3a). However, correlations between annual mean Tw and the NAO (r = 0.11, p > 0.05) or between Tw and all England runoff (r = 0.03, p > 0.05) were both statistically insignificant. This is likely because annual averaging of NAO masks years that contain both positive and negative NAO phases. For example, 2015 has annual NAO = 0.19, concealing a strongly negative NAO in summer (-1.28) and strongly positive NAO in winter (+1.31). Similarly, annual mean runoff obscures considerable sub-annual variability. Therefore, subsequent analyses (below) were performed seasonally, focusing on winter (DJF) and summer (JJA) separately.
Like annual runoff data, nationally-averaged Tw metrics also conceal much regional heterogeneity (Fig. 4). As would be expected, rivers in southern England are on average warmer than those in the north, and sites in upland regions are cooler than those in lowland areas. Figs. 5 and 6 suggest an association between NAO and regional Tw anomalies in summer and winter, with negative NAO phases  (Fig. 6). Within these national patterns, some regional features also stand out, such as the hotspot in Cambridgeshire and Bedfordshire (CB) following the extraordinary rainfall and flooding in winter 2013 (Fig. 5) and cool spots around The Wash in summer 2002, and Solent South Downs (SSD) area in summer 2003 (Fig. 5).
To further evaluate these national patterns, the NAO was correlated with seasonal average temperature anomalies (Fig. 7). The correlation between NAO and summer temperature anomaly is significant (rho = 0.64, p < 0.01), with positive NAO coinciding with higher Tw. This association is stronger than the equivalent correlation (rho = 0.42, p < 0.05) for winter which, whilst significant, is weakened by four outliers. Winter 2006-07 was remarkable because it had both positive (Dec 2006, +1.34) and negative (Jan 2007, − 0.49) NAO phases in a single season, which averaged to NAO = 0.36. If maximum monthly seasonal NAO is used in place of the seasonal mean, this year lies closer to the regression line. Winters 2011-12, 2014-15 and 2017-18 all had extreme weather due to major storms (e.g. "Beast from the East" in 2017-18) and late winters, with very cold February-March but warmer Decembers. If these four winters are removed from the analysis, the correlation is rho = 0.81 (p < 0.01).

Regional-scale analysis
The above national-scale correlation analysis was repeated but now for each EA region. There were no statistically significant trends in winter Tw (Fig. 8a) but there were some correlations with winter NAO (Fig. 8b) at the regional scale. In summer, there were four regions (CB, DN, NDT and SWWM) with significant changes ranging between +0.6 and +1.1 • C/decade. There were also significant correlations between summer Tw and NAO in four regions of north and southwest England (CL, DC, NDT, and Y), with up to 49% explained variance. Overall, the weakest correlations between Tw and NAO were in CB, KSL, and W (in summer) and CB, LN, W and WT (in winter) -all regions with major Chalk and Limestone aquifers. The only region to have both a significant trend in summer Tw and correlation with summer NAO is Northumberland, Durham and Tees (NDT); the only region with a significant trend in summer Tw but no link to NAO is Cambridgeshire and Bedfordshire (CB). This region has potentially the greatest signal to noise (variability) ratio, so represents the most promising sentinel area for detecting trends in Tw that are not associated with the NAO.
Sample sites below 15 m (Fig. 8c) or above 300 m (Fig. 8d) were extracted from the national dataset and the Tw correlated with NAO to investigate the influence of elevation. The correlation between summer Tw and NAO is stronger for upland (rho = 0.70, p < 0.01) than lowland (rho = 0.60, p < 0.05) sites. The gradient of the best fit regression is also steeper for upland (0.57 • C/NAO unit) than lowland (0.38 • C/NAO unit) sites. Hence, unit changes in NAO are associated with larger responses in Tw at higher elevation sites. Again, for long-term detection, lowland sites are preferred for monitoring Tw trends unassociated with NAO.

Catchment-scale analysis
The logistic models describe between 76% (D23, 2015-16) and 93% (D6, 2014-15) of the variance in daily maximum Tw, with an average of 88% across all sites and years. The forms of the curve vary markedly between sites (Fig. 9a) due to spatial and temporal variations in the three logistic parameters. The upper bound Tw set by the alpha parameter ranges between 12.6 • C (D1 in 2018-19) and 25.8 • C (D20 in 2012-13), reflecting the relative contribution of groundwater to summer flow at each site (Fig. 9b). The beta parameter indicates the Ta value where the gradient with Tw is steepest, and generally charts the downstream gradient in Tw from cool headwaters D1, to warmer mid-and lower-reaches D16 to D20 before the cool groundwater fed site at D23 (Fig. 9c).
The amount of explained variance in Tw by Ta is interpretable as the local significance of thermal buffering by groundwater versus radiative heating. Relatively low explained variance for D23 is expected due to substantial groundwater influx in reaches above the site. This effectively weakens the influence of Ta. Previous research shows that the gamma parameter of the logistic model is negatively correlated with cumulative upstream riparian shade and groundwater inputs (Johnson et al., 2014). This is evident in the spatial variation of the gamma parameter ( Fig. 9d) with upstream sites (D1, D6 and D10) generally less buffered by groundwater than downstream sites (D17, D20 and D23), hence giving higher gamma values (i.e. sensitivity to Ta) closer to the headwaters. Riparian shading will have greatest impact on Tw where gamma values are high as this indicates sites that are sensitive to Ta (as a surrogate for solar radiation). In contrast, sites with low gamma values are less sensitive to shade, and hence other management measures will be needed upstream. Fig. 10 shows the variability in logistic form at selected sites under positive (red lines) and negative (blue lines) phases of the NAO. Overall, there is a tendency for lower maxima Tw (or at least lower bounding curves) in years with negative NAO, especially at sites D17 and D20, and higher curves under positive NAO. Logistic curves for headwater and groundwater sites (D1, D6 and D23) are remarkably stable between years. We express the uncertainty due to NAO variability by using the logistic curves to calculate the maximum range in Tw given Ta = 25 • C (CV 25 ). This metric spans 1.3 • C (D23) to 3.9 • C (D20) and is generally lower for shaded/ spring-fed sites than for unshaded sites. As well as variations in the logistic form at each site, the NAO phase also modulates differences in Tw between sites. Inter-site variations in Tw for the same year tend to be greater under positive (Fig. 11a) than negative (Fig. 11b) NAO phases. For example, the CV 25 for D6 (shaded) and D20 (open) is 2.8 • C for positive NAO but only 0.2 • C for negative NAO. Overall, the maximum CV 25 range was 3.1 • C under negative NAO (for D10 versus D23) and 6.3 • C under positive NAO (for D1 versus D20). However, there is also uncertainty in Tw as represented by the upper/lower bound 95% confidence estimates of the logistic parameters. This uncertainty is   . 7. Associations between mean deviation in Tw from regional average Tw and NAO in (a) winter and (b) summer. Four winters with extreme weather are labelled (2006, 2011, 2014 and 2017). T-bars show standard error of mean estimates.
typically greater for open sites such as D20 than shaded and/or spring-fed sites such as D10 and D23, respectively. For example, given Ta = 25 • C, the uncertainty in Tw at D20 is 1.9 • C for negative NAO and 2.4 • C for positive NAO (Fig. 11c and 11d). The equivalent uncertainty range in Tw for D10 is 1.5 • C to 1.7 • C, and for D23 is 1.5 • C to 2.2 • C. Hence, local factors have stronger influence on Tw than the NAO, but spatial variations (and logistic model uncertainties) are heightened under positive NAO.

Discussion
The NAO has been used before to account for variability in Tw. However, our analysis is the first to present a nested nationalregional-catchment-scale analysis of NAO signatures spanning nearly two decades. This helps to situate local adaptation measures in a broader spatial-temporal context and leads us to some practical implications.

National-scale analysis
We found that annual mean Tw changed by − 0.4 • C /decade during 2000-2018 for England whereas Orr et al. (2015) reported warming of Tw by +0.3 • C/decade during 1990 to 2006 for England and Wales. Trends in annual mean CET over the same periods were 0.0 • C /decade and +0.6 • C /decade, respectively. Both Tw studies fall outside the period of striking positive trends in the NAO between the 1960s and 1990s; more recently the NAO has been less positive, with some extreme negative excursions (Hanna et al., 2015(Hanna et al., : 2540. Hence, no regime shift in Tw is expected in either study period. The difference in Tw between the two studies is probably within the bounds of sampling uncertainty, not least because the accuracy of most hand-held temperature loggers is 0.1 to 0.3 • C depending on instrument. However, it may also be explained by methodological differences, in particular the disparity in periods and domains analysed. National-scale Tw trends should be interpreted in terms of the homogeneity of sampling time, instrumentation, site distribution/ elevation, and network density . We were careful to standardize for inconsistent spot sampling times as previous studies show that this factor can yield potential errors of ±1.5 • C (Everall et al. 2015) or introduce artificial warming trends at some sites (Toone et al., 2011). The shift in the EA network locus ~30 km to the northeast, combined with 5 m drop in mean site elevation, is unlikely to have altered Tw trends. However, the reduced density of sites in recent years, particularly in 2017 and 2018, may impact national and regional averages (see Fig. 2a). This could be further investigated through boot-strap analysis of Tw using years with greatest network density.
Correlations between annual mean Tw and NAO or with England runoff were statistically insignificant, but there was a significant association between Tw and CET (r = 0.68). The strength of the correlation between NAO and Tw averaged across the whole of England was greater in summer (rho = 0.64) than in winter (rho = 0.42). These values are comparable with other studies reporting r = 0.44 for the NAO with winter Tw in the Danube at Ybbs, Austria (Webb and Nobilis, 2007), r = 0.59 for winter Tw in moorland and forest streams at Llyn Brianne, Wales (Durance and Ormerod, 2007), rho = 0.60 for winter and rho = 0.49 for summer in the Danube at Bezdan, Serbia (Basarin et al., 2016). In these cases, the NAO is regarded as a proxy for the position of storm tracks, seasonal air temperatures and precipitation totals, heavy rainfall, groundwater levels and river flowall of which can influence the thermal regimes of rivers.
Our national analysis further reveals places that have comparatively large and stable Tw anomalies under opposing NAO phases. For example, Dartmoor and Exmoor in southwest England (DC in Fig. 1) are persistent 'cool spots' in winter whereas rivers draining into the Wash in east England form a 'hot spot' in summer (Fig. 4). Regions underlain by major aquifers also tend to have damped Tw responses to NAO, although local variations in aquifer sensitivity will depend on unsaturated zone thickness and the properties of subsurface hydrological pathways (Rust et al., 2020). Nationally, these are the areas most suited for monitoring multi-decadal changes in Tw that are not linked to the NAO.

Regional-scale analysis
The regional analysis showed significant warming in summer Tw across parts of central and northeast England, but no trends anywhere in winter (Fig. 8). The strongest association between Tw and the NAO occurred in northeast and west England during summer, and at sites above 300 m (Fig. 8). Again, this is consistent with the expectation that NAO signatures in orographic precipitation and river flow are amplified in the uplands (Burt and Howden, 2013), not least because of the smaller water volumes (and hence lower thermal inertia) in headwaters compared to lowland rivers. Under positive NAO in winter, storms tend to pass more frequently over southern England. Conversely, under positive NAO in summer, there is typically high-pressure/ anticyclonic blocking over southern England which shifts the Atlantic storm track northwards with drier conditions to the south (Dong et al., 2013). Positive NAO is also correlated with fewer summer cyclones over the British Isles (Matthews et al., 2016).
Lower precipitation and runoff volumes in summer (under possitive NAO) favour more rapid heating and cooling of river flows, with less buffering by diminished groundwater inputs (Webb et al., 2003;Arora et al., 2016). Headwater streams are particularly vulnerable due to their relatively low volume of water, increasing their sensitivity to solar heating. This is identifiable in the River Dove by declining gamma values with distance downstream. Gu and Li (2002) describe the 'diminishing returns' with Tw management better focused in headwaters where Tw is sensitive to change, rather than further downstream where Tw is lagged and defined by upstream processes. Based on their sensitivity analysis of heat balance equations and hypothetical examples, Gu and Li (2002) identify a theoretical 'critical distance' of 32 km from the source, upstream of which is sensitive to hydrological and metereological changes. Similarly, Johnson and Wilby (2015) propose the upstream 20 km of the River Dove as most suitable for riparian shade provision, although they also acknowledged that for this particular river, the most upstream 6 km are buffered by groundwater inputs with limited exposure to solar heating. The target reaches will also depend on local channel width and scope for canopy closure. Understandably, headwater channels have been highlighted for restorative action (Riley et al., 2018).

Catchment-scale analysis
The LUTEN network straddles two regions (DNL and SWWM in Fig. 1) with both summer warming trends and associations with NAO that are statistically significant (Fig. 8). Practically significant NAO signatures also emerge in the daily thermal series of open and shaded sites at the sub-catchment scale (see below). During summer, positive phases of the NAO are manifested by generally higher daily maximum Tw and greater contrasts in Tw between open and shaded sites. During summers with negative NAO, the opposite conditions apply. This is because the thermal benefits of riparian shade are greater under sunnier/drier conditions than during cloudy/ wet summers. Hence, the maximum thermal benefit of riparian shade for catchments in central and southern England is observed when the summer NAO phase is positive. Under such conditions, the difference in mean daily maximum Tw between shaded and unshaded sites in the River Dove was ~3 • C.
In contrast to the benefits of riparian shade which vary between years and sites depending on prevailing weather, the effect of groundwater is remarkably constant. For the River Dove, groundwater inflow has greater influence on Tw than riparian shade, with similar findings reported elsewhere on the significance of groundwater and hyporheic flows to Tw dynamics (Burkholder et al., 2008;Arrigoni et al., 2008;Dugdale et al., 2013;Wawrzyniak et al., 2017). It is also telling that the weakest regional correlations with NAO were in areas of Chalk and Limestone (CB, KSL, LN, W and WT) for rivers with relatively large volumes of spring-flow. In these areas, Tw variations are not so strongly coupled with prevailing weather conditions. Therefore, thermal management of rivers should include surveying and protecting areas of groundwater influx, since these habitats may provide important thermal refugia for aquatic life (O'Driscoll and DeWalle, 2006;Everall et al., 2015).

Practical implications
The above findings have connotations for measures that are intended to 'keep rivers cool':

Identifying sentinel sites for long-term monitoring and reporting
Our national-and regional-scale analyses of the EA archive highlighted regions and seasons where Tw is relatively insensitive to NAO (the primary driver of inter-annual hydroclimatic variability in the UK) (Fig. 8). These are essentially rivers in southern and eastern England with significant groundwater influences. We also recognize that the EA water temperature monitoring network was not established with climate detection or benchmarking in mind. However, to maximise the utility of future monitoring, sampling should be undertaken at standard times at fixed locations, especially within the sentinel regions identified above. Further scrutiny of EA archives may help to identify specific records that meet these criteria. Then, with a network of a few well-chosen sites and rigorous sampling protocols, it would be possible to construct a national indicator of Tw for detecting changes to thermal regimes in rivers. This would fill a gap in the list of priority indicators identified by the Committee on Climate Change (2019) thereby providing future CCRAs with a key metric of ecosystem health and function.

Evaluating the true thermal benefit of riparian shade
Our catchment-scale analysis shows that the assessed thermal benefit of riparian shade depends on physical location within the river network and on prevailing climate conditions. Uncertainty in logistic regression model predictions is similarly sensitive to NAO phase. Differences in mean daily maximum Tw between open, shaded and spring-fed river reaches are maximised under positive summer NAO. However, the summer NAO index has been predominantly neutral or negative in the 21st centurysince year 2000 only three summers had positive NAO (namely, 2002, 2013 and 2018). This suggests that Tw field experiments set up in England during the last 20 years to evaluate riparian shade effects may have understated thermal benefits. Therefore, we recommend that when assessing the impact of riparian shade management on river temperatures, claimed efficacies are qualified by stating the prevailing weather conditions. In general, more field-and model-based evidence is needed optimise the design of adaptation measures and to demonstrate value for money (Wilby et al., 2010).

Protecting and creating cool water refugia
Our analyses showed the local, stabilizing influence of groundwater flows on Tw, regardless of NAO phase (contrasting with the more differentiated efficacy of riparian shade). We have previously recommended the protection of cool thermal refugia from other pressures, alongside riparian shade management (Everall et al., 2015;Johnson and Wilby, 2015). Others have called for a broadening of perspectives on environmental flow assessment to incorporate both water quantity and temperature management (e.g. Olden and Naiman, 2010). Unfortunately, historic river management practices have tended to disconnect surface flows from groundwater through channelization, incision and lowering of water tables. This has created a legacy of impacts on Tw through reduced buffering that riparian shading alone is unlikely to fully mitigate. However, efforts to restore environmental flows by reducing groundwater abstractions could bring a co-benefit of lower Tw through improved water volumes. Another option is to enhance cold-water refugia by temporarily pumping groundwater via flow augmentation schemes, particularly during episodes of low flow and/or high thermal stress (e.g. Kurylyk et al., 2015;Wilby, 1993).

Assessing the impacts of climate change
We showed that the form of logistic regression models depends on the phase of the NAO, with higher Tw generally predicted under positive NAO for the same Ta, perhaps reflecting generally drier and sunnier conditions (see Fig. 10). For example, logistic models for the open site D20, predict Tw = 19.0 • C (calibrated under positive NAO) or Tw = 16.5 • C (under negative NAO), when given Ta = 30 • C. This 2.5 • C variance might not appear much but an equivalent difference between sites on the River Dove was enough to significantly alter the phenology of populations of the mayfly Ephemera danica with implications for their resilience to future climate and/or environmental change (Everall et al. 2015). [For the shaded site D10 under positive NAO and Ta = 30 • C, the equivalent Tw = 16.4 • C, so has a 2.6 • C benefit relative to unshaded]. Other regression-based models of Tw that are used for rapid climate risk assessment (e.g. Caldwell et al., 2015;Rehana, 2019) are likely sensitive to the hydroclimatic conditions under which they were calibrated. Similar concerns about the transferability of hydrological models between contrasting climate regimes have been raised elsewhere (Wilby, 2005;Broderick et al., 2016). Since climate model projections of the summer NAO suggest a general shift towards the positive phase along with higher air temperatures in the UK (Murphy et al., 2019), regression models of Tw calibrated under positive NAO are recommended.

Conclusions
This research shows that the strength of the correlation between Tw and NAO varies in space and time across England. This matters for schemes intended to keep rivers cool because natural climate variability confounds trend detection and interpretation of field data gathered to evaluate interventions. Climate variability also increases uncertainty in models used to assess future changes in Tw. Monitoring programmes in England were not established with these issues in mind but quality assured records could be identified for a few rivers where likelihood of trend detection is high. Such data could underpin a priority national indicator of long-term Tw for CCRAs as well as more robust assessments of adaptation options. We also discussed the scope for holistic management of riparian shade and groundwater to protect cool water refugia.
There are many opportunities for further research. These include more exhaustive geospatial analysis of the influence of elevation, geology, land cover, and shade on national patterns of Tw. Detailed case studies could also be undertaken of the thermal footprint of recent extreme events such as the 'Beast from the East' (February 2018), the heatwave and record-breaking temperatures (July 2019), or the sunniest month ever (May 2020). The thermal signature of other climate modes of variability beyond NAO could be investigated leading contenders might be ENSO and the Arctic Oscillation. Equipped with this information, national capacities would be strengthened for monitoring and modelling climate changes in a fundamental indicator of freshwater ecosystem health and function.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.