Long-term daily stream temperature record for Scotland reveals spatio-temporal patterns in warming of rivers in the past and further warming in the future

once-a-month observations

Evaluating long term historical trends in T w worldwide has revealed that these can be either decreasing (e.g. in large parts of the tropics and southeast Asia) or increasing (e.g. in the Northern Hemisphere excluding the arctic regions) and are strongly correlated to trends in T a (Wanders et al., 2019), but the dominant trends reflect warming patterns with an average global increase in T w of 0.16°C per decade between 1960 and 2014 (Wanders et al., 2019). For Europe, Webb (1996) identified an increase of up to ca. 1.0°C in T w and up to 1.4°C in mean T a during the 20th century. Others have shown that there has been a relatively stronger increase in T w in Europe since the 1960s (Moatar and Gailhard, 2006;Pohle et al., 2019). More specifically for the UK, studies have also demonstrated an increase in temperatures for freshwater bodies since the 1970s or 1980s (Durance and Ormerod, 2009;Hannah and Garner, 2015;Orr et al., 2015;Watts et al., 2015;Webb and Walling, 1992). For a river in NE Scotland, for example, Pohle et al. (2019) observed higher warming after 1961 with an increase in T w of 0.2°C per decade (analysed period: 1912-2016). They reported that this coincided with an increase in T a of the same magnitude. For another Scottish upland river, Langan et al. (2001) reported an increase of 2.0°C in maximum T w for winter and spring over a 30 year period . They also related this to changes in seasonal T a .
Changes in T w are often closely related to changes in T a as both are influenced by atmospheric conditions, solar radiation, wind speed/humidity and evaporation/condensation (Arora et al., 2018;Caissie, 2006;Jackson et al., 2018;Kelleher et al., 2021;Lisi et al., 2015;Michel et al., 2020;Webb et al., 2008). Furthermore, Rabi et al. (2015) highlighted the importance of T a in the preceding days, indicating the delayed response of T w to net radiation compared to T a . Under the RCP8.5 level of radiative forcing, T a may increase in the future by >6.5°C by 2081-2100 in the north of Europe (Gutiérrez et al., 2021). Projections for seasonal UK air temperatures suggest that mean summer T a may increase by 3.6°C -5.2°C, and mean winter T a by 1.8°C -3.6°C by 2061-2080compared to 1981-2000(Met Office, 2019. For UK climate projections, the UKCP18 Strand 3 dataset (Met Office Hadley Centre, 2018) indicates the potential for an increase in summer surface T a of up to 7.0°C and winter surface T a of up to 4.0°C at the RCP8.5 level of radiative forcing (2061-2080 relative to 1981-2000). Even though a stronger increase in surface T a is mostly expected for southern England, Scotland's surface T a is estimated to increase by up to 5.0°C in summer and 4.0°C in winter by 2061-2080 (relative to 1981-2000) (Met Office, 2019).
The overall future increase in T a is expected to result in raised summer T w in many temperate climate zone catchments (Dugdale et al., 2018). Van Vliet et al. (2011) explored the impact of T a and discharge changes on future daily T w for 157 global river stations and predicted increases in annual mean stream temperatures of +1.3°C, +2.6°C, and +3.8°C under T a increases of +2°C, +4°C, and +6°C, respectively. For the UK, a range of studies also suggested a general increase in mean annual T w in the future (2071-2100), ranging between 1.0°C to 3.6°C under different climate scenarios (van Vliet et al., 2015(van Vliet et al., , 2013. However, climate change has both direct and indirect impacts on stream temperature, which are spatially variable (Liu et al., 2020;Punzet et al., 2012;van Vliet et al., 2013). Furthermore, in addition to climate, changes in hydrological conditions, landscape/channel characteristics and natural/artificial thermal inputs (Arora et al., 2016;Caissie, 2006;Dick et al., 2017;Dugdale et al., 2018;Jackson et al., 2017a;Kelleher et al., 2021) have also been recognised to influence changes in T w . A general understanding of controls on T w and knowledge of sensitive sites is therefore important for management strategies now, but also in future. Generating such understanding relies on the availability of high spatial and temporal resolution T w data. Studies with long-term data are typically available for single sites (Kedra, 2017;Pohle et al., 2019;Rabi et al., 2015;Webb and Walling, 1992), while efforts for many sites generally involve shorter-term data (Arora et al., 2018;Jackson et al., 2018Jackson et al., , 2017bJackson et al., , 2017aLisi et al., 2015;Steel et al., 2016). Only a few international studies such as Garner et al. (2014); Arora et al. (2016); Piccolroaz et al. (2016);Michel et al. (2020);and Kelleher et al. (2021) have analysed long term-data across multiple sites . Still, the temporal resolution in such T w datasets is typically coarse (i.e. less than daily). Coarse temporal resolution is known to introduce bias, as it is unlikely to represent the complete thermal variability of daily data (Koch and Grünewald, 2010). Being able to deduct high resolution data from such low resolution longer-term T w datasets across many sites would therefore provide an important step towards understanding T w trends and changes over longer periods of time (Isaak et al., 2012;Moatar and Gailhard, 2006;Orr et al., 2015), as well as the controls on past and future T w patterns.
A few studies, such as Seyedhashemi et al. (2022) and Zhu et al. (2019), have focused on obtaining daily stream temperature from coarser resolution data sets. These involved a range of different modelling approaches such as different regression models (Koch and Grünewald, 2010), k-nearest neighbour (St-Hilaire et al., 2012), dynamic 1-D water energy routing modelling (Wanders et al., 2019) or standard statistical models such as air2stream (Piccolroaz et al., 2016). These studies mostly agree on improved model performance by including current T a and preceding (1-2 days) T a and T w (Koch and Grünewald, 2010;St-Hilaire et al., 2012;Wanders et al., 2019). While these studies are all focusing on modelling daily T w from T a and in some cases include variables for seasonality or flow, combinations of a variety of climatic and hydrological variables are not considered. Hence there is a need for a relatively simple and widely applicable methodology to obtain long-term daily T w records from coarser temporal observation records by implementing simple climatic and hydrological variables. One such solution to acquire historical high temporal resolution datasets is by reconstruction from temporal sparse data by generalized additive models (GAMs) as shown by Pohle et al. (2019). GAMs are similar to generalized linear models but replace the linear form by a sum of smooth functions providing a flexible method to identify nonlinear effects in regression models (Hastie and Tibshirani, 1986) and are therefore suitable to consider complex relationships between T w and different environmental variables and model T w under a changing environment (i.e. change in seasons). The application of GAMs to derive daily T w data from coarser resolution data would then enable better understanding of (i) the relative importance of various drivers for T w and (ii) potential changes in T w under future climate scenarios.
The main aim of this study was to create a national daily stream water temperature dataset from once-a-month observations and use this to gain insights into annual as well as seasonal historic and future spatial patterns in T w . This was achieved by developing and applying GAMs to a long-term once-a-month national dataset for Scotland, UK. The specific objectives were to (i) reconstruct a national long-term daily T w record, (ii) analyse historic trends and the role of environmental controls and (iii) explore future trends to identify most vulnerable sites.

Study catchments and data
This study focussed on 45 catchments across Scotland ( Fig. 1 and Appendix A), covering~33,200 km 2 (approximately 43 % of mainland Scotland) in total. These catchments are part of a long-term national water quality monitoring programme, the Harmonised Monitoring Scheme (HMS). The HMS dataset contains monthly T w for 56 rivers (monitored since mid-1970s) across mainland Scotland collected once a month near the mouth of the network draining the represented catchment as part of regulatory water quality sampling (Scottish Environmental Protection Agency, 2013). Monitoring in eleven catchments ceased prior to 2015, so that the dataset for this study was reduced to those 45 catchments with more complete and comparable long-term records ( Fig. 1a and Appendix A). Complementary runoff data at the same or nearby sites were extracted from the national river flow archive (NRFA) (UK Centre for Ecology and Hydrology, 2019). The average distance between an HMS and nearby NRFA sampling site is 295 m with a maximum of~1.7 km (for catchment 37, which drains and area of 164 km 2 ).
The 45 catchments are representative of a range of different environmental conditions, in terms of climate, topography and land cover ( Fig. 1 and Appendix A). Catchment size varies between~60 km 2 and 4580 km 2 . Mean elevation of the catchments ranges from 79 m to 458 m ( Fig. 1a) with lower mean catchment elevation mostly occurring in central Scotland and higher elevations in the north (the area of the Cairngorms and Highlands). Mean annual catchment precipitation ranged from 766 mm to 2693 mm (Fig. 1b), with lower precipitation values in the east compared to west of Scotland. For the reference period 1981-2010, annual mean air temperature (T a ) as catchment averages ranged between 5.8°C and 8.5°C (Fig. 1c), higher temperatures are observed in the central and southern part of Scotland. Scotland's main land uses are moorland grazing and arable agriculture (Morton et al., 2007). Woodland cover is between 8 % and 57 % for individual catchments, with higher values in southern Scotland (Forestry Commission, 2020) while settlement density is the highest in the southern central area (Fig. 1d). Observed mean annual discharge for all catchments ranges from 265 mm to 2454 mm. Baseflow index (BFI), as derived from modelled runoff using the baseflows function implemented in R package hydrostats (Bond, 2016), is between 0.14 and 0.52 (Fig. 1e) showing lower groundwater contribution estimates in southern Scotland and along the west coast. The geology of Scotland is showing a clear division of mostly sedimentary basement in the south and metamorphic basement in the centre and north divided by the highland boundary fault. At the HMS sampling points annual mean T w for the reference period 1981-2010 ranged between 8.0°C and 10.5°C (Fig. 1f), with higher temperatures mostly in the south of Scotland.
While the HMS dataset provides monthly T w data, higher resolution observations for the River Gairn, a 146.5 km 2 sub-catchment of the Dee (catchment 41 in Fig. 1a) were used to independently evaluate the approach. The Gairn dataset consists of minimum and maximum T w data at 15 min intervals for the period 2013-2017. The 15 min interval data were averaged to generate hourly and daily records. To replicate the monthly sampling of the HMS dataset, data of a single day within a month were extracted for the sampling days available for catchment 41. As no information on the specific sampling time was available, 10:00 am was chosen to replicate the single measurement of a day.
Daily meteorological data (maximum and minimum T a [°C] and precipitation [mm]) for the time period 1960-2015 were obtained from HadUK-Grid at a spatial resolution of 1 km × 1 km (Met Office et al., 2018). Average daily T a was calculated as the mean of maximum and minimum daily T a . From the UKCP18 dataset, we also extracted future 12 km × 12 km resolution daily meteorological data , for 12 climate projections simulated using the HadRM3 Regional Climate Model for the RCP8.5 level of radiative forcing (Met Office Hadley Centre, 2018). This future climate dataset was partially downscaled to a 5 km × 5 km resolution and spatially bias corrected following Rivington et al. (2008b). This bias correction method assesses the climate model's ability to simulate daily T a and precipitation interpolated observations (Perry et al., 2009) for each 5 km × 5 km grid cell, from which correction factors are developed. For precipitation this means (I) correcting the number of dry days (P < 0.2 mm), by subtracting a value (generally <0.3 mm) from all rainfall events so that the mean number of modelled dry days matches observations. This is performed as Rivington et al. (2008a) identified a systematic error where there were too many days with low precipitation amounts and (II) a correction to match the Mean Annual Total precipitation, which corrects for the models' original error and the dry day correction. For maximum and minimum T a , the minimised value was the difference between the observed and modelled sum of daily means per month. The mean correction values (+ or -) for the 1960-1990 baseline period for each month were identified per grid cell and applied additively. On the assumption that errors in regional climate model estimates for the observed period are systematic, and therefore exist in future projections, the correction factors were applied to the 12 future climate projection daily data. The objective of this method is to minimise the changes to individual daily data values and so maintain the climate change signal within the HadRM3 projections. Future catchment average precipitation and T a were then calculated for all catchments from the 5 km downscaled dataset.

Reconstructing daily stream temperature data from monthly observations
Daily T w was modelled for the individual catchments using generalized additive models (GAMs) with seven potential explanatory variables and calibrated using the monthly observed T w values from the HMS dataset (Fig. 2). The explanatory variables include daily values for T a , cumulative T a , astronomical daylength, total precipitation, snowmelt, runoff, and the ratio of snowmelt over runoff (Fig. 2). These variables were selected because they (i) have been recognised by other studies to have the strongest influence on changes in T w , as summarized in the introduction, and (ii) are relatively simple to obtain, and thereby most consistently and readily available. Appendix B provides a summary of the Pearson correlation coefficients between individual explanatory variables. Catchment T a data were extracted from the UKCP18 dataset, as a weighted average of all grid cells covering the individual catchments. Cumulative T a was computed as the cumulative sum of air temperature throughout the thermal year (1st December -30th November). The astronomical daylength was derived using the R software package oce (Kelley et al., 2018) for each catchment centroid.
To obtain variables related to precipitation and runoff, a snow model and hydrological model, were used, respectively. The snow model is based on a single-layer degree-day model (Spencer, 2016) and runs on a daily timestep. It was used to simulate effective precipitation and snowmelt based on observations of total precipitation and T a from HadUK-Grid data for the past and bias corrected UKCP18 data for the future. Daily runoff was simulated using the conceptual hydrological model TUWmodel (Parajka et al., 2010) for each catchment to enable closing data gaps in the past and runoff modelling for the future. It was parametrized by calibrating against observed daily runoff from the national river flow archive (NRFA) dataset (UK Centre for Ecology and Hydrology, 2019) using the Nash-Sutcliffe efficiency (Nash and Sutcliffe, 1970) for natural logarithms of runoff (Krause et al., 2005). The internal snow routine of the TUWmodel was deactivated to explicitly account for snow as simulated by the single layer degree-day snow model. The GAMs were then used to model T w for all possible combinations of the seven potential explanatory variables, using the GAM function of the mgcv package (Wood, 2021) implemented in R (R Core Team, 2021). To account for antecedent conditions, a moving average over the preceding days was applied to T a , cumulative T a and astronomical day length. The time window for which the moving average was applied was determined by the highest correlation (based on Pearson correlation coefficient) between the moving average of T a and T w observations for each catchment. For the TUWmodel and GAMs, the data prior to 1996 was chosen as a warmup period for the models, while data from 1996-2005 were chosen for calibration and 2006-2015 for validation. This allowed for calibration and validation of the models using near-continuous observed timeseries. All possible combinations of the seven variables resulted in a maximum of 128 different GAMs per catchment. Firstly R 2 > 0.7 was used to select the best 20 GAMs (out of the 128 runs) for T w in both the calibration and validation period, as R 2 is a simple way to determine the quality of a linear model. In case >20 GAMs fulfilled this criterion they were additionally ranked and selected based on their Akaike Information Criteria (AIC). The wider model performance of the selected GAMs T w models is summarised in Appendix B. In addition to R 2 , this includes the independently calculated bias, root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE) (Nash and Sutcliffe, 1970) and Kling-Gupta-efficiency (KGE) (Gupta et al., 2009), which were all calculated using the hydroGOF R package (Zambrano-Bigiarini, 2020). Finally, the results of the top 20 GAMs were plotted in form of response curves against the explanatory variables to identify and exclude physically implausible models for each catchment individually.
To independently evaluate the modelling approach, we used it to simulate daily T w data based on the monthly subsampled data for the Gairn catchment, which were then compared with the 5 years (2013-2017) of daily observed discharge and T w timeseries. As the timespan for the River Gairn dataset represents a shorter timespan (mid 2012start 2017) compared to the HMS dataset (mid 1970 -2015), the calibration period was adjusted to mid 2012end 2014 and the validation period to early 2015end 2016.
To identify dominant controls on simulated daily T w for individual catchments, the p-value for variable significance was calculated using Pearson correlation. A variable was considered as significant for a GAM with p ≥ 0.01/n (n = 7; number of variables in the model). The total importance for each variable per catchment was calculated as the sum of times it was considered as significant.

Analysis of historical trends
Daily T w was calculated for outputs of each of the 20 best performing GAMs. Averages of these 20 results are presented in Sections 3.2 and 3.3, so that each catchment has one value but with consideration of uncertainty in the modelling. Using a five-year starting window, annual and seasonal (Winter: Dec -Feb, Spring: March -May, Summer: Jun -Aug, Autumn: Sept -Nov) trends in form of Sen slopes (Sen, 1968) were calculated for a 30-year period using 1961, 1971 and 1981 as the starting year. This means for the starting year 1961, trends for five 30-year periods (starting 1961, 1962, 1963, 1964, 1965) were calculated. The significance of a trend was evaluated using the Yue and Wang (2004) variance correction for autocorrelated data. Where four out of five 30-year windows indicated a significant trend (p-value <0.05) in the same direction, a significant trend was assumed for the entire period and a median Sen slope over all models for the entire period was calculated; in case of more than one insignificant or contrasting trend, results were marked as "contrasting trends". This approach was chosen to allow for identification of inconsistencies and artefacts in the dataset.

Future stream water temperature trends
The period between 1981 and 2010 was considered the 'baseline' for future T w projections. The future projections were generated based on the climate data. Future cumulative T a , snowmelt, runoff, and ratio snowmelt/ runoff for the 12 climate projections were generated as described in Section 2.2. These were then used for each of the best 20 GAMs from the historical dataset to calculate T w for each climate projection. Results from the 12 projections were then averaged to a single dataset of future T w (2031-2080) for each of the 20 models. Trends for future T w were calculated according to the same procedure as for the historical dataset (Section 2.3). We focussed specifically on trends in annual and seasonal T w for 1981-2010 and 2051-2080, to consider the longest possible period into the future.

Model performance and evaluation
Overall, the modelling approach showed good performance across all catchments (Fig. 3, Appendix B). Mean R 2 over all 900 models (20 models per catchment) is 0.94 for calibration and 0.93 for the validation period (Appendix B). Other independent model evaluation criteria, which were not used in the model selection process, further indicate the suitability of the approach (Fig. 3 and Appendix C).
The quality criteria for the individual catchments also indicate a good performance (Fig. 3) with mean R 2 of 0.87-0.96, mean KGE 0.87-0.98, mean NSE 0.86-0.96 and mean RMSE 0.94°C -1.55°C. These values have been calculated as a mean for the selected models of each catchment individually. All quality criteria are showing similar spatial patterns with relatively lower model performance for catchments in northwest and south Scotland (Fig. 3). There was no clear relationship between model performance and catchment size. The modelling approach also performed well for the independent test using the dataset from the River Gairn. Mean R 2 (calibration 0.93, validation 0.89) was only slightly lower than the mean over all catchments (Appendix C). The data gap in early 2016 (due to logger issues) did not affect the models as the calibration period was defined from mid 2012 to end 2014. The simulated data are within the range of the observed T w (Fig. 4 a  & b), although the models do seem to slightly overestimate T w for temperatures <5°C (Fig. 4 c). Nevertheless, they indicate a good fit for higher temperatures, which is most important here in the context of warming and temperature thresholds.

Environmental controls on stream water temperature and historic trends
The importance of environmental controls on T w based on the selected models per catchment indicated a unique combination of variables for each catchment (Fig. 5). Although some spatial patterns existed, specific grouping of catchments in terms of the dominant controls to T w was not possible. Overall, T a had the most dominant control (Fig. 5a). Cumulative T a was relatively more important in northeast Scotland (Fig. 5b) while the spatial pattern for daylength was the opposite (Fig. 5f). Precipitation and runoff had a similar importance over all catchments (Fig. 5c & d). Snowmelt had no significant relationship with T w in the central area of Scotland and individual catchments in the north (Fig. 5e), which is consistent with relatively low snowfall in these areas compared to the rest of Scotland. We also found that the ratio of snowmelt/runoff had no significant relationship with T w for any of the catchments.
Response curves of GAM model means to individual explanatory variables (Appendix F) indicated that catchments with lower T a in the past generally also show lower T w . A similar pattern was found for catchments with relatively higher elevations showing on average lower T w . However, specific grouping of catchments based on their response curves was not possible.
The historical data analyses consistently revealed either an increase in annual T w or contrasting trends over different 30-year periods (Fig. 6). All catchments indicated an increase in annual T w from 1981 (Fig. 6k). The annual trends seem to be linked mainly to spring and summer T w , as winter most often shows contrasting trends, regardless of the starting window. Trends in autumn are more complex, as they revealed cooling or contrasting trends for starting window 1961, but warming in more recent decades.

Changes in future stream water temperature
The future annual and seasonal T w are showing different spatial patterns compared to the past. Almost all future T w (2051-2080) are consistently warmer than the current baseline (1981-2010) (Fig. 7 top two rows, respectively), with the strongest increase in T w in summer, and spatially, in northwest and west of Scotland. Future annual, spring, summer and autumn T w show a homogenous spatial pattern across Scotland (Fig. 7f & h-j), winter shows a more diverse spatial pattern with generally colder catchments in the north (Fig. 7g). All future projections show warming. Overall change in T w (Fig. 7k-o) is showing a contrary spatial pattern to past T w (Fig. 7a-e) leading to more homogenous T w all over Scotland in the future (Fig. 7f-j).
Daily baseline and future T w revealed the same temporal variation throughout the year, with a consistent amplitude (Fig. 8a). Fig. 8 b-f also reveal that while annual and seasonal T w consistently increase in T w over time, this is strongest in summer, followed by spring, autumn, and winter (see also Fig. 7k-o). This indicates that the increase in summer and spring T w is the is main cause for the increase in annual T w .
There is a strong negative correlation between annual T w in the past and future increase in T w (Appendix E subfigure c). This indicates that catchments with currently low annual T w will undergo a stronger increase in T w in the future compared to catchments already showing higher T w in the past. This will lead to more homogenous but also consistently higher T w all over Scotland in the future (see also Fig. 7).

Reconstructing long-term daily stream water temperature data from monthly observations
This study demonstrated that implementing relatively simple climatic and hydrological variables in GAMs is a suitable method to derive long-term daily T w from once-a-month observations. Here, we developed and applied the approach for 45 catchments across Scotland, providing a high resolution, national, long-term spatial dataset of T w . The evaluation across sites revealed an R 2 of 0.93 in the validation period. Furthermore, the R 2 was 0.89 for a shorter period of fully independent observed daily data. Also, other quality criteria showed reasonably good performance in calibration and validation period for both main and test dataset (Appendix C). The simulated data shows an overall good fit compared to daily observations and a slight overestimation for T w below 5°C can be seen as acceptable considering that under future climate predictions T w is in general expected to increase. While further analysis is needed to better understand factors that lead to an overestimation of T w for the temperatures below 5°C, it is speculated that this might be connected to the representation of snowmelt. Relatively few days of observed T w data had snowmelt and most of the models that had to be excluded due to being hydrological implausible included the snowmelt predictor. Previous studies that developed approaches for high temporal resolution mostly focussed on a single catchment (Koch and Grünewald, 2010;St-Hilaire et al., 2012;Zhu et al., 2018). Even though other studies show the suitability of modelling daily T w from only T a using more sophisticated approaches such as machine learning (Zhu et al., 2018), our approach shows comparable results in the quality of the models with mean R 2 = 0.94 over all selected models after calibration. While the method revealed an overall good performance, it could be expanded by applying different model combinations for different seasons. This would address previously established positive relations between specific variables and T w on a seasonal basis. For example, Simpson et al. (2010) demonstrated a seasonal relationship between T w Fig. 4. Model results for the River Gairn: a) timeseries of observed and simulated T w (incl. range), b) scatterplot observed vs simulated T w for calibration and validation period, c) exceedance probability for observed and simulated T w in calibration and validation period. and rainfall or flow respectively; and Hrachowitz et al. (2010) showed that the role of different landscape controls is also varying with the season.
A limitation of the data used is that T w and runoff for individual catchments were only observed at a single point near the catchment outlet. Studies that explored spatial variation in T w data within individual catchments in Scotland showed that there are complex internal T w patterns between tributaries and variations with river order (Hrachowitz et al., 2010;Jackson et al., 2021Jackson et al., , 2017b. The analysis of the historical part of the dataset is also not considering any natural or anthropogenic changes, apart from the explanatory variables included in the GAMs, in the environment of the catchments. Changes in land use and riparian woodland over time can have significant impacts on T w (Hrachowitz et al., 2010;Jackson et al., 2021;Moore et al., 2005;Ogilvy et al., 2022). Other anthropogenic influences like dams, water pumping, deviations, intakes and associated changes in discharge have also been shown to influence T w (Michel et al., 2020;Seyedhashemi et al., 2021). Exploring this role would be a crucial next step towards improved modelling of future T w timeseries. A starting point for anthropogenic influences could be to calculate trends for datasets that allow for Before-After-Control-Impact comparison or inclusion of artificial thermal inputs into the model where such data are available. Other uncertainties within the obtained historical dataset are related to the uncertainties introduced by implementing the UKCP18 RCP8.5 (e.g. interpolation) and HMS datasets (e.g. sampling time and frequency of the HMS dataset). Nevertheless, the developed method allowed us to gain a broad understanding of historical trends on widely available climatic and hydrological data.

Controls on spatial patterns and historical stream water temperature
The results showed that each catchment's T w is a response to a unique combination of environmental controls. This corresponds with findings by others (Jackson et al., 2017a), who showed catchment-specific relationships between T w and their T a metric, indicating that relationships were not transferable between catchments. Nevertheless, certain environmental controls were more dominantly correlated to T w than others. T a and cumulative T a were included in most models for all catchments, indicating the strong relationship of preceding and current T a on T w . This corresponds with findings from a wide range of previous studies which showed a close relation between T a and T w as well as the importance of preceding T a (Hannah et al., 2004;Hrachowitz et al., 2010;Jackson et al., 2018;Jonkers and Sharkey, 2016;Koch and Grünewald, 2010;Pohle et al., 2019;St-Hilaire et al., 2012;Wanders et al., 2019;Webb et al., 2003). Catchments for which snowmelt appeared unimportant for determining T w also showed a relatively lower importance of T a . We found that these catchments typically have relatively lower elevation with less presence of snow. This observation is in contrast to findings of Lisi et al. (2015) who revealed that low elevation catchments in Alaska were more sensitive to changes in T a , although in Alaska T a is generally lower overall. While Hrachowitz et al. (2010) found specific spatial patterns in the relationship between T a and T w as a result of river order and elevation, we were unable to identify these. This might be a result of the single point measurements used for this study which are located close to the outlet and therefore not providing a strong variation in river order. The observed historical trends are consistent with other studies describing an average T w increase of 0.22°C per decade for 1981-2001 (Jonkers and Sharkey, 2016) and a T w increase of 0. 46°C for 199046°C for -200646°C for (Simpson et al., 2010 for rivers in Britain. More specifically, our results correspond to the previously reported increase in T w of 0.2°C per decade for the River Spey (catchment 39) since 1961 as described by Pohle et al. (2019) and an increase of 2.0°C in maximum T w for winter and spring over a 30 year period for the Girnock Burn (sub-catchment of catchment 41) from 1968 to 1997 (Langan et al., 2001). This study additionally provides insights into the temporal and spatial consistency of those trends. The identified seasonal trends correspond with Jonkers and Sharkey (2016), showing spring has the highest warming rates, followed by autumn, summer and then winter. However, it is contrary to the findings from Simpson et al. (2010), who described autumn and winter had the strongest increasing trends in England and Wales. This discrepancy might be caused by Simpson et al. (2010) solely focusing on streams in England and Wales where annual temperatures are on average higher than in Scotland, while Jonkers and Sharkey (2016) have investigated streams in whole Britain (including Scotland) and our study provides a more detailed focus only on Scotland.

Future stream water temperature
In general, future increases in T a will lead to catchments experiencing warming T w in the future, even for those catchments that have not experienced warming in the past. Our results show that catchments with lower mean T w in the past will experience stronger increase in T w in the future, which will lead to less spatial variability in annual T w across Scotland. This could be caused by a variety of internal and external factors e.g. climate variability (Laizé et al., 2017), changes of snow melt timing (Lisi et al., 2015), or elevation dependent warming (Isaak and Luce, 2023). As the spatial patterns in T w change do not fully match the spatial patterns of T a increase based on UKCP18 Strand3 projections (Met Office, 2019) it indicates a non-linear relationship between T a and T w . This is also supported by response curves showing a s-shaped relationship between T a and T w (Appendix F subfigure a) and highlights the importance of other site-specific variables such as precipitation, discharge, snowmelt and daylength. Considering other studies for a region also included here, Jackson et al. (2018) described higher maximum T w in northwest Scotland during summer and lower maximum T w in northwest Scotland and the Cairngorms during winter under "extreme" climate conditions. The spatial pattern described by Jackson et al. (2018) in winter can also be observed in the projected T w of this study, while summer T w are more homogenous. This discrepancy could be explained by the fact that Jackson et al. (2018) analysed maximum T w whilst our study focused on annual and seasonal averages.
The predicted increase of up to 4°C in annual T w is within the range of +3°C to +4°C by 2080-2090 (compared to 1990-2000) as described by Michel et al. (2022) for Switzerland applying the same climate scenario. However, compared to UK focused studies, the increase in average annual T w is above the predicted increase of 1.2°C -1.4°C for T w for UK rivers by 2071-2100 based on SRES B1-A2 (van Vliet et al., 2013) and also above the UKCP09 RCP8.5 predicted increase of 1.4°C -1.8°C by 2080 (van Vliet et al., 2015). This might be mostly related to different changes in T a as the UKCP18 Strand 3 used here predicts a stronger increase in T a (summer surface T a up to 7.0°C and winter surface T a up to 4.0°C) for Scotland than the SRES B1-A2 and UKCP09 RCP 8.5 scenarios used by van Vliet et al. (2013van Vliet et al. ( , 2015 respectively, which predicted increase of T a of up to 4.0°C and 2.0°C, respectively. The future increase of T w of 1.9°C for the River Dee is also slightly higher than the predicted increase of 1.4°C by the end of the 21st century (Hrachowitz et al., 2010). It has to be noted that Hrachowitz et al. (2010) models are based on the regional UKCP09 dataset with a medium emission scenario while the presented study is based on a regional climate (UKCP18 RCP8.5) high emission scenario. We have therefore evaluated an updated but warmer scenario than previously suggested.
While Van Vliet et al. (2013) found no spatial or seasonal variation for T w throughout the UK with maximum warmings of 1.2°C -1.4°C, Jackson et al. (2018) identified most (climate) sensitive streams to be in the north and nortwest of Scotland and the Cairngorm Mountains, this spatial pattern can also be observed in the change of mean summer T w of this study.
Our study also considers only one ensemble (12 projections) of a high emission climate scenario for future predictions of T w , however, as van Vliet et al. (2015) has shown, different scenarios will have different changes in T a and impacts on hydrology and therefore also on T w . While UKCP18 RCP8.5 provides plausible projections for the future it is not without uncertainties, which are also associated with downscaling UKCP18 RCP8.5 as described by Rivington et al. (2008aRivington et al. ( , 2008b.

Wider implications and future work
The methodology developed here has widespread applications and could be used to create long-term daily records from a wide range of temporally coarse national, international, and global datasets which are more widely available than daily datasets typically required for robust trend assessments. Indeed, it has the potential to be applied to smaller scale sites as well, i.e., at a sub-catchment level for individual tributaries. As such, our work could be expanded to analyse more Scottish streams (e.g., streams monitored over a shorter period or where monitoring has ceased prior to 2015) to identify their unique parameter sets and include those catchments as part of the national dataset. Observed runoff could be fully replaced by calculations based on the hydrological model used in this study or any other suitable rainfall-runoff model (Ouellet-Proulx et al., 2017), in places where field observations of runoff are temporally coarse. Our method could therefore help to understand the role of environmental controls on different catchments and might allow transfer of models to ungauged sites. Even though, the models are different for individual catchments the response curves indicate common general trends. Hence, in future work it could be assessed how a mean model for whole Scotland compares to the mean models for the individual catchments investigated in our study and how this might affect the uncertainty within the results. Results could also be used to identify climate sensitive catchments which might require more detailed monitoring and management implications. For example, for streams of high ecological status in Scotland, the increase of T w is limited to a max. 2°C from ambient T w due to infiltration and no breach of 20°C; for streams of good ecological status the uplift is limited to a max. 3°C from ambient T w and no breach of 23°C (Scottish Environment Protection Agency, 2016). Based on our projections, none of the analysed catchments are expected to exceed the thresholds for streams of good ecological status, however, summer temperatures of up to 20°C especially in the southwest of Scotland are reaching the limits for streams of high ecological status. Using the temperature maps produced in this study, it has been possible to identify sites that are associated with the highest temperature increases, which are in the northwest and west of Scotland. This information could be used as an initial assessment to identify "at risk" catchments for thermal refugia and catchments to focus on implementing thermal moderation measures e.g., introducing/increasing riparian vegetation.

Conclusion
We successfully developed a national long-term daily stream water temperature (T w ) record from over 40 years of once-a-month observations by implementing simple climatic and hydrological variables in generalized additive models. This allowed us to (i) analyse historical trends as coarser temporal resolution (e.g., once-a-month observations) is likely to introduce bias, (ii) understand the role of environmental controls for individual catchments, and (iii) predict future changes under regional climate projections (UKCP18 Strand3, RCP8.5) in T w to identify climate sensitive sites. The method performed well with R 2 ranging between 0.87 and 0.96 for individual catchments across Scotland. Even though, uncertainties within the resulting datasets are to be expected based on the results of the performance test for the GAMs (Appendix C) and due to uncertainties introduced by the used datasets (UKCP18 RCP8.5 and HMS). A fully independent evaluation with a shorter-term daily dataset (for the River Gairn) resulted in R 2 = 0.89. By analysing 45 catchments in contrasting environmental conditions, we found air temperature (T a ), cumulative T a and daylength appear to be the most important factors influencing T w in Scotland. Nevertheless, precipitation and runoff also show moderate importance for all catchments while snowmelt shows no importance for catchments in the south of Scotland (catchments which experience less snowfall in general due to low elevation and on average higher T a ). While cumulative T a was relatively more important in northeast Scotland, daylength showed a contrary spatial pattern with lower importance in the northeast. In the past (1981-2010), mean annual T w was highest in  Fig. 1a) in comparison to baseline period 1981-2010 (using re-analysis data of climate scenarios). a) mean annual stream water temperature over time b) annual mean temperature for individual periods c) winter mean temperature for individual periods d) spring mean temperature for individual periods e) summer mean temperature for individual periods f) autumn mean temperature for individual periods.
southern Scottish catchments (9.5°C), though historical showed an increase of up to 0.04°C/year in annual T w for individual catchments in the southwest and central Scotland for the same period. Future changes are showing an increase of up to 4.0°C in annual T w compared to 1981-2010 for catchments located in the northwest and west of Scotland leading to a more homogenous spatial pattern in the future with mean annual T w reaching up to 10.0°C .
Based on our Scotland wide analysis, we have been able to identify catchments in northwest and west Scotland as the most climate sensitive sites, i.e., catchments with strongest future increase in T w . These catchments should therefore require more detailed monitoring in the form of higher spatial and temporal resolution measurements and should be considered for T w management approaches such as introducing or increasing riparian vegetation. Our approach was developed and tested for relatively large-scale catchments in Scotland. While these data represent sites with variable catchment characteristics, the wider transferability of the approach to different climate zones still needs to be tested. T a will likely have a consistently dominant control, but the role of other factors may vary when the approach is applied to further national, international or even global datasets. Depending on the geographical settings, other variables might also need to be included into the modelling approach (e.g. glacial meltwater). Nevertheless, the developed methodology enables the analysis of historic trends and future changes at a high temporal resolution. It also has the potential to be applied to smaller scale sites (as demonstrated with the independent Gairn dataset), i.e., at a subcatchment level for individual tributaries to analyse T w with a higher spatial resolution.

Data availability
Data will be made available on request.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.