Urbanization and aridity mediate distinct salinity response to floods in rivers and streams across the Contiguous United States

Salinity is an important water quality parameter that affects ecosystem health and the use of freshwaters for industrial, agricultural, and other beneficial purposes. Although a number of studies have investigated the variability and trends of salinity in rivers and streams, the effects of floods on salinity across a wide range of watersheds have not been determined. Here, we examine this question by utilizing long-term observational records of daily streamflow and specific conductance (SC; a proxy for salinity) in addition to catchment characteristics for 259


Introduction
Salinity, a measure of the total dissolved inorganic salts in water, is one of the most important water quality indicators in freshwater ecosystems. Long-term elevated salinity levels have a detrimental effect on water treatment, agricultural water usage, infrastructure corrosion (Stets et al., 2018;Pieper et al., 2018), ecosystem functioning and biodiversity (e.g., Cañedo-Argüelles et al., 2013;Herbert et al., 2015;Velasco et al., 2019;Gutiérrez-Cánovas et al., 2019;Schulz & Cañedo-Argüelles, 2019). Moreover, abrupt and transient fluctuations in salinity, both increasing and decreasing, can have adverse effects on aquatic organisms due to their intolerance to such changes. As a result, salinity or concentrations of individual major ions (e.g., chloride) are typically regulated using threshold criteria (e.g., US EPA, 1998, 2002; Canadian Council of Ministers of the Environment, 2011). Salinity also modulates the solubility of oxygen in water, which has a direct impact on the health of aerobic organisms. In recent years, a number of studies reported increasing trends of freshwater salinization in several regions around the world (e.g., Kaushal et al., 2018;Bird et al., 2018;Estévez et al., 2019). These trends were primarily attributed to anthropogenic factors such as extensive mining, usage of deicing road salts, application of fertilizers, and human-accelerated weathering of materials-both natural and artificial (e.g., concrete).
An important issue that has received little attention to date is the effect of extreme events, and in particular floods on river salinity. This subject is of particular importance for anticipating climate change impacts on river water quality because of the increased probability of more frequent and intense floods and droughts as well as rapid shifts between the two in a warmer climate (Swain et al., 2018). Robust signals of flood regime changes have already been detected in observational records; for instance, a number of studies reported a shift in the timing of flood 4 peaks in many rivers around the world (e.g., Blöschl et al., 2017;Wilson et al., 2010;Arheimer & Lindström, 2015;Stewart et al. 2005). Additionally, socio-economic development is projected to be a significant driver of alterations in flood characteristics in the future (Winsemius et al., 2016). Such changes in flood regimes may result in severe consequences for water availability, both in terms of quantity and quality.
Previous studies of flooding impacts on salinity, individual ion concentrations or suspended solids have primarily focused on investigating the nonlinear variability of solute concentrations during the flood hydrograph, also known as hysteresis (e.g., House & Warwick, 1998;Hamshaw et al., 2018). It has been shown that, at hourly time scales, concentrations of most solutes are higher during the rising limb of the hydrograph compared to the falling limb (i.e., clock-wise hysteresis; House & Warwick, 1998). This has been attributed mainly to the "flushing" effect of floods whereby solutes deposited in the catchment and streambed prior to floods are mobilized and flushed during the rising limb of the hydrograph, a phenomenon also known as "first flush" in the stormwater community. Additionally, earlier results from a few catchments in Southwestern England showed that the variability of chemical concentrations during floods is complex with concentrations either increasing, decreasing or a mixture of both (Walling & Foster, 1975). Furthermore, several studies reported a time-lag effect whereby the flood hydrograph precedes variations in chemical concentrations "chemograph" by several hours (Glover & Johnson, 1974;Walling & Foster, 1975).
The aforementioned studies provided important insights on the effects of floods on solute concentrations at sub-daily time scales, and often relied on short records of high frequency 5 observations (e.g., hourly, 2-hrs) from a limited number of sites. However, the impact of floods on salinity and potential influences of climatic, geologic, topographic and anthropogenic factors across large spatial domains remain unexplored. The focus of this present paper is to examine the impact of floods on salinity, measured as specific conductance (SC), across 259 sites in the contiguous United States (CONUS) using long-term observations with a sample size of at least 3650 records (i.e., 10 years of daily data) at each site. Here we hypothesize that antecedent salinity levels (memory) combined with the interaction of climatic, geologic, hydrologic and anthropogenic factors such as mining, agriculture and urbanization leads to variability in the response of salinity to floods. We test this hypothesis by quantifying the variations in salinity during floods and examining their relationships to natural and anthropogenic properties of drainage basins. More specifically, our research objectives include: (1) quantifying the spatial and temporal variability in SC during high flow conditions, (2) identifying catchment characteristics that demonstrate a strong relationship with the response of specific conductance to floods, and (3) exploring factors that mediate the variability of specific conductance within individual sites during flood events.

Site selection
For the present study, we selected USGS stations with readily available and sufficient information on their associated catchment properties. Therefore, we used the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES-II) dataset (Falcone et al., 2010) as a reference for our site selection. GAGES-II dataset provides 355 metadata features (e.g., climatic, geologic, land use, soil, hydrologic properties) for 9,322 USGS sites. Of these 9,322 sites, we 6 selected sites that had at least 10 years of daily measurements (n = 3650) of concurrent streamflow (Q) and specific conductance (SC) in the USGS National Water Information System (NWIS). This resulted in a set of 259 sites across the contiguous United States (Figure 1a). Across all sites, SC was measured in field from an unfiltered water subsample and adjusted to a standard temperature of 25C. Detailed information on equipment, calibration and measurement methods of SC are summarized in U.S. Geological Survey (2019). Streamflow was measured using either one of two methods: Velocity-Area method or Midsection method. Further details on equipment, measurement practices and quality assurance of Q measurements is summarized in Turnipseed and Sauer (2010). All SC and Q measurements used in this study are designated with data quality code "A" which indicates that the data has been approved for publication and the review is completed.  For all sites in our case study, we used 123 catchment characteristics (Table S1) from the original 355 metadata features provided by GAGES-II dataset. These characteristics represent 7 broad categories: climatic, hydrologic, hydrologic modification (e.g., dams), topographic, land use/land cover, nutrient and pesticide application and soil properties. The remaining catchment characteristics excluded from analysis were either categorical values or monthly climatic statistics (e.g., monthly average temperature). The 259 sites used in this study span a wide range of climate, elevation, size of drainage basin, extent of urbanization and soil properties (Table 1 and Table S1).

Pre-processing time series of SC and Q
Daily SC measurements obtained from USGS NWIS were reported either as mean, median, maximum, minimum, at-noon or instantaneous values in units of / . We opted to make use of all these measurements whenever needed in order to increase the sample size in our analysis. We used a hierarchical data imputation scheme to estimate daily SC measurements.
More specifically, SC is estimated using the following order: mean, median, a linearly interpolated value between min and max, at-noon, and instantaneous measurement. For instance, at any given station and day, daily SC is estimated as an interpolated value between min and max only if the mean and median values were not reported. If in addition to mean and median, the values of max and min were not reported, daily SC is taken to be equal to at-noon SC measurement. Since deviations of SC and Q are defined relative to the long-term time averaged mean, it is a prerequisite that any long-term trends in SC or Q are accounted for and removed to ensure consistency and uniform sampling throughout the record. Details of detrending timeseries are reported in Text S1.

Identification of floods / high flow events
We define floods as daily streamflow (q) values higher than the 95 th percentile. More specifically, the set of daily streamflow values considered as floods at any given site, hereafter referred to as , is defined as follows: Where 95 is the 95 th percentile of all daily streamflow values at a given station.

A Normalized measure of specific conductance (NSC)
SC varies significantly across the 259 sites used in this study (see Figure 1a and Table   1); therefore, we use a normalized measure of SC to facilitate analysis and comparison across all sites. The normalized measure, denoted as NSC is defined as follows: Where is the value of specific conductance at day t, and ̅̅̅̅ is the long-term average value of specific conductance at a given station. More specifically, ̅̅̅̅ is computed for each site as the mean of all daily de-trended SC values during the period of record. Theoretically, has a lower bound of -1 when = 0 (i.e., no salts in water), whereas the upper bound is 10 infinite. An NSC value of 0.4 indicates that specific conductance is 40% percent higher than the long-term mean.

Dryness Index as a measure of aridity
The water balance in hydrologic basins can be represented as follows: Where , , , ∆ and ∆ denote precipitation, evapotranspiration, runoff, groundwater input and changes in water storage, respectively. At long (climatological) time scales, fluctuations in ∆ and ∆ balance out leading to equation 4.
The upper limit of is controlled by potential evapotranspiration (PET); therefore, the ratio of PET to P, defined as the Dryness Index (DI), expresses the deficit (availability) of water in hydrologic basins as shown in the following equation (Budyko, 1961): Figure 2a shows the relationship between DI and the ratio of ET/P for all 259 sites, as well as the Budyko curve (gray line), which expresses the theoretical relationship between the two. The ratio of ET/P was calculated from equation 4 as (1 -runoff ratio); where runoff ratio is equal to R/P. We classify all sites in our study into three climate zones based on the value of DI as follows: wet (DI < 0.7), temperate (0.7 ≤ DI < 1.2) and arid (DI ≥ 1.2); the number of sites in these three zones is 133, 80 and 46, respectively. Figure   2b shows the values of runoff ratio at each climate zone, whereas Figure S1 in the supplementary information shows a map of all sites according to their climate zone.

Random Forest models
In the present study, we used random forest (RF) models to predict SC levels during days of floods, and their feature importance to determine the dominant variables affecting the response of SC to floods. The set of predictor variables used in the RF models consists of 16 variables shown in Table 2. The rationale behind the selection of these predictors is to represent hydrologic conditions in the catchment and stream during floods and the memory of the catchment at different time scales of 5 days (short), 30 days (medium) and 120 days (long). We tested the sensitivity of selecting these specific time lags (e.g., 5 as opposed to 7 days), and found that there are very negligible differences between their values with ± 2 days because SC is highly autocorrelated. Two modeling approaches were carried out in this study. First, we set up a different RF model for each individual site with feature importance analysis carried out, then the results were averaged across each climatic zone. Second, we developed a regional RF model that is trained on normalized data compiled from all sites within each climatic zone (arid, temperate and wet). The total sample size was 12,476 for arid, 26,269 for temperate, and 41,847 for wet climates. Each model was trained on 75% of the data and tested on the remaining 25%. All RF models were implemented using scikit-learn package in Python (Pedregosa et al., 2011).
Hyperparameter tuning of the models is explained in Text S2. Q at day of high flow ---

Causal Analysis using Transfer Entropy
We used causal analysis to validate the results obtained from RF models and their feature importance. The advantage of using causal inference is that it allows to condition on potential confoundersvariables that might be a common cause for an apparent, statistical relationship between two variables. The method we use here is Transfer Entropy (TE; Schreiber, 2000) which quantifies the conditional mutual information ( ; | ), the shared information between variables X and Y given a set of conditioning variables Z. The implementation of Transfer Entropy we use here is based on an algorithm of K-nearest neighbors (Kraskov et al., 2004), which has been shown to provide reasonable performance in detecting causal relations in hydrological systems (Ombadi et al., 2020). The interested reader should refer to Ombadi et al.
(2020) for further details on the implementation of the algorithm.
For instance, in order to investigate whether ̅̅̅̅ 5 is an important factor affecting the variability of salinity response to floods at individual sites, TE enables us to test the relationship between ̅̅̅̅ 5 and specific conductance at days of floods ( ) while conditioning on potential confounding variables. Specifically, we test the following conditional independence: ( ̅̅̅̅ 5 → ) = ( ̅̅̅̅ 5 ; | ̅ 5 , ) (6) Equation 6 quantifies the information flow from ̅̅̅̅ 5 to after conditioning on the short-term memory of flow ( ̅ 5 ) and flow at days of floods ( ). This conditional independence test accounts for any effects induced by autocorrelation in streamflow. Similarly, the transfer entropy from to can be expressed as follows: ( → ) = ( ; | ̅ 5 , ̅̅̅̅ 5 ) (7) We used data compiled for each climate zone to compute TE in equations 6 and 7, whereas the statistical significance was assessed using 50 randomly-shuffled surrogates of the time series. In this paper, surrogates are only used to assess statistical significance in computation of TE because it is a nonparametric test.

Variability of Specific Conductance during floods
We start our analysis by examining the variability in NSC during days of floods across the 259 sites. Figure 3a shows the distribution of NSC for flood events across all sites with a total number of n = 91,257. Overall, NSC varies from -0.9 (90% decrease relative to long-term mean) to 0.14 (14% increase relative to long-term mean), which indicates a large range of variation across sites (2.5 th -97.5 th percentiles). The bulk of the distribution shown in Figure 3a is located below zero (i.e., the long-term mean), which indicates that dilution, the process of decreasing the concentration of a solute by addition of water, is the dominant mechanism that determines salinity response to floods. However, we also note that in 6.1% (n = 5521) of all 15 flood events, NSC is higher than zero, which means that there are alternative mechanisms that lead to enrichment, an increase in the concentration of salts during floods.
Furthermore, Figure 3b demonstrates respectively. We, hereafter, refer to this type of "within" site variability as intra-site variability. order streams that aggregate water from headwater streams; similarly, an increased percentage of canals leads to stronger dilution due to its impact in enhancing runoff.

Figure 4. Heatmap of significant spearman correlation coefficients between catchment characteristics (horizontal axis) and ̃ (vertical axis) at three climatic regimes: Arid,
Temperate and Wet. The relationship with the long-term time averaged specific conductance ( ̅̅̅̅ ) is also shown.

Random Forest Models and Feature Importance Analysis
The RF models trained separately for each site provide a reasonable performance in predicting . Table 3 summarizes the performance of the models in terms of different metrics including Nash-Sutcliffe efficiency coefficient (NSE). NSE values at the 259 sites range from 0.29 (5 th percentile) to 0.97 (95 th percentile) with a median value of 0.73. Only five sites have NSE values lower than zero indicating that the RF model had a lower skill than a model that uses ̅̅̅̅ as an estimate for all events. The relatively reasonable performance of the models underscores that they indeed provide an insight into the key factors affecting intra-site variability in the response of salinity to floods. Figure 5 shows the relative importance of predictors obtained from the first modeling approach (one model for each site) averaged across each climate zone. Across all zones, the top three features, in order of importance, are: ̅̅̅̅ 5 , and ̅̅̅̅ 30 . Results from the second modeling approach (regional models) are shown in Figure S2 of the Supplementary Information, were indistinguishable from those obtained using the first modeling approach with the most important predictors being ̅̅̅̅ 5 , and ̅̅̅̅ 30 .   Table 2 for the notation of variables. Figure 6 shows the values of log-transformed ( ̅̅̅̅ 5 → ) and ( → ) (equations 6 and 7) for each climate zone as well as the distribution of the surrogates (box plots).

Causal analysis using Transfer Entropy
All values of TE are statistically significant ( − = 0). These results indicate that ̅̅̅̅ 5 is indeed the most important factor affecting specific conductance during days of floods amongst the chosen input variables, since the values of ( ̅̅̅̅ 5 → ) exceed those of ( → ) across all climate zones (blue markers compared orange ones). 4. Discussion

Response of salinity to floods: dilution and enrichment
This study is the first continental-scale analysis that clearly shows that flood events lead to a decrease of solute concentration due to dilution at a majority of sites, although we also found that approximately 6% of flood events across 259 sites (n = 5521) lead to enrichment of salinity. Possible mechanisms leading to enrichment include flushing of concentrated salts from agricultural lands and mining sites, significant contributions from saline groundwater, or seawater intrusion and salt fronts in coastal sites. Moreover, the extent of dilution varies 23 significantly across streams and rivers in CONUS with median values of specific conductance ranging from a 75% (5 th percentile) to 8% (95 th percentile) decrease relative to the long-term mean. The wide range of values covered by the distribution in Figure 3a indicates that floods have significant and distinct impacts on salinity at different sites.

Catchment characteristics affecting the response of salinity to floods are distinct from those
controlling mean salinity levels Figure 4 shows that catchment characteristics regulating the response of salinity to floods (̃) are different from those that control ̅̅̅̅ . Most notable is that urbanization in temperate climates leads to lower values of salinity during floods (i.e., stronger dilution) while it has no discernable impact on ̅̅̅̅ . This result might seem to be contradictory to previous regional studies that linked urbanization to increased levels of ̅̅̅̅ (e.g., Moore et al., 2019, Prowse, 1987, Conway, 2007 due to higher usage of deicing road salts (Moore et al., 2019) or as a result of denudation and weathering of materials (Prowse, 1987). Our analysis in this study, however, did not reveal any significant links between urbanization and elevated levels of average specific conductance. In fact, natural characteristics such as percentage of organic matter in soils and percentage of shrublands in watershed are more important factors controlling ̅̅̅̅ in arid climates ( Figure 4). Similarly, the average depth to water table is the only factor exhibiting a significant relationship with ̅̅̅̅ in wet climates. These results are more consistent with findings in a largescale analysis reported by Lintern et al. (2018) in which they highlighted the dominant role of natural rather than anthropogenic catchment characteristics in controlling ̅̅̅̅ . Although urbanization had no discernable impact on ̅̅̅̅ , other anthropogenic factors such as the density of 24 salinization point sources (e.g., NPDES; National Pollutant Discharge Elimination Systems) and hydrologic disturbance index were found to be important factors in mediating ̅̅̅̅ in temperate climates. Moreover, the density of dams was also found to be an important factor affecting ̅̅̅̅ in arid climates.
We also note that the significant variability in factors regulating ̅̅̅̅ as well as ̃ across different climate zones indicates that results obtained from regional studies (e.g., Timpano et al., 2018, Moore et al., 2019, Bird et al., 2018 must be interpreted in a cautious manner. This is because, as shown in this study, processes and mechanisms that affect salinity vary considerably based on aridity. It is noteworthy that the majority of recent studies on trends and variability of salinity are focused on Eastern USA.

The interplay between aridity and anthropogenic attributes of catchments
Two main findings that stand out in section 4.2 are: 1) the adverse impact of mining in arid climates, and 2) the role of land use in strengthening dilution in temperate climates. Regarding the former, the positive correlation of mining and elevated salinity during floods can be attributed to the high concentration of salts from mines in arid and semi-arid climates induced by high evapotranspiration rates (Li et al., 2014, Kent, 1982, Jordan et al., 2004. Floods will typically mobilize and flush these concentrated salts leading to an increase in the mass of salts being transported, and hence an increase in specific conductance. As for the second finding, at first glance, it might appear to be perplexing that land use has a first order control on the response of salinity to floods only in temperate climates, while it has no discernable impact in 25 arid and wet climates. This is because one would intuitively assume that an increased percentage of impervious cover will result in an increase of runoff ratio regardless of climatic conditions. However, Figure 7 shows that changes in impervious cover have a significant impact in increasing the runoff ratio only in temperate climates (slope = 0.0032, p-value = 0.04). This is because the runoff ratio in wet climates is considerably high, and conversely in arid climates is low (Figure 2b), such that changes in impervious cover will not fundamentally alter processes by which precipitation is partitioned into runoff. On the other hand, sites with a temperate climate are characterized by runoff ratios that are sensitive to changes in impervious cover. In order to lend credence to this finding, we tested the relationship between impervious cover and runoff ratio in all sites provided by GAGES-II dataset (n = 9322 sites; Figure S3). The results show an increase in runoff ratio with higher percentage of impervious cover in both temperate and arid climates; however, the rate of increase in temperate climate is more robust than that in arid ones, which corroborates the relationships shown in Figure 7. implications of this finding are perhaps most relevant to traditional concentration-discharge (C-Q) relationships (e.g., Godsey et al., 2009;Chanat et al., 2002;Evans & Davies, 1998) in which the main assumption is that concentration of solutes can be expressed as a function of discharge 27 in an exponential form (i.e., = ). In contrast, our results show that streamflow (discharge) at day of flood ( ) is less important of a factor compared to the short-term memory of stream (catchment) in terms of average specific conductance. Despite the usefulness of these simple relationships as tools for analysis and management of water quality, they inherently neglect the dynamical nature of stream water chemistry. Our results suggest that the dynamical nature of the system, as represented by short-term antecedent SC conditions, is far more important than information on the magnitude of discharge. This is consistent with recent results that pointed out the variability of C-Q relationships for different hydrologic events (Minaudo et al., 2019;Knapp et al., 2020).

Limitations of the study and implications of findings for water quality in a future climate
In this study, we defined deviations of SC during floods relative to long-term mean rather than pre-event SC levels; thus, dilution and enrichment are labeled relative to mean baseline conditions. This is an important caveat to consider while interpreting the values of NSC presented in this study. Text S3 and Figure S4 discusses the rationale behind this definition and potential implications on the results. Another limitation of this study is that land cover characteristics in GAGES-II dataset were originally obtained from the 2006 National Land Cover Dataset (NLCD) which may not be ideally representative of current conditions if a significant change in land cover has taken place. We speculate that this might only have a minimal impact on results since the period of record for most sites used in this study is between 1990 and 2021 ( Figure 1b). Furthermore, it is important to recognize that the analysis presented in section 3.2 relates the properties of hydrologic catchments to ̅̅̅̅ and ̃ at a continental scale. Therefore, the existence of strong relationships with other properties of catchments at regional scales can't be ruled out. For instance, although not apparent at a continental scale, it is well established that urbanization and increases in impervious surface are linked to elevated salinity levels ( ̅̅̅̅ ) at a regional scale in cold climates due to the use of deicing salt (Corsi et al., 2010;Kaushal et al., 2005;Trowbridge et al., 2010;Perera et al., 2013;Snodgrass et al., 2017;Burgis et al., 2020).
The implications of our findings for water quality in a future climate are primarily centered on the result that aridity has a first order control on the response of salinity to floods.
More specifically, we showed that the relationships between catchment properties and inter-site variability in salinity response to flooding are distinct across different climate aridity zones (wet, temperate and arid). The implications of this finding are clearly demonstrated when viewed against a backdrop of anticipated aridity changes in future climate. In particular, several studies highlighted that a warmer climate will increase aridity in most regions of the world due to an increase in evapotranspiration that outweigh potential increases in precipitation (Sherwood & Fu, 2014, Fu & Feng, 2014, Lin et al., 2018. Therefore, streams and rivers that drain catchments with considerable mining activity in temperate climates are most likely to experience elevated salinity levels during floods in the future due to a shift toward higher aridity. Moreover, our results suggest that socio-economic development (increased urbanization) in the future is expected to shift runoff ratios (i.e., change the way through which precipitation is partitioned into runoff and evapotranspiration) particularly in temperate climates. Although our results indicate that this will lead to stronger dilution during floods at a continental-scale, it might also lead to elevated mean salinity levels at regional scales, especially those of cold climate with significant use of deicing salt. This raises several research questions on potential tradeoffs related 29 to increased urbanization in a future climate; in particular, whether the positive impact of higher runoff ratios (more water available for dilution) will outweigh the negative impact of higher usage of deicing salt and higher rates of weathering (larger mass of salt).

Conclusions
In this study, we found that the response of salinity to floods across 259 sites within the CONUS exhibits a wide spectrum of behaviors including both dilution and enrichment. More specifically, in a total of 91,257 flood events, salinity deviated from a 100% decrease to 34% increase relative to long-term mean with considerable inter-site and intra-site variability. Intersite variability was primarily explained by the interaction of climate aridity and anthropogenic factors. Most notable, urbanization and increases in impervious cover were found to be associated with stronger dilution during floods only in temperate climates due to the sensitivity of runoff ratio to changes in the percentage of impervious cover. In contrast, mining has a noticeable impact on driving elevated salinity levels during floods in arid climate. These findings suggest that future changes in aridity combined with socio-economic development (increased urbanization) will have significant water quality implications with regard to salinity. We also found that short-term memory and antecedent conditions of salinity in the few days preceding the flood is more important in regulating the response of salinity to flood events within individual sites than the magnitude of flood discharge or long-term salinity. This finding implies that greater consideration needs to be given to the role of system memory in determining solute concentrations than is typically inferred from concentration-discharge relationships.

Data and Code Availability
The daily observations of streamflow and specific conductance used in this study are publicly available from USGS NWIS at https://waterdata.usgs.gov/nwis. The characteristics of catchments from GAGES-II dataset are publicly available and can be accessed at https://pubs.er.usgs.gov/publication/70046617. Processed data and code will be made public with a CCBy4 license on the U.S. Department of Energy ESS-DIVE data repository upon acceptation of the paper.

Declaration of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.