Scoured or suffocated: Urban stream ecosystems oscillate between hydrologic and dissolved oxygen extremes

Headwater streams draining urbanized watersheds are subject to frequent and intense storm flows. These floods can disrupt metabolic processes occurring in benthic biofilms via the removal of biomass (i.e., scouring flows, bed mobilization) or light attenuation due to turbidity. Furthermore, channel incision caused by frequent hydraulic disturbance alters the geomorphology of streams, indirectly changing the flow and light regimes experienced by benthic biofilms. We measured dissolved oxygen (DO) and modeled whole‐stream metabolism for 18 months in six urban headwater streams in the North Carolina Piedmont, U.S.A. All streams were heterotrophic and had low rates of productivity despite relatively high streamwater nutrient concentrations. Light availability at the channel surface explained more of the day to day variation in gross primary productivity within each stream than did hydrologic disturbance. Yet among streams, the explanatory power of light declined with increasing hydrologic flashiness. We found a surprisingly wide range in DO regimes, which ranged from frequent hypoxia to near constant saturation. Hypoxia was more common in streams with lower channel gradients where bedrock outcroppings and culverts create rapid slope transitions between pools. We hypothesize this geomorphic change increases the susceptibility of benthic biota to perturbation during storms and the mean water residence time during baseflow. Increased water residence times together with elevated organic matter and nutrient inputs can set up ideal conditions for hypoxia at baseflows punctuated by frequent scouring storm flows. As a result, benthic biota are caught between hydrologic and chemical extremes that constrain their productivity.

The flow regime of a stream is defined by the magnitude, frequency, timing, and duration of hydrologic fluxes, which together encapsulate the set of dominant physical processes governing a stream's bioenergetic capacity. Environmental changes that modify the characteristics of a stream's natural flow regime (Poff et al. 1997) can alter a stream's morphology (Wolman 1967;Arnold et al. 1982;Trimble 1997), as well as the ecological communities (Power et al. 1988) and processes (Stanley et al. 2010) that it supports. Forecasts for hydroclimatic change in the 21 st century predict greater variability in precipitation across the globe (Milly et al. 2005;Swain et al. 2018) leading to altered drought and flood disturbance regimes with unclear consequences for lotic ecosystems.
Stream ecosystems draining highly urbanized areas are some of the most extreme cases of altered hydrologic regimes (Konrad and Booth 2005;Walsh et al. 2005). The hydrology of urban streams tends to be "flashy," having an increased frequency and magnitude of erosive flows during storms of all sizes (Leopold 1968;Booth and Jackson 1997). These flows are large enough to cause hydraulic disturbance to biota (Wang et al. 2000;Roy et al. 2005;Walsh et al. 2005), channel incision, and bank erosion (Wolman 1967;Arnold et al. 1982;Trimble 1997). While flashy hydrology has been shown to have a consistent negative effect on macro-organisms (Paul and Meyer 2001;Walsh et al. 2005), the impacts on stream bioenergetics driven by algae and bacteria are less clear (Wenger et al. 2009;Bernot et al. 2010;Kaushal and Belt 2012). Urban watersheds provide an opportunity to study streams that span a wide range in hydrologic flashiness within a small physiographic region because the impacts of urbanization on streams are highly variable at intermediate levels of development (Somers et al. 2013).
The bioenergetic capacity of a stream can be characterized by whole-stream metabolism which integrates the total energetics of all autotrophic and heterotrophic organisms in a reach (Odum 1956;Hoellein et al. 2013;Bernhardt et al. 2018). Whole-stream metabolism can be modeled using diel changes in dissolved oxygen (DO) concentrations to estimate daily gross primary productivity (GPP), an estimate of total autotrophic activity, and daily ecosystem respiration (ER), an estimate of total autotrophic and heterotrophic respiration (Appling et al. 2018). GPP and ER are sensitive to physical habitat characteristics (Battin et al. 2008), chemical and light conditions (Mulholland et al. 2001;Bernot et al. 2010), and disturbances related to flooding (Uehlinger 2000;Acuña et al. 2004;Roberts et al. 2007;O'Connor et al. 2012;Beaulieu et al. 2013;Reisinger et al. 2017). Urban streams tend to be hydrologically flashy and have an increased supply of limiting nutrients (Paul and Meyer 2001;Walsh et al. 2005) making it difficult to predict the cumulative impacts of these opposing drivers on GPP and ER. Frequently disturbed biofilms recover algal chlorophyll a faster following perturbation (Katz et al. 2017) suggesting rapid recovery of GPP, but previous studies have shown urban streams to be heterotrophic and within-stream GPP to be primarily disturbance constrained (Beaulieu et al. 2013;Larsen and Harvey 2017;Reisinger et al. 2017).
Understanding the flow regime of a stream without an understanding of its geomorphology is inadequate to characterize its disturbance regime (Doyle et al. 2005;Stanley et al. 2010;Peckarsky et al. 2014). The severity of a flood as a disturbance is determined not only by the intensity of the hydrologic flow, but also by channel geomorphology since habitats vary in their susceptibility to perturbation. In small headwater streams, most of the productivity occurs on the streambed (Battin et al. 2008). Therefore, the physical instability of a streambed is an important factor in suppressing biofilm development (Fisher et al. 1982;Biggs and Thomsen 1995;Uehlinger et al. 2002;Hoyle et al. 2017;Katz et al. 2017).
Less well understood is the relative importance of light availability and hydrologic conditions in between storms that might serve as "press" disturbances as compared to the frequency of "pulsed" disturbances (Bender et al. 1984) in urban streams. Channel incision, culverts, and stream burial within the urban stream channel network can reduce the cumulative light that reaches the channel surface (Elmore and Kaushal 2008;Hope et al. 2013). Water column turbidity can be elevated in streams draining urban landscapes (Walters et al. 2003;Lawler et al. 2006), causing light extinction in the streamwater column long after stormflows have receded thereby suppressing productivity within rivers (Izagirre et al. 2008). Seasonally dependent prolonged low baseflows have been observed in urban streams (Klein 1979;Roy et al. 2005) potentially because of decreased groundwater recharge in watersheds. However, support for this effect of urbanization on the hydrology of streams is variable (McMahon et al. 2003;O'Driscoll et al. 2010;Hopkins et al. 2015). When they occur, reduced baseflows have the potential to increase the mean residence time (MRT) of both water and organic material in headwater streams. Together with nutrient inputs from sewer leaks or fertilizer runoff, high biochemical oxygen demand might result in hypoxic conditions because of the drawdown of streamwater DO concentrations in near stagnant waters.
In this study, we identify the dominant controls on the metabolic and DO regimes of urban streams across a gradient of hydrologic flashiness in six urban headwater streams. We pair whole stream metabolism modeling based on continuous DO at a single-station with site-specific measurements of the light regime at the channel surface, channel geomorphology, and water chemistry. We predicted that variation in GPP within streams would be primarily controlled by light availability, but that variation among streams would be constrained by disturbance regimes.

Site description
We monitored six streams draining urbanized watersheds in the cities of Durham and Raleigh, North Carolina, U.S.A. (latitude 35-36 N, longitude 78-79 W) from October 2015 to April 2017. We delineated watershed area using the National Hydrography Dataset Plus Version 2 (McKay et al. 2012) and estimated watershed impervious surface cover (ISC) using the 2011 National Land Cover Dataset (NLCD) Percent Developed Imperviousness data product (Homer et al. 2015) and one meter resolution planimetric maps (City of Durham Public Works 2010; Wake County Government 2013) (Supporting Information Fig. S1). After previously monitoring the hydrology of 24 headwater streams in the region for the 2015 water year (01 October 2014-30 September 2015), we chose six streams to span the widest possible range in hydrologic flashiness using calculations described below. We assigned streams a final rank from F1 as the least flashy to F6 as the flashiest based upon their hydrologic flashiness during the 2016 water year (01 October 2015-30 September 2016). Study watersheds range in size from 60 ha to 231 ha and had similar proportions of NLCD characterized ISC ranging from 7% to 16% and planimetric characterized ISC ranging from 17% to 39% (Table 1). Outside of open space and lawns, undeveloped areas in the watersheds were covered by forests comprised of a mix of deciduous forest and pine stands (Griffith et al. 2002). The region is characterized by a subtropical, humid climate with precipitation that falls primarily as rain apart from occasional winter snowfall.
Durham and Raleigh, North Carolina are positioned at the intersection of multiple EPA level-IV ecoregions (Griffith et al. 2002) which correspond with physiographic differences. Three out of the six study watersheds (F1, F2, and F6) are in the Triassic Basin (ecoregion 45g), which is characterized by deep, low-permeability clay soils. F4 and F5 are located in the Northern Outer Piedmont (NOP) basin (ecoregion 45f ) and F3 is located in the Carolina Slate Belt (CSB) basin (ecoregion 45c), both of which have shallow soils overlying slate bedrock and are therefore grouped together as CSB-NOP streams. Differences in geologic basins are also reflected in steeper channel gradients and more exposed bedrock in the stream channel in CSB-NOP basin streams (F3, F4, and F5) relative to Triassic basin streams (F1, F2, and F6).

Hydrology
Pressure transducers (Solinst Levelogger EDGE, Georgetown, Ontario, Canada) recorded water level and temperature every 5 min in each stream continuously from 01 October 2015 to 30 April 2017. Barometric pressure transducers (Solinst Barologgers, Georgetown, Ontario, Canada) were deployed in dry wells open to the atmosphere to buffer diurnal temperature variation within 8 km of every stream. We used precipitation data from three distributed USGS rain gauges (0208732534, 360419078543145, and 355511078570745). We downloaded in-stream sensor data approximately every 2 weeks. Any missing dates of level or temperature data less than 2 weeks due to sensor failure were gap filled if possible using relationships developed with data from another site with the strongest correlation. We developed leveldischarge rating curves for each stream using a combination of dilution gauging, velocity profiling by electromagnetic velocity sensors (Marsh-McBirney Flo-Mate, Frederick, Maryland), and acoustic Doppler profiling (SonTek IQ+, San Diego, California). All discharge rating curves were based on at least 10 discharge measurements across each curve. We addressed the challenge of estimating extreme peak flows in our urban streams (especially for F4-F6) by solving for peak flow using stage from winter storms during wet periods. We only used stage from storms that did not exceed bank full and used the existing discharge rating curve to solve for flow assuming a rainfall to runoff ratio of one. Three storms in F1 (24-25 December 2015, 08 October 2016, and 25 April 2017) exceeded the upper limit of the discharge rating curve for the site (5.25 cm) by an unrealistic amount relative to the watershed size (> 10 cm). Therefore, in only F1, we set any discharge estimates exceeding the upper limit of the rating curve to 5.5 cm.
We calculated a suite of hydrologic metrics for the 2016 water year (01 October 2015-30 September 2016) to characterize the flow regimes of the streams. Discharge time series were complete for the 2016 water year for all sites except F5, which is missing data before 19 October 2015. We aggregated 5-min interval discharge data to hourly data and divided by watershed area before calculating cumulative flow for the water year (mm). We calculated velocity by dividing the discharge by the channel crosssectional area. We used the "BaseflowSeparation" function in the "EcoHydRology" package in R to separate stormflow from baseflow (Fuka et al. 2014). This function uses a recursive filter developed by Lyne and Hollick (1979) that we ran with 3 passes and a 0.9 filter parameter. We calculated a subset of the seven fundamental streamflow statistics (FDSS) recommended by Archfield et al. (2014) for which only one water years' worth of time-series data was sufficient. The subset of the FDSS we calculated included the L-moments of the coefficient of variation (L-CV), skewness (L-Skewness), and kurtosis (L-Kurtosis) of hourly flow data that we determined using the "lmomco" package in R (Asquith 2017). L-moments are statistical parameters which describe the shape of a probability distribution function of time-series data (Hosking 1990). Lower values of L-CV, L-Skewness, and L-Kurtosis indicate overall lower variability in flow with increasing extreme outliers. We calculated the autoregressive correlation coefficient (AR(1)) of discharge by fitting a first-order autoregressive model using the "arima" function in R software (R Development Core Team, Vienna, Austria) after removing the seasonal signal by subtracting the monthly mean from each day.
We selected urban streams along a gradient of hydraulic disturbance by characterizing extreme flows at each site. To do this, we identified two metrics that describe the intensity of changes in the hydrologic regime during storms and used an average rank of the two to rank the study watersheds from F1 to F6. First, the Richards-Baker index (RBI) characterizes the frequency and magnitude of event flows critical to hydrologic flashiness (Baker et al. 2004). We calculated RBI for each stream using 10-min discharge time-series data (Q; cms) in the following equation: We normalized RBI values for each stream to watershed area to account for the influence of watershed size on the magnitude of RBI. Second, we calculated the number of peaks above the 75 th quantile of flow (Peaks Q75 ). Peaks were only counted within the time series if they were at least 5 h in duration and had a rising limb of at least 1 h using the "findpeaks" function in the R software package "pracma" (Borchers 2018). We normalized peak numbers to watershed area to account for the influence of watershed size. Streams F4 and F5 initially had equal rank, but we ranked F5 higher because it had more peaks which better characterizes the disturbance regime we were trying to represent.

Water chemistry
We took streamwater grab samples approximately every 2 weeks by filtering 60 mL of water from the thalweg upstream of any deployed sensors through Whatman GF/F filters into a high-density polyethylene (HDPE) bottle. We measured nitrate (NO 3 − -N) concentrations on an ion chromatograph (ICS-2000) with an IonPac AS-18 analytical column and KOH eluent generator (Dionex Corporation, Sunnyvale, California, U.S.A.). We measured soluble reactive phosphate (SRP) concentrations on a Lachat QuickChem 8500 automated system (Lachat Instruments, Loveland, Colorado, U.S.A.) using EPA method 365.1. We measured ammonium (NH 4 + -N) concentrations using the o-phthalaldehyde (OPA) fluorometric technique (Holmes et al. 1999) and a field fluorometer (10 AU, Turner Designs, Sunnyvale, California). We measured total dissolved nitrogen (TDN) concentrations with a TOC-V CPH with TNM-1 module (Shimadzu Corporation, Kyoto, Japan).

Channel morphology
We characterized the channel geomorphology of the streams by surveying the bed material and channel geometry. We surveyed bed material from 03 March 2017 to 28 April 2017 using a modified Wolman pebble count method (Wolman 1954). Within a 100-m reach upstream of the DO monitoring station described below, we delineated pool and riffle reaches. We performed up to 20 cross-sections within each 100-m reach, but the number of channel cross sections within each reach type (pool vs. riffle) was proportional to its cumulative longitudinal length within the surveyed area. Cross-section locations were randomly distributed within each reach type and were not performed within culverts. Within each cross-section, we used a gravelometer to record the width of the intermediate axis of a particle chosen blindly at 0%, 10%, 30%, 50%, 70%, 90%, and 100% of the bankfull width. We excluded measurements made at 0 and 100 from final calculations of the median grain size (D 50 ) due to the frequent interference of vegetation. We assessed channel geometry in November 2017 using a survey station (Trimble, Sunnyvale, California, U.S.A.). The longitude, latitude, and elevation of the top of the left and the right banks and the deepest point in the stream channel were surveyed approximately every 10 m for up to 100 m upstream of the DO sensor location.
The frequency of perturbation of bed habitat for biofilms can be estimated by calculating the frequency of exceedance of the critical discharge at which most of the bed is mobilized (Wolman and Miller 1960) or of the velocity of periphyton removal (Hondzo and Wang 2002). We used within channel slope and the median grain size of bed material (D 50 ) to estimate the frequency of bed mobilization during high-flow events as described in the Supporting Information. We used the critical depth to calculate critical velocity (v c ) and critical discharge (Q c ) using the same level-discharge rating curves and channel geometry as used for the entire hydrology time series. Then, we calculated the proportion of the time discharge exceeded the Q c threshold for each site and compared daily velocity to the v c for periphyton removal, estimated to be 0.15 m s −1 (Hondzo and Wang 2002). We determined an average reach-scale channel incision ratio by calculating the distance between banks (channel width) from the elevation at the top of the shorter bank and dividing that by the maximum channel depth from that height for every location at which we surveyed the stream channel upstream of the DO sensor.

Light and turbidity
We deployed pendant lux sensors (Onset HOBO Pendant, Bourne, Massachusetts, U.S.A.) at the DO sensor and at four 10-m intervals upstream (n = 5). All sites had 5-min interval Average flashiness rank 1.5 2.0 2.5 4.5 4.5 6.0 Note: No flow days were counted when discharge was less than 0.00001 mm h −1 . We determined watershed area by delineating watersheds using from 30-m resolution light detection and ranging (LiDAR)-derived digital elevation model (DEM) data (Somers et al. 2013). National landcover data ( Data of "mm h −1 ," "5-min cms," and "10-min cms" all represent different discharge temporal resolutions. QF is quick flow and CV is coefficient of variation. light data beginning 18 March 2016 to 30 April 2017, with the exception of missing data in site F5 from 24 August 2016 to 21 October 2016. We strapped sensors horizontally to rebar positioned above the stream thalweg with a few on the bank when in-stream deployment was not possible. We cleaned them on average once every 2 weeks. We compared lux measurements across the five sensors within each stream to detect and remove periods of time in which the sensor may have had interference. We averaged mean daily light reaching the surface of the channel measured as lux (lumens m −2 d −1 ) across all sensors and converted to photosynthetic photon flux density (PPFD; unit: mol m −2 d −1 ) using the following relationship based on the wavelength at 555 nm where lux is at peak absorbance (1 lumens m −2 = 683 W m −2 ) (Williamson and Cummins 1983): For periods of time missing measured light data, we gap filled using data from the North American Land Data Assimilation System (NLDAS; Mitchell 2004). We only used NLDAS light data for modeling purposes to determine the beginning and end of light availability each day, while measured light at the channel surface was used for both metabolism modeling and data analysis.
Turbidity was recorded at 5-min intervals (Turner Design Cyclops 7, San Jose, California, U.S.A.) from December 2015 to August 2016. We primarily recorded turbidity several days prior, during, and several days following storms when turbidity is elevated. We calibrated the turbidity sensor with a formazin standard and measured as formazin turbidity units (FTU).

Dissolved oxygen and modeling metabolism
Optical DO sensors (Onset HOBO U26, Bourne, Massachusetts, U.S.A.) measured DO continuously from 01 October 2015 to 30 April 2017 at 5-min intervals at a single station in each stream. We calibrated DO sensors every 6 months after replacing the sensor cap by bubbling~40 L of water at room temperature through an aquaculture air stone to achieve saturated water. We waited 10 min before recording the point of 100% saturation to avoid calibrating with supersaturated water. To calibrate to 0% saturation, we immersed the sensor in sodium sulfite solution. We deployed DO sensors horizontally, strapped to cement blocks with the sensor cap facing downstream. To minimize biofouling, we brushed sensor caps gently with a toothbrush once a week during the spring and once every 2 weeks the rest of the year. Copper wire casings were occasionally used during short periods in the spring as another effort to minimize biofouling.
We fit a Bayesian state-space model to estimate three parameters: GPP, ER, and the O 2 -specific gas exchange rate coefficient normalized to Schmidt number of 600 (K 600 ) using the "streamMetabolizer" package in R (Appling et al. 2017(Appling et al. , 2018. Models for each stream were fit using 10-min DO, streamwater temperature, light, depth, and discharge time series. Due to computational limits and minimal differences between 5 min and 10 min intervals on a daily scale, we aggregated all 5 min data into 10 min intervals by taking the mean of two 5 min measurements. In the simplest terms, models were fit using the following relationship among three volumetric process rates (g O 2 m −3 d −1 ): , mean water depth (m), and light; R t is a function of daily mean ER (g O 2 m −2 d −1 ) and mean water depth (m); D t is the product of daily K 600 (d −1 ) and the deficit (g O 2 m −3 ) between actual and equilibrium concentrations of DO. For further detail on of the process equations that generated the estimates of GPP, ER, and K 600 , see Appling et al. (2017Appling et al. ( , 2018. We set the specifications of the Bayesian state-space model to incorporate both observation and process error and to use the trapezoid rule to solve the ordinary differential equation for DO. We used Bayesian hierarchical partial pooling to pool K 600 values based on days with similar discharge conditions to constrain variability in modeled K 600 (Appling et al. 2018). We set priors for each site based on an empirical relationship developed for headwater streams (Raymond et al. 2012), in which the standard deviation of the prior distributions of the piecewise function of Q to predict K 600 was set to 0.70 and the mean for each stream was separately calculated as: To improve pooling performance, we set the half-normal standard deviation prior on the standard deviation of K 600 − K 600,pred (Q d ) to 0.05 d −1 and set node centers to match the range of mean daily discharge.
Even with constrained variability, modeled estimates of ER correlated strongly with K 600 because of weak metabolic signals. Therefore, we took a three-step approach to modeling metabolism. First, we visually examined the model fits to the raw DO time series for every day. We removed days (04:00 h to 04:00 h) that had poor fits, reran the model after removing poorly fitting days, and re-examined the relationship between ER and K 600 . All sites but F1 and F2 still had a strong correlation between the two parameters. Therefore, for streams F3, F4, F5, and F6, we reran the model with the cleaned data, but only for days with GPP values above the mean for the entire time series to determine priors only informed by days with strong diel DO signals. Finally, using the priors from the high GPP models for each stream, we reran the entire cleaned time series. We set any remaining days with GPP estimates less than zero to zero and ER estimates above zero to not-available (NA). The model outputs physically impossible results (i.e., negative GPP or positive ER) because they fall within the margin of error when fitting the model to data on days with insufficient diel variation in DO to get good model fits. Setting ER to NA leads to a lack of data on ER, but the fact that these shallow streams can remain hypoxic during periods of low flow provides ample proof that heterotrophic respiration is at least equal to rates of O 2 diffusion and thus nonzero. We retained GPP values that were zero even on days with NA ER because in these highly disturbed and shaded streams, GPP is likely approximately zero when there is no diel DO signal. Following this approach, we reduced the strength of the correlation between K 600 and ER for all sites except F4 and F5 (Supporting Information Fig. S2).

Data analysis
We examined the relationships between modeled urban headwater metabolism estimates and environmental controls using R version 3.4.2. Given a stream n of six, we performed simple Spearman's rank correlation analysis to assess correlations among stream features (channel incision ratio), environmental variables (light and temperature), and cumulative metabolism estimates (GPP, ER, and net ecosystem productivity (NEP)) among the six streams. We used the "ks" package in R to create kernel density plots of GPP and ER (Duong 2017). Because of the strength of the correlation (r > 0.7) between ER and K 600 for F4 and F5, we took a conservative approach to drawing conclusions about within-stream variability of ER in those sites and did not include those sites in ER regressions with GPP and light. We calculated the mean of light reaching the channel surface, the maximum velocity, and the mean GPP for the previous 7 d (week) for each day of modeled GPP using the "dplyr" package in R (Wickham et al. 2017).
We used simple and multiple linear regression analysis to explore environmental controls on modeled GPP and ER. Data were either ln or square root transformed to meet assumptions of normality. We assessed the potential influence of nitrate and SRP on modeled estimates of daily GPP and ER using linear regressions with concentrations from grab samples on the same day within each stream. We corrected for an inflated α with Bonferroni adjustment for 12 comparisons with each nutrient (α = 0.045) to determine significance. We used Pearson's product moment correlation analysis to examine the relationships between streamwater chemistry and the mean DO saturation on days streamwater was sampled in F6, which had the greatest variation in daily DO saturation on grab sample days.

Hydrology
The hydrologic regimes of the study streams spanned a wide gradient in flow statistics during the 2016 water year, particularly those used to characterize the impacts of urbanization (Table 1). Annual runoff was 32-75% higher in the flashiest watershed (F6) compared to the other watersheds. Mean baseflow was 60% greater in the winter months (December-February) than in the summer (June-August) in streams draining watersheds located in the Triassic basin (F1, F2, and F6). In addition, two streams draining Triassic basin watersheds (F1 and F6) had over a month of no flow days during the late summer and early fall (Table 1), while the rest of the streams had nearly continuous flow. There was less than a 9% decrease in summer baseflow in the NOP basin streams (F4 and F5), but the CSB stream (F3) decreased by 43%. The frequency of precipitation events was mostly even throughout the year, with a slight increase in the frequency of small events during June and July (Supporting Information Fig. S3). RBI and Peaks Q75 were positively correlated with each other (ρ > 0.5), but not significantly (p > 0.05).

Geomorphology
Channel gradient measurements differentiated streams into two groups-those with higher channel gradients of~3% (F3, F4, and F5) and lower channel gradients of~0.5% (F1, F2, and F6). High channel gradient streams were in the CSB and NOP basins (Table 2). At baseflow in the low-gradient streams, more than half of surveyed upstream reaches (> 60 m) from the DO sensor location were classified as pools. Channel incision correlated strongly with annual flow (ρ = −1, p = 0.003) and the number of no flow days (ρ = −0.88, p = 0.02), but was not correlated with any hydrologic flashiness metric.
Overall, median grain sizes did not correlate with hydrologic flashiness metrics. Instead, we observed a wide variation among sites in the distribution of grain sizes within streams (Supporting Information Fig. S4). The median bed material grain size (D 50 ) ranged from < 2 mm in the second least flashy stream (F2) located in the Triassic ecoregion to 128.4 mm (large cobble) in the second most flashy stream (F5) located in the NOP ecoregion. The D 50 was larger in the NOP-CSB streams, but all streams had a D 16 of < 2 mm. Lowgradient streams had higher bed material skewness and kurtosis, but none of the streams had a normal distribution of grain sizes. Calculations of critical discharge yielded unrealistic values of close to 100% exceedance in sites F1, F2, and F3 (Supporting Information Fig. S5). Therefore, exceedance of critical discharge and velocity will not be discussed further.

Light, temperature, water chemistry, and turbidity
We observed relatively low-light conditions at the stream surface ( Fig. 1; Supporting Information Fig. S6) and strong seasonal trends in streamwater temperature. The maximum weekly mean of PPFD occurred at all sites in early April in both 2016 and 2017 (9.6 mol m −2 d −1 and 9.9 mol m −2 d −1 , respectively) followed by a rapid decline due to canopy cover leaf out. Weekly mean PPFD was < 3.1 mol m −2 d −1 from the end of May 2016 to the end of January 2017. Cumulative PPFD after 340 shared days was 40% lower in the darkest, most flashy site (F6) relative to the site with the most light reaching the channel surface (F5; Fig. 1A). Cumulative PPFD was highest in the two least incised channels (F3 and F5) and lowest in the most incised channel (F6) but was not significantly correlated with the incision ratio (p > 0.1; Fig. 1B). Mean streamwater temperatures were highest across all sites in September 2016 (23.8 C) and lowest in January 2016 (6.4 C). The maximum daily streamwater temperature difference among sites was 1.6 C, and no stream was consistently different from the others in temperature.
Median TDN concentrations ranged from 0.31 mg L −1 in F1 to 0.94 mg L −1 in F4 and were lowest in the late summer and early fall (August 2016-October 2016) (Supporting Information Fig. S7). Median NO 3 − -N concentrations ranged from 0.07 mg L −1 in F2 to 0.98 mg L −1 in F4. Median NH 4 + -N concentrations were an order of magnitude lower and ranged from 0.004 mg L −1 in F3 to 0.017 mg L −1 in F6. TDN correlated more strongly with NO 3 − -N than NH 4 + -N (ρ = 0.84 and 0.33, respectively, p < 0.001), and declined in the later summer and early fall in all but streams located in the NOP basin. SRP concentrations were lowest across all sites during the winter (January 2016-March 2016) and median concentrations ranged from < 0.01 mg L −1 in F2 to 0.09 mg L −1 in F6, which had distinctly higher concentrations than the rest of the streams throughout the study period (Supporting Information Fig. S7). Median daily turbidity ranged from 94 FTU in F4 to 380 FTU in F2 on days for which every site had data (n = 46). F2 tended to have the highest turbidity, while F4 had the lowest (Fig. 1C).

Dissolved oxygen
Percent DO saturation (DO %sat ) regimes were distinct among streams (Figs. 2, 5A). Differences among streams in supersaturation (DO %sat > 100%) were primarily observed during peaks of productivity in the spring, when F1, F2, and F3 had noticeable peaks. In contrast, differences among streams in periods of low DO %sat were primarily during the late summer and early fall. During these periods, there was low to almost no flow in shallow gradient streams (F1, F2, and F6) and a concomitant decline in DO concentrations that was most pronounced in F6. Median daily DO %sat (and DO obs ) ranged from 47% (4.6 mg L −1 ) in F6 to 91% (9.1 mg L −1 ) in F4 for the entire monitoring period. Both shallow gradient streams F2 and F6 had the most drastic decline in median DO %sat between the winter (December 2015-February 2016) and summer (June 2016-August 2016), with median daily saturation values falling by 30% and 49%, respectively. All other streams had less than a 10% decline in median DO %sat between seasons.

Modeled GPP and ER
We modeled relatively low median GPP in the headwater streams draining urbanized landscapes monitored in this study (Figs. 2, 3). Across all dates and streams, GPP ranged from 0 g O 2 m −2 d −1 to 9.1 g O 2 m −2 d −1 (median: 0.15 g O 2 m −2 d −1 ). February through April were the most productive months within each site, while August through November were the least productive months, apart from June 2016 as the least productive in F4. Final cumulative GPP values over 86 shared days across all sites ranged from 10.8 g O 2 m −2 in F5 to 82.3 g O 2 m −2 in F3 (Fig. 3). Across all dates and streams, ER ranged from 0.01 g O 2 m −2 d −1 to 25.3 g O 2 m −2 d −1 (median: 3.4 g O 2 m −2 d −1 ). F2 had the highest rates of respiration (median: 7.9 g O 2 m −2 d −1 ) with monthly means consistently greater than 6 g O 2 m −2 d −1 . F4 and F5 had the lowest median rates of respiration (1.5 g O 2 m −2 d −1 and 2.0 g O 2 m −2 d −1 , respectively), but strong correlations between modeled ER and K 600 suggest reaeration interference in model estimation (Supporting Information Fig. S2). ER tended to be a seasonal and cumulative ER across shared days ranged from 136.6 g O 2 m −2 in F4 to 741.8 g O 2 m −2 in F2 (Fig. 3). Mean cumulative ER was lower by more than 50% in the NOP streams (F4 and F5) than any other stream.
All streams were heterotrophic with cumulative NEP across shared days ranging from −690.2 g O 2 m −2 in F2 to −120.9 g O 2 m −2 in F4. All modeled days of metabolism in all streams were heterotrophic (GPP : ER < 1) except for 2 d in March 2016 in the most productive site F3.

Environmental drivers: Hydrology and geomorphology
The three least flashy streams had more pronounced spring GPP peaks than the three most flashy sites (Figs. 2, 3). The 95 th quantile of GPP in the less flashy sites was on average 68% greater than that of the three most flashy sites. However, the mean of maximum daily velocity for the antecedent week could only predict up to 13% of the variation in GPP within a stream ( Fig. 4; Table 3), and predictive ability only declined with a shorter antecedent window. ER was most tightly coupled to GPP and streamwater temperature in the flashiest site (F6; Supporting Information Table S1). Cumulative NEP was not correlated with any flashiness metric and the amount of time velocity exceeded the predicted velocity threshold for periphyton removal (0.15 m s −1 ; T Velocity>0.15 in Table 1) did not correlate with cumulative GPP.
Low-flow conditions during baseflow coincided with decreasing DO concentrations in the water column of low-gradient (~0.5%) streams F2 and F6 (Fig. 2). More than 97% of DO %sat conditions less than 50% occurred during flow conditions of less than 0.15 mm h −1 . Low-gradient streams tended to have lower 50% exceedance values of DO %sat as compared to high-gradient (~3%) streams (Fig. 5A). Reaeration of streamwater approaching anoxia was evident during the summer in F6 when DO %sat exhibited flushing behavior following storms (Fig. 5C) and in F2 (Supporting Information Fig. S8).

Environmental drivers: Surface and water column light availability
The explanatory power of light reaching the channel surface declined with increasing hydrologic flashiness rank, but antecedent light conditions explained more of the variation in GPP than antecedent velocity ( Fig. 4; Table 3). However, light availability was clearly not the only constraint on productivity as F5 had the most cumulative light of all the streams (Fig. 1), but the lowest cumulative GPP (Fig. 3).

Fig. 2.
Time series representing the duration of the study period of 10-min streamwater percent DO saturation (DO %sat ) in black, modeled daily GPP in green, and modeled ER in brown. Gray bars in the DO %sat time series represent periods of time when the sensor was removed from the stream channel to avoid ice damage (January 2016-February 2016) or the DO data was removed due to burial or sensor failure. Missing data in the metabolism estimates can either be a result of missing data or an inability to fit an estimate using the data. The dashed line in DO %sat plots is set at 30% saturation, which is a commonly used hypoxia threshold.
Antecedent light conditions explained more of the variation in GPP than light conditions measured on the same day as modeled GPP. Antecedent GPP as a proxy for autotrophic biomass accrual explained the most variation in GPP and its explanatory power also declined with increasing hydrologic flashiness rank (Table 3).
Following a storm, turbidity can constrain light reaching the streambed and therefore we expected turbidity to be negatively correlated with GPP. We did not find a direct negative correlation between GPP and turbidity, except in F1 where water column turbidity explained the residuals in the relationship between GPP and light reaching the channel surface on the same day (R 2 adj = 0.55, df = 77, p < 0.0001).

Environmental drivers: Temperature and nutrients
Streamwater temperature was a stronger predictor than GPP of within-stream variation in ER (Supporting Information  Table S1). However, GPP and streamwater temperature explained at most 25% of within-stream variation in ER, suggesting that respiration was not driven or constrained by internal productivity. Regression analysis of within-stream ER was not performed using modeled ER from NOP streams (F4 and F5) due to uncertainty in reaeration interference.

Discussion
In this study of the DO and metabolic regimes of urban headwater streams across a gradient of hydrologic flashiness, we found two key results. First, antecedent light conditions at the channel surface explained more of the variation in GPP than antecedent velocity within streams, but the explanatory power of light declined with increasing hydrologic flashiness among streams. Second, urban headwater streams span a surprisingly wide range in DO regimes, ranging from frequent hypoxia to near constant saturation. Both of these DO regimes resulted in heterotrophic streams with low rates of productivity. Our results suggest that variation in channel gradient and underlying geology can play an important role in determining the bioenergetics of urban headwater streams by intensifying "pulse" and "press" disturbances (Bender et al. 1984). Changes to channel geomorphology increase the susceptibility of benthic biota to perturbation during "pulsed" stormflow disturbances and create "press" disturbance conditions at baseflow with more frequent hypoxia between storms.

Light and disturbance constrained productivity
Human alterations of streams and near-stream zones have led to a reduction in the amount of solar energy that reaches urban headwater streams because of channel modification. At the reach scale, incident light reaching the channel surface in urban streams can be reduced through bank shading as a consequence of incision. Channel incision results from an imbalance in sediment delivery and erosive forces because of increased hydrologic flashiness (Wolman 1967). We observed less cumulative incident light reaching the channel surface with increasing channel incision (Fig. 1B), but the relationship was not significant most likely due to variable canopy closure. At the network scale, less light can reach the channel surface in urban settings due to culverts, piping, and stream burial (Elmore and Kaushal 2008) potentially resulting in an overall reduction of GPP (Hope et al. 2013). The low-light availability at the stream channel surface in our headwater streams might explain why median rates of GPP from these streams are comparable, but lower than those previously reported for urban streams in the mid-Atlantic (Larsen and Harvey 2017). Proximate controls of light availability through canopy and bank shading are a product of both human management of riparian vegetation and channel incision from increased hydrologic flashiness.
Of the incident light that does reach urban headwater channel surfaces, less might be converted to biomass because autotrophic organisms are constantly recovering from habitat disturbance following nearly every storm (Reisinger et al. 2017). Disturbance was frequent in our urban streams with a mean recurrence interval of discharge exceeding the 95 th quantile of flow every 5 AE 1 d during the 2016 water year across all streams. The longest period without a storm exceeding the 95 th quantile of flow was 6 AE 3 weeks across all streams. While underlying geology has an influence on the degree of sedimentation and erosion (Utz et al. 2016), most of the bed material in our streams was susceptible to being mobilized by these frequent events. Some of our stream channels had strongly bimodal distributions of grain sizes, with sand and silt intermixed with bedrock and boulders (Supporting Information Fig. S4). Critical shear stress calculations rely on a representative median grain size (D 50 ), yet the non-normal distributions of bed material sizes within many of our streams made this calculation unreliable as a metric of the frequency of bed disturbance. While larger bed material is mostly likely stable during storm events, its surface could be subject to hydraulic scour and abrasion by mobilized fines, leading to periphyton removal (Francoeur and Biggs 2006). Because of this abrasion and hydraulic scouring, any physically stable substrate in a channel may not necessarily represent refugia for biofilm biota.
The urban streams monitored in this study were less able to capitalize on available light than other previously reported southeastern headwater streams, most likely because of a reduction in autotrophic biomass on the streambed. GPP is both a function of the instantaneous response to light and the lagged effect of biomass growth. Variation in light reaching the channel surface explained little of the variation in GPP in our three most flashy streams ( Fig. 4; Table 3). Roberts et al. (2007) found they could explain 84% of the variability in GPP in the spring with light measured at the stream channel surface in another southeastern headwater stream (Walker Branch, Tennessee). In contrast to these instantaneous effects, daily GPP in our urban headwater streams was more closely connected to both antecedent light conditions and GPP over the preceding week. The explanatory power and slope of the relationship of daily GPP with antecedent GPP declined with increasing flashiness suggesting a reduction in photosynthetically active biomass. Time since last disturbance has been shown to be an important control on GPP in rivers (Uehlinger 2006;Roberts et al. 2007;Reisinger et al. 2017), but further work is needed to address how variation in the resilience of photosynthetically active biomass to disturbance changes  Table 3 for more information on linear model results.
relationships between whole-stream GPP and light availability throughout recovery intervals.
Furthermore, extreme flows do not just scour biofilms, they also can be accompanied by an extended period of high turbidity that delays biotic recovery. We found a wide range in turbidity conditions within each stream, but turbidity only explained the residuals in the relationship between GPP and light in one stream (F1). Turbidity may not persist for long in these streams following storms, but there is also the possibility that heterogeneity in the proximity of bed material with algal growth to the water surface might prevent turbid waters from attenuating light and slowing productivity (e.g., submerged tree roots near the water surface with surficial algal growth). Further work needs to be done to understand the relative influence of turbidity in urban streams on whole-stream productivity.

Respiration estimates in unproductive streams
Rates of ER dominated metabolism in all six streams (Fig. 3). We had little success in explaining day to day ER (Supporting Information Table S1). We have limited confidence in our precision, given the strong negative correlation between modeled ER and gas evasion (K 600 ) in several of our stream records (Supporting Information Fig. S2). A strong negative correlation is the direct result of a weak diel signal in DO which reflects either low rates of biological activity or high rates of reaeration. When the change in daily DO approaches zero, the metabolism model essentially equates the ER estimate with the gas evasion estimate. Current metabolism modeling methods are particularly challenged to separate variability in gas evasion and ER when the signal of productivity is low (Appling et al. 2018), such as during summer low flow periods in shaded streams. Regardless of the strength of the DO signal, future studies modeling metabolism should report correlations between modeled ER and K 600 to improve transparency in the amount of inference to be gained from the modeled respiration and gas evasion estimates. Measuring K empirically through gas additions in systems with low productivity or high K can help constrain priors in metabolism models. Table 3. Results of linear models between log-transformed GPP and the mean light reaching the channel surface (PPFD mean ), the maximum velocity (Velocity max ), the interaction of mean light and maximum velocity (PPFD mean × Velocity max ), and the mean GPP (GPP mean ) of the antecedent week (7 days) to every modeled day of GPP. Bimodal dissolved oxygen regimes For half of our streams, chronically degraded conditions were evident during baseflow periods with low DO concentrations for extended periods between storms, especially during the summer. These hypoxic periods were seen in streams where both baseflow and channel slope were low (most evident in sites F2 and F6 in Figs. 2, 5 and Supporting Information Fig. S8). Urbanization leads to reduced infiltration and shallow groundwater flows, with far more of the water exiting watersheds via direct stormwater piping to the channel. Reductions in urban stream baseflows relative to rural areas have been previously reported for southern streams in the NC Piedmont and Coastal Plain regions (Hardison et al. 2009;O'Driscoll et al. 2010). In addition to the reduction of baseflow of urban streams, changes in the geomorphology of stream channels as a result of heterogenous channel incision have the potential to increase the MRT of water within headwater streams. Normally incision is predicted to be continuous along stream channels (Schumm et al. 1984). However, bedrock outcrops or culverts under road crossings can make incision irregular and in these cases, incision may reduce slopes in between bedrock or culvert constrained sections of the channel (Fig. 6). Shoffner and Royall (2008) reported reach scale impoundment of urban streams by incision-resistant bedrock outcroppings resulting in upwards of 60% reach-scale pool hydraulic habitat in urban streams in the same physiographic region (Greensboro, North Carolina). We observed similar patterns in half of our streams where deep, slowmoving silty pools alternate with bedrock slides. In these pools, oxygen demand by benthic heterotrophs likely exceeds oxygen diffusion, and they remain hypoxic until storm waters replenish oxygen rich water (e.g., Fig. 5C). Hypoxic conditions are stressful for aquatic macro-organisms, and we observed fish kills on 08 February 2017 following a hypoxic period in the F2 stream.
During these hypoxic periods, we could not solve for stream metabolism, which relies on diel variation in DO. This does not mean that there is no productivity during these times, merely that there is no detectable signal of photosynthesis in these streams as they approach a dystrophic lentic state. We hypothesize that these oxygen depleted periods are likely extremely important to the biogeochemical cycling within these channels and given their prevalence (both spatially and temporally) may dominate nutrient cycling in these systems. We find that these low DO periods are strongly correlated with low streamwater nitrate concentrations of grab samples taken on the same day (Fig. 5B). We hypothesize that the increase in the MRT of water during periods of low flow allows for the drawdown of water column DO and increases the probability of N assimilation or transformation either in the water column or the hyporheic zone. In addition, under hypoxic conditions it seems likely that either rates of denitrification are elevated or less ammonium is nitrified, although we did not observe a relationship between DO and streamwater ammonium concentrations. In any system where long MRTs and high-organic matter availability lead to oxygen declines, classic stream techniques may underestimate productivity and nutrient retention capacity by ignoring these places and periods of time. To fully assess the impact of these important habitats will require the development of methods that do not rely on advective transport and rapid turnover.

Implications
Pools are a ubiquitous feature of rivers, made more so by both urban driven incision and the long-term legacies of dam building and dam removal (Walter and Merritts 2015). Whether a pool is likely to experience hypoxia is dependent upon its geologic setting and its interactions with groundwater. Urbanization both increases channel incision and lowers local groundwater tables-two forces that may act in concert to increase the area of pool habitats and reduce their hydrologic connectivity during low flows. Our data suggest that these pools may be quite active zones for nitrogen removal, and it is likely that they serve as important control points (sensu Bernhardt et al. 2017) for many other solutes that are sensitive to redox or that serve as alternative electron acceptors. Urban stream sediments tend to have elevated concentrations of contaminants that might be mobilized in pools under anoxic conditions (e.g., P and Hg; Vidon et al. 2010), and flushed to downstream ecosystems during floods. We have evidence from one of our streams (F6) that while streamwater nitrate concentrations decline, SRP concentrations increase under hypoxic conditions. This suggests a potential feedback mechanism in which stagnant pools in urban streams undergo internal eutrophication as P is released from bonds with cations under altered redox states. Heterogeneity in stream flow resulting from human modification of the geomorphic template within channels can alter the rate of transformation and transport of reactive materials within river networks.
Urban streams are heavily modified, but their features, while extreme, are not unique. The study of urban streams has led to a number of important insights in geomorphology (Leopold 1968), hydrology (Hollis 1975), and has been used as an important endpoint in studies of urban ecology (Paul and Meyer 2001). Much of the ecosystem science in urban watersheds has focused on the high fluxes from watersheds and through stream channels during stormflows. There has been far less attention to the dramatic changes in these ecosystems between these events. Our work documents a shift in ecosystem dynamics in urban streams toward bimodality between the fast dynamics of advective transport from pavements and hillslopes and the slow dynamics of redox active zones in nutrient and organically enriched stream pools. These bimodal ecosystem dynamics arise as a result of the bimodality of the hydrogeomorphic template, whereby incision creates slow-moving, low-slope areas in between high-slope and hardened segments of the channel (Fig. 6). This high-spatial variation in DO dynamics is not unique, as high-spatial heterogeneity in DO in rivers has been previously reported in low-gradient streams in Illinois (Garvey et al. 2007) and Kansas (Siders et al. 2017).
We set out to study the metabolism of urban streams, but the most interesting things we learned from our study arise from the frequent situations in these streams where our methodologies fail. Methods in lotic systems often rely on advection to characterize ecosystem function, constraining our ability to understand stream biogeochemical processes independent of flow. Urbanization amplifies the differences between pools and riffles, between baseflows and stormflows, and between hillslope and in-stream dynamics. Much attention has been paid to how impervious surfaces dramatically increase the contributing area to streams via surface runoff. Less attention has been directed to how the reduced baseflows that result from lower infiltration may disconnect streams and their riparian zones (Groffman et al. 2003) from their watersheds between storms. Urban stream dynamics are driven by the characteristics of their watersheds (or "valleys") during storms (Hynes 1975) but the drivers of those dynamics become increasingly distinct from the landscapes they drain at baseflow.