Turning Lakes Into River Gauges Using the LakeFlow Algorithm

Rivers and lakes are intrinsically connected waterbodies yet they are rarely used to hydrologically constrain one another with remote sensing. Here we begin to bridge the gap between river and lake hydrology with the introduction of the LakeFlow algorithm. LakeFlow uses river‐lake mass conservation and observations from the Surface Water and Ocean Topography (SWOT) satellite to provide river discharge estimates of lake and reservoir inflows and outflows. We test LakeFlow performance at three lakes using a synthetic SWOT data set assuming the maximum measurement errors defined by the mission science requirements, and we include modeled lateral inflow and lake evaporation data to further constrain the mass balance. We find that LakeFlow produces promising discharge estimates (median Nash‐Sutcliffe efficiency = 0.88, relative bias = 14%). LakeFlow can inform water resources management by providing global lake inflow and outflow estimates, highlighting a path for recognizing rivers and lakes as an interconnected system.

understanding of how impoundments impact surface water flows and has motivated the development of alternative techniques for supplementing river and lake gauge observations. Satellite remote sensing is uniquely capable of providing observation-based discharge estimates in near-real time and at the global scale (Smith, 1997). Although remote sensing of discharge (RSQ) has been performed using a variety of satellite data and techniques (Gleason & Durand, 2020), much of the recent focus has been in preparation for the recently launched Surface Water and Ocean Topography (SWOT) mission (Biancamaria et al., 2016). Though SWOT cannot directly observe river discharge, it can potentially provide unprecedented cotemporal measurements of river area, elevation, width, and slope for all rivers within the SWOT River Database (SWORD) (Altenau et al., 2021). The SWOT mission will also produce discharge estimates calculated by combining cotemporal SWOT observations with flow laws (e.g., hydraulic geometry, Manning's equation), mass conservation principles, and a priori estimates of non-SWOT-observable flow-law parameters (FLP) such as frictional resistance (Manning's n) and bathymetry (Brinkerhoff et al., 2020;Durand et al., 2014). These SWOT discharge estimates will be practically produced using the Confluence program which houses several different RSQ algorithms (Durand et al., 2023). SWOT RSQ algorithms are sensitive to FLP estimates  which are provided by the SWORD of Science (SoS) database for all SWOT observable rivers (Brinkerhoff et al., 2020). SoS priors of Manning's n and bathymetry are developed using in situ measurements that are then paired with river attributes such as mean width, allowing for mean width alone to provide prior estimates of these FLPs. Although SWOT discharge is expected to improve our understanding of global river discharge (Pavelsky et al., 2014), existing SWOT RSQ algorithms do not leverage SWOT observations of lakes into their workflow, which could improve performance.
In lakes, SWOT can observe lake surface area and elevation, which together can be combined to estimate volumetric storage change (Busker et al., 2019;Crétaux et al., 2011;Gao, 2015;Zhao & Gao, 2019). Storage change estimates are valuable for understanding seasonal and long-term trends in water availability and usage (Cooley et al., 2021;Keys & Scott, 2018;Ryan et al., 2020). Storage change fluctuations also influence downstream river discharge (Nickles & Beighley, 2021;Wang et al., 2013) but very few remote sensing applications consider lakes and rivers as an interconnected system (Gardner et al., 2019). The few remote sensing studies that do assess lakes and rivers together rely on modeled discharge and use satellite estimates of lake storage change to revise the modeled outflow discharge (Bonnema & Hossain, 2019;Yoon & Beighley, 2015;Yoon et al., 2016). This calibration only improves the difference between the inflow and outflow discharge, leaving the original bias in the modeled inflow (or outflow) discharge uncorrected (Bonnema, Sikder, Miao, et al., 2016). However, the accuracies for both inflow and outflow discharge are important because together they provide key insights into human water management and the impact lakes have on river flow regime. Currently, SWOT RSQ algorithms have neither been assessed nor are specifically designed to run at river-lake boundaries Durand et al., 2016;Frasson et al., 2021).
To address these gaps in our ability to monitor the river-lake continuum, we develop LakeFlow, an algorithm which applies river-lake mass conservation to estimate both lake inflow and outflow discharge. Like other SWOT RSQ algorithms, LakeFlow relies on Manning's equation and mass conservation (Feng et al., 2021;Hagemann et al., 2017) but also leverages additional SWOT observations of lake storage change to further constrain river discharge. In addition to discharge, LakeFlow estimates Manning's n and bathymetry of lake inflow and outflow channels, which can be used to inform or improve other SWOT RSQ algorithms. LakeFlow could potentially be applied to the nearly 17 thousand SWOT observable lakes that are located along the SWORD network and have at least one inflow and one outflow reach that are observable from SWOT ( Figure 1). In total, LakeFlow could possibly provide valuable insights into discharge dynamics at 19,380 inflow and 16,959 outflow reaches that are connected to SWOT observable lakes. Ultimately, LakeFlow bridges the gap between lake storage and river discharge to improve SWOT discharge coverage and accuracy.

LakeFlow Algorithmic Design
The LakeFlow algorithm uses SWOT observed river and lake variables to estimate discharge. LakeFlow uses the modified version of Manning's equation from Durand et al. (2014) to describe discharge dynamics for the inflow and outflow reaches, where Q is discharge and n is the frictional resistance of the river channel, referred to as Manning's n. A 0 represents the unobservable cross-sectional area that extends beyond the minimum observed water level, hereinafter referred to as bathymetry, δA is the SWOT observable change in cross-sectional area, W is river width, and S is slope. LakeFlow leverages SWOT estimated lake storage change (δV) during a time period such as between two consecutive SWOT overpasses (p) to constrain inflow and outflow discharge based on mass conservation, here t represents any time during period p, Q l is lateral inflows from channels too small to be observed by SWOT, E is lake evaporation, and all other variables are the same as Equation 1 with i and o denoting the variables of the SWOT observable inflow and outflow reaches, respectively ( Figure 2). Simply put, LakeFlow assumes that lake storage change is equal to inflow minus outflow discharge while accounting for lateral inflows and evaporation.
While SWOT provides estimates of lake storage change (δV), change in river cross-sectional area (δA), slope (S), and width (W), it does not observe Manning's n (n) or bathymetry (A 0 ) for the inflow and outflow reaches, leaving four unknown variables in Equation 2. Note that for simplicity, we only include one inflow reach and one outflow reach for Equation 2 but LakeFlow has the capability to be applied on lakes with multiple inflow and outflow reaches.  Each of these lakes is observable by SWOT (Sheng et al., 2016) and contains at least one SWOT observable inflow and one SWOT observable outflow (Allen & Pavelsky, 2018;Altenau et al., 2021). Note the Lake Allatoona inflow gauge is located on the inflow mainstem (dashed orange line) but is located 7 km upstream of the SWORD reach (orange line).
parameters (n i , A 0i , n o , A 0o ) given repeated SWOT observations. Bayesian approaches start from Bayes rule, where Θ is a set of unobserved SWOT parameters, x is the SWOT observed data, f(x|Θ) is the sampling model where data are conditional on the parameters, and p(Θ) is the joint prior distribution of the parameters. Thus, we are interested in approximating p(Θ|x), the posterior distribution. Bayesian inference aims to approximate the posterior distribution by assuming proportionality (p(Θ|x) ∝ f(x|Θ)p(Θ)) and using Monte Carlo sampling. To implement the Bayesian inference, we log transform and scale Manning's equation to have integer coefficients, 6 = −6 log + 10 log( 0 + ) − 4 log + 3 log (4) To provide a likelihood equation, we rearrange Equation 4 to isolate the measured variables for both the inflow and outflow reaches, 4 log − 3 log = −6 log − 6 log + 10 log( 0 + ), and the likelihood equation for river-lake mass conservation is, The Bayesian approach requires prior estimates of all unknown parameters in Equation 2 which are taken from the SWOT SoS. In addition to estimates of Manning's n and bathymetry, the SoS provides gauge-constrained and unconstrained modeled estimates of mean flow and LakeFlow uses the gauge-constrained estimate, taken from the Global Reach-Level A Priori Discharge Estimates for SWOT (GRADES) model product (Lin et al., 2019). The Bayesian inference uses the Stan probabilistic programming language (Stan Development Team, 2023) to approximate the posterior distribution and provide estimates of all unknowns in Equation 2.

Data Sets
We investigate the performance of LakeFlow in three sample lakes spanning a range of climate regions as seen in Figure 1: Lake Allatoona (humid); Lake Mohave (arid); and Tuttle Creek Reservoir (semi-arid). Lake Allatoona (area: 36 km 2 ) is a flood control reservoir along the Etowah River in northwestern Georgia. Lake Mohave (area: 99 km 2 ) is a hydropower reservoir on the Colorado River spanning the border of Arizona and Nevada. Tuttle Creek Reservoir (area: 43 km 2 ) is located in northeastern Kansas and is built to control floods on the Little Blue and Big Blue Rivers. These lakes each have a U.S. Geological Survey (USGS) gauge station on or near their SWOT observable inflow and outflow reaches as well as on the lakes themselves. Lake Allatoona and Lake Mohave each contain one inflow and one outflow reach and Tuttle Creek Reservoir has two inflow reaches.
Because SWOT data are not yet available, we generate a synthetic data set of SWOT observable variables by utilizing gauge records from the USGS (U.S. Geological Survey, 2022), a Landsat-based water occurrence map (Pekel et al., 2016), and a priori channel attributes provided in SWORD. We then corrupt these data to produce SWOT-like observations by using the measurement errors defined by the mission science requirements and limit the number of observations to one observation per week corresponding to the approximate average overpass rate of SWOT over these lakes (Biancamaria et al., 2016). The synthetic data set is developed using hydraulic principles and contains values of non-SWOT observed Manning's n and bathymetry, with ranges of 0.030-0.035 s/m 1/3 and 12-59 m 2 , respectively (see Text S1 in Supporting Information S1 for details of the synthetic data set). The historical timespan of the synthetic data set is determined by the availability of USGS gauge records, such that the measurements of the lake and its inflow and outflow must all overlap in time. As a result, the timespan for Where the LakeFlow algorithm can run (i.e., SWOT observable lakes with SWOT observable inflow and outflow reaches), lake storage change is predominantly governed by large-river inflows and outflows that are observable by SWOT, but lake storage change can also be influenced by other factors including inflow from groundwater runoff, small lateral streams (pink lines in Figure 1), and evaporation loss (Tayfur et al., 2007;Tian et al., 2022;Zhao et al., 2022). To study the impact of including these factors on LakeFlow's performance, we run two scenarios of LakeFlow: one that only includes SWOT-based observations and a second that includes SWOT observations and also ancillary datasets of lateral inflow and evaporation, represented by Q l and E in Equation 2, respectively. We estimate lateral inflow using high-resolution simulated discharge from GRADES (Lin et al., 2019) and we estimate evaporation losses using modeled data from the Global Lake Evaporation Volume (GLEV) data set (Zhao et al., 2022) (see Text S2 in Supporting Information S1 for details of these ancillary datasets). We then assess LakeFlow's performance related to the ancillary datasets for each of the three study sites by comparing same-day LakeFlow estimated discharge with gauge discharge from the USGS and calculate Nash-Sutcliffe Efficiency (NSE), relative bias (rBias), normalized root-mean-square error (NRMSE), and mean absolute error (MAE) ( Table S1 in Supporting Information S1). In addition to assessing discharge accuracy, we compare LakeFlow FLP estimates with the synthetic data set's values of Manning's n and bathymetry. We further compare LakeFlow FLP estimates with the SoS prior estimates to assess LakeFlow's capabilities for informing other SWOT-based RSQ algorithms. The SoS FLPs are chosen for comparison as these are the default prior FLP estimates for SWOT RSQ algorithms (Durand et al., 2023) (see Section 1 for more information).

Results
The results of the analysis, generated from synthetic SWOT data at the three test sites, indicate that the LakeFlow algorithm will be able to successfully estimate lake inflows and outflows from SWOT observations. In general, we find that LakeFlow estimated discharge skillfully resembles the gauge hydrograph for all of the inflow and outflow reaches (Figure 3). However, there is clear bias on some reaches, namely the Allatoona Lake Inflow and Tuttle Creek Reservoir Inflow 1. Even where there are biases present, LakeFlow captures flow variability for each of the reaches analyzed here as evidenced by a positive NSE for all reaches and a median NSE and NRMSE of 0.88% and 29.0%, respectively. While two reaches have relatively large rBias values, all of the other reaches have an absolute rBias less than 15% with a median rBias of 13.5%, indicating that on average, LakeFlow provides near-zero bias discharge estimates at river-lake interfaces.
LakeFlow accurately estimates discharge dynamics across all seven study reaches ( Figure 4a). Overall, LakeFlow discharge performance tends to modestly improve with the addition of the lateral inflow and evaporation ancillary datasets but does not tend to improve with the addition of only a single one of these datasets (Figures S1 and S2 in Supporting Information S1). This discrepancy is likely due to the inherent bias introduced when only including one of these ancillary terms. LakeFlow discharge mean absolute error (MAE) improves by 1.6% when both ancillary datasets are included compared to including neither. However, the bias marginally increases when both ancillary data are used but remains near-zero (Figure 4a). With and without the ancillary data, LakeFlow discharge for each study location correlates well with same-day gauge discharge observations with marginal overestimations and underestimations in low and high flows, respectively (Pearson correlation coefficient, R ranges from 0.95 to 0.99).
Across all reaches, we find that discharge performance modestly improves with the addition of the ancillary data ( Figure 4b). For example, the mean NSE and NRMSE improve by 4.8% and 6.7%, respectively, when including ancillary data. Conversely, there is a positive bias present in most reaches and the mean rBias is unaffected by the inclusion of the ancillary data. Nearly all of the metrics have a negatively skewed distribution, indicating that LakeFlow performs well on average but occasionally exhibits poor performance. In addition to estimating discharge, LakeFlow can estimate unobserved bathymetry (A 0 ) and Manning's n, with MAE values of 44 m 2 and 0.37 s/m 1/3 (log), respectively. Across all reaches and scenarios, LakeFlow MAE for bathymetry is on average 80% lower than the SoS MAE (Figure 4c) while LakeFlow estimated Manning's n values are marginally worse than the SoS (Figure 4d). LakeFlow tends to overestimate Manning's n values in the three test lakes, which may be related to bathymetry estimates having a positive bias. Bathymetric accuracy declines by 2.8% and Manning's n accuracy remains stable with the inclusion of the ancillary data.

Discussion
The LakeFlow algorithm can provide useful discharge estimates at river-lake interfaces and will enhance the SWOT mission's capabilities for monitoring surface water dynamics. We do not test LakeFlow in locations where other SWOT RSQ algorithms have been assessed, but our findings indicate that LakeFlow's discharge accuracy is comparable or better than other SWOT RSQ algorithms , thus providing the capability to extend the SWOT discharge product to river-lake boundaries with no expected decline in accuracy. These LakeFlow estimated discharge for all lake inflows and outflows compared to gauge records. These LakeFlow discharge estimates are produced using ancillary data ("SWOT + EQ l ").
discharge data can inform hydroelectric and water management decisions and improve our understanding of how reservoir dynamics affect the surrounding environment (Barnett & Pierce, 2008;Chadwick et al., 2021;Huang et al., 2019;Wang et al., 2018). Reservoir operations are particularly important in transboundary water basins where water management in upstream portions of the basin can lead to actual or perceived inequities in downstream water distribution (UNEP, 2016). However, LakeFlow inflow and outflow discharge estimates can potentially increase the transparency of reservoir management practices with implications for water management decisions within transboundary basins (Gleason & Hamdan, 2017).
In addition to discharge, LakeFlow's ability to estimate Manning's n and bathymetry values could provide useful geomorphic insights near river-lake interfaces. Compared to the SoS, LakeFlow provides marginally worse Manning's n estimates but significantly more accurate bathymetric estimates (assuming the synthetic FLPs are a reasonably valid benchmark). However, Manning's n values are inherently limited to a small range of 0.02-0.07 s/m 1/3 (Arcement & Schneider, 1989) whereas bathymetry varies widely globally. Since RSQ algorithms are sensitive to prior FLP estimates Durand et al., 2016;Tuozzolo et al., 2019), the more accurate LakeFlow bathymetries could improve the performance and efficiency of other RSQ algorithms near river-lake interfaces. Thus, there is potential to implement LakeFlow into the SWOT Confluence program (Durand et al., 2023) to inform other SWOT RSQ algorithms while also extending the SWOT discharge product to river-lake boundaries.
While LakeFlow is shown to perform well at the three study sites presented here, further work should be done to fully assess LakeFlow's performance. First, expanding the analysis to contain many more lakes spanning a variety of conditions would help to determine which factors (e.g., lake size, climate) are the dominant control on LakeFlow performance. To determine LakeFlow's benefits beyond discharge information, studies should quantify the effect of using LakeFlow estimates of Manning's n and bathymetry as a priori information in other SWOT RSQ algorithms. Further work is also needed to better characterize the importance of including ancillary data in LakeFlow as these data, on average, improve LakeFlow discharge estimates while decreasing bathymetric accuracy. Future work should also investigate whether additional ancillary data (e.g., water withdrawal, groundwater outflows) can improve LakeFlow's ability to estimate inflows and outflows. Errors in LakeFlow outputs may largely be driven by inaccuracies in the input data. The GRADES discharge product reports a Kling-Gupta efficiency of ≥0.2 at 62% of validation sites and the GLEV evaporation data set has an estimated uncertainty of 10%. While these errors are relatively low for global-scale products, their influence on LakeFlow's performance should be evaluated once SWOT data are available for use. Finally, running LakeFlow with real SWOT data will allow for a more accurate assessment of LakeFlow performance. Running LakeFlow at the global scale using SWOT observations requires a harmonized lake and river data set, which is currently in development and will enable the further understanding of river-lake interactions worldwide.
Overall, this study presents a first step in bridging river and lake hydrology with satellite remote sensing, illuminating a path forward for monitoring river-lake dynamics globally. Potential applications of LakeFlow include informing reservoir operations for flood control or optimizing the distribution of freshwater resources to humans and ecosystems (Boulange et al., 2021;Grimaldi et al., 2016;Munier et al., 2015). LakeFlow could also be used to provide estimates of water residence time in lakes which could offer insights into the variability of lake greenhouse gas emissions (Maavara et al., 2019(Maavara et al., , 2020, sediment supply of rivers and lakes (Kondolf et al., 2014;Lewis et al., 2013;Wisser et al., 2013), and lotic-lentic ecosystem connectivity (Harvey & Schmadel, 2021). Applied at the global scale, LakeFlow could potentially enhance our ability to monitor and understand the impact of reservoir operations on the global water cycle.

Conclusion
The LakeFlow algorithm applies observations from SWOT to a river-lake mass conservation framework to estimate river discharge at lake inflows and outflows. We applied LakeFlow on three sample lakes spanning a variety of physiographic conditions using a synthetic data set of SWOT-like measurements. Our findings suggest that LakeFlow can provide promising discharge estimates of river-lake boundaries using data from the SWOT satellite. Specifically, LakeFlow captures the flow dynamics at all of the SWOT-observable inflow and outflow reaches in this study with NSE values ranging from 0.46 to 0.95, similar or better to other SWOT RSQ algorithm performance . Incorporating lateral inflow and lake evaporation ancillary datasets into LakeFlow typically improves performance, although the impact of ancillary datasets on algorithm efficacy will be clearer once SWOT data becomes available in sufficient quantities. LakeFlow can improve upon prior estimates of bathymetry, which may prove beneficial for other SWOT RSQ algorithms, with relevance to the SWOT Confluence program. Estimating discharge at reservoir inflow and outflow reaches will improve our understanding of reservoir regulations' effect on river discharge. LakeFlow is a step toward integrating remote sensing of lake storage variability and river discharge to provide a more comprehensive view of surface water dynamics.

Data Availability Statement
The LakeFlow outputs and synthetic SWOT data are openly available on Zenodo (https://doi.org/10.5281/ zenodo.7781510) and the code used in this analysis can be found on GitHub (https://github.com/Ryan-Riggs/ Lakeflow). All data used to develop the synthetic datasets are publicly available: Landsat data (https://global-surface-water.appspot.com/download), U.S. Geological Survey gauge data (https://waterdata.usgs.gov/nwis/rt), evaporation data (https://doi.org/10.5281/zenodo.4646621), and GRADES hydrological model outputs (https:// www.reachhydro.org/home/records/grades). data generation. The authors acknowledge members of the SWOT discharge algorithm working group for their feedback in the early stages of algorithm development. The authors would like to also acknowledge Valeriy Ivanov, Cassandra Nickles and an anonymous reviewer for their feedback which helped to improve the manuscript quality.