Beyond river discharge gauging: hydrologic predictions using remote sensing alone

This study suggests a radical approach to hydrologic predictions in ungauged basins, addressing the long standing challenge of issuing predictions when in-situ river discharge does not exist. A simple but powerful rationale for measuring and modeling river discharge is proposed, using coupled advances in hydrologic modeling and satellite remote sensing. Our approach presents a Surrogate River discharge driven Model (SRM) that infers Surrogate River discharge (SR) from remotely sensed microwave signals with the ability to mimic river discharge in varying topographies and vegetation cover, which is then used to calibrate a hydrological model enabling physical realism in the resulting river discharge profile by adding an estimated mean of river discharge via the Budyko framework. The strength of SRM comes from the fact that it only uses remotely sensed data in prediction. The approach is demonstrated for 130 catchments in the Murray Darling Basin (MDB) in Australia, a region of high economic and environmental importance. The newly proposed SR (SRL, representing L-band microwave) boosts the Nash-Sutcliffe Efficiency (NSE) of modeled flow, showing a mean NSE of 0.54, with 70% of catchments exceeding NSE 0.4. We conclude that SRM effectively predicts high-flow and low-flow events related to flood and drought. Overall, this new approach will significantly improve catchment simulation capacity, enhancing water security and flood forecasting capability not only in the MDB but also worldwide.


Introduction
Natural disasters, such as extreme water events, occur each year (Thomas and López 2015) and have become more severe as a result of climate change (IPCC 2021, Kim et al 2022. However, there exist few river discharge gauges in remotely located catchments, limiting modeling and scenario assessment of future changes (Power et al 1999, Chiew and Mcmahon 2002, Verdon et al 2004, Zhang and Chiew 2009). Robust hydrological predictions remain a challenge across the world, as demonstrated in international initiatives like the Model Parameter Estimation Experiment (MOPEX; Duan et al 2006)  To address modeling in ungauged catchments, our study adopts a novel alternative for measuring river discharge using surrogate river discharge (SR). The proposed SR approach is based on the calibration-measurement ratio (C/M ratio or MC ratio; Brakenridge et al 2007Brakenridge et al , 2012 first created using spatial comparisons of retrieved temperatures, recognizing that the land that has a lower brightness temperature has a wetter perimeter (Brakenridge and Anderson 2006) and hence a greater river discharge. Early SR derivation was typically derived using the Ka-band microwave (Kugler and De Groeve 2007;SR K , hereafter). More recently, the use of an L-band microwave was recommended because it has a greater penetration ability (Kugler et al 2019). Consequently, Yoon et al (2022b) proposed a new approach to create SR derived from L-band microwave (SR L , hereafter) based on temporal comparisons of the retrievals.
With this approach, a new basis for hydrological modeling via SR was developed (Yoon et al 2022a) with markedly greater accuracy that did not rely on an observed river discharge time series but on a correlated SR time series. This approach enables the prediction of hydrological anomalies with reduced uncertainty in ungauged basins across a variety of settings.
However, one question that remains unaddressed is the choice of the hydrologic signature to use. This signature could be the mean of the river discharge, the essential requirement in hydrological model calibration using SR, or an estimated mean using empirical or physically based alternatives. This study presents a new Surrogate River discharge driven Model (SRM) that uses remote sensing data alone, but adopts the estimated mean flow hydrologic signature using the Budyko framework (Budyko 1958(Budyko , 1974 instead of observations. Since the Budyko framework only requires climate data, SRM depends on climate data and SR alone, enabling river discharge estimation without in-situ river discharge observations.
The proposed SRM approach is tested over the Murray-Darling Basin (MDB) in Australia. MDB, with an area of approximately 1 million km 2 , is known as the food bowl of Australia and is responsible for producing one-third of Australia's food supply (Leblanc et al 2012). The catastrophic Millennium drought (from ∼1996−2010) significantly impacted agriculture (Kirby et al 2014), ecosystems, and water availability for residents living in the MDB (Leblanc et al 2012). Beyond just this event, a projection indicates that rainfall may decline primarily in winter and spring, causing droughts as severe as the Millennium drought, but the intensity of the rainfall may increase during summer until 2070, leading to severe floods (van Dijk 2008, Leblanc et al 2012. Therefore, this catchment needs advanced hydrologic predictions in ungauged basins to prepare for such challenges into the future. Also, focusing on this region can show how extreme hydrological events could be predicted using remotely sensed data alone, benefitting flood and water resources management globally. Our proposed approach is selected through comparisons with the prediction skill of two different SRs from two alternative microwave sources to assess their accuracy in predicting river discharge.
The rest of this paper is as follows. The next section presents the study area, data, methods, and assessment metrics. Section 3 presents the results, and then we conclude with a discussion, including the implications of this work in section 4.

Study area
Our study assesses the performance of SRs over an 8 year study period (1/1/2011 to 31/12/2018) at 130 Australian Hydrologic Reference Stations (HRS; Turner et al 2012, Zhang et al 2016, anthropogenically unaffected catchments that are frequently used for modeling studies in the MDB (figure 1). Two catchments (station ID: 401212 and 405218) are selected to highlight the performance of the modeled SRs (Q SR ) with river discharge and to contrast the performance between the two SRs (SR K , SR L ) and two modeled SRs (Q SR K , Q SR L ).  (Raupach et al 2009), and the potential evapotranspiration data are provided by the Australian Landscape Water Balance model (Frost et al 2016). These data are applied as inputs for a hydrological model. Prediction using SRM is tested in gauged basin−daily in-situ river discharge is used to evaluate SRs obtained from the Australian Bureau of Meteorology (Turner et al 2012, Zhang et al 2016 for 130 HRS across MDB in Australia.

SR derivation
The early derivation of SR was based on the assumption that the brightness temperature of a measurement cell (T b,m ) consists of the brightness temperature of land (T b,l ) and water (T b,w ). This is typically created from Ka-band microwave (SR K hereafter), which has been called the calibrationmeasurement ratio (Brakenridge et al 2007) as, where T b,c is the calibration brightness temperature selected among nearby pixels at the daily time step.
Here, T b,c is carefully selected to estimate T b,l of the measurement pixel. Subsequently, a new SR is derived from the L-band microwave, having a considerably improved ground penetration to improve the signal quality. However, the calibration is challenging due to its markedly coarser spatial resolution (25 × 25 km 2 or more). Therefore, a new approach is taken to use a time-based (Yoon et al 2022b), creating a brand-new SR derived from the L-band microwave (SR L hereafter). This new SR is explained as the ratio of the reflectance increase from the driest time specified as the reference value (Yoon et al 2022b) as, which is the ratio of the reflectance increased (r m − r L m ) from the driest time (r L m , the lower baseline) and the reflectance gap (r U m − r m ) to the wettest time (r U m , the upper baseline). The lower baseline denotes the lowest recorded reflectance over the measurement period, corresponding to the lowest water presence and river discharge. We adopt this reference chosen in time instead of space, enabling the use of a longer wavelength with a high penetration ability. Both SR have been filtered with the Topographic Wetness Index filtering approach suggested by Kim and Sharma (2019) using 500 m digital elevation models product from the 90-m Shuttle Radar Topographic Mission (Jarvis et al 2008).

SR driven model
Developing SR via a hydrologic model involves two steps. First, a statistical model that relates SR to the actual flow (Q) must be defined. Such a model is analogous to a rating curve (RC) model commonly used to relate stage to river discharge, with the unknown parameter set to be determined denoted (θ RC ). Following this, a hydrologic model with a parameter set (θ H ) needs to be determined based not on observed flows (Marshall et al 2004, Jeremiah et al 2012, but on the remotely sensed correlated flow, daily timeseries of SR. Once this second model is determined, we are in a position to simulate flow, given relevant climate inputs for the present and the future. This simulated flow is denoted as Q SR and will be a markedly improved representation of the actual flow compared to SR, primarily because of the added climate inputs and physical hydrologic realism in the flow simulation. The difficulty in determining the parameters of two models (θ RC and θ H ), however, is the difference in the volume between surrogate observations and ground measurements, given the parameters of the RC model (θ RC ) cannot be independently confirmed (Yoon et al 2022a). We circumvent this by using a river discharge signature, denoted, the mean of river discharge across the study period. The use of this signature allows one to infer the posterior distribution of both the hydrologic and SR model parameter sets, leading to a likelihood function that consists of the following three terms: where x denotes climate inputs; σ H and σ RC represent the respective model error standard deviations;,s, and s represent the retrieved, standardized, and averaged SR (Yoon et al 2022a). Here, the integration of estimating the signatureq is proposed using Fu equation (Fu 1981, Zhang et al 2008, 2020, Teng et al 2012, Li et al 2013, Xu et al 2013 as which is derived from the Budyko relationship (Budyko 1958(Budyko , 1974, enabling SR development with climate data. Here q y is the annual mean of river discharge, AI y is the aridity index (E p /P), E p is mean annual potential evapotranspiration, P is mean annual precipitation, and is a parameter representing climate and physical features. Here, we let α as a default value 3.0 (Zhang et al 2020), which also is in the preferable range suggested in this region (Teng et al 2012, Zhang et al 2014. The novelty in the above rationale lies in being able to define a hydrologic model using a surrogate measurement that represents scalar independence compared to the response being modeled, a complication that is overcome through the use of the exogenously specifiedq. The GR4J model (Perrin et al 2003) is calibrated with the proposed approach in this study, and the Shuffled Complex Evolutionary Algorithm (Duan et al 1993) is applied to calibrate the model by approximating the likelihood due to its accuracy and fast calculation speed, enabling a large-scale study.

Evaluation metrics
To assess the prediction ability of the SR and the modeled SR via the hydrological model, Nash-Sutcliffe Efficiency (NSE) (Nash and Sutcliffe 1970) and linear correlation (R) are primarily used. Additionally, quantiles of modeled flow, Probability of Detection (POD, Doswell et al 1990), and Severe Drought Index (SDI, Verdon-Kidd et al 2017) are used to measure the prediction capability in extreme events.
Firstly, the quantile of modeled SR and in-situ river discharge (Q) is calculated as, where F QSR and F Q are cumulative density functions of Q SR and Q with probability p. The quantiles with p = 80, 90, and 95 are compared to evaluate high-flow predictions. Along with quantiles, the POD of probability p is calculated as, which represents the probability that a modeled flow (Q SR ) exceeds a specific quantile of Q SR,p provided that the in-situ river discharge (Q) exceeds a quantile of q (Q p ). The PODs with p= 80, 90, and 95 are used in this study to measure the high-flow prediction skill in a timely manner. In addition, SDI is calculated to assess the capability of predicting drought severity via: 1, 2, . . . , n j = 1, 2, . . . , 12 k = 1, 2, 3, 4 . . , n k = 1, 2, 3, 4 (9) where V i,k is the cumulative volume of river discharge for the ith hydrological year (from March to February, Zhang et al 2016) and the kth reference period (k = 1 for March to May, k = 2 for June to August, k = 3 for September to November, and k = 4 for December to February), Q i,j is the monthly mean of river discharge. V k is the mean and s V,k is the standard deviation of the period k across n years.

Modeled flow using surrogate river discharge
We present here how the estimated river discharge (Q SR ) is derived from SR with an example time series. Firstly, the SRs derived from two microwave signals, denoted SR K and SR L , are expressed as a contrasting example for the two selected catchments in figures 2(a) and (b). Both SRs are highly correlated to ground river discharge observation at catchment 401212 (R = 0.45 for SR K ; R = 0.70 for SR L ). However, SR L outperforms SR K at catchment 405227, as SR L has a higher R (R = −0.02 for SR K ; R = 0.73 for SR L ). SR L also predicts high-flow peaks better than SR K , as seen in figure 2(b). R values are calculated for both catchments during the study period (1/1/2011 to 31/12/2018). Although a high correlation is observed in the SRs, we recognize that SR is still insufficient as an alternative to the actual river discharge. To generate a surrogate flow having the physical realism visible in observed hydrologic data, it is necessary to calibrate flow via a hydrologic model. The calibrated results are robust for both signals at catchment 401212, as seen in figure 2(c), with high NSE values (NSE = 0.64 for Q SR K ; NSE = 0.83 for Q SR L ). Also, correlation increases considerably (R = 0.86 for Q SR K ; R = 0.93 for Q SR L ) with the model calibration that combines the information of SR and rainfall signal. In contrast, the performances of the two SR estimates differ at catchment 405227 depending on the satellite signal used, with NSE= 0.46 (R = 0.73) for Q SR K and NSE = 0.84 (R = 0.93) for Q SR L . Q SR K underestimates most of the high-flows and overestimates some low-flows (figure 2(d)). However, Q SR L predicts well in this catchment with an overall well-matched flow time series (figure 2(d)). These results imply that SR L can work at a catchment where SR K shows poor performance and is investigated further in the following section.

Model performance across murray darling basin
Here the performance of SR and Q SR for 130 HRS catchments across MDB is evaluated. As seen in figure 3(a), the correlation of SR L (mean (E) = 0.43, standard deviation (SD) = 0.15) shows a higher R than SR K (E = 0.21, SD = 0.13). Also, the R of Q SR increased from SR for both Q SR K (E = 0.58, SD = 0.23) and Q SR L (E = 0.76, SD = 0.12) through hydrologic model calibration. As seen in figure 3(b), Q SR L shows higher NSE performance (E = 0.54, SD = 0.40) than Q SR K (E = 0.31, SD = 0.26). This result exhibits that SR L is more analogous to river discharge across MDB, which results in higher performance of Q SR L . There is strong evidence in these results that hydrologic prediction without insitu river discharge time series can be achieved with the hydrological model calibration using SR. Particularly, Q SR L shows a robust model result, with 70% of catchments having NSE values bigger than 0.4, indicating good model performance (Zhang et al 2020). As seen in figure 3(c), the performance of Q SR L is specifically higher than Q SR K in the south and east-north region of MDB, where the proportion of catchments with NSE under 0.4 (Q SR K ) is significantly decreased (Q SR L ). Additionally, the effectiveness of Q SR in predicting hydrologic anomalies is assessed in terms of high-flow (flood) events. High-flow predictions are examined in two ways, related to the timing and magnitude of peak flows. Firstly, POD values corresponding to large probabilities (p = 80, 90, and 95) are calculated to assess how Q SR can predict high-flow occurrences. As displayed in figure 4(a)    . Secondly, the quantiles of p = 80, 90, and 95 are calculated for Q and Q SR to examine how Q SR can estimate the quantity of high-flow ( figure 4(b)). Q SR K and Q SR L estimate high-flow quantiles well, but they overestimate extremes at 60-70% catchments in MDB. These results show that both Q SR L and Q SR K predict the timing and magnitude of high-flow events effectively.

Discussion and conclusions
Our study presents a novel approach to river discharge prediction using satellite remote sensing without using in situ data. Our proposed SR L (based on L-band satellite signals) (Yoon et al (2022b); section 2.3) is able to further improve the derived Q SR L via SRM (section 2.4), exhibiting higher correlations (figure 3(a)) along with other measures of performance. Furthermore, integrating an estimated hydrologic signature derived from climate data allowed predictions to be formulated entirely from remotely sensed data, illustrating the possibilities our approach could have in ungauged catchments worldwide.
It should be highlighted that the new SR L enables a more precise prediction with a higher correlation, prompting a higher NSE of the resulting Q SR L (figures 2 and 3). The good quality of the SR is a key to creating a precise estimate of river discharge, enabling better hydrological model calibration. Consequently, a poor SR will lead to poor model performance and hence be verified to be acceptable before use in model calibration and prediction. Specifically, it is shown in figure 2(d) that the modeled value does not correspond to the observed peaks enough when the SR does not represent a suitable signal. The significance of the SR quality may become dominant in hydrological prediction especially when rainfall signals are uncertain, and hence is an aspect that should be investigated further. It must be pointed out that, Q SR L shows excellent prediction in southern catchments in the MDB, which are located in mountainous areas (figure 3(c)) and hence allow greater penetration of the satellite signal.
Although Q SR L outperforms in general prediction with its high NSE and R values (figure 3), both Q SR L and Q SR K have a strong performance when focusing on extremes. Both the timing and quantity of floods are sufficiently predicted by Q SR, as shown in figures 4(a) and (b). Also, catchments that experienced stress during the 5th year of the Millenium drought (from 3/2015 to 2/2016; figure 5(a)) are well predicted. This implies that drought prediction using Q SR can be effective, as is indicated in figures 5(b) and (c) using SDI metrics. Notably, while only the estimated mean value of the river discharge and a dimensionless satellite signal are used in this study, these are able to predict extreme events such as drought and flood, offering hope for the multitude of catchments in remote settings that face this situation all the time. The fact that these represent a variety of geographic and climatic settings, is a promising outcome for prediction of hydrologic extremes.
Finally, it should be noted that there are many more ungauged basins in the MDB. Although we only assess gauged HRS catchments as a basis for illustrating the viability of our proposed approach, Q SR L may be a good alternative to derive river discharge in ungauged catchments across the MDB, in addition to other parts of the world.
Overall, the results of this study reveal accurate hydrologic prediction is possible in ungauged basins to simulate continuous river discharge and characterize hydrologic extremes. The approach suggested in this study should be further applied to other catchments, demonstrating the ability of hydrological prediction worldwide.
All data that support the findings of this study are included within the article (and any supplementary files).