Storage in South‐Eastern Australian Catchments

Storage and subsequent release of water is a key function of catchments that moderates the impact of meteorological and climate extremes. Despite the fact that many key hydrological processes depend upon storage, there are relatively few studies that focus on storage itself. Storage is difficult to quantify due to catchment heterogeneity and the paucity of data on key catchment characteristics that largely determine storage, such as soil, hydrogeology, and topography. We adopt a multi‐method approach to estimate the dynamic and extended dynamic storages using hydrometric data in 69 catchments in the Murray‐Darling Basin in south‐eastern Australia. We test relationships between the derived catchment storages and hydrological and physical characteristics that potentially control storage. The study catchments tended to have small dynamic storages relative to the extended dynamic storage; proportionally the dynamic storages were all less than 10% of the extended dynamic storage. While storage estimates produced by the different methods and study catchments varied, the order in which they ranked was consistent. Correlations between catchment characteristics and estimates of storage were inconsistent; however, the results indicated that greater storage is strongly associated with steeper catchments and smoother hydrographs. This study highlights limitations in the current methodology to derive storage and the quality of widely applied hydrometric data. We reinforce the need to collect data that can validate storage estimates and call for new approaches that can broadly estimate storage at the catchment scale.

2 of 18 of environments to obtain insights into relationships between catchment processes and storage dynamics. Since then, a few studies have employed multi-method and multi-catchment approaches to investigate storage (e.g., Peters & Aulenbach, 2011;Sayama et al., 2011;Staudinger et al., 2017); however, globally such studies are still sparse.
Much of the current research has been devoted to understanding storage in headwater catchments. Headwater catchments are often located in mountainous regions and provide high volumes of river flows to lowland areas such that they are considered the "water towers of the world" (Viviroli et al., 2007). These flows are important for maintaining hydrologic connectivity and ecological integrity of regional hydrologic systems (Freeman et al., 2007) and are important sources of water for downstream human water demands. Headwater catchments in montane areas are particularly vulnerable to climate change and other anthropogenic developments (Immerzeel et al., 2020;Viviroli et al., 2011), and a lack of a deep understanding of catchment storage threatens global water security. In the south-east of Australia, forested catchments along the Great Dividing Range are responsible for large inflows into the Murray-Darling Basin, which is Australia's largest food bowl (Wheeler, 2014) and a region of significant ecological importance. In Australia, as well as in other temperate to semi-arid regions across the globe, droughts are a frequent phenomenon and are often severe (Leblanc et al., 2009;van Dijk et al., 2013). Water stored and later released by headwater catchments serve as a buffering mechanism that can reduce the impacts of drought and understanding the role catchment storage plays in sustaining streamflows and evapotranspiration is therefore crucial.
In this study, we build on past multi-method and multi-catchment approaches and estimate storage in catchments spread across the Murray-Darling Basin. We are interested in whether landscape, and climate factors are indeed related to derived storage, such that specific catchments can be protected and managed effectively. Catchment characteristics may also reveal common controls on catchment storage (Geris et al., 2015;Saft et al., 2016;Wagener et al., 2007). More specifically, our aims are to (1) estimate and evaluate the dynamic storage and extended dynamic storage of catchments in the Murray-Darling Basin; (2) determine if there are robust relationships with catchment characteristics; and (3) Evaluate if the comparative approach is useful to gain insights into catchment storage.

Study Catchments and Data
Catchments located within the Murray-Darling Basin were selected from the Australian Bureau of Meteorology (BOM) Hydrological Reference Station (HRS) project (X. S. Zhang et al., 2016). HRS is unregulated catchments with high-quality streamflow records that are in areas with minimal land use change and impacts of water resource development. As such, they are ideal for long-term analysis. The study period focuses on 1990-2018. This time range includes both distinct wet and dry periods. The 1990s, early 2010s, and 2016 were notably wet, while the Millennium Drought (van Dijk et al., 2013) was a severe drought that extended over much of the 2000s.
Streamflow data were obtained from the BOM HRS portal (http://www.bom.gov.au/water/hrs/). There is no missing data in the records as the BOM gap fills the data using the GR4J model. Gauges selected for the study needed to have more than 70% of data classified as containing the best available data (quality code A) or good data (quality code B) over the study period. After data quality filtering, 69 catchments remained for analysis ( Figure 1). Area weighted daily catchment means of precipitation, maximum temperature, minimum temperature, and Morton's potential and actual evapotranspiration (Morton, 1983) were extracted from the SILO-gridded database (Jeffrey et al., 2001;www.longpaddock.qld.gov.au/silo/). The choice of Morton's models was motivated by the suitability of the models to calculate catchment water balances and in rainfall-runoff modeling (McMahon et al., 2013).
A summary of the main catchment hydrological forcings is presented in Figure 1. Mean annual precipitation (P) ranges from 473 to 1341 mm.year −1 . Mean annual catchment streamflows (Q) ranges from 19 to 804 mm.year −1 and are highly variable, where the annual coefficient of variation of Q (Qcv), defined as the ratio of the standard deviation to the mean of annual flows, ranges from 0.23 to 1.72 (Table 2). Morton's mean annual potential evapotranspiration (PET) for the catchments ranges from 1,244 to 2,298 mm.year −1 and vastly exceeds Morton's actual transpiration (AET), which ranges from 717 to 1,124 mm.year −1 , indicating most catchments are water limited.

Defining Catchment Storage
Water storage can be considered the sum of the individual stores of water that exist within catchments. These individual stores may include groundwater, soil moisture, vegetation, surface water, and snow. The term storage is used inconsistently in hydrology and may include or omit some of these features due to the diverse applications and various domains of hydrological studies (Condon et al., 2020;McNamara et al., 2011). We follow the suggestion of McNamara et al. (2011) and use standardized methods to investigate the relationship between storage dynamics and catchment processes. Staudinger et al. (2017) created a scheme that distinguishes different conceptual catchment storages ( Figure 2). The different conceptual storages are total storage, immobile storage, mobile storage, extended dynamic storage, and dynamic storage. The partitions are based on specific methodologies that derive them and are of practical interest. Total storage is the sum of all water stored in the catchment and includes all mobile and immobile water. Total storage can be estimated through an aggregation of hydrogeological assessment of aquifers, groundwater, soil moisture information, and aboveground storage (e.g., snow). In reality, total storage cannot be precisely quantified. Immobile water is water that does not participate in the hydrological cycle and may be found in bedrock with poor permeability (Staudinger et al., 2017). Mobile water is water that participates in the hydrological cycle and is connected to catchment fluxes. Mobile water can include water with a variety of ages, such as soil moisture (young), shallow groundwater, and deep groundwater (old) passing through fractured rock systems. Estimates of mobile water can be obtained using tracer methods (Birkel et al., 2011;Cartwright & Morgenstern, 2016;Howcroft et al., 2018) or through hydrological transport models (Rinaldo et al., 2015;van der Velde et al., 2012).
Dynamic storage is the storage that controls streamflow dynamics (Birkel et al., 2011;Kirchner, 2009;Spence, 2007). Dynamic storage can be estimated from streamflow data alone using, for example, streamflow recession analysis (Kirchner, 2009) or by hydrological modeling Staudinger et al., 2017). For non-perennial streams, such as intermittent or ephemeral streams, there are periods when there is no streamflow,  Table 1. 4 of 18 yet storage continues to decrease due to subsurface water flow and evapotranspiration (Carrer et al., 2019). Extended dynamic storage is defined by Staudinger et al. (2017) as this hydrologically dynamic storage that is a result of precipitation, discharge, evapotranspiration, and groundwater. Extended dynamic storage can be estimated using hydrological models or the cumulative water balance. In this study, we focus on dynamic and extended dynamic storages as they are readily estimated from available hydrometric data. While we use different methods to derive dynamic and extended dynamic storages, we expect that the estimates obtained from each method should be reasonably comparable to other studies that have used the same definitions of storage (e.g., Buttle, 2016;Cooke & Buttle, 2020;McNamara et al., 2011;Staudinger et al., 2017).

Storage Methods
The methods used to estimate storage in this study are introduced in the subsections below. A summary of the methods is presented in Table 1.

Dynamic Storage: Storage-Discharge Relationship
The first method uses the storage-discharge (SD) relationship to estimate dynamic storage. The storage-discharge relationship is derived by examining the relationship of streamflow recession (−dQ/dt) and discharge (Q) during minimal flux periods of precipitation (P) and evapotranspiration (ET). Kirchner (2009) showed that during these periods, storage is theoretically a function of discharge (i.e., S = f −1 (Q)) and several studies have estimated storage using the method (Ajami et al., 2011;Birkel et al., 2011;Staudinger et al., 2017;Teuling et al., 2010;Yeh & Huang, 2019). Dynamic storage was estimated as the difference between maximum (S max ) and minimum storage (S min ) corresponding to some maximum (Q max ) and (Q min ) discharge rates. We estimated dynamic storage using the means of the annual maxima and minima of flows for each catchment, as done by Kirchner (2009).
where g(Q) is: Daily data were used to estimate the storage-discharge relationships and only measured (i.e., not gap-filled) streamflow data classified in the top two quality codes were used. While hourly data were used in the original study, Kirchner (2009)  Note. Dynamic storage is the storage that controls streamflow dynamics, while extended dynamic storage includes all measurable fluxes. P is precipitation, Q is discharge, PET and AET are potential and actual evapotranspiration, respectively, Tavg is average daily temperature, and S is storage.

Table 1 A Summary of the Storage Estimation Methods Used in This Study
amount of data points. Kirchner (2009) selected days in the recession where P and ET were less than 10% of discharge. In south-east Australia, the latter condition of ET being less than 10% of discharge is rarely met as high rates of ET are possible even in cooler seasons. This resulted in an insufficient amount of data points to calculate robust storage-discharge relationships. Instead, to minimize the effect of catchment fluxes on the storage-discharge relationships, we excluded days with precipitation and one day after, and restricted analyses to only include data from the winter months between June and August. This approach may underestimate the size of dynamic storage due to the effects of ET and this will be discussed later.

Extended Dynamic Storage: Water Balance
The second method uses the cumulative water balance to calculate the extended dynamic storage: where ΔS(t) is the extended dynamic storage increase or decrease from timestep t = 1 to timestep t = t, P is precipitation, Q is streamflow, AET is actual evapotranspiration, and s ET is the evapotranspiration scaling factor. P, Q, and AET are in mm per timestep, which is daily. The term ΔS(t) is used as some initial storage (S 0 ), and the total storage (S) cannot be determined using the water balance method (Sayama et al., 2011). AET was scaled for each catchment using a scaling factor s ET to ensure the water balances closed over the study period (equivalent to f WB in Equation 2 Staudinger et al., 2017). s ET is calculated as: Note. PVB, percent valley bottoms; Qcv: coefficient of variation of annual flow; P/PET, aridity index, where P is precipitation and PET is potential evapotranspiration; Q/P, annual runoff ratio; BFI, baseflow index; AC, lag-1 day autocorrelation coefficient. where , , and are mean annual precipitation, discharge, and actual evapotranspiration, respectively. Extended dynamic storage was calculated as the difference between the maximum and minimum values of ΔS observed over the study period . Using long study periods to derive storage using this method is critical to satisfy the steady-state assumption, especially in more arid regions (Han et al., 2020).

Extended Dynamic Storage: Budyko Framework
A second estimate of extended dynamic storage was obtained using the Budyko framework (Budyko, 1974). We used the framework to obtain an alternate estimate of actual evapotranspiration and subsequently the water balance. The Budyko framework relates the index of dryness (PET/P) and the evaporative index (AET/P) on the basis that water availability and atmospheric demand are the primary constraints on the equilibrium water balance (J. Y. Zhang et al., 2008). The Budyko curve, therefore, captures the interactions and feedbacks between the atmosphere, vegetation, and soil within the hydrological cycle (van der Velde et al., 2014). A generic form of the Budyko-like equation is the Fu-Zhang equation (Fu, 1981;L. Zhang et al., 2004) and is defined as: where w is an adjustable catchment parameter. The implementation of the w parameter allows for the representation of the geographical variation of the Budyko curve and the integrated effects of vegetation cover, soil properties, and catchment topography (L. Zhang et al., 2004). Equation 5 is normally solved over mean annual timescales, and AET is usually assumed to equal = − , which also inherently assumes negligible storage change (i.e., ΔS = 0).

of 18
The approach in Equation 5 yields the average annual evapotranspiration, but we needed finer-scale temporal estimates to calculate the water balance and derive storage. AET is limited by water availability and energy, but water availability can be carried through time via storage and is not simply a result of annual precipitation. Zeng and Cai (2015) showed that the water balance (ΔS) can be integrated into the Fu-Zhang equation to obtain finer-scale estimates of AET: where i is the timestep, ′ = + Δ −1 , and w adopts a similar definition to the optimized catchment parameter in Equation 5. We first optimized w in Equation 6 using annual data to define the overall relation between water, energy, and integrated catchment characteristics. Optimization of w was performed over the range 1 < w ≤ 10 using the least squares approach. We then calculated the water balance for each catchment using monthly P, Q, and estimates of AET obtained using Equation 6 with the optimized catchment value of w. The extended dynamic storage was then estimated as the difference between the maximum and minimum observed level of ΔS, as in Section 2.3.2.

Extended Dynamic Storage: Conceptual Model
The last approach estimates extended dynamic storages using a conceptual hydrological model. We used the same approach as (Staudinger et al., 2017, Section 3.3) and used the Hydrologiska Byråns Vattenbalansavdelning (HBV) model. In this method, the values of model parameters control the sizes of the storage state variables in the model. The state variables within HBV that store water are snow depth, soil moisture, upper groundwater storage, and lower groundwater storage. The extended dynamic storage was estimated as the sum of the maximum size of the HBV state variables within the study period. The parameter ranges used in calibration are presented in Table  S1 in Supporting Information S1. The full study period  was used to calibrate the model for each catchment and the model is run on a daily timestep. An adaptation of the HBV-light model as described in Seibert and Vis (2012) was used within the R package hydromad (Andrews et al., 2011). The HBV method to calculate AET was used (Seibert & Vis, 2012;Equation 4), and Morton's estimates of daily PET were input directly into the model routine (i.e., the mean daily temperature and long-term PET approach was not used). The HBV model parameters were calibrated using the Shuffled Complex Evolution-University of Arizona (SCE-UA) algorithm (Duan et al., 1992) and the Nash-Sutcliffe Efficiency objective function (Nash & Sutcliffe, 1970) with the Lindström penalty for volume error ( 2 ) (Lindström, 1997). Model calibration for each catchment was repeated 10 times to capture the effect of parameter uncertainty on simulated storage sizes due to parameter equifinality. Extended dynamic storage was calculated independently for each calibrated catchment model and then averaged.

Catchment Characteristics
Several catchment physical characteristics are used to explore the controls on catchment storage. We selected characteristics that have demonstrated relations to storage, including soil properties (Geroy et al., 2011;Western et al., 1999), bedrock type (Pfister et al., 2017;Tague & Grant, 2004), and topographic attributes (Sayama et al., 2011). The BOM Geofabric V2.1 product (http://www.bom.gov.au/water/geofabric/), a stream and nested catchment framework for Australia (Stein et al., 2014), was used to extract several characteristics, including mean elevation, elevation range, stream density, stream length, slope, and the proportion of catchment grid cells that are valley bottoms (henceforth named PVB). Three geological attributes were also extracted: the catchment area proportion of igneous rocks, sedimentary rocks, and metamorphic rocks. The catchment average of the Silica Index (Gray et al., 2016), a broad classification of soil parent material that focuses on chemical composition rather than the formation process, is an additional measure that was included to evaluate the effect of lithology on storage. Catchment average soil depth and clay content in the top meter of soil were extracted from the Soil and Landscapes Grid of Australia (Grundy et al., 2015).
Additional catchment characteristics were calculated using hydrometric data: the coefficient of annual streamflow variability (Qcv), the runoff ratio (Q/P), the mean annual aridity index (P/PET), the baseflow index (BFI), and the lag-1 day autocorrelation coefficient (AC) (Winsemius et al., 2009). The BFI has been shown to represent the storage and release properties of catchments (Salinas et al., 2013;Van Loon & Laaha, 2015) and was calculated using the lfstat R package (Koffler & Laaha, 2013). The lag-1 autocorrelation is a measure of smoothness of the hydrograph and can provide insights into water release properties of a catchment, where a higher autocorrelation coefficient indicates a slower release of water from the catchment. It is also considered one of the key hydrological signatures (Euser et al., 2013).
The study catchments cover a wide range of catchment physical properties and characteristics ( Table 2). The catchment areas range from 25 to 5158 km 2 and the median catchment area is 304 km 2 . The distribution of catchment areas is also presented in Figure 1. Igneous and sedimentary rocks are the most common underlying geologies of the catchments. Soils are moderately deep (mean depth is 0.73-1.14 m) with a range of clay fractions (22%-44%).
Spearman's rho statistic (ρ) was used to evaluate the association between the different storage estimates and catchment properties. Significance (P < 0.05) of the relationship was evaluated using Spearman's rank correlation test Algorithm AS 89 (Best & Roberts, 1975).
In our study catchments, we hypothesize that catchment storage will be greater in catchments with greater elevation and greater mean slope, based on other findings that high topographic gradients lead to increased deeper infiltration of water (Hayashi, 2020;Jasechko et al., 2016). Since flatter catchments will have lower topographic gradients, this also means that the runoff ratio should be lower and allow for greater evaporation, at least in the study region which is not energy limited. We also hypothesize that catchments with greater baseflow and smoother hydrographs, as indicated by the BFI and AC, respectively, will be indications of greater storage. Similarly, deeper soils and greater clay content are expected to indicate greater soil storage. Bedrock permeability has been found to exert a large influence on storage characteristics (Hale et al., 2016;Pfister et al., 2017). If significant relationships are discovered between rock types and storage estimates, it may relate to more permeable rock types, such as sedimentary sandstone or the presence of fractured metamorphic or igneous bedrocks. However, a simple relationship between bedrock type and storage may not be found, as a regional study found that the bedrock type alone does not simply control storage and release properties (Howcroft et al., 2018).

Storages
Robust storage-discharge relationships were found for all catchments. The mean and standard deviation of the coefficient of determination was R 2 = 0.92 ± 0.06. The minimum R 2 was 0.68. Storage values for the SD method ranged from 3 to 158 mm (Figure 3) and had a median storage value of 22 mm. Recession plots and plots of the storage-discharge relationships are presented in Figures S2 and S3 in Supporting Information S1, respectively. All HBV models obtained reasonable calibration scores (median 2 = 0.71 ). All catchments obtained a score above 0 (minimum 2 = 0.32 ), which is often used to distinguish good and bad NSE performance (Knoben et al., 2019) as a score of 0 indicates the model can only simulate mean Q. Variability of calibration scores across the 10 calibration trials for each catchment was low, where the maximum standard deviation of a catchment's 2 score was 0.04. HBV-extended dynamic storage estimates covered a range from 147 to 1012 mm with a median value of 402 mm.
Budyko curve-derived water balance storage estimates ranged between 97 and 841 mm and had a median value of 343 mm. The distribution of storage values derived using the Budyko method is broadly comparable to the HBV method ( Figure 3). The mean absolute difference between the HBV estimates and Budyko estimates of extended dynamic storage was 108 mm. The w parameter had a mean and standard deviation of 2.78 ± 0.66 across all catchments, and the relationship of w to storage is discussed later.
Extended dynamic storages estimated by the water balance method ranged from 536 to 1,802 mm and had the highest median value of 1,061 mm. The water balance scaling factor, s ET , had a mean and standard deviation of 0.83 ± 0.17, which highlights that most catchments required a reduction in actual evapotranspiration to close the water balance. The relation of s ET to storage is described later. Extended dynamic storage sizes estimated by this method were greater compared to the HBV and Budyko methods. The mean absolute difference of extended dynamic storage between the water balance method and the HBV method was 631 mm.
The size of dynamic storage, as estimated by the SD method, relative to extended dynamic storage varied from 0.3% to 59.1% depending on the catchment and the method. The median ratio of dynamic storage to extended dynamic storage was 5.2%, 6.6%, and 2.0% for the HBV, Budyko, and water balance methods, respectively.
To determine if there were consistent storage size differences between the methods, we ranked the sizes of the estimated storage for each catchment. From smallest to largest, the rankings were consistently ordered for the SD, Budyko, HBV, and the water balance methods (Table 3) with only a few exceptions.

Physical Characteristics
Significant Spearman correlations (P < 0.05) were found for several characteristics across all storage methods (Table 4). Greater mean catchment elevation, elevation range, and slope were strongly associated with greater storage. This result is reflected in PVB, where greater proportion of valley bottoms in catchments indicated lesser storage. No significant relationship was found between the size of the catchment and any of the estimated storages. Greater soil depth, unsurprisingly, indicated greater water storage. This is despite percent clay content, the particle size fraction that has the greatest water storage capacity, had no significant correlation with storage. No geological variables had consistent and strong relationships with the different estimates of storage. Catchments with lower mean annual flow variance were found to have greater storage capacity. Significant relationships between the runoff ratio and the aridity index indicated that wetter catchments have greater storage potential. Limited inference can be made with these variables, however, as the variables are part of the equations used to derive storage. The BFI had strong positive correlations with all storage methods, suggesting the digital low pass filter captures some aspect of storage and release properties. AC  Note. Bolded values are significant (P < 0.05) correlations.

Table 4 Spearman Correlation Coefficients Between Storage Estimates and the Catchment Characteristics
also had significant correlations with all storage estimates. Combined, these two variables support our initial hypothesis that smoother and slower releases of water are related to greater storage capacity.

HBV Partitioning
The HBV model has conceptual stores for snow, soil water, and groundwater and can provide insights into the simulated partitioning of water storage in the study catchments. The calibrated models show that soil storage was simulated as the largest storage for most catchments (Figure 4). Groundwater storage was the next largest storage, but the distribution was long-tailed, and some catchments have large simulated groundwater storages. Snow storage was minimal with most catchments having zero simulated snowfall. The size of conceptual stores for each catchment can be viewed in Figure S3 in Supporting Information S1. The correlations between the physical characteristics and the HBV conceptual storages are presented in Table S2 in Supporting Information S1. Aside from snow storage, there are few differences across the conceptual stores. Overall, the results match the HBV method in Table 4, where more varied topography, greater slope, deeper soils, and a smoother hydrograph, as indicated by the BFI and AC, had greater catchment storage.

Water Balance
The water balances were scaled using the scaling factor s ET to force the water balances to close over the study period. As mentioned, s ET had a mean and standard deviation of 0.83 ± 0.17. Indeed, in 66 out of the 69 catchments, the scaling factor was less than 1, indicating that Morton's model systematically overestimates AET and/or that there are errors in the values of P and Q.
We planned to evaluate whether s ET had any relationships with catchment characteristics to determine if the scaling factor was representative of any characteristics (Table S3 in Supporting Information S1). However, s ET had a significant positive correlation with water balance-derived storage estimates (ρ = 0.49, r = 0.51), which suggests that larger storages tend to either have smaller observational errors or lower overestimation of AET.

Budyko Approach
The distribution of points in Figure 5 shows the catchments respected the Budyko water and energy limits. Fitting a w parameter for all catchments in the study by minimizing the sum of squared error resulted in a value of 2.77, which is close to the mean value of the individual fits. This number is comparable to the values of 2.84 and 2.55 found by L. Zhang et al. (2004) for forested and grass covered Australian catchments, respectively.
The relationship of catchment storages and the calibrated Fu-Zhang curve parameters w in the Budyko space are presented in Figure 5. There appears to be a general trend that catchments that have a lower index of dryness and a lower evaporative index have greater storage. However, the w parameters have no significant correlation with storage (ρ = −0.04) and the parameter does not strongly represent physical characteristics that may be associated with storage (Table S3 in Supporting Information S1).

Storage and Catchment Characteristics
Our study catchments had small dynamic storages and relatively large extended dynamic storage. Across all the study catchments, the mean and one standard deviation of the dynamic storage as a proportion of extended dynamic storage is 5.1 ± 4.8%. Dynamic storage represents the storage that directly contributes to streamflow and the fact the storages were estimated to be small reflects the study environment, where evapotranspiration dominates catchment losses. The difference between the sizes of dynamic and extended dynamic storage sizes can be interpreted that a large proportion of catchment storage is "reserved" for evapotranspiration (Brooks et al., 2010;Carrer et al., 2019). While the dynamic storages are small, the fact that the headwater catchments along the southeast of Australia continue to flow in prolonged dry periods and have long travel times (Cartwright et al., 2020) suggests that these stores are deeper in the subsurface and are connected to long groundwater flowpaths (Howcroft et al., 2018).
We tested several physical catchment characteristics hypothesized to act as controls on catchment storage as well as to assess if the storage estimation methods possess physical realism. As we hypothesized, storages were strongly linked with topographic characteristics. Catchments with greater slopes were positively correlated with storage and catchments with a lower percentage of valley bottoms were negatively correlated to storage across all methods. These characteristics together express a physical system where water can readily drain to subsurface stores and support prior findings that catchment topographic characteristics are pivotal to water storage (Berghuijs et al., 2014;Jencso & McGlynn, 2011). Soil storage was found to be important, and soil depth positively correlated with all the storage estimates. This was also highlighted in the simulated partitioning of water by the HBV model, where soil water represented the greatest store for most catchments. The BFI and stream AC also significantly correlated with water storage, in line with other studies that have found the BFI captures storage and release properties of catchments (Salinas et al., 2013). A greater BFI relates to higher stream autocorrelation and for the study catchments, there is a strong Pearson's correlation between the two characteristics. A physical interpretation of this result is that greater autocorrelation and therefore greater memory in the streamflow signal, suggesting a slower storage release, slower flow paths, and therefore greater storage.
The bedrock characteristics were not found to be a strong indication of storage, with no consistent correlations across the storage estimates from the different methods. This may be a result of the coarseness of the parent data (1:1M) and the uncertainty of spatial mapping of bedrock type. Bedrock and the soil-bedrock interface are important for hydrological storage (Jencso & McGlynn, 2011;Pfister et al., 2017;Sklash & Farvolden, 1979;Sophocleous, 2002); however, other evidence has shown that the physical arrangement of these features (e.g., McGuire et al., 2005) is more important than the simple bedrock constituencies. Staudinger et al. (2017) also did not find a significant relationship between their geological indicator (average quaternary depth) and derived storage. This raises a broader issue of what the ideal geological indicators and measures are when determining broad-scale storage controls.

Methodology
The methods we applied all yielded different results, but like Staudinger et al. (2017), we found that the methods had similar rankings. That is, the methods consistently estimated relatively smaller or larger storages for the same catchment. Moreover, the multi-method and multi-catchment approach demonstrated the difficulty of quantifying catchment storage. The strong correlations with the physical characteristics show the methods captured some aspect of catchment storage behavior that match conceptual ideas of catchment storage; however, the inconsistencies of the correlations to some of the methods raises doubts if simple rules about the controls on catchment storage can be established. A potential source of this inconsistency is the fact that, despite using the most upto-date sources of data that covered the study region, many of the physical characteristics are spatially modeled values derived from other landscape-level data.
Each of the methods has their relative strengths (and weaknesses) and is discussed in subsections below. A general problem that applies to all the methods in this study is that none of the methods are direct observations of storage rather they have been inferred from catchment fluxes. Without some direct measure of storage, there is a reciprocal problem: it is difficult to define storage without defining it from fluxes when storage itself is defining or controlling those processes.

Storage-Discharge
The SD method provides a clever way of estimating the dynamic storage size by analyzing times when streamflow is a function of storage. This behavior can be observed during low flux hours, that is, when there is negligible precipitation and evapotranspiration, and the stream is in recession. This proved challenging to implement in this study using daily data and it is likely to always be an issue in drier regions. An additional complication is that catchments in Australia tend to be larger due to flatter topography. This typically results in low yields of water, and it is rarely the case that streamflow is substantially larger than evapotranspiration.
The effects of P and ET are minimized in this study by removing days with precipitation and the day after from analysis and limiting the analysis to cooler months of June to August. However, ET can still be considerable during these months in south-eastern Australia and there is almost certainly an effect on the calculated dynamic storage sizes. Improperly excluding ET results in the underestimation of storage (Kirchner, 2009), and this is a caveat of the results. Nevertheless, it is clear that dynamic storage is likely to be much smaller than extended dynamic storage in most catchments in our study region. The use of hourly data is one opportunity to improve the reliability of storage estimates using this method. This comes with other challenges, including (1) long time series of hourly data for many catchments are not widely available (2) nocturnal transpiration can still be considerable in the Australian environment (Buckley et al., 2011).

Hydrologiska Byråns Vattenbalansavdelning
As Staudinger et al. (2017) identified, the HBV model can consider different sources of storage and their relative contributions to extended dynamic storage. These storages are simulated and are not based on any real observations of groundwater, soil water, or snow storage. While they are simulated storages, our results show these conceptual stores were significantly correlated to many physical characteristics that are representative of these stores.
Model structure and the choice of the objective function are likely to have an impact on the partitioning of water and model performance . This source of uncertainty was not assessed in this study, but it could be examined by comparing the results of multiple conceptual models and objective functions to evaluate the consistency of water partitioning and storage size. Additionally, there is always uncertainty that derives from the chosen initial parameter ranges and model calibration routine (Butts et al., 2004). We used parameter ranges that are consistent with the literature (Lidén & Harlin, 2000;Seibert, 1997;Seibert & Vis, 2012), and we repeated the calibration trials 10 times for each catchment to capture parameter equifinality. The values of parameters can have a large effect on the partitioning of water between the different stores. The ranges of calibrated parameters did not indicate that there was limiting behavior that prevented further increases or decreases of the sizes of storages. There were limited cases where parameters were poorly identified across the 10 calibration trials for catchments, but the variation in the size of the conceptual storages was low across calibration trials for each catchment.

Water Balance
The water balance approach should theoretically provide the optimal measure of extended dynamic catchment storage as it tries to directly relate changes in storage with fluxes. However, a clear source of uncertainty for the water balance approach is the use of the scaling factor s ET . The use of this scaling factor was necessary as, without this factor, sensible water balances could not be computed with the data for most catchments. Despite the apparent suitability of Morton's estimates of evapotranspiration to calculate the water balance (McMahon et al., 2013), Morton's estimates of evapotranspiration do not factor in effects from wind, which can cause large differences in PET and AET calculations (Donohue et al., 2010). Most catchments had an s ET of less than 1, indicating that the catchment losses to ET are less than what is estimated by Morton's actual areal evapotranspiration. A few possibilities that may explain this result include the poor estimation of actual evapotranspiration, inaccurate spatial estimation of precipitation, or inaccurate gauging of streamflow. Small errors in any of those variables accumulate over time and cause the water balance not to close. This raises a broader issue in that we cannot close the water balance from the best data sets we have available. Moreover, despite the ubiquity of the cumulative water balance equation (i.e., ΔS = P − Q − ET) in hydrology, the equation excludes other losses, such as inter-catchment flows which are often (and potentially falsely) assumed to be negligible (Bouaziz et al., 2018;Fan, 2019). This also gives rise to another common assumption, employed here, that long-term average AET can be estimated using = − . This term could be considered the mean loss term that excludes Q, as any losses to other sources are attributed to ET. While this assumption does have utility, recent studies have highlighted that there are cases where steady state may not be reached even in reasonably long (30-year) windows (Han et al., 2020), and there is evidence of ecohydrological feedbacks with storage (Rice & Emanuel, 2019). As such, careful consideration should be given when using the hydrological steady-state assumption in studies of storage and future storage assessment frameworks should explicitly incorporate tests of the assumption.

Budyko
The Budyko approach simplifies the complex processes and interactions and expresses the controls of actual evapotranspiration by the availability of energy and water and has been validated globally (Choudhury, 1999;Koster & Suarez, 1999;L. Zhang et al., 2001). We added this method due to the limitations of the water balance approach, where it was suspected poor evapotranspiration estimates may hinder an accurate simulation of the water balance. We defined the w parameter using annual data to capture the overall long-term relationship between AET/P and PET/P. Extrapolating Budyko relationships to the monthly scale, while commonly done in the literature (e.g., Du et al., 2016;Zeng & Cai, 2015) creates some uncertainty because the relationship between AET/P and PET/P may not be consistent seasonally. Since the main long-term relationship is represented, this may mean that the extremes of change in storage are reduced and there could be an underestimation of extended dynamic storage.
We hypothesized that w may relate to some physical characteristics related to storage, as the parameter is widely believed to represent the integrated effects of soil, vegetation, and topography (L. Zhang et al., 2004). The fact that w parameter did not strongly relate to many physical characteristics likely indicates that w does integrate many characteristics, and it is unlikely to have simple relationships to due catchment heterogeneity.

Implications and Future Research
This study builds on the global push to understand water storages in catchments by using common storage definitions (McNamara et al., 2011) and estimation methods (Staudinger et al., 2017). In our study catchments, the multi-method and multi-catchment approach did not tightly constrain the sizes of extended dynamic storages. A key limitation was that the uncertainties of the hydrometric input data that ultimately limited how well we could constrain the size of storages. Quantifying the uncertainty of the hydrometric data should be a focus of further work. Further research is also required to obtain physical estimates of storage to validate storage estimation methods using hydrometric data. This includes, and is not limited to, using tracers to characterize mobile storage, and using satellite products, groundwater level data, or local-scale gravimetry (e.g., Creutzfeldt et al., 2014) to study dynamic and extended dynamic storage. Hydrometric methods are currently the only viable way to assess storage broadly at the catchment scale, and as such it is critical that improvements are found so that storage can be easily and rapidly estimated.
Many of the results here indicated that groundwater and slow storage release are important to water storage and release from catchments. Hydrological models poorly simulate these features and are likely a reason why performance outside calibration windows is reduced. Our results reinforce the call to improve conceptual models to better account for slow flow processes (e.g., Fowler et al. (2020)). It is likely a complete understanding of the underlying mechanisms that cannot be attained without grasping the mobile storage. Much of the underlying hydrological processes likely occur in the mobile storage domain where there is an important distinction between particle velocities and celerities (Beven & Davies, 2015;McDonnell & Beven, 2014). Mobile storage was not assessed as it cannot be determined from hydrometric data alone. Rather, it is usually inferred with the assistance of tracers. Several studies have evaluated MTTs using tracer data within the study region, and these could be pooled to evaluate mobile storage. However, the physical controls on MTTs in some of these catchments have not been readily identified (Cartwright et al., 2020;Howcroft et al., 2018), and the estimates of MTTs often carry considerable uncertainty due to the assumptions required to estimate recharge rates (e.g., Li et al., 2019). Despite the clear challenges, further work focusing on water age behavior could lead to breakthroughs in the understanding of the controls on catchment storage.
Changes in interannual catchment storage volumes are often assumed to be negligible, even though this assumption is widely acknowledged as being unrealistic (Rice & Emanuel, 2019). This assumption was applied in this study to derive storage from the long-term water balance. It is likely that dynamic and extended dynamic storages in our study region behave non-linearly, as indicated by the research by Saft et al. (2015Saft et al. ( , 2016, who showed that drought induces changes to the land system that are likely to influence water storage and release properties. Recent research by Peterson et al. (2021) has also shown that several catchments in south-eastern Australia have not recovered from the Millennium Drought, which may indicate permanent changes in their capacity to store water. Analysis of distinct wet and dry periods may reveal changes in storage behavior and should be a focus of further work. A similar phenomenon has been observed in regions with Mediterranean climates, where the sharp seasonal differences in precipitation and temperature in combination with drought resulted in complex runoff, evapotranspiration, and storage partitioning behavior (Avanzi et al., 2020;Feng et al., 2019;Hahm et al., 2019). Progress in this area is critical, as hydrological models also frequently perform poorly under changing climate scenarios (e.g., Avanzi et al., 2020;Duethmann et al., 2020;Fowler et al., 2016), and more realistic concepts of storage and release behavior need to be integrated into model structures .

Conclusions
Storage sits at the intersection of the main hydrological processes and advances in the understanding of catchment storage will provide greater insight into catchment functioning. While in hydrology the focus is often on the fluxes, flux behavior can be more precisely quantified within hydrological boundary conditions if that boundary can be established. With impending challenges that will be faced globally, such as climate change and large-scale land use change, it is critical to understand water storage and availability from a water resource perspective. This is particularly the case for our study region, the Murray-Darling Basin, given the recent findings of Peterson et al. (2021) that demonstrated clear changes in storage and release patterns after severe drought.
We performed a broad and comprehensive analysis of storage across 69 catchments in the Murray-Darling Basin. In relation to our original aims, (1) we successfully estimated dynamic and extended dynamic storages across our study area. While the different methods were generally ranked consistently, the estimates of dynamic and extended dynamic storage could vary substantially depending on the catchment. (2) It was difficult to determine robust catchment characteristics that control storage, but several key characteristics highlighted the nature of the storage. Our results indicate that topography and hydrograph characteristics are better indicators of storage in the study region. The geological characteristics used in the study did not strongly relate to the storage estimations, and further work is required to identify useful geological measures that relate to storage. (3) The multi-method and multi-catchment approach, as applied in this study, has been proposed as a clear way to advance our understanding of storage (e.g., McNamara et al., 2011;Staudinger et al., 2017). Our results highlight that in the absence of high-quality hydrometric data, it is difficult to precisely quantify storage. This poses a wider challenge, given that there is currently no other way to effectively assess storage broadly at the catchment scale. We propose that the uncertainty of hydrometric data needs to be addressed, and we call for new methods that can robustly and easily estimate storage at the catchment scale.