Reservoir Sizing at Draft Level of 75% of Mean Annual Flow Using Drought Magnitude Based Method on Canadian Rivers

On a global basis, there is trend that a majority of reservoirs are sized using a draft of 75% of the mean annual flow (0.75 MAF). The reservoir volumes based on the proposed drought magnitude (DM) method and the sequent peak algorithm (SPA) at 0.75 MAF draft were compared at the annual, monthly and weekly scales using the flow sequences of 25 Canadian rivers. In our assessment, the monthly scale is adequate for such analyses. The DM method, although capable of using flow data at any time scale, has been demonstrated using monthly standardized hydrological index (SHI) sequences. The moving average (MA) smoothing of the monthly SHI sequences formed the basis in the DM method for estimating the reservoir volume through the use of the extreme number theorem, and the hypothesis that drought magnitude is equal to the product of the drought intensity and drought length. The truncation level in the SHI sequences was found as SHIo [ = (0.75 ‒ 1) µo/σo], where µo and σo are the overall mean and standard deviation of the monthly flows. The DM-based estimates for the deficit volumes and the SPA-based reservoir volumes were found comparable within an error margin of ±18%.


Introduction
The flows from a river can be analyzed using annual, monthly or weekly scales for estimating the reservoir volume (V R ) corresponding to a certain draft level. The draft levels are expressed as the ratio to the mean annual flow (MAF), such as 100% (1 µ a ), 90% (0.9 µ a ), 75% (0.75 µ a ), 60% (0.60 µ a ), etc., where µ a is the mean of the annual flow sequences under consideration. The ratio is denoted as 'α' in the ensuing text with the values 1, 0.90, 0.80, 0.75, etc. The mean flow would turn out to be nearly the same at all scales, though the variance and autocorrelation structure would differ significantly from each other at respective scales. For brevity, in the ensuing text, the terms mean, standard deviation, coefficient of variation, and lag-1 autocorrelation are referred to as µ, σ, cv and , respectively, whereas for the specific cases a suffix is attached to these symbols, such as µ a and σ a for the annual flows; µ o and σ o for the monthly flows; and other symbols with the meaning explained therein. As one moves from annual to monthly and further to weekly scales, the σ and of the flow sequences will increase warranting a greater storage volume of the reservoir. The sequent peak algorithm (SPA) can be touted as the universal method for sizing a reservoir because of its popularity and wide familiarity. The method first developed by Thomas and Burden [1] is documented in hydrological and water resources books [2][3][4][5][6], and journal papers [7][8][9][10], among others. The river flow data, primarily, at the annual and monthly scales are used at a proposed site of a dam for estimating a design value of the reservoir size.
The hydrological drought analyses have been the subject of considerable investigations from the decade of the 1970s, and the pioneering models are well documented in Yevjevich [11,12]. Two main parameters of the hydrological drought viz., duration (length, L) and magnitude (M, also termed as severity), have been the subject of study. The drought modelling activity has been focused on the drought lengths, with a less noticeable thrust towards the drought magnitude, which, in a true sense, is of greater importance in terms of the management of waters, and consequently for the sizing and operation of reservoirs that are built across rivers. The research work on modeling the drought magnitude was enhanced by notable researchers [13][14][15][16][17][18][19][20][21][22][23][24][25]. In the recent past, Sharma and Panu [24] have suggested a simple model for predicting the drought magnitude by coupling it with the drought length through a third parameter, namely, drought intensity, i.e., magnitude (M) = intensity (I d ) × duration (L). They have carried out analyses by standardizing the flow sequences in the respective scales (annual, month and week), which are named SHI (standardized hydrological index) sequences. In brief, the SHI is an entity with µ = 0 and σ = 1, while retaining the probabilistic structure/character of the flow sequence under consideration. The SHI has performed well in the modelling of drought lengths and magnitudes on Canadian river flow sequences at annual, monthly and weekly scales [23,24].
Recently, Sharma and Panu [26] have endeavored to link the deficit volume (D T ) using the drought magnitude (DM)-based methodology to the reservoir volume (V R ) based on the SPA. They also demonstrated that at annual and monthly scales, the draft equivalent to 0.90 to 1 µ a indeed can be construed to be high drafts for the design and operation of reservoirs. For pragmatic reasons, a majority of reservoirs are designed and operated at a 0.75 µ a level of the draft [27]; therefore, there is a need to examine/assess the drought magnitude-based analysis at the above draft level. The estimation of the deficit volume based on the DM-based methodology at the annual time scale is fairly straightforward. However, at monthly and weekly scales, the estimation of the deficit volume presents some challenges, which form the main focus of this investigation. The DM-based deficit volumes are being compared with the estimates of the reservoir volume from the conventional SPA. The scope of this paper includes the assessment of the efficacy of the DM-based methods of reservoir sizing to the popular SPA, and to demonstrate that the proposed method is not only comparable but also possesses significant advantages, such as the ability to assign a return period and evaluate the associated risk in the design of reservoirs.

Sequent Peak Algorithm (SPA) and Drought Magnitude (DM)-Based Procedures
The sequent peak algorithm (SPA), currently, is the most familiar procedure for the estimation of reservoir volume (V R ). The SPA requires historical or synthesized river flow data as inflows, drafts as outflows, and data are analyzed without stationarization. A distinction is made between the terms stationarization and standardization using the monthly flows as an example. The monthly flows are regarded as nonstationary because of the periodicity imbued in them. The flow sequences, when standardized month-bymonth are rendered stationary with µ = 0 and σ = 1 in the resultant sequences. Such an operation to stationarize the data is referred to as standardization in the ensuing text. The nonstationary monthly flows can be standardized (not by month-by-month standardization but rather flatly using an overall µ, σ (denoted respectively as µ o and σ o ) of the entire monthly flow sequences). For example, in the Bow River case, the overall µ o = 38.96 m 3 /s and σ o = 40.85 m 3 /s (Table 1) and shall still retain the nonstationary character of flows. Such standardization is referred to as flat standardization in the ensuing text. The flat standardized sequence shall also have µ = 0 and σ = 1. The annual flow sequences are generally perceived to be stationary and their standardization is tantamount to flat standardization. The computations using SPA are amenable to the nonstationary monthly and weekly flow sequences, the stationary annual flow sequences, and the standardized monthly, weekly and annual flow sequences alike. The calculations are conducted numerically using the mass curve or the residual mass curve methods involving inflows and drafts to assess the reservoir volume (V R ). The procedure is fully acquiescent to computerized computations to arrive at the required V R for a given situation. The aforesaid methods for computations of reservoir size using the SPA are well documented in hydrologic textbooks [3,5]. Although the SPA offers the advantage of treating the stationary and non-stationary flow data in a likewise manner, it suffers from the shortcoming that no return period can be attached to the V R estimates. For instance, in a dataset of the annual flows of the Bow River (Table 2) for 106 years (1911-2016), the V R at a 0.75 MAF level of the draft was computed as 4.18 × 10 7 m 3 , whereas the estimate based on 53 years of the sample (1911-1963) turned out to be 6.01 × 10 7 m 3 . Although, this discrepancy can be attributed to differing estimates of µ, σ, and from these samples, yet attaching a return period (T = 106 or 53) to these estimates is not devoid of uncertainty. Similar to the SPA, the non-stationary flow sequences can be truncated (or chopped) at a constant draft level and the runs of the flow deficiency can be observed. The deficiency in the largest run can be computed numerically. This deficiency is denoted as 'D T-o in the ensuing text. The reservoir volumes thus have two estimates, one based on the SPA and another by counting the flow deficiency below a truncation level. For example, in the Bow River case , at the monthly scale V R was computed to be 4.01 × 10 8 m 3 , whereas 'D T-o was computed to be 3.99 × 10 8 m 3 (Table 3). These estimates of the volumes may slightly differ from each other, though may turn out to be equal under specific draft conditions. It is imperative that in the DM-based procedure, the flow sequences must be stationary. Therefore, standardization of the annual, monthly and weekly sequences is achieved and the resultant sequences are known as SHI sequences, which turn out to be weakly stationary (i.e., second-order stationarity). At the annual time scale, the draft is set at the levels 1 µ a , 0.9 µ a , 0.8 µ a , 0.75 µ a , etc. The SHI sequences are truncated at these draft levels to carry out analyses in the DM-based procedure. For example, in the Bow River case at the annual time scale (Table 1: µ a = 38.96 m 3 /s and cv a = 0.13; the suffix "a" stands for annual) with the draft = 1 µ a , the truncation (SHI x ) level in the SHI sequence can be determined by the relationship, SHI x = (αµ a − µ a )/σ a = (α − 1)/cv a = (1 − 1)/0.134 = 0. At the draft of 0.75 µ a (α = 0.75), the truncation level (SHI x ) shall turn out to be (0.75−1)/0.134 = −1.87. Because of a single value of SHI x , the analysis is tractable at the annual scale for all the draft levels (i.e., 1µ a , 0.90 µ a , and 0.75 µ a ). Note: a (annual), ma1 (MA1 smoothed monthly SHI sequences) and ρ ma2 (MA2 smoothed monthly SHI sequences) show lag-1 autocorrelation. The notation cv a denotes the values of coefficient of variation at the annual scale, cv av average of 12 monthly values of cv's, which is computed as the ratio of averaged out (σ av ) of 12 values to overall mean µ o , cv m stands for the maximum value among monthly 12 values of cv computed as the ratio of σ max (maximum value of standard deviation) to µ o . cv o stands for coefficient of variation in the non-stationary monthly sequence, i.e., the ratio of overall monthly σ o to µ o , likewise cv ow stands for the ratio of the overall weekly standard deviation to µ o . Table 3. Comparison of V R and D T-o at the draft level of 0.75µ o : annual, monthly and weekly scales.  Asterisk (*) denotes the percentage increase in monthly value compared to the annual value. Asterisk (**) denotes the percentage increase in the weekly value of V R compared to the monthly value, and -means no values were evaluated for the V R and D T-o because both were zero at the annual scale. These Bold numbers in the coulumns 3, 6, and 9 are SPA based numbers for comaparative purposes.  At the monthly scale with a draft level of 1 µ a (=1 µ o ), the SHI sequences are truncated at SHI x = 0 to correspond to the variable monthly means. It should also be realized that µ a (annual mean) = µ o (overall monthly mean) and thus, these terms can be used interchangeably. The SHI sequence obtained has uniform µ = 0 and σ = 1 across the months.
The sequence is construed as stationary and is tractable using the known methods of stochastic analysis. However, at 0.75 µ o , there shall emerge 12 values of SHI x to truncate the SHI sequence. For example, in the Bow River case at the draft level of 0.75 µ o , the 12 values of cv vary from 0.12 to 0.41 with the corresponding SHI x values ranging from −2.11 to −0.62 (Table 1). Thus, the SHI sequence needs to be truncated by variable SHI x values (or a curve rather than a horizontal line), which presents a complex scenario requiring some procedures such that known methods of statistical analysis for the estimation of drought magnitude can be applied. The above description can be extended to weekly flows and corresponding SHI sequences.

Strategies for Truncating the Non-Stationary (Monthly) Flow Sequences
At draft levels less than 1µ o , such as 0.75µ o , one way of conducting the DM-based analysis is through selecting varying values of "α" with respect to each month in order to render the SHI X = (α 1 − 1)/cv 1 = (α 2 − 1)/cv 2 = (α 3 − 1)/cv 3 = (α 4 − 1)/cv 4 -= (α 12 − 1)/cv 12 = 'SHI o ' (say). It is noted that cv 1 is defined herein as the ratio of σ 1 (standard deviation of month 1) to µ 1 (mean of month 1), and likewise for the rest of the months. Other considerations for obtaining values of the 'SHI o ' could be as follows: (1) the average of 12 monthly values of SHI x (denoted as SHI av ), (2) the maximum of 12 SHI x values (denoted as SHI m ), or (3) any other combination of these 12 values. Thus, the quantity "SHI av " is defined to be equal to "(α − 1) × µ o /σ av ", and likewise the quantity "SHI m " to be equal to "(α − 1) × µ o /σ max " (where σ max is the maximum value among 12 monthly values of standard deviations). The above analogy can also be extended to the weekly flows, where there are 52 values of α'. The hypothesis proposed in the paper is that the 'SHI x ' = 'SHI o ' (a flat standardized value), and the use of SHI m or SHI av for truncating the SHI sequences provides a reasonable procedure to evaluate the drought magnitude as an estimator for the reservoir volume at the draft of 0.75 µ o . The best estimate from the above combinations can be assessed by comparing the DM-based results with the SPA-based outputs.

Computation of Drought Magnitude (M T-o ) by the DM-Based Counting Method
In the DM-based counting method, the SHI sequence is truncated at the level SHI x . The values of the SHI below the truncation level (SHI x level) are referred to as the drought (dubbed as 'd'); whereas, the values of the SHI above the truncation level are referred to as the wet (dubbed as 'w'). Initially, at each time scale, the value of SHI x is set = 'SHI o '.

Estimation of Drought Magnitude (M T-e ) by the DM-Based Analytical Method
For a given return period (T), the longest drought period (L T ), and the corresponding drought magnitude (M T ), can be estimated using the drought magnitude-based analytical method for which the parameters are derived from the historical flow data (or regional patterns of these parameters). In the DM-based analytical method, the probabilistic relationship for M T can be obtained by applying the extreme number theorem [28] and utilizing the linkage relationship. The following applies: drought magnitude (M) = drought intensity (I d ) × characteristic drought length (L c ). The drought intensity (I d ) can be assumed to follow a truncated normal probability density function (pdf) and the drought magnitude (M) to obey the normal pdf because this quantity represents the summation of the drought intensity spikes. The mean and variance of "I d " are evaluated using the truncated normal pdf. The longest drought duration (L T ) is estimated by the Markov chain (MC)-based relationship, which is shown as Equation (A9) in Appendix A. From the above linkage relationship, the terms M T and L T are related and thus, for design purposes over a time period (T), it is imperative to obtain the longest drought duration (L T ) for the estimation of the term M T . On the monthly basis, the MC1 (Markov chain order 1) is a reasonable predictor (or estimator) of L T . In the MC1-based relationship, the four input parameters are conditional probabilities (q q , q p ), plotting position factor (F), and return period (T) which is normally assumed to be equal to the sample size. The conditional probabilities q q (i.e., present period is drought given that the past period was also drought) and q p (i.e., present period is drought given that the past period was wet) are estimated based on the analytical relationship due to Cramer and Leadbetter [29]. In the present analysis, the relationship F = 1. 33(1 + 0.25/T), developed for Canadian rivers by Adamowski [30], was used. The MC-based value of L T can be reduced to the characteristic drought length (L c ) by using a simple weighing parameter Φ (ranging from 0 to 1), which can also be obtained through an optimization procedure.
Based on the above notions for the estimation of M T , the following probability-based relationship can be deduced (see: Appendix A for detail).
Equation (1) is a discrete version of evaluating the mean from the first principles, and the following applies: , where x is a value of the random variable with the occurrence probability = p(x)]. The notation P (.) stands for the cumulative probability and p (.) stands for the simple probability. Since M T is a continuous random variable, and thus possesses a continuous pdf such that p (M T = Y j ) can be evaluated as It is noted that the upper limit of summation (n 1 ) will vary from the annual to monthly scales. For the annual scale, the maximum value of Y = 30 was found to be sufficient, whereas for the monthly scale, the maximum value of Y = 150 was found to be adequate. For integration purposes, the Y is discretized into small intervals with a step (∆) of 0.05, such that n 1 becomes equal to 600 (=30/0.05) at the annual scale, and likewise equal to 3000 for the monthly scale. For purposes of numerical integration, thus, these Y j 's shall take on the values as 0, 0.05, 0.10, 0.15, etc., up to 150 at the monthly scale and up to 30 at the annual scale.
A simplified version for the estimation of E (M T ) can be developed based only on the mean of the drought intensity (µ d ) as follows: It should be borne in mind that Equation (1) involves implicitly both the mean and variance of drought intensity (I d ) for the estimation of E (M T ). Relevant detail on the derivation of Equations (1) and (2) is provided in Appendix A.
At the monthly and weekly scales, the SHI sequences of Canadian rivers have been reported by Sharma and Panu [23] to follow a gamma pdf vis-à-vis the normal pdf for the annual SHI sequences. In such a situation, firstly the gamma-distributed SHI value at the desired truncation level (denoted as SHI x ) is transformed into an equivalent standard normal number z 0 using the following Wilson-Hilferty transformation, as documented in Viessman and Lewis [31]: A corresponding value of the drought probability (q) can be obtained from the following polynomial equation [32]: . (4) where, B = q for z 0 < 0; and q = (1 − B) for z 0 ≥ 0 and the term |z 0 | represents the absolute value of z 0 . The error evaluated by this formula is less than 0.00025 and exactly the same value is obtained from a standard normal probability table for z 0 = 0 (i.e., q = B = (1 − 0.5) = 0.5).
As an illustrative example, consider the Bow River, which obeys the gamma pdf with cv av = 0.15 at the monthly scale (Table 1) for a value of SHI x = −0.24. The corresponding z 0 = −0.16 was obtained using the Wilson-Hilferty transformation given in Equation (3). Substituting z 0 = −0.16 in Equation (4), the value of q would result to be 0.45. It is reported earlier in the text that monthly flow sequences in Canadian rivers tend to follow the gamma pdf, and for this reason, the gamma pdf has been used in this example.
In the DM-based procedure, D T-e (or D T-o ) should correspond to V R obtained from the SPA, in turn, such an expectation forms the criterion for perfecting the DM-based estimates. Once the DM-based estimates for an appropriate L T and M T are evaluated (either by the counting or analytical methods), then these quantities are used for the estimation of reservoir volumes. Analytically derived E (M T ) is denoted as M T-e and correspondingly, D T-e = σ av × M T-e (subscript "e" stands for analytically estimated).

Data Set and Computations of Reservoir Volumes by the SPA and DM-Based Methods
Twenty-five rivers from Western to Atlantic Canada ( Figure 1, Table 2) were involved in the analysis. The rivers encompassed drainage areas ranging from 46 to 74,600 km 2 with the data bank spanning from 36 to 106 years. The monthly and annual flow data for these 25 rivers were extracted from the Canadian hydrological database [33]. The weekly flows were collated using the daily flow data for the above gauging stations. The values of the statistical parameters µ, σ, cv, and for these rivers at annual, monthly and weekly scales were computed and are summarized in Table 2. Since cv = σ/µ, therefore instead of σ, the values of cv are summarized (Table 2) for the sake of brevity and ease of comprehension. In the case of the monthly and weekly flow sequences, there are 12 and 52 values of cv (or σ) that were used in respective standardization. The cv av was computed as the ratio of σ av (arithmetic average of 12 monthly values) to µ o (overall monthly mean or MAF, µ a ) and similar calculations also apply to the weekly scale. The first step in the analysis was to compute the statistics viz., µ o and σ o of the monthly sequences for the flat standardization of these sequences. Likewise, these statistics were computed for the weekly flow sequences. The flat standardized values of cv at monthly and weekly scales are denoted as cv o and cv ow . The flat standardized values of 'SHI o ' were computed at the 0.75µ o level of cutoff using the overall µ (=µ o ) and σ (=σ o or σ ow ) of the non-stationary monthly and weekly flow sequences, and are shown in Table 3.

Inter-Comparison of Reservoir (V R ) and Deficit ('D T-o ) Volumes for the Draft of 0.75 µ o at Varying Time Scales
The analysis commenced with the computing of V R and 'D T-o at the annual, monthly and weekly scales and the results for 25 rivers are summarized in Table 3. It is reiterated here that V R (m 3 (Table 3) is that at the monthly and weekly scales, the V R values are much larger than those at the annual scale. The V R values at the monthly scale were found to be almost 200% (average = 199% in Table 3) compared to the annual scale, but at the weekly scale the increase was found to be marginal (3.3%) concerning the monthly values.
At each time scale, the values of 'D T-o were found to be less than V R . However, in general, a parallel trend in the values of 'D T-o being similar to V R is apparent, although such values tend to increase from the annual to monthly scale. However, an increase in the 'D T-o values at the weekly scale is not consistent because of random occurrences of a marginal decreases in values, which can be ascribed to sampling variations. At the annual scale, the V R and 'D T-o values are almost equal, though at times they turned out to be zero, for example, 2 rivers (#23 North Margaree and #24 Upper Humber in Table 2). For these two rivers, there appears to be no need for storage at the draft of 0.75 µ a based on the annual flow analysis. This finding seemingly is less convincing as there are long dry periods rendering the presence of low flows in these rivers, requiring compensating flow releases from storage. Therefore, the aforesaid analysis at the annual scale provides fewer functional estimates of the reservoir and the corresponding deficiency volumes at the draft level of 0.75 µ a . However, more functional estimates are obtained by analyzing at the monthly and weekly scales. At these scales, to meet the draft of 0.75 µ a , storage is required in the form of reservoirs across the above rivers, as evinced by significant storage values (Table 3). Another point to be noted is that the cutoff level 'SHI o ' at the annual scale shows large variability (range −0.68 to −2.03) compared to the monthly and weekly scale, where variability is contained within a small range (from −0.17 to −0.34).
The aforesaid analysis thus points out that the monthly based V R values are drastically different from those obtained at the annual scale. The weekly based estimates are marginally higher than the monthly based, so monthly analysis seems to be adequate. Further, at the monthly and weekly scales, 'SHI o ' is varying within a close range and thus, either of the time scales can be considered for further analysis, weighing the choice to the monthly scale because of its affable tractability in terms of statistical analysis. The monthly values are easy to procure and/or synthesize. The monthly SHI (standardized flow) sequences fit well within the MC1 dependence structure, which is better amenable for analysis using stochastic methods. Because of the above observations, the detailed analysis for computing V R , D T-o and D T-e has been done on the monthly flow and SHI sequences, and the methods of analysis are presented below.

Implementation of the DM-Based Counting and Estimation Methods
In the DM-based methods, the first step was to form the moving average (MA) smoothed sequences from the monthly SHI sequences. In the present paper, the SHI sequence by itself is taken as the MA1 (moving average 1) smoothing with µ = 0 and σ = 1. In the MA2 smoothing, 2 consecutive values of SHI were averaged out, and a running smooth sequence of MA2 was formed with µ = 0 and σ (designated as σ m2 ), which turned out to be smaller (<1) than the case of MA1. Intuitively, a higher degree of smoothing leads to a higher dependence and that is why (denoted as m2 ) for the MA2 case will be greater than that ( m1 ) for the MA1 case. The MA2 sequence was again standardized using µ = 0 and a new value of σ (denoted as σ m2 ) for further analysis.
In the counting process, the SHI sequence was truncated at the level of SHI x (i.e., SHI x = SHI o , SHI x = SHI m , and SHIx = SHI av ) in both the cases of the MA1 and MA2 sequences. The values of the above three SHI x at the monthly scale are summarized in Table 4   Note: These numbers in Bold has been referred in the text.
On the other hand, in the analytical procedure, the values of L T-e , M T-e and D T-e were evaluated by involving Equations (1) to (4), and other relevant equations (Appendix A), with the parameters (cv av , σ av , m1 and m2 ) estimated from the observed flows and SHI sequences. A value of Φ in determining the characteristic length L C was needed, which required a calibration, though the initial values are available from the investigations of Sharma and Panu [26]. Such values have been further affirmed through an evaluation using the Nash-Sutcliffe efficiency (NSE) and the mean error (MER) criterion. To convert M T-e into D T-e , the original value of σ av was used as a multiplier similar to the case of the counting method.

Comparison of V R ' and M T-o at Monthly Scale with SHI o as Cutoff: By Counting Method
It must be reiterated that the DM-based method is applied to the SHI sequences.  Table 4. For brevity, all the rivers are not listed in Table 4, rather included are two typical and representative rivers from each of the three regions. The selection criterion for choosing a river was the persistence characteristics represented by the lag-1 autocorrelation ( m1 ). The first set of two rivers (#3 and #5) are located in the Prairies and Western Ontario with m1 ≥ 0.5; the second set of rivers (#8 and #11) are from Northern Ontario with 0.25 < m1 < 0.5; and the third set of rivers (#20 and #25) are from Atlantic Canada with a modest autocorrelation (i.e., m1 < 0.25). However, it should be noted that the performance statistics (NSE and MER) reported in Table 4 and all graphical displays are based on all the 25 rivers, and accordingly, the results are discussed. For comparative analysis of any pair on a 1:1 basis, the performance statistics (NSE and MER) were used. For an acceptable quality of parity in the entities in a pair, one should expect the value of NSE to be greater than 75% along with the value of MER to be within ±5% [23].
Based on the results of the analyses, it was found that M T-o in terms of the MA1 sequences underestimated V R ' (MER ≈ −14%) whereas, it was overestimated (≈15%) in the MA2 sequences with an NSE of less than 60% (Table 4, columns 4 and 5). These values, therefore, suggest that the counting-based procedure for evaluating D T-o to match V R can be interpreted as unsatisfactory. This discrepancy necessitated an alternative method to evolve estimates for deficiency volumes (D T ) comparable to SPA-based (V R ) values. Therefore, recourse was taken to an analytical method in which the probability-based relationships (Equations (1) and (2)) formed the basis for the evaluation of the needed entities.

Comparison of V R ' and M T-e (Equation (2)) at the Monthly Scale with SHI o as Cutoff
The simple procedure in the ambit of estimation methods is the use of Equation (2) (Table 4, columns 6). The M T-e estimates ( Table 4, column 7) based on the MA2 sequences seemed better with MER ≈ 0, but the NSE was low (≈46%). Such values appear to be stoic and allude to the modest potential of the simple equation to provide acceptable estimates for the reservoir volumes. Similar behavior of Equation (2) at the draft of 0.95 to 1 MAF has been reported [25], where a substantial underestimation was observed. Because of the foregoing observations, it is imperative to extend to a step higher model (Equation (1)), which involves both the mean and variance of drought intensity for the estimation of M T-e .

Comparison of V R ' and M T-e (Equation (1)) at Monthly Scale with SHI o as Cutoff
In the analytical method using Equation (1), the crucial parameter is Φ because it controls the value of the characteristic drought length, L c . The experience of authors on Canadian rivers suggests that two values of Φ, viz. 0 and 0.5, be considered for use in Equation (1). The values of M T-e were estimated based on the MA1 and MA2 smoothed sequences along with Φ = 0 and 0.5, and the results are summarized in Table 4.
The values of M T-e were computed by involving only MA1 sequences along with Φ = 0 and the results are presented in Table 4, column 8. It is apparent that the M T-e values in column 8 are generally larger than V R ' (column 3), thus indicating an overestimation. This fact is also reflected in MER ≈ 27%, though the value of NSE was reasonable (≈76%). Therefore, a value of Φ = 0.5 was attempted on the MA1 sequences, which resulted in the underestimation (MER ≈ −14%) and a low value of NSE ≈ 62% (Table 4, column 9). Since the results based on the MA1 sequences turned out to be less than encouraging, it was considered appropriate to estimate M T-e using the MA2 sequences with values of Φ = 0 and 0.5, and the results thus obtained are exhibited in columns 11 and 12 (Table 4). A striking feature on the MA2-based computations is the significant overestimation (MER ≈ 99% for Φ = 0; and MER ≈ 35% for Φ = 0.5).
With the use of Φ = 0.5 on the MA1 sequences, major discrepancies were noted in the rivers lying in Prairies and Western Ontario (#1 to #6) in which m1 was ≥0.5, whereas the rivers from #6 to #25 tended to show reasonable correspondence. It can, thus, be conjectured that the same value of Φ is not applicable uniformly in all the rivers. Therefore, the following two groups of rivers were formed: group 1 was river #1 to #6 and group 2 was river #6 to #25. The discrepancy in group 1 was ameliorated with the use of Φ = 0, while retaining Φ = 0.5 for the rivers in group 2. Likewise with Φ = 0.5 on the MA2 sequences in the group 1 rivers, but Φ = 0.5 on the MA1 sequences in the group 2 rivers tended to improve the matching. This mooted the idea of mixing Φ's (0 or 0.5) on the MA1 sequences (named as combination-A) or mixing the MA1 and MA2 sequences with Φ = 0.5 (named as combination-B). Based on the use of differential values of Φ on the MA1 sequences in combination-A, the resultant M T-e values are arranged in column 10 ( Table 4). The combination-A-based values of M T-e compared satisfactorily with V R ' as shown in Figure 2A (MER ≈ −1% and NSE ≈ 80%). Likewise, in combination-B the M T-e values with differential MA (MA1 or MA2 sequences) with a uniform value of Φ = 0.5 are arranged in column 13 ( Table 4). The combination-B resulted in NSE = 78% with negligible overestimation of 1.5% ( Figure 2B).  Table 4). The combination-B resulted in NSE = 78% with negligible overestimation of 1.5% ( Figure 2B). Although, the estimated values of MT-e under both combinations (i.e., A and B) do not appear significantly different but one would be inclined to use combination-A with the use of differential values of Φ on the MA1 sequences for estimation purposes of MT-e. This is because working on the MA1 (which is the SHI itself) sequences is easier and convenient as the Φ values can be easily plugged in the desired equations. A worthy point to note is that the combination-B involves the MA2 sequences, thus warrants making another sequence (extra effort) based on the SHI sequences and also evolving a new set of parameters, viz. ρm2 (lag-1 autocorrelation of MA2 sequences) and σm2 (standard deviation of MA2 sequences), which is not devoid of uncertainty and errors.

Comparison of VR' and MT-e (Equation (1)) at Monthly Scale with SHIm and SHIav as Cutoff
Although SHIo seemed reasonably satisfactory as a cutoff for the MA1 sequences, Although, the estimated values of M T-e under both combinations (i.e., A and B) do not appear significantly different but one would be inclined to use combination-A with the use of differential values of Φ on the MA1 sequences for estimation purposes of M T-e . This is because working on the MA1 (which is the SHI itself) sequences is easier and convenient as the Φ values can be easily plugged in the desired equations. A worthy point to note is that the combination-B involves the MA2 sequences, thus warrants making another sequence (extra effort) based on the SHI sequences and also evolving a new set of parameters, viz. m2 (lag-1 autocorrelation of MA2 sequences) and σ m2 (standard deviation of MA2 sequences), which is not devoid of uncertainty and errors.

Comparison of V R ' and M T -e (Equation (1)) at Monthly Scale with SHI m and SHI av as Cutoff
Although SHI o seemed reasonably satisfactory as a cutoff for the MA1 sequences, there is still a noticeable divergence in the higher range of values of these entities (Figure 2A,B) between V R ' and M T-e . Should this divergence be mitigated, the NSE would improve, alluding to a better fit. Since V R ' is already a fixed entity, the divergence can be reduced by improving the values of M T-e . Therefore, two versions of SHI x were contemplated, namely, SHI m and SHI av . SHI m is computed as = (0.75 − 1)/cv m and SHI av = (0.75 − 1)/cv av , where cv m = σ max /µ o , cv av = σ av /µ o , and σ max is the maximum value among 12 monthly σ's, whereas σ av is the arithmetic average of these 12 monthly values. The new values of SHI x as SHI m and SHI av are shown in Table 4 (in the mid and lower portion of column 2) for the rivers under consideration. These cutoff levels were applied on the MA1 and MA2 sequences in the same manner as was conducted with cutoff = SHI o while using Φ = 0 and 0.5. The results based on the new values of SHI x (=SHI m ) for M T-e are shown in Table 4 (columns 8 to 13). With the SHI m as the cutoff, the fit has improved yielding the NSE value ≈ 85% (combination-A, mixing of Φ values) and ≈ 86% (combination-B, mixing MA1 and MA2). The MER also remained within 3% suggesting an acceptable overestimation.
The new values of M T-e against V R ' are shown in Figure 3A,B, where the points fall closer to the 1:1 line vis-à-vis Figure 2A,B, where they lie farther from the 1:1 line. At a first glance, the plots further suggest that SHI m can be construed to be a better cutoff level compared to the SHI o , though only marginally. A further attempt was made to engage SHI av , and the new estimates of NSE and MER were evaluated (the lower portion in Table 4, columns 10 and 13). The response with SHI av as the cutoff was less than promising as MER indicated underestimation in the range of nearly 26% (combination-A) and 22% (combination-B). It can be perceived from Table 4 (column 2) that the values of SHI av appear significantly different from SHI o and SHI m , thus yielding low values of M T-e in comparison with V R '. Consequently, no further analysis was pursued using SHI av as a candidate choice. Prima-facie, SHI m appears to be a better cutoff followed by SHI o and further investigations were needed to discriminate between these two cutoffs to evolve the better one. The relative difference between V R and D T-e was considered a better measure for the discriminant analysis and was conducted as follows. Consequently, no further analysis was pursued using SHIav as a candidate choice. Prima-facie, SHIm appears to be a better cutoff followed by SHIo and further investigations were needed to discriminate between these two cutoffs to evolve the better one. The relative difference between VR and DT-e was considered a better measure for the discriminant analysis and was conducted as follows.

Relative Difference between Deficit Volume (DT-e) and Reservoir Volume (VR)
The performance statistics NSE and MER essentially help to discern the quality of a fit between MT-e and VR' about the 1:1 line because both are respectively standardized values of DT-e and VR. Also, the large variations within DT-e and VR are not accurately accounted for by NSE and MER statistics. The relative difference in percent (designated as a relative error, RE), i.e., RE (=100 × (DT-e − VR)/VR) can be used as a robust measure to express the adequacy of the estimated values of DT-e as a competing estimate of VR. Preferably, RE should be close to zero for an ideal estimate of DT-e, and conversely, a higher value indicates a poor estimation. Thus, based on the DT-e and VR values of 25 rivers, the

Relative Difference between Deficit Volume (D T-e ) and Reservoir Volume (V R )
The performance statistics NSE and MER essentially help to discern the quality of a fit between M T-e and V R ' about the 1:1 line because both are respectively standardized values of D T-e and V R . Also, the large variations within D T-e and V R are not accurately accounted for by NSE and MER statistics. The relative difference in percent (designated as a relative error, RE), i.e., RE (=100 × (D T-e − V R )/V R ) can be used as a robust measure to express the adequacy of the estimated values of D T-e as a competing estimate of V R . Preferably, RE should be close to zero for an ideal estimate of D T-e , and conversely, a higher value indicates a poor estimation. Thus, based on the D T-e and V R values of 25 rivers, the RE values were evaluated for the two cutoff conditions, viz. SHI x = SHI o and SHI x = SHI m involving both combination-A and combination-B. The mean (designated as µ re ) and standard deviation (designated as σ re ) of these 25 RE values were computed and are presented in Table 5. The cutoff equal to SHI o with combination-A yielded the least µ re = 0.77% (σ re ≈ 17%) followed by µ re = 1.52% (σ re ≈ 18%) with combination-B. The cutoff equal to SHI m outputted µ re close to 4% (σ re ≈ 17%) in both combination-A and combination-B. Given these statistics, one can interpret that SHI o is a better cutoff than SHI m , which contradicts the earlier finding (based on NSE and MER) that had placed SHI m ahead of SHI o . It is worthy to note that µ re and σ re (not tabulated in Table 5) were also evaluated for the cutoff equal to SHI av , which resulted in µ re ≈ −24% (σ re ≈ 14%) for combination-A and likewise, µ re ≈ −22% (σ re ≈ 14%) for combination-B. These values (specifically the µ re values), being far from the desired value of 0, further support the earlier argument of rejecting SHI av as a viable and competing cutoff level.
At this point, there is a need to discriminate between SHI o and SHI m , and to suggest a consistent cutoff level for estimating the deficit volumes (=reservoir volumes), as the statistics MER and RE have not rendered unanimous results. This anomaly was solved by enlarging the sample size from 25 (each river is a sample) to 37. The rivers with a long record (#3, #5, #19, #20, #22 and #23 in Table 2) were split into two samples, each of nearly equal size, and V R and D T-e values were computed for 37 (25 + 12) samples. Based on the 37 samples, the MER (using M T-e and V R ') and RE (using D T-e and V R ) were evaluated and the results are shown in Table 5 at the cutoff levels of SHI o and SHI m for combination-A and combination-B. For the cutoff level (SHI m ), the new µ re values were found to range between 3 to 4% (σ re > 18%) but the NSE values dropped significantly from 85% to 73%. This is undesirable behavior, and thus the reliability of the cutoff (SHI m ) lands in a doubtful regime. In contrast, the behavior of the cutoff (SHI o ) is more consistent because the NSE values remained closer to each other in both samples of the original 25 flow datasets and the latter 37 flow datasets. Further, the µ re values were found to be closer to 0 (ranging from 0.77 to 3.91%). Based on these numbers, it can be easily conjectured that the cutoff (SHI o ) is better compared to the cutoff (SHI m ). Needless to mention, it is easy to derive SHI o values by computing µ o and σ o (or cv o ) from the non-stationary monthly flow sequences. Likewise, the m1 and σ av can also be rapidly computed from the monthly flow and SHI sequences.
To objectively illustrate the DM-based methodology, an example is presented using the monthly data of the following two rivers: the English River (#5 in Table 2, with a high value of m1 = 0.76) and the Neebing River (#7 in Table 2, with a modest value of m1 = 0.43). The relevant specificities and computations for these rivers are as follows.
English River: T = 92 years (1164 months), µ o = 58.61, cv o = 0.74, cv av = 0.51, m1 = 0.76, and m2 = 0.88. Since m1 = 0.76 (>0.50), so Φ = 0. Plugging the above parameters in the relevant equations, D T-e was computed as 2.26 × 10 9 m 3 . The SPA-based value of V R was computed as 2.04 × 10 9 m 3 . The relative difference is therefore +11% (indicating an overestimation compared to the SPA-based value), which is reasonably acceptable. The above estimate can be attached to a return period of 100 years (sample size 92 years ≈ 100 years). The other merit of the DM-based method is the estimation of D T-e with any other return period, say 50 years (600 months). Plugging T = 600 into the relevant equations, D T-e = 1.92 × 10 9 m 3 , which is less than the 100-year value (=2.04 × 10 9 m 3 ).
Neebing River: T = 66 years (792 months), µ o = 1.62 m 3 /s, cv o = 1.48, cv av = 0.81, m1 = 0.43, and m2 = 0.73. Since m1 = 0.43 (<0.50), so Φ = 0.5. Plugging the above parametric values into the relevant equations, D T-e was computed as 4.83 × 10 7 m 3 . The SPA-based value of V R was computed as 5.69 × 10 7 m 3 . The relative difference is therefore about −15% (an underestimation compared to the SPA-based value), which is again within an acceptable range of error. The above estimate can be attached to a return period of 66 years. The D T-e value with the return of T = 50 years or 600 months was computed as 4.58 × 10 7 m 3 and also for T = 100 years or 1200 months as 5.20 × 10 7 m 3 . These D T-e estimates are closer to the SPA estimate of 5.69 × 10 7 m 3 , with a slight underestimation. For design application, the V R can be taken as the average of 5.69 × 10 7 and 4.83 × 10 7 = 5.26 × 10 7 m 3 for T = 66 years.
In a nutshell, the proposed analytical method yields satisfactory estimates of D T-e , which are comparable to the SPA-based estimates of reservoir volume, with a margin of error of ±18%. The combination with the length scaling parameter Φ = 0 and Φ = 0.5, respectively, for the rivers with high persistence and low persistence on the SHI (MA1) sequences proved adequate for estimation purposes.
Although in this analysis, the comparison has been made between SPA-and DM-based methods using the Canadian river flow data, it was considered appropriate to test the efficacy of the DM method using the data of the Mitta Mitta River (Australia) reported by McMahon and Mein (1978), and compare the estimates of the reservoir volumes by various other methods as shown in Table 6 (draft level of 0.75µ at the monthly scale). The estimate of the reservoir volume based on the DM method is also listed in the last row of this table. Table 6. Comparison of reservoir capacity by different methods at the monthly scale.

Reservoir Capacity Estimation Method
Reservoir Capacity at 0.75 µ (MAF) It is evident from the above table that the estimate of reservoir volume by the DM (#9) lies between the SPA (#3), Gould Gamma (#6) and Behavioral analysis (#8). It has been reported (McMahon and Mein, 1978) that the Gould Gamma and Behavior analysis methods yield better results in terms of the least bias and standard error. In view of the foregoing observation, the DM-based estimate is relatively closer (i.e., within +14% error) to the above two estimates. While comparing with the SPA-based estimate, the DM-based estimate is slightly less (i.e., within −10.6%). Succinctly, the DM-based method offers an approach for the design of reservoir capacity, which opens a platform for comparison with techniques such as Extended deficiency analysis, Behavioral analysis, Hardison generalized method, Phien method, Vogel and Stedinger method, Gould's probability method, etc., which have been discussed by McMahon et al. (2007). Such investigations of the aforesaid methods on Canadian rivers are underway by the authors under a separate study.

Conclusions
The DM-based methodology for reservoir sizing has been developed and applied at a 0.75 mean annual flow (MAF) level of the draft at the monthly scale in 25 rivers across Canada. The analysis, involving the annual, monthly and weekly flows at the aforesaid draft level, suggested that the monthly scale is the most optimal for the estimation of reservoir volumes at the draft of 0.75 MAF. The annual time scale was found inadequate. The weekly scale was adequate but only offered marginal benefits over the monthly based estimates of reservoir volumes.
In the DM-based methodology, the monthly standardized hydrological index (SHI) sequences were used to evolve the values of standardized drought magnitudes, M T-e , which were compared with the SPA-based standardized reservoir volumes, V R '. The DMbased analysis involved the MA1 (moving average 1) and MA2 sequences, which were formed from the SHI sequences. The SHI sequences themselves are denoted as MA1, while the averaged consecutive two SHI values formed the MA2 sequences. The cutoff level for truncating the MA1 and MA2 sequences was determined as SHI o = (0.75 − 1) µ o /σ o , where µ o and σ o are the overall mean and standard deviation of monthly flow sequences. Another competitive cutoff level was found as SHI m = (0.75 − 1) µ o /σ max , in which σ max is the maximum value of standard deviation among 12 monthly values. In the DM-based methodology, one crucial parameter is the characteristic drought length, which is controlled (or optimized) through the length scaling parameter Φ. At the monthly scale, the value of Φ = 0 was found adequate for rivers with a lag-1 autocorrelation ( m1 ) ≥ 0.5, and the Φ = 0.5 for rivers with m1 < 0.5. Two combinations of MA sequences were attempted to estimate M T-e , as follows: in combination-A, differential values of Φ (Φ = 0 for rivers with m1 > 0.5 and Φ = 0.5 for rivers with m1 < 0.5) were applied on the MA1 sequences; and in combination-B, a single value of Φ = 0.5 was applied on the differential MA sequences, i.e., MA1 sequences of rivers with m1 < 0.5 and MA2 sequences of rivers with m1 ≥ 0.5. Combination-A emerged to be better in terms of ease and convenience of computations while yielding M T-e values that are compatible with V R '. The DM-based estimates of deficit volumes (D T-e ) were found to be closer to the SPA-based reservoir volumes (V R ) within an error margin of ±18%. An important feature of the DM-based method is its ability to explicitly involve the return period in the estimation process, unlike the SPA, which yields estimates of reservoir volumes that are devoid of return periods.
Author Contributions: T.C.S. and U.S.P. collaboratively accomplished various tasks related to this manuscript, draft preparation and finalization of the manuscript for journal publication such as conceptualization and development of the methodology, data collection and data analysis. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
The partial financial support of the Natural Sciences and Engineering Research Council of Canada for this paper is gratefully acknowledged.

Conflicts of Interest:
The authors declare no conflict of interest. Such a consideration reduces the expression for the term P (M ≤ Y j ) in Equation (A3) as follows: It was noted that the parameters µ M and σ 2 M are related to the extreme drought length L T , and the mean drought length L M [24]. Such a representative drought length is named herein as a characteristic drought length L C , which can be expressed as follows: The parameter Φ can be designated as a weighing parameter as it weighs the mean drought length L M and the longest drought length L T . The value of Φ varies from 0 to 1 and is determined through optimization or a trial and error procedure. For first-order dependence or a Markov Chain order 1 (MC1) situation, the mean length L M can be expressed as follows [13]: The expression for the expected value of L T in the MC1 situation in drought periods can be obtained as follows [24]: where F is the factor to account for the plotting position in the empirical estimation of the exceedance probability. That is, in the Hazen plotting position formula, the exceedance probability = 0.5/T (T = sample size), so the return period is equal to T/0.5 = 2T or F = 2. Likewise, in the Weibull plotting position formula F = 1. In this analysis, the plotting position formula [30] as developed for Canadian rivers has been used. The formula evaluates the exceedance probability = 0.75/(T + 0.25), so F = 1. 33 (1 + 0.25/T) ≈ 1.33 as T is generally large. The term q q stands for the conditional probability of the present period being drought given the previous period was also a drought, and likewise, q p stands for the present period being drought given the previous period was wet. The conditional probabilities q q and q p can be computed from the following relationship due to Cramer and Leadbetter (29): where ϑ is a dummy variable of integration while other terms are as defined earlier. The integral in Equation (A10) can be evaluated by a numerical procedure, and the values of q q for a given and z 0 can be computed. For the monthly SHI sequences = m1 . Equation (A10) can also be used to estimate p p (probability present time being wet given the past period was also wet) by replacing q q by p p and q by p (=1 − q). Therefore, an estimate of q p can be found as q p (=1 − p p ), which can be used in Equation (A9) for the estimation of the longest drought lengths L T , involving the MC1 situation. Note for the MC0 situation, such as for the annual SHI sequences, q q = q p = q. The expressions for µ M and σ 2 M can be shown to be related as follows [34]: Once a proper value of L C has been determined, Equation (A6) is integrated numerically to evaluate P (M ≤ Y j ) and then the value of the integrand is inserted in Equation (A3) to yield an estimate of P (M T ≤ Y j ). Letting these values of Y j as Y 0 (j = 0) = 0, Y 1 (j = 1) = 0.05, Y 2 (j = 2) = 0.10, Y 3 (j = 3) = 0.15, Y n1 = 150 (for the monthly scale) with an increment, ∆ = 0.05, E (M T ) can be computed using Equation (A3).
A particular version for the estimation of E (M T ) can be developed based on the mean of the drought intensity, µ d involving Equations (A4) and (A9) as follows: Note, Equation (A2) involves both the mean and variance of drought intensity (I d ) for the estimation of E (M T ), and therefore is more comprehensive.