The bounded inverse Weibull distribution: An extreme value alternative for application to environmental maxima?

investigation


Introduction
There has long been interest in the magnitude of future large low-probability natural events that are greater than any previously recorded.For example, future large floods may significantly impact catchment ecosystems and their magnitudes are also of relevance for civil engineering design.
Extrapolating beyond the largest event is always subject to error and asymptotic extreme value theory has been widely employed as an aid.The theory was developed by the early workers in the field including Fisher and Tippett (1928), Gnedenko (1943) and von Mises (1936).An historical overview is given by Kotz and Nadarajah (2000).Coles (2001) considers a range of practical applications.
For magnitudes that are unbounded above, extreme value theory indicates that the largest values of sufficiently large samples will be distributed approximately as either a Type 1 (Gumbel) extreme value distribution, or a Type 2 extreme value distribution.
Both Type 1 and Type 2 distributions have been widely applied to environmental maxima.The Type 2 form for application to rainfall annual maxima was first proposed by Jenkinson (1955) and has been subsequently supported by studies of many rainfall data sets.See, for example, Papalexiou and Koutsoyiannis (2013).
Type 1 and Type 2 extreme value distributions applied to data are conveniently presented on Gumbel plots.Type 1 distributions yield linear data plots and Type 2 forms display an upward curvature.Fig. 1 and Fig. 2 illustrate Gumbel data plots for the two distribution types.The lines passing through the data in Fig. 1 and Fig. 2 represent, respectively, a Type 1 linear relation and a Type 2 curve.The respective lines here were not fitted by formal methods but were superposed on the data to illustrate similarity to the data values.This similarity would normally be taken to mean that the unbounded extreme value models hold and there is therefore some theoretical justification to extrapolate to large low-probability event magnitudes.
However, there is a significant issue of concern in such extrapotations.Type 1 and Type 2 extreme value distributions are both unbounded above, so their utilisation in any context implies magnitudes increasing without limit as exceedance probabilities decrease.That is, the Type 1 and Type 2 models at some point must over-predict event magnitudes.In reality, environmental magnitudes are bounded above by the bounded nature of the processes that create them.For example, is always possible to specify a rainfall magnitude so large that it has zero probability of exceedance, as opposed to a vanishingly small probability.
Applying unbounded extreme value models to bounded situations invalidates the theoretical justification for extrapolating beyond the largest past magnitude.Of course, for application within the data range there is no issue with utilising any flexible unbounded distribution as an empirical exercise in data matching.Zaghloul et al. (2020) give an overview of distributions applied to flood maxima.See also El Adlouni et al. (2010), and Marani and Ignaccolo (2015).However, all unbounded distributions will inevitably lead to over-prediction at some point if extrapolated far beyond the largest value.
The question then arises as to whether there exists any approach which might permit bounded extreme value theory to be applied to apparently unbounded data.This will carry the implication of estimating an upper bound parameter and quantifying its inevitably large estimation uncertainty.The purpose of this brief paper is to outline one such approach, which might be considered for further studies.The content here is developed from an earlier published paper by the senior author (Bardsley, 2019)

Asymptotic distribution of maxima reciprocals
As is well known, asymptotic extreme value theory for sample maxima can also be applied to sample minima.In particular, reciprocals of sample maxima become the sample minima for positivevalued random variables.If the original sampled variable has an upper bound c, then asymptotic extreme value theory for minima indicates that for sufficiently large samples the distribution of sample maxima reciprocals will be approximated by a 3-parameter Weibull distribution with distribution function: W(z) here is the probability of a smaller reciprocal value, and c À1 , a, and c are the Weibull location, scale, and shape parameters, respectively.The parameter c is the upper bound to the original data.
From the W(z) inverse, it is evident that z will plot as a linear function of r, where r = fÀln½1 À wðzÞg 1=c .
It follows that a graphical check of the fit of the 3-parameter Weibull distribution to maxima reciprocals can be made if values   of c À1 , a, and c can found so that the reciprocals plot close to the linear function of r.
The Weibull plots for the Buller River discharge reciprocals and the Lake Tekapo rainfall reciprocals are shown in Fig. 3 and Fig. 4, respectively.The data r i values were obtained here by estimating the W(z) values with the Gringorten plotting position expression.
There is evidently something of a contradiction between the two extreme value approaches applied to these data sets.The Type 1 and Type 2 extreme value similarities to the data could be interpreted as being consistent with an absence of upper bounds.However, the Weibull reciprocal plots from the same data are suggestive of upper bounds, particularly for the Buller River example.This indicates that some apparent Type 1 and Type 2 extreme value data plots can be equally well described by the reciprocal extreme value model with an upper magnitude bound.
A 3-parameter Weibull match to maxima reciprocals, while interesting, is not helpful in itself for practical application.This is because all the extrapolation would be within the narrow region between the smallest reciprocal value and zero.This would inevitably result in considerable error if Weibull parameter estimation was carried out using the data reciprocals.
It is therefore more useful to work with the original untransformed data.To this end, an alternative distribution for maxima can be defined as the distribution of the reciprocals of random variables from the 3-parameter Weibull distribution defined by Eq. ( 1).The implication is that if sample size is large enough for Eq. ( 1) to apply to reciprocals, then the distribution of 3-parameter Weibull random variable reciprocals would be a useful distribution to apply to the original maxima.This distribution is referred to here by the utility name of ''bounded inverse Weibull distribution", described in the next section.

Bounded inverse Weibull distribution
The bounded inverse Weibull distribution is defined here as the distribution of reciprocals of random variables from a 3-parameter Weibull distribution with a positive location parameter.The distribution function and density function are given respectively by: where c is the distribution upper bound and a and c are the Weibull scale and shape parameters.
The bounded inverse Weibull distribution should not be confused with the 3-parameter inverse Weibull distribution, which is unbounded above (Marušic ´et al., 2010).
The moments of the bounded inverse Weibull distribution seem not amenable to analytical expressions, so moment-based parameter estimators are not available.
The situation where c ! 1 is likely to be of most interest for application to block maxima.This corresponds to f (c) = 0, f (x) tending toward zero as x approaches zero from above, and a single mode m within the range 0 < m < c.
A lower bound approximation to the mode is given by: If c is held constant then a ?0 implies m ?c.
If other parameters are held constant then increasing c causes F (x) to tend toward the distribution function of a Type 2 extreme value distribution with its location parameter at zero.It is of interest to see how the bounded inverse Weibull distribution displays in Gumbel plots.That is, plots where magnitude x (vertical axis) is expressed as a function of the horizontal Gumbel variate y = Àln{Àln[F(x)]}.For the bounded inverse Weibull distribution, this function is given by: which is a rising S-shaped form a with a single inflection point at y* = c ln (a c).That is, the second derivative is positive for y < y* and negative for y > y*.Gumbel data plots from bounded inverse Weibull distributions may therefore show increasing gradients, decreasing gradients, or have S-shaped forms, depending on whether y* is located above, below, or within the data range.
In contrast, Gumbel plots for the extreme value distributions do not have inflection points, with Type 1 distributions plotting as straight lines and Type 2 distributions having rising curves with increasing gradient.
If a bounded inverse Weibull distribution quantile value is fixed at k such that F(k) = exp(-1), then increasing c while holding c fixed will result in the distribution moving toward a Type 1 extreme value distribution which has k as its location parameter.Good matches of Type 1 and Type 2 extreme value distributions to block maxima are therefore not evidence for the absence of an upper bound.The bounded inverse Weibull model predicts that all Gumbel plots are of S-shaped form in reality.However, impossibly long data records might sometimes be required to confirm the final Sform.The issue then becomes whether the largest few values in an existing record are close enough to the upper bound to reflect its influence and permit a data-based estimate of c.
Fig. 5 illustrates some bounded inverse Weibull distribution forms and their associated Gumbel plots.All the Gumbel plots extend over the range of y from -1.53 to 4.6, corresponding to F(x) = 0.01 and F(x) = 0.99, respectively.The Fig. 5 (a) Gumbel plot has an inflection point at y* = 2.08 and the distribution has similarities with the Type 1 extreme value distribution.The Fig. 5(b) distribution is similar in form to a Type 2 extreme value distribution with a location parameter at zero.Its associated Gumbel plot has an inflection point at y* = 3.6.The inflection point for the Fig. 5 (c) Gumbel plot is located at y* = À1.83.This is below the y range of the plot so the graphed curve displays a consistent decreasing gradient.
The S-shaped Gumbel plot form of bounded inverse Weibull distributions is most evident in the parameter combination of Fig. 5 (d), where the Gumbel plot inflection point is more clearly  visible at y* = 1.65.Such distinct S-shaped forms might be interpreted as arising from mixtures of distributions.However the bounded inverse Weibull distribution can accommodate some plots of this type without bringing in additional parameters from a second probability distribution.
Other parameter combinations can be set up which give ''unnatural" distribution forms, not shown here because they would be of no value as descriptors of real world block maxima distributions.
A feature of the bounded inverse Weibull distribution is that, like the extreme value distributions for maxima, it has a reproductive property with respect to its sample maxima.If random samples of size n are taken from a bounded inverse Weibull distribution (c, a, c), then the distribution of its sample maxima is a bounded inverse Weibull distribution (c, a n , c), where a n ¼ an À1=c .This means that if it happens that individual events within a year can be approximated as bounded inverse Weibull random variables, then there is empirical support for also using the distribution for annual maxima, with parameter adjustment for a. Zaghloul et al. ( 2020) make similar comments with respect to the reproductive property of the unbounded Burr Type 3 distribution.

Parameter estimation and an upper confidence bound to c
Distribution location parameters outside of the data range can make estimation difficult and may create considerable estimation error.This applies particularly for practical application of the bounded inverse Weibull distribution, where the physical upper bound to rainfall or flood magnitude may be some distance away from the largest recorded event magnitude.
Maximum likelihood estimation is one convenient approach for bounded inverse Weibull distribution estimation, given the absence of moment estimators.Maximising the log likelihood function simplifies to finding the maximum of a function defined by two parameters rather than three (Appendix).Plotting the log likelihood function over a range of trial c values provides a necessary check on whether estimation of c is feasible in any given instance.In particular, estimation of the upper bound will not be possible if the likelihood increases without limit for increasing trial values of c.The maximum likelihood estimate c* would also not be helpful if it was less than the largest recorded value.Some preliminary study suggests that c* is biased toward under-estimation.However, this can be offset in part by obtaining a 95% upper confidence bound to c by using a bootstrap approach.This bound is likely to be more useful than c* itself for making inferences about large magnitudes beyond the data range.
Given maximum likelihood estimates of the three unknown bounded inverse Weibull parameters from a sample of size n, an upper confidence bound to c can be obtained by a parametric bootstrap procedure.The inverse of Eq. ( 2) is first used to simulate N samples of size n, where N is sufficiently large.Alternatively, the N samples could be generated from the corresponding Weibull distribution and then converted to reciprocals.
For each of the N simulated samples, maximum likelihood parameter estimates are obtained and the N bound estimates B i are recorded.In the event that a simulated sample generates likelihood which is monotonically increasing with c, the B i estimate is specified as a large unknown value.A sequence D i is defined as

Examples
We consider again the Buller River annual flow maxima and Lake Tekapo annual rain maxima.
The Buller River data proved feasible for obtaining maximum likelihood parameter estimates of all parameters.Inserting the estimates into Eq.( 5) gives the function plot shown in Fig. 6.It is evident from comparison with Fig. 1 that the recorded maxima are consistent with both the Type 1 extreme value distribution and the bounded inverse Weibull distribution.That is, the inflection point in Fig. 6 at y* = 0.58 is hardly discernible so there is little to choose between the two distributions over the data range.However, extrapolating beyond the largest flood event may be better achieved by the bounded inverse Weibull distribution, given evidence of a lower bound to the discharge reciprocals (Fig. 3).At the same time, due note must be taken of the estimation error of c.The Buller River example is helpful for illustrating the effect of estimation error and the importance of the largest recorded values.Prior to 2021, the largest flood in the period of observation was 8,192 m 3 s À1 in 1970.This was exceeded by a large event of 8,887 m 3 s À1 in July 2021.An upper bound estimate made at the end of 2020 would have given c* = 8,753 m 3 s À1 and c 0.95 = 10,214 m 3 s À1 , with c* in this case being less than the large flood magnitude in the following year.Fig. 8 shows how incorporating the 2021 flood magnitude modified the upper tail form of the fitted distribution.
For the Lake Tekapo rainfall example, inserting the maximum likelihood parameter estimates into Eq.( 5) gives the Gumbel plot shown in Fig. 9 and the probability distribution shown in Fig. 10.Comparison with Fig. 2 indicates that there is little difference between the data support for the bounded inverse Weibull distribution and an unbounded extreme value distribution -in this case the Type 2 extreme value distribution.
It can be seen from Fig. 10 that the maximum likelihood upper rainfall bound estimate of c* = 277 mm is some distance above the main body of the distribution, so there is likely to be considerable estimation error.Also, the reciprocal of the 95% upper confidence bound is 0.002 mm À1 , which is not far removed from zero when taking into account that simulating N = 1000 samples involves tently from around 300 mm.As with the Buller example, this estimate is likely to be be changed by the occurrence of a larger event.

Discussion
Asymptotic extreme value models for maxima have traditionally provided the theoretical justification for extrapolating hydrological magnitudes beyond the largest recorded value.As noted earlier, the theoretical validity of the unbounded Type 1 and Type 2 distributions to this end is questionable because they ignore the existence of physical upper bounds to the environmental processes involved.
The proposed alternative is approaching the extrapolation issue via invoking the Weibull asymptotic extreme value distribution for minima applied to maxima reciprocals.This is a considerable departure and has generated robust discussion.See, for example, the HESS discussions in Bardsley (2017).There is also a school of thought that has argued against even the concept of upper rainfall bounds (Koutsoyiannis, 2004).More recently, Lombardo et al. (2019) derive unbounded and supposedly exact distributions of    hydrological extremes.However, we find it is difficult to see how such distributions can be described as exact in the absence of an upper bound parameter.
There is no suggestion of course that the two examples considered here provide justification of an alternative general model of extreme value analysis incorporating upper bounds.Nonetheless, the examples do give proof of concept that, at least in some situations, the more physically justifiable Weibull-based upper bound models can be applicable to apparently unbounded data.
One specific aspect for further work could be of interest.An upper confidence bound to c for rainfall maxima has similarities to probable maximum precipitation PMP, which has been defined as a rainfall magnitude having ''very small chance of being exceeded" (Salas et al., 2020).For situations where c estimates are possible for rainfall data, it would be useful to compare upper confidence bounds for various significance levels against any previously determined PMP magnitudes.

Conclusion
The use of the Weibull distribution for reciprocals of block maxima from sufficiently large samples is no less valid an application of asymptotic extreme value theory than applying the generalised extreme value distribution directly to the maxima.The case of particular interest is where the distribution of data reciprocals indicates the presence of a positive lower bound.This corresponds to the 3-parameter Weibull distribution being the appropriate asymptotic form of smallest extremes, showing that the data indicates the presence of an upper bound to the maxima.
In such situations, parameter estimation is best carried out using the distribution of Weibull reciprocals, referenced here as the bounded inverse Weibull distribution.It can be possible for apparent Type 1 and Type 2 extreme value data plots to contain sufficient information to enable maximum likelihood upper bound estimation via the inverse Weibull distribution.
The method outlined here has potential for extreme value extrapolation beyond the largest recorded magnitude because any detected upper bound prevents unrealistically large magnitudes from low exceedance probabilities.Use of the upper confidence bound to c should offset the risk of anticipating future magnitudes which are too small because of estimator bias.However, the method remains a proposal for now and extensive application to data sets would be required to establish its practical value.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Gumbel plot of annual maximum discharges for the Buller River at Te Kuha, New Zealand.Data are for 1963-2021.The straight line is from a Type 1 extreme value distribution.All data plots in this paper use the Gringorten plotting position expression.

Fig. 2 .
Fig. 2. Gumbel plot of annual maximum daily rainfalls from Lake Tekapo, New Zealand.Data are for 1927-2020.The curved line is from a Type 2 extreme value distribution.

Fig. 3 .
Fig. 3. Weibull plot of reciprocals of Buller River discharge annual maxima.The parameter values for the line are c = 2.07, c À1 = 1.0195Â 10 À4 m À3 s, a = 1.3488Â 10 À4 m À3 s.The c À1 value (vertical intercept) gives the Buller River upper discharge bound estimate as c* = 9,809 m 3 s À1 .Parameter values were obtained by maximum likelihood fitting of the bounded inverse Weibull distribution (see Section 3).

Fig. 4 .
Fig. 4. Weibull plot of reciprocals of Lake Tekapo daily rainfall discharge annual maxima.The parameter values for the line are c = 2.35, c À1 = 0.00361 mm À1 , a = 0.02187 mm À1 .The c À1 value (vertical intercept) gives the Lake Tekapo upper rainfall bound estimate as c* = 277 mm.Parameter values were obtained by maximum likelihood fitting of the bounded inverse Weibull distribution (see Section 3).
* and the D i values are arranged in decreasing rank order, with any unknown large values grouped at the start.If a 95% upper confidence bound to c is required then the quantile K 0.05 is found, defined such that 5% of the D i values are smaller.The c 0.95 upper confidence bound to c is then obtained as c 0.95 = c* À K 0.05 .A useful description for obtaining parametric bootstrap confidence intervals is given in MIT OpenCourseWare (Orloff and Bloom, 2014).

Fig. 7
Fig.7shows the degree of definition of the upper discharge bound estimate for the Buller River, for trial c values up to 15,000 m 3 s À1 .The 95% upper confidence bound was obtained as c 0.95 = 11,252 m 3 s À1 , from maximum likelihood c estimates from 1000 samples of size n = 59, simulated from the distribution.The Buller River example is helpful for illustrating the effect of estimation error and the importance of the largest recorded values.Prior to 2021, the largest flood in the period of observation was 8,192 m 3 s À1 in 1970.This was exceeded by a large event of 8,887 m 3 s À1 in July 2021.An upper bound estimate made at the end of

Fig. 6 .
Fig. 6.Gumbel plot of annual maximum discharges from the Buller River.The curved line is from a bounded inverse Weibull distribution defined by maximum likelihood parameter estimates: c* = 2.07, a* = 1.3488Â 10 À4 m À3 s, and c* = 9,809 m 3 s À1 .

Fig. 7 .
Fig. 7. Plot of log-likelihood values for the Buller River data for trial values of c over the range of 9,000 to 15,000 m 3 s À1 .The rising curve shows the corresponding values of c that maximise the likelihood for given c.

Fig. 8 .
Fig. 8. Bounded inverse Weibull probability density distributions for the Buller River annual maxima, as obtained from maximum likelihood parameter estimation, with and without the 2021 flood event of 8,887 m 3 s À1 .Left and right vertical arrows are the c 0.95 values obtained from excluding and including the 2021 flood event, respectively.The two distributions are defined over the discharge ranges as defined by the respective upper bounds of c = 8,753 m 3 s À1 and c = 9,809 m 3 s À1 .

Fig. 10 .
Fig. 10.Bounded inverse Weibull probability density distribution for the Lake Tekapo annual rainfall maxima, as obtained from maximum likelihood parameter estimation.The distribution is defined over the finite rainfall range extending to c = 277 mm.

Fig. 11 .
Fig. 11.Plot of log-likelihood values for the Lake Tekapo rainfall maxima for trial values of c over the range of 200-1000 mm.The rising curve shows the corresponding values of c that maximise the likelihood for given c.