Simultaneous confidence intervals for all pairwise differences between the coefficients of variation of rainfall series in Thailand

The delta-lognormal distribution is a combination of binomial and lognormal distributions, and so rainfall series that include zero and positive values conform to this distribution. The coefficient of variation is a good tool for measuring the dispersion of rainfall. Statistical estimation can be used not only to illustrate the dispersion of rainfall but also to describe the differences between rainfall dispersions from several areas simultaneously. Therefore, the purpose of this study is to construct simultaneous confidence intervals for all pairwise differences between the coefficients of variation of delta-lognormal distributions using three methods: fiducial generalized confidence interval, Bayesian, and the method of variance estimates recovery. Their performances were gauged by measuring their coverage probabilities together with their expected lengths via Monte Carlo simulation. The results indicate that the Bayesian credible interval using the Jeffreys’ rule prior outperformed the others in virtually all cases. Rainfall series from five regions in Thailand were used to demonstrate the efficacies of the proposed methods.


INTRODUCTION
Thailand is located in Southeast Asia and is classed as a tropical area. It is influenced by both the southwest and northeast monsoons. The southwest monsoon crosses Thailand between mid-May to mid-October (the rainy season) and brings moist air from the Indian Ocean that causes clouds and heavy rain. The northeast monsoon crosses Thailand from mid-October to mid-February (the winter season) causing cold and dry weather. Moreover, the South receives additional heavy rainfall due to moisture coming in from the Gulf of Thailand. The season changes from mid-February to mid-May (the summer season) due to uncertainty in the weather and is influenced by tropical cyclones in the South China Sea, and thus, the weather is generally hot and dry but often with heavy rain and thunderstorms (Thai Meteorological Department, 2015). Thailand often endures flooding due to thunderstorms, which can take lives and damage property, especially on farms due to Thailand being an agricultural country. Thailand is divided into five regions according to its climate pattern and meteorological conditions (Table 1) (Thai Meteorological Department, 2015). Therefore, it is important to investigate rainfall dispersion in each area to gain preliminary information for formulating policies to mitigate such incidents.
For statistical inference, the CV, the ratio of the standard deviation to the mean, is a good tool for investigating rainfall dispersion. The advantage of using the CV is that it is unitless and thus, is useful for measuring dispersion in data series with different units or drastically different means. Focusing on inferential statistics, the confidence intervals and functions of the CV for several distributions have been presented. Wong & Wu (2002) suggested a small-sample asymptotic method for constructing the confidence intervals for the CV of normal and non-normal distributions when the sample size is very small. Mahmoudvand & Hassani (2009) proposed two new methods for constructing the confidence intervals for the CV of a normal distribution and compared them with Miller's, Makay's, Vangel's, and Sharma-Krishna's methods; they found that their proposed methods are more appropriate than the others. Buntao & Niwitpong (2012) proposed the generalized pivotal approach (GPA) and a closed-form method for variance estimation for the difference between the CVs of lognormal and delta-lognormal distributions; their results show that the GPA is the most suitable. After that, they constructed the confidence intervals for the ratio of the CVs of delta-lognormal distributions using GPA and the method of variance estimates recovery (MOVER) (Buntao & Niwitpong, 2013); their results were similar to the confidence intervals for the difference between the CVs. Wongkhao, Niwitpong & Niwitpong (2015) presented the generalized confidence interval (GCI) and MOVER to construct the confidence intervals for the ratio of CVs of normal distributions and then compared their methods with the Verrill and Johnson and bootstrapping methods; they found that GCI and MOVER performed better than the others. Sangnawakij & Niwitpong (2017a) proposed MOVER, GCI, and the asymptotic confidence interval (ACI) for constructing the confidence interval for the CV and difference between the CVs of two-parameter exponential distributions; their results show that GCI was appropriate for a single CV and ACI worked well for the difference between the CVs. In addition, confidence intervals were extended by Sangnawakij & Niwitpong (2017b) based on the score and Wald intervals for the difference between and ratio of CVs of two gamma distributions; their proposed methods performed well in a comparative study. Recently, Yosboonruang, Niwitpong & Niwitpong (2018) proposed GCI and a modified Fletcher method to construct the confidence intervals for the CV of a delta-lognormal distribution and found that GCI was the best. Afterward, they introduced the fiducial GCI (FGCI) and MOVER to construct the confidence intervals for the CV of a delta-lognormal distribution (Yosboonruang, Niwitpong & Niwitpong, 2019a). Moreover, they compared the confidence intervals based on FGCI and a Bayesian method for the CV of a delta-lognormal distribution (Yosboonruang, Niwitpong & Niwitpong, 2019b); their results indicate that the Bayesian method outperformed FGCI. Yosboonruang & Niwitpong (2020) constructed confidence intervals using GCI and MOVER based on variance stabilizing transformation, the Wilson score, and Jeffreys' method for the ratio of the CVs of delta-lognormal distributions; their results show that GCI was the most suitable. Yosboonruang, Niwitpong & Niwitpong (2020) presented FGCI and a Bayesian method to construct the confidence interval for the difference of CVs of delta-lognormal distributions; they concluded that the Bayesian method was the most appropriate.
Since dispersion in the precipitation series for different areas can be the same or different, simultaneous estimation of this for multiple areas has been investigated using various distributions and parameters. Mandel & Betensky (2008) introduced an algorithm for simultaneous confidence interval (SCI) construction and then compared bootstrapped and normal-based SCIs in which the limits of the bootstrap intervals were smaller than the normal-based intervals. Donner & Zou (2011) used a two-step MOVER approach for constructing SCIs for multiple contrasts of binomial proportions; their proposed method was reasonable for small-to-moderate sample sizes. Abdel-Karim (2015) considered three methods: FGCI-MOVER, MOVER-MOVER, and simultaneous FGCI to construct SCIs for the ratio of means of lognormal distributions; they reported that the MOVER-MOVER method outperformed the others. Li, Song & Shi (2015) suggested parametric bootstrapping to construct SCIs for all pairwise differences between the means of twoparameter exponential distributions. Thangjai, Niwitpong & Niwitpong (2019) presented three methods: MOVER, a computational approach, and FGCI to construct SCIs for all of the differences between the CVs of lognormal distributions; their results show that MOVER was the best and the computational approach performed similarly to MOVER when the sample size was large. In addition, Thangjai & Niwitpong (2020) used parametric bootstrapping, GCI, and MOVER for SCI construction for all of the differences between CVs in two-parameter exponential distributions; their results indicate that GCI was the most appropriate in most cases, while MOVER was the best for large sample sizes.
As mentioned above, rainfall series data follow a delta-lognormal distribution. Since our focus is on comparing the dispersion of rainfall from five regions in Thailand, the pairwise differences between the CVs of their rainfall data distributions are an interesting topic to study. Although there have been numerous methods published for constructing SCIs for the differences between the parameters of several types of distributions, constructing SCIs for all of the pairwise differences between the CVs of delta-lognormal distributions has not yet been reported. GCI is a general method that is often used to construct confidence intervals, but FGCI is stronger than GCI since it provides asymptotically correct frequentist coverage (Hannig, Abdel-Karim & Iyer, 2006). Moreover, previous researchers have reported that MOVER is an appropriate method for constructing the SCIs for various parameters of several types of distributions. Therefore, one of ours aims was to construct SCIs for this scenario based on FGCI and compare them with ones based on MOVER and Bayesian methodology. The coverage probability, the probability that the confidence interval of the estimate covers the value of the parameter, together with the expected length were used to estimate the performance of the confidence intervals.

METHODS
Let X i = X i1 ,X i2 ,...,X in i , i = 1,2,...,k be a random sample from k independent deltalognormal distributions, denoted by X ij ∼ µ i ,σ 2 i ,δ i(0) , where δ i(0) = P X ij = 0 . Since this distribution contains zero and positive values, then the zero values follow a binomial distribution and the positive values a lognormal distribution denoted by X ij = 0 ∼ Bin n i ,δ i(0) and Y ij = ln X ij ∼ N µ i ,σ 2 i , respectively. Moreover, let n i(0) and n i(1) be the numbers of zero and positive values, respectively, where n i = n i(0) + n i(1) . The distribution function of a delta-lognormal distribution is given by where δ i(1) = 1 − δ i(0) . Following Aitchison (1955), the respective population mean and variance of X i are and Var Following this, the CV of X i can be expressed as Since we are interested in constructing the SCIs for all pairwise differences between the CVs, then where i,l = 1 ,2,...,k and i = l. The maximum likelihood estimators of δ i(1) and µ i areδ i(1) = n i(1) /n i andμ i = n i(1) j=1 ln x ij /n i(1) , respectively. Furthermore, the unbiased estimator for σ 2 Assume thatδ i(1) andσ 2 i are independent, then the maximum likelihood estimator of ν i can be defined aŝ Similarly, where i,l =1 ,2,...,k and i = l.

The simultaneous FGCIs
To construct the simultaneous FGCIs, a fiducial generalized pivotal quantity (FGPQ), which is a subclass of the generalized pivotal quantity (GPQ) (Hannig, Iyer & Patterson, 2006), is presented as follows. (1) is called an FGPQ if it corresponds with the following two conditions (Weerahandi, 1993;Hannig, Iyer & Patterson, 2006) is the parameter of interest. The FGPQs for σ 2 i and δ i(1) can be constructed by applying Definition 1. According to Hannig, Iyer & Patterson (2006) and Li, Zhou & Tian (2013), the respective FGPQs for δ i(1) and σ 2 i are and where U i ∼ χ 2 n i(1) −1 . Following this, the FGPQ for ν i is simply Hence, the FGPQ for the differences between two independent CVs can be expressed as where i,l =1 ,2,...,k and i = l.
Therefore, the 100(1 − α)% two-sided SCI for ν i − ν l based on the FGCI method can be written as L il ν il U il , where L il and U il are the α/2-th and (1 − α/2)-th quantiles of R ν il , respectively. Theorem 1 Let X i = X i1 ,X i2 ,...,X in i ,i = 1,2,...,k be a random sample from k independent delta-lognormal distributions with mean µ i , variance σ 2 i , and probability of zero values (1) and ν l = exp σ 2 l − δ l1 /δ l1 for i,l =1 ,2,...,k and i = l be the CV of X i and X l , respectively. Furthermore, letν i andν l be the estimators of ν i and ν l , respectively. The estimator for the variance of the difference between ν i and ν l isV ar (ν i −ν l ). Let n i be the sample size of the i-th random sample and n = n 1 + n 2 + ... + n k . Assume that n i /n → r i as n → ∞ where 0 < r i < 1. Therefore, Accordingly, This implies that

The Bayesian method
According to the distributions of X i for i = 1,2,...,k with the unknown parameters By applying the second-order partial derivative of the log-likelihood function with respect to the unknown parameters, the Fisher information matrix of the unknown parameters can be written as In this paper, we constructed both of equal-tailed SCIs based on simulation data and simultaneous credible intervals based on information from a simulation study of their prior distributions using two forms of Bayesian prior; the suitability of the Jeffreys' rule and uniform priors was determined by considering the values of a random variable of their posterior distributions that correspond to those for a delta-lognormal distribution. See also, Yosboonruang, Niwitpong & Niwitpong (2019b) and Yosboonruang, Niwitpong & Niwitpong (2020).

The Jeffreys' rule prior
The Jeffreys' rule prior is obtained from the square root of the determinant of the Fisher information matrix (Jeffreys, 1946). It is well known that a delta-lognormal distribution comprises lognormal and binomial distributions. From the CVs in Eq. (4), the parameters of interest are σ 2 i and δ i(1) , and the Jeffreys' rule priors for these parameters , respectively. Assuming that σ 2 i and δ i(1) are independent, the prior distribution for a delta-lognormal distribution can be defined . By combining the likelihood function and the prior distribution of a delta-lognormal distribution, the joint posterior density function can be written as By integrating Eq. (16), the respective posterior distributions of σ 2 i and δ i(1) are derived as and .

The uniform prior
Since the uniform prior has a constant function for the prior probability (Stone, 2013), then the uniform priors of σ 2 i and δ i(1) are 1, denoted by p σ 2 i ∝ 1 and p δ i(1) ∝ 1, respectively. Afterward, the uniform prior for a delta-lognormal distribution becomes p σ 2 i ,δ i(1) ∝ 1. Similar to Eq. (16), the joint posterior density function is obtained by combining p σ 2 i ,δ i(1) with the likelihood function from Eq. (14). Subsequently, we obtain the posterior of σ 2 i and δ i(1) by integrating the joint posterior density function with respect to the others. Thus, the posterior distribution is σ . Therefore, the 100(1 − α)% equal-tailed SCI and simultaneous credible interval for ν il based on the Bayesian method are L il ≤ ν il ≤ U il , where L il and U il are the lower and upper bounds of the intervals, respectively. (1) , with sample sizes n 1 ,n 2 ,...,n k and n = n 1 +n 2 +...+n k . Let r i = n i /n as n → ∞, (1) and ν l = exp σ 2 l − δ l(1) /δ l(1) be the CVs of X i and X l , respectively. Letν i andν l be the estimators of ν i and ν l , respectively. An estimator for the variance of the difference between ν i and ν l isV ar (ν i −ν l ). Let p σ 2 i ,δ i(1) and p σ 2 i ,δ i(1) |x ij be the prior distribution and the joint posterior density function for delta-lognormal distribution, respectively. Therefore, Proof The proof is similar to Theorem 1.

MOVER
The concept of MOVER proposed by Donner & Zou (2012) can be applied to construct the 100(1 − α)% two-sided confidence interval of ν i − ν l for i,l =1 ,2,...,k and i = l, for which L il ≤ ν il ≤ U il where L il and U il denote the lower and upper limits of the confidence interval, respectively, expressed as and where i,l =1 ,2,...,k and i = l. From (4), the parameters of interest are δ i(1) and σ 2 i , and so the confidence intervals for these parameters can be constructed.
Since the unbiased estimator of σ 2 i is given byσ 2 Consequently, the respective lower and upper bounds for σ 2 i are defined as and The score method proposed by Wilson (1927) is used to construct the confidence limits for δ i(1) . According to Brown, Cai & DasGupta (2001) and Donner & Zou (2011), the respective lower and upper limits of δ i(1) are given by and where Z i ,i = 1,2,...,k follow a standard normal distribution. This approach is similar to constructing the confidence limits for σ 2 l and δ l(1) . Therefore, the 100(1 − α)% two-sided SCIs for ν i − ν l based on the MOVER method are where i,l =1 ,2,...,k and i = l. Theorem 3 Let X i = X i1 ,X i2 ,...,X in i ,i = 1,2,...,k, be random samples from k independent delta-lognormal distributions with mean µ i , variance σ 2 i , and probability of zero values δ i(0) . Furthermore, let the sample size of the i-th random sample be n i , where n = n 1 + n 2 + ... + n k and r i = n i /n as n → ∞, for which 0 < r i < 1. Let (1) and ν l = exp σ 2 l − δ l(1) /δ l(1) , for i,l =1 ,2,...,k and i = l, be the CVs of X i and X l , respectively. In addition, letν i andν l be the estimators of ν i and ν l , respectively. Let L il ,2,...,k and i = l, be the respective lower and upper limits of the confidence interval for ν il = ν i − ν l . Therefore, Proof Suppose that the respective lower and upper limits of the confidence interval for where i,l =1 ,2,...,k and i = l. Thus, the respective estimators of variance forν i andν l at ν i = l i and ν l = l l arê where z α/2 is the α/2-th quantile of the standard normal distribution. Similarly, the respective estimators of variance forν i andν l at ν i = u i and ν l = u l arê Hence, the respective lower and upper limits can be expressed as and Therefore, Suppose that n i /n → r i ∈ (0,1) as n → ∞,i = 1,2,...,k where n = n 1 + n 2 + ... + n k .
From the central limit theorem, n( Following Skorokhod's theorem, let Y n and Y be random variables from the common probability space with distributions D n and D , respectively. Thus, Y n converges to Y almost surely, denoted by Y n a.s. → Y , and D n converges to D almost surely, denoted by D n a.s. → D . Assume that Z i and Z * i are independent and identically distributed random variables. Thus, T X ,X * ,µ,σ ..,k, and i = l. Since the limiting distribution of T X ,X * ,µ,σ 2 is continuous and z α/2 (X ) → q α/2 , where q α/2 is the α/2-th quantile of the distribution of D * , we can obtain P D n ≤ z α/2 → P D ≤ q α/2 = P D * ≤ q α/2 = 1 − α, as n → ∞. Therefore, P ν il ∈ ν il ± z α/2 V ar (ν i ) +V ar (ν l ),∀i = l → 1 − α, which implies that

SIMULATION RESULTS
Here, the performances of the proposed methods via Monte Carlo simulation with the R statistical program are presented. The best method attains a coverage probability equal to or greater than the nominal simultaneous confidence level of 0.95 together with the shortest expected length. The simulations were conducted with 15,000 iterations for each combination of parameters. Furthermore, 5,000 replications for the FGCI and Bayesian methods for each case of parameter combination were carried out. Sample sizes were set as 25, 50, and 100; δ i(1) = 0.2,0.5,0.8; and σ 2 i = 0.5,1.0,2.0. The results in Tables 2-4 and Figs. 1-3 show that the coverage probabilities of FGCI and the equal-tailed Bayesian using Jeffreys' rule prior were close to or greater than the nominal confidence level for almost all k values. Similarly, the coverage probabilities of the equal-tailed Bayesian using the uniform prior, the Bayesian credible intervals using Jeffreys' rule and uniform priors, and MOVER were close to or greater than the nominal confidence level for all cases. For most cases, the Bayesian credible interval using Jeffreys' rule prior attained the shortest expected length, except for n i = 50; δ i(1) = 0.5,0.8; and σ 2 i = 0.5,1.0, for which the expected lengths of FGCI were the shortest.

EMPIRICAL STUDY
Thailand is generally divided into five areas by topography, i.e., Northern (A1), Northeastern (A2), Central (A3), Eastern (A4), and Southern (A5). The daily rainfall data from these areas in August 2020 were used to assess the performances of the proposed methods for SCI construction. The distributions of these data are presented in Fig. 4, which shows right-skewness for all of the datasets. Thus, the minimum Akaike information criterion (AIC) and the lowest Bayesian information criterion (BIC) were used to test the fitting of the distributions to such data. From AIC and BIC results in Table 5, it is evident that the positive values in the rainfall datasets from the five areas conform to lognormal distributions. Moreover, normal Q-Q plots were constructed to show the distributions of the log-transformed positive rainfall data from the five areas (Fig. 5), which verified the AIC and BIC results that these datasets follow lognormal distributions. A summary of these data are n 1 = 31,δ 1 = 0.7097,μ 1 = 0.7715,σ 2 1 = 3.4565,η 1 = 6.6088, n 2 = 31,δ 2 = 0.6774,μ 2 = 1.4332,σ 2 2 = 2.9550,η 2 = 5.2294, n 3 = 31,δ 3 = 0.6452,μ 3 = 1.5512,σ 2 3 = 2.8638,η 3 = 5.1154, n 4 = 31,δ 4 = 0.4839,μ 4 = 1.4178,σ 2 4 = 2.1487,η 4 = 4.0888, n 5 = 31,δ 5 = 0.4839,μ 5 = 1.8040,σ 2 5 = 2.1962,η 5 = 4.1930. Table 6 reports the 95% SCIs and credible intervals for all pairwise differences between the CVs of the daily rainfall series from five areas in Thailand. The results show that the expected length of the Bayesian credible interval using the Jeffreys' rule prior was the shortest, which corresponds with the simulation results. Therefore, it is a good choice for constructing the SCI for all of the pairwise differences between the CVs of the precipitation series from the five areas in Thailand.

DISCUSSION
The simulation results indicate that the Bayesian credible interval using Jeffreys' rule prior outperformed the other methods in virtually all cases. Although the coverage probabilities Table 2 The coverage probabilities and expected lengths for the 95% SCIs and credible intervals for all pairwise differences between the CVs of delta-lognormal distributions for k = 3.  Uni-E represented the equal-tailed Bayesian confidence intervals using Jeffreys' rule and uniform priors, respectively, and B.Jrule-C and B.Uni-C represented the Bayesian credible intervals using Jeffrey's rule and uniform priors. Table 3 The coverage probabilities and expected lengths for the 95% SCIs and credible intervals for all pairwise differences between the CVs of delta-lognormal distributions for k = 5.     in some cases were close to 1.00, suggesting that overestimation may have occurred, the expected lengths were the shortest. Therefore, the Bayesian credible interval using Jeffreys' rule prior can be used to construct the SCIs for all of the pairwise differences between the CVs of delta-lognormal distributions. Since constructing SCIs concerns the differences between the parameters of interest for all pairwise comparisons, our findings correspond with Yosboonruang, Niwitpong & Niwitpong (2020) who found that the highest posterior density Bayesian using Jeffreys' rule prior is appropriate for constructing the confidence interval for the difference between two independent CVs of delta-lognormal distributions. However, Abdel-Karim (2015) and Thangjai, Niwitpong & Niwitpong (2019) reported that MOVER is the most suitable for constructing SCIs for the mean or CV of a lognormal distribution, but this is not in agreement with our findings for the data and scenario used in this study since the range of intervals for its SCI was wider than when using the Bayesian methods. In addition, the SCI range between the CVs of the daily rainfall data series from the five different areas of Thailand was too wide, and so this demonstrates that it is different in rainfall dispersion from five areas in Thailand.

CONCLUSIONS
Herein, we proposed methods to construct the SCIs for all pairwise differences between the CVs of delta-lognormal distributions, including FGCI, two Bayesian methods constructed under the equal-tailed confidence intervals and credible intervals using the Jeffreys' rule and uniform priors, and MOVER. The performances of the proposed methods were determined via their coverage probabilities together with their expected lengths under  various circumstances. The results indicate that the Bayesian credible interval using the Jeffreys' rule prior was suitable for constructing the SCIs for all pairwise differences between the CVs of delta-lognormal distributions in terms of the coverage probability together with the expected length. Furthermore, FGCI is appropriate for constructing these SCIs in cases of the variances equal to 0.5 and 1.0 with the proportion of non-zero values equal to 0.5 and 0.8 for the sample sizes of 50 and 100. In addition, the results of using daily rainfall data from five regions in Thailand coincided with those from the simulation study.