Bayesian computation for the common coefficient of variation of delta-lognormal distributions with application to common rainfall dispersion in Thailand

Rainfall fluctuation makes precipitation and flood prediction difficult. The coefficient of variation can be used to measure rainfall dispersion to produce information for predicting future rainfall, thereby mitigating future disasters. Rainfall data usually consist of positive and true zero values that correspond to a delta-lognormal distribution. Therefore, the coefficient of variation of delta-lognormal distribution is appropriate to measure the rainfall dispersion more than lognormal distribution. In particular, the measurement of the dispersion of precipitation from several areas can be determined by measuring the common coefficient of variation in the rainfall from those areas together. Herein, we compose confidence intervals for the common coefficient of variation of delta-lognormal distributions by employing the fiducial generalized confidence interval, equal-tailed Bayesian credible intervals incorporating the independent Jeffreys or uniform priors, and the method of variance estimates recovery. A combination of the coverage probabilities and expected lengths of the proposed methods obtained via a Monte Carlo simulation study were used to compare their performances. The results show that the equal-tailed Bayesian based on the independent Jeffreys prior was suitable. In addition, it can be used the equal-tailed Bayesian based on the uniform prior as an alternative. The efficacies of the proposed confidence intervals are demonstrated via applying them to analyze daily rainfall datasets from Nan, Thailand.


INTRODUCTION
Currently, the anthropomorphic emissions of greenhouse gases, sulfate aerosols, and black carbon are having a seriously deleterious effect on the Earth's climate (Nema, Nema & Roy, 2012). This phenomenon is directly increasing the global temperature, warming the oceans, and melting the polar ice caps, thereby causing a rise in sea level and initiating extreme weather events (NASA, 2020). Southeast Asia is a tropical area that is affected by ocean currents, prevailing winds, and abundant rainfall during the monsoon season (WorldAtlas, 2021). Thailand is located in Southeast Asia, where the climate is influenced by the monsoon winds. Especially, the combined effect of the southwest monsoon, the Inter-Tropical Convergence Zone, and tropical cyclones causes plenty of rain to fall over the country (Thai Meteorological Department, 2015). Large amounts of rainfall cause regular flooding in some areas of the country, thereby leading to damage to property and loss of life. Moreover, Thailand is an agricultural country, and rainfall fluctuation makes it difficult to predict heavy precipitation that may cause loss of or damage to crops. Therefore, it is necessary to measure the dispersion of rainfall in specific areas by using statistical tools such as the coefficient of variation (CV) to enable accurate prediction of future catastrophic events. Nan is a province in Thailand located near the origin of the Nan River that flows into the Chao Phraya River. Furthermore, throughout the year, the precipitation in Nan fluctuates between a precipitation deficit and heavy rainfall. The latter accompanied by thunderstorms occurs in the late summer period, and due to the southwest monsoon, the amount of daily rainfall increases from mid-May to early October with the highest daily rainfall frequently in August or September, which can cause flooding in some areas (Thai Meteorological Department, 2015). Therefore, datasets of daily rainfall from the three areas (Chiang Klang, Tha Wang Pha, and Pua) in Nan province in August 2018 and 2019 were selected. These data comprise positive values that conform to a lognormal distribution, and true zero values, in which the frequency conforms to a binomial distribution, as presented in Fig. 1. In addition, the normality plots shown in Fig. 2 together with the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) indicate that the daily rainfall data from the three areas follow delta-lognormal distributions. Furthermore, many researchers have reported that rainfall data follow a delta-lognormal distribution (Fukuchi, 1988;Shimizu, 1993;Yue, 2000;Kong et al., 2012;Maneerat, Niwitpong & Niwitpong, 2019aYosboonruang, Niwitpong & Niwitpong, 2019b. Since the CV is the ratio of the mean and the standard deviation of a population, it is free from units of measurement and is often used to measure the dispersion of data and compare it between populations. For statistical inference, several methods for constructing confidence intervals for the CV and functions of the CV have been suggested (e.g. Pang et al., 2005;Hayter, 2015;Nam & Kwon, 2017;Yosboonruang, Niwitpong & Niwitpong, 2018. However, since using the common CV of delta-lognormal distributions for statistical inference has not previously been reported, this has become our research interest as it is useful for measuring the dispersion in several independent data series, especially rainfall data. Several statisticians have suggested confidence intervals for the common CV of normal and non-normal distributions. Once Gupta, Ramakrishnan & Zhou (1999) obtained the asymptotic variance of the common CV for normal distributions, they constructed confidence intervals and compared their coverage probabilities and expected lengths. Tian (2005) developed a method by using the concept of the generalized confidence interval (GCI) for the common CV. Subsequently, Behboodian & Jafari (2008) used the concept of generalized p-values and GCI to construct a new method and compared it with Tian's method (Tian, 2005); the former outperformed the others by attaining a suitable coverage probability and the shortest expected length. Ng (2014) constructed confidence intervals for the common CV of lognormal distributions using the generalized variables approach; the performance of the proposed method was similar to Tian's method (Tian, 2005). Liu & Xu (2015) constructed a confidence interval for the common CV of several normal populations based on the confidence distribution interval. Thangjai & Niwitpong (2017) proposed the adjusted method of variance estimates recovery (MOVER) to construct the confidence interval for the weighted CV of two-parameter exponential distributions; a performance comparison between it, GCI, and a large sample method revealed that GCI performed the best and adjusted MOVER was only suitable for data with  Thangjai, Niwitpong & Niwitpong (2020a) applied adjusted GCI and a computational method to construct confidence intervals for the common CV of normal distributions; they compared them with GCI and the adjusted MOVER, the results of which show that the adjusted GCI is appropriate for small samples and the computational method was suitable for large ones. In addition, Thangjai, Niwitpong & Niwitpong (2020b) extended the computational approach and MOVER to produce confidence intervals for the common CV of lognormal distributions and compared their performances with those employing fiducial GCI (FGCI) and Bayesian approaches, of which FGCI was the best. Unfortunately, the work of Thangjai, Niwitpong & Niwitpong (2020b) considered only the positively skewed distribution: lognormal distribution. In this work, we regarded the lognormal distribution that contained true zero values, the delta-lognormal distribution, for the confidence interval construction of the common CV. Therefore, the research by Thangjai, Niwitpong & Niwitpong (2020b) also needs to continue as rainfall data must be the delta-lognormal distribution. The fact that daily rainfall data can usually be fitted to a delta-lognormal distribution after collecting data over a sufficiently long period has drawn interest from several researchers to present statistical inference for its parameters. Several researchers have suggested confidence intervals for the mean and functions of the mean of delta-lognormal distributions, such as the traditional method, the normal algorithm, the exponential algorithm (Kvanli, Shen & Deng, 1998), bootstrapping, the likelihood ratio, the signed log-likelihood ratio (Zhou & Tu, 2000;Tian, 2005;Tian & Wu, 2006), GCI (Tian, 2005;Chen & Zhou, 2006;Li, Zhou & Tian, 2013;Wu & Hsieh, 2014;Hasan & Krishnamoorthy, 2018;Maneerat, Niwitpong & Niwitpong, 2018, MOVER (Maneerat, Niwitpong & Niwitpong, 2018), Aitchison's estimator, a modified Cox's method, a modified Land's method, the profile likelihood interval (Fletcher, 2008;Wu & Hsieh, 2014), FGCI (Li, Zhou & Tian, 2013;Hasan & Krishnamoorthy, 2018;Maneerat, Niwitpong & Niwitpong, 2019a), as well as Bayesian approaches (Maneerat, Niwitpong & Niwitpong, 2019a). Moreover, confidence interval estimations for the variance (Maneerat, Niwitpong & Niwitpong, 2020a, CV (Yosboonruang, Niwitpong & Niwitpong, 2018, and functions of the CV ) of delta-lognormal distributions have been suggested, including GCI, the modified Fletcher's method, FGCI, MOVER, the Bayesian approach, and bootstrapping.
Herein, we construct confidence intervals for the common CV of delta-lognormal distributions based on FGCI, the Bayesian approach, and MOVER. Their performances were compared in terms of their coverage probabilities and expected lengths. The methods for the confidence intervals estimation are presented in the next section. Subsequently, the results and discussion of a simulation study are analyzed, followed by the use of daily rainfall data to determine their applicability in real situations. Finally, conclusions on the study are offered.

METHODS
Let X ij , i ¼ 1; 2; . . . ; k, j ¼ 1; 2; . . . ; n i be a random variable of size n i from k delta-lognormal distributions with density function where I 0 x ij Â Ã is an indicator function for which the values are 1 when x ij ¼ 0, and 0 otherwise; I 0;1 ð Þ x ij Â Ã are equal to 0 and 1 when x ij ¼ 0 and x ij > 0, respectively; and This distribution is a combination of lognormal and binomial distributions. The numbers of positive and zero observations are defined as n i1 and n i0 , respectively, where n i ¼ n i1 þ n i0 . According to Aitchison (1955), the mean and variance of a delta-lognormal distribution are defined as and respectively. Since the CV computed from r i =l i , then By using the log-transformation (Yosboonruang, Niwitpong & Niwitpong, 2018), let The unbiased estimators for r 2 i and d i arer 2 i ¼ The approximately unbiased estimate variance ofû i iŝ The ordinary form of the common log-transformed CV is given bỹ where w i ¼ 1=Vû i ð Þ. Accordingly, the common CV is defined as Here, the methods to establish the confidence intervals for the common CV for delta-lognormal distributions are provided in detail.

FGCI
Let X ij , i ¼ 1; 2; . . . ; k, j ¼ 1; 2; . . . ; n i be a random sample with density function are the parameters of interest and l i is a nuisance parameter. Let x ij be the observed values of X ij . To construct the FGCI (Weerahandi, 1993;Hannig, Iyer & Patterson, 2006), the fiducial generalized pivotal quantity (FGPQ) R X ij ; x ij ; h i ; l i À Á is needed to satisfy the following two properties: 1. For each x ij , the conditional distribution of R X ij ; x ij ; h i ; l i À Á is unaffected by the nuisance parameter.

The observed value of
Given that R a is the 100aÀth percentile of R X ij ; x ij ; h i ; l i À Á , then R a=2 ; R 1Àa=2 À Á becomes the 100 1 À a ð Þ% two-sided FGCI for h i . Hence, it is essential to use the FGPQs for d i and r 2 i to construct the confidence interval for the common CVg ð Þ for delta-lognormal distributions.
Consider k individual random samples X i1 ; X i2 ; . . . ; X in i . Following Hannig (2009) andLi, Zhou &Tian (2013), the FGPQ for d i is as follows Similarly, Wu & Hsieh (2014) followed the concept of Krishnamoorthy & Mathew (2003) to find the FGPQ for r 2 i defined as where U i $ v 2 n i1 À1 . To find the FGPQ forû, we then substitute R d i and R r 2 i into Eq. (6) as follows: Consequently, the FGPQ for common CVg ð Þ is where the FGPQ for an estimated variance ofû i , for which R w i is the inverse, is given by where Thus, we employ Rg to produce the confidence interval forg. Consequently, the 100 1 À a ð Þ% two-sided confidence interval forg based on FGCI becomes Rg a=2 ð Þ; Rg 1 À a=2 ð Þ À Á , which denote the a=2th and 1 À a=2 ð Þth percentiles of Rg, respectively.

Bayesian methods
Since random samples X ij ; i ¼ 1; 2; . . . ; k; j ¼ 1; 2; . . . ; n i have a delta-lognormal distribution with unknown parameters Subsequently, the Fisher information matrix of f based on the partial derivatives of the log-likelihood functions for d i ; l i , and r 2 i becomes In the present study, the Bayesian method is used to construct the equal-tailed confidence interval and the credible interval for the common CV. In the following section, we propose the independent Jeffreys and uniform priors.

The Bayesian method using the independent Jeffreys prior
Since Jeffreys' prior for unknown parameter f is derived from the square root of the , the independent Jeffreys prior for d i and r 2 i are (Harvey & van der Merwe, 2012), respectively. Since d Ã i and r 2 i are independent, then the independent Jeffreys prior for a delta-lognormal distribution i . Thus, the joint posterior density of f becomes This leads to the posterior density of d Ã i given by which is a beta distribution with parameters n i0 þ 1=2 and n i1 þ 1=2, denoted by Similarly, the posterior density of r 2 i can be derived as which is in the general form of an inverse gamma distribution denoted by The Bayesian method using the uniform prior Because all possible values are equally likely a priori for the uniform prior, then it is a constant function of a priori probability (Stone, 2013;O'Reilly & Mars, 2015). According to Bolstad & Curran (2016), the uniform prior for d Ã i and r 2 i are proportional to 1, which can be defined as p d Ã i À Á / 1 and p r 2 i , thereby the uniform prior for the parameters of interest for a delta-lognormal distribution is p d Ã i ; r 2 i À Á / 1. Accordingly, the joint posterior density function is defined as which is consequently a density function of a beta distribution, i.e. d Ã i j x ij $ Beta n i0 þ 1; ð n i1 þ 1Þ. For r 2 i , the posterior density has an inverse gamma distribution with respective shape and scale parameters n i1 À 2 ð Þ=2 and n i1 À 2 ð Þr 2 i =2 which expressed as Subsequently, we construct the confidence intervals and the credible intervals for the common CV by substituting the posterior densities of d Ã i and r 2 i from the independent Jeffreys and uniform priors into Eqs. ( 6), (7), and (9).

MOVER
Following the method of Zou & Donner (2008), let u 1 and u 2 be the parameter of interest and then letû 1 andû 2 be the independent estimators of u 1 and u 2 , respectively. Furthermore, the lower and upper confidence limits for u 1 + u 2 are , PeerJ, DOI 10.7717/peerj.12858 8/24 Subsequently, let l i and u i , for i ¼ 1; 2, be the lower and upper bounds of the confidence interval for u i , respectively. Since l i and u i provide the possible parameter values, then l 1 þ l 2 is close to L u 1 þu 2 and u 1 þ u 2 is close to U u 1 þu 2 . To obtain the lower limit L u 1 þu 2 , the estimated variance ofû i at u i ¼ l i is given by Similarly, to obtain the upper limit U u 1 þu 2 , the estimated variance ofû i at u i ¼ u i is given by Next, by substituting d Varû l i À Á and d Varû u i À Á into Eq. ( 23), we obtain and Thereby, the unbiased estimate variance ofû i at u i ¼ l i and u i ¼ u i can be expressed as When this concept is extended to k parameters, the lower and upper confidence limits for t ¼ P k i¼1 u i are given by and According to Krishnamoorthy & Oral (2017) and recalling the common logtransformed CV from Eq. ( 8), the upper and lower confidence limits for r 2 i and d i are required to construct the confidence interval for the common CV of delta-lognormal distributions. Since the estimate of r 2 i iŝ where n i1 À 1 ð Þr 2 i =r 2 i $ v 2 n i1 À1 , the 100 1 À a ð Þ% confidence interval for r 2 i is derived as To construct the confidence interval for d i , the concept of the variance stabilizing transformation proposed by DasGupta (2008) and Wu & Hsieh (2014) was used. Therefore, the confidence interval for d i is given by Since and for i ¼ 1; 2; . . . ; k. Therefore, the 100 1 À a ð Þ% confidence interval forg based on MOVER can be written as where and

Monte Carlo simulation studies
The R statistical program was used to run the Monte Carlo simulation study and calculate the results for evaluating the performances of FGCI, MOVER, and the Bayesian intervals with the independent Jeffreys or uniform priors. The criteria for choosing the best performing confidence interval were coverage probability ! 0.95 and the shortest expected length for each scenario tested. To generate the data, we set the number of populations as k ¼ 3; 5; 10; sample sizes as n 1 ¼ n 2 ¼ . . . ¼ n k ¼ n ¼ 25; 50; 100; probabilities of non-zero values as d 1 ¼ d 2 ¼ . . . ¼ d k ¼ d ¼ 0:2; 0:5; 0:8; and variances as r 2 1 ¼ r 2 2 ¼ . . . ¼ r 2 k ¼ r 2 ¼ 0:1; 0:5; 1:0; 2:0. For each combination of parameters, 10,000 simulation runs were generated together with 2,000 replications for FGCI and the Bayesian approaches by applying Algorithms 1 and 2, respectively. . . . ; k; j ¼ 1; 2; . . . ; n i from a delta-lognormal distribution.
Compute l r 2 i ; u r 2 i ; l di ; u di ; l i ; u i : Compute the 100(1 − a/2) % confidence interval forg.  The result for the 95% confidence and credible intervals for the common CV of delta-lognormal distributions for various sample sizes, probabilities of non-zero values, and variances are reported in Tables 1-3 and displayed in Figs. 3-8. The equal-tailed Bayesian credible intervals based on the independent Jeffreys or uniform priors produced coverage probabilities close to or greater than the nominal confidence level for almost all of the scenarios whereas the others could achieve this in only some of them. Furthermore, in terms of the expected lengths, the equal-tailed based on independent Jeffreys prior were shorter than the uniform prior for all cases. In addition, the expected lengths of the Bayesian credible interval based on the independent Jeffreys prior were shorter than the others in almost every case when r 2 i ¼ 0:5. For all k and sample sizes together with r 2 i ¼ 1; 2, the expected lengths of the equal-tailed Bayesian based on the independent Jeffreys prior were the shortest when d i ¼ 0:2; 0:5, while MOVER had the shortest  Table 2 The results for the 95% two-sided confidence intervals for the common CV of delta-lognormal distributions for k ¼ 5. expected lengths for d i ¼ 0:8. For FGCI, the coverage probabilities and their expected lengths were very wide for all cases in which it is not reasonable for the construction of confidence interval. However, the equal-tailed Bayesian interval based on the independent Jeffreys prior can be used to derive the confidence interval for the common CV of delta-lognormal distributions since it produced coverage probabilities ! 0.95 for almost all cases, although the expected lengths were not always shorter than the other methods in some cases.

Application of the methods to real datasets
Datasets of daily rainfall from Chiang Klang, Tha Wang Pha, and Pua in Nan, Thailand, were obtained from the Upper Northern Region Irrigation Hydrology Center. The reason for using these datasets is discussed previously. AIC and BIC were used to test the possible distributions of these datasets in which the non-zero observations follow a      (Tables 4 and 5, respectively). The results indicate that the non-zero observations in the three datasets most closely follow a lognormal distribution. Furthermore, the normal Q-Q plots via the log-transformation of non-zero observations shown in Fig. 2 reveal that they follow normal distributions. By testing the non-zero observations together with the binomial distributions of the true zero observations indicate that the daily rainfall data from the three areas follow delta-lognormal distributions. The summary statistics for the three daily rainfall datasets were n 1 : n 2 : n 3 ¼ 62 : 62 : 62; d 1 :d 2 :d 3 ¼ 0:7258 : 0:7903 : 0:7419;l 1 :l 2 :l 3 ¼ 2:1189 : 1:6448 : 1:8971; r 2 1 :r 2 2 :r 2 3 ¼ 1:7857 : 3:4406 : 1:8346; andg ¼ 3:2011. The 95% confidence and credible intervals for the common CV of the three daily rainfall datasets are summarized in Table 6. The results reveal that the three confidence intervals tested contained the real value of the parameter, thereby reinforcing the conclusions based on the simulation study results. However, the expected length of FGCI was the shortest, thereby making it a good choice for estimating the common CV in the dispersion of precipitation from the three areas in Nan province, Thailand.

DISCUSSION
We extended the idea of Thangjai, Niwitpong & Niwitpong (2020b) who established confidence intervals using FGCI for the common CV of lognormal distributions to the context of the same distribution but with excess zeros. We then applied it to examine the dispersion in three daily rainfall datasets. In our case, the findings from the simulation study infer that the Bayesian methods were preferable to FGCI and MOVER for almost all cases. Although MOVER performed well for cases with a high proportion of non-zero values, it produced coverage probabilities that were lower than the nominal confidence level for cases with small variances. This is probably because the lower and upper bounds of the zero values are used in the confidence interval construction, and the combined effect with the other parameters caused the inadequate coverage probability results.   CONCLUSION Confidence intervals for the common CV of delta-lognormal distribution were constructed based on FGCI, two equal-tailed Bayesian credible intervals using the independent Jeffreys or uniform priors, and MOVER. Their coverage probabilities and expected lengths under various simulation scenarios were used to assess their efficacies. The equal-tailed Bayesian credible interval based on the independent Jeffreys prior provided superior coverage probabilities compared to other methods. Moreover, the equal-tailed Bayesian based on the uniform prior can be used as an alternative. This is due to the parameter to be estimated relying on the posterior densities of d Ã i and r 2 i . According to the results of the simulation study and the real data example, the Bayesian credible interval based on the independent Jeffreys prior is suitable for cases with small variances since it provided the narrowest length of the interval due to it falling in the domain of its posterior density. Furthermore, MOVER is the best choice when the proportion of non-zero values is high and the variance is large.