Generalized exponential distribution : A Bayesian approach using MCMC methods

© 2015 Growing Science Ltd. All rights reserved


Introduction
A generalized exponential distribution (see Gupta & Kundu, 1999) can be a good alternative for the use of the popular gamma or Weibull distributions to analyse lifetime data (see also, Raqab, 2002;Raqab & Ahsanullah, 2001;Zheng, 2002;Sarhan, 2007;Gupta & Kundu, 2001, 2007).The generalized exponential distribution with two parameters has density given by, where t > 0; α > 0 and λ > 0 are respectively, shape and scale parameters.Let us denote this model as GE(α, λ).The density function given in Eq. ( 1) has great flexibility of fitting depending on the shape parameter α: if α < 1, we have a decreasing function and if α > 1, we have a unimodal function with mode given by λ −1 logα.Observe that if α = 1, we have an exponential distribution with parameter λ.
The survival and hazard function associated with Eq. ( 1) are given Eq. ( 2) and Eq.(3), respectively, as follows, S(t; α, λ) = P (T and (3) Observe that the hazard function h(t; α, λ) has a non-deceasing trend from 0 to λ when α > 1; a nonincreasing trend from ∞ to 1 when α < 1 and constant with α = 1.This behavior of the hazard function given in Eq. ( 3) is similar to the behavior of the hazard function of a gamma distribution.Also observe that the median lifetime obtained from S(t; α, λ) = 1/2 is given by, The moment generating function for a random variable T with a generalized exponential distribution and density of Eq. ( 1) is as follows, (see Gupta & Kundu, 2008) From Eq. ( 5), we find all moments of interest.The mean and variance are given, respectively, by 2 1 1 E(T) [ ( 1)-( 1)], var(T) [ '(1)-'( 1)], where ψ(.) is a digamma function given by and Γ (x) is a gamma function.In this paper, we develop a Bayesian analysis for the generalized exponential distribution using Markov Chain Monte Carlo (MCMC) methods (see for example, Gelfand & Smith, 1990;Chib & Greenberg, 1995) to obtain the posterior summaries of interest.
The paper is organized as follows: in Section 2, we introduce the likelihood function; in Section 3, we present a Bayesian analysis considering different non-informative priors for the parameters; in Section 4, we present inference for the survival function at a specified time; in Section 5, we introduce two numerical illustrations; finally, in Section 6, we present some conclusions.

The Likelihood Function
Suppose we have identically distributed lifetimes t = (t 1 , ..., t n ) ′ from a GE(α, λ) distribution.The likelihood function in the parameters α and λ, based on t is then The logarithm of the likelihood function given in Eq. ( 7) is given by, The maximum likelihood estimators (MLE) for α and λ are obtained from ∂l/∂α = 0 and ∂l/∂λ = 0, where, From Eq. ( 9), we find the MLE for α given by, The MLE for λ is obtained by solving the nonlinear equation, where α ˆ is given by Eq. ( 10) and Observe that we need to use an iterative method to find the MLE λ ˆ from Eq. ( 11).The second derivatives of l(α, λ) are given, respectively, by Hypotheses tests and confidence intervals for α and λ can be obtained using the asymptotical normal distribution for α ˆ and λ ˆ , that is where I 0 is the observed Fisher information matrix given by,

A Bayesian Analysis
For a Bayesian analysis of the GE(α, λ) distribution, we assume different prior distributions for α and λ.The Jeffreys non-informative prior (see for example, Box & Tiao, 1973) for α and λ is given by where I (α, λ) is the Fisher information matrix given by, where, (see Gupta & Kundu, 1999).Let us denote the prior Eq. ( 15) as " Jeffreys1".A possible simplification is to consider a non-informative prior from Π(α, λ) = Π(λ| α)Π 0 (α).Using the Jeffreys'rule, we have, is given in Eq. ( 17) and Let us denote the prior Eq. ( 19) as " Jeffreys2".A third non-informative prior distribution is assumed considering independence between α and λ, that is, where α > 0 and λ > 0. Let us denote the prior (20) as " Jeffreys3".Assuming dependence between the random quantities α and λ, we could assume a bivariate prior distribution for α and λ derived from copula functions (see for example, Nelsen, 1999;Trivedi & Zimmer, 2007).A special case is given by the Farlie-Gumbel-Morgenstern Copula (see Morgenstern, 1956) given by, where u = F 1 (α) (marginal distribution function for α) and v = F 2 (λ) (marginal distribution function for λ).The joint distribution function for α and λ is given (from ( 21)) by, where the parameter δ is associated to the dependence between α and λ.If δ = 0, we have independence between α and λ.The joint prior density for α and λ, obtained from ∂ 2 F(α, λ)/∂α∂λ, is given by, where f 1 (α) and f 2 (λ) are the marginal densities for α and λ.Assuming marginal exponential distributions with known hyperparameters a 1 and a 2 , we have, conditional on the hyperparameter δ, Let us denote Eq. ( 24) as "Farlie-Gumbel" or "Copula prior".The specification of Eq. ( 24) must be completed by a prior distribution for δ, Π(δ).A suggestion for this prior could be a uniform U [−1, 1] distribution on the interval [−1, 1].Other prior specifications also could be considered, as independent informative Gamma distributions, that is, where a α , b α , a λ and b λ are known hyperparameters and Gamma(a,b) denotes a gamma distribution with mean a/b and variance a/b 2 .

The joint posterior distribution for α and λ assuming the Farlie-Gumbel-Morgenstern prior
Assuming the "Farlie-Gumbel-Morgenstern" prior Π 4 (α, λ| δ) introduced in Eq. ( 24), the statistical model and prior model together form an ordered structure in which the distribution of the data is written conditionally on the parameters (α, λ) as f(t| α, λ); the distribution of (α, λ) is written conditionally on the hyperparameter δ as Π 4 (α,λ| δ) and is completed by the distribution of δ, Π(δ).The full joint distribution of all random quantities in the problem is hierarchically written as, In a three-level hierarchy with the form Eq. ( 26), inference about (α, λ) and δ is simply obtained through their joint posterior distribution, and inference about (α, λ) and δ are given by their marginal posteriors.Therefore, the joint posterior for the GE(α, λ) distribution parameters (α, λ) and the hiperparameter δ is given by The next step is to specify a prior distribution for the parameter δ and a convenient prior proposed by Box and Tiao (1973) for correlation parameter in bivariate normal data is given by, for an appropriated choice of "c".If we do not have any information from previous studies, a common choice is c = 0, that is, π(δ) ∝1.We decided to use c = −1/2, based on MCMC convergence rate and mathematics convenience, providing the joint posterior In our study, the aim is to specify the posterior marginal distributions of the parameters α and λ.Thus, to estimate the parameters of interest, we use the marginal posterior distributions by integrating out the nuisance parameter δ from the joint density Eq. (30).Solving the integral above for the parameter δ and considering the integral results, ; in this way, we have the joint posterior distribution for α and λ given by which is the same posterior obtained if we have used π(δ)∝1.Samples of the joint posterior distributions of Eq. ( 30) can be simulated by using MCMC methods.In this way, we simulate α from the conditional posterior distribution Π(α|λ, t) and λ from the conditional distribution Π(λ|α, t) using the Metropolis-Hastings algorithm (see for example, Chib and Greenberg, 1995).

Survival function S
In applications, we usually have interest in the survival function S, given by Eq. ( 2), that is, if X represents the lifetime of a patient under a given treatment then S represents the probability of this patient "survives" for at least a specified time t (or longer).Next, we compare the posterior densities of the survival function S(t) by using the priors discussed in this paper.We note that Eq. ( 2) is a function of α and λ and hence it is a parameter itself with a posterior distribution Π(S | t).To derive this posterior we first transform (α, λ) to (S, W) where W = α and S = S 0 = 1 − [1 − exp(−λt 0 )] α To derive this posterior we first transform (α, λ) to (S, W) where W = α and S = S 0 .As Jeffreys and gamma priors are invariant to 1-1 transformation then the posterior distribution for the new parameters can be derived by the Jacobian transformation, that is, . The Jacobian matrix J of this transformation with respect to the parameters W and S is given by: where Note that although the prior distribution of Eq. ( 24) belongs to the copula Farlie-Gumbel family the prior resulting from transformation S and W does not belong to the same family, that is, this prior is not invariant under nonlinear transformations.Marginal posterior of S could be obtained by integrating Π(S, W |t) with respect to the auxiliar variable W, but this is complicated.We prefer the MCMC approach to get this posterior density.From Eq. ( 2), the MLE of survival function S = S(t o ) is given by where α ˆ and λ ˆ are the MLE of α and λ, respectively.

Numerical Illustrations
In this section, we introduce two examples to illustrate the proposed method.
From the plots of Fig. 1, we observe that the contour plots are strongly affected by the different parameterizations, especially for small sample sizes as the case of n = 5 observations.In this case, we need to be careful to assume classical asymptotical inference results.For comparison of the different priors proposed in this paper, the joint posterior contours are given in Fig. 2, Fig. 3 and Fig. 4 considering the data sets introduced in Table 1 with n = 5, 25 and 50, respectively.Observe that the contour plots are similar considering the different non-informative prior distributions for α and λ, specially for large sample sizes (n = 25 and n = 50).For small sample sizes (n = 5), we observe that the choice of non-informative priors for α and λ could be an important issue in the Bayesian analysis of GE(α, λ) distributions.We also need to appeal to numerical procedures to extract characteristics of marginal posterior distributions such as Bayes estimator, mode and credible intervals.We can then use MCMC algorithm to obtain a sample of values of α and λ from the joint posterior.The chain is run for 25,000 iterations with a burn-in period of size 5,000.The MCMC traceplots for the posterior distribution are shown in Fig. 5, Fig. 6 and Fig. 7, and the resulting marginal distributions are plotted in Fig. 8, Fig. 9 and Fig. 10.It is important to point out that we have used the hyperparameter values a α = b α =a λ = b λ = 0.01 for the gamma priors (25) and a 1 = 1 and a 2 = 1 for the "Farlie-Gumbel" prior Eq. ( 28).We also have used the prior (31) for the parameter δ.
The MCMC plots suggest we have achieved convergence and the algorithm also showed a rate of acceptance around 25-35%.  2 and  3).From the results of Tables 2 and 3, we observe that the Bayesian posterior means for α and λ are very different from the values α = 1.5 and λ = 3.5 used to simulate the data sets considering a small sample size (n = 5).Also observe that in this case, the estimated variances are very large.

Table 2
Posterior mean and variance for the parameter α using the data of   From the results of Figs.(8-10), we observe that the marginal posterior densities for α and λ become more similar assuming the different non-informative priors for α and λ, as the data sample size increases (n = 25 and n = 50).Assuming t = 0.2, the survival function given the parameter values α = 1.5 and λ = 3.5 is given by S(0.2) = 0.64.Assuming the three simulated data sets of Table 1, we have in Figs.(11)(12)(13), the marginal posterior distributions for S(0.2) obtained from the 20,000 simulated Gibbs samples for the joint posterior distribution of α and λ, considering the different non-informative priors introduced in Section 3. From the plots of Figs.(11-13), we observe that the choice of non-informative priors for the parameters α and λ of the GE(α, λ) distribution could be very important to obtain accurate Bayesian inferences for the survival function, since with the use of the "Farlie-Gumbel (copula)" prior we have more accurate Bayesian inferences for S(0.2).In Tables 6 and Table 7, we have the posterior summaries of interest for S(0.2) assuming the different non-informative priors for α and λ.We also observe more accurate Bayesian inferences for the survival function S(0.2), assuming the "Farlie-Gumbel" prior distribution for the parameters α and λ.In Fig. 16, we have the plots of the empirical and fitted survival functions modeled by the generalized exponential for α ˆ = 5.23 and λ ˆ = 0.031 considering distribution with density given in Eq. ( 1) and assuming the Bayesian estimators b the copula prior distribution for α and λ (see Tables 8 and 9).From Fig. 16, we observe a good fit of the generalized exponential distribution for the data (Lawless data).

Conclusion
The use of the generalized exponential distribution with density (1) could be a good alternative to analyse lifetime data, in comparison to the popular gamma distribution.Observe that the survival function (see ( 2)) for the GE(α, λ) distribution presents a closed analytical form and the survival function for the gamma distribution presents an incomplete gamma function.In this way, since lifetimes of medical or engineering applications present censored data, the use of GE(α, λ) can have good computational advantages.Accurate inference for the parameters of the GE(α, λ) could be obtained using MCMC methods for a Bayesian analysis of the model.In this way, we need to choose an appropriate prior distribution for the parameters of the model, especially in the situations where we do not have expert opinion to build our prior.From the results of this paper we observe that the use of copula prior distributions for the random quantities α and λ could improve our inference results, especially considering the survival function at specified time t.These results are of great interest in applications.

Fig. 11 .Fig. 12 .Fig. 13 .
Fig. 11.Posterior densities of survival function S for n = 5 Fig. 12. Posterior densities of survival function S for n = 25 Fig. 13.Posterior densities of survival function S for n = 50 Table 6Posterior mean and variance for the parameter S

Table 3
Posterior mean and variance for the parameter λ using the data of Table1

Table 6
Posterior mean and variance for the parameter S (The values between parentheses express the posterior variance)

Table 10
Posterior summaries for S (Lawless data)