Bayesian Survival Analysis of Generalized DUS Exponential Distribution

Modeling and analysis of survival rate has proved a fruitful aspect of statistical work in many ﬁelds of science. This paper aims at using Bayesian approach to ﬁt generalized DUS Exponential distribution (GDUSED) . Kumar, Singh, and Singh (2015) proposed a renovation and called it DUS transformation. A Bayesian approach has been assumed to ﬁt this model as a survival model. A real survival data set is utilized for illustration. Implementation is done using LaplaceApproximation and JAGS . Some graphical representations related to the probability density function and hazard function of the (GDUSED) are provided. LaplaceApproximation and JAGS codes have been provided to implement censoring mechanism using both optimization and simulation tools.


Introduction
There is a variety of models, available in the literature, to examine the lifetime data. At first, exponential distribution was broadly used to investigate lifetime information because of its straightforwardness and systematic tractability. Albeit, a one-parameter exponential distribution, has a number of fascinating properties; however, its utilization is improper, all these things are taken into consideration in circumstances where related risk rate is not steady. To suit the circumstance of non steady danger rate, analysts tried relentlessly to grow new lifetime distribution with the aim that these turn out to be more adaptable than the ones used currently with regards to the state of their probability density function (pdf ) and their related hazard rate. Kumar et al. (2015) proposed a change and called it a DUS change to get another distribution. In this case, the off chance that G(x) is the baseline cumulative distribution function (cdf ), DUS transformation yields new cdf F(x) as given beneath: Maurya, Kaushik, Singh, and Singh (2017) proposed another class of distribution which comprises a wide range of hazard rates for fitting decision of shape parameter and proposed the use of DUS transformation on the exponentiated cdf , henceforth referred to as generalized DUS (GDUS) transformation. This new distribution is of important interest to the researcher who aims to utilize it in applied operations by using Bayesian analysis. Survival analysis has numerous applications in various science fields, for example, in biological science, medicine, engineering, management, and public health. These statistical distributions are essentially utilized to show the life of an item with the end goal to think about its important features. In this way, suitable and proper distribution may yield valuable data that give good results and help settle on right choices. In this paper, the researcher goes for examining how Bayesian approach fits the (GDUSED) model by using LaplaceApproximation and JAGS (Just Another Gibbs Sampler). The tools and techniques which are used in this paper are implemented by using LaplacesDemon and r2jags packages of R, and hence they are in the Bayesian field. In light of the Bayes rule, Bayesian inference can give a balanced strategy to invigorate our belief in the light of new and upcoming data. Since researchers face difficulty in finding solutions for high dimension, in this article, the researcher uses a package to help in solving the problem, one package is LaplacesDemon (Statisticat (2015)) which helps and facilitates high dimensional Bayeisan inference, mainly characterized by its own intellect and beneficial analysis and the other package is R2jags in which there is the function JAGS (Just Another Gibbs Sampler) can be run directly from R by using R2jags package. Also, R2jags package used for simulation from posterior density. The JAGS function takes data and starting values as an input. The JAGS function writes automatically a jags script, calls the model, and saves the simulations for easy access in R. A real survival data set is used to explain LaplaceApproximation and JAGS. Therefore, Bayesian analysis of (GDUSED) model has been initiated to achieve the following aims: • To describe a Bayesian model, that is, specification of likelihood and prior distribution.
• To write down the R code for approximating posterior densities and simulation tools with, LaplaceApproximation and JAGS.
• To illustrate numeric as well as graphic summaries of the posterior densities.
2. The generalized DUS exponential distribution Maurya et al. (2017) projected another class of distribution which comprises a wide range of hazard rates for proper choice of the shape parameter. Also, they proposed the use of DUS transformation on the exponentiated cdf , henceforth referred to as generalized DUS (GDUS) transformation. The distribution obtained by GDUS transformation is needed to have both monotone and bathtub-shaped hazard rates depending upon the decision of the values of the parameters. To illustrate the perspective, we consider exponential distribution as the base distribution because of its effortlessness and prominence in life testing problems, despite the fact that its use is limited to those phenomena where hazard rate is steady.

Bayesian inference
1. Establishing a full probability model for all observable and unobservable quantities.
This model should be consistent with obtainable knowledge of the data being modeled and how it was collected.
2. Computing the posterior probability of unknown quantities conditioned on observed quantities. The unknown quantities may admit unobservable quantities such as parameters and potentially observable quantities such as predictions for outlook observations.
3. Estimating the model fit to the data. This includes evaluating the implications of the posterior.
At this point, we pinpoint the very important three points in Bayesian inferences which are as follows: • prior distribution p(θ): The parameter θ can set a prior distribution element using probability as a means of quantifying uncertainty about θ before taking the data into a count.
• Likelihood p(y|θ): likelihood function for variables is related in full to a probability model.
• Posterior distribution p(θ|y): is the joint posterior distribution that expresses uncertainty about parameter θ after considering the prior and the data as in the following equation:

The prior distributions
Having the prior distribution, the Bayesian inference can provide the information concerning an uncertain parameter θ connected through the probability distribution of data. This uncertain parameter can help to obtain the posterior distribution p(θ|y). In the case of the Bayesian inference, it is very important for prior information to be identified through the value of the specified parameter. The information which is gathered before analyzing the experimental data by using a probability distribution function is referred to as the prior probability distribution (or the prior). In this article, we use two types of priors, the half-Cauchy prior and the Normal prior. The simplest type of priors is a conjugate prior which facilitates posterior calculations. In addition, a conjugate prior distribution is intended for an unknown parameter which leads to a posterior distribution for which there is a simple formula for posterior means and variances. (AbuJarad and Khan (2018a)) applied the half-Cauchy distribution by setting scale parameter α = 25.
First, the probability density function of half-Cauchy distribution by scale parameter α is defined as The mean and variance of the Half-Cauchy distribution dose not exist but its mode is equal to 0. The half-Cauchy distribution with scale α = 25 is a suggested, default, weakly informative prior distribution used for a scale parameter. On this scale α = 25, the density of half-Cauchy is almost flat but not completely see Figure 2. Thus, the prior distributions which are not completely flat provide sufficient information for the numerical approximation algorithm to continue to look at the target density (posterior distribution). The inverse-gamma distribution is often used as a non-informative prior distribution for scale parameter, but generates a trouble in the model fitting process (Gelman and Hill (2006)). Otherwise if more information is needed, the half-Cauchy is a favorable option. Consequently, in this article, the half-Cauchy distribution with scale parameter α = 25 is used as a weakly informative prior distribution.
Next, in the normal (or Gaussian), every parameters are assigned a weak information Gaussian prior probability distribution. Our aims in this work is to use the parameters β i independently in the normal distribution with mean=0 and standard deviation=1000, i.e., β j ∼ N (0, 1000), therefore, we get a flat prior, as we see in Figure 2. We see that the large variance indicates a lot of uncertainty about each parameter and hence a weak informative distribution.

Laplace approximation
Laplace Approximation is a technique first known in De Laplace (1774). For this technique, it is important to specify and decide information prior to data analysis. For example, many simple Bayesian analyses give the similar results to standard non-Bayesian approaches when built on non-informative prior distribution, for example, the posterior t-interval for the normal mean with unknown variance. An objective validation of a non-informative prior distribution depends on the amount of information existing in the data, in the simple cases as the sample size n increases, the influence of the prior distribution on posterior inference decreases. These thoughts are sometimes referred to as asymptotic approximation theory because they refer to properties that hold in the limit as n becomes large. Thus, a special method of asymptotic approximation is the Laplace Approximation which accurately approximates the unimodal posterior moments and marginal posterior densities in many cases. In this section, we introduce a concise, informal description of Laplace Approximation method.
Suppose −h(θ) is a smooth, bounded unimodal function with a maximum atθ and θ is a scalar. By using Laplace Approximation (e.g., Tierney and Kadane (1986)) the following integral can be approximated byÎ As presented in Mosteller and Wallace (1964), Laplace Approximation is used to expandθ and to find the following integral: by setting h (θ) = 0, we get Spontaneously, if exp[−nh(θ)] is exceptionally topped aboutθ, then the integral is able to be well approximated by the performance of the integrand near θ, which can be expressed in the following equation: To evaluate moments of posterior distributions, we have to estimate expressions as follows (Tanner (2012)): where exp[−nh(θ)] = L(θ|y)p(θ).

Bayesian analysis of the generalized DUS exponential
Bayesian analysis is the best method to get the marginal posterior distribution of the specific parameters of interest. Basically, the best approach to meet this point is self-evident. First, we calculate and obtain the joint posterior distribution of all unknown parameters, at that point, we integrate this distribution over the unknown parameters that are not of immediate interest to obtain the needed marginal distribution. Also, at the same level, we draw samples from the joint posterior distribution by using simulation, this will lead to the ultimatum goal of treating the parameters of interest and ignoring the values of the other unknown parameters (see, e.g., AbuJarad and Khan (2018b)).

The generalized DUS exponential
The probability density function (pdf ) of the generalized DUS exponential model is defined as: Also, the survival function S(t) of the generalized DUS exponential model is defined as: Now, we have the capacity to condition the likelihood function for right censored (in the meantime similar to our case the data are right censored) as follows: where δ i is an indicator variable which takes the value 0 if observation is censored and the value 1 if observation is uncensored.
The likelihood function of the generalized DUS exponential model in terms of the f (t; α, λ) and S(t; α, λ), is defined as: The posterior distribution of our convictions about the basic inclination is determined in the standard way by applying Bayes' rule (Statisticat (2015)), the joint posterior density is given by AbuJarad and Khan (2018a) and defined as follows: To complete Bayesian inference in the generalized DUS Exponential model, we choose a prior distribution for α and β s. Also, we discuss the issue related to deciding prior distributions in section 4. Now, we expect that the prior distribution for α is half-Cauchy on the interval [0, 25] and for β is Normal with [0, 1000]. Basic application of Bayes rule as showed in equation (7) which is applied toward the condition (8) to gives the back thickness for α and β as illustrated in condition (9). The result proposed for this negligible back appropriation gets high-dimensional fundamental over every single model parameter β j . To solve this integral, we use the approximated by means of Markov Chain Monte Carlo methods. In any case, due to the availability of software package similar to LaplacesDemon, this model can be effectively fitted in Bayesian paradigm through Laplaceapproximation in addition to MCMC techniques as well as JAGS.

IUD data set
In an attempt to investigate and analyze menstrual bleeding data from women using contraceptives trials to prevent pregnancy, the World Health Organization (WHO 1987) has made available clinical data of 18 women aged between 18 and 35 years by using intrauterine device (IUD), known as the Multi-load 250 until discontinuation because of menstrual bleeding problems. The time origin starts with the first day in which women use the IUD, and it is finished with discontinuation because of bleeding problems. It was also mentioned that some women in the study stopped using the IUD because of the desire for pregnancy, or because of no need for a contraceptive, or just simply losing follow up. The study aimed at documenting those women for two years from the time origin. For some practical reasons those women could not be examined for two years to see if they were still using the IUD. This explains why there are three times of discontinuation of more than 104 weeks that are right-censored. The table below illustrates the number of weeks of using IUD (Collett (2015)).

Implementation using with Laplace approximation
Bayesian fitting of generalized DUS Exponential model for this data can be done in R by utilizing capacity LaplaceApproximation for analytic approximation and then with Laplaces-Demon for MCMC simulations. In this manner, usage has been made by using LaplacesDemon package.

Creation of IUD data
Despite the fact that most R functions utilize data as a frame, Laplace's Demon utilizes at least one numeric matrices in a list. It is much faster to process a numeric matrix than to process a data frame in iterative estimation. The above data of 18 menstrual bleeding from women (IUD), have given the survival times. The data and related codes are given in Appendix A2.

Specification of the generalized DUS exponential
At that moment, we are motivated to look at the posterior estimates of the parameters as soon as the generalized DUS Exponential model (GDUSED) is fitted to the above mentioned information (data). Therefore, the meaning of the probability (likelihood) becomes the topmost necessity for the Bayesian fitting. At this point, the likelihood function can be written as: by taking the logarithm of the above likelihood function, we obtain where λ = exp(Xβ) is a linear combination of explanatory variables and log is the natural log used for the time to failure event. The Bayesian system requires the determination and specification of prior distributions for the parameters. At this point, we stick to subjectivity and we introduce weakly informative priors for the parameters. Priors for the β and α are in use to be normal and half-Cauchy at the same time as follows (see, e.g., AbuJarad and Khan (2018a)): β j ∼ N (0, 1000); j = 1, 2, 3, ...J α ∼ HC(0, 25).
In this methodology, we acquire the log posterior of the (GDUSED). The model specification and related codes are given in Appendix A3.

Fitting of generalized DUS exponential with Laplace approximation
Now, we use Laplace's method with the function LaplaceApproximation to facilitate approximating the joint posterior density. For the idea of optimization, several algorithms have been implemented in this function. Among all the algorithms, we have discovered that the BFGS, LBFGS and TR perform well in a large portion of the cases. However, for this specific case, Trust region TR algorithm of Nocedal and Wright (1999) is protected due to its proficiency of convergence in the least number of iteration. To start the optimization, some initial values for the parameters must be defined and in this way zero is set to the regression coefficients. Currently, the time has come to call the function LaplaceApproximation to approximate the posterior densities of (GDUSE) model for the data using Laplaceapproximation. This method is implemented in LaplaceApproximation function with TR as a method. An object GDUSEDLA has been created as a result of using LapalceApproximation function. The fitting of model with Laplace Approximation and related codes are given in Appendix A4.

Summarizing output
The analytic results by using LaplaceApproximation function are shown in Table 2. It may be noted that the posterior mode of parameters beta and log.alpha are 4.160.40, −0.650.37, respectively. According to 95% credible intervals, beta is found to be statistically significant and log.alpha is found to be statistically non-significant. Hence, they are appropriate variables for modeling survival data. The simulated results using sampling importance resampling (SIR) method are shown in Table 3, which represents the posterior mode (Mode), posterior standard deviation (SD), Monte Carlo standard error (MCSC), effective sample size (ESS) and respective credible intervals LB (2.5%), Median (50%) and UB (97.5%).

Simulation study of IUD data of GDUSED
In this section, the simulation will be performed by utilizing random walk Metropolis algorithm. With the end goal of outline IUD data has been utilized. The R commands for the usage of IM are given underneath with protest name GDUSEDLD by utilizing function LaplacesDemon and the outcomes are outlined in Table 4, alongside the Posterior density plot is presented in Figure 3. The simulation study of IUD data of GDUSED and related codes is given in Appendix A5.  (2011)) is Just Another Gibbs Sampler that was mainly written by Martyn Plummer in order to provide a BUGS. It is a program for the analysis of Bayesian models using Markov Chain Monte Carlo (MCMC). R2jags (Su and Yajima (2012)) is an R package that allows fitting JAGS models from within R (Gelman et al. (2013)). This program is based on a version of numerical library (Rmath) used for R. Many of the functions in base R for mathematical and statistical calculations are also available in the JAGS (Lunn, Jackson, Best, Spiegelhalter, and Thomas (2012)). Let we consider the Bayesian analysis of IUD data with JAGS using its interface of R that is, R2jags package of R. R2jags which is designed for inference on Bayesian models using Markov chain Monte Carlo (MCMC) simulation, which is also used for simulation from the posterior density. The JAGS function takes data and starting values as input. It automatically writes a jags script, calls the model, and saves the simulations for easy access in R.

Creation of data
In order to fit the model with JAGS, the data is required to be listed containing the name of each vector. This can be done by R related codes that are given in Appendix A6.

Definition of the generalized DUS exponential
The GDUSED is used for modeling the IUD data, as presented in the following formula: where X is model matrix and β is the vector of regression coefficients.
Thus, the JAGS codes of the this model are given in Appendix A7.

Summarizing output
The summary of JAGS simulations after being fitted to the GDU SED(α, λ) model for the IUD data JAGS simulates the data from posterior density using Metropolis within Gibbs algorithm and approximates the results which are reported in Table 5. Rhats are very close to 1.0, which is indicated to a good convergence. A plot of the posterior densities can be seen in Figure 4.

Conclusion
In this paper, GDUSED has been used for the analysis of IUD data in Bayesian paradigm. For illustration of data, a real survival data set is used. The analytic approximation and simulation methods are implemented using LaplacesDemon and R2jags packages of R. As indicated by the findings, it is clear that simulation tools provide better results when compared with those obtained by asymptotic approximation.
y is the vector of survival time containing both groups in it, censor is a binary vector of censoring using 1 for uncensored and 0 for censored observation.