New Flexible Regression Models Generated by Gamma Random Variables with Censored Data

We propose and study a new log-gamma Weibull regression model. We obtain explicit expressions for the raw and incomplete moments, quantile and generating functions and mean deviations of the log-gamma Weibull distribution. We demonstrate that the new regression model can be applied to censored data since it represents a parametric family of models which includes as sub-models several widely-known regression models and therefore can be used more effectively in the analysis of survival data. We obtain the maximum likelihood estimates of the model parameters by considering censored data and evaluate local influence on the estimates of the parameters by taking different perturbation schemes. Some global-influence measurements are also investigated. Further, for different parameter settings, sample sizes and censoring percentages, various simulations are performed. In addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We demonstrate that our extended regression model is very useful to the analysis of real data and may give more realistic fits than other special regression models.


Introduction
The CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior) is responsible for assessing every three years, postgraduate courses in the country with the goal of keeping the courses with a level of excellence and contribute to the training of researchers (Horta and Moraes, 2005) in Brazil.According to CAPES, one of the requirements is that the master programs are done in two years and doctoral courses in four years (Moreira et al., 2010).For this reason, the programs graduate has studied how to encourage students before or within the stipulated time.To determine the best strategy to take, a particular program postgraduate noted the time it took his doctoral students to hold, according to sex and age at first registration.
One way to study the effect of these explanatory variables (gender and age of the students) on the completion time is through a location-scale regression model, also known as a model of accelerated lifetime.These models consider that the response variable belongs to a family of distributions characterized by a parameter location and scale parameter.Further details on this class of regression models can be found in Cox and Oakes (1984), Kalbfleisch and Prentice (2002) and Lawless (2003).
However, for the case of parametric models, it is assumed that the time until the end of the PhD is sampled from a continuous distribution.In the context of survival analysis, some distributions have been used to analyze censored data.For example, Leiva et al. (2007) conducted a diagnostic study in a log-Birnbaum-Saunders regression model.Carrasco et al. (2008) defined a regression model considering a modification of the Weibull distribution.Ortega et al. (2011) proposed the beta-Weibull regression model and Silva et al. (2011) proposed the log-Burr XII regression model.Thus, using the same approach adopted in this work, a distribution obtained from a generated gamma family will be expressed in the form of models belonging to the location-scale models.In this way, we can study the influence of explanatory variables on the completion time of doctoral students.
Regression models can be proposed in different forms in survival analysis.Among them, the location-scale regression model (Lawless, 2003) is distinguished and it is frequently used in clinical trials.In this paper, we propose a locationscale regression model based on the log-gamma Weibull (LGW) distribution.We consider a classic analysis for the LGW regression model.The inferential part was carried out using asymptotic distribution of the maximum likelihood estimators (MLEs), which, in situations when the sample is small, may present difficult results to be justified.As an alternative to classic analysis we explore the use of bootstrap method for survival times analysis as a feasible alternative.After modeling, it is important to check assumptions in the model and to conduct a robustness study in order to detect influential or extreme observations that can cause distortions in the results of the analysis.Numerous approaches have been proposed in the literature to detect influential or outlying observations.On the other hand, when using case deletion, all information from a single subject is deleted at once and, therefore, it is hard to tell whether that subject has any influence on a specific aspect of the model.A solution for the earlier problem can be found in a quite different paradigm, being a local influence approach where one again investigates how the results of an analysis are changed under small perturbations in the model, and where these perturbations can be specific interpretations.We develop a similar methodology to detect influential subjects in LGW regression models with censored data.Further, we compare two residuals to assess departures from the error assumptions and to detect outlying observations in the LGW regression models with censored observations.For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and the empirical distribution of each residual is displayed and compared with the standard normal distribution.
In Section 2, we perform a brief review on the GW and LGW distributions and derive the quantile function (qf) of the last distribution.Some of the mathematical properties of the LGW model such as the ordinary and incomplete moments, mean deviations, Bonferroni and Lorenz curves and generating function are investigated in Section 3. In Section 4, we consider a brief study on the GW distribution and present certain the characterizations of LGW distribution.In Section 5, we obtain the MLEs and the estimates based on a bootstrap method and provide some results from simulation studies for the LGW regression model with censored data.In Section 6, we use diagnostic measures considering three perturbation schemes and case-deletion in the LGW regression model with censored observations.Section 7 deals with the definition and discussion of the residuals and presents useful comments on the results from various simulation studies.In Section 8, a real data set is analyzed for illustrative purposes.Finally, we offer some conclusions in Section 9.

The Log-gamma-Weibull Distribution
The art of proposing generalized distributions has attracted theoretical and applied statisticians due to their flexible properties.Most of the distributions used to describe real data are chosen for the following reasons: a physical or statistical theoretical argument to explain the mechanism of the generated data, a model that has previously been used successfully and an appropriate model whose empirical fit is good to data.The Weibull distribution is a very popular distribution for modeling lifetime data and for modeling phenomenon with monotone failure rates.When modeling monotone hazard rates, the Weibull distribution may be an initial choice because of its negatively and positively skewed density shapes.This distribution has cumulative distribution function (cdf) (for t > 0) given by where α > 0 is a shape parameter and λ > 0 a scale parameter.The probability density function (pdf) corresponding to (1) is given by We write T ∼GW(α, λ) for a random variable T having the pdf (2).
There has been an increased interest in defining new univariate continuous distributions by introducing one additional shape parameter to the baseline distribution.The extra parameter has been proved useful in some cases to explore tail properties and to improve the goodness-of-fit of the new distribution.In fact, Zografos and Balakrishnan (2009) and Ristic, and Balakrishnan (2012) proposed a family of univariate distributions generated by one-parameter gamma random variables.Based on any baseline cdf G(t), they defined the gamma-G family with pdf f (t) and cdf F(t) (for ϕ > 0) given by and du is the gamma function, and γ(a, z) = ∫ z 0 u a−1 e −u du is the incomplete gamma function.The gamma-G family has the same parameters of the G distribution plus one additional shape parameter ϕ > 0. For any specified G distribution, we can generate the associated gamma-G distribution.Recently Nadarajah et al. (2015) provide a comprehensive treatment of general mathematical properties of gamma-G distributions.For ϕ = 1, the G distribution is a basic exemplar of the gamma-G distribution with a continuous crossover towards cases with different shapes (for example, a particular combination of skewness and kurtosis).
In this context, we define the gamma Weibull (GW) density function by inserting (1) and (2) in equation (3).So, we obtain where ϕ > 0 and α > 0 are shape parameters and λ > 0 is a scale parameter.By combining the gamma-G and Weibull distributions, we obtain the generalized gamma distribution (Stacy, 1962).The applications of the GW distribution can be directed to model insurance data, tree diameters, software reliability, extreme value observations in floods, carbon fibrous composites, firmware system failure, reliability prediction and fracture toughness, among others.
To obtain a distribution that belongs to the location-scale model, we consider the transformation of the random variable Y = log(T ), which has the log-gamma Weibull (LGW) distribution.Thus, considering the pdf (4) and the transformations α = σ −1 and λ = e µ , the LGW distribution (for y ∈ R) reduces to where ϕ > 0, σ > 0 and µ ∈ R. Therefore, Y follows the LGW distribution.Plots of the pdf (5) for selected parameter values are displayed in Figure 1.These plots show great flexibility for different values of the shape parameter ϕ with µ = 0 and σ = 1.If Y is a random variable having pdf (5), we write Y ∼ LGW(ϕ, µ, σ).
The survival function corresponding to (5) becomes The simulation of Y is very easy: if V is a gamma random variable with shape parameter ϕ and unit scale parameter then Y = µ + σ log(V), will have the LGW density function (5).

Mathematical Properties
In this section, we derive explicit expressions for the ordinary and incomplete moments, generating function and Bonferroni and Lorenz curves for the LGW distribution.Its mathematical properties are not difficult to be implemented in applications because of the computational and analytical facilities available in programming softwares like MATHEMAT-ICA and MAPLE that can easily tackle the problems involved in computing the special functions in these properties.

Moments
The need for necessity and the importance of moments in any statistical analysis especially in applied work is obvious.Some of the most important features and characteristics of a distribution can be studied through moments (e.g.tendency, dispersion, skewness and kurtosis).The rth raw moment of Y can be expressed as where u = exp . Further, based on a result by Prudnikov et al. (1986, equation 2.6.21.1), we can rewrite (9) as The mean and variance of Y follow from ( 10) as where ψ(•) is the digamma function, ψ(n, ϕ) is the polygamma function and n is a positive integer.
The central moments (µ r ) and cumulants (κ r ) of Y can be determined from (10) as 2 and kurtosis γ 2 = κ 4 /κ 2 2 follow from the second, third and fourth cumulants.Other kinds of moments such L-moments may also be obtained in closed-form, but we consider only the previous moments for reasons of space.
Figure 2 provides some plots of the skewness (Figure 2a) and kurtosis (Figure 2b) for σ = 1.5 as a function of µ for some values of ϕ, whereas Figure 3 provides some plots of the skewness (Figure 3a) and kurtosis (Figure 3b) for µ = 0 as a function of ϕ for some values of σ.It can be noted that when the parameter µ increases the LGW distribution is higher and concentrated than the standard normal distribution, i.e., it has heavy tails.Moreover, when the parameter ϕ decreases, the LGW distribution becomes positive asymmetrical (Figure 2).On the other hand, Figure 3 indicates that when the parameter ϕ increases, the shape of the LGW distribution is flatter than that of the normal distribution.For the parameter ϕ greater than three, this distribution has a negative skewness and if the parameter ϕ is less than three, the distribution is positively skewed.
The nth descending factorial moment of Y is where is the Stirling number of the first kind which counts the number of ways to permute a list of r items into k cycles.So, we can obtain the factorial moments from the ordinary moments given before.

Mean Deviations
For empirical purposes, the shapes of many distributions can be usefully described by what we call the first incomplete moment, which plays an important role for measuring inequality, for example, income quantiles and Lorenz and Bonferroni curves.The first incomplete moment of Y is given by m and using a similar approach of Section 3.1, we can write where where i = √ −1 is the complex unit and L denotes an integration path (see Gradshteyn and Ryzhik, 2000;Section 9.3) for a description of this path.The Meijer G-function contains many integrals with elementary and special functions.Some of these integrals are included in Prudnikov et al. (1986).In the MATHEMATICA software, ψ(ϕ) is denoted by PolyGamma[0, ϕ] and the Meijer G-function is represented by We can obtain using the MATHEMATICA software Applications of equation ( 11) can be addressed to obtain Bonferroni and Lorenz curves defined for a given probability π by B(π) = m 1 (q)/(πµ ′ 1 ) and respectively, where ) is easily calculated from the survival function ( 6) and m 1 (z) is given by (11).

Generating Function
The moment generating function (mgf) provides the basis of an alternative route to analytical results compared with working directly with the pdf and cdf and it is widely used in the characterization of distributions and the application of the skew-normal test (Meintanis, 2010) and other goodness of fit tests (Ghosh, 2013).Therefore, using a result in Prudnikov et al. (1986, equation 2.6.21.1), we can derive the mgf of Y as

Characterizations of LGW Distribution
In this section we present certain characterizations of LGW distribution.The first characterization is based on a simple relationship between two truncated moments.It should be mentioned that for this characterization, the cdf need no have a closed form.We believe, due to the nature of the cdf of LGW, there may not be other possibly interesting characterizations than the ones presented in this section.Our first characterization result borrows from a theorem due to [Gläzel, 1987] , see Theorem G in the Appendix A. Note that the result holds also when the interval I is not closed.Moreover, as shown in [Gläzel, 1990], this characterization is stable in the sense of weak convergence.Here is our first characterization of LGW distribution.
Proposition 1.Let X : Ω → R be a continuous random variable and let q 1 (x) = exp The random variable X belongs to LGW family (5) if and only if the function η defined in Theorem G has the form Proof.Let X be a random variable with density (5), then and and finally Conversely, if η is given as above, then and hence Now, in view of Theorem G, X has density (5) .
Corollary 1.Let X : Ω → R be a continuous random variable and let q 1 (x) be as in Proposition 1.The pdf of X is (5) if and only if there exist functions q 2 and η defined in Theorem G satisfying the differential equation The general solution of the differential equation in Corollary 1 is where D is a constant.Note that a set of functions satisfying the differential equation in Corollary 1, is given in Proposition 1 with D = 0.However, it should be also noted that there are other triplets (q 1 , q 2 , η) satisfying the conditions of Theorem G.

The Log-gamma Weibull Regression Model with Censored Data
The last decade is full of works on generalized classes of regression models, which are always precious for applied statisticians.In practical applications, the lifetimes are affected by explanatory variables such as the cholesterol level, blood pressure and many others.Let x i = (x i1 , . . ., x ip ) ⊤ be the explanatory variable vector associated with the ith response variable y i , for i = 1, . . ., n.
We construct a linear regression model for the response variable y i based on the LGW distribution given by where the random error z i has the pdf (8), β = (β 1 , . . ., β p ) ⊤ , σ > 0 and ϕ > 0 are unknown scalar parameters and x i is the vector of explanatory variables modeling the location parameter Hence, the location parameter vector µ = (µ 1 , . . ., µ n ) ⊤ of the LGW regression model has a linear structure µ = xβ, where x = (x 1 , . . ., x n ) ⊤ is a known model matrix.
Consider a sample (y 1 , x 1 ), . . ., (y n , x n ) of n independent observations, where each random response is defined by y i = min{log(t i ), log(c i )}.We assume non-informative censoring such that the observed lifetimes and censoring times are independent.Let F and C be the sets of individuals for which y i is the log-lifetime or log-censoring, respectively.We consider non-informative censoring such that the observed lifetimes and censoring times are independent.The loglikelihood function for the vector of parameters θ = (ϕ, σ, β ⊤ ) ⊤ from model ( 12) has the form is the pdf (5) and S (y i ) is the survival function ( 6), for i = 1, . . ., n.Therefore, the log-likelihood function for θ reduces to where θ = (ϕ, σ, β ⊤ ) ⊤ is the vector of unknown parameters, r is the observed number of failures and γ(x, ϕ) is the incomplete gamma function.The components of the score vector U(θ) are given by where z i = log(u)du and j = 0, 1, . . ., p.
The MLE θ of θ can be obtained by maximizing the log-likelihood function (13).We use the matrix programming language Ox (MaxBFGS function) (see Doornik, 2007) to calculate the estimate θ.Initial values for β and σ are taken from the fit of the LW regression model with ϕ = 1.
Under general regularity conditions, the asymptotic distribution of √ n( θ − θ) is multivariate normal N p+3 (0, K(θ) −1 ), where K(θ) is the expected information matrix.The asymptotic covariance matrix K(θ) −1 of θ can be approximated by the inverse of the (p + 3) × (p + 3) observed information matrix J(θ) and then the inference on the parameter vector θ can be based on the normal approximation N p+3 (0, J(θ) −1 ) for θ.This multivariate normal N p+3 (0, J(θ) −1 ) distribution can be used to construct approximate confidence regions for some parameters in θ and for the hazard and survival functions.

Bootstrap Re-sampling Method
The bootstrap is a computer-based method for assessing the accuracy of statistical estimates and tests.It was first proposed by Efron (1979).Treat the data as if they were the(true, unknown) population and draw samples (with replacement) from the data as if you were sampling from the population.Repeat the procedure a large number of times(say B) each time computing the quantity of interest.Then, use the B values of the quantity of interest to estimate its unknown distribution.
Let T=(T 1 , . . ., T n ) be an observed random sample and F be the empirical distribution of T. Thus, a bootstrap sample T * is constructed by re-sampling with replacement of n elements of the sample T. The bootstrap estimator of the standard error (Efron and Tibshirani, 1993) is the standard deviation of these bootstrap samples, namely Note that B is the number of bootstrap samples generated.According to Efron and Tibshirani (1993), assuming B ≥ 200, it is generally sufficient to present good results to determine the bootstrap estimates.However, to achieve greater accuracy, a reasonably high B value must be considered.We describe the bias corrected and accelerated (BCa) method for constructing approximated confidence intervals based on the bootstrap re-sampling method (Hashimoto et al., 2013).

Sensitivity Analysis
There are basically two approaches to detecting observations that seriously influence the results of a statistical analysis.One approach is the case-deletion approach, and the second approach is one in which the stability of the estimated outputs with respect to the model inputs is studied via various minor model perturbation schemes.

Global Influence
The assessment of robustness aspects of the parameter estimates in statistical models has been an important concern of various researchers in recent decades.The case deletion measures , which consists of studying the impact on the parameter estimates after dropping individual observations, is probably the most employed technique to detect influential observations (Cook and Weisberg, 1982) A global influence measure considered by Xie and Wei (2007) is a generalization of the Cook distance defined as a standardized norm θ(i) − θ.It is expressed as where L(θ) is the observed information matrix.
Another measures to evaluate the influence is called of likelihood distance and considers the difference between θ(i) and θ.
Thus, the likelihood distance is given by where l( θ) is the value of the logarithm of the likelihood function of the full sample and l( θ(i) ) is the value of the logarithm of the likelihood function of the sample excluding the i-th observation.

Local Influence
A second tool for sensitivity analysis is known as local influence.The local influence measures are calculated after a perturbation scheme in the model or data is established.Thus different perturbation schemes can be used according to the purpose of the analysis.Therefore, considering the model defined in ( 12) and the logarithm of the likelihood function expressed in ( 13), the following perturbation schemes are used: a. Case perturbation Let 0 ≤ ω i ≤ 1 and ω 0 = (1, . . ., 1) ⊤ be the vector representing no perturbation.The logarithm of the log-likelihood disturbed for each model incorporating different weights for each element of the data is defined by ]} .

b. Response variable perturbation
To verify the sensitivity of the model ( 12), it is assumed that the response variable y i = log(t i )(i = 1, . . ., n) is submitted to the additive perturbation scheme such that y * i = y i + ω i sd(y i ), where sd(y i ) is a scaling factor which can be the standard deviation of the logarithm of failure times and ω i ∈ R is a perturbation vector (Silva et al., 2010).In this case, ω 0 = (0, . . ., 0) ⊤ is the vector representing no perturbation and the logarithm likelihood function disturbed is expressed as ]} .

c. Explanatory variable perturbation
Another way of evaluating the sensitivity of the model ( 12) is to consider small perturbations in a particular continuous explanatory variable, denoted by X j .In this case, the explanatory variable is submitted to the additive perturbation scheme, such that x * i j = x i j + ω i sd(x i j ), wheresd(x i j ) is scaling factor that can be the standard deviation of the disturbed explanatory variable (Silva et al., 2010).Thus, considering that x * ⊤ i β = β 0 + β 1 x i1 + . . .+ β j x * i j + . . .+ β p x ip and ω 0 = (0, . . ., 0) ⊤ is the vector representing no perturbation, the logarithm of the likelihood function is given by For all three perturbation schemes, the array of maximum curvature is calculated numerically as , where v = 1, . . ., p + 3, i = 1, . . ., n and ω = (ω 1 , . . ., ω n ) ⊤ is the vector of weights that penalizes the LGW regression model or the observations.

Residual Analysis
In order to study the assumptions of the errors and the presence of outliers, we propose various residuals, for example, Collett (2003), Weisberg (2005) and Colosimo and Giolo (2006).In the context of survival analysis, the deviance residual has been more widely used, because they take into account the information of censored times (Silva et al., 2008).Thus, the deviance residual plot versus the observed times provides a way to verify the adequacy of the adjusted model and help to find atypical observations.The deviance residual is expressed as where is the martingale residuals, sign(•) is a function that leads the values +1 if the argument is positive and −1 if the argument is negative and ẑi = (y i − x ⊤ i β)/ σ for i = 1, . . ., n.

Simulation Study
We performed a simulation to assess the MLEs of the LGW regression model with censored data, and also to investigate the behavior of the empirical distribution of martingale and deviance residuals.For the simulation study, the variables z 1 , . . ., z n of the LGW distribution (8) were generated by the acceptance-rejection method, as explained in Ross (2006) and Bonat et al. (2012).Therefore, for the sample sizes n = 100, n = 300 and n = 500, the values of the parameters of the distributions are set at ϕ = 0.8 and 1.5, σ = 1.0, β 0 = 2.0 and β 1 = 4.0.The survival times are generated from the following algorithm (Zeviani, 2012), adapted in this work for censored data.
i. Generate v ∼ uniform(a 1 , b 1 ), where a 1 and b 1 are chosen to represent the support of random variable with pdf (8).
ii. Generate u ∼ uniform(0, b 2 ), where b 2 was chosen to represent the values of the density (8). iii vi. Generate c ∼ uniform(0, τ), where τ was adjusted to obtain the percentages of right censoring 10% and 30%.
viii.Otherwise, return to step i.
Therefore, 1, 000 samples are generated for each combination of n, ϕ, σ and censoring percentages by means of Monte Carlo simulations, and the MLEs of the model parameters are obtained for each of the samples.Then, for each adjusted model, the residuals r D i (16) are determined.On the other hand, Figures 4-5 display the plots of the residuals versus the expected values of the order statistics of the standard normal distribution.This plot is known as the normal probability plot and serves to assess the departure from the normality assumption of the residuals (Weisberg, 2005).Therefore, the following interpretations are obtained from these plots: the empirical distribution of the deviance residual agrees with the standard normal distribution and as the sample size increases, the empirical distribution of the deviance residual becomes closer to the normal distribution (as illustrated in Figures 4-5).

Application
In this study, information from 49 PhD students of a particular program of postgraduate between the periods 1999 to 2012 are used.The interest of the study is to check whether the median completion time is within the maximum period (four years) stipulated by CAPES.Further, we wish to verify whether gender and age in the year of first registration exert some influence on the completion time of the students and if there is interaction between these explanatory variables.Thus, the response variable was defined as the time (in months) of first registration until the end of the PhD (date of defense).However, students who dropped out or who abandoned the course and did not return or who have not completed the course during this period are considered censored times.So, we define the following variables, y: logarithm of time (months), x 1 : age (years) and x 2 : gender (0= female, 1=male).
In many applications there is qualitative information about the hazard shape, which can help with selecting a particular model.In this context, a device called the total time on test (TTT) plot (Aarset 1987) is useful.The TTT plot is obtained by plotting ∑ n i=1 T i:n ), where r = 1, . . ., n and T i:n , i = 1, . . ., n are the order statistics of the sample, against r/n (Mudholkar et al, 1996).
First, to verify the behavior of the PhD data, the Kaplan-Meier and the TTT curves are displayed in Figure 6.From these plots, it is noted that the TTT-plot indicates that the time until the end of the PhD of students presents an increasing failure rate (Figure 6a).Moreover, the time not present a level above zero as shown in Figure 6b.There will be evidence that the gender of the students did affect the time until the end of the PhD.Thus, in accordance with what was observed in Figure 6, we can consider the following model: To maximize the function ( 13) and obtain the MLEs of the parameters of the proposed model, we use the MaxBGFS subroutine of the matrix programming language Ox version 6.20 with initial values ϕ = 1.000, σ = 0.760, β 0 = 4.880, β 1 = −0.153and β 2 = 0.026, obtained from the fit of the Weibull regression model in software R (version 2.15.1).Thus, Table 1 provides the parameter estimates of the models, standard errors and significance of the parameters for the MLEs and non-parametric bootstrap estimates.By examining the figures in this table, we conclude that the estimates by the two methods are very similar.So, there is an evidence that the presence of the interaction between age and gender at the time of completion of a PhD degree.The next step is to detect possible influential points in the LGW regression model.The measurements of global and local influence are calculated using the matrix programming language Ox (version 6.20).Generalized Cook's distance ( 14) and likelihood distance (15) are displayed in Figure 7. From this figure, it is noted that the cases ♯8, ♯18 and ♯21 are possible influential observations.For the local influence plots, considering perturbation of cases (C d max = 2.1805), the logarithm of time perturbation (C d max = 0.3925) and explanatory variable perturbation x 1 (C d max = 4.4124), it is noted that the points ♯18 and ♯21 can be considered as possible influential observations as illustrated in Figures 8-9.On the other hand, the plot of the deviance residuals versus the fitted values is displayed in Figure 10.It is observed that there is evidence that the observations ♯7, ♯18 and ♯21 are discrepant (Figure 10a).Therefore, the sensitivity analysis (global influence and local influence) and residual analysis detect points ♯18 and ♯21 as possible influential observations.These observations identified as potential influential points correspond to the students who have the following descriptions: i.The observation 18 corresponds to a 31 years old female student, who defended her PhD in a maximum time (60 months).
ii.The observation 21 corresponds to a 31 years old male student, who defended his PhD in a minimum time (35 months).
Thus, to analyze the impact of these observations on the parameter estimates, we adjust the model eliminating individually each observation, and then removing the two observations.In Table 2, we present relative changes (in percentages) of the estimates defined by , where θ j(i) is the MLE without the ith observation.
On the figures of Table 2, we note that the MLEs of the parameters of the LGW regression model are robust to the deletion of influential observations.Moreover, the significance of the estimates of the parameters does not change (at the 5% significance level) after removal of the cases, that is, no changes inferential after removal of observations considered influential in diagnostics plots.Therefore, the observations are kept in the data set.
Finally, we verify the quality of the adjustment range of the LGW regression model by constructing in Figure 11 the normal probability plot for the component of the waste diversion with simulated envelope (Atkinson, 1985).This figure reveals that there is evidence of a good fit of the LGW regression model to the current data.Thus, the fitted model to the data can be expressed as ŷi = 4.1351 − 0.0045 x i1 − 0, 4237 x i2 + 0, 0101 x i1 x i2 , i = 1, . . ., 49.
Thus, according to this model, the time until the end of the PhD for male students varies with age, in this case, older students have the highest degree of time than the younger students.

Concluding Remarks
In this paper, we derive explicit expressions for the raw and incomplete moments, quantile and generating functions and mean deviations of the log-gamma Weibull distribution.Based on this distribution, we construct a new log-gamma Weibull regression model to investigate the effect of the age and gender at the time until the end of a PhD student.We also provide diagnostic measures to test the adequacy of the fitted model.In particular, the estimates of the parameters of the new model obtained by two methods are similar.Further, we provide diagnostic analysis for the fitted model to the the times of the students until the end of their PhD.Considering these aspects, we conclude that the fitted model can explain the effect of the variables on the time and that there is an interaction between age and gender of the students at the time until the end of the PhD.
is the incomplete gamma function, and J(ϕ, z ⋆ ) = ∫ z ⋆ 0 log(u) u ϕ−1 e −u du.The integral J(ϕ, z ⋆ ) can be expressed in terms of the digamma function ψ(ϕ) = d log[Γ(ϕ)]/dϕ and the Meijer G-function defined by For the B bootstrap samples generated, T * 1 , . . ., T * B , the bootstrap replication of the parameter of interest for the bth sample is given by θ * b = s(T * b ), i.e., the value of θ for sample T * b , b = 1, . . ., B.

Figure 9 .Figure 10 .
Figure 9. Index plot of local total C i from the regression model fitted to the current data.(a) Case-weight perturbation.(b) Response variable perturbation.(c) Explanatory variable perturbation, x 1

Table 1 .
MLEs and non-parametric bootstrap estimates for the parameters of the regression model for the current data.

Table 2 .
Relative changes [-RC-in %], estimates and the corresponding p-values in parentheses for the regression coefficients to explain the logarithm of time.