The Generalized Weighted Lindley Distribution: Properties, Estimation and Applications

In this paper, we proposed a new lifetime distribution namely generalized weighted Lindley (GLW) distribution. The GLW distribution is a useful generalization of the weighted Lindley distribution, which accommodates increasing, decreasing, decreasing-increasing-decreasing, bathtub, or unimodal hazard functions, making the GWL distribution a flexible model for reliability data. A significant account of mathematical properties of the new distribution are presented. Different estimation procedures are also given such as, maximum likelihood estimators, method of moments, ordinary and weighted least-squares, percentile, maximum product of spacings and minimum distance estimators. The different estimators are compared by an extensive numerical simulations. Finally, we analyze two data sets for illustrative purposes, proving that the GWL outperform several usual three parameters lifetime distributions.


Introduction
In recent years, several new extensions of the exponential distribution have been introduced in the literature for describing real problems. Ghitany, Atieh, and Nadarajah (2008) investigated different properties of the Lindley distribution and outlined that in many cases the Lindley distribution

PUBLIC INTEREST STATEMENT
We have proposed and presented a probability distribution called generalized weighted Lindley (WL) distribution. This distribution is an useful generalization of the WL distribution which accommodates increasing, decreasing, decreasingincreasing-decreasing, bathtub, and unimodal hazard rate. A significant account of mathematical properties for this distribution was presented. Different estimation procedures were proposed and compared by extensive numerical simulations. We believe that new distribution will allow the users to describe different data-sets obtaining a better predictive performance in comparison with other usual distributions.
The results of this paper are organized as follows. Section 2 provides a significant account of mathematical properties for the new distribution. Section 3 presents the eight estimation methods which are considered. In the Section 4 , a simulation study is presented in order to identify the most efficient procedure. Section 5 illustrates the proposed methodology in two real data-sets. Section 6 summarizes the present work. (1) f (t| , , ) = ( + )Γ( ) t −1 ( + ( t) )e −( t) ,

Generalized weighted Lindley distribution
The generalized WL distribution (2) can be expressed as a two-component mixture where p = ∕( + ) and T j ∼ GG( + j − 1, , ), for j = 1, 2, i.e. f j (t| , ) has GG distribution, given by The behavior of the p.d.f. (2) when t → 0 and t → ∞ are, respectively, given by Figure 1 gives examples of the shapes of the density function for different values of , and .
The cumulative distribution function from the GWL distribution is given by x 0 w y−1 e −w dw is the lower incomplete gamma function.

Moments
Many important features and properties of a distribution can be obtained through its moments such as mean, variance, kurtosis, and skewness. In this section, important moment functions such as the moment-generating function, r-th moment, r-th central moment, among others are presented.

Survival properties
In this section, we present the survival, hazard, and mean residual life (MRL) function for the GWL distribution. The survival function of T is given by The hazard function is given as The behavior of the hazard function (12) when t → 0 and t → ∞ are, respectively, given by Theorem 2.5 The hazard rate function h(t) of the GWL distribution is increasing, decreasing, bathtub, unimodal, or decreasing-increasing-decreasing shaped.
Proof The theorem proposed by Glaser (1980) is not easily applied in the GLW distribution. Since the hazard rate function (12) is complex, we considered the following cases: (1) Let = 1, then GWL distribution reduces to the WL distribution. Ghitany et al. (2011) proved that the hazard function is bathtub-shaped (increasing) if 0 < < 1( > 0), for all > 0.
(3) Let = 2 and = 1, from Glaser's theorem (Glaser, 1980), the hazard rate function is decreasing shaped (unimodal) for 0 < < 1 ( > 1). ✷ These properties make the GWL distribution a flexible model for reliable data. Figure 2 gives examples of the shapes of the hazard function for different values of , and .
Proposition 2.6 The MRL function r(t| , , ) of the GWL distribution is given by The behavior of the MRL function when t → 0 and t → ∞ are, respectively, given by

Entropy
In information theory, entropy has played a central role as a measure of uncertainty associated with a random variable. Shannon's entropy is one of the most important metrics in information theory. For the GWL distribution, Shannon's entropy can be obtained by solving Proposition 2.7 A random variable T with GWL distribution has Shannon's entropy given by where Proof From the Equation (14), we have Note that using the change of variable y = ( t) and after some algebra .
and r(∞) From Equations (6) and (10), we can easily find the solution of E[T ] and E[log T] and the result as follows. ✷ Another popular entropy measure is proposed by Renyi (1961). Some recent applications of the Renyi entropy can be seen in Popescu and Aiordachioaie (2013). If T has the probability density function (1) then Renyi entropy is defined by Proposition 2.8 A random variable T with GWL distribution, has the Renyi entropy given by Proof The Renyi entropy is given by and with some algebra the proof is completed. ✷

Lorenz curves
The Lorenz curve (Bonferroni, 1930) is a well-known measure used in reliability, income inequality, life testing and renewal theory. The Lorenz curve for a non-negative T random variable is given through the consecutive plot of Proposition 2.9 The Lorenz curve for the GWL distribution is

Methods of estimation
In this section, we present eight different estimation methods for the parameters , and of the GWL distribution.

Maximum likelihood estimation
The maximum likelihood method has been widely used due to its better asymptotic properties. The estimates are obtained by maximizing the likelihood function. Let T 1 , … , T n be a random sample where T ∼ GWL( , , ), the likelihood function is given by The log-likelihood function l( , , ;t) = log L( , , ;t) is given by . Numerical methods such as Newton-Rapshon are required to find the solution of the nonlinear system. Note that from (21) and (23)

Moments estimators
The method of moments is one of the oldest methods used for estimating parameters in statistical models. The moments estimators (MEs) of the GLW distribution can be obtained by equating the first i with the theoretical moments Therefore, the ME ̂M E , ̂M E and ̂M E , can be obtained by solving the non-linear equations

Method of maximum product of spacings
The MPS method is a powerful alternative to MLE for the estimation of unknown parameters of continuous univariate distributions. Proposed by Amin (1979, 1983), this method was also independently developed by Ranneby (1984) as an approximation to the Kullback-Leibler information measure. Cheng and Amin (1983) proved desirable properties of the MPS such as asymptotic efficiency, invariance, and more importantly, the consistency of maximum product of spacing estimators holds under more general conditions than for MLEs. Under mild conditions for the GWL distribution, the MPS estimators are asymptotically normal distributed with a joint trivariate normal distribution given by

The Cramer-von Mises minimum distance estimators
The Cramer-von Mises estimator is a type of minimum distance estimators (also called maximum goodness-of-fit estimators) and is based on the difference between the estimate of the cumulative distribution function and the empirical distribution function (Luceño, 2006). Macdonald (1971) motivated the choice of the CME estimators providing empirical evidence that the bias of the estimator is smaller than the other minimum distance estimators. The Cramer-von Mises estimates ̂C ME , ̂C ME and ̂C ME of the parameters , and are obtained by minimizing with respect to , and . These estimates can also be obtained by solving the nonlinear equations:  where Δ 1 (⋅| , , ), Δ 2 (⋅| , , ) and Δ 3 (⋅| , , ) are given respectively in (27).

The Anderson-Darling and Right-tail Anderson-Darling estimators
Another type of minimum distance estimator is based on ADE statistic and is known as ADE estimator. The ADE estimates ̂ ADE ,̂ ADE and ̂ ADE of the parameters , and are obtained by minimizing, with respect to , and , the function These estimates can also be obtained by solving the nonlinear equations The Right-tail ADE estimates ̂ RADE ,̂ RADE and ̂ RADE of the parameters , and are obtained by minimizing the function with respect to , and . These estimates can also be obtained by solving the nonlinear equations: where Δ 1 (⋅| , , ), Δ 2 (⋅| , , ) and Δ 3 (⋅| , , ) are given respectively in (27).

Simulation study
In this section, an intensive simulation study is presented to compare the efficiency of the estimation procedures for parameters of the GWL distribution. The following procedure was adopted: (1) Generate pseudo-random values from the GWL( , , ) with size n.
(4) Using ̂= (̂,̂,̂) and = ( , , ), compute the mean relative estimates (MRE) ∑ N j=1̂i ,j ∕ i N and the mean square errors (MSE) Considering this approach, the most efficient estimation method will have MREs closer to one and MSEs closer to zero. The results were computed using the software R using the seed 2015 to generate the pseudo-random values. The initial values considered were the same values used to generate the random samples. The chosen values to perform this procedure were N = 10, 000 and n = (50, 60, … , 250). For reasons of space, we have presented the results only for = (2, 0.5, 0.1).
However, the following results are similar for other choices of .
For this comparison to be meaningful, the estimation procedures need to be performed under same conditions. However, for some particular samples and estimation methods, the numerical techniques do not work well in finding the parameter estimates. Therefore, a rate study is presented to verify the frequency of convergence of the numerical solutions. This procedure is carried out by counting the number of times each estimation fails in finding the numerical solution. In Figure 3 we present the proportion of failure from each method.
From Figure 3, the MLE, LSE, WLSE, ME, and the CME estimators fail in finding the parameter estimates for a significant number of samples. Therefore, such methods are not recommended for estimation of the GLW parameters. Hereafter, we consider the MPS, ADE, RADE estimators due to their better computational stability. The MLE is considered only for illustrative purposes since it is the most used estimation method. Figure 4  From these results, the MSE of the MLE, MPS, ADE, and RADE estimators tend to zero for large n and also, as expected, the values of MREs tend to one, i.e. the estimates are consistent and asymptotically unbiased for the parameters. For small sample sizes, the MLE has the largest MSEs. The MPS has smaller MSEs with MREs closer to one for almost all values of n. Additionally, the MPS, ADE, and RADE estimators were the only methods that were able to find ̂,̂ and ̂ for all the 2 × 10 6 generated samples. Therefore, combining all results with the good properties of the MPS method such as consistency, asymptotic efficiency, normality and invariance, we conclude that the MPS estimators are a highly competitive method compared to the maximum likelihood for estimating the parameters of the GWL distribution.  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  4  4
The TTT-plot (total time on test) is considered in order to verify the behavior of the empirical hazard function (Barlow & Campo, 1975). The TTT-plot is obtained through the plot of [r / n, G(r / n)] where and t (i) is the statistical order. If the curve is concave (convex), the hazard function is increasing (decreasing). On the other hand, when it starts convex and then becomes concave (concave and then convex) the hazard function has bathtub (inverse bathtub) shape.
The goodness of fit is checked considering the Kolmogorov-Smirnov (KS) test. This procedure is based on the KS statistic D n = sup | | F n (t) − F(t; , , ) | | , where sup t is the supremum of the set of distances, F n (t) is the empirical distribution function and F(t; , , ) is c.d.f. A hypothesis test is conducted at the 5% level of significance to test whether or not the data come from F(t; , , ). In this case, the null hypothesis is rejected if the returned p-value is smaller than 0.05.
To carry out the model selection, the following discrimination criterion methods are adopted: AIC (Akaike information criteria) and AICc (Corrected Akaike information criterion) computed, respectively, by AIC = −2l(̂ ;t) + 2k and AICc = AIC , where k is the number of parameters to be fitted and ̂ is estimation of . For a set of candidate models for t, the best one provides the minimum values. Aarset (1987) presents the data-set (see Table 1) related to the lifetime in hours of 50 devices on test    Comparing the empirical survival function with the adjusted distributions, it can be observed that the GWL distribution is as a better fit. This result is also confirmed from the AIC and AICC (see Table 3) since GWL distribution has the minimum values and also the p-values returned from the KS test are greater than 0.05. It should be emphasized that considering a significance level of 5%, the others models are not able to fit the proposed data. Table 3 displays the MPS estimates, standard errors, and the confidence intervals (CI) for , and of the GWL distribution.

Average flows data
The study of average flows has been proved to be of high importance to protect and maintain aquatic resources in streams and rivers (Reiser, Wesche, & Estes, 1989). In this section, we consider a real data-set related to the average flows (m 3 /s) of the Cantareira system during January at São Paulo city in Brazil. It is worth mentioning that the Cantareira system provides water to 9 million people in the São Paulo metropolitan area. The data-set available in Table 4 was obtained from the National Water Agency from 1930 to 2012.  In this section, we consider the ML estimator showing that both MPS or MLE could be used successfully in applications. Figure 6 shows (left panel) the TTT-plot, (middle panel) the fitted survival superimposed to the empirical survival function and (right panels) the hazard function adjusted by GWL distribution. Table 5 presents the AIC and AICc criteria and the p-value from the KS test for all fitted distributions considering the data-set related to the January average flows (m 3 /s) of the Cantareira system.
From the empirical survival function and the adjusted distributions, it can be observed that the GWL distribution is better. This result is also confirmed from AIC and AICC since GWL distribution has the minimum values and the p-values returned from the KS test are greater than 0.05. Table 6 displays the ML estimates, standard errors, and the CI for , and of the GWL distribution.

Concluding remarks
To summarize, we have proposed a three-parameter lifetime distribution. The GLW distribution is a straightforward generalization of the WL distribution proposed by Ghitany et al. (2011), which accommodates increasing, decreasing, decreasing-increasing-decreasing, bathtub, and unimodal hazard rate making the GWL distribution a flexible model for reliable data. The mathematical properties of this distribution are also discussed.
The estimation procedures for the parameters of GWL distribution are also derived considering eight estimation methods. Since it is not feasible to compare these methods theoretically, we have presented an extensive simulation study in order to identify the most efficient procedure. We  observed that the MLE, ME, LSE, WLSE, and the CME estimators fail in finding the parameter estimates for a significant number of samples. The simulations showed that the MPS (maximum product of spacing) is the most efficient method for estimating the parameters of the GWL distribution in comparison to its competitors. Finally, two data-sets were analyzed for illustrative purposes proving that the GWL distribution outperforms several usual three parameter lifetime distributions.