Abstract

New one-parameter and two-parameter distributions are introduced in this paper. The failure rate of the one-parameter distribution is unimodal (upside-down bathtub), while the failure rate of the two-parameter distribution can be decreasing, increasing, unimodal, increasing-decreasing-increasing, or decreasing-increasing-decreasing, depending on the values of its two parameters. The two-parameter distribution is derived from the one-parameter distribution by using a power transformation. We discuss some properties of these two distributions, such as the behavior of the failure rate function, the probability density function, the moments, skewness, and kurtosis, and limiting distributions of order statistics. Maximum likelihood estimation for the two-parameter model using complete samples is investigated. Different algorithms for generating random samples from the two new models are given. Applications to real data are discussed and compared with the fit attained by some one- and two-parameter distributions. Finally, a simulation study is carried out to investigate the mean square error of the maximum likelihood estimators, the coverage probability, and the width of the confidence intervals of the unknown parameters.

1. Introduction

Lindley [1] proposed a one-parameter distribution, now known as the Lindley distribution, with the following probability density function (pdf): The failure rate function of the Lindley distribution is always increasing. The properties of the Lindley distribution are studied in detail by Ghitany et al. [2]. There are situations in which the Lindley distribution may not be suitable from a theoretical or applied point of view, Ghitany et al. [3]. For this reason, Ghitany et al. [3] used a power transformation, , to introduce the power Lindley distribution which is a more flexible distribution. The pdf of PL is Ghitany et al. [3] showed that the hazard function of PL can be increasing, decreasing, and decreasing-increasing-decreasing depending on the values of the parameters. They also discussed some of the statistical properties of the distribution and used the maximum likelihood method to estimate its two unknown parameters and applied it to a real data set. In spite of the flexibility of the PL to fit some real data sets, it fails to fit some other data sets.

The main aim of this paper is to introduce two new distributions. The first is a one-parameter distribution which is similar to the Lindley distribution and the second is the power transformation of the one-parameter distribution. We refer to these two distributions as and TN respectively. The hazard function of is only unimodal, while the hazard function of TN can be decreasing, increasing, unimodal, decreasing-increasing-decreasing, or increasing-decreasing-increasing depending on the values of its two parameters. The variety of shapes of the hazard function of the TN enables it to be a good model to fit different data sets.

The rest of the paper is organized as follows. Section 2 introduces the new one-parameter distribution and some of its characteristics are discussed in Section 3. Section 4 presents the transformation of the new distribution, TN . Different characteristics of TN , such as the hazard function, quantiles, random sample generation, moments, and order statistics distributions, are discussed in Section 5. Section 6 discusses the maximum likelihood estimate of the two parameters of TN . Applications of the two models are presented in Section 6. Monte Carlo Simulation study is carried out in Section 7 to examine the accuracy of the maximum likelihood estimators of the TN parameters as well as the coverage probability and average width of the confidence intervals for the parameters. Finally, Section 8 concludes this paper.

2. The New Distribution

Consider the random variable whose pdf is given by The survival function (sf) of is given by while its hazard rate function is given by For simplicity, from now on, we refer to this distribution as .

Interpretation. There are two different interpretations of as follows.(1)The pdf is a mixture density of two mixture components. One follows Exp and the other is the lifetime of a two independent component series system with and and mixture weights and , respectively. This means that can be expressed in terms of and as .(2)The random variable can be described as a mixture of three components: , , and a   with mixture weights and , respectively. This means that can be expressed in terms of , and as . Some characteristics of are derived in the next section.

3. Characteristics of

In this section, algorithms are described to obtain quantiles of and to generate samples from . Also, the moment generating function and the moments of this distribution are derived.

3.1. Quantiles

The th quantile, , can be derived as follows.(1)Let ;(2)Solve the following equation numerically in (3)The th quantile is .

3.2. Random Sample Generation

We provide below three equivalent algorithms to generate a random variate from .

Algorithm 1. (1) Generate .
(2) Solve numerically the following equation in :
(3) Set .

Algorithm 2. (1) Generate I from the set such that ,   . (a)If , set , where .(b)If , set , where .
(2) Set .

Algorithm 3. (1) Generate I from the set such that ,   .(a)If , set , where .(b)If , set , where .(c)If , set , where .
(2) Set .

3.3. The Moments and the Moment Generating Function

The moment generating function (mgf) of may be written as Differentiating the above expression times with respect to and setting to zero, we get th moments, , as

Based on the first four ordinary moments, the measures of skewness and kurtosis of can be obtained using Plots of the skewness and kurtosis of the distribution as a function of are plotted in Figure 1. From the plots, sk and are unimodal functions of . The skewness is always positive and the kurtosis is larger than 3; therefore, is positively skewed and leptokurtic.

4. Power Transformation of the New Distribution

To get a more flexible distribution, we consider an extension of the new distribution with the pdf (3) by using the power transformation , . The pdf of is given by The density of is plotted in Figure 2 for three choices of when , which shows that the density is symmetric when , left skewed when , and right skewed when . This implies that the power parameter characterizes the shape of the density function. More investigation of the density will be discussed, in the next section, based on the skewness and kurtosis measures. From now on, we will use TN to refer to the power transformation of the new distribution .

Interpretation. There are two different interpretations of as follows.(1)The pdf is a mixture density of two mixture components. One component follows and the other is the lifetime of a two independent component series system with and PG and mixture weights and , respectively.(2)The random variable can be described as a mixture of three components: , , and a PG with mixture weights , and , respectively.Straightforward calculations yield the the survival function of TN as We derive some characteristics of TN in the next section.

5. Characteristics of

5.1. The Hazard Function

The hazard rate function of TN is For , the hazard function is unimodal. Its limiting values at zero and infinity are , and it reaches a maximum value of at where denotes the Lambert function which is the inverse of the function .

For , the shape of the hazard function is difficult to ascertain analytically. The shape was determined numerically by examining the derivative of the hazard out to the 99.99th percentile of the distribution, and the results are shown in Figure 3. For , the hazard is decreasing except for a small region with close to 1 and where the hazard is initially decreasing, then increasing, and finally decreasing (DID). For , the hazard is strictly increasing for large ( in the figure). For smaller with close to 1, the hazard can be unimodal (for very small ) or initially increasing, then decreasing, and finally increasing (IDI) (for slightly larger ). Figure 4 shows the hazard for five choices of and which demonstrate the five possible shapes.

5.2. Quantiles and Random Sample Generations

The 100 th quantile of TN , , can be derived from that of , , as follows: Figure 5 depicts the three quartiles , , and , which can be obtained from the th quantile by setting , and , respectively. From Figure 5, the Interquartile range (IQR = ) decreases dramatically when increases.

The following algorithm generates a random variate from TN .

Algorithm 4. (1) Generate from , using one of the Algorithms 13;
(2) Set .

5.3. The Moments and Shape Measures

Let follow TN . After some algebra, the th ordinary moment of is derived as Therefore, the mean and variance of are Figure 6 depicts the mean and variance of TN as functions of when which shows that the mean decreases dramatically in and takes its minimum of 0.8298 at then it increases steadily to take its maximum of 0.9687, while the variance is decreasing.

Based on the first four ordinary moments, the measures of skewness and kurtosis of TN can be obtained by substituting (17) into (9) and (10), respectively. Plots of the skewness and kurtosis of TN distribution as functions of , when , are given in Figure 7. From these plots, the skewness is positive when and negative when and the kurtosis is (i) equal to 3 when either or which means that the distribution is mesokurtic; (ii) greater than 3 when either or which means that the distribution is leptokurtic; (iii) smaller than 3 when which means that the distribution is platykurtic. This analysis shows how the power parameter improves , because the power transformation model can be used for data with a wide variety of distributional shapes.

5.4. Order Statistics

Consider independent and identical components whose lifetimes, say , follow TN . The following theorem gives the limiting distributions of the lifetime of the series system and of the parallel system consisting of these components.

Theorem 5. The limiting distributions of and are where and is the cdf of .

Proof. Using L’Hospital rule, Therefore, (19) follows by Theorem (ii) of Arnold et al. [4].
For the power transformation, we have Therefore, (20) follows by Theorem of Arnold et al. [4].

The following Theorem gives the limiting distribution of the th order statistic of the lifetimes .

Theorem 6. The limiting distributions of , are where .

Proof. It follows from Theorem 5 and of Arnold et al. [4].

Theorem 5 means that and follow asymptotically and Ext , respectively, while Theorem 6 means that follows asymptotically PG .

6. Maximum Likelihood Estimation

Maximum likelihood estimation (MLE) is one of the most common methods for estimating the parameters of a statistical model. Assume that independent and identical items, whose lifetimes follow TN , are put on a life test simultaneously. Let be the failure times of the items and let . The likelihood function for is The log-likelihood function is where The first partial derivatives of   , with respect to and , are where The second partial derivatives of are where The information matrix is The MLE of and , say and , are the solution of the system of nonlinear equations obtained by setting and such that the is positive definite. This system has no analytic solution, so numerical methods, such as the Newton-Raphson method, Burden and Faires [5], should be used.

Large-Sample Intervals. The MLE of the parameters and are asymptotically normally distributed with means equal to the true values of and and variances given by the inverse of the information matrix. In particular, where is the inverse of , with main diagonal elements and given by Using (32), large-sample confidence intervals for and are where is the upper quantile of the standard normal distribution.

7. Applications

In this section, we analyze four data sets to illustrate the applicability of the two new distributions proposed in this paper. The first data consists of 61 observed recidivism failure times (in days) of individuals released directly from correctional institutions to parole in the District of Columbia, Columbia, USA [6]. The second data set consists of 43 active repair times (in hours) for an airborne communication transceiver [7]. The third data set consists of 57 times (in thousands of operating hours) of unscheduled maintenance actions for the number 4 diesel engine of the U.S.S. Grampus, up to 16 thousand hours of operation [8]. The forth data set consists of the tensile strength (measured in GPa) of 69 carbon fibers tested under tension at gauge lengths of 20 mm [9].

We will refer to these data sets as failure times, repair times, maintenance actions, and tensile strength data, respectively. For each data set, we fit the proposed two distributions as well as Lindley and power Lindley distributions. For the sake of comparison, we apply goodness-of-fit tests to verify which distribution better fits these data sets. We consider the well-known Kolmogorov-Smirnov (K-S) statistic, the Cramér-von Mises (C-M), and Anderson-Darling (A-D) statistics [10]. Furthermore, we consider the Akaike information criterion , where is the log-likelihood function at the MLE of the parameters and is the number of model parameters. Table 1 shows the MLE of the parameters of each model, the corresponding maximum log-likelihood value, and the AIC for the four data sets. Table 2 presents the results of the goodness of fit tests for the four data sets using each model.

For every data set, we plotted the scaled total time on test transform (TTT-transform) plot which gives qualitative information about the hazard rate shape [11]; the hazard functions for the four fitted models; the empirical and fitted density and distribution functions. Figures 8, 9, 10, and 11 show the four plots for the four data sets 1–4, respectively. The scaled TTT-transform plots show that the repair data set has a unimodal hazard, while the rest of data sets have increasing hazards.

The inverse of information matrix at the MLE using the four data sets are listed below.

Failure times:

Active repair:

Maintenance actions:

Tensile strength:

For the first three data sets, model has the smallest value of the Kolmogorov-Smirnov (largest value), the Cramér-von Mises, and Anderson-Darling goodness-of-fit tests statistics which indicate that the best fit is provided by the TN model for these data sets. For the forth data set, the power Lindley model provides the best fit in the sense of having the smallest test statistics. For all data sets, TN is a better fit than . For the first two data sets, is a better fit than both and PL while it is the worst fit for the last two data sets. The AIC statistic is the lowest for TN for all data sets except for tensile strength where it is slightly higher.

Further, for testing as a submodel of the TN , we use the likelihood ratio test statistic (LRT) to check if the fit using the TN is statistically superior to a fit using the for each data set. The LRT for testing against is , where and are the maximum log-likelihood values under and , respectively. Under , . The LRT rejects if , where denotes the upper point of chi-square distribution with 1 degree of freedom. Table 3 lists the values of the LRT and the corresponding value for the four data sets. Based on the values, the is not rejected against the to fit the repair times data set, while it is rejected, at any level of significance greater than or equal to 0.0473, to fit the other three data sets.

8. Simulation Study

We used a simulation study to investigate the performance of the accuracy of point and interval estimates of the two parameters of TN . The following steps are as follows:(1)Specify the values of the parameters and ;(2)Specify the sample size ;(3)Use Algorithm 2 and the transformation to generate a random sample with size from TN .(a)Calculate the MLE of the two parameters and the inverse of the Fisher matrix;(b)Calculate the squared deviation of the MLE from the exact value of each parameter;(c)Calculate a 95% CI for each parameter;(4)Repeat steps 2-3, times;(5)Calculate the mean squared error (MSE), the average of the confidence interval widths, and the coverage probability for each parameter. The MSE associated with the MLE of the parameter , , is where is the MLE of using the th sample, , and . Coverage probability is the proportion of the simulated confidence intervals which include the true parameter .The simulation study is used when , the sample sizes are 25, 50, 75, and 100, and the parameter values ,   ,   ,   ,   ,   ,   , , and . Some of the selected values of give decreasing, unimodal, increasing, and increasing-decreasing-increasing hazard shapes, respectively, as shown in Figure 3. Table 4 presents the MSE, coverage probability , and average width (AW) of 95% confidence intervals of each parameter. As it was expected, this table shows that the MSEs of the estimates decrease as the sample size increases, that the coverage probabilities are very close to the nominal level of 95%, and that the average widths decrease as the sample size increases.

9. Conclusion

In this paper, we have proposed new one-parameter and two-parameter distributions, called the and , respectively. The was obtained by using a power transformation of the distributed variable. The provides more flexibility than the in terms of the shape of the density and hazard rate functions as well as its skewness and kurtosis. We derived the maximum likelihood estimates of the parameters and their variance-covariance matrix. We proposed different algorithms to generate samples from the two proposed distributions. Applications of the two proposed distributions to real data sets show better fits than Lindley and power Lindley distributions. Finally, we examined the accuracy of the maximum likelihood estimators of the parameters as well as the coverage probability and average width of the confidence intervals for the parameters using simulation.

Notation

pdf:Probability density function
cdf:Cumulative distribution function
mgf:Moment generating function
:Uniform distribution on
Exp :Exponential distribution with mean
:Weibull distribution with pdf
:Gamma distribution with pdf
PG :Power gamma distribution with pdf
PL :Power Lindley distribution with pdf
Ext :Extreme-value distribution with pdf .

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.