The modified Power function distribution

Abstract: Recently, a lot of new, improved, flexible and robust probability distributions have been developed from the existing distributions to encourage their applications in diverse fields. This paper proposes a new lifetime distribution called the Modified Power function (MPF) distribution, the distribution belongs to the MarshallOlkin-G family of distribution and it’s an extension of the one parameter Power function distribution. The MPF distribution enjoys a close form distributional expression. Some of its statistical properties including possible transformations are presented. The paper suggests the use of maximum likelihood method of parameter estimation for estimating the parameters of the new distribution. The applicability of the distribution was illustrated with two real data-sets and its goodness-of-fit was compared with that of the Exponential, Weibull, Lindley Exponential, Exponentiated Exponential, Kumaraswamy, Power function and Beta distributions by using the AIC, AICc, CAIC, BIC, HQC, W∗ and A∗ goodness-of-fit measures and the results shows that the MPF distribution is the best candidate for the data-sets.


Introduction
Over many decades ago the survival/reliability and failure characteristics of such devices, items, or equipments that are liable to falling out of use have hitherto been studied with so many probability

PUBLIC INTEREST STATEMENT
The accuracy and dependability of results of any statistical modelling rely heavily on the choice of model used. Right data with wrong model often yield vague and misleading estimates and predictions; and consequently, to erroneous and complicated interpretations and conclusions. This paper presents a new model called the Modified Power function (MPF) distribution. The distribution is an important generalization of the Power function distribution whose beauty stems from its very simplicity and handiness in modelling variables that takes values between 0 and 1. For example data arising from meteorological measurements and several other scientific experiments often take such values and can be modelled by the MPF distribution. Additionally, in real application situation the distribution have a better fitting capability in comparison with some other well-known distributions.
The idea in this paper is to extend the one parameter Power function distribution to a more flexible distribution by adopting the Marshall and Olkin's scheme, in order to make it more versatile in analysing a variety of complex data-sets. For a collection of different versions of the Power function distribution readers are referred to Table 1 of Tahir, Alizadeh, Mansoor, Cordeiro, and Zubair (2014). The cumulative density function (cdf) of the particular parameterization of the Power function distribution under study in this paper is of the form while the corresponding probability density function (pdf) is given by: where is a shape parameter. This distribution is a competitor and a special case of the beta distribution. The distribution is useful in describing the characteristics of random variables that are confined in the open interval of 0 and 1 in real life. For instance, the distribution could be used in meteorology to model sunshine data as in Table 2 of Sulaiman, Oo, Wahab, and Zakaria (1999).
The formulation of the Marshall-Olkin family of distributions is as follows while the cdf is F(x) = 1 −F(x) and the pdf is given by: (1) G(x) = 1 − (1 − x) ; 0 < x < 1; > 0, (2) g(x) = (1 − x) −1 ; 0 < x < 1; > 0; (3) where F (x) is the complementary cumulative density function (reliability/survival function) of the Marshall-Olkin-G family, Ḡ (x) and g(x) are the reliability and pdf corresponding to the baseline distribution (original distribution), respectively, and is the additional shape parameter. Notably, the Marshall-Olkin distribution reduces to the baseline distribution when = 1. To see some of the distributions that have been modified according to Marshall and Olkin (1997) readers are referred to Barreto-Souza, Lemonte, and Cordeiro (2013) and Cordeiro, Lemonte, and Ortega (2014).
Different parameterizations of the Power function distribution have been extensively studied in the literature. For example; Meniconi and Barry (1996) proposed the two parameter Power function distribution as a simple alternative to the Exponential distribution when it comes to modelling failure data; particularly, those that are related to electrical components. Tahir et al. (2014) extended the twoparameter Power function distribution to a more general and flexible four-parameter Weibull-Power function distribution as an adequate distribution for modelling survival data. They studied some of its statistical properties and the bivariate extension was also proposed. Naveed Shahzad and Asghar (2016) proposed the Transmuted Power function distribution generalizing the Power function distribution according to Shaw andBuckley (2009). Hanif, Al-Ghamdi, Khan, andShahba (2015) estimated the parameter of the one-parameter Power function distribution using the Bayesian method (Gibbs sampler). They compared the performance of the estimates obtained with two priors (Weibull and Generalized gamma distribution) with those obtained by the method of maximum likelihood. Saleem, Aslam, and Economou (2010) modelled a heterogeneous population using the two-component mixture of one-parameter Power function distribution. Zarrin, Saxena, and Kamal (2013) provides some analytical results for reliability computation and Bayesian estimation for a system reliability whose applied stress and strength behaves like the two-parameter Power function distribution. Naveed- Shahzad, Asghar, Shehzad, and Shahzadi (2015) showed by Monte-Carlo simulation study that, the method of L-moments provides better estimates of the parameters of the two-parameter Power function distribution than the one that is based on the methods of moments and maximum likelihood. Zaka and Akhter (2013) estimated the parameters of the two-parameter Power function distribution through the methods of least squares, relative least squares and ridge regression. Kumar and Khan (2014) presented concise expressions for single and product moments of the generalized order statistics (gos) of the three-parameter Power function distribution and discussed their characterization based on the conditional moments of the gos. Saran and Pandey (2004) derived and discussed the linear unbiased estimates of the parameters of a three-parameter Power function distribution based on the kth record values. Chang (2007) provides the characterizations of the two-parameter Power function distribution using the independence record values. Lim and Lee (2013) gave some proof of a characterization of the two-parameter Power function distribution using the lower record values. And Ahsanullah, Shakil, and Golam Kibria (2013) presented a new characterization of the two-parameter Power function distribution based on the lower records.
The rest of this paper contains the following sections: Section 2 is the introduction of the Modified Power function (MPF) distribution and its properties; Section 3 is the parameter estimation; Section 4 is the Monte-Carlo simulation study; Section 5 is the application of the new distribution and Section 6 is the conclusion.

Model definition
The cdf of the MPF distribution is given by; with pdf ; 0 < x < 1; ; > 0, while, the reliability function F (x), which gives the probability that a system will survive beyond a specified time say x is defined by; and the reversed hazard rate function rhrf which is defined by; and it gives the instantaneous failure rate of a system at time x given that if failed before time x.

Asymptotics and shapes
The asymptotic and shape characteristics of the pdf in Equation (6) and the rhrf in Equation (8) are outlined in this section.
The asymptotic behaviour of the pdf is and While the asymptotic behaviour of the reversed hazard rate function is given by: To characterize the shape of the MPF distribution, we start by obtaining the first derivative of its pdf which we have as The f � (x) < 0 indicate that the MPF distribution could be monotone decreasing. It has a critical point x 0 at which it is maximum, the critical point of the function is given by; Thus, ∃ x < x 0 such that f(x) is increasing and x > x 0 such that f(x) is decreasing, then we say that f(x) has a single mode at x 0 . For different parameter values, the pdf of the MPF distribution could take any of the following shapes: Another important shape characteristics of the MPF distribution is bathtub shape. This shape can be verified by showing that the density function in Equation (6) is convex.
Clearly, f �� (x) > 0, and the convexity of f(x) is confirmed. Figure 1 illustrate all the possible shapes of the pdf of the MPF distribution.
To characterize the shape of the reversed hazard rate function H(x) of the MPF distribution, we start by obtaining its first derivative which we obtained as H � (x) < 0 implies that the reversed hazard rate function is decreasing (reverse-J) and H � (x) has two real roots at Hence, for some and , ∃ if ≤ 1 and > 1 or if < 1 and ≥ 1, decreasing, if > 1 and ≤ 1 or if ≥ 1 and < 1, unimodal, if , > 1.

Transformations
If a random variable X is continuous and differentiable, then it can be transformed into another random variable Y by f (y) = f (x)∕|dy∕dx| and using a suitable transformation relation such as y = f (x), where | dy∕dx| is called the Jacobian of the transformation. We have presented some of the possible transformations of the MPF distribution in Appendix 1.

Quantile function and random number generation
By inverting Equation (5) we obtain the quantile function of the MPF distribution as When Q = 0.50 in Equation (9) we obtain the median of the MPF distribution as If Q ∼ U(0, 1) then we can simulate random variables from the MPF distribution through the inversion of cdf method with Equation (9). Also, using Equation (9) we can obtain the Bowley skewness (B) due to Bowley (1901Bowley ( /1920) and Moors kurtosis (M) due to Moors (1986). Remarkably, these measures does not depend on the moments of the distribution and are almost insensitive to outliers, unlike the classical skewness and kurtosis statistics.
The Bowley skewness statistic is given by: while, the Moors kurtosis statistic is given by:  The plots of the Bowley skewness of the MPF distribution indicate that the distribution could be asymmetric (positive or negative) or symmetric, while the plots of the Moors kurtosis indicate that the distribution is heavy tailed. Notably, the variability of the two measures depends on the values of the two shape parameters ( and ).

Useful expansion
To present a straightforward analytical derivation of some important properties of the new distribution, the following expansion of the pdf of the MPF distribution in Equation (6) is handy. Given that Hence, we obtain a useful expansion of the pdf as

Moments
Lemma 2.1 If X follows the MPF distribution then, its kth crude moment is given by:

Generating function
Apart from producing moments, the moment generating function (mgf) could be used to describe and characterize the distribution of a random variable say X.

Lemma 2.2 If X is distributed according to the MPF distribution, then its mgf (M X (t)) could be obtained through the general definition of the mgf of a continuous random variable, which is defined as
Thus the mgf of the MPF distribution is Proof The proof is straightforward, hence we omit it. ✷

Incomplete moment
In Economics, the incomplete moment forms a basic tool for constructing measures of inequality such as the Lorenz and Bonferroni curves. The kth incomplete moment of a continuous random variable is defined as Lemma 2.3 If X is distributed according to the MPF distribution then, its kth incomplete moment is given by: Proof The proof is analogous to that of Lemma 2.1. ✷

Entropy
Entropies are used to quantify the variation, likelihood, or randomness of a random variable. In this section, we present the Rényi entropy measure due to Rényi (1961) of the MPF distribution. The Rényi entropy generalizes the following entropies: Hartley, Shannon, collision and min-entropy and it is given by: Lemma 2.4 If X follows the MPF distribution then, its Rényi entropy measure is given by: Proof Using Equation (12)

Lorenz curves
Lorenz curve is a popular tool in Economics which gives a graphical representation of the distribution of income or wealth. It was originally introduced by Lorenz (1905) B(2, (i + 1)) , https://doi.org/10.1080/23311835.2017.1319592

Order statistics
The distribution of the kth order statistics denoted by f x (k) (x) of an n sized random sample X 1 , X 2 , X 3 , … , X n is generally given by: The density of the kth order statistics of the MPF distribution could be obtained by substituting Equations (5) and (10) into Equation (16)

Parameter estimation
Here, we estimate the parameters of the MPF distribution through the method of maximum likelihood. Suppose X is distributed according to the distribution with pdf f(x) then; the probability of obtaining the estimates of the parameters that could generate the observed sample of size n, say x 1 , x 2 … , x n is referred to as the maximum likelihood while, the estimates are called the maximum likelihood estimates.
The mle of any random variable X is defined by; f (x; ); ∈ Θ, https://doi.org/10. 1080/23311835.2017.1319592 where is the vector of parameters belonging to f(x) and Θ is the parameter space.
The mle of the MPF distribution is derived using Equation (6)  The mle estimates of and could be obtained by setting Equations (22) and (23) to zero and solving them simultaneously. But, these equations cannot be solved analytically because of their nonlinear structure. Indeed, there are a lot of in-built optimization functions in most of the available mathematical and statistical softwares that one could use, instead. For instance, the mle and mle2 functions that are under the bbmle package and the nlm, nlminb and optim functions which are under the stats package, etc; could offer numerical solution to such problems in , -language (R Core Team, 2013).
Denote Ω = (̂,̂) � , we have that under some standard regularity conditions, √ n(Ω − Ω) have a multivariate normal distribution N 2 (0, J −1 n (Ω)), where J n (Ω) is the expected information matrix defined by: −E( 2 ℒ(Ω)∕ Ω Ω � ). The asymptotic behaviour of the expected information matrix can be approximated by the observed information matrix, denoted by I n (Ω). Where the diagonal entries of I −1 n (Ω) are the variance of (Ω) while the off-diagonal entries are the covariances. Given that √ n(Ω − Ω) ∼ N 2 (0, I −1 n (Ω)) is available, we can perform statistical inference for functions of Ω. The observed information matrix of the MPF distribution is given by: where and

Monte-Carlo simulation
To investigate the performance of the mle estimates in estimating the parameters Ω = ( , ) � of the MPF distribution we conduct a Monte-Carlo simulation study. First, different sample sizes 20, 30, 40, … , 400 were drawn from the MPF distribution with parameters (10.00 and 7.00) and (0.25 and 0.80) using the inverse transform method; secondly, the parameters of the distribution were estimated and; finally, the algorithm switches between drawing a specific sample size (n) and estimating the parameters of the MPF distribution in 5,000 (N) iterations. For each sample size the parameter estimates, standard errors, bias and mean square errors MSE were computed and plotted in Figures 4 and 5. Where The simulation results as shown in Figures 4 and 5 indicates that the mle method provides good estimates of the parameters of the MPF distribution because the parameter estimates stabilizes and approximates to the true value as n increases while the standard errors, bias and MSE decreases with increasing n.

Application
In this section we demonstrate the usefulness of the MPF distribution with two data-sets. The first data whose basic statistics are presented in Table 1 are on the Anxiety performance of a group of 166 normal women that were reported in Bourguignon, Ghosh, and Cordeiro (2016). The second illustration is based on the Evaporation data in Table 2 that was extracted from the monthly publication of climatological data of the National Oceanic and Atmospheric Administration (NOAA). The data is on the daily pan evaporation in hundredths of inches that was recorded in September 2016 in San Joaquin Drainage 05 Friant Government Camp, California, USA. The data is freely available at http://www.ncdc.noaa.gov/oa/ncdc.html. Some of the basic statistics of the Evaporation data are listed in Table 3. The values of the two data-sets correspond to the support of the MPF distribution and this characteristic is one of the main motivation for these illustrations.
The MPF distribution with seven other competing distributions: exponential (Exp), Weibull, Lindley Exponential (LE) due to Bhati, Malik, and Vaman (2015), Exponentiated exponential (EE) due to Gupta and Kundu (2001), Kumaraswamy, Power function (PF) and Beta distribution would be fitted to the data-sets and their performance would be compared by the following goodness-of-fit measures: AIC, AICc, CAIC, BIC, HQC, W * statistics due to Cramér (1928) and Von Mises (1928) and A * statistics due to Anderson and Darling (1952).  • Akaike information criterion (AIC), • AIC with a correction (AICc), • Consistent Akaike information criterion (CAIC), • Bayesian information criterion (BIC), • Hannan-Quinn information criterion (HQC), • Cramér-von Mises W * criterion , where L , k, n and F (⋅) correspond to the estimate of the model maximized log-likelihood function, number of parameters in the distribution, sample size of the fitted data and the estimated distribution function under the ordered data, respectively.
The distribution with the smallest goodness-of-fit measures is considered as the best candidate for the given data. Results from the model fittings are listed in Tables 4 and 5, while Figures 6 and 7 shows the pdf's and cdf's of the estimated distributions superimposed on the empirical ones.
The estimated Fisher information matrix of the MPF distribution under the Anxiety data shows that the mle of and converges to the global maximum. The inverse of this matrix gives the variance-covariance matrix of the distribution parameters. The Fisher information matrix is given by: The estimated Fisher information matrix of the MPF distribution under the Evaporation data shows that the mle of and converges to the global maximum. The inverse of this matrix gives the variance-covariance matrix of the distribution parameters. The Fisher information matrix is given by: Comparing the results in Tables 4 and 5, one can quickly see that the MPF distribution with the smallest goodness-of-fit statistics appears the best candidate for the data under consideration and the pdf (left panel) and cdf (right panel) plots of the fitted distributions in Figures 6 and 7 does not suggest otherwise regarding the appropriacy of the MPF distribution in modelling the two data-sets.

Concluding remarks
This paper have contributed a new distribution to the Marshall-Olkin-G family of distributions. The new distribution is a generalization of the one parameter Power function distribution called the MPF distribution. Some of the statistical properties of the MPF distribution have been derived, such as the quantile, skewness, kurtosis, moments, variance, generating functions, Rényi's entropy, Lorenz curve, order statistics and some useful transformations are given. The method of maximum likelihood estimation (mle) was used to estimate the parameters of the new distribution and results based on Monte-Carlo simulation study, supports the use of the mle method for estimating the parameters of the MPF distribution. The goodness-of-fit of the MPF distribution was demonstrated with two real data-sets and the results from the data modelling shows that the MPF distribution offers a better fit to the data-sets than the other competing distributions. We hope that the new proposed distribution will be highly utilized across all relevant fields.