The New Generalized Exponentiated Fr´echet–Weibull Distribution: Properties, Applications, and Regression Model

,


Introduction
Statistical distributions are extremely useful in describing many world phenomena. Specifcally, fnding an appropriate distribution is a fundamental requirement to analyze and interpret data properly. Te suitable selection of distributions leads to valid inference and right conclusion. Many statistical distributions have been proposed and applied to ft real data in many applications, such as education, physics, chemistry, demography, management, and engineering. However, in many of these areas, data may display a complex pattern which cannot be adequately ft using the classical and traditional distributions. Tis complexity in data patterns has led to the need to develop statistical distributions that are more fexible, practical, and accurate in modeling them in the literature. Recently, several studies have attempted to extend the classical models and generate some new families of distributions (for a good review of these methods, see [1]).
Alzaatreh et al. [2] introduced a new general method for generating families of continuous distributions, called the transformed transformer (T-X) method. Tis method generalizes the beta-G [3] and Kumaraswamy-G [4] families by replacing the beta distribution and Kumaraswamy distribution with any continuous distribution for a random variable T defned on [a, b]. Particularly, the cumulative distribution function (cdf ) of the T-X family can be defned as where a is a real number and W[G(x)] is a function of any cdf of a random variable X.
Abd-Elmonem et al. [21] applied the T-X method to introduce a new extended distribution, called Fréchet-Weibull distribution which is based on the Fréchet distribution as a generator. Te cdf and the probability density function (pdf) of Fréchet-Weibull distribution with four parameters, namely, α, λ > 0 as scale parameters and β, k > 0 as shape parameters, respectively, are given as On the other hand, Gupta et al. [22] introduced the exponentiated method for which the existing distribution is generalized by adding an extra shape parameter to its pdf. Consequently, Cordeiro et al. [23] proposed a new class that generalizes the exponentiated method by adding two extra shape parameters to an existing distribution. Recently, Rezaei et al. [24] introduced a more general method by adding three extra shape parameters to an existing distribution. Te cdf and pdf of this new exponentiated family are defned, respectively, as where G(x) and g(x) are the cdf and pdf of any statistical distribution and a, b, and θ are positive real numbers. Te exponentiated generalized half logistic Fréchet distribution introduced in [25] and the exponentiated generalized exponential Dagum distribution proposed in [26] can be regarded as members of this family. Another distribution is the exponentiated generalized extended Gompertz distribution in [27] that generalizes the Gompertz distribution. It is the purpose of this paper to increase the fexibility of some existing distributions in order to accommodate the complexity of certain data. To that end, we combine the classical Fréchet-Weibull distribution with the new generalized exponentiated distribution class, providing the new generalized exponentiated Fréchet-Weibull distribution (NGEFWD). Te proposed distribution can be used as an alternative to several existing distributions in modeling diferent applications. Another goal is related to the signifcance of regression modeling. Specifcally, real data are frequently explained by other variables, which are referred to as explanatory variables or covariates. Hence, researchers have shown an increasing interest in investigating these relationships by considering regression analysis. Many regression models have been constructed in the literature recently based on some distributions of the response variable. In particular, log-location-scale regression models have been considered by many authors based on diferent distributions. Among these, Silva et al. [28] studied the log-Burr XII regression model, Carrasco et al. [29] introduced the logmodifed Weibull regression model, Ortega et al. [30] proposed the log generalized modifed Weibull regression model, Pescim et al. [31] developed a log-linear regression model based on the odd log-logistic generalized half-normal distribution, Altun et al. [32] suggested the log Zografos-Balakrishnan BXII distribution, Korkmaz et al. [33] proposed the log odd power Lindley-Weibull regression model, Baharith et al. [34] introduced the log odds exponential Pareto IV regression model, Cordeiro et al. [18] discussed the log-Xgamma Weibull regression model, Eliwa et al. [35] proposed the log odd Lindley half logistic regression model, Altun et al. [36] proposed the log additive odd log-logistic odd Weibull-Weibull regression model, Shama et al. [37] suggested the log gamma Gompertz regression model, and Anwaar Dhiaa and Sunbul Rasheed [38] provided two regression models derived from the Burr XII family of distributions. Ten, a further objective of this paper includes introducing a new regression model based on the NGEFWD distribution.
Tis paper is organized as follows. In Section 2, the NGEFWD is introduced and some plots for the pdf and hazard rate function (hrf ) of NGEFWD are provided. In Section 3, we derive the expansion of the pdf for the NGEFWD. In Section 4, we discuss some of the statistical properties of the new distribution. Te maximum likelihood estimates (MLEs) of the model parameters are determined in Section 5. Section 6 discusses the simulation results. In Section 7, the NGEFWD is applied to three real datasets. In Section 8, we propose the log-NGEFW regression model and estimate the model parameters using the maximum likelihood estimation. Section 9 presents some simulation studies to estimate log-NGEFW regression model parameters. In Section 10, three real datasets are investigated to show the fexibility of the new regression model. Finally, Section 11 ofers some concluding remarks.

The New Generalized Exponentiated Fréchet-Weibull Distribution
Te NGEFWD can be obtained by replacing G(x) in equation (4) by the cdf in equation (2) and g(x) in equation (5) by the pdf in equation (3). Tat is, a random variable X is said to have NGEFWD with seven parameters a, b, θ, β, k > 0 as shape parameters and α, λ > 0 as scale parameters if its cdf and pdf are defned, respectively, as (6) and Te reliability function and hrf of NGEFWD can be obtained, respectively, as For various values of the distribution's parameters, Figures 1 and 2 illustrate the shapes of the NGEFWD's pdf and hrf, receptively. It can be seen that the NGEFWD can demonstrate left skewed, symmetrical, right skewed, and reversed-J shaped densities. Also, it can take a form of decreasing, upside down bathtub, reversed bathtub, and reversed-J shaped hazard rates. Accordingly, NGEFWD can be considered as an appropriate model for ftting a variety of lifetime data in applied areas.

Expansion of pdf for NGEFWD
In the following, we can express the pdf of NGEFWD in equation (7) with an expanded form using the binomial expansion defned for a positive real power as for |z| < 1 and c > 0 (c is a nonnegative integer). Specifcally, applying the binomial expansion in equation (9) three times, the pdf of the NGEFWD can be rewritten as where

Statistical Properties
In this section, we derive some useful statistical properties of the NGEFWD.

Te Quantile Function and Median.
Te quantile function of the NGEFWD is defned as where z(u) � 1 − (1 − 1 − [1 − u] 1/θ 1/b ) 1/a and u is a uniformly distributed random variable. If we use u � 0.25 or 0.75, we get the frst quantile or the third quantile of the NGEFWD, respectively. Te median of the NGEFWD is given as where and the Moors kurtosis (MK) measures the degree of tail heaviness (if MK increases, the tail of the distribution becomes heavier). It is defned in [40] as where Q(.) is the quantile function in equation (12). From Figure 3, the NGEFWD can be right skewed, and for fxed α, the MK is a decreasing function of θ. 4.3. Te r th Moment. Te r th moment of the NGEFWD can be obtained as where w ijh is defned in equation (11). Ten, the mean and variance of the NGEFWD are, respectively, given as where w ijh is defned in equation (11).

Te Moment Generating Function and Characteristic
Function. Based on the expansion of e tx � ∞ r�0 t r x r /r!, the moment generating function can be calculated based on the r th moment of the NGEFWD as where w ijh is defned in equation (11).

Complexity
Similarly, we can obtain the characteristic function based on r th moment of the NGEFWD as where w ijh is defned in equation (11).

Order
Statistics. Suppose x 1 , x 2 , . . . , x n is a random sample from NGEFWD, where X L is the L th order statistic; then, the pdf of this L th order statistic is defned as where f(x) and F(x) are the pdf and cdf of NGEFWD defned, respectively, in equations (7) and (6). By using the binomial expansion, we can write Tus, Substituting by equations (6) and (7) and applying the binomial expansion in equation (9) four times, we obtain where 4.6. Rényi Entropy. Te Rényi entropy of a random variable represents a measure of variation of the uncertainty, and it is defned as Te Rényi entropy of the NGEFWD can be given as  Complexity . (26) By using the binomial expansion in equation (9) three times, we get where Tus, Ten, the Rényi entropy of the NGEFWD can be obtained as

Estimation of the NGEFWD Parameters
Tis section provides the estimation of the unknown parameters using the maximum likelihood technique, which is the most widely used estimation method. Let x 1 , x 2 , . . . , x n be a random sample from the NGEFWD with unknown parameters Θ � (a, b, θ, α, β, k, λ); then, the log likelihood function is given as

Complexity
By taking the frst partial derivatives of the log likelihood function with respect to a, b, θ, α, β, k, and λ, we obtain Te MLEs of the unknown parameters can then be obtained by solving the system of nonlinear equations (32)- (38) numerically. Alternatively, equation (31) might be directly maximizing using an optimization technique in any software, such as the statistical R program. Complexity 7

Simulation Studies for the NGEFWD
In this section, some simulation studies are performed to examine the accuracy of the MLEs of the NGEFWD. Te results were obtained by generating N � 1000 samples from the NGEFWD with diferent sample sizes, n � 25, 50, 100, 200, and 500, and with various cases for the true parameter values as Te quantile function in equation (12) is applied to generate random samples from the NGEFWD where u is uniformly distributed. Te mean square error (MSE) and the root mean square error (RMSE) were computed for each parameter in order to investigate its accuracy using the following relations: where where N is the number of generated samples, n is the size for each sample, θ is the MLE, and θ tr is the true value of each parameter. From Table 1, it can be seen that when the sample size n increases, the MLEs become closer to the true value of parameters, and hence the MSE and RMSE decrease and tend to zero. Te results demonstrate that the maximum likelihood method provides an accurate estimation of the parameters for the NGEFWD.

Applications for the NGEFWD
In this section, some applications of the NGEFWD are provided to illustrate its usefulness, using three real datasets. Te goodness of ft of the NGEFWD is compared with some of its submodels and a related distribution. Specifcally, the ft of NGEFWD is compared to the following distributions.
(iv) Te exponentiated generalized Fréchet-Weibull distribution (EGFWD) with pdf as (v) Te Kumaraswamy-Weibull-Burr XII distribution (KWBXIID) in [41] with pdf as Te comparison is based on some diferent criteria, namely, the negative log likelihood function (− LogL), the Akaike information criterion (AIC), the consistent Akaike information criteria (CAIC), the Bayesian information criteria (BIC), Hannan-Quinn information criterion (HQIC), and the Kolmogorov-Smirnov (KS) statistic as Te best model to ft data is the model with lowest values of AIC, CAIC, BIC, HQIC and KS and highest p value. Te MLEs of the model parameters were computed by using "optim" function in R program. Furthermore, the observed frequencies for the data are plotted and compared with the expected frequencies for each model. Tables 2-4  7.1. First Dataset. We will consider the dataset discussed in [42], which represents the ages of 155 patients sufering from breast tumors from June to October in 2014. 8 Complexity  If the number of clusters found out is not correct, it is in italics. Te best benchmark is written in bold in the condition of right cluster number.

The Log New Generalized Exponentiated Fréchet-Weibull Regression Model
Assume that X is a random variable from the NGEFWD given in equation (7) and let Y � log(x). Ten, the cdf and pdf of the log new generalized exponentiated Fréchet-Weibull (LNGEFW) regression model with the transformation parameters μ � log(λ) and σ � 1/k can be expressed, respectively, as where α, σ > 0 are the scale parameters, a, b, θ, β > 0 are the shape parameters, and − ∞ < μ < ∞ is the location parameter. Te survival function of the LNGEFW regression model is given as Te standardized random variable z � y − μ/σ has the following pdf: with survival function as    14 Complexity

Maximum Likelihood Estimation of the LNGEFW Regression Model.
For the right-censored lifetime data, let (y 1 , η 1 ), . . . , (y n , η n ) be random sample of n observations where each random response variable is obtained as y i � min log(x i ), log(C i ) . Let f i be the log-lifetime and C i be the log-censoring time which are independent and random; then, the likelihood function L(Θ) for the parameter vector Θ � (a, b, θ, α, β, σ, υ T ) is given as where is the indicator random variable. Ten, the log likelihood function can be obtained as where f(z i ) and SF(z i ) are given in equations (47) and (48), respectively. Tus, we have where r denotes the number of uncensored observations and Te MLE of the parameter vector Θ � (a, b, θ, α, β, σ, υ T ) T can be obtained by maximizing the log   [45] introduced the martingale residual as where δ i is the censor indicator; δ i � 1 if the i th observation is lifetime and δ i � 0 if the i th observation is censored. Te martingale residual of the LNGEFW regression model is where z i � (y i − υη T i )/σ. r M i takes value between − ∞ and +1 and has skewness.

Deviance Residual.
Terneau et al. [46] defned the deviance residual to reduce the skewness symmetrically distributed around zero as where r M i is given in equation (54). Te deviance residual of the LNGEFW regression model is

Simulation Studies for the Log New Generalized Exponentiated Fréchet-Weibull Regression Model
We conduct Monte Carlo simulation studies for various values of sample size (n), parameter values, and diferent censoring percentages to investigate the accuracy of the MLE in the LNGEFW regression model. Te lifetimes x 1 , . . . , x n are sampled from the NGEFWD in equation (7) considering the following reparametrization: μ � log(λ) and σ � 1/k, and by taking μ i � υ°+ υ 1 η i , where η i is the explanatory variable generated from a standard uniform distribution. Te considered values for the parameters are Case I: υ°� 3.4, υ 1 � 2.9, σ � 3.6, a � 2.6, b � 4.2, θ � 4.6, α � 0.6, β � 0.7  Noninformative censoring is commonly used in diferent studies. Te censoring times C 1 , . . . , C n are generated from a uniform distribution (0, τ), where the indicator random variable is given as Tis is adjusted until the censoring percentages of 0.1, 0.3, and 0.5 are reached. Te lifetimes considered in each ft are calculated as y i � min log(x i ), log(C i ) . Tis simulation was repeated N � 1000 times, and for each parameter, the mean estimate, MSE, and RMSE are calculated. Te results are listed in Table 8.
From Table 8, it is shown that when sample sizes increase, the MSE and RMSE of estimates decrease and the estimates tend to the true values of the parameters. Also, when censoring levels increase, the MSE and RMSE of parameter estimates increase for the same sample size. Te results indicate that the maximum likelihood method provides consistent estimation for the parameters of the LNGEFW regression model.

Applications for the Log New Generalized Exponentiated Fréchet-Weibull Regression Model
In this section, three real datasets are applied to illustrate the usefulness of the LNGEFW regression model. For three applications, the maximum likelihood method is applied to obtain the estimates of the parameters for the LNGEFW regression model. Te estimates and their standard errors (SEs) are calculated along with the AIC, CAIC, BIC, and HQIC to compare the LNGEFW regression model with some competitive models, namely, log-Burr XII (LBXII) regression model in [28], log Topp-Leone-Fréchet (LTLF) regression model in [47], and log Topp-Leone generated Weibull (LTLGW) regression model in [48]. Te estimates and their SEs are reported in Tables 9-11, while Tables 12-14 summarize the information  criteria for each analyzed dataset. 10.1. Voltage Data. Lawless [49] introduced an experiment in which specimens of solid epoxy electrical insulation were considered in an accelerated voltage life test. Te sample size of data is n � 60 with a percentage of 10% censored observations, and it has three levels of voltage: 52.5, 55.0, and 57.5 kV. Te variables considered in the study are as follows: y i : failure times for epoxy insulation specimens, cens i : censoring indicator (0 � censoring, 1 � lifetime observed), and η i1 : voltage (kV). Te results are presented by the ftting model where y i follows the NGEFWD.

Leukemia Data.
Leukemia data are presented in [49]. Tese data contain information of 33 patients who were diagnosed with leukemia. Te variables involved in the study are as follows: t i : survival time, y i : log survival time, cens i : censoring indicator (0 � censoring, 1 � lifetime), η i1 : white blood cell characteristics test (0 � negative, 1 � positive), and η i2 : white blood cell count. Te ft of the regression model is described as where y i follows the NGEFWD.

Stanford Heart
where y i follows the NGEFWD. Te results in Tables 12-14 show that the LNGEFW regression model has smallest values of AIC, CAIC, BIC, and HQIC for the voltage, leukemia, and Stanford heart transplant data compared to the other competitive models. Terefore, the LNGEFW regression model might provide the best ft to the three data among other models. Figures 7-9 represent the deviance residuals against the index of the observations for all datasets. It can be noted that all observations fall within the interval (− 3, 3), except observation 26 in Figure 9. Tus, observation 26 in Figure 9 is a possible outlier. In addition, from these fgures, it can be seen that all points lie inside the envelope, which indicates that the LNGEFW regression model provides good ft to all datasets.

(0.24414985)
ITe bold values represent the best results in the tables.

Conclusions
Introducing fexible distributions to real data models is of great importance to more accurately model diferent real datasets. In addition, many regression models must be developed to analyze the efect of covariates on the data in numerous practical applications. Tus, in this article, the NGEFWD is proposed in order to overcome the complexity of the pattern of some datasets. Some useful statistical properties of the new distribution are derived. Te maximum likelihood method is applied to estimate the model's parameters. Additionally, in order to examine these MLEs, some Monte Carlo simulation studies are conducted for diferent cases for which the results indicate that the proposed estimators have a good performance, and it is quite clear from the results that as sample size increases, a better estimate is obtained. Tus, the consistency, normality, and maximum efciency properties of the MLE are efective. Te suggested distribution can be applied in diferent applications, such as engineering, reliability, and many other real-life data. Hence, the usefulness of the new distribution is examined by analyzing three real datasets. It has been observed that the NGEFWD distribution consistently provides a better and accurate ft than some other common competitive models. Moreover, based on the NGEFWD, the log-location-scale technique is applied to introduce the LNGEFW regression model. Te maximum likelihood method for the rightcensored data is considered to estimate the parameters of the LNGEFW regression model. Some simulation studies with various values of parameters, sample size, and censoring percentage are considered to demonstrate the new regression model's versatility. Based on the results of two Monte Carlo simulation studies conducted for the LNGEFW regression model, the MLEs provided consistently good results. Te LNGEFW regression model performed very well when applied to three real-world datasets and provided the best fts among some other competitor regression models based on the information criteria. Terefore, it can be considered the most appropriate model among all the others. Hence, NGEFWD and its extension LNGEFW regression model are expected to attract the attention of various applied sciences due to their suitability and fexibility. Further studies could be conducted by using other methods of estimation, such as the moment estimation method, and diferent regression techniques, such as quantile regression.

Data Availability
Te references for the data used to support the fndings of this study are cited within the article.

Conflicts of Interest
Te authors declare that they have no conficts of interest.