Classical and Bayesian inference for the new four-parameter Lomax distribution with applications

In this study, a new four-parameter Lomax distribution is proposed using a new alpha power transformation technique. The new distribution is named "New Alpha Power Transformed Power Lomax Distribution." Mathematical properties, including moments, the moment-generating function, the mean residual life, order statistics, and the quantile function, are obtained. The maximum likelihood estimation approach is used to estimate the model parameters. A comprehensive simulation is used to evaluate the behavior of maximum likelihood estimators. Two real-world data sets were used to demonstrate the significance of the proposed model, and the results show that the new model performs better when interpreting lifetime data sets. In the end, for the data sets, Bayesian estimation and Metropolis-Hasting's approach were also utilized to construct the approximate Bayes estimates, and convergence diagnostic methods based on Markov Chain Monte Carlo techniques were applied.


Introduction
The significance of statistical theory stems from its capacity to analyze different types of data sets.Lifetime data modeling is imperative in many arenas, including medicine, engineering, insurance, finance, and many more.It is very common practice to model lifetime data sets using probability distributions and their generalizations.Finding adaptable probability models to accommodate the analysis of varied data types with extreme observations is a difficult task for researchers.However, the traditional probability models do not provide the best fit when the data sets are heavily tailing or contain extreme observations.The addition of one or two parameters to the fundamental distributions encourages the development of new distribution theory modeling ideas.In recent years, extended distributions have been a common method for the derivation of flexible distributions that may be tailored to actual data.
Moreover [36], has introduced the approach of inducing a new parameter to the baseline distribution.The resulting approach is called the New Alpha Power Transformed (NAPT) family, and its pdf and cdf are given by the ensuing formulae, and Using the same generalized family [37] presented a new extension of power Lindley distribution and utilized it to analyze wind speed data.
The key purpose of the contemporary study was to present a new four-parameter Lomax distribution.The new model is named "New Alpha Power Transformed Power Lomax (NAPTPLx) distribution".Derived some of its important mathematical properties.The parameters of NAPTPLx distribution are estimated using the maximum likelihood (ML) estimation technique.A comprehensive simulation study is used to assess the behavior of derived ML estimators.We utilized two data sets from different fields to show the usefulness of the new distribution.In the end, we also analyze the data sets using the Bayesian estimation approach and using the Markov Chain Monte Carlo (MCMC) approach.
The remaining portions of the manuscript are organized as follows: The new probability distribution is introduced in Section 2. Quantile function, raw moments, incomplete moments, moment-generating function, order statistics, survival and hazard function, and mean residual life function are some of the mathematical properties that Section 3 derived.In Section 4, the model's parameters are estimated.In Section 5, a thorough simulation analysis is presented.Section 5 provides the formulation of the Bayesian model.Section 6 provides two examples of practical data applications.Bayesian estimation using the MCMC approach is also discussed in Section 6.In the end, we conclude our study in Section 7.

The NAPTPLx model
The cdf of the NAPTPLx distribution can be derived by incorporating equation (1) in equation ( 2), which is given by The corresponding pdf is In addition, by using expansion v i the alternative form of equation ( 3) is written as The shape of the density function is based on four parameters.The limiting behavior of the NAPTPLx distribution at the lower limit and the behavior of density at upper limits The adaptability of the NAPTPLx distribution with varying shape behavior is demonstrated by a pdf curve.The understudy model is categorized into three subfamilies.The pdf curves exhibit a sharp decline in responsiveness for β < 1.In the second subfamily when β = 1, the density curves demonstrate decreasing behavior but this subfamily has a specific initiation on the verticle axis.The pdf curves show unimodal behavior in the third subfamily.All these subfamilies with variable shapes of density curves show the flexibility of the proposed distribution.
The survival and failure rate (hazard rate) function of NAPTPLx distribution are given below and The cumulative hrf of the NAPTPLx distribution is  M. Abdullah et al.
The reverse hrf of NAPTPLx distribution is The different shapes of the hrf are given in Fig. 2. Fig. 2 indicates some interesting results about the failure rate pattern NAPTPLx distribution.The hrf curves starting from the origin when β > 1, start from some specific point at the y-axis for β = 1 and the graph is L-shaped for β < 1.

Statistical properties
In this section, we derived several essential statistical features for the NAPTPLx distribution.

Quantiles
The quantile function (qf) is a key statistical measure and is utilized for data generation.The qf of the NAPTPLx distribution is given in equation ( 5) ( where p follows a uniform distribution.The quartiles of the NAPTPLx distribution can be calculated numerically and presented in Table 1.

Moments
Moments play a significant part in statistical analysis and its applications.It may be utilized to investigate the utmost prominent traits and properties of a model, such as mean, variance, dispersion index, skewness, and kurtosis.
The rth moments can be derived using the following formula Now substituting equation ( 4) into equation ( 6) Making transformation y = x λ β , we get Again, transforming y = w 1− w in equation ( 8), and after some simplifications we get the following expression ) .

Moment-generating function
Moment generating function (mgf) can be gained via the following formula The mgf of NAPTPLx distribution is derived by using equation ( 4) Table 2 lists some computational metrics for various parameter selections, including mean, standard deviation (SD), coefficient of variation (CV), dispersion index (DI), coefficient of skewness (CS), and kurtosis (CK).

Incomplete moments
Incomplete moments of NAPTPLx distribution are dx, and ) .

Order statistics
Let X 1 , X 2 , X 3 , …, X n be a random sample taken from the NAPTPLx distribution and X 1:n , X 2:n , X 3:n , …, X n:n be the corresponding order statistics.The probability density function of kth-order statistics is where f(x) and F(x) are pdf and cdf of NAPTPLx distribution.We can use the binomial expansion of [1 − F(x)] n− k given as follows Putting equation (10) in equation ( 9) The kth order statistics for NAPTPLx distribution can be expressed as when r = 1 and r = n in equation ( 11), we can obtain the expressions of smallest and largest order statistics, respectively.

Mean residual life
The mean residual life (MRL) of the NAPTPLx distribution is derived as Making transformation and algebraic simplification, we get Substituting equation (12) in equation ( 11), we get

Parameter estimation of NAPTPLx distribution
In this section, we consider maximum likelihood estimation (MLE) for a given sample of size from the NAPTPLx distribution.Its relative log-likelihood function is The partial derivatives are written as follows: ∂l ∂β Table 3 Simulation study results for the NAPTPLx distribution.

Table 4
Simulation study results for NAPTPLx distribution.
Par. n Since the above-derived estimators cannot be solved precisely, we will utilize R software to solve these non-linear equations using Newton-Raphson and other optimization techniques.

Simulation study
A complete simulation analysis is used in this section to test the behavior of derived maximum likelihood estimators.Random samples of 10, 30, 80, 100, and 200 are created for this numerical process.The process is performed 10,000 times.Absolute bias (AB) and mean square error (MSE) are obtained and used for evaluation.The following combination of parameters is utilized to simulate samples.The simulation results are given in Tables 3 and 4.

Bayesian analysis
Since the Bayesian framework treats model parameters as random variables, their prior distributions must be specified to estimate the NAPTPLx distribution's parameters.Therefore, choosing a prior distribution is crucial for estimating parameters.The independent and G(a 4 , b 4 ) are our choice for α, β, θ, and λ prior distributions.Gamma prior is flexible with a non-informative domain, and it also provides conjugate prior for the likelihood function.These factors led to the selection of this prior density.The densities of the proposed gamma distributions are as follows: where a 1 , a 2 , a 3 , a 4 , b 1 , b 2 , b 3 , and b 4 are the hyperparameters of the prior distribution and all are positive constants.Hence, we have The posterior density function is proportional to the product of the likelihood function given in the previous section and the joint prior distribution for these parameters.The expression can be written as where L 1 is the log-likelihood function.
There are no closed-form inferences because the posterior density is convoluted.Therefore, suggest simulating samples from the posterior utilizing MCMC approaches, especially Gibbs sampling and Metropolis-Hastings (M − H) algorithms, enabling straightforward sample-based conclusions.The M − H algorithm is available in MCMCpack, an R package that contains functions to perform Bayesian analysis.The model was run for 1,000,000 iterations, with a burn-in phase of 10,000 simulated samples and a size 200 thinning interval.
For all parameters of interest, the highest posterior density intervals (abbreviated as HPD intervals or HDI) with 95% coverage were obtained.The shortest interval among all Bayesian credible intervals is an HPD interval.Convergence was evaluated visually using traceplots from each MCMC chain and quantitatively using the Geweke criterion.The Geweke's z-score is generated by the difference between the two-sample means divided by its estimated standard error, assuming asymptotic independence between these two components.This statistic has an asymptotically standard normal distribution if the MCMC samples are selected from a stationary distribution.

Data analysis
In this section, two data sets from different fields are utilized to illustrate the applicability of the proposed NAPTPLx distribution.The distributions used for comparison purposes are the Lomax (Lx) distribution, the Power Lomax (PLx) distribution, the Marshal M. Abdullah et al.
Olkin Lomax (MOLx) distribution, and the Marshal Olkin Power Lomax (MOPLx) distribution.The density functions of these models are given in Appendix.The parameters of all considered competitive distributions are estimated using MLEs.Some renowned model selection and goodness-of-fit measures such as the AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), (AD) Anderson Darling, (CVM) Cramer von Misses, and (KS) Kolmogorov-Smirnov statistic are used to identify the best-fitted distribution.The model has maximum log-likelihood values, p-values of AD, CVM, and KS, and minimum values of AIC, BIC, and AD, CVM, and KS statistic values that were chosen as the best fits for the data.

Analysis of bladder cancer patients' data
The first data set contains the remission time of 128 bladder cancer patients (monthly) data.It was originally studied by Ref. [38] The nature of this data set is assessed using a boxplot.We also plot the Total Time on Test (TTT) plot to identify the failure rate pattern of the considered data set.The boxplots and TTT plots are given in Fig. 3.The MLEs with standard errors (SE) for each model  and goodness of fit measures for the first data set are given in Table 5. Fig. 4 shows the fitted density over the histogram of empirical data, the empirical cdf in the black line, and the estimated cdf in the red line.Additionally, fitted survival function and probability-probability (PP) plots of the NAPTPLx distribution for the bladder cancer patients data set.

Analysis of income tax data
The second data set is about Egypt's monthly income tax from the duration of January 2006 to November 2010.The data observations are; 5.9, 20.4,14.9, 16.   goodness of fit measures are given in Table 6.Fig. 6 shows the fitted density over the histogram of empirical data, the empirical cdf, and the estimated cdf.Additionally, fitted survival function and PP plots of the NAPTPLx distribution for the bladder cancer patients data set.
According to Tables 5 and 6, the proposed distribution has lower values and is the best fit for the competitive distributions.Figs. 4  and 6 show fitted pdf, cdf, survival function, and PP plots for both the bladder cancer and monthly income tax data sets.As a result, the NAPTPLx distribution fitting improved when compared to other distributions.

Analysis via Bayesian method
Under the Bayesian methodology described in Section 5, the prior distributions of α, β, θ, and λ and were assumed to be α ∼ Gamma(a 1 ,b 1 ), β ∼ Gamma(a 2 ,b 2 ), θ ∼ Gamma(a 3 , b 3 ) and λ ∼ Gamma(a 4 ,b 4 ).The posterior means, standard errors, Geweke's Z-score, lower and upper HPD are given in Table 7.The traceplot, and histogram of posterior density are used for the evaluation of the MCMC iterations.The posterior samples for the parameters are shown in Figs.7 and 8.

Conclusion
In this paper, we introduce a new four-parameter Lomax distribution called the New Alpha Power Transformed Power Lomax distribution.The new distribution allows for more flexibility in analyzing real-world data sets.The proposed distribution has several characteristics of the NAPTPLx distribution.The density function of the new model contained variable shapes, such as exponentially decreasing and unimodal behavior.Some statistical properties are derived.The parameters of the NAPTPLx distribution are estimated  M. Abdullah et al.

Fig. 1
Fig. 1 lists all possible shapes of the probability density function.The adaptability of the NAPTPLx distribution with varying shape behavior is demonstrated by a pdf curve.The understudy model is categorized into three subfamilies.The pdf curves exhibit a sharp decline in responsiveness for β < 1.In the second subfamily when β = 1, the density curves demonstrate decreasing behavior but this subfamily has a specific initiation on the verticle axis.The pdf curves show unimodal behavior in the third subfamily.All these subfamilies with variable shapes of density curves show the flexibility of the proposed distribution.The survival and failure rate (hazard rate) function of NAPTPLx distribution are given below

Fig. 6 .
Fig. 6.Fitted Distribution for the income tax data.

Fig. 7 .
Fig. 7. Trace and Posterior density plot based on the first data set.

Table 1
First, second, and third quartile values for some selected values of parameters.

Table 5
Estimates (SE) and goodness-of-fit statistics of MLEs: The bladder cancer patient's data.
M. Abdullah et al.

Table 6
Estimates (SE) and goodness-of-fit statistics of MLEs: The income tax data.

Table 7
Bayesian estimates, SE, HPD, and Geweke's score for both data sets.