Classical and Bayesian Inference for the Kavya–Manoharan Generalized Exponential Distribution under Generalized Progressively Hybrid Censored Data

: This manuscript focuses on the statistical inference of the Kavya–Manoharan generalized exponential distribution under the generalized type-I progressive hybrid censoring sample (GTI-PHCS). Different classical approaches of estimation, such as maximum likelihood, the maximum product of spacing, least squares (LS), weighted LS, and percentiles under GTI-PHCS, are investigated. Based on the squared error and linear exponential loss functions, the Bayes estimates for the unknown parameters utilizing separate gamma priors under GTI-PHCS have been derived. Point and interval estimates of unknown parameters are developed. We carry out a simulation using the Monte Carlo algorithm to show the performance of the inferential procedures. Finally, real-world data collection is examined for illustration purposes.


Introduction
Censoring schemes (CSs) play a significant role in lifespan and reliability studies. According to the estimated experiment time and accompanying cost, many practical experiments that rely on the lifespan of objects may be completed before failing all of the items. In these situations, just a subset of an item's failure information is recorded, and the data collected is known as censored data.
The most popular censoring methods used in life tests are type-I and type-II CSs. Ref. [1] proposed a hybrid CS, which is a combination of type-I and type-II CSs. In many cases, it is prepared in advance to remove items prior to failure at several stages of the experiment; however, the above CS lack the flexibility to allow for items to be removed from the experiment at stages other than the trial's endpoint. To address this issue, Ref. [2] proposed a type-II progressive CS (T-IIPCS) as a generalization for the censoring systems described previously.
The following can be used to establish T-IIPCS: Assume that a life-test experiment is conducted on a random sampling of n items and that the trial's starting point is the number of reported failures m(< n), which was previously progressive CS (R 1 , R 2 , . . . , R m ). During the time of the smallest failure, X 1:n , R 1 operating items are picked at random and left out of the experiment. At the second lowest failure time X 2:n , R 2 operational items are randomly chosen and eliminated from the experiment. The procedure is continued until the final failure time X m:n takes place, at which point all remaining operational items R m = n − m − ∑ m−1 i=1 R i are removed from the experiment. The experiment is then terminated at X m:n .
T-IIPCS has one big drawback in that if the items being studied are reliable and of excellent quality, the experiment time may be very long. This restriction was addressed in Ref. [3] with an improved system known as a type-I progressive hybrid CS (TI-PHCS), in which n, m, and (R 1 , R 2 , . . . , R m ), as well as the experimental duration τ, is determined beforehand. In this case, the experiment is completed at time X * =min{X m:n , τ}. Except for the last time point, this scheme is identical to T-IIPCS.
One of T-IIPHCS's most significant shortcomings is that the effective sample size is random and might be relatively small. As a result, statistical inference techniques may be unreliable or less efficient. A novel variation of progressive censoring called the generalized TI-PHCS (GTI-PHCS) was introduced in Ref. [4] to eliminate the problems that emerged in TI-PHCS, in which a smaller number of failures is predetermined. Using this filtering method would save time and money throughout a lifetime test trial. Additionally, the experiment having more failures improves estimates of statistical efficacy. The CS aids in ensuring that at least a constant number of observed items w(< m < n) are satisfied to attain the efficiency necessary for statistical evaluation. It also controls the experiment's overall duration to be close to that time if the number of observed failures appears to be very low up until τ. In this case, the experiment is completed at the moment X * = max{X w:n , min{X m:n , τ}} and any remaining operational items are removed from the experiment.
Numerous researchers have reported different techniques of estimation using the GTI-PHCS. In accordance with maximum product spacing (MPS), Ref. [5] introduced progressive type-II hybrid CS. For an exponential (E) model and a Weibull model, respectively Refs. [4,6] provided an accurate likelihood inference and entropy estimation methodology. Salem et al. [7] discussed a joint Type-II generalized progressive hybrid censoring scheme based on exponential distribution. By combining the generalized E (GE) and the basic step-stress accelerated life test with the competing risks model Ref. [8] investigated the statistical prediction problem of unobserved failure durations. In Refs [9,10], Bayesian and maximum likelihood (ML) estimation strategies for the E and Weibull under competing risks models were examined. When applying partially accelerated life tests to units whose lives are exponentially distributed under normal stress circumstances, [11] explored several point and interval estimates for the parameters involved, as well as the ideal stress change time. Ref. [12] examined the competing risk models under GTI-PHCS based on Chen distribution. Ref. [13] used GTI-PHCS to estimate the Weibull distribution's unknown parameters, reliability, and hazard functions with application to real data. Ref. [14] provided the ML and Bayesian estimators of the distribution's parameters, together with the reliability and hazard functions, based on GTI-PHCS data, from a GE distribution with application to numbers of million revolutions data before failure for each of the 23 ball bearings in the life test.
The GE model has been shown to be beneficial in a wide range of applications involving life testing, survival analysis, and reliability. Ref. [15] investigated this model, which is a special instance of the exponentiated Weibull model [16,17]. The followings are the cumulative distribution function (CDF) and probability density function (PDF) of the GE model, with scale parameter λ and shape parameter θ, for x > 0: g(x; θ, λ) = λθe −λx 1 − e −λx θ−1 .
Several authors used the PDF (2) and CDF (1) to generate new extensions of the GE model, such as beta GE model [18], Marshall-Olkin GE [19], half-Cauchy GE [20], odd Lomax GE [21], and modified slashed GE [22]. Recently, [23] introduced the Kavya-Manoharan GE (KMGE) model as the special case of the Kavya-Manoharan exponentiated Weibull model. The KMGE distribution is a new extension where it does not require any additional parameters to the baseline distribution which absolutely is an advantage with no more parameters. The CDF, PDF, and hazard rate function (HRF) of the KMGE model are and The plots of these PDF and HRF are displayed in Figure 1. It can be noticed from this figure that the PDF can be uni-modal, decreasing, and right-skewed, but the HRF can be increasing, decreasing, and constant. These curves indicate that the KMGE model is very flexible in modeling different types of data. As far as we are aware, there has not been any research that uses MPS, LS, and weighted LS (WLS) estimation techniques to estimate model parameters of probability distribution in the presence of the GTI-PHC data. Then according to the novelty of the KMGE distribution, we provide three important estimation methods besides the ML, percentiles (PE), and Bayesian methods. After that, a medical and engineering data application is supplied in accordance with the flexibility of the KMGE distribution (see Figure 1). In this regard, we summarized our study's objectives as follows: • Discuss the point and interval statistical inference of the two unknown parameters θ and λ for the KMGE distribution using five classical estimation approaches such as ML, MPS, LS, WLS, and PE based on GTI-PHCS. • Estimate the model's parameters of the KMGE distribution in view of the Bayesian estimation strategy using symmetric and asymmetric loss functions. • Using specific metrics of accuracy, a simulation study is run to look at how different estimates behave. • A potential application based on GTI-PHCS has been explored for data from engineering and medical sciences.
The rest of this paper is structured as follows: the model formulation of GTI-PHCS is proposed in the Section 2. Five classical estimation approaches such as ML, MPS, LS, WLS, and PE are investigated in the Section 3. Bayesian estimation with credible intervals is discussed in the Section 4. In Section 5, we evaluate the quality points and interval estimators using a Monte Carlo approach. In Section 6, we employ the theoretical study findings to real-world data. Finally, the discussion and conclusion are presented in Section 7.

Generalized Type-I Progressive Hybrid Censoring
The implementation steps of the GTI-PHCS are described below: 1.
Assume that a random sample of n units undergoes a lifetime testing trial.

4.
The operational units R 1 are chosen at random and eliminated from the experiment at X 1:n , the first failure time. At the subsequent failure time X 2:n , R 2 operating units are randomly selected and eliminated from the experiment, and the procedure is repeated. Eventually, the experiment is completed when X * = max{X w:n , min{X m:n , τ}} , and any remaining operational units R C are omitted from the experiment. Table 1 contains the values of the final censoring number R C .

5.
Assume that B represents the number of units that fail prior to τ. The experiment's end time X * is, therefore, provided by if X w:m:n ≤ τ < X m:m:n , X m:m:n , if X w:m:n < X m:m:n ≤ τ.
Any one of the subsequent six cases could be observed for the results: • Case I: If the observed time τ happens to occur before the wth failure time X w:n and B(< w) failures occur up to time τ, τ < X w:n < X m:n . Afterward, we won't remove any operating units from the experiment until the (B + 1)th, . . . , (w − 1)th failure times, after which we will remove all of the remaining operating units R * w = n − w − ∑ w−1 i=1 R i from the experiment at the wth failure time, thereby stopping the experiment at X * = X w:n , where R B+1 = · · · = R w−1 = 0, see Figure 2. In this case, we allow the experiment to continue after experimental time τ is reached to guarantee that at least the wth failure time X w:n happens. The following remarks, in this case, will be made: X 1:n < · · · < X B:n ≤ τ < X B+1:n < · · · < X w:n . • Case II: When wth failure time X w:n happens before the τ, X w:n ≤ τ < X m:n , and B(≥ w) failures occur up to the τ time. The experiment is terminated at X * = τ by removing all of the remaining operational units Figure 3. The following observations will be made in this situation: X 1:n < · · · < X w:n < · · · < X B:n ≤ τ. • Case III: When the mth failure time X m:n happens before the time τ, X w:n < X m:n ≤ τ, then all the remaining operational units R m = n − m − ∑ m−1 i=1 R i are deleted from the experiment, terminating it at X * = X m:n , as shown in Figure 4. The following observations will be made in this situation: X 1:n < · · · < X w:n < · · · < X m:n ≤ τ.

Different Classical Approaches of Estimation
This section discusses five classical methods for calculating ML, MPS, LS, WLS, and PE of the underlying parameters θ and λ using data gathered via GTI-PHCS.

Approach of ML Estimation
The likelihood function based on GTI-PHCS is provided via where x = (x 1:n , . . . , x K:n ), R * τ = n − K − ∑ K i=1 R i , final censored number R C , and X * , K, be experiment end time values based on three cases are reported in Table 1. It is interesting to note that, in Case I, several values of R i , i = 1, 2, . . . , m may be different throughout the test than those set before the test even starts.
Utilizing CDF (3) and PDF (4), the log-likelihood function takes the below formula: Note that, we write x j instead of x j:n for the simplified form. According to Equation (6), (θ,λ) of (θ, λ), can be computed as below: The ML estimates (MLEs) of θ and λ may be determined by taking the first partial derivatives of (6) with regard to θ and λ and equating them to zero as The MLEsθ andλ of θ and λ may be computed by solving the score equations, ∂L ∂θ = 0 and ∂L ∂λ = 0, with regards to θ and λ and solving these equations concurrently to produce the MLEs. Because analytical solutions cannot get the roots, these equations can indeed be investigated numerically utilizing iterative procedures employing statistical software via the "maxLik" package installed through the R 4.3.0 programming language.
According to the usual asymptotic normality theory of MLEs, we may assume that can be approximated bŷ where V(θ) and V(λ) are the variance ofθ andλ which may be founded by computing the inverse of the Fisher information matrix, i.e., where the caretˆdenotes that the derivative is evaluated at (θ,λ). It is simple to obtain the second partial derivatives of the probability function's natural logarithm for θ and λ.
In the lower bound of NACI, as n → ∞, the positive parameter can occasionally have a negative value. Ref. [24], as n → ∞, recommended using a log transformation confidence interval (LTCI) in this situation. Based on the log-transformed MLE's usual estimate, , where i = 1, 2, as a standard normal distribution can be approximated.

Approach of Maximum Product of Spacing Estimation
Ref. [25] proposed an alternate technique to the ML method for estimating unknown parameters in continuous distributions. Refs. [26,27] utilized progressive type-II censoring to estimate the parameters involved in the Weibull and Kavya-Manoharan inverse length biased exponential distributions. The MPS estimates (MPSEs) are yielded by maximizing the next product of spacing for θ and λ.
where F(x 0 ) = 0 and F(x K+1 ) = 1, and the expression about GTI-PHCS model has been discussed in Table 1. Utilizing (3) and by maximizing the product of spacing for θ and λ, then the MPSEsθ andλ of θ and λ are provided where a * = 1 e−1 e e−1 K .
To obtain the MPSEs, the nonlinear equations can also be solved simultaneously. Since an exact solution cannot obtain the roots, these equations are analytically resolved using iterative strategies and statistical software: the same forms are provided for j = 1, or K.

Approaches of LS and WLS
Ref. [28] established the LS and WLS techniques for estimating the parameters of the beta distribution. Refs. [29,30] employed progressive type-II censoring to estimate the parameters contained in the doubly Poisson-exponential and exponential-doubly Poisson distributions. To estimate the parameters in the Poisson-logarithmic half-logistic distribution under a progressive-stress accelerated life test, Ref. [31] proposed adaptive type-II progressive hybrid censoring.
Let (X 1 , . . . , X K ) be the ordered GTI-PHCS sample from the KMGE model of size K. The LS estimates (LSEs) of λ, θ are derived by minimizing the below formula in which E F(x j ) denotes the empirical CDF expectation, as supplied in [32] As a result, the LSEsθ andλ of θ and λ are yielded by minimizing the following formula These estimates can also be achieved by simultaneously solving the nonlinear equations to generate the LSEs. These equations can be resolved analytically using iterative methods and statistical tools since precise solutions cannot get the roots.
where A(x j ) and B(x j ) are provided in (10) and (11), respectively.
The WLS estimates (WLSEs) of θ and λ may be generated by minimizing the below formula is the weight factor and V[ F(x j )] is the variance of the empirical CDF, see [32], which is provided via Minimizing the next quantity, we get the WLSEsθ andλ of θ and λ.
To produce the WLSEs, the following nonlinear equations can be numerically solved via iterative methods and statistical tools: where A(x j ) and B(x j ) are given by (10) and (11), respectively.

Approach of Percentiles Estimation
Ref. [33] suggested a percentile approach for estimating distribution. If data from a closed-form CDF were gathered, it would only make sense to estimate the unknown parameter by adjusting a straight line between the theoretical points generated by the CDF and the percentile points of the sample. In this method, the empirical CDF looks like this: where K is defined as in Table 1 and Based on GTI-PHCS, the PEs of the considered parameters can be obtained as follows: It is feasible to acquire the PEsθ andλ of θ and λ by reducing the following quantity with regards to θ and λ. where These estimates can also be achieved by concurrently solving the nonlinear equations and obtaining the PEs. Because an exact solution cannot yield the roots, these equations can be investigated numerically by employing iterative procedures employing statistical software:

Bayesian Estimation
Here, the squared error loss (SEL) function and linear exponential loss (LINEXL) function are used to generate the Bayes estimators of θ and λ. To accomplish this, we supposed that the KMGE model parameters, θ and λ, each have independent gamma (G(.)) priors of the forms G(o 1 , s 1 ) and G(o 2 , s 2 ). Gamma priors should be considered for a variety of reasons, including the fact that they are (a) adjustable, (b) offer diverse shapes based on parameter values, and (c) are fairly simple and brief and might not produce a solution with a challenging estimation problem. The KMGE parameters θ and λ joint prior density is given by where the hyper-parameters o 1 , o 2 , s 1 , and s 2 are the ones that hold the previous data. Many academic authors created Bayesian estimates for their parameter models utilizing instructive gamma priors, including Refs. [34,35], and Ref. [36]. The informative priors will be used to elicit the hyper-parameters. The mean and variance using the maximum likelihood estimates of the KMGE distribution are determined. The priors (gamma priors) of the o j and s j mean and variance will be identical to θ and λ. We may find the means and variances ofθ andλ by equating them with the mean and variance of the gamma priors, as below After resolving the two equations above, the estimated hyper-parameters can now be expressed as The likelihood function (5) and the joint prior (14) may be combined to obtain the posterior distribution, say Π * (θ, λ| data), which is defined as the joint posterior density can be denoted in the final form: The SEL function should be taken into account in a Bayesian analysis for several reasons: In addition to being a typical symmetric loss and being straightforward, obvious, and easy to understand, it also assumes that overestimation and underestimation are treated equally and directly builds the Bayes estimator by using the posterior mean. However, when considering the SEL function, the posterior expectation of (16), which is expressed aŝ The most widely used asymmetric loss function is the LINEXL function. In many ways, the asymmetric loss function is thought to be more complete, according to Varian [37]. The Bayes estimate (BEs) of any function g(θ, λ) under the LINEXL function can be determined aŝ It is clear from (16), that it is impossible to express the marginal posterior densities of θ and λ explicitly. We propose utilizing Bayesian Markov chain Monte Carlo (MCMC) methods to generate samples from (16). The conditional posterior density functions of θ and λ are, thus, obtained, respectively, from (16), as and The posterior density functions of θ and λ, respectively, cannot be analytically reduced to any known distribution, as shown in (19) and (20). As a result, it is believed that the Metropolis-Hastings (M-H) method is the best approach to resolving this problem; for further information, see Refs. [38][39][40]. The following describes the sampling method for the M-H algorithm based on the normal proposal distribution: 1.
If both u 1 and u 2 are less than A θ and A λ , respectively, then set Redo steps 3-7 H times to get θ (i) and λ (i) for i = 1, 2, . . . , H.

Results of Simulation
Because assessing the efficiency of estimating methods is conceptually challenging, a Monte Carlo simulation is employed to address this obstacle. Here, we evaluate the performance and efficacy of the estimating approaches presented in earlier parts using Monte Carlo simulation. The procedure is as follows: 1.

4.
As described in Section 2, employ GTI-PHCS to the random sample produced in Step 3.
Determine the average of estimates, mean squared error (MSEr), and relative bias (RB) ofη across M samples as described in the following: whereη is an estimate of η. 8.
Determine the mean of the different estimates with their MSErs and RBs utilizing Step 9. 9.
Compute the average of the RBs (ARB) and MSErs (AMSEr) as below: 10. Calculate the average lengths (ALs) and coverage probabilities (COVPs) of the parameters (θ, λ), then their 95% NACIs and LTCIs. Calculate also the average of the ALs (AAL) as below: The sample generation uses the following CSs: • CS.1: • CS.2: • CS.4: The calculations were performed using the true parameter values θ = 2.5 and λ = 1.1. Moreover, the values n = 40, 80, r m = m n = 50%, 75%, 100% (of the sample size) (m = nr m ), r K = K n = 40%, (of the sample size), and τ = 2.5, 5.0 are used in the simulation analysis via R 4.3.0 programming software by installing the "maxLik" package to estimate MLE, along with their "coda" package in R 4.3.0 programming software, to obtain the Bayes point estimates.
The following points may be detected based on the computation results contained in Tables 2-11: 1.
The MPSEs are the best estimates through the AMSErs and ARBs.

2.
The MLEs are comparable to the LSEs, WLSEs, and PEs through the ARBs and AMSErs.

3.
The WLSEs are comparable to the LSEs and PEs through the ARBs and AMSErs.

4.
The LSEs are comparable to the PEs through the ARBs and AMSErs.

5.
The NACIs are comparable to the LTCIs through the AALs. 6.
For similar values of m and τ, and as n rises, the RBs, MSErs, ARBs, AMSErs, AL, and AAL decrease. 7.
For the same values of n, and τ, and as m increases, the RBs, MSErs, ARBs, AMSErs, AL, and AAL decrease. 8.
For similar values of n and m, by rising τ, the RBs, MSErs, ARBs, and AMSErs decrease for the MPSEs, MLEs, LSEs, and WLSEs, while the RBs, MSErs, ARBs, and AMSErs increase for the PEs. 9.
As τ increases, for the same values of n and m, the AL, and AAL decrease for CS.

Applications
The significance and applicability of the suggested KMGE model are illustrated using two real data sets from engineering and medical science. We use the "maxLik" program in the R package to compute likelihood estimates using the Newton-Raphson (NR) algorithms; for further information, see [41].
The first data set contains the minutes that 100 bank customers had to wait before receiving the service. It was first employed by [42]. " 18 8.8, 8.8, 8.9, 8.9, 9.5, 9.6, 9.7, 9.8, 10.7, 10.9, 11, 11, 11 For data I: From Table 12, the MLEs (with their standard errors (SE)) of θ and λ, meanwhile the Kolmogorov-Smirnov test (KS) (p-value) was 0.0366 (0.9993). Figure 5 illustrates data I: the estimated and empirical CDF of KMGE in the left, the estimated and histogram of KMGE density in the center, and the generate and quantile (Q-Q) of the KMGE in the right for the waiting time before receiving the banking service data using a graphic visualization. In the results in Table 12, we can confirm that the data I have fitted the KMGE distribution.  Table 12, the MLEs (with their SE) of θ and λ, meanwhile the KS (pvalue) was 0.1198 (0.8579). It means that the KMGE lifetime model fits the number of hours (in thousands) between failures of secondary reactor pumps data well. Figure 6 illustrates the estimated and empirical CDF of KMGE on the left, the estimated and histogram of KMGE density in the center, and the Q-Q of KMGE on the right for the number of hours (in thousands) between the failures of the secondary reactor pump data using a graphic visualization. From the results in Table 12, we can confirm that data II fitted the KMGE distribution.  For the GTI-PHCS of data I: Different GTI-PHCS samples (where m = 80, K = 70, and τ = 12) based on various CS selections were obtained from the waiting period before getting the banking service data and are shown in Table 13 to evaluate our acquired estimators. Also, MLE and Bayesian estimates based on GTI-PHCS for data I have been obtained in Table 14.   Table 15 to evaluate our acquired estimators.  Tables 14 and 16 because there was no prior knowledge about the unknown KMGE parameters θ and λ from the available data set. Figures 9 and 10 show the trace plots of each generated sample to show how the MCMC iterations have converged. Figures 11 and 12 discussed MCMC posterior density and scatter plot for parameters based on CS.4 for data I, data II, respectively.

Concluding Remarks
In this paper, we considered the problems of parameter estimation of the KMGE distribution under the generalized progressively hybrid censored samples. For point estimation, five classical approaches of estimation such as ML, MPS, LS, WLS, and PE are discussed. Moreover, the Bayesian approach is studied. For interval estimation, we use the ML method of estimation by using the normal approximation confidence interval and the normal approximation of log-transformed MLE. The considered five classical estimation methods were then compared in terms of RB, MSEr, ARB, AMSEr, and AL of CI via Monte Carlo simulations. The MPS method shows better performance than the other four classical estimation methods for most of the considered cases. Bayesian estimation of the unknown parameters is presented under informative prior using two different loss functions. The results in the illustrative example show that the proposed ML and Bayesian work well again. In summary, the improved estimation methods of the ML, MPS, LS, WLS, PE, and Bayesian approaches contribute to more accurate parameter estimation for the KMGE distribution. These methods have practical utility in industries such as finance, insurance, engineering, healthcare, environmental modeling, social sciences, and more. By utilizing these estimation methods, practitioners can obtain reliable parameter estimates, leading to improved decision-making and predictive modeling in real-world applications. Future studies will take into account the estimation problem based on a broad framework called unified hybrid censoring.