Theoretical framework and inference for fitting extreme data through the modified Weibull distribution in a first-failure censored progressive approach

The importance of biomedical physical data is underscored by its crucial role in advancing our comprehension of human health, unraveling the mechanisms underlying diseases, and facilitating the development of innovative medical treatments and interventions. This data serves as a fundamental resource, empowering researchers, healthcare professionals, and scientists to make informed decisions, pioneer research, and ultimately enhance global healthcare quality and individual well-being. It forms a cornerstone in the ongoing pursuit of medical progress and improved healthcare outcomes. This article aims to tackle challenges in estimating unknown parameters and reliability measures related to the modified Weibull distribution when applied to censored progressive biomedical data from the initial failure occurrence. In this context, the article proposes both classical and Bayesian techniques to derive estimates for unknown parameters, survival, and failure rate functions. Bayesian estimates are computed considering both asymmetric and symmetric loss functions. The Markov chain Monte Carlo method is employed to obtain these Bayesian estimates and their corresponding highest posterior density credible intervals. Due to the inherent complexity of these estimators, which cannot be theoretically compared, a simulation study is conducted to evaluate the performance of various estimation procedures. Additionally, a range of optimization criteria is utilized to identify the most effective progressive control strategies. Lastly, the article presents a medical application to illustrate the effectiveness of the proposed estimators. Numerical findings indicate that Bayesian estimates outperform other estimation methods by achieving minimal root mean square errors and narrower interval lengths.

The importance of biomedical physical data is underscored by its crucial role in advancing our comprehension of human health, unraveling the mechanisms underlying diseases, and facilitating the development of innovative medical treatments and interventions.This data serves as a fundamental resource, empowering researchers, healthcare professionals, and scientists to make informed decisions, pioneer research, and ultimately enhance global healthcare quality and individual well-being.It forms a cornerstone in the ongoing pursuit of medical progress and improved healthcare outcomes.This article aims to tackle challenges in estimating unknown parameters and reliability measures related to the modified Weibull distribution when applied to censored progressive biomedical data from the initial failure occurrence.In this context, the article proposes both classical and Bayesian techniques to derive estimates for unknown parameters, survival, and failure rate functions.Bayesian estimates are computed considering both asymmetric and symmetric loss functions.The Markov chain Monte Carlo method is employed to obtain these Bayesian estimates and their corresponding highest posterior density credible intervals.Due to the inherent complexity of these estimators, which cannot be theoretically compared, a simulation study is conducted to evaluate the performance of various estimation procedures.Additionally, a range of optimization criteria is utilized to identify the most effective progressive control strategies.Lastly, the article presents a medical application to illustrate the effectiveness of the proposed estimators.Numerical findings indicate that Bayesian estimates outperform other estimation methods by achieving minimal root mean square errors and narrower interval lengths.

Introduction
Effectively analyzing biomedical physical data holds the potential to advance personalized medicine, contribute to disease prevention, enable early diagnosis, and support the development of targeted therapies.However, it also introduces notable ethical and privacy considerations, particularly when handling sensitive patient information.Collaboration between researchers and healthcare professionals becomes imperative to ensure the responsible and secure management of biomedical data while leveraging its potential for improving human health.In various experimental and statistical scenarios, obtaining comprehensive information on failure units can be exceedingly challenging, if not impossible, due to constraints such as cost and time limitations.This challenge is particularly pertinent in reliability research, medical survival analysis, and industrial life testing trials, where minimizing total testing duration and associated high costs is of utmost importance.In these experiments, units may either fail or be removed before failing, and these removed units may be utilized in subsequent experiments.As a result, censorship occurs when the precise ages of the units in the test are known.Currently, control and censorship methodologies encompass various types that have been implemented in lifetime experiments.Among these, one of the most commonly used methods is Type II censoring, where all  units are initially included in the test, and the test concludes when the pre-determined  − ℎ unit (1 ≤  ≤ ) fails.Furthermore, the time at which the test ends is random.
Despite the potential for extended testing times due to the presence of units with high ages, many experimenters opt for Type II censorship.However, it's worth noting that Type II censorship has a drawback in that units cannot be withdrawn from the test once initiated (see Kundu and Howlader [1], Balakrishnan and Han [2] and Lawless [3]).Hence, a censorship method that offers more flexibility compared to Type II censorship, allowing for the withdrawal of units during the test's duration, is known as Type II progressive censorship (PT2C).Progressive control strategies have garnered significant attention in recent times due to their adaptability in permitting units to be removed at any point other than the endpoint.Progressive control and censorship systems have been introduced in various forms, including Type I, Type II, and hybrid progressive control systems.However, it's worth noting that conducting investigations, especially when dealing with highly reliable products, can be time-consuming using these control methods.A robust solution to this issue involves grouping the tested units into several sets, each containing an equal number of units.The time until the first failure within each group is recorded, resulting in what is known as a progressive first failure control chart.This approach has gained popularity in recent years for reliability analysis and life testing studies.For more details on estimation based on PT2C with applications, see EL-Sagheer [4], Wu and Gui [5], EL-Sagheer et al. [6], Khodadadian et al. [7], Noii et al. [8], Khodadadian et al. [9], Khodadadian et al. [10] and Luo et al. [11].
Although PT2C can enhance experimental efficiency, the testing duration remains a concern.Johnson [12] introduced a life test method where test units are grouped and all groups are tested simultaneously until the first failure occurs in each group.This type of censoring is known as first-failure censoring (FFC), as discussed by Wu et al. [13] and Wu and Yu [14].Unfortunately, once grouped, units cannot be removed during FFC.To address this limitation and further improve test efficiency, Wu and Kus [15] proposed a new life test method that combines PT2C with FFC, termed progressive first failure censoring (PFFC).PFFC allows for the removal of certain groups of test units before observing any failures in those groups.Many researchers have explored statistical inference using PFFC across various models, see for instance, Ahmadi and Doostparast [16], Kayal et al. [17], Shi and Shi [18] and EL-Sagheer et al. [19].In this paper, the modified Weibull distribution (MWD) is discussed based on the PFFC approach.Xie et al. [20] suggested the MWD as a generalization of the WD.Moreover, the statistical properties and detailed statistical analysis were given in Tang et al. [21] and Chen [22].If  follows a MWD, then the probability (PDF), cumulative (CDF), survival (SF), hazard rate (HRF) and inverse hazard rate functions (IHRF) are given, respectively, as and where  is the scale parameter and both  and  are the shape parameters.It is clear that the exponential power distribution EPD(, ) is a special case of the MWD with  = 1, see Smith and Bain [23], Aarset [24] and Gupta et al. [25].Also, the shape of HRF of the MWD depends only on the shape parameter  as follows.For  ≥ 1, the HRF is increasing function.For 0 <  < 1, the HRF is decreasing for  <   (  .The PDF and HRF plots of the MWD are given in Figs. 1 and 2, respectively.
The main aim of the article is to address the challenges associated with estimating unknown parameters and reliability measures when applying the MWD to censored progressive biomedical data.Specifically, the article aims to: Develop and compare classical   and Bayesian techniques for parameter estimation, survival, and failure rate functions under the modified Weibull distribution framework.Compute Bayesian estimates using both asymmetric and symmetric loss functions, employing the MCMC method to derive these estimates and their corresponding credible intervals.Conduct a simulation study to evaluate the performance of the proposed estimation procedures, considering various optimization criteria to identify optimal progressive control strategies.Demonstrate the practical application of the proposed estimators through a medical case study, showcasing their effectiveness in biomedical data analysis.Provide numerical evidence supporting the superiority of Bayesian estimates, showing reduced mean square errors and narrower interval lengths compared to alternative estimation methods.In essence, the article seeks to contribute methodological advancements in statistical inference for extreme biomedical data, particularly in the context of first-failure censored progressive scenarios, thereby enhancing the reliability and applicability of statistical methods in medical research and healthcare quality improvement efforts.
The paper layout is arranged as follows.Section 2 shows the maximum likelihood estimates (MLEs) and observed Fisher information matrix (FIM).Bayes estimates are obtained using Lindley's and Markov chain Monte Carlo (MCMC) approaches in Section 3. In Section 4, a simulation study is carried out.Application on renal transplant survival times is studied in Section 5. Finally, the article is summed up in Section 6.

MLE
Maximum Likelihood Estimation (MLE) stands as a cornerstone in statistical inference, offering a powerful framework to deduce parameters that best describe the underlying data distribution.By maximizing the likelihood function, MLE seeks to find the values of parameters that make the observed data most probable under the assumed statistical model.Widely employed across disciplines from finance to biology, MLE facilitates robust parameter estimation for complex models, relying on the assumption of independently and identically distributed (i.i.d.) data.This methodological approach inherently balances simplicity with efficiency, providing optimal estimates under ideal conditions of large sample sizes.Understanding MLE empowers researchers and practitioners to make informed decisions based on data-driven insights, essential in shaping modern scientific and industrial practices.
This section discusses the MLE given some observed data.We have extended the Weibull distribution to have 3-parameter as a sort of model complexity for attaining better fitting of the data and achieving a high level of accuracy.Furthermore, we compute the estimate and the approximate confidence intervals for the survival function (SF), hazard rate function (HRF), and inverse hazard rate function (IHRF) which, to the best of our knowledge, have not been discussed a lot in the literature.Let   ∶∶∶ ;  = 1, 2, ..., , be the PFFC sample from MWD with censoring scheme .
For more details on the model description see Wu and Kus [15].Thus, the log-likelihood function without normalized constant can be expressed as It is possible to get the maximum likelihood (ML) estimators by solving the following likelihood equations after setting the partial derivatives of Eq. ( 1) with respect to , , and  to zero where and As the non-linear Eqs.(2), (3) and ( 4) are evidently unsolvable analytically, a numerical approach like Newton-Raphson is employed, as stated in EL-Sagheer [4].Additionally, the MLE of  (), ℎ () and  () can be written as

Approximate confidence intervals
Approximate Confidence Intervals (ACIs), leveraging the Fisher Information Matrix (FIM), offer a robust statistical tool for estimating parameter uncertainties.By utilizing the second derivative of the log-likelihood function, FIM provides a framework to calculate ACIs efficiently.These intervals are valuable in scenarios where exact solutions are impractical, providing reliable estimates with manageable computational effort.ACIs derived from FIM enhance decision-making by quantifying the precision of parameter estimates in statistical inference.This approach is widely applied across disciplines for its versatility and reliability in uncertainty quantification.Based on the asymptotic normality of the MLEs, the ACIs of the parameters ,  and  can be constructed via asymptotic variances that can be acquired from the inverse of the FIM  −1 (, , ).Practically, we usually estimate  −1 (, , ) by  −1 ( α, β, λ) . Furthermore, using the following approximation is a more straightforward and legitimate process Therefore, the inverse of the FIM can be determined using the likelihood equations through the following form Therefore, (1 − )100% ACIs for parameters ,  and  become where  ∕2 is the percentile of the standard normal distribution with right-tail probability ∕2.According to delta method discussed in Greene [26], the variances of Ŝ (), ĥ () and r () can be roughly calculated using and where △ Ŝ (), △ ĥ () and △r () are the gradient of Ŝ (), ĥ () and r () with respect to ,  and .Therefore, (1 − )100% ACIs for  (), ℎ () and  () are

Bayesian estimation
Bayesian estimation represents a powerful paradigm in statistical inference, rooted in Bayes' theorem, which updates prior beliefs with observed data to yield posterior distributions.Unlike frequentist methods, Bayesian estimation incorporates prior knowledge into the analysis, making it particularly adept in scenarios with limited data or when historical information is available.This approach allows for the quantification of uncertainty through posterior distributions, offering a comprehensive understanding of parameter estimates.Bayesian methods excel in complex modeling tasks, where incorporating prior information can enhance accuracy and robustness.The gamma distribution (GD) family is widely recognized for its flexibility in accommodating a diverse range of prior beliefs held by experimenters.Moreover, statisticians are particularly drawn to its richness; adjusting its parameters yields new data that introduces fresh insights.As a result, the GD garners significant attention within the statistical community.Here, it is considered that the parameters ,  and  are independent and follow gamma distributions and where, the hyperparameters   and   ,  = 1, 2 and 3 reflect the knowledge of prior about (, , ) and assumed to be nonnegative and known.A special case: When all hyperparameters of GD are zero, we obtain Jaffrey prior in the form . Therefore, the joint prior can be expressed by

Lindley's technique
Lindley's technique offers a simplified method to calculate posterior distributions without requiring extensive computational resources.This technique approximates Bayesian inference by leveraging a second-order Taylor expansion around the mode of the prior distribution.By focusing on local behavior, Lindley's approximation provides a pragmatic solution for situations where exact posterior calculations are challenging.This method is particularly useful in scenarios where the posterior distribution is unimodal and symmetric around its mode, offering a computationally efficient alternative to more complex Bayesian inference techniques.The Lindley approximation was first presented by Lindley [27].It is significant because it allows the Bayes estimators to be estimated in a way that doesn't require integrals, as will be demonstrated below.Let   ) ; , ,  = 1, 2, 3.
It is known that Lindley's approximation does not make the interval estimation.So, we will construct the credible intervals (CRIs) of the unknown quantities based on MCMC technique.

MCMC technique
Markov chain Monte Carlo (MCMC) techniques stand as a cornerstone in Bayesian estimation, offering powerful tools to approximate complex posterior distributions through iterative sampling.Originating from the marriage of Markov chains and Monte Carlo methods, MCMC has revolutionized statistical inference by enabling practitioners to tackle high-dimensional problems that defy conventional analytical solutions.At its core, MCMC generates a sequence of correlated samples from the target distribution by constructing a Markov chain whose equilibrium distribution matches the posterior of interest.This chain's ergodicity ensures that with sufficient iterations, samples converge to the true posterior distribution, overcoming the curse of dimensionality often encountered in Bayesian inference.Several types of MCMC algorithms have emerged to address varying challenges in Bayesian estimation.The foundational Metropolis-Hastings (M-H) algorithm remains widely used, proposing candidate states based on an acceptance criterion.Its extension, the Gibbs sampler, simplifies multivariate distributions by sampling from conditionals iteratively.Both methods exemplify the adaptability of MCMC to different problem structures and data types, see Geman and Geman [28], Metropolis et al. [29] and Hastings [30].Further innovations include the Hamiltonian Monte Carlo (HMC), which leverages gradient information to improve sampling efficiency, particularly in high-dimensional spaces.Sequential Monte Carlo (SMC) methods provide alternatives for dynamic models or scenarios with evolving data streams, ensuring robustness and adaptability in Bayesian analysis.In summary, MCMC techniques have become indispensable in Bayesian statistics, offering a principled approach to exploring and summarizing complex posterior distributions.Their evolution continues to enrich the field, enabling researchers and practitioners to extract meaningful insights from increasingly intricate datasets and models.The joint posterior density can be reformulated as follows Thus, the conditional densities can be expressed as * 2 (|, , ) ∝   2 −1 exp and Equation ( 11) follows a GD, enabling the straightforward generation of samples for  using any gamma-generating routine.
Conversely, Eqs ( 9) and (10) do not conform to established distributions, necessitating the use of MCMC techniques for sampling.Specifically, the algorithm will employ Gibbs sampling and the M-H algorithm in sequential steps to generate samples from these equations.
where  is the burn-in period.To establish the CRIs of   order  (+1)

Simulation study
Simulation studies play a pivotal role in the realm of statistical research, particularly in evaluating and comparing estimation methods.By simulating data under known conditions, researchers can systematically assess the performance of various statistical techniques across different scenarios.These studies provide a controlled environment where the true values are known, allowing for a rigorous comparison of estimation accuracy, precision, and robustness.Moreover, simulations facilitate the exploration of methodological assumptions and their implications in practical applications.They help identify strengths and weaknesses, guiding the selection of appropriate methods based on the specific characteristics of the data and research objectives.In essence, simulation studies serve as a cornerstone for advancing statistical methodologies, ensuring that researchers can confidently apply the most effective techniques to real-world data analysis challenges.Considering the suggested algorithm that Balakrishnan and Sandhu [31] with the CDF 1 −(1 −  ())  , 1000 PFFC samples were generated from MWD with the parameters (, , ) = (1, 0.1, 2),  = 2 and different (, ).The performance of the derived estimates of ,  and  from the proposed methods (MLE, Lindley approximation, MCMC technique) is compared in terms of point and interval estimates.To this end, the mean squared error,  = 1 1.It is clear that from all Tables, as (, ) increase, the MSEs decrease and the Bayes estimates under GELF with  = 1 have the smallest MSEs.
2. Scheme 1 performs better than other schemes in the sense of having smaller MSEs.
3. The MCMC CRIs give more accurate results than the ACIs because the lengths of the MCMC CRIs are smaller than the lengths of ACIs for different  and .
Table 1 The MSE of the parameter .
(, ) 4. Generally speaking, the Bayes estimates for the parameters using MCMC method are better than their MLEs and Bayes estimates using Lindley approximation, based on MSEs.

5.
From Tables 1-8, the estimated values for all parameters using Lindley approximation under GELF at  = −1 are exactly equal to the estimated values of all parameters using Lindley approximation under SELF.
6. From Tables 1-8, the estimated values for all parameters using MCMC under GELF at  = −1 are exactly equal to the estimated values of all parameters using MCMC under SELF.7. The estimates for the ML and Bayesian approaches are extremely similar, and their ACIs have high CPs.

Table 3
The MSE of the parameter .

Application on renal transplant survival times
To elucidate the estimating methodologies covered in the preceding sections We offer an application of real-world data for renal transplant survival times.An actual data set was first reported by Hand et al. [32].The information shows the graft survival times (in years) for one hundred kidney transplant recipients.The data is listed as follows: 0.0035, 0.0068, 0.0101, 0.0167, 0.0168, 0.0197, 0.0213, 0.0233, 0.0234, 0.0508, 0.0508, 0.0533, 0.0633, 0.0767, 0.0768, 0.0770, 0.1066, 0.1267, 0.1300, 0.1639, 0.1803, 0.1867, 0.2180, 0.2967, 0.3328, 0.3700, 0.3803, 0.4867, 0.6233, 0.6367, 0.6600, 0.7180, 0.7800, 0.7967, 0.8016, 0.8300, 0.8410, 0.9100, 0.9233,    The Kolmogorov-Smirnov (K-S) test statistic is used to assess the degree of fit between the MWD and the actual data.The K-S distances and associated p-value are calculated and come out to be 0.3571 and 0.092661, respectively.Based on the p-value, we can conclude that the MWD fits the data exactly.Fig. 3 provides additional illustrations in the form of empirical, Q-Q, and P-P charts.
Based on the previous sample of PFFC, the MLEs and ACIs for , , ,  (), ℎ () and  () are determined to be as in Tables 9 and 10.Moreover, to compute the Bayesian estimates, the prior distributions of the parameters are needed to specify.Since we have no prior information, we assume that the non-informative gamma priors for ,  and  that is, when the hyper-parameters are   = 0.0001 and   = 0.0001,  = 1, 2, 3.In addition, 12000 MCMC samples were generated and the first 2000 samples expunged as 'burn-in'.Figs. 4 and 5 display the trace plots of the parameters generated by the MCMC approach and the associated histograms.The dashed line to verify the convergence of the MCMC method (around point estimation of the parameter).While the solid line determines the lower and upper bounds of the credible intervals.Tables 9 and 10 show the Bayesian estimates as well as 95% CRIs for , , ,  (), ℎ () and  ().

Conclusion
In this study, we have devised three different methods employing a PFFC scheme to estimate the unknown parameters of the MWD.Using the Fisher information matrix, we have created ACIs for ,  and .Furthermore, the ACIs for SF, HRF, and IHRF  have been computed using the delta approach.It is clear that the posterior distribution equations for the unknown parameters are complex and difficult to reduce analytically into well-known forms, particularly when taking Bayesian estimates into account.We have used MCMC techniques and the Lindley approximation to compute the Bayesian estimators in order to overcome this difficulty.We have calculated these Bayes estimates for both SELF and GELF.In addition, the study began by evaluating various methodologies and directly comparing their performance in a simulated environment.Based on the results obtained, it was determined that the Bayes method is suitable for estimating and constructing approximate confidence intervals for unknown parameters when dealing with progressively first-failure censored data from the MWD.Furthermore, the MCMC algorithm demonstrated superior performance compared to Lindley's method.Subsequently, the MWD was applied to real-world medical data, revealing its capability to accurately model current data, thereby suggesting its potential for analyzing similar datasets in the medical field.Despite these findings, the study highlights several avenues for future research.Specifically, optimizing censoring schemes for enhanced effectiveness and extending statistical inference methods to accommodate accelerated life testing models with multiple failure factors remain important areas for future investigation.Also, our paper can have many impacts and benefits across different fields we list it as follows: • Healthcare and Biomedical Research: In healthcare, understanding the distribution of extreme medical events, such as patient survival times or disease progression, is vital for treatment planning and resource allocation.By applying the modified Weibull M.S. Eliwa, L.A. Al-Essa, A.M. Abou-Senna et al.

Fig. 1 .
Fig. 1.The PDFs of the MWD with different ,  and .

Table 2
The MSE of the parameter .

Table 4
The ALs and CPs of 95% ACIs and CRIs for ,  and .