A new least squares method for estimation and prediction based on the cumulative Hazard function

: In this paper, the cumulative hazard function is used to solve estimation and prediction problems for generalized ordered statistics (deﬁned in a general setup) based on any continuous distribution. The suggested method makes use of R´enyi representation. The method can be used with type II right-censored data as well as complete data. Extensive simulation experiments are implemented to assess the e ﬃ ciency of the proposed procedures. Some comparisons with the maximum likelihood (ML) and ordinary weighted least squares (WLS) methods are performed. The comparisons are based on both the root mean squared error (RMSE) and Pitman’s measure of closeness (PMC). Finally, two real data sets are considered to investigate the applicability of the presented methods.


Introduction
The main objective of statistical inference is to draw conclusions about a certain population based on a random sample of that population. Conclusions may be about the functional form of the distribution for the population of interest, estimates of its characteristics, testing statistical hypotheses regarding parameters or making predictions about future events based on current knowledge.
One of the essential elements of reliability analysis is testing the product failure times under normal operating conditions. Products are frequently highly dependable, and extensive applied testing is prohibitively costly. The use of complete samples is not a good choice in statistical analysis when the product is highly reliable and applied testing is prohibitively expensive. In such situations, statisticians have recommended many types of censoring to save time and money. Type II-right censorship is one of these censoring types and has been more popular over the past few decades. It is expected for this type that a few lower-order statistics are observed and that subsequent inferences are then required, such as determining the distribution from which the data is collected, estimating the distribution parameters and making predictions about unobserved events. Here, the emphasis is on estimate for censored samples as well as prediction problems.
There are many contributions to the estimation of complete and censored data. Excellent references for this subject, among others, include Balakrishnan and Cramer [1], Casella and Berger [2], Lehmann and Casella [3] and the references therein.
Prediction of future observations is a major concern in many real-world problems. The prediction for ordered random variables (RVs) is frequently used in industrial applications and survival studies to estimate the future prevalence of defective products. In lifetimes testing experiments, the interval and point predictions are also helpful in determining the best censoring strategy. To put it more precisely, we can place identical items in a given life-test experiment and wait until a manageable number of failed items are obtained in relation to cost and time. We can therefore predict the failure times of the survivor items using these observed failure times. This prediction enables the experimenter to choose an appropriate censoring scheme. Such censoring schemes may be Type I or Type II censoring, progressive Type II censoring, or hybrid censoring. The prediction has additional benefits that help us decide whether or not the life test needs to be accelerated and when a future failed item should cause the life test to end. Many authors have considered the prediction issue in statistical literature from both a theoretical and an applied standpoint. Ahsanullah [4] studied the linear prediction of record values of the two-parameter exponential distribution. Bayesian predictions of both ordinary order statistics (OOSs) and generalized order statistics (GOSs) were studied in AL-Hussaini [5] and AL-Hussaini and Ahmed [6], respectively. In Aly [7], two-point predictors of the fractional kth upper record values from the exponential distribution are given. Some predictive and reconstructive results of dual GOSs from the inverse Weibull distribution are obtained in Aly [8] via the pivotal quantities approach. The problem of predicting future lifetimes from the Weibull probability model for a simple step-stress plan under the Khamis-Higgins model is studied by Amleh and Raqab [9].
The rest of this paper is organized as follows: In Section 2, some necessary preliminary results that are related to the present work are presented. A new least-squares method for estimation and prediction is presented in Section 3. The methodology presented in Section 3 is validated in Section 4 through comprehensive simulation studies. In Section 5, two real data sets are analyzed for illustrative and comparison purposes.

Preliminary results
As a unified model of various ascending models of ordered RVs, Kamps [22] established the GOSs, based on any continuous distribution function (CDF) F, via their joint probability density function (PDF) the quantile function of F. The model parameters are defined by the vector γ = (γ 1 , ..., γ n ), where .., n, with γ n = k > 0 and (m 1 , ..., m n−1 ) ∈ R n−1 . Particular models, such as OOSs, sequential order statistics (SOSs), progressive type II censored order statistics, standard kth record values and Pfeifer's record, result from specific choices of the model parameters, γ 1 , ..., γ n . When m i = m for all i = 1, 2, ...n, extensive asymptotic results for bootstrapping m-GOSs are obtained by Barakat et al. [21]. According to the restriction γ i γ j for i j, a wide subclass of GOSs-model, excluding record values, have discussed by Kamps and Cramer [23], where some important distributional properties have been considered. In particular, the PDF of the rth GOS X * r:n is given by 2) and the CDF of X * r:n is In view of the probability integral transform, the RVs, U * r:n = F −1 (X * r:n ), r = 1, 2, ..., n, are uniform GOSs. Consequently, it can be shown that and Var(U r:n ) = r(n − r + 1) (n + 2)(n + 1) 2 , r = 1, 2, ..., n. (2.7) The ordinary least squares method of estimation for complete samples was originally proposed by Swain et al. [24]. The method is based on minimizing the function with respect to the unknown parameter vector Θ = (θ 1 , θ 2 , ..., θ ), wherex n = (x 1 , x 2 , ..., x n ) is an observed ordered sample. The weighted least squares estimates can be accomplished by minimizing the function with respect to Θ. Several authors used the least squares method and the weighted least squares method. For estimating the parameters from different distributions; among them are Gupta and Kundu [25] and Kundu and Raqab [26].
El-Adll and Aly [27] have extended the above results to type II censored samples. Namely, based on the first r observed OOSs, x 1 , x 2 , ..., x r , approximate least square estimates of Θ can be obtained by minimizing the least squares function (2.10) For the GOSs model, we can exptend the ordinary least squares and weighted least squares functions, respectively, to take the formulas (2.11) and where , i = 1, 2, ..., n, x r = (x 1 , x 2 , ..., x r ) are observed values of the GOSs X * 1:n , X * 2:n , ..., X * r:n , E[U * i:n ] and Var(U * i:n ) are given by (2.4) and (2.5), respectively.
Kamps [22] extended Rényi's representation [29], to GOSs model (see also Barakat et al. [28]). This representation enables us to express the rth GOS based on the exponential distribution as a linear combination of independent and identically distributed (iid) RVs from the EXP(1). Namely, where Z 1 , ..., Z n are iid RVs from the EXP(1) and "U d = V" means that the RVs U and V have the same CDF.

A new least squares method for estimation and prediction
Minimizing the least squares and weighted least squares functions in (2.11) and (2.12) is not always a simple problem. In this section, a novel and efficient method for solving such problems is proposed.

Least squares estimation via the cumulative hazard function
The cumulative hazard function of any RV X with CDF F is defined by Evidently, the function H(x) is a nonnegative and nondecreasing function on x. Suppose now that X * i:n := X(i, n,m, k) denotes the ith GOS from a continuous CDF F. Therefore, the RV's Z * i:n := H(X * i:n ), i = 1, 2, ..., n, are GOSs from the standard exponential distribution (denoted by EXP (1). Clearly, if the CDF F depends on an unknown vector of parameters Θ = (θ 1 , θ 2 , ..., θ ), with ≥ 1, then H depends on the same vector of parameters. Hence, we can develop a least square method based on H. In view of Kamps [22] and Barakat et al. [28], we have Consequently, the parameters can be estimated by minimizing the least squares function where −∞ < x 1 < x 2 < ... < x n < ∞ are observed values of the GOSs X * 1:n , X * 2:n , ..., X * n:n . Moreover, the weighted least squares estimates based on the cumulative hazard function can be obtained by minimizing the weighted least squares function.
According to the extended Rényi's representation (2.13), we have where z 1 , ..., z n is an observed random sample from the EXP(1). Clearly, E i j=r+1 Z j −1 γ j = 0. Therefore, we can approximate the sum of the last term by its mean. Consequently, we can suggest the modified least squares function. An approximate least squares estimate of Θ based on the first r observed GOSs,x r := (x 1 , x 2 , ..., x r ) for r ≤ n, can be obtained by minimizing L * H,r (Θ|x r ) with respect to Θ. Similarly, minimizing the function produces a modified weighted least squares estimate of Θ based on x 1 , x 2 , ..., x r .

A least squares predictor relying on GOSs
The results of El-Adll and Aly [27] can be extended to GOSs through minimizing the predictive least squares function with respect to Θ and x s . Similarly, a weighted least squares predictor can be derived by choosing the weights w i defined by (2.12). As we proceeded in (3.5) and (3.6), an approximate point predictor of the unobserved sth GOS can be obtained via the minimization of the proposed predictive least squares function (3.9) By the same manner, approximate weighted least squares estimates of Θ and x s based on x 1 , x 2 , ..., x r are derived via minimizing the predictive weighed least squares function (3.10) Remark 3.1.
One advantage of choosing the cumulative hazard function transformation is that it always follows the standard exponential distribution. In addition, its mean and variance can simply be computed for any model of ordered random variables, and they do not depend on unknown parameters. Moreover, as a quick comparison between the proposed method and one of the most well-known methods of estimation, we find that the Maximum likelihood estimation (MLE) for the distribution parameters may be difficult to obtain in certain cases. Particularly when the support of the distribution is unknown. Also, the MLEs may not be robust enough to depart from the assumed distribution. These considerations motivated the least-squares approach to be used.

Numerical simulation experiments
In this section, simulation experiments are carried out to compare the proposed method with different estimation and prediction techniques. We are mainly interested in some important probability distributions that can be widely applied in survival analysis, reliability theory and life-testing experiments. For brevity, we compare only three methods, namely the maximum likelihood, the ordinary weighted least squares and the weighted least squares via the cumulative hazard function. The following assumes that the first r (r ≤ n) elements of the GOSs based on a continuous distribution F are observed and used to estimate the unknown distribution parameters and predict some future observations. In this simulation, 10,000 independent random samples are generated from F and then used in the proposed estimation and prediction methods.

Estimation
In view of Kamps [22], the likelihood function based on the first r elements of the GOSs, which is the joint PDF of X * 1:n , X * 2:n , ..., X * r:n , is given by The maximum likelihood estimates (MLEs) of the unknown parameters Θ = (θ 1 , θ 2 , ..., θ ) can be accomplished via maximizing L * (Θ|x r ) by solving the nonlinear equations ∂ log(L * (Θ|x r )) ∂θ j = 0, j = 1, 2, ..., , numerically. The ordinary weighted least squares estimates (WLSEs) are obtained by minimizing the function WL * F,r (Θ|x r ) given in (2.12), through solving the nonlinear equations .., , numerically. Finally, the modified weighted least squares estimates (MWLSEs) can be obtained by minimizing the function WL * H,r (Θ|x r ) in (3.7), via solving the nonlinear equations is primarily the focus of the next sections of the study.
We use the root mean square error to compare different estimators or predictors. As a result of its sensitivity to extreme values, the root-mean-square error may not even exist. Pitman's measure of closeness (Pitman [30]) is also used for comparing estimators or predictors. According to Keating et al. [31], who provided several inspiring instances and examples, Pitman's measure of closeness is an efficient criterion for selecting among estimators. Pitman's measure of closeness is widely applied to assess estimators and predictors by several authors. Pitman's measure is used by Balakrishnan et al. [32] for comparing different point predictors for type II censored data that follows an exponential distribution in one sample and two sample cases. Nagaraja [33] used the mean square error and Pitman's measure of closeness to compare the best linear predictor with the best linear invariant predictor for the record value and order statistic. Raqab et al. [34] compared different point predictors of progressively censored units using Pitman's measure of closeness. Letθ 1 andθ 2 be two estimators of the same parameter θ. Then,θ 1 is Pitman closer thanθ 2 if for all values of θ, with strict inequality for at least one value of θ.
Pitman's measure of closeness is applied to explore which of two estimators is the closest (in probability) to the true value of a parameter.

Two parameter-exponential distribution
If the CDF of the RV X is given by it is said to have a two-parameter exponential distribution, with a location parameter θ and a scale parameter σ. The MLEs of θ and σ based on the first r elements of GOSs are given in El-Adll [35] bŷ The WLSEs,θ andσ, are obtained by minimizing the function subject to the constrains σ > 0 and θ − a > 0, for some real constant a. We get the MWLSEs, θ * and σ * by minimizing the function, subject to σ > 0 and θ − a > 0, where a is a suitable real number chosen, provided that it is less than the minimum of the data. Minimization can be accomplished by equating the first partial derivatives with respect to θ and σ with zero and then solving the resulting equations, numerically. The results are presented in Tables 1 and 2.
The main reason for choosing the two-parameter exponential distribution in the simulation study is not only its theoretical importance but also because it has explicit forms for the maximum likelihood estimator of its parameters, and consequently, it can be considered a reference in comparison with the proposed method.

The two-parameter modified Kies exponential distribution
Al-Babtain et al. [36] introduced a new family of continuous probability distributions, which they called the "new modified Kies family." They discussed the two-parameter MKi-exponential (MKiExp) distribution as a special case in detail. In the same paper, they demonstrated the practical importance and flexibility of fitting several types of real data. The CDF of MKiExp distribution is given by where α and σ are the shape and scale parameters, respectively. Abd El-Raheem et al. [37] consider the estimation problem of multiple constant-stress tests for progressive type-II censored MKiExp data with binomial removals. The problem of determining numerical estimates in this work is reduced to a suitable minimization problem subject to the following constraints: α > 0 and σ > 0. The results are given in Tables 5 and 6.

Prediction
According to Kaminsky and Rhodin [13], the predictive likelihood function (PLF) based on the first r observed order statistics is given by The above PL function was extended to the GOSs model in Barakat et al. [28] for fixed and random sample sizes.
By setting α = 1, the above results are specialized to the two-parameter exponential distribution. The prediction problem of the two-parameter exponential distribution is discussed in details in Barakat et al. [38]. Moreover, θ = 0 yields the two-parameter Weibull distribution.

Predicting future observations from the modified Kies exponential distribution
As in the Weibull distribution, the prediction of a future order statistic from the two-parameter modified Kies exponential distribution can be accomplished by minimizing the following three functions: The minimization problem is subject to the constraints σ > 0, α > 0 and x s > x r . Remark 4.3.
The RMSE and PMC are obtained numerically via a simulation, and all computations are performed through Mathematica 13.1.
In view of the simulation studies given above, the following comments are extracted: 1. In all cases, for the scale and shape parameters as well as the point predictors, the RMSEs decreased as r increased.
2. In most cases, the RMSE of the MLEs is smaller than the RMSEs of both the WLSEs and MWLSEs of the parameters.
3. For the Exp(θ, σ), according to PMC shown in Table 2, the MLE is the best followed by the MWLSE, which is followed by the WLSE for estimating the location parameter θ, while for estimating the scale parameter σ, the MWLSE is the best followed by the WLSE, which is followed by the MLE. Table 3, it is noted that
6. In view of PMC, Table 4 reveals that the WLSE is the best for the location parameter, while the MWLSEs of the scale and shape parameters are the best whenever n − r is small, but the WLSEs are the best whenever n − r is large. Moreover, for complete samples (i.e., r = n), the MWLSEs of the location, scale and shape parameters are the best.
7. According to the results presented in Tables 5 and 6, there is no preference for one method over the others when estimating the MKiExp distribution parameters. 8. In most cases, it is noted that the MWLSP and WLSP are better than the MLP, according to both the RMSE and PMC (see Tables 7-10). Table 7.

The aircraft windscreen failure times
The windscreen on a large aircraft is a complex piece of equipment comprised of several layers. Failures of these items typically involve damage to or delamination of the heating system's nonstructural outer ply. These failures do not result in damage to the aircraft but do require replacement. Data of this type is incomplete in that all failure times have not yet been observed and may include failures to date of a particular model or combination of models. Murthy et al. [39] reported failure and service times for a specific model windscreen in Table 16.11 on page 297. The data represents 84 observed failure times for a specific windscreen device. Al-Babtain et al. [36] have shown that the MKiExp is an appropriate model for fitting this data.
We assume that the first r failure times have been observed, and we apply our prediction methods to predict the remaining failure times in two different scenarios. Three-point predictors, along with their related estimates and their relative errors, are obtained in the first scenario. Alternatively, to avoid complications in computations, in the second scenario, we first estimate the parameters based on the first r observed failure times and then compute three different point predictors for the future sth failure. In order to assess the prediction results, we compute the relative error (RE) for each point predictor.
Recall that the RE is defined by RE = 100 × |X s:n −X s:n | X s:n , whereX s:n denotes the point predictor and X s:n is the exact value of the quantity to be predicted. The results are presented in Table 11.

Application to reticulum cell sarcoma
The second data set is reported and analyzed by Hoel [40] and Abu El Azm [41] among others. According to these data, male mice received a radiation dose of 300 roentgen at an age of 5-6 weeks. Each mouse's cause of death was identified by autopsy as either thymic lymphoma, reticulum cell sarcoma or other causes. Reticulum cell sarcoma is designated as cause 1 in this instance, and the other two causes of death are merged to make cause 2. There were 77 observations in this set of data, 39 of which are attributable to the second cause of death, while 38 are attributed to the first. For analysis purposes, we consider the following observations that are due to the first cause of death: 317, 318, 399, 495, 525, 536, 549, 552, 554, 557, 558, 571, 586, 594, 596, 605, 612, 621, 628, 631, 636,  643, 647, 648, 649, 661, 663, 666, 670, 695, 697, 700, 705, 712, 713, 738, 748, 753. It has been shown that the Makeham-Gompertz distribution is quite adequate for fitting reticulum cell sarcomas (e.g., Hoel [40]). Arguing as in the first data set, the prediction results are shown in Table 12.
The real data analysis given the the two preceding applications revealed that the maximum RE appears for the WLS method, followed by the MWLS method, followed by the ML method. Table 12. Three different point predictors, the corresponding estimates and the relative errors of future OOSs x s:n , s = r + 1, ..., n based on the first r ∈ {22, 26, 30} OOS for Example 2.

Conclusions
In this article, by using the cumulative hazard function, a new least squares method for estimation and prediction has been proposed. The method is presented in a general setup so that it can be applied to any model of GOSs. A simulation study and numerical comparisons based on the RMSEs and PMCs have been performed through three important probability distributions. For applicability, two examples of real data sets are provided to illustrate the prescribed method. The comparisons reveal that the method is comparable with the ML and WLS methods, in the sense that there is no obvious preference for one method over the others for all estimation and prediction situations. Moreover, analyzing the real data revealed that the second scenario, in which we first estimate the unknown distribution parameters using Type II right censoring and then predict future unobserved failures, performs better than the first scenario for the three methods.

Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.