Inference for the log-logistic distribution based on an adaptive progressive type-II censoring scheme

The primary aim of this study is to explore and investigate the maximum likelihood (ML) estimation and the Bayesian approach to estimating the parameters of log-logistic distribution and to calculate the approximate intervals for the parameters and the survival function based on adaptive progressive type-II censored data. The ML estimators of the parameters of the probability distribution were obtained via the Newton–Raphson Method. The approximate confidence intervals for the reliability function were calculated using the delta method. The Bayes estimators based on squared error loss function (SELF) and the approximate credible intervals for the unknown parameters and the survival function using the Bayesian approach were constructed using the Markov Chain Monte Carlo (MCMC) method. A Monte Carlo study was performed to examine the proposed methods under different situations, based on mean-squared error, bias, coverage probability, and expected length-estimated criteria. The Bayesian approach appears to be better than the likelihood for estimating the log-logistic model parameters. An application to real data was included. Subjects: Science; Mathematics & Statistics; Statistics & Probability; Statistics; Mathematical Statistics; Statistical Computing; Statistical Theory & Methods ABOUT THE AUTHORS Maha F. Sewailem is a graduate student in the Mater of Applied Statistics at Qatar University. Her research interests are in the likelihood and Bayesian inference with lifetime data and their applications. This article is based on her master thesis under the supervision of the second author, Prof. Ayman Baklizi. Ayman Baklizi is a professor in the Department of Mathematics, Statistics and Physics, Qatar University. He earned his BSc and MSc degrees in Statistics from Yarmouk university, Jordan in 1991 and 1994 respectively and the PhD from Universiti Putra Malaysia in 1998. His research interests are in the fields of lifetime data analysis, likelihood inference and nonparametric statistics. He published several papers in these fields as well as being a regular referee and member of the editorial board of several statistical journals. PUBLIC INTEREST STATEMENT The log-logistic distribution has extensively been used to analyze lifetime data. Its flexibility allows us to analyze many types of data such as lung cancer data that occur in survival analysis, economics, reliability, and hydrology. This article develops the likelihood and the Bayesian inference procedures for the parameters of the loglogistic distribution and the survival function, assuming that the data are adaptive progressively type II censored. We obtained the point as well as approximate interval estimators for the various quantities of interest. In order to explore and explain the properties of these inference procedures to the practitioners, a simulation study is conducted to study the performance of the inference procedures developed in this article. Furthermore, to explain how the proposed methods work in practice, an example based on a real data set is given. Sewailem & Baklizi, Cogent Mathematics & Statistics (2019), 6: 1684228 https://doi.org/10.1080/25742558.2019.1684228 © 2019 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license. Received: 05 March 2019 Accepted: 21 October 2019 First Published: 30 October 2019 *Corresponding author: Ayman Baklizi, Department of Mathematics, Statistics and Physics, Qatar University, Doha, Qatar. E-mail: a.baklizi@qu.edu.qa Reviewing editor: Yan Sun, Mathematics & Statistics, Utah State University, Logan USA Additional information is available at the end of the article


PUBLIC INTEREST STATEMENT
The log-logistic distribution has extensively been used to analyze lifetime data. Its flexibility allows us to analyze many types of data such as lung cancer data that occur in survival analysis, economics, reliability, and hydrology. This article develops the likelihood and the Bayesian inference procedures for the parameters of the loglogistic distribution and the survival function, assuming that the data are adaptive progressively type II censored. We obtained the point as well as approximate interval estimators for the various quantities of interest. In order to explore and explain the properties of these inference procedures to the practitioners, a simulation study is conducted to study the performance of the inference procedures developed in this article. Furthermore, to explain how the proposed methods work in practice, an example based on a real data set is given.

Introduction
The log-logistic distribution, known in economics as the Fisk distribution (Fisk, 1961), is widely used in survival analysis and in life testing experiments. Its shape resembles that of the log-normal distribution, and it can be considered a substitute to the Weibull distribution (a non-monotonic hazard function) in real-life data analysis. It is a combination of the Gompertz and Gamma distributions with a mean and variance equal to one. Moreover, this life distribution has a characteristic property that, in contrast to the log-normal distribution, the distribution function can be formed explicitly in closed form. This allows us to analyse many types of censored data. Relevant work on log-logistic distribution was done, for instance, by Guure (2015), who estimated the parameters of the log-logistic model based on right censored data using Bayesian and classical estimation methods. Al-Shomrani, Shawky, Arif, and Aslam (2016) considered Bayesian estimation of the unknown parameters using MCMC techniques in case of complete sample.
Assume that the random variable X has a logistic distribution with mean µ, variance σ 2 , and let T ¼ e x ; then, T has a log-logistic distribution. Therefore, the corresponding probability density function (pdf), cumulative distribution function (cdf), survival function and hazard rate function of the log-logistic distribution are, respectively, as follows: where µ and σ are the location and scale parameters, respectively. This form of the log-logistic distribution was considered by Balakrishnan (1992) and AL-Haj Ebrahem and Bakli zi (2005). Figures 1 and 2 below show, respectively, the plots of the log-logistic probability density function and the survival function for some distinct values of parameter σ. As for the hazard function, it is a decreasing function when σ ! π ffiffi 3 p . For σ < π ffiffi 3 p the hazard function increases until it becomes maximum then it starts to decrease. Figure 3 shows the graph of the hazard function for some values of the parameter σ.
Researchers may not have enough time to observe the lifetime for all experimental units. Reducing the duration of the experiment and the related cost are the main aims of censoring. The most common censoring schemes are type-I censoring, where the experiment stops at a predetermined time T-and type-II censoring, where the experiment stops once the m specified failure times have been obtained. However, these regular censoring schemes do not have the flexibility of removing of units at points other than the terminal point of the experiment. Therefore, progressive type-II censoring scheme has been introduced in real-life tests. A more flexible censoring scheme, called the type-II hybrid progressive censoring was proposed by Kundu and Joarder (2006). A combination of the type-I censoring and type-II progressive censoring schemes, known as an adaptive type-II progressive censoring scheme, was proposed by Ng, Kundu, and Chan (2009) for real-life studies. Consider n identical, independent units in a reliability experiment; let m and n be pre-determined early. In addition, let the progressive censoring scheme R = (R 1 ; Á Á Á Á Á Á ; R m Þ be provided before starting the experiment. The experimental total time may run over the pre-fixed time T. Assume J represent the observed failure times before the predetermined time T, i.e., X J:m:n T < X Jþ1:m:n; where X 0:m:n ; 0 and X mþ1:m:n ; 1 . When the experiment total time has exceeded the ideal test time In this situation, we do not remove any survival units except at the time of the m th failure; this allows us to accelerate the experiment so that it ends as soon as possible. For more detailed knowledge of progressive censoring and real applications in reliability and quality see Balakrishnan (2007) and Balakrishnan and Cramer (2014).
The rest of the paper is organized as follows. In Section 2, the ML estimators of the unknown parameters as well as the corresponding survival function are derived. Also, the approximate confidence intervals of the parameters µ and σ are constructed using the Fisher information matrix and the approximate confidence interval of the survival function is constructed using the Delta method. In Section 3, the Bayesian approach has been used to estimate the same parameters and in turn to construct the corresponding approximate credible intervals using the Metropolis-Hastings algorithm. A Monte Carlo simulation study is carried out to investigate the efficiency of different estimators in Section 4. An example from real data set is provided in Section 5. Finally, a brief summary and conclusion are introduced in Section 6.

Classical estimation
In this section, we shall derive the inference procedures based on the likelihood function for the parameters and the reliability function of the log-logistic distribution based on adaptive progressively type II censored data.

Maximum likelihood estimation
We consider n units in a real-life experiment from a lifetime distribution with cdf F x; θ ð Þ; pdf f x; θ ð Þby using the above-mentioned assumptions. The conditional likelihood function of the vector of parameters ϴ gives the vector of observationst with a progressive censoring scheme R = (R 1 ; Á Á Á Á Á Á ; R m Þ for an observed sample. This was defined by Ng et al. (2009) as follows: Figure 3. Plots of the log-logistic hazard rate function for some values of σ when μ ¼ 0.
and the corresponding conditional likelihood function of log-logistic distribution is as follows: The ML estimator is obtained by findingμ andσ that maximize the likelihood function. This is done by equating the first partial derivatives of the log-likelihood function to zero and solving for µ and σ simultaneously as follows: The first partial derivatives of Equation (1) with respect to µ and σ are, respectively, as follows:   . Therefore, a numerical method is required to find the ML estimators.
The ML estimator for the reliability function using the invariance property of ML estimator is as follows:

Asymptotic confidence intervals for µ and σ
The observed Fisher information matrix of parameters µ and σ for large n, is given as follows (Aldrich, 1997): For the log-logistic model with adaptive progressive type II censored data we have.
Unfortunately, the exact mathematical expression for the expected Fisher information matrix is difficult to obtain analytically because the calculation of the expectation of the Hessian matrix is complicated due to containing intractable terms. Therefore, by using the inverse of the observed Fisher information matrix I À1μ ;σ ð Þ, the approximate 100(1-α) % normal confidence intervals for the parameters µ and σ are given, respectively, as follows: where V(μ) and V(σ) are the estimated variances ofμ andσ; given by the main diagonal elements of I À1μ ;σ ð Þ, and z α=2 represent the right tail probability α=2 for standard distribution. In addition, the Delta method (Greene, 2010), is applied to evaluate the approximate confidence intervals for the survival functions S(t). This is a natural way for calculating the confidence interval for the functions of the ML estimators, in which these functions are intractable to calculating the variance analytically. Then, we create linear approximations of this survival function and then calculate the variance of linear approximation as follows: the approximate estimate of the variance of d S t ð Þ is given by the following: Then the approximate confidence interval for S(t) is as follows:

Bayesian estimation
The Bayesian approach provides an alternative way to estimate the parameters when previous knowledge about µ and σ is available. In this paper, the Bayes estimates under the squared error loss function (SELF) are constructed for the unknown parameters (μ; σ) and for the survival function. The corresponding credible intervals for these quantities are calculated. The noninformative priors for both parameters µ and σ are considered to be π 1 μ ð Þ / 1, π 2 σnμ ð Þ/ 1 σ . When π 1 μ ð Þis multiplied by π 2 σnμ ð Þ, the corresponding prior density of µ and σ is given by Additionally, a conjugate logistic prior for µ is considered. Because the scale parameter σ of the loglogistic distribution is non-negative, the gamma prior for this parameter is taken; this prior distribution is more flexible and popular (Guure, 2015). The prior density for µ is considered to be For parameter σ the prior distribution is taken asπ 2 σnμ ð Þ/σ aÀ1 exp Àσ b À Á , where a and b represent the hyperparameters. Consequently, the joint prior density of µ and σ is taken to be as follows: Subsequently, the general form of the posterior density is proportional to the likelihood function time of the prior density function, as follows: prior f g and the corresponding joint posterior conditional density function with non-informative priors is while the corresponding joint posterior conditional density function with an informative prior is Hence, the Bayes estimates of any function of µ and σ such as h (µ, σ), based on SELF is obtained as follows: It is not possible to compute Equation (4) analytically in closed form. MCMC is one of the best numerical approximation for estimating these unknown parameters and build the associated credible intervals, see Hamada, Wilson, and Reese (2008) for more information about the MCMC techniques and Bayesian reliability examples. Here, the Metropolis-Hastings algorithm has been considered to simulate samples from the full conditional posterior distribution and the proposal proceeds by proposing a joint move on (µ, σ). The Metropolis-Hasting algorithm is illustrated below. It provides a flexible way of obtaining random values from a target distribution with the logistic candidate for parameter µ and the inverse-gamma candidate for parameter σ. We assume that ϴ is a two-dimensional, real-valued parameter vector as follows: 6) Draw u from a uniform (0,1) density.

Simulation study
In this section, a Monte Carlo simulation study was carried out to investigate the performance of different estimators of µ, σ, and S(t).

General specifications
The simulation study was implemented using the R program. The process of generating an adaptive progressive type-II censored sample with a pre-determined number of n and m and the progressive censoring schemes with given values of the ideal censoring time T from the Log-Logistic distribution is described below using the procedure described by Balakrishnan and Sandhu (1995) and by Ng et al. (2009). For illustration, the algorithm to generate an adaptive progressively type-II censored sample from any continuous lifetime distribution is considered as follows: (1) Define the values of n, m, µ, σ, T and R 1 ; R 2 ; Á Á Á ; R m ð Þ : (2) Simulate m random variables from uniform (0,1) as W 1 ; W 2 Á Á Á Á Á Á W m : (4) Set U i ¼ 1 À V m V mÀ1 . . . :V mÀiþ1 for i ¼ 1; 2; . . . ; m : Then U 1 ; U 2 ; . . . :U m , is the m progressive type-II observed sample from the Uniform (0,1) distribution.
. . . ; m, where F À1 :; θ ð Þ represent the quantile function of the log-logistic distribution. Thus, X 1 ; X 2 ; . . . ; X m , is the needed progressive type-II observed sample from the specified distribution F(.) by using the inverse transformation method.
(6) Identify the value of J, where x J:m:n <T<x Jþ1:m:n; and discard the sample x jþ2:m:n;ÁÁÁÁÁÁ ,x m:m:n: (7) Simulate the first m À J À 1 order statistics from a truncated distribution considered as f x ð Þ 1ÀF x jþ1:m:n ð Þ ½ with sample size n À ∑ j i¼1 R i À J À 1 as x jþ2:m:n ; x jþ3:m:n ; Á Á Á ; x m:m:n : The Monte Carlo simulation was performed under different numbers of total sample size n, observed sample size m, and different cases of the progressive censoring schemes R for each choice of m and n as shown in Table 1.
Hence, a simulation study was executed using two distinct values of the ideal total test time T as 1; 1:8 ð Þ in which these censoring times were calculated as F À1 0:5 ð Þ ¼ 1 and F À1 0:75 ð Þ¼1:8. To generate the data, we supposed that the initial true values of the parameters μ; σ ð Þ were (0,1), we used the values of t ¼ 0:5; 1; 2, the corresponding values of the survival function are ðS t ð Þ ¼ 0:7784; 0:5; 0:2215 respectively. For prior information, the non-informative priors for both parameters were considered as the flat prior for the parameter µ and the 1 σ Jeffreys prior for the parameter σ. Additionally, an informative prior was considered as μfollowing the logistic distribution with known hyperparameters (ϴ = 0, β = 1) and σ following the gamma distribution with known hyperparameters (a =1, b =1). To find the Bayesian estimates and the 95% Bayes intervals for the unknown parameters, we simulate 11,000 MCMC values from the target distribution using the Metropolis-Hastings algorithm. Generally, the successive samples (values) from the target distribution were correlated; however, this autocorrelation disappeared as the MCMC algorithm continued to run.

Results and comparisons
For both ML and Bayesian estimators, the process was replicated 2000 times. For each generated sample, we obtained the ML and Bayes estimators and constructed a 95% confidence and credibility intervals. For each interval, we determined whether the initial true value falls inside the interval and we computed the interval's width. The coverage probability was computed as the count of intervals containing the initial true value divided by 2000, while the estimated average length of the confidence and credible intervals was evaluated as the average lengths for all intervals over 2000. The mean, bias, and MSE of µ, σ, and S(t) parameters for each method are summarized in Tables 2 and 3 for T = 1 only to save space. From the simulation study, obviously the ML estimators are close to the Bayes estimators when the priors are non-informative. Thus, it is preferable to use the ML estimation when no reliable information is available. Furthermore, according to the tables below, the bias was very small, and not significant in all cases of the progressive censoring schemes with all methods of estimations; consequently, the estimators are   Following this, a simulation study was performed to consider the coverage probability and average length as shown in Tables 4 and 5. From these tables, it was discovered that the average length of the confidence interval and the credible interval decreased as n and m increased. The coverage probabilities of the confidence intervals based on the likelihood are close to the nominal level of 0.95 for µ, σ, and S (t =0.5, 1, 2) as n grew larger, but failed to reach the desired level for small values of n. On the other hand, the coverage probabilities of the credible intervals approached the nominal level of 0.95 for µ, σ, and S (t =0.5, 1, 2) in most cases.

An example based on real data
The data set was originally found by Nichols and Padgett (2006). It has been analyzed by Lemonte (2014) and by AL Sobhi and Soliman (2016). The uncensored data on the breaking stress of carbon fibers (in Gpa) are composed of 100 observations, as shown below in Table 6. Carbon fiber is composed of carbon atoms bonded together to form along chain. They are extremely stiff, strong, and light. Although this fiber has many significant benefits compared with more traditional materials such as steel, aluminum, and plastic, it is more expensive than they are. It is used in many processes to create excellent building materials such as solid carbon sheets and carbon tubes. The most common applications of carbon fiber are in sports equipment and robotics.  In this illustration, the data set was used to simulate an adaptive progressive type-II censored sample with m = 60 and with two distinct values of ideal total test time T (1.60 and 3.66); the progressive censoring scheme was considered as R ¼ 30; 0 Ã58 ; 10 À Á . For clarity, R ¼ 1; 0 Ã4 ; 3 À Á is given as a short form of R ¼ 1; 0; 0; 0; 0; 3 ð Þ . The function "sample" in the R program was used to remove randomly 30 survival units from 99 units at the first failure; then, the remaining 10 survival units at the last failure were removed. Thus, the observed adaptive progressive type-II censored samples are shown below in Table 7 for two different numbers of T and two distinct numbers of J. If J = 13 means that only 13 observed failures were observed before time T = 1.60 and J = 60 means that all the observed failure times were observed before time T = 3.66, then this implies that the experiment ended before time T.  To check if the sample follows the log-logistic distribution, we need to apply the Kolmogorov-Smirnov test for one sample. The estimate of the parameters for the complete sample was used to standardize the data and transform it to logistic distribution, namely x = (π/sqrt(3))*(log(y)-µ)/σ, since the form of the parameters in the log-logistic model is different from that adopted in R. It was noted that under the significance level (0.05), the p-value =0.3927, is greater than the significance level and the test statistics value equals 0.090001, which is too small. This implies that the proposed log-logistic model fitted the sample data satisfactorily. The different estimators and the associated 95% confidence intervals of µ, σ, S (0.5) were computed. Since no previous knowledge of the unknown parameters was available, the diffuse priors for both µ and σ were used. We generated 11,000 MCMC samples μ i ; σ i ð Þ; i ¼ 1; 2; . . . . . . ; 11000 ð Þ and then discard the first 1000 random values. Table  8 summarises the ML estimators and the Bayes estimators for the parameters µ, σ, and S (0.5) via the censored sample, which provides an estimate close to the estimate of the same parameters by using the complete sample. Furthermore, from this table, it seems that the Bayes estimates under the noninformative prior and the ML Estimator were close to each other.
The approximate 95% confidence intervals were computed together with the corresponding length for each interval, as reported below in Table 9. It is obvious that the length of the likelihood based interval was nearly as short as that of the Bayes credible interval with non-informative prior. The results of this example were consistent with the results obtained in the simulation study.

Summary and conclusion
This study studied the likelihood and Bayesian approaches to estimate the parameters of the loglogistic distribution and the survival function under an adaptive progressive type-II censored data. The ML estimators of the parameters and the reliability function cannot be obtained in closed form; therefore, they were computed using the Newton-Raphson numerical method. Additionally, the asymptotic confidence intervals for µ and σ were constructed. The Delta method was utilized to find an approximate confidence interval for the reliability function. Furthermore, the Bayesian approach presented was based on non-informative priors for both the unknown parameters and, in another case, for an informative logistic conjugate prior for parameter µ and gamma prior for parameter σ. The Bayes estimates under the squared error loss function could not be obtained analytically. Therefore, the Metropolis-Hastings algorithm was provided to generate 11,000 samples and then the first 1000 draws were removed as discarded values based on a convergence diagnostic via the "coda" package in R. The two unknown parameters were estimated and the corresponding credible intervals for these quantities and for the reliability function were computed. Next, a simulation study examined a case of 2000, replicated to investigate the realization of the derived methods for the various values of sample sizes n, effective sample sizes m, and the three different progressive censoring schemes for each choice of n and m. The proposed methods were examined on the basis of a real-life example.
The Biases of both estimators were small in all situations; therefore, we have approximately unbiased estimators. The MSEs of the two estimators are close to each other too. From the estimated coverage probability it was obvious that the intervals based on ML estimation attain the nominal level by increasing the sample size and the effective sample size, while the credible intervals attain the nominal level in most cases. The results suggest that the Bayesian inference procedures have a similar and sometimes better overall performance and therefore they are recommended.