Parametric Bootstrap Methods for Estimating Model Parameters of Non-homogeneous Gamma Process

Non-Homogeneous Gamma Process (NHGP) is characterized by an arbitrary trend function and a gamma renewal distribution. In this paper, we estimate the confidence intervals of model parameters of NHGP via two parametric bootstrap methods: simulation-based approach and re-sampling-based approach. For each bootstrap method, we apply three methods to construct the confidence intervals. Through simulation experiments, we investigate each parametric bootstrapping and each construction method of confidence intervals in terms of the estimation accuracy. Finally, we find the best combination to estimate the model parameters in trend function and gamma renewal distribution in NHGP. KeywordsBootstrap, Non-homogeneous gamma process, Trend function, Confidence interval.


Introduction
In various fields of reliability engineering, stochastic point processes are used to describe probabilistic behavior of the cumulative number of events occurred as time goes by (see Meeker and Escobar, 1998). For example, the occurrence of failures in repairable systems or the detection of faults in software testing are often modeled by representative stochastic point processes. By applying stochastic point processes to various situations, we can derive some important measures quantitatively, such as mean time between failures for repairable systems or software reliability for software products. The most important issue for applying them is to statistically estimate the model parameters which characterize the stochastic point process in order to evaluate the quantitative measures.
In general, statistical estimation can be classified into point estimation and interval estimation. The former estimates model parameters or reliability measures as numeric values. However, since these model parameters have to be estimated based on the limited information such as failure data observed in a specific period, the estimators of model parameters always include the estimation errors. Thus, different failure time data may often result different model parameters, and these may not guarantee any confidence of the model parameters estimated by means of the point estimation. The interval estimation is more suitable, and the confidence intervals of model parameters will give the statistically significant information because the resulting estimates of model parameters or reliability measures are given by confidence regions. A lot of existing works on interval estimation of model parameters for representative stochastic point processes have been reported in the field of software reliability. Yamada and Osaki (1985), Joe (1989), van Pul (1992), Zhao and Xie (1996) gave asymptotically approximate confidence intervals of model parameters for specific Non-Homogeneous Poisson Process (NHPP)-based software reliability models (SRMs). Okamura et al. (2007) developed variational approaches to approximate the confidence interval of model parameters from the both viewpoints of frequentist and Bayesian methods. In this way, considerable attention has been paid to perform the interval estimation of model parameters of NHPP.
In this paper, we propose interval estimation methods of model parameters for a more generalized stochastic point process. Non-Homogeneous Gamma Process (NHGP) were proposed by Berman (1981) as a kind of generalized stochastic point process of NHPP. Later, software reliability models based on NHGP was considered by Ishii and Dohi (2008). In their work, the parameter estimation of models was done in accordance with the common maximum likelihood principle. Furthermore, Saito and Dohi (2015) extended NHGP-based SRMs from both viewpoints of modeling and parameter estimation, and developed a non-parametric maximum likelihood estimation method. In these existing works, the authors focused on only the point estimation of model parameters or software reliability measures. In fact, interval estimation of model parameters or several reliability measures for an NHGP is much more difficult in the sense of analytics than the point estimation in the maximum likelihood estimation framework. To our best knowledge, the bootstrap method is a straightforward but the most effective method to calculate the confidence regions in such cases. For the NHPP case, some authors try to perform interval estimations of complex estimators via bootstrap methods (see Croteau et al., 2005;Kaneishi and Dohi, 2011;Léger and Cléroux, 1992;Tokumoto et al., 2014). We aim to derive approximately the probability distributions of statistical estimators of model parameters using parametric bootstrap methods, and calculate the two-sided confidence intervals of model parameters for NHGP. Once the probability distribution of an estimator of model parameters is given, we can obtain the complex estimators such as some reliability measures easily with bootstrap method.
In this paper, we consider a simple estimation problem of model parameters in NHGP and introduce two parametric bootstrap methods with three construction methods of confidence intervals. The estimator distributions are obtained, and the best combination of bootstrapping and construction methods is investigated through simulation experiments.
The rest of the paper is organized as follows. In Section 2, we describe the failure model by an NHGP. Section 3 discusses two parametric bootstrap methods and three construction methods of confidence intervals. In Section 4, we show simple numerical examples through simulation experiments. Finally, concluding remarks are made in Section 5.

Non-Homogeneous Gamma Process 2.1 Model Description
Non-Homogeneous Gamma Process (NHGP) is introduced by Berman (1981) and well known as a generalized point process of Non-Homogeneous Poisson Process (NHPP). In the NHPP case, On the other hand, in the case of NHGP, the random variables ( +1 ) − ( ) are characterized by the Stieltjes convolution of exponential distributions with unit scale parameter, and obey the independent and identically gamma probability density functions with shape parameter and unit scale parameter: where Γ(⋅) denotes the standard gamma function. In other words, ( +1 ) − ( ) follow a renewal process having the gamma renewal density ( ) with shape parameter and unit scale parameter. That is, Berman (1981) considers a situation where there are (> 0) NHPPs with same failure intensity function ( ) and represents probabilistic events to occur when every successive th event of the underlying NHPP is observed. For the background, Berman (1981) supposed that the shape parameter is an unknown integer-valued parameter. However, it is not necessary to limit shape parameter to be an integer-value. Hence, we assume the shape parameter as an unknown real-valued parameter which has to be estimated from the event count time data. In the case of > 1, the system after the repair is improved in better condition than that before the failures occur, and the larger indicates the larger improvement effect. On the other hand, if < 1, then the system after the repair gets in worse condition than that before the repair of failures.
By taking the inverse time-scale transform of ( ), the NHGP sequence is given. Let ( | − ) be the conditional intensity function of NHGP at time , where − denotes the history of ( ) up to, but not including time . The intensity function of an NHGP depends on the past history of events and is no longer deterministic except for = 1. When = 1, it is evident that the conditional intensity function of NHGP reduces to the intensity function of NHPP, say, ( | − ) = ( ) which is deterministic. On the other hand, when ≠ 1, the intensity function of the NHGP may be stochastic. To avoid confusion, ( ) is called the trend function for the NHGP to distinguish the intensity function for the NHPP in the latter discussion. Furthermore, ( ) = ∫ ( ) 0 is called the cumulative trend function for an NHGP which is equivalent to the mean value function for an NHPP when = 1, since the cumulative trend function ( ) does not always represent the mean value function of NHGP, E[ ( )] (see Lindqvist et al., 2003). On the other hand, when ≠ 1, the system after repair would not be as same as it was just before the failure. In fact, an NHGP can represent two representative aging properties in the same model, called minimal repair property ( = 1) and major repair property ( ( ) = ) in the common reliability engineering (see Kijima, 1989). More precisely, if we assume the trend function to be a constant ( ( ) = ), then the NHGP is reduced to a gamma renewal process with the gamma renewal distribution with shape parameter and scale parameter . Therefore, the trend function ( ) or the cumulative trend function ( ) characterizes the NHGP as a non-stationary stochastic point process. We illustrate a schematic behavior of the conditional intensity function of NHGP in Fig. 1, where the jump size at each failure time ( = 1, 2, ⋯ , ) in the case of ≠ 1 depends on the kinds of trend function.

Maximum Likelihood Estimation of Model Parameters
Suppose that failure times ( = 1, 2, ⋯ , ) are observed from the time = 0 to the last data point = , where ( = 1, 2, ⋯ , ) are realizations of ( = 1, 2, ⋯ , ). Then, the likelihood function of the NHGP is given by (see Berman, 1981) By taking the logarithm of both sides in Eq.(2), we get the log-likelihood function of NHGP: = ∑ (log ( ( ) − ( −1 )) + log( ( ))) =1 ( 3) where 0 = ( 0 ) = 0. For easy understanding, we assume the power law type trend function in the following argument: for 0 < < ∞, 0 < < ∞. In this case, the log-likelihood function of NHGP can be rewritten by From this result, we can get Maximum Likelihood (ML) estimates of model parameters (̂,̂,) which maximize the log-likelihood function in Eq.(5) simultaneously. Therefore, once we select the trend function ( ), or equivalently the cumulative trend function ( ), we can estimate all model parameters in the NHGP. It should be noted that Eq.(5) is reduced to the log-likelihood function of NHPP having power law type trend function when = 1. Therefore, it can be said that the NHPP is completely included in the NHGP as a special case. Also, our proposed method can be applied to an NHGP having the arbitrary trend function.

Interval Estimation 3.1 Parametric Bootstrap Methods
In the above case, the resulting ML estimate is based on a fixed sample of failure time data 1 , 2 , … , and may involve the estimation error. That is, we cannot consider the uncertainty of the estimator as a random variable. It can be said that the interval estimation is more suitable in order to take account of uncertainty of estimation results. Unfortunately, the method to derive analytically the interval estimates of model parameters in an arbitrary NHGP has not been proposed yet. Instead of deriving confidence intervals analytically, we consider a statistical estimation via parametric bootstrap method and derive the probability distribution of estimators of model parameters in the NHGP. The bootstrap method is a representative statistical approach to replicate several failure time data sets from original failure time data 1 , 2 , … , (see Efron, 1979). It is well known that the bootstrap method can be used to the cases of i.i.d. sample and NHPP sample. In this paper, the same ideas are applied to calculate some confidence intervals of the NHGP's model parameters. The generated failure time data sets are called the bootstrap samples. For the parameter estimation of NHGP with failure time data, we generate (> 0) bootstrap samples from original failure time data 1 , 2 , … , and obtain ML estimates of model parameters of the trend function and of the gamma renewal distribution. In detail, we propose two bootstrap methods; simulation-based method and re-sampling-based method, to replicate the bootstrap samples as follows.
 Repeat these steps times, and generate sets of bootstrap samples.

Method (ii) Re-sampling-based method (BS2):
We sample failure time data with replacement randomly from the original data 1 , 2 , … , and omit the tied data. We obtain ML estimates of model parameters with bootstrap samples ℎ, * ( = 1,2, … , ℎ ; ℎ = 1,2, … , ) obtained by the re-sampling technique, where ℎ is decided by the size of tied data.

Construction Method of Confidence Intervals
We estimate ML estimates ̂ℎ (ℎ = 1,2, … , ) based on the bootstrap samples. From these estimates of model parameter, we derive the ordered statistics of parameter estimates by −∞ = (0) ≤̂( 1) ≤ ⋯ ≤̂( ) , and regard it as the complete sample from the random variable International Journal of Mathematical, Engineering andManagement Sciences Vol. 3, No. 2, 167-176, 2018 ISSN: 2455-7749 (statistical estimator) ̂. If there exists the cumulative distribution function (c.d.f.) of ̂, the empirical distribution function (̂) corresponding to the sample estimates ̂( ℎ) (ℎ = 0, 1, ⋯ , ) is given by It is well known that the above empirical distribution function approaches to the real (but unknown) c.d.f. of the ML estimator (random variable) ̂ as → ∞ and is strongly consistent. From the empirical distribution function of ̂, we can obtain not only the bootstrap sample mean E[̂] = ̅ and its higher moments, but also the confidence intervals with significance level . For the model parameter , let ̂( 1− )/2 and ̂( 1+ )/2 be the lower and upper limits of two-sided 100 % confidence interval with significance level . As a simplest case, the confidence interval of ̂ can be defined by the quantile of Eq.(6). That is, the two-sided 100 % confidence interval is given by where ℎ and ℎ indicate the numbers of replication order of ̂( ℎ) satisfying (̂( ℎ ) ) = (1 − )/2 and (̂( ℎ ) ) = (1 + )/2, respectively. The above confidence interval is called the Percentile Confidence Interval (PCI).
It should be noted that PCI strongly depends on the number of replication . In other words, we need to replicate a sufficiently large to guarantee the convergence. However, the computational cost of the corresponding bootstrap method will be rather costly. In fact, it is known that the bootstrap method works well when the sample size is large enough, because the coverage probability of bootstrap confidence intervals asymptotically converges to the nominal level. For such a limitation, two alternative techniques to approximate the confidence intervals are introduced in this paper.

Approximate CI-1 (ACI-1):
In this method, the basic idea comes from normal approximation. Suppose that the estimator distribution of model parameter ̂ can be approximated by a normal distribution. Then, the two-sided 100 % confidence interval is given by where (1− )/2 is the (1 − )/2-quantile of the standard normal distribution, and Var[̂] is the variance of bootstrap samples. The above confidence interval is called the approximate confidence interval-1 (ACI-1) in this paper.

Approximate CI-2 (ACI-2):
When the assumption of normality is suspected for the probability distribution of ̂, it is appropriate to approximate by the distribution of bootstrap statistics ̅ −̂.
In this case, the lower and upper quantile points of the distribution for ̅ −̂ are given by which denotes the two-sided 100 % confidence interval of ̂. Hence, we obtain The above confidence interval is called the approximate confidence interval-2 (ACI-2) in this paper.

Simulation Experiment
Suppose the non-homogeneous gamma process having a power law type trend function with ( , ) = (50, 3), and a gamma renewal distribution with unit scale parameter and shape parameter = 2 to generate the original failure time data in our simulation experiments. We choose four representative sample paths as original failure time data which are described in Fig. 2 and apply two bootstrap methods to generate = 2,000 bootstrap samples. DS1 indicates the case of somewhat larger cumulative number of failures than the mean value function of the true model. DS2 fits best the true mean value function. DS3 represents the smaller trend than the true model. Finally, DS4 shows an S-shaped curve around the true mean value function. The ML estimates for respective data sets are given in Table 1. From this table, we can know that ML estimates of parameters and in DS2 take closest values to the real ones. However, focusing on the parameter in the gamma renewal distribution, the shape parameter in DS3 is nearest to the real value.  International Journal of Mathematical, Engineering andManagement Sciences Vol. 3, No. 2, 167-176, 2018 ISSN: 2455-7749  We calculated 2,000 model parameters of the power law type trend function and the gamma renewal distribution with ML estimation method. For the significance level = 0.95, we derive the two-sided 95% confidence intervals of each model parameter. To compare the resulting confidence intervals with each other, we apply two bootstrap methods and three construction methods of confidence intervals in Section 3. Tables 2-7 show two-sided 95% confidence intervals of model parameters with two bootstrap methods and three construction methods of confidence intervals. The confidence intervals which included real model parameter ( , , ) = (50, 3, 2) are denoted with underline. By comparing the estimation results of BS1 with that of BS2, it is observed that the confidence intervals with BS1 tend to include real model parameters among these four data sets. Especially, ACI-2 with BS1 results the appropriate confidence intervals which include the real value of model parameters in many cases. From these tables, we can know that the resulting confidence intervals depend on the underlying data. Even if we apply BS1 and ACI-2 to derive the confidence intervals of model parameter in the non-homogeneous gamma process, we may have to pay attention that the confidence interval of model parameter does not include the real value in special cases such as DS1. However, it is due to the data set which is rather larger than the true mean value function. Therefore, this result may rarely occur in real situations.

Concluding Remarks and Future Work
In this paper, we have proposed two bootstrap methods and three construction methods of confidence intervals for the model parameters of Non-Homogeneous Gamma Process (NHGP). In detail, we have applied the proposed parametric bootstrap methods to derive the estimator distributions of model parameters of power law type trend function and gamma renewal distribution. Furthermore, we have calculated two-sided 95% confidence intervals via three different approaches. As a result, it has been shown that the confidence intervals could be derived appropriately with bootstrap methods for the model parameters of NHGP. Through simulation experiments, we have also investigated which method was the best one to calculate confidence intervals in NHGP case. It could be confirmed that the ML point estimate was included between the two-sided 95% confidence intervals from the results of simulation experiment in almost all cases with BS1 and ACI-2. For future studies, we will propose nonparametric bootstrap methods for model parameters of NHGP and derive confidence intervals of estimators in the case with incomplete knowledge on the trend function.