Accelerated degradation tests with inspection effects

This study proposes a framework to analyze accelerated degradation testing (ADT) data in the presence of inspection effects. Motivated by a real dataset from the electric industry, we study two types of effects induced by inspections. After each inspection, the system degradation level instantaneously reduces by a random value. Meanwhile, the degrading rate is elevated afterwards. Considering the absence of observations due to practical reasons, we employ the expectation–maximization (EM) algorithm to analytically estimate the unknown parameters in a stepwise Wiener degradation process with covariates. Moreover, to maintain the level of generality for the adaption of the method in various scenarios, a confidence density approach is utilized to hierarchically estimate the parameters in the acceleration link function. The proposed methods can provide efficient parameter estimation under complex link functions with multiple stress factors. Further, confidence intervals are derived based on the large-sample approximation. Simulation studies and a case study from Schneider Electric are used to illustrate the proposed methods. The results show that the proposed model yields a remarkably better fit to the Schneider data in comparison to the conventional Wiener ADT model. © 2020 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )


Introduction and motivation
Reliability tests are widely used to predict product lifetime in various industries. A successfully planned and conducted reliability test can provide important information supporting managerial decisions under a reasonable test budget, thereby reducing both prospective costs and risks. In order to shorten the test duration, conventional life tests are commonly conducted under elevated stresses to accelerate the failures of test units. In recent decades, with the advances in sensors and monitoring technologies, degradation tests become preferable to life tests in the sense that they can predict reliability characteristics over time without the concern of censoring ( Meeker, Escobar, & Lu, 1998 ). In a typical degradation test, discrete degradation measurements are taken and the observed degradation paths are then employed to make inferences about the product reliability. Degradation paths are usually modeled based on a quality characteristic (QC), such as the brightness of displays and battery life of electronic devices ( Wang, Tang, Bae, & Xu, 2018 ).
on system degradation levels. Another reason of atypical degradation paths lies in the intervening nature of inspections. In other words, certain types of inspections may inherently influence the system degradation ( Zhao, Gaudoin, Doyen, & Xie, 2019 ). One of the examples is the destructive test, where inspections can cause directly destructive effects to the test systems ( Shi, Escobar, & Meeker, 2009 ). Considering the aforementioned issues, we propose to model the effects brought by inspections in degradation tests and afterwards investigate parameter estimation upon such testing data.
The research to be proposed is motivated by a real experiment carried out by Schneider Electric with the objective to reveal the degradation characteristic of an electrical distribution device. As a key part of the device, a mechanical linkage corrodes over time, which is a dominant cause of performance degradation. To quantify the degradation level of the device, engineers measure the torque that is needed to separate the linkage. A higher torque implies a more severe condition of corrosion. On the one hand, since the inspection separates the linkage, the grown corrosion is physically disassembled, leading to a reduction in the degradation level during the inspection. On the other hand, the inspection causes damage to the integrity of surface treatments in the linkage, which leads to a higher rate of corrosion. Obviously, the two types of effects are opposite with respect to the system health. Another concern of the problem is the difficulty in revealing the accurate degradation reduction during the inspection. To follow a basic rule to inspect systems, engineers tend to minimize the influence of inspection and therefore only measure the torque that separates the linkage. Once a measurement is obtained, the inspection is terminated immediately. Consequently, the measurement process may only capture the degradation level before inspections yet fail to observe degradation reduction. Although the engineers can give an approximation of the degradation reduction from the prior knowledge or other experiments, the accurate value can never be known. Thus, the reduction effect is a hidden variable that cannot be directly utilized for statistical inference.
By employing the experiment from Schneider Electric as an illustrative example, the study aims at establishing a framework to analyze accelerated degradation tests with complex inspection effects. In general, the proposed method can be applied to model degradation data in the presence of environmental covariates and interventions that exert both positive and negative effects.

Literature review
With the fast emergence of system monitoring technologies, modeling and inference of degradation data now play a vital role in the research area of reliability engineering and its interfaces with other areas, such as mechanical engineering ( Wang & Tsui, 2017 ), energy ( Lin et al., 2017 ), electrical engineering ( Si, 2015 ) etc. Degradation analysis not only subsumes approaches to model relevant data, but also creates alternative planning methods of reliability tests for life prediction.
Interest in accelerated degradation test (ADT) has grown in recent decades owing to its successful applications to various products and systems, such as LED lamps, lithium-ion batteries and rail tracks ( Ye & Xie, 2015 ). Initiated by  , regression-based general path models are widely used for degradation modeling in ADT ( Hong, Duan, Meeker, Stanley, & Gu, 2015 ). Further, due to clear physical explanations and appealing mathematical tractability, stochastic processes such as Wiener processes ( Hu, Lee, & Tang, 2015 ), gamma processes ( Tsai, Sung, Lio, Chang, & Lu, 2016 ) and inverse Gaussian processes ( Ye, Chen, Tang, & Xie, 2014 ) start to play an influential role in ADT modeling and planning. For a more detailed overview, one can be referred to Limon, Yadav, and Liao (2017) . To date, several new considerations on ADT planning and analysis emerged in the literature to tackle more practical issues. To name a few, Tseng and Lee (2016) proposed a general exponential-dispersion model to characterize degradation and gave the optimal ADT plans in analytical forms. In Li, Wu, Ma, Li, and Kang (2018) , random fuzzy theory was adopted to model the uncertainty in ADT data. Wang and Tsui (2017) considered multiple stresses in ADT for rubber sealed O-rings. With regard to model uncertainty, Liu, Li, Zio, Kang, and Jiang (2017) applied the Bayesian modeling averaging approach to model ADT data.
In real problems, it is common that a degradation path cannot be characterized by models in regular forms such as linear or typical nonlinear ones (e.g., polynomial, logarithm and exponential models). The reasons behind an atypical degradation path can be rather complex. For example, Hong et al. (2015) proposed a degradation modeling approach by utilizing dynamic weathering covariates to characterize irregular degradation paths. In some other studies ( Bae, Yuan, Ning, & Kuo, 2015;Wang et al., 2018 ), change-point detection and modeling were discussed for degradation data. Adaptive and dynamic methods for online degradation modeling have also prevailed in the literature ( Si, 2015;Zhai & Ye, 2018 ).
Apart from these, human interventions to industrial systems can also be a pivotal cause of atypical degradation path, and a common example is imperfect maintenance ( Mercier & Castro, 2019 ). Surprisingly, despite plentiful extant works on atypical degradation paths, we can only find very scant research in the literature that addressed similar issues in ADT problems. Xiao and Ye (2016) discussed the ADT planning problem with random initial degradation levels. In Ye, Hu, and Yu (2019) , the initial performance of test units were considered to allocate units to stress levels. Nevertheless, these works did not incorporate the effect of inspection as described in Section 1 .
The rest of the paper is organized as follows. In Section 3 , a systematic approach to model construction and parameter estimation is established for ADT data with inspection effects. Section 4 discusses the uncertainty quantification of parameter estimators. Simulation studies are carried out in Section 5 . Section 6 presents the case study from Schneider Electric. Finally, conclusions are drawn in Section 7 .

Preliminaries
Consider a degradation test with M stress levels and N i test units are allocated to stress level i, where i = 1 , . . . , M. For the jth test unit under stress i, as which we call unit (i, j) for simplicity, a total of O i j inspections are carried out at time epochs τ i j1 , . . . , τ i jO i j .
Since an instant degradation reduction occurs upon each inspection, we denote the degradation level before the reductive effect by y − i jk for the k th inspection for unit (i, j) , with k = 0 representing the initial inspection prior to the test. Meanwhile, the degradation level after the reductive effect is denoted by y + i jk . We introduce a variable z i jk to model the proportion of degradation reduction with respect to y − i jk for the k th inspection of unit (i, j) as follows: The notational details are illustrated in Fig. 1 . As mentioned in Section 1 , y − i jk is usually observable whereas y + i jk is relatively difficult to reveal. Thus, for notational simplicity, we subset the data to y − and z that are given by To further facilitate degradation modeling, we let y . Considering the notation in (1) , we can rewrite y i jk as

Wiener degradation model and inspection effect
We employ the Wiener process as the baseline model to characterize the inherent generation of degradation for system of interest. More concretely, if we assume that the system operates without any intervention, the degradation path of the system can be modeled by a drifted Wiener process. The Wiener process features independent Gaussian increments over non-overlapping time periods, which enables its wide application in degradation modeling. A drifted Wiener process { W (t) ; t ≥ 0 } can be characterized by drift and diffusion parameters, denoted by μ and σ, respectively. In this manner, it gives for any t > s follows a normal distribution with mean μ(t − s ) and variance σ 2 (t − s ) , i.e., W (t − s ) ∼ N(μ(t − s ) , σ 2 (t − s )) . In the presence of inspection effects, i.e., nonzero z i jk 's, the degradation path no longer follows the conventional Wiener process. Nevertheless, according to the aforementioned properties, given z i jk 's, y i jk are independent increments of a Wiener process and they follow: where μ i jk is the degradation rate between the (k − 1) th and k th inspection for test unit (i, j) and τ i jk = τ i jk − τ i j(k −1) . Note that if the inspection has no effect on the system degradation, then z i jk 's are 0 for all i, j and k, and the degradation path can be modeled by a conventional drifted Wiener process. Note that in the proposed model, we use a flexible notational convention in the sense that τ i jk and O i j can be different for different test units, implying that the proposed methods can be well applied to unbalanced data set in terms of time and number of observations. Next, experimental factors as covariates are introduced into the model. The majority of research on degradation tests has suggested to use parametric models to link μ and covariates x i ( Jakob, Kimmelmann, & Bertsche, 2017 ). We use f acc ( x i ) to denote the baseline degradation rate under stress x i and the acceleration model can be formulated in either parametric or nonparametric manners. The parametric acceleration model can take various forms based on the physical mechanism of degradation and factors involved in the test.
The selection of f acc ( x i ) is not the main focus of the study, and some brief discussions regarding the Schneider example are given in Appendix A . Moreover, the complexity in f acc ( x i ) may impede the inferential efficiency under various specific models. Under different forms of f acc ( x i ) , we wish to maximize the generalizability of the proposed model. Towards this end, we treat f acc ( x i ) as separate parameters first and then employ a hierarchical analysis in Section 3.6 for further inferences.
To capture the effect of degradation rate increase after the k th inspection, a function g(k ; ω ) is introduced, where ω is a vector of unknown parameters. To benchmark the effect, we use g(k ; ω ) as an added term to the baseline degradation rate and therefore we assume g(0 ; ω ) ≡ 0 . Then, the degradation rate between the (k − 1) th and k th inspection for test unit (i, j) is given by One implicit assumption from (3) is that the increase effect of degradation rate brought by inspections and stress variables x i are independent. In other words, we assume that the inspection effect on degradation rate only depends on k and does not interact with environmental stresses. Since the degradation rate increases with the number of inspections, g(k ; ω ) is a non-decreasing function.
The simplest forms are polynomial, e.g., the first and second order polynomial models are given as follows: Without loss of generality, we assume g(k ; ω) = ωk for analytical simplicity in the following analysis.
Recall that we have introduced z i jk earlier to describe the proportion of degradation reduction. A beta distribution is employed to model the unconditioned z i jk , for any realization of z i jk , denoted by z, the density function is given by where u and v are the shape parameters of the beta distribution.
It is worth mentioning that beta distribution has been widely used to model the effect of the imperfect repair ( Zhang, Gaudoin, & Xie, 2015 ).

Remark. If y +
i jk is unobservable, u and v cannot be estimated from the model due to the absence of z under the frequentist setting. However, engineers may manage to approximately quantify the reductive effect from domain expertise or preliminary experiments. For the example from the Schneider Electric, engineers can carry out a different type of experiment to measure the degradation levels before and after the first inspection at t = 0 and model the reductive effects, though it is inapplicable during the ADT as it takes much longer time to obtain the reductive measurements. In the presence of the described available data at t = 0 , it is easy to fit a beta distribution to the observed reductive measurements. Other types of models to characterize z can also be used after uncomplicated modifications to the likelihoods in the following contents in the section and density functions in Appendix B . It is noteworthy that the Bayesian method can be an appealing alternative, where u and v can be characterized by some prior distributions at first. The computational burden can be an important issue in Bayesian inference, especially when z is unobservable ( Bernardo et al., 2003 ). Further, in this paper, we presume that z i jk 's are independent random variables. The assumption is valid if the inspection effects of degradation reduction are instantaneous and do not interact over time. In realistic applications, the effects can interact over time under stress environments. In such cases, a joint distribution can be employed to characterize the interdependence of z i jk 's. However, the introduction of interdependence leads to more complicated likelihoods, which hinders the tractability of estimators. Monte Carlo EM algorithm can be useful in implementing the aforementioned extensions ( Levine & Casella, 2001 ). In Sections 3.3 and 3.4 , we will discuss the modeling of degradation data under two different scenarios. In the first one, the degradation levels before and after the inspection are observable so that all the parameters in the model can be estimated through a complete likelihood. In the second one, only the degradation levels before the inspection can be obtained for inference, under which case parameters u and v are given a priori to facilitate the estimation of other unknown parameters.

Inspection effect modeling with complete observations
As preliminaries for the following analyses, this part aims at establishing data modeling framework via maximum likelihood estimation (MLE). The complete log-likelihood function of θ c under data ( y − , z ) can be represented by Note again that we assume that z i jk 's are mutually independent and they are also independent of y and the environmental factors. Under the current model setting, the unknown parameters in the model can be summarized by If the effect of degradation reduction can be observed, the problem is simplified to a standard MLE problem with all observations available in (5) . Further, since parameters u and v are independent of other parameters in the model, the likelihood involving u and v can be independently maximized via the observations z i jk . The remaining part of the likelihood can also be maximized by numerical methods.

Inspection effect modeling with hidden effect observations
When y + i jk is unobservable, the effect of degradation reduction cannot be captured, which makes u and v in the model ines- timable. This kind of information can also be quantified by a beta distribution as described in (4) . As discussed before, we assume that the pilot distribution of z is known and characterized by u 0 and v 0 , respectively. By holding the property of independence of z i jk 's. the following log-likelihood is to be maximized to estimate Due to the absence of z i jk , the log-likelihood cannot be maximized in its current form. Alternatively, we resort to the expectationmaximization (EM) algorithm to obtain the parameter estimates.
The EM algorithm is an iterative method to find the MLE for statistical models with latent variables. Following its first introduction by Dempster, Laird, and Rubin (1977) , the EM algorithm has been studied extensively from both theoretical and practical perspectives ( McLachlan & Krishnan, 2007 ). Specifically, in reliability engineering, the EM algorithm is commonly used to capture latent random effect in life and degradation models ( Chen & Ye, 2017;Duan & Wang, 2018 ). Apart from the EM algorithm, the hidden semi-Markov model was used for health diagnosis and prognosis with latent effects in Dong and He (2007) . A two-stage approach was proposed in Lee, Hu, and Tang (2017) to estimate the model from time-censored ADT data. The Kalman filtering technique has also been prevailingly employed in the remaining life estimation based on degradation models ( Si, Wang, Hu, & Zhou, 2014 ). The EM algorithm consists of two steps: (1) the E-step in which the conditional expectation of the complete log-likelihood with respect to incomplete data is completed; (2) the M-step in which the expected log-likelihood is maximized to generate parameter estimation at the current iteration. Denote the conditional expectation of the complete log-likelihood at iteration n by Q θ| θ (n ) , we have In the expectation step, we need to compute E z i j(k −1) | y − , θ (n ) and E z 2 i j(k −1) | y − , θ (n ) , Unfortunately, the conditional distribution of z i j(k −1) cannot be identified as a known random distribution. We have to resort to numerical methods to evaluate the expected values with respect to z i j(k −1) . Related details are given in Appendix B .
Next, the first order derivatives of Q θ| θ (n ) are provided for its maximization. Following previous analyses, we also treat f acc ( x i ) as unknown parameters. Thus, By solving ∂ Q θ| θ (n ) /∂ f acc ( x i ) = 0 , we obtain the following equation: of which the solution is followed by To plug Eq. (9) to Eq. (11) , the following equation is obtained which yields the estimates of ω and f acc ( x i ; δ) given by The root of the equation can be easily obtained by solving Through (9) -(15) , the current iteration of the EM algorithm is realized. The iterations are continued until the convergence of parameter estimators.

Guess of initial estimates and ending of iterations
To start the aforementioned EM algorithm, starting estimates θ (0) are needed. The convergence speed of the algorithm hinges on the selection of θ (0) . Here, we utilize the mean of the z i jk to approximately obtain θ (0) . It is obvious that E(z i jk ) = u 0 / (u 0 + v 0 ) . Therefore, to use E(z i jk ) rather than unobservable z i jk , we have where the left-hand-side term is completely observable and after the manipulation of a typical MLE, θ (0) can be given as in Another issue of the EM algorithm is when to terminate the iterations. The question poses a tradeoff between the estimating precision and computational efficiency. Generally, it is a common criterion to terminate the iterations when the proportions of changes in absolute values of parameter estimators are smaller than critical values ε . It is a vector because different parameters may have different critical values. For the problem described in the paper, parameters play different roles depending on how decision makers would utilize the estimates. For example, in terms of life prediction, f acc ( x i ) 's are important for the extrapolation to understand the degradation rate under normal usage conditions, especially in the presence of condition fluctuations, while ω is more useful if the device is frequently inspected. Regarding these reliability issues, ε can be properly determined to satisfy the required estimating accuracy.

A hierarchical analysis to estimate f acc ( x i )
In aforementioned analyses, f acc ( x i ) , i = 1 , . . . , M are treated as unknown parameters for estimation. As one of the main objectives of the study, the estimation of degradation rate under normal usage condition is realized by a hierarchical method. Due to the possible complexity in f acc ( x i ) , it could be onerous to derive analytical iterative solutions to the parameters herein to the EM algorithm. In view of this, the hierarchical method can provide reasonable estimation and meanwhile keep the mathematical derivations in the paper directly adaptable in various scenarios. Liu, Liu, and Xie (2015) reported a method to conduct meta-analysis of independent studies via a confidence density (CD) approach. In the paper, we employ a revised version of the confidence density to estimate the parameters δ in f acc ( x i ) , and we denote f acc ( x i ) by f acc ( x i ; δ) in the following context. As discussed in Appendix A , the form of f acc ( x i ; δ) depends on certain known physical mechanisms. To ensure f acc ( x i ; δ) to be greater than zero, we take natural logarithm on it. Due to the invariance property of MLE, the MLE of log f acc ( x i ; δ) is readily given by log ˆ f acc ( x i ; δ) . Three times differentiable mapping functions M = (M 1 , . . . , M M ) is used to link log f acc ( x i ; δ) and the unknown parameter vector δ: where a is a M-dimensional column vector with each element given by By maximizing (18) , we obtain the point estimator of δ under CD, which we denote by ˆ δ CD : The reason for using the CD estimation is twofold. First, as mentioned previously, it facilitates the derivation of closed-form estimators under the EM framework. Second, the efficiency of estimation is not compromised by the hierarchical operations by CD estimation. The following analyses are presented to justify the latter statement. Under a conventional MLE framework, if δ is directly used to maximize the likelihood function in (6) where ˜ I ( ζ) = V −1 and J ( δ) = ∂ M ( δ) /∂ δ is the Jacobian of M with respect to δ.
The proofs of the lemma and the following results are provided in Appendix D . Lemma 1 implies the asymptotic properties of the direct estimators ˆ δ DIR via the delta method. Next, we focus on the CD estimators ˆ δ CD with the following lemma.
Lemma 2. The first-order derivative of the log-confidence density function log h ( ζ) ≡ log h ( δ) with respect to ζ, is asymptotically equivalent to the score function s ( δ) = ∂ log L ( δ) /∂ δ.
According to Lemma 2 , the CD estimator ˆ δ CD and direct estimator ˆ δ DIR share exactly identical asymptotic properties. Therefore, we can introduce Theorem 1 in analogy to Lemma 1 immediately following Lemma 2 .
The statements in the theorem show that the CD approach is asymptotically as efficient as the direct estimation approach. To quantify the uncertainty in the CD estimators, we put forward Corollary 1 based on Lemma 2 and Theorem 1 .
Remark. On the one hand, the confidence density based methods can address the aforementioned problem to hierarchically estimate parameters without imposing difficulties in the EM algorithm. On the other hand, as was advised in Liu et al. (2015) , the information from independent studies can be well combined via the confidence density. In the problem we have been focusing on in the paper, the proposed hierarchical analysis can be applied to degradation tests under different acceleration functions and finally yield integrated results for the parameters of interest, which could be a subset or transformation of parameters that are already involved in those tests.
In light of the previous analyses on the hierarchical estimation, we have justified the efficiency of the CD approach. The estimation of δ can be readily obtained via (19) . Due to the nonlinear and non-additive properties of the Peck model, it is not intuitive to compare the effects of environmental factors under the usage environments, which could be of great interest to reliability engineers and decision makers. The following proposition is given under the Peck model to compare the effects brought by a single-unit change in temperature and relative humidity.

Proposition 1. (Intuitive comparison of effects under the Peck model)
The baseline degradation rate at the reference environment and is more sensitive to relative humidity otherwise.
The proposition can be straightforwardly justified by taking the first-order derivative on f acc ( x ; δ) with respect to T K and RH, thus the proof is omitted. The proposition will be used for illustration in Section 6.2 . If acceleration models other than the Peck model are used, similar statements can also be entertained for effect comparison.

Uncertainty quantification of the estimated parameters
To quantify the uncertainties in the parameter estimators is a vital task to enable and justify the adoption of the estimators in knowledge creation and decision making. Compared to point estimation, interval estimation is usually preferable in real problems. In this section, we discuss the large-sample based method to construct confidence intervals for estimated parameters.
The assumptions of large-sample approximation are commonly utilized to provide asymptotic covariance matrix of parameter estimators from which confidence intervals are obtained. For complete datasets, a routine practice is to compute the Fisher information from the log-likelihood directly, the elements in the Fisher information matrix I ( θ) are given by Asymptotically, ˆ θ follows a MN distribution, i.e., ˆ θ ∼ N( θ, I ( θ) −1 ) .
Since θ is also unknown, we can alternatively employ the observed information I ( ˆ θ) to compute the asymptotic covariance matrix of θ.
For incomplete data sets with hidden observations as discussed in Section 3.4 , we adopt the approach from Oakes (1999) , where the Fisher information can be directly calculated via function Q. Accordingly, the observed Fisher information can be computed by Note that the second term in the r.h.s. of the equation is viewed as the missing information due to the absence of z . Likewise, the asymptotic confidence intervals for unknown parameters can be constructed by the Fisher information. The derivation of (25) requires manipulations based on Appendix B , and the analytical details of (25) are given in Appendix E . As a side note, for the parameter σ 2 , the normal approximated confidence intervals are usually inappropriate. Alternatively,we build the confidence intervals based on log σ 2 via the delta method. Further, the confidence intervals of log σ 2 are transformed by an exponential operation to quantify the uncertainty in σ 2 . More approaches for uncertainty quantification based on EM algorithm can be found in Louis (1982) and Meng and Rubin (1991) .

Simulation study
To facilitate the simulation, we need to specify an experimental design of the degradation test. By letting N = i N i be the total test units and π i = N i /N be the proportion of test units allocated to stress i, we suppose that π = (π 1 , . . . , π M ) is a pre-specified experimental design for the simulation study. Without loss of generality, the test is assumed to allow for the change in temperature and humidity under the Peck model introduced in Appendix A . Additionally, three levels for each factor are specified and we follow a 4:2:1 allocation rule for each factor Meeker & Hahn, 1977 ). To be specific, the test plan allocates more test units to lower stress levels to avoid overwhelming extrapolation. The detailed plan is shown in Table 1 .
Parameters are set as shown in Table 2 for the purpose of illustration. As a side note, the reference temperature and relative humidity are fixed at T K ref = 293 . 15 Kelvin (20 degree Celsius) and RH ref = 50% , respectively.
To explore the effect of sample sizes, we will show results under N = 49 , 98 and 147 in the simulation studies. Test units are assumed to be inspected for 3 times. The following four sub-studies constitute this section to explore the effectiveness of the proposed model.

Estimates from the EM algorithm and confidence intervals
First, point estimates are obtained from the EM algorithm under 10 0 0 simulation replicates for each sample size of interest. The convergence criterion is set as ε = 0 . 001 (1 ‰ ) for all parameters.
In Table 3 , the mean bias and root mean squared error (RMSE) are shown under N = 49 , 98 and 147 . By observing the results, we can imply that the EM algorithm can accurately estimate the unknown parameters, with rather low mean bias and RMSE even under moderate sample sizes. Moreover, the accuracy of the estimation enhances with the increase in sample size. Specifically, compared to other parameters, the estimation accuracy of σ 2 is relatively low but drastically improves over the sample size. A possible reason behind this is that the estimation of σ 2 involves both E(z i j(k −1) | y − , θ (n ) ) and E(z 2 i j(k −1) | y − , θ (n ) ) as indicated in (15) , where more uncertainty of hidden variables are brought into the estimators. Further, with the extant point estimates, the confidence intervals are constructed via the method proposed in Section 4 . Specifically, the (1 − α) × 100% confidence interval is given by ii denotes the i th diagonal element of a matrix and z α/ 2 is the 1 − α/ 2 quantile of the standard normal distribution. In Table 4 , the coverage probabilities and the average lengths are listed under the simulated datasets under three sample sizes. We compute the 95% confidence intervals under large-sample approximation. As the sample size increases, the coverage probability becomes closer to 0.95 and the average length is shortened. As seen, even under a sample of 49, the confidence intervals perform well and for most parameters over 90% of them can cover the true values. Again, influenced by the relatively large bias, the coverage probability of σ 2 is moderately lower under small sample sizes. Recall that we use the log σ 2 to construct confidence intervals for σ 2 . The trick is proven to benefit the performance. A supporting example is that under N = 49 , we obtain a coverage probability of 0.841 comparing to 0.750 where σ 2 is directly used to quantify the uncertainty.

Hierarchical analysis
We now consider the estimation of δ in the Peck model by means of the proposed hierarchical analysis. Likewise, the performance of point estimation and uncertainty quantification are listed in Table 5 and Table 6 , respectively. The results imply good performances with low mean bias and RMSE for the point estimators as well as coverage probabilities that are close enough to 0.95. It is worth noting that the hierarchical analysis consumes limited computational efforts. For example, point estimation together with uncertainty quantification under N = 147 only takes less than 1 seconds on a single Intel i5 core. If hierarchical analysis is not employed, the complexity of f acc (·) will hinder the derivation of analytical results in the EM algorithm, which could introduce enormous computational burdens to the problem.

Sensitivity analysis with respect to the degradation reduction effect
As discussed in Section 3 , the choice of u 0 and v 0 leans on the experience of engineers. Misspecification of the distribution for z 0 may occur and lead to higher bias of parameter estimation. Here, by assuming the true values of u 0 and v 0 to be 2 and 3, respectively, we change the assumed values to explore the influence of misspecification under sample size N = 49 . Note that E(z 0 ) = 0 . 4 and var (z 0 ) = 0 . 04 hold under true values. In Table 7 , we list the bias and RMSE under four settings of misspecfied values of u 0 and v 0 as follows: As seen from the result, the misspecification of mean of z 0 brings considerable bias to the estimators of f acc ( x i ) and ω, while the estimation of σ 2 suffers more when the variance is misspecified. For extrapolating analysis based on the ADT data, f acc ( x i ) and ω play more important roles. For this purpose, engineers should focus on evaluating the mean effect of degradation reduction for a better estimation accuracy. To explore the influence of misspecification on the estimation of parameters in the Peck model, we carry out the hierarchical methods to estimate δ and show the mean bias and RMSE in Table 8 . The estimated δ suffers a considerable bias when the mean of z 0 is misspecified, while the influence of misspecified variance exerts relatively smaller influence on the estimation accuracy.

Data from Schneider Electric
Schneider Electric conducted a degradation test for a type of electrical distribution device. A total of 104 test units underwent the test under four different settings of temperature and humidity. Specifically, two levels of temperature (313.15 Kelvin and 333.15  Table 9 . Based on the discussions with engineers from Schneider, we propose to model the degradation reduction effect by a beta distribution with parameters u 0 = 1 and v 0 = 3 .

Model estimation
We proceed to the parameter estimation and uncertainty quantification. First, we are interested in how the parameter estimators converge through the iterations. In Fig. 3 , the estimates at each iteration are plotted. We compare results under proposed initial estimates (see Section 3.5 ) against initial values of setting 0.1 for all parameters. As shown, the proposed initial values can speed up the convergence of EM algorithm by selecting the initial values close to the MLE. Table 10 reports the MLE, asymptotic standard deviation and 95% confidence intervals of θ (LB and UB represents the lower bound and upper bound of a confidence interval, respectively). We can observe that the lower bound 95% confidence interval of ω is positive, which implies that the effect of degradation rate increase is significant.
Next, the hierarchical analysis is carried out and the results are shown in Table 11 . It can be seen that both ˆ E a and ˆ δ 1 are significantly positive. Thus, temperature and relative humidity both exert accelerating influences on the degradation.
At the current stage, it is not easy to tell how the environmental factors influence the degradation rate in intuitive senses, thus we utilize Proposition 1 evaluated at the CD estimate of δ as follows: which implies that under the usage environments, the baseline degradation rate is more sensitive to the change in relative humidity. From the practical point of view, it is of more significance to prevent the relative humidity from becoming overwhelmingly high.
To demonstrate the advantages of the proposed model, we perform a comparison with the conventional ADT model that overlooks the inspection effects. Specifically, the conventional model assumes linear Wiener degradation paths. Table 12 lists parameter estimates as well as the Bayesian information criterion (BIC) values under various combinations of assumed u and v as well as under the conventional Wiener degradation model. We can observe a remarkable improvement by considering the inspection effects. Under the BIC, the advantage of the proposed model is overwhelming. Thus there is no further need for the comparisons under criteria with smaller penalty term of additional parameters, such as the Akaike information criterion (AIC). Specifically, we can observe that the estimates of σ 2 under the proposed models (around 0.02 to 0.03) are considerably smaller than that under the conventional model (0.052). This implies that the proposed model can capture more uncertainty systematically via the inspection effects, which is ignored by the conventional model.

Analyses of reliability characteristics
The ultimate goal of degradation data modeling is to better understand the reliability and support related decision making. Towards this end, reliability engineers usually make use of the results of the estimated model to infer reliability characteristics of interest. Some commonly used characteristics include mean time to failure (MTTF) and life quantiles. By taking into account both accelerated environmental factors and inspection effects, the proposed methods can generate some interesting results that differ from those under conventional approaches.
First, we are curious about the reliability under different environmental conditions, which is one of the main purposes of the ADT. Four combinations of temperature and relative humidity are considered. An appealing property of the Wiener degradation model is that the lifetime follows an inverse Gaussian distribution, under which the reliability function is analytical. By assuming the threshold to be 0.5, the reliability curves and 80% pointwise confidence bands subject to estimation uncertainty are shown in   fect of relative humidity seems more significant, which validates the statement from Proposition 1 . Practically, managers should keep the device working in cool and dry environments to prolong its lifetime. Additionally, due to the extrapolation of factors in ADTs, there exists considerable variability in the reliability curves, especially under lower T K and RH, of which decision makers need to beware. More conservative decisions are usually suggested in the presence of larger model uncertainty.
As the company's another objective is to predict the remaining useful life (RUL) for devices of different age and inspection history, under the estimated model, we plot the mean RUL and 0.1 RUL quantile under different combinations of current degradation level ( y + ) and historical times of inspection ( k ) in Fig. 5 . As is in accordance with intuitions, a higher y + or k leads to smaller RUL. To be more specific, the mean RUL decreases in proportion to the increase in y + , while the influence of k is more significant when it is smaller. The behavior of 0.1 RUL quantile is similar except a faster decrease in the quantile when y + becomes higher. It is noteworthy that, in the presence of inspections, the reliability function in a general form over a time horizon from zero to infinity is onerous to derive due to the randomness in both degradation levels and the effect of inspections. Decision makers can employ the RUL distribution with given inspected degradation levels, which is tractable, to characterize the reliability characteristics and support future decisions.
For a better exposition of degradation paths under different inspection intervals, we simulate and plot the expected degradation path and confidence bands in Fig. 6 . We set the same scale for xand y -axes for comparison. Obviously, more frequent inspections lead to faster degradation despite the existence of degradation reduction effect. Interestingly, we observe that the uncertainty in the degradation paths is the smallest when τ = 0 . 1 . It is probably a result of tradeoff between the uncertainties in the two types of inspection effects. For the studied device, it may not be wise to inspect the system too frequently. Nevertheless, the decision maker should consider the risk of unexpected failures under limited information.

Concluding remarks
We focus on an atypical degradation modeling problem motivated by a real example. As a first attempt to deal with partially observable data with manifold inspection effects, the study utilizes the EM algorithm to iteratively derive the parameter estimators. Further, a hierarchical confidence density is used to accurately estimate the acceleration model. Uncertainty quantification is carried out via large-sample approximation. The manipulation of the proposed model can be widely applied to different types of ADT data. Simulation studies validate the proposed methods and show satisfactory estimating performance under different sample sizes. The computational burden is significantly reduced by the hierarchical analysis. A case study from Schneider Electric demonstrates the advantage of the model over conventional degradation models. A series of reliability analyses based on the estimation results are provided to explore managerial insights from practical perspectives. The involvement of inspection effects can be further investigated for reliability modeling and management. Bayesian methods can be explored to characterize the hidden inspection effects. Dynamic approaches to system inspection and maintenance optimization based on the current model are of interest to effectively manage industrial systems. As is mentioned in Section 3.6 , it is also useful to combine the information from different tests, which puts forward some opportunities for future research. For more detailed extensions to solve real industrial problems, relevant discussions of other degradation models such as the gamma processes and inverse Gaussian processes and more models for inspection effects can be further explored. where

(B.2)
Moreover, the first-order derivatives of f with respect to θ (n ) are given by (B.4)

Appendix C. Expression for initial guesses of starting points
Following the analytical results in Section 3.4 , we can calculate the initial values of parameter estimates by the following equations: Appendix D. Technical proofs in Section 3.6

D.1. Proof of Lemma 1
In the following proof, we assume that the regularity conditions described in Appendix B in Liu et al. (2015) hold for the likelihood contributions and ζ is identifiable in the form of likelihood under the current model. As a side note, we are using the denominatorlayout notations. Further, let L * ( ζ) ≡ L ( δ) under ζ = M ( δ) . We apply the Taylor approximation as follows: It is well known that as n → ∞ , ˜ ( ζ) /n converges to ˜ I ( ζ) in probability. As implied in Liu et al. (2015) , the random part of the equation follows MN distribution: This completes the proof of Lemma 1 .

(D.4)
For ∂ log L ( δ) /∂ δ, a Taylor approximation is employed and we ob- As seen, (D.4) and (D.5) are equal under the Taylor approximation. Thus, they have the same asymptotic properties, which completes the proof.

Appendix E. The observed Fisher information from the method in Oakes (1999)
The following second-order derivatives of Q are given and used for the construction of the Fisher information in (25) :