Trend analysis of the power law process using Expectation–Maximization algorithm for data censored by inspection intervals

doi:10.1016/j.ress.2011.03.018

Reliability Engineering & System Safety

Volume 96, Issue 10, October 2011, Pages 1340-1348

https://doi.org/10.1016/j.ress.2011.03.018 Get rights and content

Abstract

Trend analysis is a common statistical method used to investigate the operation and changes of a repairable system over time. This method takes historical failure data of a system or a group of similar systems and determines whether the recurrent failures exhibit an increasing or decreasing trend. Most trend analysis methods proposed in the literature assume that the failure times are known, so the failure data is statistically complete; however, in many situations, such as hidden failures, failure times are subject to censoring. In this paper we assume that the failure process of a group of similar independent repairable units follows a non-homogenous Poisson process with a power law intensity function. Moreover, the failure data are subject to left, interval and right censoring. The paper proposes using the likelihood ratio test to check for trends in the failure data. It uses the Expectation–Maximization (EM) algorithm to find the parameters, which maximize the data likelihood in the case of null and alternative hypotheses. A recursive procedure is used to solve the main technical problem of calculating the expected values in the Expectation step. The proposed method is applied to a hospital's maintenance data for trend analysis of the components of a general infusion pump.

Introduction

Trend analysis is a common statistical method used to investigate the operation and changes in a repairable system over time. This method takes historical failure data of a system or a group of similar systems and determines whether the recurrent failures exhibit an increasing or decreasing trend. Both graphical methods and trend tests are used for trend analysis. The latter are statistical tests for the null hypothesis that the failure process follows a homogenous Poisson process (HPP) [1].

Crow/AMSAA test [2], [3] assumes that the failure process of a repairable system has a Weibull intensity function and finds the maximum likelihood estimates of the parameters. The shape parameter is then used as an indicator for a growth or deterioration in the system reliability. The Laplace and the Military Handbook tests [4] are used to test whether data follow a HPP. Kvaloy and Lindqvist [5] propose the Anderson–Darling trend test for a NHPP with a bathtub shaped intensity function based on the Anderson–Darling statistic test [6]. The Lewis–Robinson [7] is a modification of the Laplace test and is used as a general test to detect trend departures from a general renewal process [8]. The Mann test [9] corresponds to a renewal process null hypothesis and the monotonic trend as the alternative. Kvaloy et al. [10] declare that the Mann test is a more powerful test against decreasing trend. The MIL-HDBK-189 [11], [12] is used for testing a NHPP with a power law intensity function. Caroni [13] modifies the trend tests introduced by Kvaloy et al. [10] for data that end with the last of a random number of failures within a predetermined observation period. Louit et al. [14] review several tests available to assess the existence of trends, and proposes a practical procedure to discriminate between the use of statistical distributions to represent the time to failure. Regattieri et al. [15] introduce a framework defining a general approach for failure process modeling, and demonstrate the application of the proposed framework in a light commercial vehicle manufacturing system.

Most trend tests [4], [16] assume that failure times are known, so the failure data are complete. Currently in the literature, except for right censoring, there is no available method for estimating the parameters of a non-homogeneous Poisson process (NHPP), which incorporates left and interval censored failure data if repairs are not instantaneous or not performed immediately. However, it is expected that the data received from industry include missing and incomplete information. Sometimes a particular type of data of interest is not measured at all, or if measured may be incomplete or unreliable. For example, hidden failures which make up to 40% of all failure modes of a complex industrial system [17] are not evident to operators and remain dormant until they are rectified at scheduled inspections. In this case, the times of hidden failures are either left or interval censored, and the censoring interval is the interval between two consecutive inspections.

Weibull or power law [2] and log linear [18] intensity functions are two common models used to describe recurrent even data underlying NHPP. The maximum likelihood estimates of the parameters are obtained for two intensity parameters [12], [19]. In this paper, we assume that the failure process follows an NHPP with a power law intensity function. We use the Expectation–Maximization (EM) algorithm to estimate parameters of the power law process and propose a recursive method to calculate the expected values in the EM Expectation step. In the Maximization step, we use the Newton–Raphson method. We then apply the general likelihood ratio test to test HPP against NHPP alternative [12]. In some practical situations, the mid-points of the censoring intervals may be considered as the failure times and used to estimate the parameters of the failure process and to perform trend test analysis. If the inspection intervals are not short, this approximation of actual failure times can be inaccurate.

A detailed problem description is given in Section 2. Section 3 describes the EM algorithm and proposed trend test method, while Section 4 includes numerical examples using adapted data from the case study [20]. Section 5 presents our conclusions.

Section snippets

Problem definition

Consider a repairable system or a unit subject to censored failure times. For example, an audible component in an infusion pump in a hospital is used to communicate with operators and inform them of the status of the patient to whom the device is attached. When the level of liquid delivered to a patient is reduced to a certain level, the audible component starts sending warning alarms. If the component fails, the pump can still function, but the patient's health risk increases if the operator

Trend test

We propose to use the likelihood ratio test [19], [21] to test for possible trend in failure times, with null (H₀) and alternative (H₁) hypotheses, as follows:

H₀: homogenous Poisson process (β=1) (no trend)
H₁: non-homogenous Poisson process (β≠1) (trend)

Let L(α,β) is the likelihood of the observed data, which is given in detail by Eq. (4). Under H₀, parameter α should be estimated using maximum likelihood assuming that β=1. Let then, $L_{0} = L (\hat{α}, β = 1)$ .

Under H₁, both parameters are estimated by

Case studies

The results obtained from the study in [20] show that components of a general infusion pump (Fig. 3) can be categorized into two groups. Failures of some components such as indicators, switches, and occlusions make the system to stop functioning as soon as they occur, and they are fixed immediately. Then, these failure times are known. The second group consists of components with hidden failures such as circuit breakers or informative components such as audible signals. The pump can continue to

Concluding remarks

In many situations, such as hidden failures, failure times are subject to censoring. Current trend analysis methods in the literature mention only right censoring and do not address recurrent failure data with left or interval censoring and periodic or nonperiodic inspections if repairs are not instantaneous or not performed immediately. It is assumed that the failure process follows a non-homogenous Poisson process with a power law intensity function. The paper proposes using the likelihood

Acknowledgments

We acknowledge the Natural Sciences and Engineering Research Council (NSERC) of Canada, the Ontario Centers of Excellence (OCE), and C-MORE Consortium members for their financial support. Permission to include Fig. 4, Fig. 5 is kindly granted by U.S. Food and Drug Administration (FDA). We are thankful to the anonymous reviewers for their comments that helped us to improve the presentation of the paper.

References (27)

J.Y. Kvaloy et al.
TTT-based tests for trend in repairable systems data
Reliab Eng Syst Saf
(1998)
D.M. Louit et al.
A practical procedure for the selection of time-to-failure models based on the assessment of trends in maintenance data
Reliab Eng Syst Saf
(2009)
A. Regattieri et al.
Estimating reliability characteristics in the presence of censored data: a case study in a light commercial vehicle manufacturing system
Reliab Eng Syst Saf
(2010)
B.H. Lindqvist
On the statistical modeling of repairable systems
Stat Sci
(2006)
L.H. Crow
Reliability analysis for complex repairable systems
L.H. Crow
Confidence interval procedures for the Weibull process with Applications to reliability growth
Technometrics
(1982)
H. Ascher et al.
Repairable systems reliability. modeling, inference, misconceptions and their causes
(1984)
T.W. Anderson et al.
Asymptotic theory of certain ‘goodness of fit’ criteria based on stochastic processes
Ann Math Stat
(1952)
P.A.W. Lewis et al.
Testing for a monotone trend in a modulated renewal process
J.F. Lawless et al.
A point-process model incorporating renewals and time trend, with application to repairable systems
Technometrics
(1996)

H.B. Mann

Nonparametric tests against trend

Econometrica

(1945)

Kvaloy JT, Lindqvist B, Malmedal H. A statistical test for monotonic and non-monotonic trend in repairable systems. In:...

MIL-HDBK-189. Reliability growth management. Washington DC: Department of Defense;...

Cited by (33)

On system reliability for time-varying structure
2023, Reliability Engineering and System Safety
In reliability theory, the aging of a multi-state system is dominated by both the components and the corresponding structure functions. In previous studies, structures are usually assumed to be static, and thus the time-independent structure functions are utilized. However, due to the complex nature of aging, the structure could also vary with time, which may lead to unsatisfactory assessment reliability with the static structure-based analysis. The current investigation provides a universal approach to assessing the reliability of complex systems with time-varying structures. An open-system model is introduced to broaden the logic method of the system reliability. The analysis of open-system model implies that structure functions are probabilistically described by the time-varying structure distributions, which are the fine graining of the conditional probabilistic tables (CPTs) of the Bayesian networks. The aging of components and the time-varying structures are integrated into a probabilistic graphical model together, which is put forth to assess the time-varying reliability of complex systems. A general algorithm based on expectation–maximization (EM) for various dynamic processes for components and system structures is obtained. Two specific processes, e.g., Markov and Weibull, are studied in detail. Three examples are presented to illustrate the proposed approach and give a deeper understanding of time-varying structures.
Failure modes based censored data analysis for repairable systems and its industrial perspective
2021, Computers and Industrial Engineering
Reliability analysis of repairable systems (RS) has always been a challenging task for the industries especially when the collected data is censored. In real-world situation, the industries, generally preserve all the system’s related information such as the number of failures, time between failures and their respective failure modes (FMs) for future reliability analysis. The problem arises when FM wise analysis is required to be done for RS along with other censoring criteria such as test termination, removing the operating system from the study etc. The literature provides various models to deal with different types of censored data but lacks in providing FM based censored data analysis technique for RS, when both preventive and corrective maintenances (PM and CM) are treated as imperfect. To bridge this gap, the paper develops a technique and virtual age models for RS which can simultaneously address FM wise analysis with right and multiply censoring data considering both the PM and CM as imperfect. The paper also develops likelihood function using proposed models for parameter estimation. The proposed technique and models are demonstrated with the help of a case study selected from aviation industry. The paper then highlights the applicability of proposed models to industries dealing with complex and critical RS in conducting failure modes and effects analysis (FMEA) and remaining useful life (RUL) estimation. The paper also brings out as to how the proposed models can be converted to the existing models with some modifications.
Adaptive stochastic-filter-based failure prediction model for complex repairable systems under uncertainty conditions
2020, Reliability Engineering and System Safety
Citation Excerpt :
Crow proposed the Crow-AMSAA model with the power-law recurrence rate function to describe failure counting process [12]. Taghipour used an NHPP and the expectation–maximization (EM) algorithm to analysis failure data censored by inspection intervals [4]. Meeker integrated the nonparametric estimator and the NHPP model used for the reliability analysis of window-observation warranty data [13].
Dynamical reliability assessment and failure prediction are effective tools for ensuring the efficiency, availability, and safety of repairable systems. To achieve better assessment performance, accurate modeling failure recurrence data are the core of prediction approaches. However, because of the uncertainties from the environmental conditions and repair activities, the failure counting model is usually not well established. To solve this problem, in this paper, we propose an adaptive recursive-filter-based dynamical failure prediction approach for complex repairable systems. First, based on the framework of the state space model, a fusion model that fuses Brownian motion into a nonhomogeneous Poisson process is proposed to characterize failure process under multiple uncertainty conditions. Then, an adaptive statistical inference method based on a Bayesian recursive filter and the EM algorithm is derived to update the model parameters and estimate the initial states adaptively. To verify the effectiveness of the proposed approach, a real gas pipeline compressors reliability prediction problem was implemented.
Power–law nonhomogeneous Poisson process with a mixture of latent common shape parameters
2020, Reliability Engineering and System Safety
Rapid developments in information technologies enabled recording big data environments in near real-time. Such big data environments provide an unprecedented opportunity for efficient event detection and therefore effective reliability models, but they also pose interesting challenges. One challenge is modeling the number of recurrent events for heterogeneous subpopulations with limited records. To address this challenge, a power–law nonhomogeneous Poisson process with machine learning capabilities is proposed. The scale parameter of the Poisson process is learned for each individual subpopulation. However, the shape parameter is learned for latent groups that each consists of multiple (internally homogenous) subpopulations. The proposed Poisson process collaboratively models multiple heterogeneous subpopulations; therefore, it allows transferring knowledge between subpopulations and diminishes the chances of overfitting. Simulation and real-life case studies showed the high modeling accuracy of the proposed approach.
Dynamic reliability assessment and prediction for repairable systems with interval-censored data
2017, Reliability Engineering and System Safety
The ‘Test, Analyze and Fix’ process is widely applied to improve the reliability of a repairable system. In this process, dynamic reliability assessment for the system has been paid a great deal of attention. Due to instrument malfunctions, staff omissions and imperfect inspection strategies, field reliability data are often subject to interval censoring, making dynamic reliability assessment become a difficult task. Most traditional methods assume this kind of data as multiple normal distributed variables or the missing mechanism as missing at random, which may cause a large bias in parameter estimation. This paper proposes a novel method to evaluate and predict the dynamic reliability of a repairable system subject to interval-censored problem. First, a multiple imputation strategy based on the assumption that the reliability growth trend follows a nonhomogeneous Poisson process is developed to derive the distributions of missing data. Second, a new order statistic model that can transfer the dependent variables into independent variables is developed to simplify the imputation procedure. The unknown parameters of the model are iteratively inferred by the Monte Carlo expectation maximization (MCEM) algorithm. Finally, to verify the effectiveness of the proposed method, a simulation and a real case study for gas pipeline compressor system are implemented.
A reliability decision framework for multiple repairable units
2016, Reliability Engineering and System Safety
In practice, the analyst is often dealing with multiple repairable units, installed in different positions or functioning under different operating conditions, and maintained by different disciplines. This paper presents a decision framework to identify an appropriate reliability model for massive multiple repairable units. It splits non-homogeneous failure data into homogeneous groups and classifies them based on their failure trends using statistical tests. The framework discusses different scenarios for analysing multiple repairable units, according to trend, intensity, and dependency of the units׳ failure data. The proposed framework has been verified in a fleet of aircraft and in two simulated data sets. The results show a reliability model of multiple repairable units may contain a mixture of different stochastic models. Considering single reliability models for such populations may cause erroneous calculation of the time to failure of a particular unit, which can, in turn, lead to faulty conclusions and decisions. When dealing with massive and non-homogeneous multiple repairable units, the application of the proposed framework can facilitate the selection of an appropriate reliability model.

View all citing articles on Scopus

View full text

Trend analysis of the power law process using Expectation–Maximization algorithm for data censored by inspection intervals

Abstract

Introduction

Section snippets

Problem definition

Trend test

Case studies

Concluding remarks

Acknowledgments

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

On the statistical modeling of repairable systems

Stat Sci

Reliability analysis for complex repairable systems

Confidence interval procedures for the Weibull process with Applications to reliability growth

Technometrics

Repairable systems reliability. modeling, inference, misconceptions and their causes

Asymptotic theory of certain ‘goodness of fit’ criteria based on stochastic processes

Ann Math Stat

Testing for a monotone trend in a modulated renewal process

A point-process model incorporating renewals and time trend, with application to repairable systems

Technometrics

Nonparametric tests against trend

Econometrica