Production, Manufacturing and Logistics
Two new stochastic models of the failure process of a series system

https://doi.org/10.1016/j.ejor.2016.07.052Get rights and content

Highlights

  • New failure process models for repair of a series system are proposed.

  • The concept of a virtual component is developed.

  • The models outperform existing models of repair based on nine simulated datasets.

Abstract

Consider a series system consisting of sockets into each of which a component is inserted: if a component fails, it is replaced with a new identical one immediately and system operation resumes. An interesting question is: how to model the failure process of the system as a whole when the lifetime distribution of each component is unknown? This paper attempts to answer this question by developing two new models, for the cases of a specified and an unspecified number of sockets, respectively. It introduces the concept of a virtual component, which corresponds to the part of the system that is replaced upon system failure. It then discusses the probabilistic properties of the models and methods for parameter estimation. Based on six datasets of artificially generated system failures and a real-world dataset, the paper compares the performance of the proposed models with four other commonly used models: the renewal process, the geometric process, Kijima’s generalised renewal process, and the power law process. The results show that the proposed models outperform these comparators on the datasets, based on the Akaike information criterion.

Introduction

Modelling the failure processes of technical systems has attracted much attention from reliability researchers. There exist many papers that attempt to develop statistical models for characterising the failure process of a system (see, Baxter, Kijima, and Tortorella (1996); Cox and Lewis (1966); Dorado, Hollander, and Sethuraman (1997); Doyen, Gaudoin, 2004, Doyen, Gaudoin, 2011; Duane (1964); Kijima and Sumita (1986); Lam (1988); Wu and Zuo (2010), for example). However, much of this existing research assumes that the systems are equivalent to one-component systems. Such an assumption is restrictive and unrealistic as real-world systems normally consist of very many components. In addition, in the real world, the lifetime distribution of each component may not be estimable for various reasons; for example, data on real systems often contain little or no information about the failure processes of individual components. Hence, there is a need to develop new, simple (few parameters) and elegant (richly applicable) failure process models for multi-component systems. This is the purpose of this paper.

Before we introduce our models, we define repair concepts, and two important models of imperfect repair.

In reliability mathematics, the effect of maintenance upon failure of an item is typically categorised into:

  • Perfect repair, in which maintenance restores the condition of a failed item to an “as good as new” status; for example, a failed item is replaced with a new identical one. The renewal process is a widely used model for the failure process of items under perfect repair (Ross, 1996).

  • Minimal repair (see, Cox and Lewis (1966); Duane (1964), for example), in which maintenance restores a failed item to its state immediately prior to failure. The operating state of an item after minimal repair is often called “as bad as old” in the literature. The only model of minimal repair available in the literature is the non-homogeneous Poisson process (NHPP).

  • Imperfect repair, in which maintenance restores a failed item to a status somewhere between “as good as new” and “as bad as old”. Many models, including the geometric process (GP) and its variants (Lam, 1988, Wang, Pham, 1996, Wu, Clements-Croome, 2006), the generalised renewal process models (GRP) (Doyen, Gaudoin, 2004, Kijima, 1989, Kijima, Sumita, 1986), and the reduction of failure hazard models (Doyen & Gaudoin, 2004), have been developed for modelling imperfect repair.

The particular models themselves are defined as follows.

  • The geometric process: Following Lam (1988), given a sequence of non-negative random variables {Xk,k=1,2,}, if they are independent and the cumulative distribution function of Xk is given by F(ak1x) for k=1,2,, where a is a positive constant, then {Xk,k=1,2,} is called a geometric process (GP). The GP has attracted a lot of attention in the literature (see, Wu and Scarf (2015); Zhang, Gaudoin, and Xie (2015), for example).

  • The generalised renewal process: Kijima and Sumita (1986) and Kijima (1989) introduce two types of repair models, type I and type II, using the concept of virtual age. These models distinguish between the age of the system, which is the time elapsed since the system was new (usually at time t=0), and the virtual age of the system, which accounts for the current health of the system when compared to a new system. The two models assume Vk=Vk1+qkXk, and Vk=qk(Vk1+Xk), respectively, where Vk is the virtual age of the system immediately after the kth repair, Xk is the operating time of the system since the kth repair, and 0 ≤ qk ≤ 1. The models are often referred to collectively as the generalised renewal process (GRP). In the type I model, if qk=0, then the kth repair is a perfect repair; if qk=1, then the kth repair is a minimal repair.

  • The renewal process, the superimposed renewal process, and the homogeneous Poisson process (HPP): Consider a series system consisting of m sockets into each of which there are inserted non-repairable independent components, and whenever a component fails, the system fails, and the failed component is replaced with a new identical one and system operation resumes. Then the number of failures occurring at each socket is a renewal process and the number of failures of the series system as a whole forms a superimposed renewal process (Høyland & Rausand, 2009). In general, the superimposed renewal process is not a renewal process, unless the individual renewal processes are homogeneous Poisson processes (Drenick, 1960). When both the number of components in the system is large and the operation time of the system is large then the superimposed renewal process behaves approximately as a homogeneous Poisson process (HPP) (Høyland & Rausand, 2009). The HPP is a counting process with constant intensity function (or rate of occurrence of failures).

  • The non-homogeneous Poisson process (NHPP): This process generalises the HPP and has a time-varying intensity function.

The failure process models such as the GRP (Gilardoni, Toledo, Freitas, & Colosimo, 2015), GP (Wu & Clements-Croome, 2006) and NHPP (Asfaw & Lindqvist, 2015) do not distinguish the effect of maintenance upon failure of different components in the system. Thus, such models effectively consider the system as a one-component system. The use of these models is typically justified by the fact that in practice failure data are scarce, and so, even if one could model each component in the system individually and plan maintenance accordingly, such an approach would not be applicable.

Furthermore, appealing to the asymptotic behaviour of the superimposed renewal process as justification for the use of an HPP in an application is questionable because in practice typical systems are relatively young and failures are rare. Using an NHPP with an increasing intensity function such as the power-law process (Høyland & Rausand, 2009) overcomes this system age issue. However, the NHPP (and HPP for that matter) supposes repair is minimal. At the other end of the spectrum, the renewal process supposes the entire system is replaced on failure so that repair is perfect.

Capturing imperfect repair with the GP or the GRP presents further difficulties. The GP implicitly assumes that the times between failures are either stochastically increasing or stochastically decreasing, which is not always true for the failure process of a series system. For example, if all the components in the series system have increasing hazard functions, then a replacement of a failed component improves the reliability of the system; on the other hand, operating times between adjacent replacements of the system are stochastically decreasing. Hence the times between failures of the system may be neither stochastically increasing nor stochastically decreasing. Furthermore, in the limit (large number components at large t) the superimposed renewal process behaves as a homogeneous Poisson process, which cannot be captured by the GP.

The GRP overcomes the issue of stochastically increasing or decreasing times between failures by allowing the repair effect parameter to vary. However, if indeed qk vary with k, then typically they must be estimated using a very limited number of failure observations. As a result, the models will be poorly estimated. If the qk are assumed equal for all k, then the GRP will not capture the fact that the effects of replacement of failed components of different types are different.

In summary then, the existing models of the failure process of a multi-component system are restrictive. Furthermore, limited failure data make it impossible to estimate either the lifetime distribution of each component in a system or a model for a system as a whole with many parameters. We must therefore seek simple and elegant models for a system as a whole that can be fitted using a limited number of failure observations. Our contribution develops two new classes of such models for a series system with non-identical components.

In these models, the failure process of the system is regarded as equivalent to the failure process of a system consisting of a component, called the virtual component, in its socket and the remainder of the system, called the virtual sub-system (VSS). Upon a failure of the system, we suppose that the virtual component is replaced and the remainder of the system is either not maintained (equivalently minimally repaired) or subject to imperfect repair. In reality, one can contend that at a repair, the change in system reliability is at least as big as if the most reliable component had been replaced. Broadly speaking, replacement of the virtual component upon system failure captures this notion.

It should be noted then that in this paper, we distinguish three systems: i) the real system, that is, the reality, e.g. a manufacturing cell, a traction motor, a wind turbine, a compressor; ii) the system, that is, the mathematical model of the system, e.g. a series system with a number of non-repairable, non-identical components; and iii) the virtual system (VS), consisting of a virtual component (VC) and a virtual sub-system (VSS).

In our first model (Model I) the number of sockets in the series system is not specified. In our second (Model II) the number of sockets in the series system is m. The distinguishing features of the two models are discussed in detail later in the paper. The key concept, and a contribution of this paper, is the notion of the virtual component. Through this notion, our models capture not only that a repair effect is neither as good as new nor as bad as old but also that systems comprise distinct, typically non-identical components.

The paper considers the scenarios where the lifetime distribution of each component is neither known nor knowable for various reasons. For example, if the number of system failures is large but one does not know the causes of system failure (so that the different components cannot be distinguished), the lifetime distribution for each component cannot be estimated. On the other hand, if the number of failures is small, knowing the causes of system failures does not provide sufficient information to estimate the lifetime distribution for each component.

Thus, we claim that this paper is the first paper to model the repair effect in multi-component systems with a stochastic process, on the basis that the virtual age reduction models, such as the GRP, do not model a multi-component system, and the superimposed renewal process, HPP and NHPP do not model repair. It proposes elegant models for the failure process of a series system that overcome the limitations of existing models that are either restrictive (renewal process and its generalisations) or require knowledge of the failure process for individual components (superimposed renewal process).

The structure of the paper is as follows. Section 2 lists assumptions and notation required. Section 3 develops the two models of interest. Section 4 gives the likelihood functions of the two models given data on failures. Section 5 assesses the validity of the proposed models based on artificially generated datasets and a real-world dataset. We make some concluding remarks, including the implications of our work for applications, in the final section.

Section snippets

Assumptions and notation

Consider a series system with a number of statistically independent components. If a component fails, the system fails. On failure, the failed component is replaced instantaneously by an identical component, and the system is restored to operation. Time for replacement is negligible. Associate a socket with each component location in the system, in the sense of Ascher and Feingold (1984), so that the components in their sockets collectively perform the operational function of the system.

The

Model development

In this section we will develop the two new models. First, we recall two concepts, and then we derive a result about the cumulative intensity function of the system when the components in a series system are identical (Proposition 1).

Given a component with hazard function h(t) and lifetime X, then Pr{Xy+x|Xy}=F(x+y)F(y)1F(y)=1exp(yx+yh(u)du).

Pr{Xy+x|Xy} in Eq. (3) is the cumulative distribution function of the remaining lifetime distribution of a component that has survived for y time

Parameter estimation

Consider M independent systems of the same kind (replicates), each of which consists of m non-repairable components with hazard functions hi(t) for i=1,,m. Suppose that Mj failures of system j are observed at times tj,1,tj,2,,tj,Mj (where j=1,2,...,M), respectively. Denote xj,1=tj,1 and xj,k=tj,ktj,k1 for k ≥ 2 and all j.

As discussed above, for both Model I and Model II, we suppose (a) upon system failure, VC1 in both Model I or Model II is always renewed, and (b) upon system failure, VC2

Simulation study

In this subsection, we fit Models I and II, and four other models (RP, NHPP, GP, and GRP) to artificially generated datasets, and then compare the AIC values of these models. It is known that the renewal process (RP) is usually used for perfect repair scenarios, the non-homogeneous Poisson process with power law intensity function (NHPP-PL) is often used for minimal repair, and two models for imperfect repair: the geometric process (GP) and the generalised renewal process with Vk=Vk1+qXk being

Conclusions

This paper develops two stochastic models of the failure process of a multi-component series system. Model I regards the failure process of the system equivalent to that of a virtual system consisting of two sockets into each of which there is inserted a virtual component. Whenever the system fails, replacement occurs at socket 1 and minimal repair or no maintenance is conducted on the virtual component in socket 2. Model II regards the failure process of the system equivalent to that of a

Acknowledgments

The authors are indebted to the reviewers and the editor for their constructive comments.

References (25)

  • R.F. Drenick

    The failure law of complex equipment

    Journal of the Society for Industrial & Applied Mathematics

    (1960)
  • J. Duane

    Learning curve approach to reliability monitoring

    IEEE Transactions on Aerospace

    (1964)
  • Cited by (30)

    • Two methods to approximate the superposition of imperfect failure processes

      2021, Reliability Engineering and System Safety
      Citation Excerpt :

      Below we give a brief review on the work published in the last two years. More references in this area can be found in[2,4] and[3], respectively. There are many publications methods proposed to approximate the SRP (see[13–15] for example).

    • The use of second-hand items based on delay time modelling

      2021, Process Safety and Environmental Protection
      Citation Excerpt :

      A crucial concept that links them all is the failure process for a multi-component system. Wu (2019) and Wu and Scarf (2017) establish models that aid the failure process to be better understood. Wu (2019) proposes a new model to investigate the failure process of a multi-component series system.

    • Generalized renewal processes

      2021, Safety and Reliability Modeling and Its Applications
    View all citing articles on Scopus
    View full text