Importance measure of probabIlIstIc common cause faIlures under system hybrId uncertaInty based on bayesIan network

When dealing with modern complex systems, the relationship existing between components can lead to the appearance of various dependencies between component failures, where multiple items of the system fail simultaneously in unpredictable fashions. These probabilistic common cause failures affect greatly the performance of these critical systems. In this paper a novel methodology is developed to quantify the importance of common cause failures when hybrid uncertainties are presented in systems. First, the probabilistic common cause failures are modeled with Bayesian networks and are incorporated into the system exploiting the α factor model. Then, probability-boxes (bound analysis method) are introduced to model the hybrid uncertainties and quantify the effect of uncertainties on system reliability. Furthermore, an extended Birnbaum importance measure is defined to identify the critical common cause failure events and coupling impact factors when uncertainties are expressed by probability-boxes. Finally, the effectiveness of the method is demonstrated through a numerical example.


Introduction
The assessment of the reliability of modern industrial systems has to take into account various system characters since systems are becoming increasingly large and complex.As an example, reliability models have to take into account characteristics such as dynamic behavior [32], multiple failure mechanisms [10], components dependent relationships, uncertainties, etc. [15,18].Several conventional combinatorial methods have been developed and proved to be effective tools for system reliability modelling and assessment, including reliability block diagram (RBD), fault tree analysis (FTA) [12], Markov chains, Bayesian network [11,23], etc.Nevertheless, when considering model performance, computational efficiency and executive complexity, these traditional models present advantages, but also serious limitations [3]: (i) Static fault tree and RBD model can map system com-ponents to events, but fail to capture the dynamic behavior; dynamic fault tree models are needed to model time-dependent behaviors, increasing significantly the complexity of the investigated model.(ii) Markov chain can deal with dynamic behavior, but it is limited to exponential distribution for failure behaviors.Markov chain is also faced with state space exponential explosion problems when applied to systems of large size, Because Markov chain method consider all relationships among parent nodes, children nodes, and even sharing nodes.(iii) Due to conditional independence assumptions between the random variables and dependence separation among the nodes in Bayesian network, a child node in a Bayesian network is only affected by a limited number of parent nodes [11,12].(iv) Otherwise, Bayesian networks provide a powerful capability of probability reasoning, dynamic behavior modeling and multi-model synthesis [23].These advantages prompt Bayesian network to be a widely used method in Mi J, Li Y-F, Beer M, Broggi M, Cheng Y. importance measure of probabilistic common cause failures under system hybrid uncertainty based on bayesian network.eksploatacja i niezawodnosc -Maintenance and reliability 2020; 22 (1): 112-120, http://dx.doi.org/10.17531/ein.2020.1.13.

Jinhua Mi Yan-Feng Li Michael Beer Matteo Broggi Yuhua Cheng
Importance measure of probabIlIstIc common cause faIlures under system hybrId uncertaInty based on bayesIan network oparta na sIecI bayesowskIej mIara ważnoścI probabIlIstycznych uszkodzeń spowodowanych wspólną przyczyną w warunkach nIepewnoścI hybrydowej systemu When dealing with modern complex systems, the relationship existing between components can lead to the appearance of various dependencies between component failures, where multiple items of the system fail simultaneously in unpredictable fashions.These probabilistic common cause failures affect greatly the performance of these critical systems.In this paper a novel methodology is developed to quantify the importance of common cause failures when hybrid uncertainties are presented in systems.First, the probabilistic common cause failures are modeled with Bayesian networks and are incorporated into the system exploiting the α factor model.Then, probability-boxes (bound analysis method) are introduced to model the hybrid uncertainties and quantify the effect of uncertainties on system reliability.Furthermore, an extended Birnbaum importance measure is defined to identify the critical common cause failure events and coupling impact factors when uncertainties are expressed by probability-boxes.Finally, the effectiveness of the method is demonstrated through a numerical example.sciENcE aNd tEchNology reliability modelling and assessment of a diversity of large engineering systems.

Keywords
In engineering practice, unavoidable uncertainties are of uttermost importance for system reliability assessment.The combination of both aleatory (stochastic) uncertainty and epistemic (lack of knowledge) uncertainty leads to the framework called "mixed uncertainty" or "hybrid uncertainty", and it is ubiquitous in engineering systems [4].Uncertainties mainly arise from the following aspects: observational uncertainty, model uncertainty and parametric uncertainty.The purpose of uncertainty analysis is developing advanced approaches to reduce those uncertainties, thus leading to more accurate analysis and assessment of system reliability [29].The uncertainty characterization models can be divided into three types: classic probabilistic analysis, non-probabilistic models, and imprecise probability model.Imprecise probability models, including evidence theory [7,14], probability-box (p-box) theory [8], fuzzy probability theory [25,28], etc. have been proved to be more appropriate for hybrid uncertainty.In particular, the essence of p-box theory is the combination of classic probability theory and interval arithmetic, it's a very effective tool to treat imprecise probabilities, allowing for the comprehensive propagation of hybrid uncertainty [6,24].
As defined by the Nuclear Energy Agency (NEA), common cause failures (CCFs) are the simultaneous failure events of two or more components in the same common cause component group (CCCG).CCFs are caused by shared initiating events which also called "coupling impact factors" [30].CCFs directly connect the failure event with root causes; thus, research work on CCFs can build the causeeffect relationship between components failures and failure causes.Multiple of parametric models, such as β-factor model and α-factor models, generically classified as "ratio models" have been developed for quantification of CCFs.In addition, additional CCF models allow for direct representation of the CCF events, such as the square-root method; and shock models, such as binomial failure rate model, have also been proposed.For reliability analysis and assessment of system with CCFs, these parametric models have been extended to incorporate CCFs into system fault tree model [13,16], Bayesian network model [17], etc.These methods are especially widely used in probabilistic safety assessment of large complex systems with high reliability and long lifetime requirements [9].
Probabilistic CCF (PCCF) is a generalized model of CCF that can characterize the simultaneous failures of components in CCCGs with different probability of occurrence.When employing ratio models for PCCF analysis, even if the β-factor model is the most widely used method thanks to its simplicity, α-factor model is receiving increased interest as well since it can model multiplicities of CCFs and can build a bridge between failure events and coupling causes [31].When dealing with the reliability assessment of systems taking into consideration PCCFs by means of static fault tree model, some explicit and implicit modelling methods were proposed by Wang, et al. [26] to estimate the reliability of systems with arbitrary components types and different component failure distributions.Thereafter, Wang, et al. [27] extended these models and proposed both an explicit and implicit method to analyze reliability of phased-mission systems with PCCFs.Zhu, et al. [33] proposed a stochastic computational approach to deal with the reliability overestimate of dynamic fault trees with PCCFs when dynamic behaviors are considered in redundant system.Additionally, when epistemic uncertainty are also present in systems, Zuo et al.
[34] evaluated the system reliability when PCCFs are specified as interval value based on evidential network, and the Birnbaum importance was extended to measure the contribution of components to system reliability.Based on evidential network, Qiu et al. [21] proposed a valuation-based system method for system reliability analysis with consideration of parametric uncertainty and CCFs.Mi, et al. [17] incorporated CCFs and uncertainties into evidential network to study the reliability of multi-state systems.These methods mainly focus on epistemic uncertainty and CCFs, and it needs to be emphasized that there still lack of research work on system reliability when hybrid uncertainties and PCCFs are both considered.
Hence, in recent years, the research works of CCF are mainly focusing on the quantification models and method extensions, while few works are carried on estimation of the importance measure of different types of CCFs and of impact factors to system reliability [1,22], especially when aleatory uncertainty and epistemic uncertainty are unavoidably, present in systems.This research gap will be of considerable significance to safety critical industrial areas seriously affected by CCFs, such as aerospace and nuclear industry.The ranking of CCF events and impact factors can give explicit guidance for system renew design, and also meaningful for maintenance measure formulating and fault eliminating.
To evaluate the importance of various CCF events to system reliability, the CCF events should be modelled and expressed in system reliability model.Therefore, this paper is organized as follows, firstly, the CCF events are modelled by Bayesian network based on alpha factor model, and incorporated into system Bayesian network.Then the hybrid uncertainties are expressed by p-boxes, and the Birnbaum importance are extended as EBI (extended Birnbaum importance) which can be used to define the importance of CCF events to system reliability.Finally, a numerical example is used to realize the impact measure of CCF to system reliability.

Parametric model for common cause failure
α-factor model is a multiparameter method which can be used to quantify all kinds of CCFs.The definition of the α-factor, indicated as α k , is the fraction of the total failure probability of events that occur in the system and involve the failure of k components caused by a common cause.For a common cause component group (CCCG) with m components in the same type, which also called a CCCG of size m, the sum of all α k equals to 1.After a series of experiments, the number of basic failure events is collected, and the number of failure events with k components failure based on a common cause is n k which can be computed by weighted impact vector method [19,20].Then the alpha factor can be estimated by using the maximum likelihood estimator when there are sufficient test data, and: Then, a common cause vector α α CCCG m is defined to represent the effect degree of different CCF events on failure of each component in a CCCG of size m and: Besides, since β factor model can only get an approximate scope by engineering experiences, but α factor model has the ability of integrating experts' judgments of system and past data, which makes it to be a more suitable parameter model in practice engineering.When the total failure probability of a component is P t , the occurrence probability of the corresponding failure events for staggered testing is given by: sciENcE aNd tEchNology When the occurrence probability of component P t is specified, alpha factors can be calculated by Eqs. ( 1) and ( 2), then the occurrence probabilities of different CCF scenarios can be calculated by Eq. (3).

Bayesian network modeling of component in common cause component groups
Bayesian networks are a widely used methodology for reliability modelling; they are composed of nodes, which represent binary state random variables, and edges, which represent dependencies between these variables, respectively.In a Bayesian network with n nodes, where X i is the corresponding random variable of node i, for any X i (i=1,…,n), there exists π ( ) { ,..., } X X X i i ⊆ − 1 1 which causes variable X i to be conditional independent from all the variables in the set 11 { ,..., } i X X − .Thus, based on the chain rule and conditional independence of Bayesian network, the joint distribution of n variables can be derived by the following formula as: ,..., ,..., =∅ , the conditional distribution ( ( )) ii P X X π will degrade into the marginal distribution P(X i ).Thanks to the decomposition of joint distribution, it is possible to greatly reduce the complexity of a Bayesian network model.
In this framework, the failure event of a component in a CCCG can be decomposed based on Eq. (3).For example, for a component X in a CCCG with 3 components, the failure event of a component, e.g. the failure event identified by the random variable X 1 , can be further characterized as independent failure, X 1-ind , two components CCF, X 12 and X 13 , and three components CCF, X 123 .Then, the Bayesian network of the failure of the component can be built as shown in Fig. 1.Based on the conditional independence between variables and reasoning mechanism of Bayesian network, the probability of component X 1 is: When an arbitrary independent failure event or CCF event occurs, the failure of component will be triggered.By using 0 and 1 to represent the failure and functioning states of X, respectively, the conditional probability table (CPT) of node X is listed in Table 1.
In the case when 3 components of the same type are connected in a parallel system, the failure of each component within the CCF events is represented through a Bayesian network shown in Fig. 1, and the Bayesian network of the whole system can be further assembled and it is shown in Fig. 2. Finally, the probability of the system can be evaluated by exploiting the forward reasoning of Bayesian network and Eq. ( 4), and expressed as: )) , When the CPTs and marginal distributions are given, the system reliability can be further evaluated.

Probability-box for uncertainty expression
The term "hybrid uncertainties" is used to identify uncertainty quantification and propagation procedures that include both aleatory uncertainty and epistemic uncertainty.The aleatory uncertainty characterizes the inherent randomness typical of some physical processes which cannot be eliminated or reduced and is quantified by means of probability theory.On the other hand, epistemic uncertainty is present in systems due to lack of knowledge, insufficient data, etc., but it can be reduced by providing more data and increasing the knowledge of the system.P-box theory has been proved to be an effective method

Fig. 2. BN of 3 component parallel system with CCFs sciENcE aNd tEchNology
to analyze aleatory and epistemic uncertainty in systems.The probability expression of p-box boundaries includes the aleatory uncertain information of system performance, while the area between the upper and lower bounds represent the epistemic uncertain information.For a random variable X affected by hybrid uncertainty, its probability distribution is not identified by a unique cumulative distribution () X F t , but by an upper and lower bound, consisting of a p-box [ ( ), ( )] X X F tF t .The overall slant of a p-box represents aleatory uncertainties; while the epistemic uncertainty is represented by the breadth between the upper and lower bound of the p-box.Based on this definition, the p-box is extended and used in system reliability analysis, sensitivity analysis, risk analysis, etc.
As an example of p-box appearing in the estimation of the reliability of a system, let's consider the case when the lifetime of a component is assumed to follow a Weibull distribution.The shape parameter β and scale parameter η are affected by imprecise information and defined as interval parameters, which varies in [ ,] ββ and [ ,] ηη , respectively.The lifetime of a component is described by a non-negative random variable X i on the real number  , and ( ) i R t and ( ) i R t are the bounding reliability functions for random variable X, and ( ) Then the p-box variable which is employed to express the system reliability can be defined as [17]: For example, for a component with lifetime distribution follows Weibull distribution where the scale parameter is within the interval [1.68, 1.86] and the shape parameter is within interval [2.08, 2.32], the Weibull p-box is constructed by taking the envelopes of those distributions and is shown in Fig. 3.

Bayesian network reasoning with hybrid uncertainty
For a Bayesian network with n+m+1 nodes, the variables of n root nodes are represented as i x (i=1,2,…,n), the variables correspond to intermediate nodes is j y (j=1,2,…,m), the leaf node variable is T, based on the chain rule of Bayesian network, the system reliability can be calculated by the following equation represents the parent of intermediate node y m .When hybrid uncertainties are taken into account in the system, and the reliability of basic components is expressed by p-boxes in Eq. ( 7), then, the occurrence probability of decomposed independent failure events and CCF events can be obtained by Eq. ( 3) when the alpha factors are given or estimated.Since the reliability function is a monotone decreasing function, the p-box of the system reliability can be further derived and expressed as: After the upper and lower bound of system reliability pbox are obtained, the hybrid uncertainty of system can be intuitively shown.

Extended Birnbaum importance measures
The Birnbaum importance measure has been commonly used to evaluate the contributions of components to the reliability of binary coherent system.In this paper we focus on the dependent failure especially CCFs.The failure events of the components are originally not independent; however, after the decomposition of the component failure events by partial alphafactor model, the decomposed basic events are independent [2].The Birnbaum importance measure can be further extended and employed to evaluate the contributions of CCF events and independent events to the system reliability.
For a Bayesian network which has n root nodes and composed of both CCF events and independent failure events, and where , the system reliability is represented as 1 ( ,..., ,..., ) R p p p .In addition, the state of component or event j, indicated by x j , is considered.x j = 0 represents the j-th component in a false state, while x j = 1 represents the event j as true.By explicitly assigning these two possible states to the j-th component, the reliability of the system at where x j is either 0 or 1 is expressed as , respectively.Then, by extending the basic definition of BI, which is defined as the partial derivative of the system reliability function, the extended Birnbaum importance (EBI) of the event j is defined as: p is the occurrence probability of event j.The EBI measure can be used to rank the importance of failure event i when aleatory uncertainties are exclusively considered.However, when epistemic uncertainty is also present in the system, and the reliability of the components is represented by p-boxes (refer to Section 3.1), then pbox which used to express the extended BI of event j can be further obtained and calculated by: where the lower bound and upper bound of EBI can be further computed by the following two global optimization algorithms: ) ( ) After the BIs of varieties of CCF events are obtained and expressed by p-boxes, the interval-valued BI at specific time t can also be derived, then the ranking of the contribution of CCF events to system reliability can be further obtained.

System description
An arbitrary 13-component non-repairable complex system in Ref. [5] is used to illustrate the proposed method in this section.The compound system is further transformed into series-parallel system as shown in Fig. 4. All 13 components can be divided into 5 groups, and the components in the same group have the same lifetime distribution.The components classification and corresponding hypothetical lifetime distributions are defined in Table 2.

Reliability analysis of the system
The reliability modeling and analysis of the example system is carried out with the following steps: Step 1: Develop the Bayesian network model of the system without considering CCFs and hybrid uncertainty.Based on the system structure in Fig. 4, the Bayesian network of the system is constructed, as shown in Fig. 5.The nodes 1 to 13 correspond to the basic failure events of 13 basic components, while the nodes 14-15 are the intermediate nodes and finally node 25 is the leaf node, used to represent the event of system failure.The CPTs of a simple 3 nodes Bayesian network structure under series and parallel system structure are shown in Table 3, and the CPTs of a Bayesian network structure with more nodes can be easily inferred based on the system structure and failure mechanisms.,The system reliability can be obtained by means of inference of the Bayesian network, and it is shown in Fig. 6.In order to validate the proposed approach, the system reliability from Bayesian network is compared the reliability computed by survival signature; the two results agree very well, proves the validity of the proposed method.
Step 2: System reliability without CCF events but with hybrid uncertainties.When aleatory uncertainty and epistemic uncertainty are considered in this system, the system Bayesian network is represented by the same structure as shown in Fig. 5.The upper bound and lower bound of system reliability can be calculated based on Eqs.(10) to (12), and are shown in Fig. 7.By contrast, the results obtained by survival signature method within those two conditions: (1) with consideration of hybrid uncertainties; (2) without hybrid uncertainties.The results are also shown in Fig. 7.It shows that regardless of whether or not the hybrid uncertainties are considered, the system reliability computed by survival signature and traditional Bayesian network is with consistent, and these results are within the upper and lower bounds which calculated by p-box method.The   sciENcE aNd tEchNology trend of the system reliability contains the aleatory uncertain information of system, and the breadth of reliability p-box can clearly reveal the epistemic uncertainty of system.
Step 3: Develop the Bayesian network model of system with CCF events.Based on the CCF modeling method illustrated in Section 2, a new Bayesian network model of this example system can be built and shown in Fig. 8.Where the nodes 1 to 37 are independent failure events of components and various different CCF seniors of CCFGs, nodes 38 to 49 correspond the basic failure events of 13 components, nodes 50 to 60 are the intermedia nodes, and the system failure events is represented by node 61.The CPTs of nodes in components layer are similar to Table 1.
Based on the partial alpha factor model, the probabilities of independent failures and various CCF events can be calculated by Eqs. ( 1) to (3).Then system reliability can be further computed and shown in Fig. 9.The reliability of the system is decreased with a more serious tendency when CCFs are considered, which further shows the significant influence of CCF events to system reliability.3. The CPTs of 3 nodes Bayesian network under series and parallel system structure sciENcE aNd tEchNology

Importance analysis of components and common cause failures
Based on Eqs. ( 10) to (12), the importance of all types of components without considering CCFs can be calculated and shown in Fig. 10.Then the ranking of component importance is RI(Type5)>RI(Type2)> RI(Type1)>RI(Type4)>RI(Type3).In order to indicate the effect of uncertainties to component importance, the importance of component type 5 with uncertainties is shown in Fig. 11.There is a big difference of type 5 after considering uncertainties, and there is an intersection between the lower and upper bound of importance.
To indicate the importance of CCFs to system, based on the definition in Section 3.3, as shown in Fig. 12, the EBIs of various CCFs of component type 5 are computed.Then the importance ranking of various CCF events can be gotten at any specific time.Furthermore, after getting the impact factors of CCFs, it will give a more accurate guide for system design and maintenance measure making.

Conclusions
This paper mainly discusses the importance measure of CCF events to system reliability with consideration of hybrid uncertainties.Firstly, considering the existing theory research of CCF and system modeling, we primarily model the system reliability by Bayesian network, and then the CCFs are incorporated into system Bayesian network model based on alpha factor model.The result compared with survival signature shows the validity of Bayesian network model and CCFs can decrease system reliability with a heavy tendency.When considering hybrid uncertainties in system, we extend the Birnbaum importance for CCF events on the basis of p-box and get the upper and lower bounds by global optimization algorithm.Finally, through a numerical study, the importance p-boxes of various CCF events are calculated and the changing diagrams about importance are gotten, the results identify the effectiveness of the proposed method.This paper only measures the importance of various CCF events to system reliability, in the future work, the importance of coupling common causes to system reliability should be further investigated which can provide a guidance for system design and maintenance.

Fig. 1 .
Fig. 1.BN of decomposed component in CCCG of size 3 of leaf node T, and ( ) m y π

Fig. 4 .
Fig. 4. The structure of example complex system

Fig. 6 Fig. 9 .Fig. 7 .
Fig. 6 System reliability ignore uncertainties and CCFs This work was partially supported by the National Natural Science Foundation of China under contract No. 51805073 and U1830207, the Chinese Universities Scientific Fund under contract No. ZYGX2018J061, the Sichuan Science and Technology Project under contract No. 2019JDJQ0015.Jinhua Mi wishes to acknowledge the financial support of the China Scholarship Council.

Table 2 .
Parameters of system components