Research on Hidden Failure Reliability Modeling of Electric Power System Protection

Aiming at digital relay protection system, a novel hidden failure Markov reliability model is presented for a single main protection and double main protection systems according to hidden failure and protection function under Condition-Based Maintenance (CBM) circumstance and reliability indices such as probability of protection system hidden failure state are calculated. Impacts of different parameters (containing impacts of human errors) to hidden failure state probability and the optimal measures to improve reliability by variable parameter method are also analyzed. It’s demonstrated here that: Compared to a single main protection, double main protection system has an increased hidden failure probability, thus the real good state probability decreases, two main protections’ reliability must be improved at the same time, so configuration of the whole protection system for the component being protected can’t be complicated. Through improving means of on-line self-checking and monitoring system in digital protection system and human reliability, the real application of CBM can decrease hidden failure state probability. Only through this way can we assure that the protection systems work in good state. It has a certain reference value to protection system reliability engineering.


Introduction
Ref. [1][2][3][4][5] are the first to explore hidden failures in protection system carefully, later many experts carried research on protection hidden failure and its contribution to protection system reliability and power system reliability and have obtained many good results [6][7][8][9][10][11][12][13][14].Now, CBM (Condition-Based Maintenance) is presented to apply in power system and protection system in China, hidden failure of protection is defined as a function defect of protection device before; under new CBM circumstance [15,16], hidden failure is defined as a hidden defect of protection that can't be detected by means of CBM such as on-line self-checking and monitoring system, and it may result in mal-operation or non-operation of protection system under certain condition, for example, settings of protection don't change according to the operation mode of protected equipment.Application of CBM is based on condition of protection device instead of operation time, it can decrease test time and test cost.CBM is carried on aiming at hidden failure state of protection system; the level of its putting into practice determines the level of protection system's good state.
When carrying on reliability research of protection system using Markov method, it's often assumed that failure rate and repair rate of protection is constant, and CBM Substitutes routine test by using on-line self-checking and monitoring method, the routine test interval doesn't need to be considered.In the following, aiming at digital relay protection system, a novel hidden failure Markov reliability model will be presented for a single main protection and double main protection system separately, according to hidden failure and protection function under CBM circumstance, reliability indices such as probability of protection system hidden failure state will be calculated.Impacts of different parameters (containing impacts of human errors) to hidden failure state probability and the optimal measures to improve reliability by variable parameter method will be analyzed.It can present a certain reference value to protection system reliability engineering and application of CBM in protection system.

Hidden Failure Reliability Model of Single Protection System
First, hidden failure reliability model of a single main protection is presented by Model 1, as Figure 1 shows.When doing research on reliability of protection system, each state of the system must be considered, so is probability of each state and the transition rate between states.Markov process is a useful tool to analyze these questions.In Figure 1, state 1 is normal state of component being protected and protection equipment; state 2 is that when component fails, its protection operates correctly; after component being repaired, it goes to state 1; state 3 is that component is good, protection has selfcheckable failure; state 4 is that component is good, protection has non-self-checkable mal-operation failure; state 5 is that component is good, protection has nonself-checkable non-operation failure; state 6 is that hidden mal-operation is triggered under external fault or it's own fault condition, and non-self-checkable mal-operation of protection happens; state 7 is that when component fails, non-self-checkable non-operation of protection happens; if component is repaired first, it goes to state 3; if protection is repaired first, it goes to state 2; state 8 is that component fails, protection's mal-operation is considered as correct operation, after component is repaired, it goes to state 4. Hidden mal-operation state (state 4) can convert to hidden non-operation state (state 5) and vice versa.
In Figure 1,  C is failure rate of component being protected,  c is repair rate of component being protected,  P is failure rate of protection(it consists of hardware failure rate and software failure rate), Through Equation ( 1) and ( 2

Hidden Failure Reliability Model of Double Main Protection System
Reliability model of double main protection system is presented by Model 2, as Figure 2 shows.The model is similar to Model 1, but it's more complicated for double main protection, protection P1 and P2 has identical position.Define  P as failure rate of protection P1, the parameters of main protection P1 is identical to that of Model 1.
As for protection P2,  P2 is failure rate of protection, is non-self-checkable mal-operation rate of protection,

Hidden Failure Reliability Model of Single Protection System Considering Human Error
Human error can be defined as any improper action, resulting in events that will affect the proper action of the system.From a system point of view, with reliable hardware and software, human error remains as a great threat to system safety [17][18][19][20].For example, incorrect operation of operating personnel occurred in South America and North Mexico interconnected power grid cascading outage on Sept. 8, 2011, so now it has been an important factor that deserves our attention.The reasons for human errors are fatigue and sleeplessness, anger, emotional upsets, lack of skill, hunger, letdown from low blood sugar, medication, drugs and so on.Human error can be divided into seven kinds: design error, operator error, fabrication error, maintenance error, contributory error, inspection error and handling error.
There are numerous techniques available for conducting human reliability assessment, such as THERP (technique for human error rate prediction), HEART(human error assessment and reduction technique) and so on.Through these methods we can achieve the failure prob-ability of human operation.Here human error is described by a mean failure probability of a constant.
The two fault modes for protection system are maloperation and non-operation, the impact of human error to protection system also has two kinds: mal-operation and non-operation.In the following analysis, it's assumed that human error appears after some operation and repair.
Hidden failure reliability model of single main protection system considering human error is presented by Model 3, as Figure 3 shows.This model is based on Model 1, two kinds of human errors are considered: 1) protection system mal-operation owing to incorrectly operation of operating personnel, for example, dispatching personnel or operator on duty fails to follow correct procedure; 2) protection system are not completely good after repair, for example, settings of protection don't change after repair, this may cause hidden mal-operation or non-operation of protection system.
In Figure 3, when protection P trips incorrectly owing to human error, state 1 goes to state 6; when protection P is not repaired completely owing to human error, state 3 goes to state 4 (hidden mal-operation state) or state 5(hidden non-operation state).As for protection P, K h1 is a mean human error rate; v 1 is mal-operation percentage owing to human error; so we can achieve the reliability indices that are identical to Model 1.

Case Studies
Here, take the data of Table 1 for example, we calculate the reliability indices of the three models and analyze the results; the computation results are shown as Table 2. Using variable parameter method, p hidden curve of Model 1 under different C 1 is shown as Figure 4 (that is to say, under certain C 1 , when  P increases, we can obtain the curve of p hidden ), p hidden curve of Model 2 under different C 1 is shown as Figure 5 (to Model 2, when  P2 increases, p hidden curve under different C 2 is the same as Figure 5), impact of human error to p hidden of Model 3 is shown as Figure 6.From Table 2, Figure 4 to Figure 6, we can draw the conclusions:  Compared to Model 1, Model 2 has a higher p hidden and p hw , a lower p hj , this shows that redundant protection can decrease hidden non-operation state probability, but at the same time it increases hidden mal-operation state probability, thus hidden failure state probability increases, so the completely good state probability of protection system decreases.When using redundant protection, we must consider it.


To Model 3, when K h1 increases, p hidden increases; when v 1 increases as the arrow shows, p hidden decreases; compared with Model 1, when K h1 is small, it rarely has impact on these indices.This means that mean human error rate and mal-operation percentage owing to human error can affect hidden failure state probability, so we must take all measures that can be done to decrease human rate error and improve reliability of protection system.


From Figure 4 and Figure 5, we can see that the curves of hidden failure state probability of Model 1 and Model 2 under different C 1 are similar; when  P increases, p hidden increases; when C 1 increases, p hidden decreases.This shows that failure rate of protection and self-checkable success rate of protection can affect reliability of protection system greatly，and two main protection's reliability must be improved at the same time.Through improving means of on-line self-checking and monitoring system in digital protection system, the real application of CBM can decrease hidden failure state probability.When reliability of single main protection system is high, we can consider simplified configuration of the whole protection system.

Conclusions
Aiming at digital protection system, we must take meas-ures not only to decrease mal-operation probability and non-operation probability, but also to decrease hidden failure state probability.Compared to a single protection, double main protection system has an increased hidden failure state probability, thus the real good state probability decreases, two main protection's reliability must be improved at the same time, so configuration of protection system for the component being protected can't be complicated(such as two out of three vote) .Human error rate can increase hidden failure state probability of protection system, human error must be reduced during normal operation and maintenance process.Through improving means of on-line self-checking and monitoring system in digital protection system, the real application of CBM can decrease hidden failure state probability.Only through this way can we assure that the protection systems work in good state.It has a certain reference value to protection system reliability engineering.

Figure 1 .
Figure 1.Hidden failure reliability model of single main protection system.

Figure 2 .
Figure 2. Hidden failure reliability model of double main protection system.

Figure 3 .
Figure 3. Hidden failure reliability model of single main protection system considering human error.
), we can get stable state transition probability matrix B and each state probability non-operation rate of protection,  2 is repair rate of protection,is repair rate of both protection at the same time.Define:C 9 =C 1  P ，C 10 =C 2  P2 .