Markov Chain Reliability Model of Cogeneration Power Plant Substation

The paper presents a Markov chain reliability model of a cogeneration power plant substation. Stochastic automata networks formalism and functional transition rates were used to specify the reliability behavior of a system. Iterative solution techniques were used to find steady-state solution of Markov models with different sets of randomly generated failure and repair rates. Modeling results were used to perform uncertainty and sensitivity analysis of the reliability model. DOI: http://dx.doi.org/10.5755/j01.eee.19.5.1214


I. INTRODUCTION
Markov chain is an effective statistical modeling technique which can describe complex behavior of various stochastic systems and has a well-developed mathematical apparatus.Examples of Markov chain models can be found in computer and telecommunication networks [1], engineering [2] or biological systems [3].Markov chains can also be used in reliability modeling.
However, some examples of Markov chains in reliability modeling deals with relatively small (less than 100 states) systems [4], assumes total independence of model units [5] or does not address model solution [6].Markov chain models of industrial power systems can have thousands or even millions of states.This means that the use of efficient model specification techniques and fast computation algorithms is very important in reliability modeling.
In this paper, Markov chain is used to model the reliability of cogeneration power plant substation.In order to specify system behavior, stochastic automata networks (SAN) formalism [7] was used.Stochastic automata networks were successfully applied to model the availability of large computer networks [8].We think that SAN formalism is suitable for specifying reliability models of power systems and estimating performance measures of a system under investigation.
One of the advantages of using Markov chain model is that it allows computing steady state probabilities of all system states, which helps to estimate probabilities of rare events and failure scenarios.This would be a difficult task in performing simulation, and would require a lot of CPU time or implementation of special modeling techniques [9].
The lack of statistical data is an important issue in reliability modeling, since parameter uncertainty can lead to the misestimation of system measures [10].In this paper, Markov chain model is used to solve a reliability problem with different sets of parameters, which allows performing uncertainty and sensitivity analysis [11].Necessary computations can be executed efficiently using a Markov chain model and iterative solution algorithms.

II. MARKOV CHAIN MODELS AND SAN
In this paper stationary analysis of irreducible and homogenous continuous-time Markov chain is performed.Markov chain describes a system as a discrete set of states with possible transitions among them.In realistic models the size of state space can be large (thousands or millions of states), thus numerical modeling techniques must be applied.Numerical analysis can be divided in three main stages.
1) Generation of system states and transition matrix (an infinitesimal generator matrix).In reliability modeling it means specification of possible failure scenarios, failure and repair rates etc.Failure and repair rates can be evaluated statistically and stored in an infinitesimal generator matrix Q.
2) Computation of a steady-state probability vector π from the system of linear equations Equation ( 1) can also be interpreted as a left eigenvector problem.Since the infinitesimal generator matrix Q is singular, an additional condition is used, i.e. the sum of all probabilities must be equal to 1.It can be expressed in a matrix form as large.It means that numerical methods should be applied to find a steady-state solution.The standard methods to solve such type of problems are the following: direct algorithms, iterative and projective methods [12].
3) Computation of reliability measures using steady-state probabilities.Usual measures in reliability modeling include system availability, the mean time between system failures, frequency of system failure, etc.For example, system availability (i.e., probability that system is available) can be calculated according to the formula where A denotes the set of states in which the system is available.
The third stage requires a thorough reselection and analysis of all system states, though it is not as time consuming as the computation of steady-state solution.
One of the main problems in Markov chain modeling is the rapid growth of system states.For example, Markov chain model of the system, consisting of 10 parallel items (each can be in 2 possible states: failed or operating) has 10 2 states.Moreover, each additional item doubles the size of state space -the phenomenon called state space explosion.This problem must be addressed in system specification and solution.
One of the methods to mitigate the state space explosion is the use of stochastic automata networks (SANs) formalism.SAN method is based on specifying the system by its division into smaller interacting subsystems, called automata.Each automaton ( ) i A is associated with its own state space and the transition among the states can depend on other automata.Interactions among different automata can be modelled by synchronising events and functional transition rates.The infinitesimal generator matrix Q of the entire network, i.e.SAN descriptor, can be represented as a sum of Kronecker products of infinitesimal generators ( ) i Q (each describes the behaviour of an individual automaton [13]).The Kronecker product of matrices is defined as the matrix The Since the Kronecker product is associative [13], i.e.

( ) ( )
, the Kronecker sum of k square matrices ( ) can be defined as follows

III. SAN DESCRIPTOR OF POWER PLANT SUBSTATION RELIABILITY MODEL
We assume a co-generative power plant substation (0.4/10.5 kV), consisting of two independent blocks.The first block has a single transformer (T1), and the second block has two transformers (T2-T3) connected to a busbar section (B2).Each block consists of transformers (T1-T3), switches (S1-S6), circuit breakers (C1-C3) and busbar sections (B1-B2).These items are connected with lines, which can also fail.The reliability model of the power plant substation is created with the following assumptions: 1) Each item can be in one of two possible states, i.e. operating or failed; 2) The failed items are detected immediately and repair is initiated; 3) There is no limit on repair capacity; 4) The repaired item is as good as new; 5) An item can not fail if the power is disconnected; 6) The duration of failure and repair times are distributed according to exponential law.The reliability model of the power plant substation is created with the following assumptions: 1) Each item can be in one of two possible states, i.e. operating or failed; 2) The failed items are detected immediately and repair is initiated; 3) There is no limit on repair capacity; 4) The repaired item is as good as new; 5) An item can not fail if the power is disconnected; 6) The duration of failure and repair times are distributed according to exponential law.There are 14 different items connected with 12 line segments, it means that a Markov chain reliability model has shaped transition matrix.
The second method of state space reduction is lumping of the states with identical items.For example, if there are two switches in a consecutive branch, they can be represented by one state instead of two.In this case, identical failure rates add up and repair rate remains the same.
Failure The proposed model description leads to a network consisting of 4 automata.The first automaton ( ) A describes the first block (T1-S1-C1-S1-B1).An infinitesimal generator of the first automaton is represented as follows The first row in ( 6) signifies an operating state, while the others mean that one item has failed.In the first row of ( 6 Diagonal elements (denoted as *) are negative sums of all row elements of the matrix (6).
In the model, the second block is described by three automata according the resulting circuit breaker actions when items are being repaired.The second and the third automata ( ) Since the second and third automata consists of identical items, so ( ) ( ) The fourth automaton represents the (S4-B2-S6) part of the power plant substation.The infinitesimal generator matrix of ( ) In matrices (7) and ( 8) f refers to functional transition rates, because failure rates of these automata depend on each other.Each function can be expressed as and ( ) i sA denotes the state of the i-th automaton -i.e., each function indicates, if the respective automaton is in the operating state.For example, if the transformer T2 is under repair, the branch (T2-S3-C2) is disconnected, while (T3-S5-C3) can still operate.
Global infinitesimal generator matrix Q of the whole reliability model can be expressed as a Kronecker sum of infinitesimal generators of automata The subscript g in (10) denotes the generalization of Kronecker sum to matrices with functional transition rates.
The space of states of the system can be described as a 4tuple ( ) For example, the state (0;1;0;0) means that the second automaton ( ) A is in a failed state number 1 (which means that transformer T2 has failed), while every other automata are in the operating state.

IV. MODELING RESULTS
After reducing the state space, the Markov chain reliability model of the power plant has only 750 states.In this case it is possible to store the infinitesimal generator matrix Q in RAM and to solve the model by the direct methods.However, since matrix Q is very sparse (most of its elements are zeros), more efficient approach is the use of sparse storage and iterative solution methods.
One of the advantages of Markov model is that it allows generating all possible system states and calculating steady states probabilities of the rarest failure scenarios.This task would be more difficult using simulation approach since it would require a lot of CPU time [9].
Statistical data collected by the Lithuanian Energy institute and from [14] were used to obtain modeling results.The model parameters (failure and repair rates per year) are presented in Table I.System state Probability (0;1;0;0) 0.00010268 (5;0;0;0) 0.00004936 (0;4;0;0) 0.00003702 (0;0;3;0) 0.00003295 (0;0;0;3) 0.00002468 A special property of reliability modelling is the lack of statistical data to evaluate failure and repair rates precisely.Parameter uncertainty can lead to unreasonable conclusions and significant misestimation of system measures.Uncertainty analysis can mitigate the problem, but it requires repetitive model solution in order to estimate system measures with different sets of model parameters.Markov chain models have a certain advantage over some other techniques (e.g.simulation) if iterative solution algorithms are applied to find steady state probabilities.
Time requirements to perform the uncertainty analysis by simulation approach and Markov model is shown in Table III.N denotes the number of different sets of model parameters.
( ) Total time: ( ) Total time: ( ) The Markov model has an advantage since steady-state solution vectors ( ) are relatively close to each other.
Setting a vector ( ) π as the first iteration step ensures fast convergence for the rest calculations.This can be described using pseudo-code:  System availability was estimated with different sets of model parameters, which allows evaluating the distribution of the system availability (Fig. 2).It was assumed that system is available, if at least two transformers are operating.In that case, the set of states in which system is available consists of 4-tuples (11), satisfying the following condition MATLAB statistics toolbox was used to analyze the modeling results.The experiments showed that beta distribution, with probability density function was the best fit the modeling data (Table V), since it provides the highest log likelihood value (Γ in (9) denotes the gamma function).Beta distribution is also more suitable than left bounded distributions (e.g., lognormal or Weibull), because its domain (between 0 and 1) matches the range of probability.
The estimated average system availability is about 0.999081, which means that the power plant is shut down on average for 8 hours and 3 minutes per year.
Chi-square goodness-of-fit test affirmed our distribution fitting results.The estimated p-value was 0.1733, which indicates that null hypothesis, i.e. system availability has the beta distribution with parameters a = 11272.3and b = 10.3635, can not be rejected at the standard 0.05 significance level.
The estimated distribution of system availability can be used as a part of a larger simulation model, because system availability can now be rapidly simulated by generating betadistributed random variables.
Other numerical experiments were conducted under the assumption that model parameters are distributed normally.However, since failure and repair rates can not be negative numbers, one sided truncation (left tail) of normal distribution was used, which leads to the probability density function Truncated normal random numbers were generated using acceptance-rejection method.In this case it requires similar amount of CPU time as the generation of the standard normal random numbers, since the left truncation l x is far from the mean value µ.
As for the uniformly distributed model parameters, we computed steady state probabilities with 1000 different sets of normally distributed model parameters and estimated the distribution of system availability (Fig. 3).Similarly as with uniformly distributed model parameters, beta distribution was the best fit (Table VI).The estimated average probability of system availability is about 0.999091, which means that the power plant is shut down on average for 7 hours and 58 minutes per year: slightly less than in the case of uniformly distributed model parameters case.
Chi-square goodness-of-fit test confirmed our null hypothesis, i.e. system availability has beta distribution with parameters a=11933.8 and b=10.858.In this case estimated p-value 0.3247 is almost two times higher than in the case of uniformly distributed parameters case and it is significantly higher than the standard 0.05 level of significance.This proves that beta distribution provides a good fit to model the system availability.
Sensitivity analysis was performed in order to evaluate the model parameters which significantly contribute to the system availability.For this purpose we measured the correlation between randomly generated failure and repair rates and computed system availability.The following correlation coefficient were estimated: Pearson's r, Spearman's ρ and Kendall's τ.Failure and repair rates which are significantly correlated to the system availability (with significance level 0.05) are presented in Table VII: The results of correlation sensitivity analysis confirm the intuitive assumptions about the reliability model.The explanation of the negative correlation between system availability and failure rates (or its negative correlation between repair rates) is straightforward.The transformer has the most significant effect, since it has the highest failure rate and the lowest repair rate (which leads to longer average repair time).The fact that the number of line segments in reliability model exceeds the number of any other items could explain its significance.
Correlation sensitivity analysis of normally distributed model parameters and system availability showed similar results (Table VIII).The same failure and repair rates are statistically significant in both sets of experiments.In both cases Pearson's and Spearman's correlation coefficients have higher values than Kendall's τ.

V. CONCLUSIONS
A Markov chain reliability model of a cogeneration power plant substation was presented.Stochastic automata networks formalism with functional transition rates is suitable to specify the reliability behaviour of a system.The size of state space can be lowered significantly if suitable SAN descriptor is chosen.
The Markov chain reliability model and the iterative algorithms allow estimating rare failure scenarios with high precision.Repetitive model solution with different sets of model parameters can be performed efficiently by the use of iterative methods and Markov chain models.This property was used to perform uncertainty and sensitivity analysis of system availability of the power plant.
The obtained results showed that beta distribution is the best fit to model the system availability.Failure and repair rates of transformers and line segments have the most significant effect on system availability.
lumping of the states.

2 A( ) 3 A
and describe two identical parts of consecutive branches in the second block of the substation.

Fig. 2 .
Fig. 2. Uncertainty of system availability with uniformly distributed model parameters.
generate normally distributed model parameters with mean values and variations close to those of uniformly distributed parameters (Table IV), the following formulas were used: Parameters a and b in (12) mean minimum and maximum values of uniform distribution U(a;b).According to (16)-(18) and Table IV, transformer failure rates have truncated normal

Fig. 3 .
Fig. 3. Uncertainty of system availability with normally distributed model parameters.
rates of transformers, busbars, switches, circuit breakers and line segments are denoted as t λ , b λ , s λ , c λ

TABLE I .
FAILURE AND REPAIR RATES.
1510 − precision, using Gauss-Seidel algorithm.In TableIIwe present some probable failure scenarios and their steady-state probabilities.

TABLE II .
FAILURE SCENARIOS AND STEADY-STATE PROBABILITIES.

TABLE III
Markov chain model with 2 sets of parameters (size of each set is 1000) with different distribution laws.The first set of failure and repair rates have uniform distribution (TableIV) with probability density function i π SteadyStateSolution( ( ) i p , ( ) 1 π ); 7. ( ) ← i s SystemMeasures( ( ) i π ); 8. end for.In this paper, uncertainty analysis was performed by the use of

TABLE IV
Random generation of model parameters and steady-state calculation were performed in C++ Builder, using PC with AMD Athlon 64 X2 dual core processor 4000+ 2.10 GHz and 896 MB of RAM physical address extension.It took 1.514 seconds of CPU time to perform the entire calculation.

TABLE VII .
CORRELATION BETWEEN SYSTEM AVAILABILITY AND UNIFORMLY DITRIBUTED MODEL PARAMETERS.