A Markovian model for power transformer maintenance

The condition of the insulation paper is one of the key determinants of the lifetime of a power transformer. The winding insulation paper may deteriorate aggressively and result in the unexpected failure of power transformers, especially under the presence of high moisture, oxygen, and metal contaminants. Such types of scenarios can be prevented if the deterioration is detected on time. Various types of condition monitoring techniques have been developed to detect transformer condition such as dissolved gas analysis (DGA) and frequency response analysis (FRA). They are non-intrusive and provide early warning of accelerated deterioration both chemically and mechanically. However, the accuracy of those techniques is imperfect, which means periodic inspection is still indispensable. In this paper, we discuss the value of continuous condition monitoring for power transformers and present a way to estimate this value. Towards this, a continuous-time Markov decision model is presented to optimize periodic inspections, so that the cost is minimized and the availability is maximized. We then analyze the performance based on the information from both discrete inspection and continuous condition monitoring using DGA and FRA. The result shows the dissolved gas analysis can improve the availability and operation cost, while frequency response analysis can only improve the availability of power transformers.


Introduction
Power transformers are critical assets in a power transmission network. A failure of a power transformer also may cause cascading failure and catastrophic blackout in the power grid. The necessity of increasing reliability and availability of power transformers can be analyzed directly from a financial point of view. Between 1997 and 2001, the total losses caused by power transformer failure in the US were over 286 million [1]. Moreover, the aging population of power transformers has increased since 1975 [2]. These imply that it is expected to have an increase in power transformer failure, and the resulting load curtailment if the maintenance strategy remains the same.
In literature, various types of maintenance models have been developed to address the problem of power transformer maintenance. Aldhubaib and Salama have developed a reliability centered maintenance and replacement approach to optimize maintenance and replacement to increase the lifetime of power transformers and reduce annual cost [3]. Dhople et al. proposed a set-theoretic method for capturing the uncertainty in Markov reliability and reward model to maximize the availability of power transformers [4]. Abu-Elanien et al. developed a decision support system to determine the life expectancy of transformers from techno-economic perspective [5]. Lima et al. designed a two-level framework of fault diagnosis and decision making for power transformers with considering the loss for life caused by overload condition [6]. Abiri-Jahromi et al. have developed a two-stages maintenance management model that contains both mid-term and shortterm maintenance to maximize the serviceability of power transformers [7]. Koksal and Ozdemir have improved the power transformer maintenance using a Markovian model [8].
As to the condition state of a power transformer is considered to be discrete, most of the developed models are based on the Markovian deterioration model. However, the deterioration of power transformers is oversimplified and modeled by Markov chain with a single deteriorating path. Such an approach is inaccurate because it overlooks the complexity of the deterioration of power transformers, such as the acceleration deterioration of insulation paper caused by high moisture. Because of this, the effective of condition-based maintenance that solely relies on the periodic inspection is over-estimated. Therefore, to re-estimate the value of continuous monitoring, it is essential to improve the deterioration model of power transformers. In practice, the accuracy of the condition monitoring is imperfect and may be interfered by operation signals and external signals. Therefore, even for the power transformers that have already installed the condition monitoring devices, periodic inspection can still provide additional value to triangulate the estimated condition information by condition monitoring. The objective of the paper is twofold: optimize the condition-based maintenance for power transformers; explore the value of online monitoring from the perspective of the lifecycle of power transformers.
T To achieve the objectives, in the second section, we use cause-effect analysis on different subsystems of the power transformer to identify the potential risk of acceleration deterioration of insulation paper caused by the malfunction of different subsystems. In section three, we develop a continuous-time Markov chain model to optimize the maintenance of power transformers based on the information from inspection and continuous monitoring. Section four analyzes the value of different types of online monitoring numerically. Section five summarizes the concluding remarks of the paper.

Deterioration of power transformers
To systematically analyze the deterioration of power transformers to identify the information that can be used to improve the modeling of the power transformer. According to functionality and structure, power transformers can be classified into seven subsystems: winding, magnetic core, insulation oil, bushing, tap changer, tank and cooling equipment. In practice, the condition of winding insulation paper is usually regarded as the index for power transformer condition [9] and [10]. The deterioration of the winding insulation paper may accelerate under the presence of high moisture, oxygen, and metal contaminants.
In this section, we aim to identify the potential risks of the accelerated deterioration of winding insulation paper caused by the malfunctions of other subsystems using cause-effect analysis. In general, the deterioration of winding insulation paper may accelerate in two ways: accelerated chemical aging and accelerated mechanical aging. The accelerated chemical deterioration is a combination of three interactive processes: pyrolysis, hydrolysis, and oxidation [11]. Hydrolysis is the dominant process in the accelerated chemical degradation. The rate of hydrolysis is dependent on the content of moisture and catalyzes by the acidity [12]. The increase of acidity is caused by the sludge formation as the result of oxidation. The sludge will also increase the temperature and accelerate the pyrolysis. The accelerated chemical deterioration starts with the occurrence of contamination and moderate partial discharge. During the deterioration process, dissolved gas will be generated. Eventually, the partial discharge will result in treeing, tracking or even breakdown the winding insulation. The accelerated mechanical deterioration is usually initialized by the loss of clamping force or distortion of winding geometry, which is mainly caused by abrasion under electric-magnetic forces [13]. Under accelerated mechanical deterioration, the partial discharge will proceed to creeping and result in the breakdown of winding insulation.
The aging of winding insulation is related to the moisture, acidity, oxygen, containment level, and clamping forces. Empirically, abnormality in these factors is usually caused by malfunctions of other subsystems. For example, inelastic gasket on bushing can increase the risk of excessive moisture, oxygen, and containment level and in turn accelerate the rate of winding insulation aging and reduce the life of the power transformer. Malfunctions such as inelastic gasket can be repaired with a minor cost if it is detected on time. However, the resulting deterioration of the insulation paper is irreversible and will significantly reduce the service lifetime of the power transformer. Nomenclature π ij steady state probability of state i j ( , ) λ a infant mortality rate of power transformers the deterioration rate of winding insulation paper from state i ( ,0) at normal deterioration process λ di, 1 the deterioration rate of winding insulation paper from state i ( ,1) in the accelerated chemical deterioration λ di, 2 the deterioration rate of winding insulation paper from state i ( ,2) in the accelerated mechanical deterioration λ fi, 1 the transition rate from state i ( ,0) in the normal deterioration to state i ( ,1) in the accelerated chemical deterioration λ fi,2 the transition rate from state i ( ,0) in the normal deterioration to state i ( ,2) in the accelerated mechanical deterioration λ F the sudden failure rate of power transformer in the normal deterioration process λ Fd the sudden failure rate of power transformer in the accelerated deterioration process λ ol the rate of successfully detected malfunctions by online monitoring device μ 1/ in the duration of periodic inspection μ 1/ c the duration of minor maintenance μ 1/ M the duration of major maintenance μ 1/ F the duration of corrective maintenance μ 1/ R the duration of replacement C u the downtime penalty cost due to unexpected failure C p The downtime penalty cost due to maintenance and inspection ′ C in the cost of periodic inspection ′ C C the cost of minor maintenance ′ C M The cost of major maintenance ′ C F the cost of corrective maintenance ′ C R the cost of replacement C ol the annual cost of online monitoring device Therefore, it is valuable to identify the malfunctions on different subsystems that may result in the accelerated deterioration of the insulation paper. We have identified the malfunctions from the extant literature using cause-effect analysis. The result is illustrated in Fig. 1.
In practice, the malfunction rates of the subsystems of power transformer are recorded as shown in [14]. As a result of the findings in this section, we can identify the transition rate between normal deterioration and accelerated deterioration for the winding insulation paper and refine the deterioration model of power transformers. In the next section, we will apply this knowledge to formulate the deterioration and maintenance model of power transformers in both distribution networks and transmission networks.

Deterioration and maintenance models for power transformers
We use continuous-time Markov chain (CTMC) to model the deterioration and maintenance of power transformers. Practically speaking, the maintenance strategy used for power transformers in distribution networks and those in transmission networks are different. We first model the deterioration and maintenance strategy for power transformers in distribution networks. The state transition diagram is illustrated in Fig. 2.
In Fig. 2, the states are represented by two indices i and j. i is the index representing the condition of the winding insulation paper and j is the index that represents the type of deterioration. We classify the condition of winding insulation i into 5 states (healthy, aged, defective, faulty and failure). Normal deterioration process is indicated with = j 0, while the accelerated chemical aging is presented as = j 1 and accelerated mechanical aging is represented as = j 2. We denote λ fi j , as the rate of transition from condition state i ( ,0) to i j ( , ). In an accelerated deterioration process, power transformer reaches the deterioration failure state (4,0) faster than normal process. Apart from deterioration failure, we also consider sudden failure with probability λ F , which is mainly caused by some exogenous events. In practice, if one of the power transformer subsystems has malfunctioned, the overall vulnerability of transformer will rise. We represent the probability of sudden failure during accelerated deterioration as λ Fd ( ⩾ λ λ Fd F ). The infant mortality rate λ a is also considered in the first condition state. The combination of all failure rates is a bathtub curve with diverse wear-out tails as shown in Fig. 3.
In Fig. 3, the wear-out tail is not only dependent on the type of the accelerated deterioration process it undergoes, but also on the timing of when the malfunctions result in accelerated deterioration.
In some scenarios, the main objective of maintenance of power transformers in the distribution network is to retain its basic serviceability. Corrective maintenance is implemented after the sudden failure of a power transformer. It can effectively restore the transformer to the normal operating condition just before the failure. However, not all failures are repairable. After the breakdown of winding insulation, replacement of the power transformer is necessary to preserve the serviceability of the power transformer. We denote the duration of corrective maintenance as μ 1/ F with a cost ′ C F , and replacement time as μ 1/ R with a cost ′ C R . Meanwhile, failure also results in unplanned downtime penalty C u due to the risk of load curtailment. The availability of the power transformer A is shown in Eq. (1): where π i j , indicates the steady state probability of the state i j ( , ). The long-term operational cost C is shown in Eq. (2).
In Eq. (2), the operation cost is a sum of three items. The first one is the downtime penalty cost for load curtailment. The second is the cost of corrective maintenance and the third is the cost of replacement. To derive the analytic solution for Eqs. (1) and (2), we need to express the steady state probabilities analytically. Firstly, this requires expressing all states with respect to a reference state for which we choose π 0,0 . The expressions of all the states in term of π 0,0 is shown in Appendix A. By recalling that the sum of all the steady state probabilities is equal to 1, we can calculate π 0,0 and in turn calculate the rest of steady state probabilities.
For power transformers in transmission networks, failure can be catastrophic and is associated with huge penalty costs. It is important to prevent failures and maximize the availability of power transformers. Here, condition-based maintenance is applied in addition to corrective maintenance and replacement. Condition-based maintenance is a type of maintenance strategy that recommends maintenance activities based on the information acquired through periodic inspection or online monitoring. In practice, both periodic inspection and online monitoring have their limitations. In periodic inspection, the deterioration may proceed to an unacceptable state or even failure state between the two successive inspections because of the stochastic nature of aging. The limitation of online monitoring is that not all types of malfunctions can be detected by online monitoring. Therefore, it is practical to implement both types of monitoring on power transformers in transmission networks.
The state transition diagram for condition-based maintenance of power transformers in transmission networks is shown in Fig. 4. The rate of periodic inspection is denoted as λ in . If the inspection indicates that the power transformer is in an acceptable condition ( < i 3 at normal deterioration), it will lead to no further action as presented by state i ( ,4). If any type of accelerated deterioration of the winding insulation is detected, a minor maintenance will be applied as represented by state i ( ,5). The malfunctioned subsystems will be repaired and deterioration will recover back to normal thereby prolonging the service lifetime of the transformer. However, because the deterioration of winding insulation paper is irreversible, minor maintenance will not restore the condition of the winding insulation. We denote the state of minor maintenance as state i ( ,6) where < i 3. The duration of the minor maintenance is 1/μ c with a cost ′ C c . If = i 3, a major maintenance (3,6) will be implemented to completely overhaul the power transformer. The major maintenance includes the replacement of the deteriorated winding insulation paper and can restore the condition of transformer back to (0,0). The duration of major maintenance is μ 1/ M with a cost ′ C M . State (i,7) indicates the minor preventive maintenance that is immediately scheduled after an early warning signal of a malfunction is detected by online monitoring. The rate of such type of preventive maintenance is λ ol . Because of preventive maintenance, the rates of malfunction λ fi,1 or λ fi,2 are reduced. When the rate = λ 0 ol , the state diagram indicates the scenario when maintenance strategy is based on the periodic inspection only, without the support of online monitoring. When online monitoring is implemented, early warning signals of some malfunctions can be detected on time. As a result, the malfunction can be repaired by a minor maintenance preventively before causing any non-negligible damage to the winding insulation. i ( ,7) represents the states for such type of minor preventive maintenances with a rate λ ol . In this case, the rates of malfunctions are reduced due to the preventive maintenance that are enabled by online monitoring. Thus, λ ol is equal to the reduced malfunction rate. We denote the cost of online monitoring as C ol . We assume that all the maintenance and inspection activities will require the transformer to be stopped. The penalty cost per unit time for such stoppages is denoted as C p . Because maintenance and inspection are preventive and prescheduled activities, C p is significantly smaller than C u , which is caused by unexpected failures. With the similar method used for the distribution network, we can express the availability and operation cost respectively as shown in Eqs.
In Eq. (4), the operation cost is the sum of the downtime penalty cost for unexpected failure, downtime penalty cost for preventive activities, inspection cost, minor maintenance cost, major maintenance cost, correction maintenance cost, replacement cost and the cost of online monitoring.This condition-based maintenance model opens up a successful way to find the optimal mean time between inspections to maximize the availability and minimize the operation cost with or without online monitoring. In Section 4, a numerical case study is provided to demonstrate the approach and to assess the value of different types of online monitoring.

Value of monitoring
In general, online monitoring is worthwhile only for power transformers in transmission networks, and we restrict our focus to those. The illustrative example is based on the parameter settings of 220 kV oil-immersed power transformers. We first aim to find the optimal mean time between inspections to maximize availability or minimize operation cost without applying online monitoring. Then, we optimize the mean time between inspections with different types of online monitoring. Using that, we assess the value of different types of online monitoring.
To ease the understanding, we assume the deterioration rates of winding insulation paper are independent of condition states. Degree of polymerization (DP) is widely used as the measurement of the condition of insulation paper. As shown in [15], 1/DP can be used as the condition boundaries for the aging insulation paper. Based on [16], the condition of insulation paper reaches the failure state when DP is below   . The parameters λ f 1 and λ f 2 are evaluated by experts according to the cause-effect analysis in Section 2. The overall numerical setting for all the parameters is shown in Table 1. In the table, the values of parameters for deterioration and maintenance duration are estimated based on the secondary data with the verification of experts. In practice, the values of failure cost and maintenance cost may vary case by case.
Based on the parameter settings, we can calculate the availability and cost of the power transformer using Eqs. (3) and (4). We plot the availability and operation cost against the mean time between inspections in Figs. 5 and 6 respectively.
Because both availability and operation cost are unimodal, we can find the optimal mean time between inspection to maximize the availability using Eq. (5).
Likewise, we can calculate the optimal mean time between inspection to minimize the operation cost using Eq. (6).
It can be found from Fig. 5 that the optimal mean time between inspections to maximize the availability is about 1.7 years and the maximum availability is 99.42%. From Fig. 6, the operation cost is minimized at £89,750/year, when mean time between inspections is around 0.33 years. The average rate of major maintenance is 0.0519/year. In practice, the cost setting of ′ C C C , , u p M and ′ C R would be smaller than the default setting because protection systems, control algorithms and redundancy of power systems can significantly reduce the impact of unavailability of the power transformer. Fig. 7 provides a sensitivity analysis under different cost settings. Fig. 7 presents the operation cost when ′ C C C , , u p M and ′ C R are scaled down to 80%, 60%, 40%, 20% of the default setting. The inspection interval is optimized to minimize the operation cost under each cost setting. The minimum operation cost and optimal inspection interval are presented in Table 2.
From Table 2, we can see that the minimum of operation cost decreases linearly when cost is reduced. In addition, the optimal inspection interval increases non-linearly.
We now examine how online condition monitoring might further improve the performance of power transformer maintenance. We consider two types of online monitoring devices: dissolved gas analysis device (DGA) and frequency response analysis device (FRA).
DGA determines the concentration of dissolved gas and moisture content. It is very sensitive to the change of key gases produced by the chemical aging of insulation paper. The accuracy of DGA can be as high as 90% [20]. It means that DGA may provide early warning for 90% of malfunctions that cause accelerated chemical aging of the insulation paper. Hence, the transition rate λ ol is: 0.0027/ ol f 1 and λ f 1 is reduced to 0.0003/year. According [21] and [22], we assume the annually cost for operating a condition monitoring is: €3000/year £2550/year ol With the same method, we can calculate the availability and operation cost by applying DGA. The results are illustrated in Figs. 8 and 9 respectively. From Fig. 8, we can see that the maximum availability is 99.48% with DGA when the mean time between inspections is about 2 years. Compared to the optimized availability without DGA, the improvement by applying DGA is 0.6%. When the mean time between inspections is smaller than 0.4 years, little improvement can be made by using DGA. The improvement in availability increases with increase in mean time between inspections. Fig. 9 shows the operation cost under different mean time between inspections with DGA. The minimum operation cost is £85,895/year with a mean time between inspection of 0.685 years. Compared to the optimized operation cost without DGA, the savings is about £3855/ year. It is worthwhile to notice that when the mean time between inspection is smaller than 0.2 years, the operation cost without DGA is smaller due to the cost of online monitoring. The average rate of major maintenance is reduced to 0.0504/year.
We will then investigate the performance of FRA. The accuracy of FRA to detect the mechanical related malfunctions is shown to be 90% in [23]. In this scenario, the λ ol is: 0.0009/ ol f 2 and the λ f 2 is reduced to 0.0001/year. We assume C ol remain the same. The resulting availability and operation cost are illustrated in the Figs. 10 and 11. In Fig. 10, the maximized availability is 99.43% with FRA when the mean time between inspection is about 1.86 years. Compared to the optimized availability without FRA, the improvement by applying FRA is 0.01%. Fig. 11 shows the minimized operation cost for the power transformer with FRA is £91,840/year. It is even larger than the scenario without FRA. This is because the performance improvement cannot surpass the annual cost of online monitoring. The average rate of major maintenance is reduced to 0.0517/year. Table 3 compares the performance between DGA and FRA As shown in Table 3, DGA can improve the availability, reduce the operation cost and reduce the average rate of major maintenance. But, FRA only provides a relatively small improvement in availability and less reduction in the average rate of major maintenance. In the example, FRA does not reduce the operation cost because the cost savings is offset by the annual operation cost of online monitoring. In practice, FRA is even less effective, because of the lack of interpretability and hence is usually applied only for post-fault tests.

Conclusion
In this paper, we first analyzed the malfunctions that may accelerate the deterioration of winding insulation paper. Then, we applied the information to refine the deterioration model of power transformers. We considered multiple types of failure rates of power transformers, such as infant mortality, sudden failure, and deterioration failures, so that the overall failure rate of power transformer follows a bathtub Table 1 Parameter setting [14,17,18,19]. curve with multiple wear-out tails. We developed analytical models for power transformer maintenance in the distribution network and transmission network using CTMC. A condition-based maintenance is designed for power transformers in the transmission network. The designed condition-based maintenance model opens up a successful way to optimize the maintenance of power transformers according to the information from both periodic inspection and online monitoring       Further work will be focused on extending this model to solve the maintenance of a power system that is composed of multiples power transformers, switches, and other devices.