Interpreting dissolved gases in transformer oil: A new method based on the analysis of labelled fault data

In this contribution, a new dissolved gas analysis (DGA) method combining key gases and ratio approaches for power transformer fault diagnostic is presented. It is based on studying subsets and uses the ﬁve main hydrocarbon gases including hydrogen (H 2 ), methane (CH 4 ), ethane (C 2 H 6 ), ethylene (C 2 H 4 ), and acetylene (C 2 H 2 ). The proposed method uses 475 samples from the dataset divided into subsets formed from the maximum and minimum(s) concentrations of the whole dataset. It has been tested on 117 DGA sample data and validated on the International Electrotechnical Commission (IEC) TC10 database. The performance of the proposed diagnostic method was evaluated and compared with the following diagnostic methods: IEC ratios method, Duval’s triangle (DT), three ratios technique (TRT), Gouda’s triangle (GT), and self-organizing map (SOM) clusters. The results found were analysed by computer simulations using MATLAB software. The proposed method has a diagnosis accuracy of 97.42% for fault types,


INTRODUCTION
Power transformers are the most expensive and important elements of power systems. They are crucial for the safety and stability of network operations. Indeed, the failure of a power transformer can lead to a major breakdown of the power grid, leading to outages, costly repairs and huge financial costs [1]. Therefore, early detection of transformer faults is imperative in the process of operating and maintaining power system networks. Chromatographic analysis of dissolved gas in oil, namely dissolved gas analysis (DGA) is one of the most widely used techniques for the early detection of faults inactive parts of transformers [2,3]. Its popularity stems from the fact that this technique is non-intrusive and can be used for real-time monitoring. The principle of the method consists of periodically taking samples of transformer insulation oil to obtain the composition of gases dissolved in the oil due to the degradation of the insulation system [4]. Identification of the different dissolved gases is made possible by gas chromatography discovered in the 1940s [5]. Gas production is favoured by the temperature level and/or the energy produced by the fault. Depending on the type of fault, different types of decomposition processes may occur. When electrical or thermal faults occur in transformer oil, it degrades, generating combustible gases such as hydrogen (H 2 ), methane (CH 4 ), ethane (C 2 H 6 ), ethylene (C 2 H 4 ), and acetylene (C 2 H 2 ). When decomposition occurs in cellulosic insulation, the gases generated are carbon monoxide (CO) and carbon dioxide (CO 2 ), which indicates a thermal fault. Other gases such as oxygen (O 2 ) and nitrogen (N 2 ) are also produced [6]. Once the gases have been identified and quantified, the result still needs to be interpreted to assess the condition of the transformer. Several methods have been proposed in the literature to predict the occurrence of faults and to determine their types by interpreting the concentration of the gases detected [7]. Several standards from different committees and organizations, such as International Electrotechnical Commission (IEC) 60559-1999, Institute of Electrical and Electronics Engineers (IEEE) C57.104-1991, and International Council on Large Electric Systems (CIGRE) TF 15.01.01 provide guidelines for DGA interpretation.
Generally, conventional diagnostic methods using dissolved gases can be divided into three main categories: key gas, graphical and gas ratio methods [8]. The key gas method is based on the correlation of key gases generated with the fault type. In this method, the fault type is identified by the percentage of the generated gases as suggested by IEEE C57.104-2019 [9]. The graphical methods are based on a graphical representation visualizing the different types of faults. Each side of these graphs represents the relative proportions of key gases concentrations or combinations. The most popular graphical methods are Duval's triangle (DT) [10] and Duval's pentagon (DP) [11]. Other graphic methods exist in the literature such as Mansour's pentagon [12], Gouda's heptagon [13], or Gouda's triangle (GT) [14]. Gas ratio methods are based on the correlation of ratio of fault gas concentrations with incipient fault types. These methods take into account the ratios of key gases to develop a code that is supposed to give an indication of fault type. These include, among others, Doernenburg's ratios method (DRM) [ [34][35][36] for the diagnosis of transformer faults based on DGA data. The current existing conventional and intelligent methods are carried out by means of a sample dataset with the corresponding labelled faults. The size of the training data is a limitation for conventional methods because they require interpretation by human experts [37]. As a result, many of these techniques are based on a reduced amount of data, thus increasing the probability of misdiagnosis.
In this paper, a new diagnostic model combining key gas and gas ratio methods is proposed. It is based on multi-studying dataset (subset) and six ratios of H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 . This method solves the problem of the size of the dataset by creating subsets made from combining maximum and minimum(s) sample concentrations of the main dataset. The ratio approach was used to distinguish between the different faults in each subset. The proposed diagnostic method was carried on using 475 samples dataset, tested on 117 samples DGA data. The classification performance of the proposed method is validated on IEC TC10 database and compared with following conventional methods DT, IRM, RRM, TRT, and DRM.
The remaining part of this paper is organized as follows: A brief description of the types of faults detectable by DGA, and the relationships between the gases produced and the corresponding faults is given in Section 2. Section 3 is devoted to brief review of gas ratio methods. The principle and the flow chart of proposed method are presented in Section 4. The test performance of proposed method and its comparison with conventional methods using IEC TC10 database are presented in Section 5. Finally, Section 6 concludes the paper.

Transformer fault types
The three major types of power transformer faults which can be reliably identified during a visual inspection are partial discharges, thermal overheating, and arcing [38]. Partial discharges and arcing refer to electrical faults and correspond to the deterioration of insulation due to high electrical stress. Thermal faults refer to the deterioration of the insulation system as a result of a rise in abnormal temperature. Such rises result from overheating of conductors, short circuits, overheating of windings due to Foucault's currents, loose connections, and insufficient cooling [5]. Based on IEC 60599, these major fault types can be further classified into 6 types of transformer faults, summarized in Table 1.

Relationship between faults and dissolved gas produced
The two main causes of gas formation in an operating transformer are electrical and thermal stresses. Each type of fault degrades the oil or paper differently, each producing its amount of dissolved gas. The quantities are more or less important depending on the intensity of the particular fault. The nature of the gases formed and their relative proportions provide information on the type of stress, its intensity and the type of materials affected [39]. When an electric arc discharge occurs, large amounts of hydrogen and acetylene are produced, with minor amounts of methane and ethylene. For such a failure, acetylene typically accounts for 20% to 70% and hydrogen for 30% to 90% of the total hydrocarbons. Carbon dioxide and carbon monoxide can also be formed if the cellulose is present at the fault site. In some cases, the oil may carbonize [40]. The occurrence of thermal faults leads to the degradation of oil and paper. Oil overheating produces ethylene and methane with small amounts of hydrogen and ethane. Traces of acetylene can be formed if the fault is serious or involves electrical contacts. Large quantities of carbon dioxide and carbon monoxide are produced when thermal faults attack cellulose.
Hydrocarbon gases, such as methane and ethylene, are formed if the fault involves an oil-impregnated structure [7].

GAS RATIO METHODS
The gas ratio methods are conventional methods that use key gas ratios for fault diagnosis. In this section, a brief review of these methods is presented.

Doernenburg's ratio method
The DRM is the first method using the DGA approach. It was designed in 1794 in order to evaluate the three main faults types.

Roger's ratio method
The Rogers Ratio Method takes into account the ratios of H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 to develop code allowing fault diagnosis. In Table 4, ratio range and corresponding codes are listed. The corresponding diagnostics for the various code combinations are presented in Table 5 [9].

IEC ratio method
The IEC ratio method takes into account the same ratios as RRM and the faults are classified into nine categories. The same codes of the three ratios in Table 4 are used in Table 6 which presents code combination according to the IRM faults diagnostics.

Three ratios technique
The TRT proposed by Gouda et al.
[18] uses three new gas ratios to classify fault types and their severity, as shown in Table 7. In this method, the R 1 ratio is used to classify thermal, arcing, and partial discharge faults. The R 3 ratio, also used in the above diagnostic techniques, is used to separate thermal and electrical faults, and so it is used to confirm the type of R 1 ratio fault. The R 2 ratio is used to assess the degree of severity of thermal, electrical and partial discharge faults. It is used to distinguish between low (PD 1 ) and high (PD 2 ) partial discharge faults, low (D 1 ) and high (D 2 ) energy discharge faults and also very low (T 0 ), low (T 1 ), medium (T 2 ) and high (T 3 ) temperature thermal energy faults [14]. The corresponding diagnostics for the various code combinations, inspired by the flowchart described in [14], are presented in Table 8.  High temperature thermal T > 700 • C This technique shall be applied when at least one of the concentrations of dissolved gases exceeds the normal limits as shown in Table 9.

Principle of the method
This article proposes a diagnostic method for power transformer faults that combines the key gas and gas ratio approaches. It is mainly based on the decomposition of the studying dataset into studying subsets which are then studied individually using the ratios method approach. Six gas ratios involving the five main hydrocarbon gases formed in transformer oil, namely H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 are used. The subsets obtained by decomposing the main dataset result from the combination of maximum and minimum(s) sample concentrations of the main dataset. The gas ratio approach is used to determine the different faults in each subset. As each subset is treated independently of the others, this allows more flexibility on the ratios to be taken into account and on the ratio ranges to be used for development of the model of each subset (sub-model). The final diagnostic model is obtained by combining the different sub-models obtained with each subset. Table 10 shows the subsets resulting from combinations having hydrogen as maximum concentration. A total of 75 studying subsets can be created from the main dataset. Table 11 lists the definition of the gas ratios used, while Figure 1 illustrates the principle of the proposed method.

Example of application of the method
This example illustrates the application of the proposed method to a dataset of 25 samples (Table 12). The first step in the method is to create subsets from the samples in the studying dataset. In the second step, each subset is studied individually and the corresponding sub-model is proposed. The third and last step consists of grouping all the sub-models into a single program to have the diagnostic model. These three steps are presented in Figure 2. A generalization to a larger database made it possible to have the flowchart of the diagnostic method presented in Table B1 and the pseudo code in Appendix A. Examples of numerical application on samples 5 (purple), 10 (red), 17 (blue) and 25 (green) from Table 12 can be seen in Table B1.

Data collection
The present study was carried out using 592 samples covering the six faults classes with actual fault types collected from several sources as presented in Table 13  In order to conduct the new proposed method, the DGA data was divided into studying and testing dataset as shown in Table 14. The studying dataset is composed of samples labelled of dissolved gas and is used for the implementation of flow chart of the proposed method. The testing dataset is used for verification of observations made in each subset.

Results and discussion
Implementation of the proposed method was performed using MATLAB software and the algorithm was programmed in .m codes. Table 15 presents an overview of the fault diagnostic accuracy obtained by comparing studying and testing datasets. Considering the diagnostic accuracy results obtained from the studying dataset, it is clear that the proposed method performs better at detecting PD, D 2 and T 3 faults, with accuracy greater than or equal to 90%. A fairly good accuracy, close to 70%, was reported for faults D 1 and T 2 while an accuracy of 82.6% was assigned for fault T 1 . In summary, 83.36% of the dissolved gas samples were well diagnosed, i.e. 396 out of 475    data sets. Based on the diagnostic accuracies obtained from the testing dataset, it appears that the observations made on the studying dataset were well carried out in the measure that its diagnostic precision was higher.

Validation and comparison with other conventional methods using IEC TC10 database
The IEC TC 10 database contains 117 cases of fault for transformers in service, which were identified by visual inspection [38]. This data is not part of the new DGA proposed method. In order to validate this proposed model, this DGA database was used. The diagnostic results are presented in Table C1 and the average diagnostic accuracies by equipment type are summarized in Table 17. Table 16 shows the equipment's abbreviations of the IEC TC10 database. In Table 17, the fault types refer to the three   The diagnostic accuracies with the IEC TC10 database for the different methods are presented in terms of the equipment and distributed according to severity and fault type. Considering the diagnostic accuracy obtained from the equipment, the proposed method could be used to detect and classify faults in P, U, R, I, and C equipment. For power transformers without communicating OLTC, the proposed method has diagnostic accuracy of 88.88% and 97.22% respectively in terms of severity and fault type. However, for power transformers with communicating OLTC, the diagnostic accuracy is 100% for both types. Out of the 117 cases including all equipment, the proposed method has diagnostic accuracy of 90.60% and 97.43% for severity and fault type respectively.
The use of subsets makes it possible on the one hand to propose empirical methods to diagnose power transformers using a large number of labelled data and on the other hand to take into account all the characteristics of the sample subsets created. However, the multiplication of studying datasets increases the work of the human expert, who no longer confines himself to observations allowing detection and classification of faults in a single set, but in several sets at the same time. Although the new diagnostic method is more constraining in terms of the work carried out, it offers several avenues for improving the performance of existing methods. Also, it can be used to propose a method with dynamic ratios according to the different subsets created. It could even be used to combine several methods into one by applying them to the different subsets created.

CONCLUSION
In this paper, a new conventional DGA method for fault diagnosis of power transformers is proposed. This method is based on multi datasets combining the key gases and gas ratio approaches. The key gases approach is used to form the different studying subsets from the combination of maximum and minimum(s) sample gas concentration of main dataset. The gas ratio approach is used to detect and classify faults of each studying subset. The dataset used in this paper contains 709 labelled samples covering six fault types. The first group of 592 samples is used for the implementation and evaluation of the diagnostic model proposed. Taking into account the subjectivity of the testing dataset, the performance of proposed diagnostic model was validated using the second group of data consisting of the 117 samples from the IEC TC10 database. The proposed method has a diagnosis accuracy of 97.42% for fault types, as compared to 93.16% of TRT, 96.58% of GT method, 97.25% of SOM clusters method, and 98.29% of DT method. In terms of fault severity, however, the proposed method has the highest diagnostic accuracy of 90.60% compared to 78.90% of SOM clusters method, 83.76% of TRT, 88.03% of DT method and 89.74% of GT method. The main advantage of the proposed method is that it can be formalized insofar as the schematic approach is clear and comprehensible. Whereas this is not the case with the conventional methods existing in the literature, which present their flow chart without the methodical approach that made it possible. The use of studying subsets makes it possible to implement conventional diagnostic methods using large databases leading to the proposal of a more efficient diagnostic model. In addition, it offers many possibilities in the improvement of existing conventional methods, in the implementation of combined or even hybrid diagnostic approaches. The proposed model appears to be a promising approach to support a new generation of DGA diagnosis and to overcome the complexities. 16

APPENDICES
The pseudo code describes step by step how the method can be reproduced by everyone. In this pseudo code, it is indicated how the flowchart can be transformed into a code with two examples. Table B1 presents the flow chart of the proposed diagnostic method and Table C1 shows the diagnostic results obtained with the conventional methods and the proposed method, using IEC TC10 database.