Power transformer fault diagnosis using FCM and improved PCA

: In order to improve fault diagnosis accuracy of power transformer, a new fault diagnosis model based on fuzzy C-means (FCM) clustering algorithm and improved principal component analysis (IPCA) is presented. First, dissolve gas analysis samples are clustered with FCM and cluster centre for each fault type is regarded as reference sequence. Then, the IPCA approach is implemented to obtain main principal components contain-ing 95% of original information. Finally, Euclidean distances between principal components of reference sequence and testing sample are calculated to identify ﬁ nal fault type. Cases studies and test results show that the proposed approach achieves recognition of transformer fault effectively and has a higher diagnostic accuracy than the international electrotechnical commission (IEC) ratio method and the improved three ratio method.


Introduction
Power transformer is one of the most expensive and vital assets in power system, which plays an important role in converting voltage level, connecting transmission and distribution networks [1]. Power transformer operating condition has direct impact on safety, reliability and stability of the power system. However, faults occurrence of power transformer seems inevitably due to aging and kinds of stresses, including electrical, mechanical, thermal and environmental stresses. Therefore, it is imperative to detect and diagnose incipient faults of power transformer early and correctly for a reliable and steady operation of power system.
Dissolved gas analysis (DGA) is an extensively used interpretation approach for power transformer fault diagnosis. The main advantage of the DGA method is that it diagnoses faults simply and effectively regardless of magnetic field around. Besides, DGA can be carried out without denergisation of power transformer [2]. Plenty of fault interpretation techniques have been developed based on DGA, such as the key gas method [3,4], IEC ratio method [5], improved three ratio (ITR) method [6], Roger ration method [7], Dornenburg ratio method [8] and Duval triangle method [9]. DGA methods mentioned above have made progresses in practical application, but there are still some drawbacks, such as code absence, sharp boundary of codes, diagnosis inconsistence and undesired performance [10]. Recently, since the development of artificial intelligent (AI) and machine learning, various AI techniques and theories have been utilised in fields of power transformer fault diagnosis. Support vector machine (SVM) [11], artificial neuro-networks (ANNs) [12], fuzzy cluster method (FCM) [13], grey theory [14], matter element analysis [15] and evidential theory [16] have been applied to improve fault diagnosis performance. There are limitations of AI techniques mentioned above, i.e. ANN has good self-learning and recognition ability but with slow convergence and local optimal. For SVM, suitable kernel function and parameters are hard to determine and new structures need to be designed for multi-class problems. Determination of distinguish coefficient and standard sequence is a complex task for grey theory. Effective fusion of conflict evidences is still a challenge for evidence theory. Therefore, there are rooms for traditional DGA techniques and AI-based models to further improve the fault diagnosis capability.
Many variables can be used to interpret incipient faults of power transformer. Nevertheless, these variables may be related and redundant, which makes the fault diagnosis model more complex and diagnosis performance poor. So, it is necessary to reduce dimension and remove redundancy in variables. The PCA method is a simple yet effective way to compress data space, extract key features and eliminate redundancies between variables. A DGA-based fault diagnosis method using fuzzy C-means (FCM) clustering algorithm and improved principal component analysis (IPCA) is put forward in this paper. Cluster centres of original DGA dataset obtained by FCM are used as reference sequences. Main principal components of sample matrix are computed using IPCA. Final diagnosis results are based on the comparison of Euclidean distance between reference sequences and testing samples. Results of case studies and experimental test verify the validity and accuracy of the proposed method.

Establishment of reference sequence
Power transformer reference sequence is usually employed to determine fault types according to the distance between testing samples and typical sequences [17]. Fault types corresponding to the minimum distance can be treated as diagnosis result. So, creation of typical sequence, known as reference sequence, has a significant influence on the determination of final result. Till now, most of reference sequence is based on the mean value of input variables, which is subjected to limited samples size, imprecise and inconsistency. Besides, the conventional normalisation process to acquire reference sequence puts too much emphasis on samples' uniformity and pays less attention on sample difference or distribution characteristic. As a result, the reference sequence may not be typical enough and lead to inaccurate results. Therefore, more reliable and precise reference sequences are needed in fault diagnosis of power transformer.

Data normalisation
Different range and magnitude of raw samples result in long computational time, low accuracy and poor generalisation of the learning model. Hence, normalisation is a critical preprocess of AI and machine learning. In general, the original data matrix X is normalised with (1) shown as where X * is the standardised matrix. X m stands for the mean value of each variable and S is standard deviation.
In order to take sample difference and distribution characteristic into consideration, relative proportion of samples is used to standardise the sample matrix, described as follows: where x ij and x * ij are the variables before and after normalisation, respectively, n stands for samples size and m for dimension number.

Fuzzy c-mean cluster algorithm
FCM clustering algorithm is a popular unsupervised clustering approach, which is introduced by Dunn and has been intensively used in many different fields, such as quantisation, segmentation and pattern recognition [18]. FCM implements fuzzy clustering by searching sets of clusters and determined cluster centres that are intend to mark the average location of each cluster. FCM assigns each data a membership grade for each cluster and updates cluster centres and membership grade iteratively to make cluster centre moving towards the 'right' location of dataset [19]. Determination of final cluster centre can be regarded as an optimisation issue that minimises the constrained objective function presented as follows: where c and n are the number of clusters and samples in dataset, respectively. m ij [ [0, 1] presents the membership degree of the data x j in the ith cluster. d ij is the Euclidean distance between the ith cluster centre y i and jth data x j . m [ [1, + 1] is a weighting exponent and generally sets to 2.
With the application of the Lagrange multiplier method, an iterative optimisation is carried out and updated membership u ij and cluster centres v i are shown, respectively, as (4) and (5): The iteration will stop when (6) is satisfied: where d is a termination criterion and t is the iteration number.
To sum up, the establishment of reference sequence consists of the following steps: (i) normalise raw dataset with (2); (ii) initialise FCM algorithm, including initial cluster centres, cluster number, iteration number, weighting exponent and termination criterion; (iii) calculate objective function and update membership and cluster centres with (4) and (5); and (iv) terminate calculation process if (6) is satisfied or return step (iii).
The final cluster centres are viewed as reference sequence of each fault types and used to determine incipient fault of new DGA samples [20].

Improved principle component analysis
PCA is one of the most classical feature extraction methods in statistical analysis. This approach extracts significant factors via linear transformation and gets minimal dimensions to represent original information. Therefore, PCA is used to reduce variables dimensions, eliminate redundant information, simplify structure of classifier with acceptable information lost and promote classification performance [21].
Generally speaking, information contained in the raw dataset can be divided into two parts. One is the variables relationship information that is presented by correlation coefficient, and another one is the variables' difference information that is revealed through variance. The common method for normalisation is shown as (1), with whom the mean value is '0' and variance is '1'. As a result, relationship information is kept only and difference information is lost. Furthermore, the PCA algorithm using the aforementioned normalisation method may not be able to represent original information entirely and correctly [22]. To overcome these shortcomings, an averaging approach described as (7) is employed to normalise raw data: where x ′ ij is the standardisation vale of x ij . The main advantage of averaging value normalisation method is that there is no different in correlation matrix and no information lost after normalisation [23].
Given data matrix X n×p , then the implementation steps of PCA are listed as below: (i) Normalisation of data matrix X: In order to keep all information, equation is applied to standardise the original samples. (ii) Build correlation coefficient matrix R and calculation of eigenvalues l and eigenvector g: The correlation coefficient can be computed with equation (8) after normalization and set up the correlation coefficient matrix R. After solving characteristic equation lI − R || = 0, eigenvalues l and corresponding eigenvector g are obtained: (iv) Calculate the first h principal component of sample: After determining the exact number of h, then principal components Y i of samples are obtained using (11):  Table 1.

Preprocessing of fault diagnosis
To obtain the reference sequence, a matrix with all 524 samples is first normalised with (2), and then the FCM method is implemented to obtained six cluster centres for all fault types, and the result is shown in Table 2.
It can be concluded from Table 2 that, for discharging faults (such as D1, D2 and PD), the major proportion of gases are hydrogen (H 2 ). While the proportion of C 2 H 2 and C 2 H 4 increases as discharging intensity increases. Meanwhile, the main proportion for T1 is H 2 and CH 4 , but the proportion of CH 4 and C 2 H 4 increases as temperature rises. The major proportion of key gas and changing tendency suggested by Table 2 is agreed with [4][5][6].
After reference sequences are established, a sample matrix X combined with reference sequence and testing samples is built and IPCA is implemented to extract main principal components from X and determine final diagnosis result of testing samples. The main steps of IPCA are described in Section 3. When the first h principal components are gained by following step (iv), the Euclidean distance D between principal components of testing sample C s and reference sequence C r can be calculated with (12) [24]: Fault type of the reference sequence corresponding to the minimum distance stand for the diagnose result. For example, if the minimum distance is the third one d 1 , then the final result is high energy discharge.

Case study and experimental test
In order to reveal the fault diagnosis process concretely and demonstrate the validity and effectiveness of the proposed approach, a practical case and another 130 samples from China Southern Power Grid are studied.
According to the proposed methods, the testing sample is normalised and then the sample matrix X is created. After that IPCA is applied to calculate correlation coefficient matrix R, eigenvalues l and the corresponding eigenvectors g, which are shown as below:   Contribution rate and cumulative contribution rate of principal components are listed in Table 3. It can be seen from Table 3  The minimum distance is d 3 = 0.2649, which indicates that final diagnosis result of the testing sample is partial discharge. The fault diagnosis result accords with actual situation, which manifests the fault diagnosis capability of the proposed approach. In order to evaluate effectiveness and accuracy of the proposed approach, 130 extra samples provided by a power supply bureau are diagnosed by three different interpretation techniques, including the IEC ratio method, the ITR method and the proposed methods. Comparisons results of fault diagnosis are presented in Table 4.
It is illustrated from Table 4 that the proposed approach is capable to diagnose incipient faults of power transformer effectively and correctly. Moreover, the proposed method has higher accuracy than that of the IEC ratio method and ITR method.

Conclusion
A fault diagnosis method based on FCM clustering and IPCA algorithm is proposed in this paper. This method normalises raw dataset with relatively percentage of gas concentration and regards cluster centres of sample matrix as reference sequence. IPCA is then applied to reduce dimension and eliminate redundancy in input variables. The distance between principal components of testing sample and reference sequence is used to determine final diagnostic results. Case study and experimental test verify the effectiveness and accuracy of the proposed approach. Comparison results show that the proposed methods have a better fault diagnosis performance than the IEC ratio method and ITR method.

Acknowledgments
This project is supported by the Nation High-tech R&D program of China (No. 2015AA050201). Also, the authors thank all the reviewers for their useful comments.