An Efficient Approach With Application of Linear and Nonlinear Models for Evaluation of Power Transformer Health Index

In this paper, efficient and accurate linear and nonlinear models are proposed for indicating comprehensive health requirements of the transformer using health index (HI) concept. The models are established with 336 experimental datasets including oil characteristics and dissolved gas analysis (DGA) of various types of transformers placed in different areas. The significance of DGA parameters in transformer health condition is considered with the inclusive DGA factor (DGAF) parameter, which considers the weighting importance of seven dissolved gases. Nonlinear models used in this paper are artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS), which represent the behavior of transformer insulation parameters. The nonlinear models are compared with multiple linear regression (MLR) which is a linear statistical model. The models are established with 80 percent of the experimental dataset. The other 20 percent of data are utilized for the efficiency assessment of the models. The results demonstrate that the models provide an assessment of the health condition of the transformers comparable to existing models with high accuracy. The contributions of this paper are: 1) Evaluating the overall HI of the transformer employing a complete set of 15 input parameters of transformer oil-paper insulation system. 2) Adding DGAF, %WaterPaper, IFT parameters and showing the importance of these parameters. 3) Regarding the condition of solid insulation of the transformer particularly. 4) Applying a diverse and large practical dataset composed of 336 different transformers located in different country areas. 5) Using the MLR method for three purposes. 6) Providing linear (MLR) and nonlinear (ANN, ANFIS) models for HI calculation of the dataset, simultaneously. 7) Verifying the applicability and efficiency of the ANFIS model for simulating HI value.

Continuous performance of power transformers is necessary to maintain the reliability of the power transmission and distribution network. Aging along with changes in loading conditions, weather conditions, faults, and other electrical, chemical, and mechanical stresses, accelerate insulation deterioration of the transformers. Power transformer lifetime depends directly on the condition of the transformer insulation. Condition assessment of power transformer is necessary to extend transformer lifetime with detecting any probable failure and poor health condition. Some maintenance strategies are developed based on a comprehensive and simultaneous survey of different dissolved gas analysis (DGA) and oil quality related parameters [1]- [3].
The condition of each oil characteristic represents just one feature of the transformer insulation condition using the limits of the parameters related to DGA [4], [5] and oil quality [6]- [9] tests given in standards. But to make proper decisions, the operator needs the comprehensive health assessment of the transformer insulation system and investigation of all DGA [10] and oil-quality parameters. Therefore, some efficient methods are required that can be trained from network history and employed to assess the comprehensive health status of power transformers.
In this paper, the parameters collected from the site and laboratory diagnostic tests, operating observations, and field inspections are utilized for assessing the comprehensive health status of the transformer. The oil-quality test parameters are: breakdown voltage (kV), dissipation factor at 90 • C, acidity (mg KOH /g oil ), interfacial tension (mN/m), water content in oil at 20 • C (ppm), percent water saturation of oil, percent water in paper insulation, degree of polymerization (DP), furfural content (ppm), and the DGA test parameters are CO 2 , CO, H 2 , CH 4 , C 2 H 2 , C 2 H 4 , and C 2 H 6 gases contents.
Health Index (HI) is the methodology of incorporating different features data to obtain a quantity value for comparison of the comprehensive status of the transformers. The utility uses HI to distinguish between degradation, which requires maintenance schemes, and degradation which demonstrates end of life defined as DP=200. HI tool employs the expert's skill to forecast future operation, replace procedures and failure probabilities. HI quantifies the transformer condition based on multiple condition criteria related to the long-term degradation factors that cumulatively results the transformer's end of life.

B. LITERATURE SURVEY
The Health Index concept for assets as we know today is introduced first time by Hughes [11], and continued in [12] with risk factors included in the index which provides a composite health index for network assets, and then used extensively in [13] to describe the impact of preventative maintenance on health index and predict future asset condition based on the current health index and maintenance practices.
After discussing HI for general transmission and distribution assets, an approach to determine the health index especially for power transformers is presented in [14] which shown a realistic and detailed Health Index formulation method for power transformers. For this purpose a simple linear weighting system is used for each parameter, whereas the actual weighting, scoring and limits could vary from one power utility to the other [15], [16]. Some literature provide the incipient fault diagnosis for the transformers and a health index that represents the overall health condition of the transformers with investigation of all DGA and oil-quality parameters altogether is not regarded in them [17]- [21].
HI has been used to evaluate the health status of the transformer in several studies [13]- [16], [22]- [33]. The information required for extracting transformer assessment different indices is presented in the CIGRE 761 technical brochure [34].
Artificial intelligence algorithms such as fuzzy logic [15], fuzzy SVM [16], synthetic minority over-sampling technique [22], binary cat swarm optimization based SVM [23], and multi-agent system [35] are used to obtain transformer health condition. Neural network [17], [18] and neurofuzzy [19]- [21] methods are utilized to detect the fault condition of the transformers. A feature selection method using classification techniques to extract the most effective parameters to determine the HI condition is presented in [24]. In [25] at first, statistical analysis of the transformer data is done, and then the HI approach is given for the transformer maintenance. A probabilistic method for transformer HI calculation to deal with data uncertainty is presented in [26], [27]. In [28], the HI decreasing rate is considered to improve the transformer condition assessment. Some regression-based models are simulated for HI prediction in [29]. A statistical distribution method is applied to predict the HI of transformers in [30]. A procedure of calculating health index for oil-paper transformers using binary logistic regression is presented in [31]. A decision-support model determine assets needing additional maintenance or replacement by failure mode and effect analysis is presented in [32]. In [33] the orthogonal wavelet network is used to estimate transformer Health Index using transformer test results. Economic parameters include in the assessment to use the health index result to prioritize maintenance activities is presented in [36].
Some important literature related works are categorized in terms of the input data, classification methods, aggregation methods, output type, advantages and disadvantages of each reference in Tables 1 to 3. Comparison of input data categories for some sample literature are given in Table 1. Comparison of classification methods and aggregation methods for the same sample literature are shown in Table 2. Comparison of output type and advantages and disadvantages for the same sample literature are described in Table 3.
The originality and contributions of the proposed work with respect to the literature to overcome the research gap findings is explained in detail in the next subsection.

C. MAIN CONTRIBUTIONS
In this study, three efficient methods have been proposed for transformer HI calculation. Multiple Linear Regression (MLR) model is developed to obtain the best fit of data using linear regression [37], [38]. MLR method models the linear relationship between HI and transformer insulation parameters. Also, in order to consider the nonlinear relationship between model parameters, Artificial Neural Network (ANN) and Adaptive Neuro-Fuzzy Inference System (ANFIS) models are used for transformer HI calculation. The different models are implemented using 336 experimental datasets, and their performances are compared.
The originality and the principal contributions of the paper with respect to the literature are addressed as follows: I) Evaluating the overall HI of the transformer employing a complete set of eight input parameters of transformer oil-paper insulation system including physical, chemical, mechanical, and electrical aspects of transformer insulation condition. One of the input parameters of this paper is DGAF which includes seven dissolved gases with their importance weightings. So, in this paper 15 input parameters are utilized for construction of the models. An increase in the amount of input data results in an improvement of the model's performance. II) Utilizing DGAF parameter instead of TDCG [4] (total dissolved combustible gas), and adding two significant parameters including IFT (Interfacial Tension) and %WaterPaper (percent Water in Paper insulation) to the input parameters in comparison with the previous works. The importance of adding DGAF, %WaterPaper, IFT parameters is shown with the importance ranking of the input parameters obtained in the results section using the efficient MLR method. III) Regarding the condition of solid insulation of the transformer particularly. Due to the importance of paper insulation condition influence on the overall HI of the transformer, in addition to Furfural and CO, CO 2 gases, %WaterPaper is also considered to monitor accurately the insulating paper condition. IV) Applying a diverse and large practical dataset composed of 336 different transformers located in different country areas stabilizes the model to predict a more accurate HI value for each new data. V) Using the MLR method for three purposes. Firstly, MLR detect the outliers and bad influential points, which cause problems in model construction.   previous works is made to indicate the accuracy of the presented models. The reason of superior efficiency results of the ANFIS model is due to combining the learning capabilities of neural network and reasoning capabilities of fuzzy logic.

D. PAPER LAYOUT
The remainder of this paper is organized as follows: in Section 2, the dataset including parameters of the transformer diagnostic tests are illustrated. In Section 3, the methodology including MLR, ANN and ANFIS formulation and implementation for determining the HI of the transformer oil-paper insulation system, and error criteria for comparing the models are presented. The results which show the applicability of the developed ANFIS in predicting transformer HI and comparison of the models based on their deviation (error) from the experimental HI are provided in section 4, followed by the conclusion in Section 5.

III. DATASET OF TRANSFORMER INSULATION PARAMETERS
In this paper, a comprehensive dataset is utilized. The dataset is composed of 336 DGA and oil characteristics test reports of various transformers. The voltage levels and power ranges of transformers in the dataset are different. The transformers are located in varying weather and operating conditions. This dataset includes advanced diagnostic tests of power transformers which are prepared by Iran Transformer Research  Institute (ITRI). It should be noticed that the dataset is not for 336 different transformers. Some sets of data may be related to a transformer in different time intervals. Diagnostic tests related to DGA and oil-paper insulation system are done on transformers placed in various districts of Iran. The transformers of the dataset are applied in different industries and loading conditions. Therefore, applying such a comprehensive dataset makes the results of the models reasonable. The model closest to the assessment of ITRI could decisively referred as a credible model trained from the comprehensive dataset in the best manner and predict HI accurately.
In this study, eight important parameters from different electrical, physical and chemical tests of power transformer oil-paper insulation, including Interfacial Tension (IFT), Breakdown Voltage (BDV), Acidity, Dissipation Factor at 90 • C (DF), Furfural content, Water content in oil at 20 • C (Water), percent Water in Paper insulation (%WaterPaper), and DGA Factor (DGAF) are regarded as the input parameters of the models. The HI parameter is the output of the models.
In this paper, the DGA is not used to specify the type of faults that occurred in the transformer. It helps to assess the comprehensive health status of the transformer. Thus, the values of seven dissolved gases (the DGA parameters) are integrated into one inclusive parameter (DGAF) [14]. In [15], the influence of DGA parameters on the HI is examined with the dissolved combustible gas (DCG) parameter. The DCG parameter is defined as the sum of DGA gases values simply excluding CO and CO 2 gases. Two disadvantages of using DCG parameter are: 1) The CO and CO 2 gases which contain beneficial details concerning the degradation of paper insulation are disregarded [4], [5], [16], [21].
2) The degree of importance and weighting factors of gases are not regarded [14], [16].
The DGAF value is obtained using various limits of the dissolved gases given in the standards [4], [5] as follows [14].
where i is related to seven dissolved gases (H 2 , CH 4 , C 2 H 6 , C 2 H 4 , C 2 H 2 , CO and CO 2 ), S i is the score value according to the volume of dissolved gases, and W i stands for the weighting factor of the dissolved gases assumed by [14].
In this paper, two parameters DP and percent water saturation of oil are not considered as the input parameters for the models, because they are highly related to Furfural and Water parameters, respectively.
In order to deal with the moisture content appropriately, the value of water at oil in a sample temperature should be corrected to a specified temperature. Due to the empirical matters, the specified temperature is considered at 20 • C [6]. In this paper, the corrected values of water at 20 • C are utilized to facilitate the comparison of the parameters at various oil temperatures [6]. The parameter percent water saturation of oil is calculated from the equation: 100×[(water in oil)/(water in oil at saturation)] [8], [9], where water in oil at saturation means the maximum content of water that is soluble in the oil at a particular temperature which is equal to 53 ppm [9] at 20 • C. Therefore, percent water saturation of oil is not considered as an individual input parameter because it is related to the water content of oil at 20 • C (Water).
The relation between DP and Furfural in the dataset is very close to the equation: DP = (1.51 − log F)/0.0035 [39], so DP is also not considered as an individual input parameter.
In this paper, two insulation parameters including IFT and %WaterPaper are also emphasized in the input parameters for a more accurate assessment of transformer overall HI. The IFT between oil and water is an excellent indicator to detect the particles of the degradation process and contaminants soluble in the transformer oil. The parameter IFT can VOLUME 9, 2021 be used to detect the deterioration of materials in overloaded transformers [6], [8], [9].
Water content in the oil is measured as one of the routine tests for oil in the transformers. The content of moisture in the oil does not always signify moisture in the paper insulation. In the process of transformer cooling, water tends to return to the paper slowly. The variations in water content of oil slightly affect water content of paper, because about 99 percent of the water exists in the paper insulation.
When a thermal balance between the paper and oil is established, water content in oil could be an accurate indicator of the water content in the paper insulation. Such case usually does not happen in operating transformers [6], [8], [9]. In this paper, due to the significance of the solid insulation condition in specifying the health status of the transformer, in addition to furfural and dissolved carbon oxides in oil, %WaterPaper is also considered as an individual parameter to evaluate the condition of transformer paper insulation [6], [8], [9], [21].
In this paper, experimental values of HI prepared by transformer experts at ITRI, are utilized as the output parameter of the proposed models.

IV. METHODOLOGY
In this section, MLR, ANN, and ANFIS models are proposed for HI calculation of transformer insulation system. 80% of the dataset is considered as training and 20% of the dataset is utilized as testing objects, randomly. The testing dataset is utilized for evaluating the proficiency of the models. The trained models are utilized to predict HI for testing dataset (unseen data) with possible slightest deviation from the experimental values of HI provided by the ITRI.

A. MULTIPLE LINEAR REGRESSION (MLR) MODEL
MLR is a method utilized to model the linear relationship between input parameters (transformer insulation characteristics) and the output parameter (HI) using regression analysis [37], [38]. Linear regression provides an equation that minimizes the distance between the fitted line and all data points. The slight difference between the experimental and predicted HI values makes a model fits the data well. The most usual error metric used in the linear regression method is the minimization of the sum of the squared errors. The model expresses the value of an output variable as a linear function of the input variables, so the resulting prediction equation for the i th set of data is as follows.
where k j is the value of the j th input parameter, c 0 is regression constant, c j is coefficient of the j th input parameter, p is the number of input parameters, and y prd i is the predicted output value for the i th set of data.

B. ARTIFICIAL NEURAL NETWORK (ANN) MODEL
The idea of ANN is obtained from the human brain system. The complicated relations of problem data can be modeled with ANN which is utilized such as a black box model needs no precise details of the problem [17], [18], [40].
ANN is a proficient nonlinear method that evaluates the transformer predicted HI value. ANN model learns the relations between input parameters (transformer insulation characteristics) and output parameter (HI) based on training data.
In order to implement the ANN model, a three-layer feedforward neural network including one hidden layer trained with the Levenberg-Marquardt (LM) Back-Propagation (BP) algorithm is utilized. Through extensive experiments, it has been demonstrated from a practical viewpoint that neural networks with one hidden layer are preferred to networks with more than one hidden layer. The last-mentioned networks are more vulnerable to fall into a local minimum. In engineering employments, neural networks with one hidden layer are usually utilized [40].
The number of neurons in the input and output layers are specified based on the problem definition. In this study, the input layer has eight neurons and the output layer has one neuron. The number of hidden layer neurons could be considered as an adjustable parameter, which should be optimized.
The weights and biases of the ANN model are adjusted for each training sample to minimize the mean squared error between the predicted value of the network and the experimental one. In the input layer of the network, the summation function is calculated with the inputs, their weights and biases. The output of the j th hidden neuron is attained with the following transfer function.
where r is the number of input layer neurons (input parameters), w ij is the connection weight between input and hidden layer, x i and b j are the input and bias of the hidden layer, and a j is the output of the hidden layer and input of the next layer (output layer). The final output is calculated as follows.
where S is the number of hidden layer neurons, w jk is connection weight between hidden and output layers, b k is bias of output layer, and a k is the final output. Several linear and nonlinear transfer functions are available for ANN. These transfer functions are nonlinear, continuous and differentiable. So the network could obtain complicated relations between input and output data. In this work, sigmoid logsig (f 1 ) and linear purelin (f 2 ) transfer functions are utilized in hidden and output layers, respectively.
The implementation diagram of the ANN architecture is illustrated in Fig. 1.

C. ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM (ANFIS) MODEL
ANFIS is an adaptive network that utilizes neural network learning method and fuzzy inference system to map inputs into the output. It can be used to simulate complex nonlinear problems. The basic of hybrid Neuro-Fuzzy models is application of neural network learning rules to specify the membership function (MF) parameters automatically [19], [20], [38].
ANFIS method suggested by Jang on the basis of TSK (Takagi-Sugeno-Kang) fuzzy inference system which includes the capabilities of both neural network and fuzzy logic methods [41].
The output of TSK fuzzy system is a linear combination of the inputs. Therefore, the output is a decisive number and a defuzzification process is not needed. The i th rule of TSK fuzzy system is as follows.

If x 1 is A i1 and x 2 is A i2 . . . and x r is A ir
where x j is the j th input parameter, r is the number of input parameters, m is the number of rules, A ij is the j th fuzzy set of the i th rule, and p iq are consequent parameters of the i th rule. The ANFIS structure has five layers: fuzzy layer, product layer, normalized layer, defuzzify layer, and total output layer [41].
In the first layer, fuzzy MFs of input parameters are generated as follows.
where µ A i j could be any fuzzy MF type. The Gaussian MF is usually used for ANFIS as follows.
where c ij and σ ij are premise parameters which explain Gaussian MF center and width, respectively. The second layer is composed of fixed nodes. Fixed nodes combine the input MFs to calculate the firing strength of the i th rule (w i ) that computes by the algebraic product T-norm as follows.
The third layer applies a normalization function to obtain the normalized firing strength of the i th rule as follows.
In the fourth layer, the nodes are adaptable and every node has the product of equations (7) and (11) as follows.
Finally, the fifth layer is the total output layer that illustrates the overall output as the sum of all input signals as follows.
In the ANFIS structure, there are two adaptable layers including layer 1 which has two adjustable parameters (premise parameters c ij and σ ij ) related to the input MFs, and layer 4 which has r+1 adjustable parameters (consequent parameters p iq ) of the first-order polynomial.
The overall pseudocode of the proposed ANFIS method is as follows: The five loops of ANFIS method are shown in the above pseudocode. In the first loop all membership degrees are calculated by Gaussian function. firstlayer_nods value represents the number of nodes. In the remaining loops the results of layer two to five are evaluated. nods value represents the number of nods in these layers. Functions of rule_layer, normalize and consequent are explained in (10) to (12).
In this study, the ANFIS model with eight inputs and one output is utilized based on the subtractive clustering algorithm. The subtractive clustering algorithm itself identifies VOLUME 9, 2021 the number of clusters. In this algorithm, the number of fuzzy rules is only associated with the number of clusters. Therefore, it is a proper method to solve problems which have a large number of inputs.
The ANFIS utilizes a hybrid learning method to train the network which is a combination of least-squares and backpropagation gradient descent methods. The hybrid learning approach is effectively attains the optimal premise parameters in layer 1 and consequent parameters in layer 4 [41].
The implementation diagram of the ANFIS structure consists of five layers is shown in Fig. 2.

D. ERROR CRITERIA
A combination of error metrics is often required to evaluate the model performance. In this paper, four statistical error criteria including root mean squared error (RMSE), coefficient of determination (R 2 ), mean absolute error (MAE), and mean relative error (MRE) have been utilized to determine the performance and predictive capabilities of the models. The error criteria equations are as follows.
where y exp i and y prd i are the experimental and predicted output values for the i th set of data, respectively,ȳ prd is the average of predicted output values, and n is number of samples in the dataset.
The best agreement between the predicted and experimental values should have RMSE, MAE, and MRE of zero and R 2 equals to one. Therefore, these error measures could illustrate the different meanings of training quality indices. R 2 is a measure ranges from 0 to 1 that shows the global fit of the model. In this paper, R 2 is used to measure the agreement between experimental and predicted values of each model. The closer R 2 is to 1, the stronger this agreement.

V. RESULTS AND DISCUSSION
The quality of the training dataset may considerably affect the efficiency of the intelligent methods of transformer condition evaluation. In this paper, a diverse and large dataset contains almost all possible conditions (good, fair, poor) assumed in [6] for each input parameter is used which stabilizes the model to predict HI value closest to ITRI assessment for each new data.

A. CHECKING QUALITY OF THE DATASET
At first, MLR analysis is done with all 336 sets of data to examine data quality and detect the outliers and influential points, which cause problems in model construction.
The applicability domain of the MLR model is examined by Williams plot [38], which is an efficient method to find both the response outliers and the structurally influential points in the model. This plot is obtained from the calculation of the standardized residuals and leverage values for each set of data. The leverage value (h i ) for the i th set of data is obtained as follows. (18) where h i is the leverage value, X is (r+1)×n matrix including r input parameters for each of n data sets and a column with elements equal to one for the regression constant, and x i is the i th row vector of X .
The standardized residuals (e s,i ) for the i th set of data is defined as follows.
where mse is the mean squared error of the model between the predicted and experimental HI values.
In the Williams plot, structurally influential sets have leverage values greater than critical hat value (h * = 3(r + 1)/n). Moreover, sets of data with the standardized residuals greater than three standard deviation units (3σ ) are considered as outliers. Fig. 3 shows the Williams plot that has four zones. Zones 1, 2, 3 and 4 indicate regular sets, good influential high leverage sets, bad influential high leverage sets, and high residual outlier sets of data, respectively.
It can be described from Fig. 3 that the majority of datasets are placed inside the applicability domain and there is no outlier (Zone 4) and there is only one bad influential point (Zone 3) in the dataset. The high leverage data sets that have small residuals (Zone 2) are related to some of transformers with Poor or Very Poor experimental HI condition. They are good influential points that make the model stable and more accurate. But the bad influential points (Zone 3) with simultaneously high leverage and high residual values which could probably be associated with the wrong measurements, destabilize the model and should be removed from the dataset to avoid decreasing the accuracy of the model. The RMSE of the model decreases 4.093% from 0.2150 to 0.2062 by removing the bad influential point of zones 3 and 4.  By removing the bad influential point from the dataset, the remaining 335 datasets are divided randomly into training and testing subsets. 268 training datasets have been utilized to build the proposed MLR, ANN and ANFIS models, whereas the remaining 67 testing datasets being used to indicate the efficiency of the trained models in HI evaluation using the transformer insulation parameters.

B. MLR MODEL
The linear model constructed with training dataset by MLR method given in equation (2) between transformer insulation parameters and HI value for the i th set of data is as follows.
where regression coefficients calculated by MLR model are: It can be seen that the sign of the coefficients of BDV and IFT parameters is positive and the other six parameters have a negative sign. The physical concepts of transformer insulation parameters given in standards [6]- [9] confirm that the values of BDV and IFT parameters have a positive relationship with the transformer insulation condition (the higher the BDV and IFT values, the better the transformer insulation condition), and the other six parameters have a negative relationship with the transformer insulation condition.
In order to make the MLR model coefficient values comparable and investigate the significance of each transformer insulation parameter on the evaluation of HI condition, the parameters should be standardized. For this purpose, HI and transformer insulation parameters are standardized such that their mean values be zero and standard deviation values be one. By doing MLR analysis on this standardized dataset, the model coefficients are calculated as: The standardization of the regression coefficients makes it possible to emphasize the parameters with larger absolute standardized coefficients. So importance ranking of insulation parameters becomes as follows.
It could be seen that DGAF has the highest effect on transformer HI value, and also %WaterPaper and Furfural are two next significant parameters. Moreover, it can be illustrated that two insulation parameters, IFT and %WaterPaper considered in addition to the parameters of the previous works, have a considerable effect on HI value.

C. ANN MODEL
ANN operates based on the setting of parameters of its architecture. The selection of initial values of its weights and biases has a significant effect on the network's performance. Because a nonlinear optimization method (BP LM training algorithm) is used in ANN structure, it may not certainly results in a specific solution at each run. Finding an appropriate architecture for ANN is a difficult task. The optimum number of neurons in the hidden layer is determined by trial and error.
The optimal number of hidden neurons can be determined by comparison of the average calculated RMSE of the networks. In Table 4, the average and standard deviation of RMSE for the testing dataset are given for the different number of hidden neurons for 100 trails.
It can be inferred from Table 4 that if the hidden layer has three neurons, the ANN model results in minimum average and standard deviation of RMSE.
The weight and bias values of the optimal ANN configuration (with three hidden neurons) is shown in Table 5.

D. ANFIS MODEL
In this paper, the ANFIS model with subtractive clustering algorithm and hybrid learning method is utilized. The premise and consequent parameters for the optimal ANFIS model are shown in Tables 6 and 7.     For example, from (7), Rule 1 becomes as: If x 1 is A 11 and x 2 is A 12 and x 3 is A 13 and x 4 is A 14 and x 5 is A 15 and x 6 is A 16 and x 7 is A 17 and x 8 is A 18 where, x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , and x 8 are BDV, DF, Acidity, IFT, Water, %WaterPaper, Furfural, and DGAF parameters, respectively.
The Gaussian type fuzzy MFs µ A ij for the j th input parameter and the i th rule, generated with equation (9) are shown in Fig. 4.

E. COMPARISON OF THE MODELS AND DISCUSSION
Unlike the ANN model, the MLR and ANFIS models are robust and result in exactly the same error criteria and HI values at each run. In Table 8, the error criteria of HI calculation for the proposed MLR, ANN and ANFIS models are provided.
The small values of the RMSE, MAE, MRE and values proximate to one of R 2 in Table 8 demonstrate the agreement of the proposed models with the experimental model of ITRI. It could be observed from Table 8 that the ANFIS model provides superior results for train, test and total datasets. This superior efficiency results from combining the learning capabilities of neural network and reasoning capabilities of fuzzy logic in the ANFIS.
In order to specify the overall health status of a transformer, the HI values are normalized on the scale of 0 (thoroughly degraded transformer) to 1 (excellent condition). Table 9 presents the categories of HI values and correlates them to  [13], [14]. the failure probability, expected lifetime and required actions. HI values are classified into condition categories from ''Very Good'' to ''Very Poor''. The health status of each transformer is specified by the ranges of HI values defined in [13], [14].
The comparison of experimental normalized HI values and those predicted by proposed MLR, ANN and ANFIS models are given in Fig. 5 for 268 training datasets of transformers.
The comparison of experimental normalized HI values and those predicted by proposed MLR, ANN and ANFIS models are given in Fig. 6 for 67 testing datasets of transformers.
It can be considered from Figs. 5 and 6 that HI values for transformers of the training and testing datasets are placed in various health condition zones of Table 9.
The normalized HI values predicted by the ANFIS model (the most precise and robust model), against experimental HI for 335 training and testing datasets are illustrated in Fig. 7.
The health condition of Table 9 is the same for experimental and predicted HI values for data points placed in diagonal grids of Fig. 7. But datasets in nondiagonal grids pertain to the cases that the predicted condition is not the same as the experiment. For example, the health condition for cases in Grid 1 is Poor for predicted HI and Very Poor for experimental HI. About the cases of Grid 2, the predicted and experimental HI values have Fair and Good health conditions, respectively. It is concluded from Fig. 7 that the accuracy of the predicted model is high with this comprehensive dataset, because about 80% of datasets are placed at diagonal grids.
Also, the other cases are adjacent to the diagonal line. These cases are located at the border of two condition zones. For these datasets, the experimental HI specifies that the transformer is at the end of one condition zone, and the predicted HI specifies that the transformer is at the beginning of the adjoining condition zone. Therefore, results prove that the predicted and experimental HI values are considerably in agreement.
The test data which are the input parameters of the presented models are shown in Table 10    The output parameter of the presented models is the normalized Health Index value (N. Value). The condition of the transformers is obtained from the normalized value of HI. A comparison of the experimental HI as the reference value with some previous works and also with the presented model of this paper for the above-mentioned 15 sample transformers are given in Table 11.
In Table 11, Condition of 1,2,3,4,5 are referred to Very Good, Good, Fair, Poor, Very Poor conditions of the transformers, respectively. HI 1 is the experimental HI which is prepared by transformer experts at ITRI. In this paper, the experimental HI is considered as the reference value for the validation of the proposed models. The experimental HI values are obtained using utility expert comments. In addition to the results of oil characteristic tests, utility experts have more information about the transformers. The transformer expert may know about maintenance history, loading history, operating conditions, and they have the specialty to analyze the data and assessment of transformer condition. HI 2 is obtained from the method based on industry standards explained in [16]. HI 3 is obtained with the Duval's Triangle [42]. In Duval's Triangle PD=partial discharges, D1=discharges of low energy, D2=discharges of high energy, T1=thermal faults of temperature<300 • C, T2=thermal faults of temperature 300 • C<T<700 • C, T3 = thermal faults of temperature >700 • C, DT=mixtures of thermal and electrical faults. HI 4 is obtained with the Fuzzy C-Means (FCM) method [43]. FCM is a clustering method in which a data point relates to a cluster to some degree by a membership function. HI 5 is obtained with the correlation coefficients between the health index and the input parameters [44]. HI 6 , HI 7 and HI 8 are the health index values obtained from the MLR, ANN, and ANFIS models presented in this paper, respectively. The accuracy of the presented ANFIS model could be demonstrated from the comparison given in Table 11. It could be seen that for the sample transformers, there is no difference between the condition obtained by the ANFIS model and the experimental ones. But the other methods associate with some deviations from the experimental conditions. However, the results of MLR and ANN models are also acceptable in comparison with the other abovementioned methods.

VI. CONCLUSION
In this paper, the procedure of combining transformer insulation specifications and dissolved gas analysis data to provide a single numerical Health Index value as a comparative measure of the overall status of the transformer is presented. The HI is calculated for 336 experimental field datasets of transformers with different voltage levels and power ranges in different weather and operating conditions. Also, employing an inclusive DGAF parameter which considers seven dissolved gases regarding their importance, provides a relative indication of transformer DGA condition. Two parameters of transformer insulation including IFT and %WaterPaper as two significant oil characteristics are also included in the models.
In this paper, the linear model MLR and nonlinear ANN and ANFIS models are proposed for predicting transformer HI value. The training process of the models is performed with 268 datasets and then the proficiency of the models is proved with other 67 testing datasets. It is demonstrated from the results that the most accurate and robust model is the ANFIS model.
Although the linear and nonlinear presented models provide good results, the ANFIS model is somehow superior.
In 80% of cases, prediction of health condition by ANFIS exactly matches the ITRI experimental health condition assessment. In the other ones, predictions are placed at the border of two adjoining conditions zones. So its performance sounds reliable for such a diverse dataset. The presented procedure assists the operator in recognizing the distinction between degradation which requires maintenance and diagnosis plans, and degradation that specifies end of life defined by DP=200 and direct asset management decisions.