Moisture Prediction of Transformer Oil-Immersed Polymer Insulation by Applying a Support Vector Machine Combined with a Genetic Algorithm

The support vector machine (SVM) combined with the genetic algorithm (GA) has been utilized for the fault diagnosis of transformers since its high accuracy. In addition to the fault diagnosis, the condition assessment of transformer oil-immersed insulation conveys the crucial engineering significance as well. However, the approaches for getting GA-SVM used to the moisture prediction of oil-immersed insulation have been rarely reported. In view of this issue, this paper pioneers the application of GA-SVM and frequency domain spectroscopy (FDS) to realize the moisture prediction of transformer oil-immersed insulation. In the present work, a method of constructing a GA-SVM multi-classifier for moisture diagnosis based on the fitting analysis model is firstly reported. Then, the feasibility and reliability of the reported method are proved by employing the laboratory and field test experiments. The experimental results indicate that the reported prediction model might be serviced as a potential tool for the moisture prediction of transformer oil-immersed polymer insulation.


Introduction
Power transformers perform a vital task of transforming electrical energy in power systems, and the condition evaluation of its internal oil/paper system has attracted wide attention [1][2][3][4][5][6]. In recent years, a prevailing method of moisture determination is the so-called frequency domain spectroscopy (FDS) [7][8][9][10] technique, which is also called dielectric frequency response (DFR) technique. The condition assessment of transformer oil-immersed polymer insulation based on FDS could be realized by the following steps: Firstly, extracting feature parameters that could reflect the insulation conditions of oil-immersed insulation from the measured FDS data. Then, establishing the quantitative relationship between these feature parameters and insulating states, and, finally, the insulating status of the oil-immersed insulation can be evaluated by the quantity relationship. Existing studies [11,12] indicated that the insulation conditions of power transformers could be affected by various factors, such as moisture, thermal, oxygen, and acids.
Moreover, for every 0.5% increase in moisture, the life of oil/paper insulation will be shortened greatly [13], and the increasing moisture reduces the breakdown voltage, which leads to equipment accidents [14]. Therefore, the moisture diagnosis is of great significance to the transformer operating condition, and such the topic is therefore treated as the investigating theme in this work. Review existing investigation, the approaches for oil-immersed insulating condition prediction based on dielectric response spectroscopy can be divide into two types. One is based on extracting the feature parameters The cellulosic pressboards and insulating oil are firstly dried in the vacuum tank at 105 • C/50 Pa for 48 h. Then, they are placed into the vacuum tank at 60 • C/50Pa for 48 h to complete the vacuum impregnation. In that way, the oil-immersed pressboards are prepared. Afterward, the prepared oil-immersed pressboards are divided into five groups and placed into five different aging cans to perform an accelerating aging experiment at 150 • C for 0 day, 1 day, 3 days, 7 days, and 14 days, respectively. Finally, the oil-immersed pressboards with initial mc% (a%) are placed in a precision electronic balance and their quality (m) is recorded; the natural moisture absorption is later performed to obtain the expected moisture by controlling the quality. Provided that the measured value of the balance reaches m * (1 + b%)/(1 + a%), the moisture content of the pressboard is regarded as b%. In this work, the expected moisture content is 1%, 2%, 3%, and 4%, respectively. Above all, the oil-immersed pressboards with various aging conditions and moisture content are prepared.
The prepared oil-immersed pressboards are put into a three-electrode device to finish the frequency response test with the help of a dielectric response analyzer, i.e. DIRANA tester. The three-electrode device is filled with dried and degassed insulating oil and the pressboard is immersed in the insulating oil to simulate the off-line measurement status (vacuum and oil immersion) of the transformer main insulation.
Afterward, the moisture is tested by the Metrohm Coulomb Karl Moisture Tester (as shown in Figure 1) and based on International Electrotechnical Commission standard, IEC 60814, the degree of polymerization (DP) is tested by the Automatic Viscosity Tester (as shown in Figure 1) and based on IEC 60450-2007, respectively. The experiment schedule is shown in Figure 1. Besides, according to the preset moisture values, the moisture content of the oil-immersed pressboard can be divided into eight different levels. The classification of moisture content is shown in Table 2.
Polymers 2020, 12, x FOR PEER REVIEW 3 of 17 electrode device is filled with dried and degassed insulating oil and the pressboard is immersed in the insulating oil to simulate the off-line measurement status (vacuum and oil immersion) of the transformer main insulation.
Afterward, the moisture is tested by the Metrohm Coulomb Karl Moisture Tester (as shown in Figure 1) and based on International Electrotechnical Commission standard, IEC 60814, the degree of polymerization (DP) is tested by the Automatic Viscosity Tester (as shown in Figure 1) and based on IEC 60450-2007, respectively. The experiment schedule is shown in Figure 1. Besides, according to the preset moisture values, the moisture content of the oil-immersed pressboard can be divided into eight different levels. The classification of moisture content is shown in Table 2.

Analysis of FDS Curves
The dielectric response experiment is executed at 45 °C, where the test voltage is AC 200V and the test frequency section is 2 × 10 −4 Hz-5000 Hz. In this case, the tanδ curves of oil-immersed pressboards with various moisture content and aging degrees can be seen in Figure 2.
The previous studies revealed that the aging effect will alter the response curves in the lowfrequency regions, while the curves in the high-frequency part will almost not be affected. On the contrary, the observed moisture effect always makes the response curve changed in the entire frequency regions. Therefore, the feature parameters extracted by the response curve in the highfrequency regions are regarded as an available tool for analyzing the moisture effect [26,27]. The integral value of tanδ curves can be thus chosen as the feature parameter (fingerprint) to realize the moisture diagnosis.
Therefore, in order to collect more moisture information from tanδ curves, as well as avoid the impact of aging, the integral values in three characteristic ranges of the middle-high frequency sections of tanδ curves are selected as the dielectric fingerprints (D1-D3) to predict the moisture inside the oil-immersed insulation, as shown in Equation (1):

Analysis of FDS Curves
The dielectric response experiment is executed at 45 • C, where the test voltage is AC 200V and the test frequency section is 2 × 10 −4 Hz-5000 Hz. In this case, the tanδ curves of oil-immersed pressboards with various moisture content and aging degrees can be seen in Figure 2.
The previous studies revealed that the aging effect will alter the response curves in the low-frequency regions, while the curves in the high-frequency part will almost not be affected. On the contrary, the observed moisture effect always makes the response curve changed in the entire frequency regions. Therefore, the feature parameters extracted by the response curve in the high-frequency regions are regarded as an available tool for analyzing the moisture effect [26,27]. The integral value of tanδ curves can be thus chosen as the feature parameter (fingerprint) to realize the moisture diagnosis. is relatively small. Moreover, observe the oil-immersed sample with the same aging status (DP value) shown in Table 3; the σoil changes greatly (633, 2200, 1359, 1375, and 1667 times, respectively) with the moisture increase to 4% from 1%. In summary, when the measured FDS curves convey a similar shape, the σoil could be utilized as an available tool for distinguishing the moisture effect, such that the property could especially promote the moisture diagnosis.   Therefore, in order to collect more moisture information from tanδ curves, as well as avoid the impact of aging, the integral values in three characteristic ranges of the middle-high frequency sections of tanδ curves are selected as the dielectric fingerprints (D 1 -D 3 ) to predict the moisture inside the oil-immersed insulation, as shown in Equation (1):  where D i (i = 1, 2, 3) represents the dielectric fingerprints the and the multiplied coefficient is to keep D i in a similar data dimension [24]. In addition, the contribution of moisture effect and the aging effect on tanδ curves is too similar to identify. Besides, the different combinations of moisture and aging may lead to a similar shape of tanδ curves [11,24,26]. In this case, the evaluation results of both aging and moisture are unreliable. In view of this issue, the DC conductivity of insulating oil (σ oil ) is introduced as an auxiliary fingerprint (D 4 ). If the value of D 4 is relatively large, the shape of tanδ curve is dominated by the moisture effect due to the fact that moisture is the leading factor of σ oil . On the contrary, the shape of the tanδ curve is greatly affected by the aging effect when the value of D 4 is relatively small. Moreover, observe the oil-immersed sample with the same aging status (DP value) shown in Table 3; the σ oil changes greatly (633, 2200, 1359, 1375, and 1667 times, respectively) with the moisture increase to 4% from 1%. In summary, when the measured FDS curves convey a similar shape, the σ oil could be utilized as an available tool for distinguishing the moisture effect, such that the property could especially promote the moisture diagnosis. Afterwards, the integral values in Figure 2 are extracted based on Equation (1) and the DC conductivity is measured by a DIRANA tester. Above all, the fingerprints D 1 -D 4 of the prepared oil-immersed pressboards can be acquired and are shown in Table 3.

Construction of Fitting Analysis Model
In part A, 20 groups of dielectric fingerprints have been extracted, which are aimed at reflecting the moisture level of oil-immersed insulation. However, if only 20 groups of dielectric fingerprints are utilized to carry out the training of the GA-SVM multi-classifier, it will give rise to the underfitting of the model, which restricts its learning ability. While the approach for collecting the adequate dielectric fingerprints is theoretically available by preparing a great number of oil-immersed pressboards with various mc% and DP. However, it will be an extremely heavy work in actual operation due to the preparation duration, materials, and accuracy. Considering this situation, a fitting analysis model is proposed in this work to obtain sufficient fitting fingerprints. Then, the fitting analysis model will be utilized to calculate the fitting fingerprints (F 1 -F 4 ) to construct the training set. The construction of the fitting analysis model is carried out as the following steps: The fitting fingerprints F i is defined as the dependent variable Z i , i = 1, 2, 3, 4, which is set to represent the integral value of tanδ curves. Moreover, the value of DP and mc% are defined as independent variable X and Y, respectively, which make an obvious impact on the shape of tanδ curves; II.
Determining the types of fitting functions to build the model with the higher goodness of fitting. Two types of functions (Power 2D and Rational Taylor) are selected to realize the construction of the fitting analysis model in this work. Moreover, the values of F i cannot be negative, the fitting functions are thus added with an absolute operation; III.
The 20 groups of original data shown in Table 3 are brought into the determined functions so the fitting analysis model is constructed; IV.
Adjust the parameters of the model. The values of parameters can largely determine the goodness of the model. After experiments and adjustments, the parameters of the model are decided and the fitting analysis model is established. Finally, the goodness of F i (i = 1-4) reach to 0.988, 0.983, 0.999, and 0.989, respectively. The surface of the fitting analysis model is shown in Figure 3, and its parameters and formulates are displayed in Table 4.
Polymers 2020, 12, x FOR PEER REVIEW 6 of 17 fitting analysis model will be utilized to calculate the fitting fingerprints (F1-F4) to construct the training set. The construction of the fitting analysis model is carried out as the following steps: I. The fitting fingerprints Fi is defined as the dependent variable Zi, i = 1, 2, 3, 4, which is set to represent the integral value of tanδ curves. Moreover, the value of DP and mc% are defined as independent variable X and Y, respectively, which make an obvious impact on the shape of tanδ curves; II. Determining the types of fitting functions to build the model with the higher goodness of fitting.
Two types of functions (Power 2D and Rational Taylor) are selected to realize the construction of the fitting analysis model in this work. Moreover, the values of Fi cannot be negative, the fitting functions are thus added with an absolute operation; III. The 20 groups of original data shown in Table 3 Figure 3, and its parameters and formulates are displayed in Table 4.

Construction of the GA-SVM Moisture Content Prediction Model
The introduction of SVM and GA is given in this chapter. Combining the training samples provided by the reported fitting analysis model, the GA-SVM model utilized for moisture prediction is later constructed.

The Introduction of SVM
The support vector machine is developed based on the statistical learning theory [28]. It uses the principle of structural risk minimization to increase the generalization capability of the classification model [29]. The sample points are mapped into the high-dimensional space with the help of kernel function to realize linearly separable, which are linearly inseparable in the low-dimensional space. Then, the optimal hyperplane is established to accomplish the classification of samples. An ordinary hyperplane can be expressed by Equation (3).
If the sample points (x i , y i ) meet the condition expressed in Equation (4), the classification result is correct.
If the classification margin of the two types of sample data reaches the maximum, the hyperplane in Equation (3) will become the optimal hyperplane, which can be expressed by Equation (5).
where, the ξ i is called the slacked variable, and the introduction of ξ i allows the existence of the misclassified samples. The C is a penalty factor, and the value of C reflects the importance attached to misclassified samples. Besides, φ(·) maps the sample points from the original space to the feature space. By using the Lagrange optimization method to solve the quadratic optimization problem in (5), Equation (6) will be later gotten.
Where α i and β i are called the Lagrange operator. If Equation (6) satisfies the conditions obtained by Lagrange vertical multiplication shown in Equation (7), Equation (5) can be converted to the dual form shown in Equation (8).
Then, the kernel function is pulled in to simplify the calculations, and the Gaussian radial basis function is selected as a kernel function in this paper, which can be expressed in the form of Equation (9).
Where, γ is the key parameter of the Gaussian radial basis kernel function. The decision function brought in the kernel function can be shown in Equation (10).

The Introduction of GA
The parameters C and g (g = γ) reported in part A can affect the performance of the SVM model, and the genetic algorithm can optimize the performance of SVM by adjusting the values of C and g, where the optimizing process is shown in Figure 4.
The parameters optimization by using the genetic algorithm can be realized by employing the following steps: I.
C and g should be firstly defined, and then operate binary-coded. II.
Generating an initial population containing M individuals, where M is the number of individuals in the initial population. III.
Calculating the fitness of individuals in the population by Equation (11): where, N T and N F represent the number of correctly classified and incorrectly classified samples, respectively. IV.
Ranking the individuals based on the fitness, then judging whether the termination condition meets the required level; if so, end the iteration, if not, go to step V. V.
Generating the progeny by genetic operators. Then, go to step III. The genetic operator in the above process consists of selection, crossover, and mutation. Among them, the selection is realized by the method of roulette wheel selection. In this method, individuals with higher fitness will occupy a larger area in the roulette so that they are more likely to be selected. Besides, crossover and mutation are the main and secondary ways to produce the offspring different from the parents to enhance the diversity of the population.
In the loop of Figure 4, the genetic algorithm continuously adjusts the constituent of the parameter population based on its fitness, and removes the individuals with a lower level of fitness. As the number of iterations increases, the increasing number of high-quality individuals consists of the new population. Finally, the parameter optimization is completed the and the optimized parameters are outputted.  The parameters optimization by using the genetic algorithm can be realized by employing the following steps: The value range of C and g should be firstly defined, and then operate binary-coded. II. Generating an initial population containing M individuals, where M is the number of individuals in the initial population. III. Calculating the fitness of individuals in the population by Equation (11): = + (11) where, NT and NF represent the number of correctly classified and incorrectly classified samples, respectively. IV. Ranking the individuals based on the fitness, then judging whether the termination condition meets the required level; if so, end the iteration, if not, go to step V. V. Generating the progeny by genetic operators. Then, go to step III.
The genetic operator in the above process consists of selection, crossover, and mutation. Among them, the selection is realized by the method of roulette wheel selection. In this method, individuals with higher fitness will occupy a larger area in the roulette so that they are more likely to be selected. Besides, crossover and mutation are the main and secondary ways to produce the offspring different from the parents to enhance the diversity of the population.
In the loop of Figure 4, the genetic algorithm continuously adjusts the constituent of the parameter population based on its fitness, and removes the individuals with a lower level of fitness. As the number of iterations increases, the increasing number of high-quality individuals consists of the new population. Finally, the parameter optimization is completed the and the optimized parameters are outputted.

Construction of Sample Set
The training set is obtained by the reported fitting analysis model shown in Figure 3. The sample's data is obtained by taking points at equal steps from the surface (a)-(d) presented in Figure 3. It can be learned from Table 2 that the value range of DP is from 279 to 1172 and the value range of mc% is from 0.91 to 4.47.
Therefore, this study presets the available value ranges of mc% (1%-4.5%) and DP (280-1170). The step size of DP value and mc% is preset to 10 and 0.5%, respectively. Then, a total of 720 sample points are obtained. Afterwards, mapping the sample points into the surfaces in Figure 3, and 720 sets of fitting fingerprints (F i ) corresponding to different mc% and DP values are obtained. Lastly, the 720 sets of fitting fingerprints are used to finish the construction of the training set of GA-SVM model.

Model Building and Parameter Optimization
The reported 720 sets of fitting fingerprints are employed to accomplish the training of the GA-SVM model and the genetic algorithm is utilized to optimize the parameters C and g.
As shown in Step I (Section 4.2), the variation range of the key parameters C and g are firstly determined, as shown in Equation (12): Then, according to Step II, the number M of individuals in the initial population is determined, M = 20. And 10-fold cross-validation is selected to obtain the cross-validation accuracy of the GA-SVM model. Moreover, considering the accuracy of cross-validation and the time cost of training, 200 is selected as the maximum value of iterations in this work. After completing the above series of settings, the following Steps (III, IV, V) are carried out using the Matlab program. The optimization process of parameter C and g is shown in Figure 5.
Then, according to Step II, the number M of individuals in the initial population is determined, M = 20. And 10-fold cross-validation is selected to obtain the cross-validation accuracy of the GA-SVM model. Moreover, considering the accuracy of cross-validation and the time cost of training, 200 is selected as the maximum value of iterations in this work. After completing the above series of settings, the following Steps (III, IV, V) are carried out using the Matlab program. The optimization process of parameter C and g is shown in Figure 5.
The optimization is terminated when the number of iterations reaches 100, it is because the crossvalidation accuracy of the GA-SVM model is close to 100% (Much larger than 95%). As a result, the optimized global optimal parameters are output with the best C = 1.688 and the best g = 284.9706.
In addition, it can be learned from Figure 5 that the best fitness of the generations is stable at 100% and the average fitness fluctuates between 100% and 99.93%. Moreover, the cross-validation accuracy of the GA-SVM model on the training set reaches 100%. In this way, the construction of the GA-SVM model is achieved.

Controlled Trial
As a controlled trial, the MATLAB R2019b software and the same 720 training sets are utilized to train the SVM model (parameters are not optimized by the genetic algorithm). The 10-fold crossvalidation and Gaussian radial basis functions are also applied during the training. The classification result of the SVM model on the training sets is shown in Figure 6. The optimization is terminated when the number of iterations reaches 100, it is because the cross-validation accuracy of the GA-SVM model is close to 100% (Much larger than 95%). As a result, the optimized global optimal parameters are output with the best C = 1.688 and the best g = 284.9706.
In addition, it can be learned from Figure 5 that the best fitness of the generations is stable at 100% and the average fitness fluctuates between 100% and 99.93%. Moreover, the cross-validation accuracy of the GA-SVM model on the training set reaches 100%. In this way, the construction of the GA-SVM model is achieved.

Controlled Trial
As a controlled trial, the MATLAB R2019b software and the same 720 training sets are utilized to train the SVM model (parameters are not optimized by the genetic algorithm). The 10-fold cross-validation and Gaussian radial basis functions are also applied during the training. The classification result of the SVM model on the training sets is shown in Figure 6.
The classification results indicate that the CV accuracy of the SVM model is 98.9% and 9% of the training samples that should be classified into M7 are misclassified into M6. Above all, it can be learned from the controlled trial that the application of the genetic algorithm can improve the classification accuracy of the SVM model. The classification results indicate that the CV accuracy of the SVM model is 98.9% and 9% of the training samples that should be classified into M7 are misclassified into M6. Above all, it can be learned from the controlled trial that the application of the genetic algorithm can improve the classification accuracy of the SVM model.

Feasibility Verification of the GA-SVM Model for Moisture Prediction
After accomplishing the construction of the GA-SVM moisture diagnosis model, the verification experiments are performed to prove the feasibility and accuracy of the moisture diagnosis model in lab and field conditions.

Verification under Laboratory Conditions
Two types of cellulosic pressboards (as shown in Table 5) and Karamay No.25 naphthenic mineral oil are utilized to prepare the testing oil-immersed pressboards. The purpose of this experimental design is to discuss whether the reported GA-SVM model could be applied to the cellulose materials with different thicknesses. Then, four groups of testing samples with different mc% and DP are prepared and the FDS curves are tested according to the steps shown in Figure 1. The insulating information of the testing lab samples is tested by Karl Fischer titrator and viscosity tester, which is shown in Table 6. Moreover, the tanδ curves of the lab samples can be seen in Figure 7.

Feasibility Verification of the GA-SVM Model for Moisture Prediction
After accomplishing the construction of the GA-SVM moisture diagnosis model, the verification experiments are performed to prove the feasibility and accuracy of the moisture diagnosis model in lab and field conditions.

Verification under Laboratory Conditions
Two types of cellulosic pressboards (as shown in Table 5) and Karamay No.25 naphthenic mineral oil are utilized to prepare the testing oil-immersed pressboards. The purpose of this experimental design is to discuss whether the reported GA-SVM model could be applied to the cellulose materials with different thicknesses. Then, four groups of testing samples with different mc% and DP are prepared and the FDS curves are tested according to the steps shown in Figure 1. The insulating information of the testing lab samples is tested by Karl Fischer titrator and viscosity tester, which is shown in Table 6. Moreover, the tanδ curves of the lab samples can be seen in Figure 7.  For the feasibility verification of the reported GA-SVM model, the dielectric fingerprints of the lab samples are extracted by Equation (1). Combining the tested σoil, the dielectric fingerprints of lab samples can be collected, which are shown in Table 7. The four groups of dielectric fingerprints in Table 7 are input to the reported GA-SVM model, and the moisture diagnosis results can be obtained, as shown in Figure 8, where the percentage error (P.E) in Table 8 is computed by Equation (13)    For the feasibility verification of the reported GA-SVM model, the dielectric fingerprints of the lab samples are extracted by Equation (1). Combining the tested σ oil , the dielectric fingerprints of lab samples can be collected, which are shown in Table 7. The four groups of dielectric fingerprints in Table 7 are input to the reported GA-SVM model, and the moisture diagnosis results can be obtained, as shown in Figure 8, where the percentage error (P.E) in Table 8 is computed by Equation (13). For the feasibility verification of the reported GA-SVM model, the dielectric fingerprints of the lab samples are extracted by Equation (1). Combining the tested σoil, the dielectric fingerprints of lab samples can be collected, which are shown in Table 7. The four groups of dielectric fingerprints in Table 7 are input to the reported GA-SVM model, and the moisture diagnosis results can be obtained, as shown in Figure 8, where the percentage error (P.E) in Table 8 is computed by Equation (13)     As shown in Table 8, the measured mc% of lab samples is 2.04%, 2.97%, 1.28%, and 3.18%, and it is classified as M3, M5, M2, and M5, respectively. As a result, the reported GA-SVM moisture prediction model achieves the correct classification of the lab samples and the percentage error is 1.96%, 1.01%, 17.19%, and 5.66%.

GA-SVM Analysis Error Analysis
From the above experiments, the calculated average percentage error of lab testing samples is 6.46%, so the reported GA-SVM model realized a relatively reliable moisture prediction for the lab samples.

Verification under Field Conditions
The three-winding transformers with different loading histories under maintenance are selected in this part to verify the field application of GA-SVM. The details of them are shown in Table 9. The DIRANA is used to the FDS test of the main insulation system between the high and medium voltage windings, as well as the inside moisture. The scheme of the assessment activities is given in Figure 8, where D 1 -D 4 are the original measured values and D 1 -D 4 is the dielectric fingerprints processed by the XY model and temperature correction.
The field verification can be achieved by referring to the following steps: I. Testing the complex capacitance C tot *(ω) of the above field transformers. The complex dielectric constant ε tot *(ω) is calculated by the formula ε tot *(ω) = C tot *(ω)/C 0 . II.
In this work, the structure parameters X and Y of the tested transformers are obtained by the commercial dielectric response analyzer, i.e., DIRANA/OMICRON. The DIRANA could access the preset curve that is most similar to the measured curve by using match technique. Then, the obtained structure parameters (X and Y) of the preset curve is approximatively regarded as the tested sample's parameters. The analysis results of X and Y are shown in Table 11. III.
The XY model is utilized to calculate the ε PB *(ω) of transformer oil-immersed pressboards, as shown in Equation (14).
IV. Correcting the influence caused by various test temperatures. The FDS data of field transformers is measured at 30 • C, 29 • C, and 30 • C, respectively, while the data of the reported GA-SVM model is measured at 45 • C. Therefore, the shift factor αT shown in Equation (15) is applied to realize the temperature correction of the FDS data at different test temperatures.
where E a represents the activation energy (E a =113kJ/mol). R represents a gas constant and R = 8.314 J/(mol·K). T ref represents the reference temperature (T ref = 318.15 K). T represents the test temperature (T 1 = 303.15 K; T 2 = 302.15 K; T 3 = 303.15 K). Then, the shift factor α T1 , α T2 , α T3 is calculated (α T1 = 8.28; α T2 = 9.60; α T3 = 8.28). Above all, the temperature correction of the measured FDS data is realized. The corrected tanδ curves are shown in Figure 9, where the scattered points are uncorrected data, and the curves are corrected by the shift factor. The dielectric fingerprints D 1 -D 3 can be extracted from these correct curves by using Equation (1). Besides, it has been proved in studies [24,30] that the temperature correction of σ oil can be achieved by multiplying the shift factor α T , in that way, D 4 = σ oil × α T . Above all, the dielectric fingerprints are obtained and shown in Table 12. V.
Get the dielectric fingerprints D 1 -D 4 inputted to the reported GA-SVM model. The mc% prediction of the field transformers is realized, and the results are shown in Table 10. The prediction mc% in Table 10 is from GA-SVM and the measured mc% is from DIRANA. It can be seen that the same moisture level is given for Field 1 and Field 3. As for Field 2, the number of operating years is 8. Considering that the normal operating life of the transformer is 30 years, the Field 2 transformer is still lying in the early-term stage. Therefore, the predicted results of DIRANA and GA-SVM are both within a reasonable range.  In summary, the feasibility and accuracy of the reported GA-SVM model have been preliminarily proved based on the laboratory and field data shown in Tables 8 and 10, respectively.

Conclusions
Compared with the traditional moisture predicted method. A novel idea is proposed in this paper, which constructs the quantitative relationship between FDS feature parameters and the moisture. In that way, the fitting analysis model is employed to expand the number of training samples so that the GA-SVM can be trained and further applied to accomplish the moisture prediction of transformer oil-immersed polymer insulation. The analysis and findings can be summarized as follows:

I.
A group of dielectric fingerprints D 1 -D 4 , sensitive to the moisture content inside the transformer oil-immersed insulation, can be obtained by the presented analysis. II.
A small number of samples are difficult to realize the training of the SVM multi-classifier, so a fitting analysis model is proposed in this paper to expand the number of training samples. Consequently, the heavy work of preparing a great number of oil-immersed pressboards is replaced and the training set with adequate fitting fingerprints is obtained easily. III.
In the course of the construction of the moisture prediction model, the genetic algorithm is utilized to optimize the key parameters of the SVM. It has been proved that the GA-SVM performs better than the SVM on the classification of the training set. IV.
The feasibility and accuracy of the moisture prediction model based on the GA-SVM are proved by the laboratory and field experiments. The reported GA-SVM achieves the correct classification of the moisture levels of the lab samples, and the average relative error of the lab sample is 6.46%. As for the field transformer, considering the measured results of DIRANA and the operating years, the predicted values of GA-SVM is within a reasonable range.