Partial Least Squares ( PLS ) IntegratedFourierTransformInfrared ( FTIR ) Approach for Prediction of Moisture in Transformer Oil and Lubricating Oil

Fourier transform infrared (FTIR) spectroscopy has been advocating a promising alternative for Karl Fischer titration method for quantification of moisture in oil. +is study aims to integrate partial least squares regression (PLSR) approach on FTIR spectra for prediction of moisture in locally accessible transformer oil and lubricating oil. +e oil samples spiked with known moisture concentrations were extracted with acetonitrile and subjected to analysis with an FTIR spectrophotometer. +e PLSR model was built based on 100 training/test splits, and the prediction performance was measured with the percentage root mean squares error (% RMSE). +e range of concentration studied was between 0 and 5000 ppm. +e marker region of moisture was found at 3750–3400 and 1700–1600 cm with the latter demonstrating a better predictive ability in both lubricating oil and transformer oil. +e prediction of moisture in lubricating oil was characterized with lower % RMSE. At concentration less than 700 ppm, the prediction accuracy deteriorates suggesting poor sensitivity. +e PLSR was implemented on IR spectra of a set of blind samples, verified with Karl Fischer (for transformer oil) method and Kittiwake (for lubricating oil) method. +e prediction was encouraging at concentrations above 1000 ppm; at lower concentrations, the prediction was characterized with high percent error. +e algorithm, validated with 100 training/test splits, was converted into an executable program for prediction of moisture based on FTIR spectra. +is program can be used for prediction of other substances given that the marker region is identified. FTIR can be used for prediction of moisture in oil nevertheless the sensitivity and precision is low for samples with low moisture concentration.


Introduction
Moisture analysis is a routine monitoring activity for utility companies.e presence of moisture in transformer oil and lubricating oil will lead to break down of transformer and machinery.In transformer oil, moisture reduces its dielectric strength whilst in lubricating oil, it affects the oil viscosity causing corrosion to the machinery.Conventionally, the moisture in oil is determined using the Karl Fischer titration method.
e method demonstrates sensitivity as low as 10 ppm; however, it is expensive involving various solvents and is time consuming [1,2].
Fourier transform infrared (FTIR) spectroscopy has been advocating a promising alternative for quantification of moisture in oil, integrating partial least squares regression (PLSR).is technique has been employed for lubricating oil [3][4][5][6][7], transformer oil [8,9], fuel oil [10], turbine oil [11], and biodiesel [12] with promising sensitivity as low as 50 ppm [6].e advantage of FTIR is that it allows rapid analysis with minimal sample preparation and is inexpensive.It also permits on-site monitoring for application of utility oil nevertheless the quantification is oil-specific, requiring individualised calibration.
In this study, we employ the PLSR approach on FTIR spectra for prediction of moisture in locally accessible transformer oil and lubricating oil.e model was validated based on exhaustive training/test splits and was applied on a set of blind samples, verified with Karl Fischer and Kittiwake method.
e prediction algorithm was converted into an executable program that runs on Windows.e exhaustive training/test split strategy is new in this area of application to testify and verify the prediction avoiding overfitting of a model based on one single dataset.

Sample Preparation.
A sample of transformer oil (Hydrax Hypertrans HR) and lubricating oil (Shell Turbo T68) provided by Sarawak Energy Berhad was used to develop the model.e calibration model was established based on oil samples spiked with known concentration of moisture, extracted with acetonitrile.Prior to sample preparation, the oil was left in dried molecular sieve with pore size of 4 Å for three days to remove the moisture present.Karl Fischer and Kittiwake methods suggest that the saturation level of moisture was <10 ppm for transformer oil and <500 ppm for lubricating oil.Note that the molecular sieve was dried in a furnace at 325 °C for 24 hours before use.

Sample Analysis.
Ten millilitres of the treated oil samples was transferred to centrifuge tubes and spiked with distilled water to attain moisture concentrations at varying levels (0, 500, 700, 1000, 2000, 3000, 4000, and 5000 ppm).
e concentration covers a wide range aiding to examine the sensitivity of the method where the standards were prepared by gravimetric addition of water to oils.ey were vortexed and extracted with 10 mL of dried acetonitrile for 1 min, respectively.e samples were continued to be centrifuged for 10 mins at 7500 rpm to allow separation.
e solvent layer was transferred to septum-capped vials for analysis with Fourier transform infrared (FTIR) spectrophotometer equipped with an ATR (Agilent 4500 Series FTIR).All spectra were obtained at a resolution of 4 cm −1 and 64 scans.
e oil, both lubricating oil and transformer oil, spiked with moisture and extracted with acetonitrile was scanned between 4700 and 590 cm −1 in five replicates yielding a total of 40 spectra, respectively.e spectra in * .spcformat were saved.

Model Development.
e PLS regression model was developed in Matlab R2013a.e spectra in * .spcformat were converted into * .mat,readable in Matlab.
e interested IR region was determined, and the spectral data were extracted, X (M × N). e corresponding concentration is the response in a vector, y (M × 1).e spectral data were split into training and test sets.Two-thirds of the samples was assigned as the training set to build the model for prediction whilst the remaining serves as the test samples to validate the model.
e training data, X (M × N), were standardized and the response, y (M × 1), was mean-centred before subjected to PLS algorithm.Note that in this study, two PLS components were used.e test set, X test (M test × N), was likewise standardized using the mean and standard deviation of the training samples.
e PLS algorithm assumes a linear relationship between the predictor, X, and the response, y. ey are decomposed into models of X � T • P + E and y � T •q + f where E and f are the noise; T is the score's matrix common for X and y; P and q are the loading matrices.e PLS algorithm is referred to Brereton [13] for brevity.

Model Evaluation.
e prediction model was validated based on 100 training/test sets with the percentage root mean square error (% RMSE) calculated as follows.e flow chart in Figure 1 illustrates the prediction of moisture in oil with PLSR.e model was programmed in Matlab R2013a, transformed into graphical user interface (GUI) and converted into an executable program.
e algorithm was then applied on a set of blind samples with the prediction compared against the measurements attained using Karl Fischer (for transformer oil) and Kittiwake (for lubricating oil) methods.e blind samples were independent test set spiked with known concentration of moisture.Analysis of variance (ANOVA) was performed to evaluate the % RMSE attained based on different spectral regions over 100 training/test splits to determine if there is a significant different at 95% confidence level.

Results and Discussion
3.1.Lubricating Oil. Figure 2 shows the IR spectra of acetonitrile with moisture at varying concentrations, extracted from lubricating oil.e spectra overlap perfectly except at the regions typically designated for the absorption of OH groups at 3750-3400 cm −1 and 1700-1600 cm −1 .e band intensity increases as the concentration increases where these regions have been commonly used for prediction of moisture in oil.At lower concentrations of 500 and 700 ppm, the peak intensities are closely similar.Under the influence of spectral variability, the concentrations may be confused suggesting that the detection of moisture at this level can be challenging.
Both regions of 3750-3400 and 1700-1600 cm −1 were subjected to evaluation to determine their predictive ability based on 100 training/test sets incorporating PLSR.e prediction performance was measured based on the average % RMSE over 100 iterations.With 100 training/test splits, the model is assured free of bias, and the prediction is not fitted based on one single dataset.Figure 3 shows the predicted versus expected concentrations using spectral data at 37500-3400 and 1700-1600 cm −1 .e average % RMSE of 100 training/test sets suggests that the region at lower frequency demonstrates statistically better prediction accuracy (p < 0.05).
is observation opposes the finding of Ng and Mintova [4] where the moisture was extracted using DMSO; the best prediction was reported using the spectral at 5400-e region at 1800-1500 cm −1 was inferred with the lowest accuracy as a result of interference from aminic, phenolic additives and other oxidation products present in lubricating oil.On the contrary, Meng et al. [14] corroborate the nding of this study, concluding that the absorption at 1630 cm −1 noticeably experiences fewer interferences compared to the OH stretching region at 3400 cm −1 .Van de Voort et al. [5] however recommends the absorption at 3676 cm −1 implying that this frequency is less a ected by the interference of phenol antioxidant.Note that both Meng et al. [14] and Van de Voort et al. [5] apply the solvent extraction strategy using acetonitrile.Overall, both regions have been widely used for quanti cation of moisture in oil whether directly or indirectly as summarized in Table 1.e choice of optimal water band may di er as a result of matrix interference and the type of oil studied.
e dataset with moisture concentrations ranging between 0 and 5000 ppm were divided into two subsets of low (0, 500, 700, and 1000 ppm) and high (0, 1000, 2000, 3000, 4000, and 5000 ppm) concentrations to evaluate the method sensitivity.It is anticipated that both subsets shall yield comparable % RMSE if the sensitivity is not compromised.Table 2 summarizes the average % RMSE of training/test sets, for two subsets of low and high moisture concentrations, according to spectral data at 1700-1600 and 3750-3400 cm −1 .e average % RMSE is higher for subset of low concentrations, indicative of poorer prediction.
To determine the model sensitivity, the calibration samples were subjected to self-prediction as tabulated in Table 3.
e concentration at 700 ppm was detected reasonably accurate hence the limit of detection (LOD) is recommended at 700 ppm.Nevertheless, the concentration at which the moisture can be comfortably detected, in another words the limit of quanti cation (LOQ), is suggested at 1000 ppm. is recommendation is in line with the literature ndings where Dong et al. [3] propose the LOD at 500 ppm whilst Holland et al. [19] postulate at 1000 ppm, agreeing with Fitch [20].In terms of precision, it is found that the prediction is subjected to high variance, indicative of unstable models.According to Shang et al. [21], ATR-FTIR su ers limitations of short path length and weak signal for quantitative analysis, rendering less accurate and inconsistent measurements.FTIR has been employed for quanti cation of moisture in oil with sensitivity ranging from as low as 30 ppm to 13000 ppm (Table 1); this broad variation is hypothetically attributed to the types of oil examined, sample preparation, analytical procedure, and validation strategies.In this study, acetonitrile extraction is applied incorporating the exhaustive splitting of training and test sets to verify the prediction model.
e model was further applied on a set of blind samples where the prediction was compared and veri ed with the Kittiwake moisture sensor.Table 4 shows the moisture concentrations predicted with PLSR using the spectral data at 1700-1600 cm −1 and the measurements attained using Kittiwake method.e accuracy and precision of the results are evaluated based on the percentage error (% error) [abs(mean value − expected value)/expected value × 100%] and the percentage coe cient of variation (% CV) [standard deviation/mean value × 100%].At concentrations lower than 800 ppm, the prediction with PLS regression is inconsistent and inaccurate; however, the prediction is seen to improve as the concentration increases.
ere is a marked di erence between the predicted concentrations of PLSR and the measured concentration of Kittiwake.Note that the recommended operating range for the Kittiwake sensor is between 0 and 3000 ppm with sensitivity of 500 ppm.At a concentration between 500 ppm and 800 ppm, the Kittiwake measurement is satisfactory with an error of 5-8%; however, as the concentration increases (>800 ppm), the method consistently records higher error than the prediction of PLSR.

Transformer Oil.
Figure 4 shows the spectra of acetonitrile containing moisture extracted from transformer oil.e marker regions were likewise identi ed at 3750-3400 and 1700-1600 cm −1 where the spectral data were singled out for prediction according to training/test sets over 100 iterations.
Figure 5 shows the predicted versus expected concentrations of training/test sets where the spectral data at 1700-1600 cm −1 demonstrate better predictive ability than that at 3400 cm −1 with statistical signi cance (p < 0.05).e prediction of moisture in transformer oil is seemingly less promising compared to lubricating oil.e average % RMSE for test sets of transformer oil and lubricating oil is 16.55% and 28.63%, respectively.e prediction performance may be oil dependent, governed by the formulation and additives.Essentially, oils dissolve some water with their saturation point governed by the amount of additives.Transformer oil is a mineral oil with minimal additives; it can be saturated with 3-10 ppm of water whilst lubricating oils saturate at higher moisture level depending on the oil type (hydraulic uids 100-1000 ppm; industrial lubricating oil 600-5000 ppm; automotive lubricating oil 1000-5000 ppm; stern tube oil >16%) [22].
e dataset was likewise divided into two subsets of lower and higher concentration range to examine the prediction performance.Like lubricating oil, the prediction for samples at lower concentrations is characterized with a greater % RMSE of 60.75% (test samples) whilst at higher range, the prediction improves with % RMSE of 21.40% (Table 5).e prediction similarly exhibits better accuracy at concentrations above 1000 ppm as suggested by the self-prediction results of transformer oil (Table 6).Table 7 summarizes the prediction of   Journal of Spectroscopy moisture in blind samples, veri ed with Karl Fischer method.e corresponding % error and % CV are included in the table.e prediction with PLSR is seen to su er at concentrations <1000 ppm with large error between 29 and 66%.As the moisture concentration increases, the prediction accuracy improves with percent error of 3-14%.For Karl Fischer method, the measured water against the expected concentrations uctuate inconsistently and unpredictably with high 3410 and 3454 cm −1 (water in oil) Higgins and Sleenbinder [11] Lubricating oil (0.1-3.7%) 3600-3100 cm −1 (water in oil) Blanco et al. [18] Lubricating oil (100-4000 ppm) 5400-4900, 3750-3200, 1800-1500 cm −1 (water in DMSO) Ng and Mintova [4] Lubricating oil (0-2100 ppm) 3676 cm −1 (water in acetonitrile) Van de Voort et al.     prediction error and variance.Coulometric Karl Fischer method is ideal for detection of moisture at low concentration of 10 µg to 100 mg; at increased concentration, Margolis [23] reports reduced measured water when the oil becomes insoluble in the solvent. is study reports the quantification of moisture using the indirect strategy of acetonitrile extraction.e prediction was also attempted using the spectra of oil, directly spiked with moisture; unfortunately the prediction was erroneous and irreproducible likely due to scattering of infrared light as a result of inhomogeneous water globules [15].A surfactant stabilizer was added to reduce the scattering aiding to enhance the prediction; however, no observable improvement was evidenced (results are not included).

Prediction Program for Moisture in Oil.
e algorithm was converted into graphical user interface (GUI) and compiled into an executable file that can run on Windows.In this program, users provide the calibration spectra with known moisture concentrations for evaluation according to 100 training/test sets.e program can then be executed to predict the moisture in unknown samples.Figure 6 shows the interface and output of the program for prediction of moisture in oil.
e program can be used to predict other substances given that the marker region is identified.
Essentially, water is present in oil as dissolved, emulsified, or free water.At a low concentration, water is able to disperse in oil; however, as the level increases, it becomes immiscible.e saturation level depends largely on the type of base oil, additive, temperature, and pressure.
e dissolved water may be present at a concentration less than 2000 ppm, whilst emulsified and free water range from 150-5,000 ppm and 500-50,000 ppm, respectively [24].In this study, the concentration of moisture examined is mostly present in the state of emulsified and free water.For concentration lower than 1000 ppm, the method is not sufficiently sensitive and may not be useful for moisture monitoring in utility oil.Typically, the allowable moisture content in in-service transformer oil is less than 35 ppm [25].
is implies that the prediction of moisture in transformer oil with FTIR is unsuitable.For lubricating oil, the desired level of moisture may vary depending on the machine's specification and operation.In some machines, a small amount of water can be destructive.Normally, the moisture in lubricating oil is maintained below 0.03%, and a concentration of 0.15-0.20%can be damaging [26].

PLSR versus Simple Linear Regression (SLR).
e peak intensity is found to increase with the concentrations without sign of interference; the relationship suggests that the moisture is possibly predicted using simple linear regression (SLR).To testify, identical 100 training/test splits were subjected to PLSR and SLR using spectral area at 1600-1700 cm −1 .Table 8 summarizes the average % RMSE of prediction using PLSR and SLR based on 100 training/test splits (concentration ranges 0-5000 ppm).Evidently, the prediction performance of PLSR and SLR is comparable with the prediction in lubricating oil demonstrating better accuracy.

Conclusion
FTIR integrated with PLSR is feasible for prediction of moisture in oil, demonstrated on lubricating oil and transformer oil.
e spectral region at 1700-1600 cm −1 exhibited better predictive ability with lubricating oil revealing a lower % RMSE.e method unfortunately yielded high prediction error for samples with concentrations less than 700 ppm.Hence for monitoring of moisture in utility oil, which is typically low in concentration, the method is lacking in sensitivity.Karl Fischer and Kittiwake methods are intended to validate the FTIR method in detection of moisture; however, the results suggest that the three methods are not directly comparable for the reason that their optimum measuring range is different.Coulometric Karl Fischer method is commonly recommended for moisture level of a few ppm to <1-2% whilst Kittiwake method is found to work better at less than 1,000 ppm with detection limit of 500 ppm.For FTIR method on the other hand, the detection of moisture is suggested at >700 ppm with the solvent extraction strategy.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Figure 2 :
Figure 2: Spectra of acetonitrile with moisture extracted from lubricating oil.

Table 1 :
Summary of literature studies of moisture in oil using FTIR.

Table 2 :
Average % RMSE for subsets of low and high moisture concentrations in lubricating oil according to spectral regions at 3750-3400 and 1700-1600 cm −1 .

Table 3 :
Self-prediction of the calibration samples (lubricating oil).

Table 4 :
Moisture concentrations predicted with PLSR using the spectral data at 1700-1600 cm −1 and the measurements attained using Kittiwake method.

Table 5 :
Average % RMSE for subsets of low and high moisture concentrations in transformer oil according to spectral regions at 3750-3400 and 1700-1600 cm −1 .

Table 6 :
Self-prediction of the calibration samples (transformer oil).

Table 7 :
Prediction of moisture blind samples of transformer oil using PLSR, verified with Karl Fischer method.

Table 8 :
e average % RMSE of prediction using partial least squares regression (PLSR) and simple linear regression (SLR) based on 100 training/test sets (concentration ranges 0-5000 ppm).