Discrimination and Identification of Vegetable Oil Based on Voltammetric Electronic Tongue

The study presented the application of a voltammetric electronic tongue to discriminate and identify vegetable oil. Concretely, it aimed to research the discrimination of different oil and the prediction of unknown oil. Seven oil samples from different varieties and geographical origins were measured by a voltammetric platinum electrode as the sensing part. The electrochemical response current signals of samples which were the original data information were obtained with cyclic voltammetric measurement. Principal Component Analysis (PCA) and Cluster Analysis (CA) algorithms were used as the modeling tools to discriminate different vegetable oil respectively. Discriminant Factorial Analysis (DFA) and RBF Neural Network (RBFNN) were used as the prediction models for unknown oil. Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT) were applied as feature extraction method for data input set of the prediction models. Different combinations of prediction strategies with feature extraction methods were compared. It was found the samples with different varieties or origins were clearly discriminated with using PCA and CA. The best prediction results were obtained with a 90.48% of identification accuracy by employing FFT-RBFNN. The implementation of this study suggests the electronic tongue may be a useful tool for oil quality evaluation and control.


INTRODUCTION
Oil occupies an important position in daily life and chemical industry.With the continuous progress of edible oil production and processing technology, the application range of oil is broader and more widely.As one of the most basic nutrients for people, edible oil can offer a variety of fatty acids for human bodies.It will affect our health directly if the oil becomes oxidation, rancidity or deterioration.Therefore, reliable and convenient edible oil's quality testing technology shows its significance.Sensory analysis and physical and chemical test are the two traditional methods for oil quality testing.Sensory analysis is susceptible to the influence of subjective and external factors which affect the objectivity and accuracy of the evaluation results.Physical and chemical test method is to analyze acid value, peroxide value or other indicators of the oil which aims to detect the quality of the oil and nutritional characteristics.However it has some disadvantages, such as the cumbersome steps, long time testing and high costs.Therefore, the reliable and convenient testing technology for edible oil's quality evaluation and control is particularly important and needs to research further.
As a new type of modern test and analytical instrument using selective, nonspecific, interactive sensitive metal electrodes or modified electrodes array to sense liquid samples characteristic response signal and employing signal pattern recognition or multivariate statistical analysis method, voltammetric electronic tongue can take qualitative and quantitative analysis of the liquid samples.The basic principle of the voltammetric electronic tongue is to put sensor array in the solution and collect different polarizative current produced by sample solution based on electrochemical voltammetry for qualitative and quantitative analysis.At present this technology has been applied in food, environment, medicine and other fields (Winquist et al., 1997;Oliveri et al., 2009;Ariza Avidada et al., 2013;Ivarsson et al., 2001;Han et al., 2008;Wu et al., 2011;Wei, 2011;Makhotkina and Kilmartin Paul, 2010;Isabel et al., 2012;Arunangshu et al., 2012;Kamalika et al., 2013;Jorge et al., 2010;Fredrik et al., 2011;Liu et al., 2013;Dmitry et al., 2013;Carolin et al., 2011).
Most different approaches of using voltammetric electronic tongue are mainly to test samples which can dissolve in water.Tested sample solution is based on water medium, such as fruit juice, tea, wine or other alcoholic beverages.These mediums can provide sufficient good solution conductivity.For existing of redox activated substances with the characteristics of sensitivity to sensory stimulus and antioxidant properties such as VE, polyphenol compounds, carotenoids, vegetable oil can be analyzed by electricity chemical methods.However, due to the high viscosity and low electrical conductivity of the oil, it is very difficult to directly apply voltammetric electronic tongue and electricity chemical sensor array into the oil's testing.It is necessary to find appropriate hydrophobic supporting electrolyte to apply voltammetric methods on oil quality testing.
In recent years, many scholars have carried out some researches on edible vegetable oil detection by artificial taste bionic electronic tongue system.Péter (2005) have distinguished corn oil, sunflower oil, palm oil, palm oil and oil mixed in different proportions successfully by French Alpha Astree electronic tongue system.Apetrei et al. (2005Apetrei et al. ( , 2007) ) used modified carbon paste electrodes as working electrode successfully to distinguish corn oil and other oil and analyze the degree of bitterness of virgin olive oil.Rodriguez-Mendez et al. (2008) used twelve carbon paste electrodes as working electrode to analyze carbolic acid compounds in olive oil.Oliveri et al. (2009) designed an electronic tongue system with two micro platinum electrodes and directly applied it for detection of vegetable oil which successfully distinguish different qualities and different geographic origins.Hang et al. (2013) and Men et al. (2013) also applied the electronic tongue system for edible oil detection.
This study aims to evaluate the feasibility of discriminating vegetable oil from different varieties and geographical origin and predicting unknown oil samples by means of the voltammetric electronic tongue with various preprocessing strategies and different modeling tools.In the study, seven oil samples from different varieties and geographical origin were measured by a voltammetric platinum electrode as the sensing part.The electrochemical response current signals of samples which were the original data information were obtained with cyclic voltammetric measurement.Principal Component Analysis (PCA) and Cluster Analysis (CA) algorithms were used as the modeling tools to discriminate different vegetable oil respectively.Discriminant Factorial Analysis (DFA) and RBF Neural Network (RBFNN) were used as the prediction models for unknown oil.Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT) were applied as preprocessing methods for data input set of the prediction models.

MATERIALS AND METHODS
Test material: Seven edible vegetable oil samples from different varieties and geographical origins were In the discrimination experiments, the seven oil samples in the experiments were divided into three categories.The first category (Experiment No.1) was three varieties from Xinzheng vegetable oil (named A1, B1 and C1).The second category (Experiment No.2) was three varieties from Zhoukou vegetable oil (named A2, B2 and C2).The third one (Experiment No.3) was three geographical Origin from the same variety vegetable oil (named A1, A2 and A3).While in the identification and prediction investigations, all of the seven oil samples were needed for prediction models.
Trihexyl (tetradecyl) phosphonium decanoate was adopted as the hydrophobic supporting electrolyte.It is a kind of Room Temperature Ionic Liquids (RTILs) produced by the company of Strem in the United States.The purity of RTIL is 95% (Oliveri et al., 2009;Sun et al., 2003).
Experimental samples: Due to the high viscosity and low electrical conductivity of the oil, it is impossible to get a good current-potential response characteristic with voltammetric electronic tongue work electrode.It is necessary to search for a suitable supportive electrolyte in order to improve oil's conductivity.The RTIL was selected as the supportive electrolyte because it showed good dissolving ability for inorganic and organic materials, non-volatile and small toxicity and good conductivity.RTIL had a wide electrochemical window and can be mixed with oil directly.
So, all samples were mixed with RTIL respectively before test.All the experimental samples were the mixture of 20 mL different vegetable oil and 1.5 g RTIL.

Main instruments:
The main instruments were electrochemical workstation (potentiostat) CS350 with three electrode construction (a working electrode, a counter electrode and a reference electrode).Platinum electrode is used as working electrode in the work, the counter electrode is platinum column electrode and the reference electrode is Ag/AgCl electrode.Moreover, electrode polishing materials, ultrasonic cleaning instrument, ultrasonic processor, anhydrous ethanol solution and the electronic balance JA2003 whose dividing value is 0.001 g and linear error is ±0.002 were chosen into the test.
Voltammetric electronic tongue: The electronic tongue system used in the study was designed.The electronic tongue consists of an electrochemical workstation with three electrodes as the sensing part and it controls excitation potential generation, collects taste response signal data and builds discrimination and identification models in PC (computer).The experimental samples were supplied for electronic tongue analysis.
The experimental samples should be homogenized by ultrasound for 5 min and stand for 15 min before the voltammetric measurement of electronic tongue.Then, three-electrode array was put into the mixed solution to begin measuring.
Cyclic voltammetry measurements were performed at room temperature (25°C) using electrochemical workstation (potentiostat) CS350 controlled with software package.
In order to obtain a stable voltammetric response along the whole oil samples experiments, electrodes were first cycled in the standard 10 mM KCL solution to get the standard voltammogram.Afterwards, the mixed experimental sample solutions were used for each measurement.
In each measurement, the cyclic voltammogram was recorded by cycling the excitation potential loaded between the working electrode and reference electrode with the mode of rapid linear scanning.The excitation potential was an isosceles triangle waveform produced by electrochemical workstation.The response current signal was acquired between the working electrode and the counter electrode and then used for later data analysis.Additionally, an electrodes cleaning stage was performed between each measurement to prevent any impurities on the working electrode.Cleaning fluid (anhydrous ethanol solution) and samples were placed alternately.Cleaning time was set 40s in the cleaning fluid.
Each sample was measured 10 times accompanying sampling 15953 points in one time respectively.Hence, a dataset of 70 samples for vegetable oil samples was supplied for analysis.The dataset of total samples formed a 10 * 7 = 70 lines, 15953 column matrix.Each line meant an observation and each column as the response to the current value of different potential.

Data processing:
Current data preprocessing: There was noise interference and burr in electronic tongue response current signal of three groups of oil samples.Signal data need preprocessing before analyzing the data.In order to obtain signal with high signal-to-noise ratio, burrs removing, smooth filtering algorithm were used to take digital filtering of the data.Moreover, as said before, a large dimensionality of the generated current data was brought when voltammetric sensor was used.A current data with 15953 sampling points were got from each oil sample measurement.It was difficult to apply the whole current data directly in the discrimination modeling.However, significant results show using the whole curve is better than using only particular peaks (Alsberg et al., 1998).So for discrimination modeling, current values of the first 1000 points, the middle 1000 points and the last 1000 points were adopted to represent an oil sample's feature information on the sampling data time series.

Discrimination modeling:
After applying the current data preprocessing, the dataset of samples formed a 10 * 7 = 70 lines, 3000 column matrix.The next step was the modeling for oil samples discrimination.The modeling stage was achieved through the use of PCA and CA.
As a kind of dimension reduction processing technology, Principal Component Analysis (PCA) is a multivariate statistical method that transforms the original multiple variables to a small number of integrated variables.It uses less several integrated variables to replace the original variable indexes, the integrated variables reflects the original information as much as the variable indexes.At the same time each of them is independent (Han et al., 2008;Wu et al., 2011;Wei, 2011).The purpose of PCA is to reduce the number of variables, eliminate redundant information and explain most information of the original data with fewer variables.The fewer new variables (principal components) with high correlation can explain most of the original data variance.PCA was used for the qualitative discrimination of the samples among Group 1, Group 2 and Group 3 with different varieties and origins of vegetable oil.PCA took variables called Principal Components (PCs) whose cumulative contribution rate over 85% to characterize the samples information.
Cluster Analysis (CA) is a multivariate analysis method which classifies the research objects according to the features of them.Essentially, CA looks for a statistic value (distance or similarity coefficient) that can reflect the affinity-disaffinity relationship among elements objectively.Then it divides elements into several classes according to the statistic.The basic idea of system cluster method is that at the start of cluster analysis, it lets N samples as a kind respectively and stipulates the distance between samples and the distance between the different classes.And then the nearest two classes are merged into a new class, repeating the process, until all the samples are into a category.Finally it will form an affinity-disaffinity relationship graph (clustering tree or hierarchical diagram).In this study, the cluster analysis algorithm was applied based on Euclidean distance for vegetable oil samples discrimination.
Feature extraction: In the case considered, it can be finished discrimination modeling with current data preprocessing.It compressed original data down to 3000 data for each sample measurement.But for identification and prediction modeling, the dataset matrix of 70×3000 was also a large dimensionality especially if RBFNN was used.This was because RBFNN need employ a dataset for training with larger number of samples other than large dataset as model input.The data treatment difficulty of the modeling using these dataset was evident.
FFT was a kind of high efficient algorithm of the Discrete Fourier Transform (DFT).It was an efficient tool in digital signal processing which decomposes the large data sequence into different frequencies.The algorithm realization of FFT can be described as the decomposition of the signal with a sine/cosine function pair with different frequencies and then calculation coefficient for each one according to its contribution to the original signal.FFT was performed aiming to compress hundreds of original current data down to Fourier coefficients.At the same time, the selected number of coefficients can be optimized through the inverse Fourier Transform (Cetó et al., 2012(Cetó et al., , 2013(Cetó et al., , 2015)).
For FFT has the disadvantage of lacking space partial character and the difficulty of determining space position and distribution of special signal, DWT owns better time and frequency partial character when projecting the signal down to a width-adjustable (dilation and translation factors) wavelet function.Wavelet Transform (WT) has the key feature of owning multiple resolutions to capture signal partial character based on wavelet basis functions.The wavelet basis functions depending on dilation and translation factors are obtained by mother wavelet.DWT is implemented by discretization of the dilation and translation factors.Computing of signal decomposition and reconstruction of DWT is based on Mallat's multiple resolution analysis.It decomposes the signal of length N into subspaces of length N/2 and then increases the decomposition level.Data length becomes half after decomposition.Daubechies wavelet was used as basis functions in the study.Similar to FFT, the number of inputs can be optimized through signal reconstruction degree.
Identification and prediction modeling: DFA is a kind of multiple discriminant analysis.According to the known category sample properties to establish a discriminant function, then it discriminates the unknown categories of samples and classified them to the known categories.Fisher criterion of DFA looks for a set of low-dimensional linear function for highdimensional space and classifies observation set in low dimensional space.The difference of categories is expanded after data dimension reduction, so it can obtain high discriminant efficiency.
RBFNN was selected as a prediction model because of its superior performance.It is a three layer forward networks using the radial basis function (gauss function) as activation function and can approximate any continuous function with arbitrary precision.RBFNN is non-linear transform from the input space to the hidden layer of space, but the transform is linear from hidden layer to output layer space.It greatly accelerates the learning speed and avoids local minimum problem.RBFNN is very suitable for analysis of non-linear sensor responses and complex relationships between high-dimensional space and the given data.
In the identification and prediction models construction considered, input data reduction was important and necessary to reduce the complexity of input data, which can avoid data redundancy, decrease model training time and obtain a better prediction ability model.In the study, two feature extraction methods of FFT and DWT for input data reduction was applied in modeling.
PCA and CA were performed with SPSS11.FFT and DWT were performed with LabVIEW 8.5 while DFA and RBFNN were done by using MATLAB 7.1.

Cycle current voltage characteristics of edible vegetable oil and RTILs:
The trihexyl (tetradecyl) phosphonium decanoate's cyclic voltammetry response signal curve with 2 mm platinum electrode under -2V-+2V scanning potential and 50 mV/s scanning rate was shown in Fig. 1.The upper part, namely reduction wave, was called cathodic branch while the lower part of oxidation wave was called the anode branch.The wave reached maximum and minimum value at the potential of +2 V and -2 V. Figure 1 showed that the RTIL had a wide electrochemical window and can be used as an ideal supportive electrolyte for electrochemical reaction.The RTIL was used as supportive electrolyte of vegetable oil and it must assure the information of vegetable oil's characteristics will not be lost.Discrimination of vegetable oil samples using PCA and CA: The aim of experiment No. 1 and 2 was to make sure if the cyclic voltammetric electronic tongue can distinguish edible oil due to its variety correctly while experiment No. 3 aimed to show whether the cycle voltammetric electronic tongue can distinguish the same variety of vegetable oil due to its geographical origin.

AiChu vegetable oil samples:
To this aim, a total subset of three samples (A1, B1 and C1) was analyzed with the electronic tongue.The whole original data was preprocessing as said in below section and then were analyzed employing PCA.The PCA plot was showed in Fig. 3. From Fig. 3, samples A1, B1 and C1 can be seen that they distributed in three areas of two-dimensional space respectively and there was no overlapping.It indicated that three vegetable oil samples from different varieties can be recognized well based on principal component analysis.Moreover, it should be also noticed that the contribution rate of PC1 reached 98.69% and PC2 reached 1.27%.The total accumulated explained variance achieved 99.96% with only the first two PCs, which meant the two new PCs values can represent most of the variance containing in the original data.
On account of CA, it should be taken standard transformation to preprocessed data and then calculate the euclidean distance between samples based on the class average methods.The cluster tree diagram shown in Fig. 4 was obtained by distance classification.Three variety of vegetable oil's affinity relationship was presented obviously.The samples were divided into two classes: one was the AiChu blend oil (sample C1) and the other was the AiChu peanut oil (sample B1) and the AiChu soybean oil (sample A1) when the square distance between classes was 35.523.The samples were divided into three classes well when the square distance between classes decreased to 20.2184.That was to say, discrimination accuracy can achieve 100% by selecting suitable square distance between classes.Figure 4 proved that CA was also useful tool to classify vegetable oil samples from different varieties.

JinLongYu oil samples:
To confirm this satisfactory result was due to its variety indeed, experiment No. 2 was continue to be performed with samples A2, B2 and C2.The original data preprocessing method was as the same as before.The PCA plot was showed in Fig. 5. From Fig. 5, the three samples were well distributed in three areas with the total accumulated achieved 99.88% with only the first two PCs.A good classification result of other three varieties of vegetable oil samples was also obtained.Although some individual point in the 1e-5 5e-0 0 -5e-0 -1e-5 samples deviated from the classification regional center, the classification results reached expected effect.CA analysis was performed according to the same algorithm as before.The three samples were divided into two classes: one class was sample B2 and the other was sample A2 and C2 with the square distance52.7531.The 100% discrimination accuracy was achieved with the square distance 26.9589 (Fig. 6).The two experiments and two different discrimination modeling results showed the electronic tongue was able to distinguish edible oil due to its variety correctly.

Discrimination of different geographical origin oil samples:
The total subset of three samples (A1, A2 and A3) from different geographical origin was analyzed with the electronic tongue.After applying the current data preprocessing, the dataset of samples formed a 10 *3 = 30 lines, 3000 column matrix and was sent to discrimination model.
Figure 7 showed a good discrimination of the samples in PCA modeling.The samples distributed in the three areas of the two-dimensional space respectively as expected except for individual samples from Xinzheng (Sample A1) deviated from its group.The total accumulated explained variance achieved 99.78% with the first two PCs.
CA also gave the good trend observed in Fig. 8.The samples were divided into two classes (One was A3 and the other A1 and A2 with the square distance 62.4261.and three classes with 44.7194, which achieved 100% discrimination.The cluster tree showed if the samples were divided into three classes, it can discriminate oil samples from three different origins well.
So, as can be observed, discrimination of the different oil according to its geographical origin can be achieved with PCA and CA modeling.

Prediction of different vegetable oil using DFA and RBFNN:
To further asses the ability of electronic tongue as a tool to identify and predict unknown vegetable oil, identification modeling by using both DFA and RBFNN was also attempted.However, the obvious difficulty was identification modeling with the large dimensionality original data matrix due to the collecting of voltammetric sensor or preprocessed data   The total original data matrix was 70 (samples) ×15953 (collecting data points).Each voltammetric test data was compressed down to only 16 coefficients without any loss of significant information by means of FFT and compressed down to only 23 coefficients after using DWT.These coefficients were applied to DFA and RBFNN modeling as input data respectively (Table 2).
The total 70 set of samples were split into two subsets randomly: The training subset (49 samples, 70%) was used for building DFA or RBFNN model and the testing subset ((21 samples, 30%) was used to assess the models prediction performance.The division was performed randomly.
For DFA modeling, the training set was clearly discriminated three clusters and all the 21 testing oil samples as unknown samples were projected into the DFA map.Prediction results and identification accuracy using DFA were observed in Table 2.It showed that correct identification number and identification accuracy based on DWT was 15 and 71.43%, respectively while that based on FFT feature extraction was 14 and 66.67%.Best results were obtained when using DWT as the feature extraction method for the FFT model.
RBFNN was also applied to predict the oil samples in the research.After topological structure optimization with training, structure of RBF neural network was determined as 16-19-1 when using FFT as the feature extraction method.The 16 coefficients obtained from FFT were used as input neurons.The number of hidden layer neurons was optimized by training and prediction accuracy and set at 19.One neuron represented the prediction sample in the output layer.The same modeling method was performed when using DWT and the network construction was designed 23-24-1.The prediction results were shown in Table 2. Correct identification number and identification accuracy based on DWT with 19 and 90.48%, respectively were better than that based on FFT with 17 and 80.95, respectively.
Hence, in contrast with DFA, RBFNN modeling was better whether using FFT or DWT as feature extraction.The best results were obtained employing DWT-RBFNN with identification accuracy 90.48%.

CONCLUSION
Given the weak electrical conductivity of vegetable oil, it is necessary to choose the appropriate supportive electrolyte for voltammetric electronic tongue to provide good electrochemical response window for later modeling.
Voltammetric electronic tongue has proved to be a useful tool for vegetable oil discrimination and identification analysis.Concretely, we reported its application in discriminating vegetable oil from different varieties and geographical origin and identifying different vegetable oil.Moreover, different discrimination modeling, different feature extraction method and identification modeling was performed and compared.
The research provides a test analysis way for follow-up research in edible oil's quality evaluation and control.Finally, future efforts with the research may involve further model validation (e.g., cross validation, sample increasing) and system optimization (e.g., adding new voltammetric electrodes number and variety).Beyond, the research may be improved with combining electronic nose.

Fig. 1 :Fig. 3 :
Fig. 1: Cyclic voltammetry response signal of RTIL Figure 2 showed the cyclic voltammogram of AiChu soybean oil and it mixed with RTIL solution (Sample A1).The red curve represented AiChu soybean oil voltammogram and the blue one represented voltammogram of AiChu soybean oil mixed with RTIL solution.It can be seen from Fig. 2 that the pure oil was almost not conductive.The response character of voltammetric electronic tongue was improved greatly after being joined with RTIL.The cyclic voltammogram of the mixed solution was also different from that of RTIL compared with Fig. 1.It further proved the response current characteristic of mixed solution represented the essential attribute of pure oil.

Fig. 7 :
Fig. 7: Discrimination result and PCA score plot of different geographical origin oil samples

Fig. 8 :
Fig. 8: Cluster tree of different geographical origin oil samples

Table 1 :
Three categories of experimental samples No.

Table 2 :
The prediction results of unknown samples by DFA and RBFNN