Evaluation of Brazilian Monovarietal Extra Virgin Olive Oils Using Digital Images and Independent Component Analysis

Digital images associated with chemometric tools, in a non-destructive approach, have been applied for the evaluation of the Brazilian monovarietal extra virgin olive oil. Independent component analysis (ICA) was employed in the image data evaluation of olive oils produced from distinctive olive varieties, and it was successful in highlighting the natural grouping among the samples based on the color histograms (red, green and blue (RGB) channels). The study’s proposal indicates the ability of accurate separation of the twenty-four samples considering their color. To conclude and support the hypothesis, principal component analysis (PCA) was applied in the visible spectra of the Brazilian monovarietal extra virgin olive oil, to observe the similarities between the samples, and the results are in agreement with those obtained by image analysis coupled with ICA. This method demonstrates the possibility for the characterization of olive oils respecting the principles of Green Chemistry.


Introduction
Extra virgin olive oil (EVOO) is a superior olive oil category and it is solely obtained from the fruit of the olive tree, Olea europaea L., by the mechanical means. Although the production of olive oil is concentrated in the Mediterranean countries area, the cultivation of olive trees is disseminated in other countries, including Brazil; the increasing consumption of EVOO is related to its peculiar properties. 1 In Brazil, Maria da Fé (a city located at the south of the state of Minas Gerais at Serra da Mantiqueira) is a place that favors the olive trees cultivation due to the altitude that shows lower temperature and the higher rainfall precipitation. 2 A small area around 500 ha, 3 has begun the olive trees cultivation and the Brazilian monovarietal extra virgin olive oil (Brazilian monovarietal EVOO) production, obtained from a single variety, 4 presenting distinct characteristics, directly related to the olive cultivar and the natural environment in which it grows. 5 Arbequina, one of the most widely grown cultivars and marketed in the world, has recently started to be cultivated in Brazil. The popularity arises from its easy adaptation in new environmental conditions and the good oil quality, which is considered ideal for new and emerging markets. 3 Obtained from the Italian varietal Grappolo, the Grappolo 541 is a Brazilian clone, whose fruit serves both to olive oil and table olives production. Maria da Fé is a Brazilian clone from Portuguese varietal Galega, that has an indication to obtain olive oil. 6 Empeltre is a variety commonly cultivated in the Mediterranean area and now in Brazil, is used for olive oil production. 7 Coratina cultivar is widely diffused in the south of Italy, the oil produced is characterized by a strong peppery flavor due to the high content of polyphenols. 8 Koroneiki cultivar is originated from Greece, produces olive oil of exceptional quality, and in Brazil, this varietal has been successfully tested for olive oil extraction. 9,10 The quality and authenticity of the olive oil are attributed to several factors, including the varietal from which was obtained. 11 Then, it is important to differentiate and recognize monovarietal olive oils, since demonstrates unique features. EVOO is a complex food matrix and monitoring its quality (from harvesting through the transformation to storage) and authentication (olive's variety, detection of adulterations, identification of geographical origin) can be a challenge. 12 Analytical tools can provide this kind of distinction; several researchers have attempted to find a method that allows the distinction according to cultivar, based on electronic tongue, 11 characterization of fatty acids composition, 13 nuclear magnetic resonance, 12 phenolic and organic compounds, 14,15 chemical composition, 16,17 among others. 18,19 Nevertheless, some methods have several drawbacks and, the most significant are low quickness, high cost, with the necessity of sample pre-treatment and the demand for highly-skilled personnel. 20 In this sense, chemometric tools can help to characterize or authenticate monovarietal olive oils, 21 multivariate analysis can be useful since pattern recognition procedures can be applied to compare similarities and differences in a large dataset. 22 Chemometric tools allow working with some alternative analytical techniques, assisting in the interpretation of the results, extracting relevant information from data. In this context, the digital images have been used as a source of analytical information since the last century. 23 Moreover, color and chemometrics were previously employed in characterization of EVOO, 24 and for the classification of EVOO samples with respect to the brand. 25 Digital images may be an alternative way for the authentication of Brazilian monovarietal EVOO since they are intrinsically a multivariate system: that is a collection of data stores in pixels. 26 The images are formed by the combination of the channels red (R), green (G), and blue (B), where each pixel is composed by numerical information that can be accessed by the decomposition of this images in RGB channels. The intensity of each color in the RGB channels is measured in minimum and maximum varying in the range from 0 to 255. The combination of this channels creates the different colors, come out 2563 combinations. The decomposition of all pixels, frequency of channel and color, results in a frequency histogram, that can be treated as spectral data and used for developing chemometric models. 23 Imaging instrumentation for the hyperspectral imaging can be expensive. However, in this study the images were acquired by a commercial scanner, with de advantages of being low-cost, no sample pre-treatment required, besides being eco-friendly, the images can be stored into a matrix of pixels. 27 Image analysis can be utilized with unsupervised (exploratory) or supervised (discrimination/classification) chemometric tools. 28 Therefore, this study aimed to propose an alternative methodology for the rapid distinction of Brazilian monovarietal EVOO based on the cultivar by combining digital images with independent component analysis (ICA). ICA is an exploratory tool, used as curve resolution that recovers the signal of the pure components out of signal mixtures. Moreover, the image decomposition performed by ICA allows us to efficiently extract "interesting" signals for the Brazilian monovarietal EVOO evaluation. Therefore, ICA is a promising tool for digital image analysis and interpretation. 29 To assist in the ICA results interpretation, principal components analysis (PCA) was performed, to obtain a view of the visible spectra of Brazilian monovarietal EVOO that was subjected to the novel method.

Apparatus and software
A scanner from Hewlett-Packard (HP) Company model F4480 was used for the images acquisition. The signal was separated using the ICA resolution method with the JADE (Joint Approximate diagonalization of Eigenmatrices) algorithm, 30 and the MATLAB software version R2007b. 31 Visible spectra were recorded with an Ocean Optics UV-Vis spectrometer, interfaced to a computer by using the software integration SpectraSuite. The spectra were collected from 400 to 800 nm, with a 1 mm quartz cuvette, without any sample preparation.

Procedure for the image acquisition
For the image acquisition, 5 mL of each sample were placed in small Petri dishes (6.0 cm radius × 1.5 cm height) and settled in the flatbed scanner, in order to guarantee a uniform illumination on all sections. 32 All images were acquired in the same conditions light. The sample position on the scanner was the same for all samples. All images were digitalized with an amount of 2338 × 1700 pixels and resolution of 200 dpi, stored in the .jpeg format. The images were export to the MATLAB environment for the conversion into RGB histograms. From the same image, three areas were selected obtaining three samples with sizes of 80 × 80 pixels (three tensor {80,80,3}, which 80 corresponds to the number of pixels utilized and 3 are the variables R, G, and B which can assume values from zero to 255). The selection was carefully made in a homogeneous part of the images, resulting in 72 vectors obtained from the unfolded tensors. Each chosen area is represented by a tensor that was unfolding into RGB channels. Figure 1 illustrates the schema of all procedures for the images acquisition, the RGB histogram organization and the construction of the matrix from a single-channel.

Independent component analysis (ICA)
ICA is a blind source separation method based on the construction of factors named independent components (ICs), which are linear combinations of the originals variables. The ICs are assumed to correspond to more statistically independent signals of the "pure" source present in the analyzed mixtures. 33 For signal processing, ICA is a computational method for separating a multivariate signal into additive subcomponents. 34 Signals are organized into a matrix X (s × v) where s is the samples correlating to the rows and v the variables corresponding to the columns of the X, the ICA model can be described as Kassouf et al.: 35 where A is the matrix of pure signal proportions (the mixing matrix), while S is the matrix of pure source signals (the ICs).
The main ICA objective is to estimate a 'demixing matrix', W = A -1 . Since ICA is a blind source separation, the goal of ICA is to calculate W knowing only the X. The row of the S is related to the pure source signals and may be recovered from the matrix of the measured signals (X) by: The Joint Approximate Diagonalization of Eigenmatrices (JADE) algorithm, aiming to extract the more independent non-Gaussian sourced from mixtures signal with Gaussian noise, was used for the W calculation. 33 Figure 2 shows the RGB histograms for all Brazilian monovarietal EVOO images obtained from the scanner. Each sample, after tensor unfolding, generated a vector with dimensions 1 × 768 (256 results for R, G, and B variables, side by side in that order). These vectors were organized in a matrix X (RGB histograms), where ICA was applied, as an unsupervised method is not used to predict data as do not have an associated response variable, meaning that, is a multivariate tool that makes possible to recover the distinctive traits of a specific food, in this case, Brazilian monovarietal EVOO, highlighting the patterns in the data. 36 In fact, in the literature, independent components estimated from various scientific data are often reported without any kind of validation. 37 The image acquired from scanners is known to be a mixture of the red, green and blue components. Therefore, it seems natural to treat all information obtained from RGB channels as useful. 38 However, by using the R, G and B channels no relevant differentiation between Brazilian monovarietal EVOO were observed. Then, each channel was used to search for a relevant differentiation that enables the evaluation of the samples. Figure 3 presents the average histogram for the B channel of each monovarietal. This was the channel that presents more useful information in the  differentiation from the EVOO monovarietals. In all cases were tested the number of ICs from 6 (number of varietals) to 8, the results showed feasible differentiation between varietals from the B channel with 7 ICs. Although seven ICs were necessary to provide the signal separation that can be useful for Brazilian monovarietal EVOO distinction, only those that contain relevant information, correlated with the aim of this study will be discussed. Figures 4 and 5 present the ICs (IC1, IC2, IC3, IC4, IC6, and IC7) responsible for the varietals grouping. The IC5 has no ability for any group distinction and due to this, the result for this IC was not presented here. For each ICs two plots are shown: the first one representing the scores, correspondent to the group formation, while the second present the extracted signals that are responsible for the monovarietal grouping's.

Results and Discussion
Scores plot (Figure 4) demonstrated high rates of natural grouping, enabling to affirm that the six groups of varietals were efficiently separated. The samples from 13 to 24 corresponding to the Empeltre varietal were discriminated by IC1. The samples from 49 to 60 belonging to the Maria da Fé were separated in IC3. On IC4 were distributed the samples from 25 to 36 from the Koroneiki varietal. At IC6, the samples from 37 to 48 for Arbequina varietal were distributed. The Coratina (samples from 1 to 12) and Grappolo (samples from 61 to 72) varietals were grouped on ICs 2 and 7.
The Grappolo and Coratina are together on the same ICs, Grappolo is a Brazilian clone obtained from an Italian varietal, 6 and the Coratina cultivar is widely diffused in the south of Italy. 8 For these varietals with Italian origin, the results suggest that the Brazilian climate and soil contributed to a similar composition, that provided comparable colors to these olive oils. Figure 5 shows which portion from B channel is responsible for the separation demonstrate in the scores. Empeltre monovarietal presents the extracted signal on IC1 around the variable 75. Maria da Fé monovarietal discriminated in the IC3 shows extracted signal around the variable 6. The IC4, in which the Koroneiki monovarietal was separated, the variables around 13 is responsible for the distinction. The variables around 88 are the extracted pure signals that discriminated against the Arbequina monovarietal on IC6. Coratina and Grappolo monovarietals present extracted signals around the variable 40 on IC2 and, around 45 on IC7.
The R, G, B channels are responsible for the colors constitution in a digital image. Then, the visible spectra ( Figure 6) from the Brazilian monovarietal EVOO were evaluated. Several studies show that the α, β and γ carotenes present absorption at 447, 451, and 462 nm, respectively, while chlorophylls show absorptions at 420 and 670 nm. [39][40][41][42][43][44][45] In Figure 6, it is possible to observe that the Brazilian olive oil from different monovarietals presents a shift at these wavelengths. This effect is typical in the case of concentration variations.
Principal component analysis (PCA) was applied in the visible spectra of three independent samples of Brazilian monovarietal EVOO to achieve a full comprehension for the results shown when applying ICA to the digital images. PCA was applied in the transposed matrix, i.e., the wavelength in the lines and samples in the columns. This strategy aimed to evaluate the relation between the absorptions in the different wavelengths and assign them to the samples. Figure 7 shows the PCA results. The scores showed a correlation between the carotenes and chlorophylls absorption in the positive side of the first principal component (PC1), which explain 99.12% of the variance in the dataset. In fact, the spectral regions present at the positive side from PC1 corresponds to the absorption of the α, β and γ carotenes and of the chlorophylls. The loadings plot showed the Brazilian monovarietal EVOO samples in a crescent order, i.e., the carotenes and chlorophylls content increase being higher content in the Maria da Fé monovarietal and smaller content in the Empeltre monovarietal. It is verified that the samples for the varietals Coratina and Grappolo are similar in the quantities of carotenes and chlorophylls that reinforce the result achieved by ICA coupled with digital images.
The color presented by olive oils is related to its pigment content, varying from light gold to a rich green. 46  Chlorophylls and carotenoids are very common pigments, responsible for the color presented by EVOO, observed in its visible spectrum, 40 the exact combination and proportions of these pigments determine the final color of the oil. Chlorophyll pigments, such as pheophytin, are responsible for the greenish color, while compounds like lutein and β-carotene from yellow pigments. The amount and proportions of these pigments depend on the cultivar, maturation, and the olive processing system, besides the storage conditions. 45,46 In fact, in a study conducted on the assessment of Brazilian monovarietal olive oil in two different package systems shows that glass bottles are a package system that provides more protection for the Brazilian monovarietal olive oil over tinplate cans. 47 Here all samples were in glass bottles, and according to the producer the olives passed through the same processing system and were harvested in similar maturation degrees. Furthermore, the samples presented the same storage time. Then, the variables that contribute to the amount of pigments are cultivar.

Conclusions
Image analysis from scanner coupled with the independent component analysis is a feasible tool for the evaluation of Brazilian monovarietal extra virgin olive oil. The proposed methodology is an easy and cheap way to collect a great amount of data and can be an alternative for robust methodologies in the evaluation and distinction of monovarietal extra virgin olive oil, with some advantages since it does not require samples pre-treatment, or the use of reagents or solvents, and it is quickness. The combination of digital images and ICA provide high rates of distinction, probably due to the different composition of varietals that contributes to different color intensities in the B channel.
The performance achieved by this novel methodology for the distinction of monovarietal olive oil was considered satisfactory in the evaluation of olive oils produced from only one varietal such as Empeltre, Koroneiki, Arbequina and Maria da Fé. Nevertheless, it is possible to conclude that Coratina and Grappolo varietals cultivated in Brazil present similar constituents that contribute to their colors and then, those two varietals are together and were not discriminated between them. In fact, to conclude about this hypothesis, PCA was applied in the visible spectra and the achieved results confirm that Coratina and Grappolo monovarietals are similar in the carotenes and chlorophylls quantities. and Luiz F. Oliveira da Silva for data curation, software, validation, visualization and writing original draft; Patrícia Valderrama, Paulo H. Março, Sandra T. M. Gomes, and Makoto Matsushita for writing-review and editing.