Data on fatty acid profiles of green Spanish-style Gordal table olives studied by compositional analysis

This article contains processed data related to the research published in “Tentative application of compositional data analysis to fatty acid profiles of green Spanish-style Gordal table olives” (Garrido-Fernández et al., 2018) [1]. It provides information on the implementation of compositional data analysis (CoDa) to the fatty acid profiles of Spanish-style Gordal table olives vs the use of conventional statistical analysis (data composition expressed in percentages). Particularly, it includes: i) the matrix of the sequential binary partition used for the balance estimation and the isometric log-ratio transformation (ilr) of the fatty acid profiles, ii) correlation among the diverse fatty acids expressed in percentages and their significances, iii) the ilr transformed values (coordinates in the Euclidean space) obtained following the sequential binary partition previously detailed, iv) the graphical presentation in the Simplex (ternary centred plot) of the treatments as a function of the four fatty acids with the higher log-ratio variances, and v) segregation of treatments based on Cluster Analysis.


a b s t r a c t
This article contains processed data related to the research published in "Tentative application of compositional data analysis to fatty acid profiles of green Spanish-style Gordal table olives" (Garrido-Fernández et al., 2018) [1]. It provides information on the implementation of compositional data analysis (CoDa) to the fatty acid profiles of Spanish-style Gordal table olives vs the use of conventional statistical analysis (data composition expressed in percentages). Particularly, it includes: i) the matrix of the sequential binary partition used for the balance estimation and the isometric log-ratio transformation (ilr) of the fatty acid profiles, ii) correlation among the diverse fatty acids expressed in percentages and their significances, iii) the ilr transformed values (coordinates in the Euclidean space) obtained following the sequential binary partition previously detailed, iv) the graphical presentation in the Simplex (ternary centred plot) of the treatments as a function of the four fatty acids with the higher log-ratio variances, and v) segregation of treatments based on Cluster Analysis. &

Value of the data
The data include the sequential binary partition of fatty acid profiles in CoDa and could be useful for calculating balances and the ilr transformation for other food compositions and interested researchers.
The correlation among fatty acids expressed in percentages may help other researchers for finding spurious relationships.
The information may facilitate the comparison of conventional multivariate techniques and compositional, regardless of the field, and promote international collaborations in data analysis.
Presentation in the Simplex can be an appropriated way of graphing compositional data and treatments' effects.

Data
The data cover aspects of conventional and compositional analysis. Particularly, the presentation of these data in the Simplex (Fig. 1), the binary partition (Table 1), the ilr transformations based on it (Table 3) as well as the application of multivariate tools to the original data ( Table 2 and Fig. 2A) and ilr coordinates (Fig. 2B).

Experimental design, materials and methods
Olives (maturity index ¼1) [2] were processed in duplicate according to the green Spanish-style. After fermentation for eight months, 10 kg olives from each replicate, were packaged in glass containers (50 g NaCl/L and 5.5 g lactic acid/L cover brine), stabilized by pasteurization, and stored at room temperature (22 72°C) for two months. The applied processing and packaging mimicked those used at industrial scale [3]. Samples ( ̴ 5 kg olives) were withdrawn in duplicate from i) the fresh Gordal olives extracted by Abencor (RM), ii) each of the replicates of the fermented fruits (extracted by Abencor (FO) and Soxhlet (FOS)), and iii) packaged olives (extracted by Abencor (PO) and Soxhlet (POS)). The olives from the samples were pitted, homogenized with an Ultra-Turrax T25 (IKA-Labortecnik, Staufen, Deutschland) and extracted as described elsewhere [4,5].
Fatty acid profiles were obtained through analysis of their FAMEs by GC according to the procedures recommended in the Commission Regulation (EU) No 2015/1833 [6]. The fatty acid methyl esters were quantified in a Hewlett-Packard 5890 series II gas chromatograph, using a fused silica capillary column Select FAME (100 m×0.25 mm, 0.25 μm film thickness) (Varian, Bellefonte, PA), a flame ionization detector, and a reference standard of saturated and unsaturated fatty acids methyl esters (FAME Mix C4-24). Details of the procedure can be found elsewhere [5,[7][8][9]. The identification of fatty acids followed the guidelines of the Commission Delegated Regulation (EU) 2015/1830 (8 July 2015) and previous works on processed olives [5,[7][8][9]. The analysis of each replicate was made in duplicate, and the average recorded.
The data matrix consisted of 10 rows (five treatments in duplicate) and 19 columns (fatty acids). Values were first tested for outliers and normality. The data were plotted in the Simplex (Fig. 1), analysed with specific exploratory techniques like compositional biplot [1], and subjected to sequential binary partition (Table 1). This partition led to CoDa dendrogram [1, Fig. 2] and the ilr transformed values (or coordinates) (Table 3) [10,11]. Finally, the data (percentages and ilr transformed values or coordinates) were subjected to similar multivariate Cluster Analysis (based on the Euclidean distance and the Ward method) (Fig. 2) [12].
CoDaPack v. 2.01.14 (Department of Computer Science and Applied Mathematics, University of Girona, Spain), XLSTAT 2014 (Addinsoft, Paris, France) were used for data processing and graph drawing. Fig. 1. Segregation of treatments (processing phases and extraction systems), as described by a ternary centred plot based on the four fatty acids with the highest log-ratio variances. RM, oil extracted from the raw material (fresh fruits); FO and FOS, oils extracted from the fermented olives; PO and POS, oils extracted from packaged olives. S, samples extracted by Soxhlet; otherwise, by Abencor.

Table 1
Sequential binary partition used for balance and the ilr transformation calculus. In the balance, values coded as þ1 are assigned to numerator; those coded −1 to the denominator; those coded as 0 do not participate in the balance. In the case of assignation of more than one fatty acid to the numerator, the denominator, or both, the balances are based on their geometric means.