Dataset on the phytochemicals, antioxidants, and minerals contents of pecan nut cake extracts obtained by ultrasound-assisted extraction coupled to a simplex-centroid design

This article contains a dataset related to the research published in “The potential of the pecan nut cake as an ingredient for the food industry” [1]. A three-component simplex-centroid mixture design coupled with response surface methodology (RSM) was applied to generate statistical models and to analyze the dataset. The method was also applied to evaluate the effect of different solvents (ethanol, water, and acetic acid) on the extraction of bioactive compounds of pecan nut cake (PNC) and its antioxidant activity. Furthermore, simultaneous optimization of the solvent mixture was carried out to predict the optimum point with the best combination of solvents to obtain an extract with enhanced phytochemical composition, as well as high in vitro antioxidant activity. The maximization of total phenolic compounds, condensed tannins, and antioxidant activity of the PNC was predicted by the desirability function. A total of 80 interactions were run to provide the best condition for optimization. The combined use of the different solvents enables a higher recovery of the compounds than their isolated use. This dataset may help other researchers on the application of a mixture design to recover phytochemicals from a broad range of co-products such as defatted meals and other nut cakes, which are sometimes discarded as waste by many industries.


a b s t r a c t
This article contains a dataset related to the research published in "The potential of the pecan nut cake as an ingredient for the food industry" [1]. A three-component simplex-centroid mixture design coupled with response surface methodology (RSM) was applied to generate statistical models and to analyze the dataset. The method was also applied to evaluate the effect of different solvents (ethanol, water, and acetic acid) on the extraction of bioactive compounds of pecan nut cake (PNC) and its antioxidant activity. Furthermore, simultaneous optimization of the solvent mixture was carried out to predict the optimum point with the best combination of solvents to obtain an extract with enhanced phytochemical composition, as well as high in vitro antioxidant activity. The maximization of total phenolic compounds, condensed tannins, and antioxidant activity of the PNC was predicted by the desirability function. A total of 80 interactions were run to provide the best condition for optimization. The combined use of the different solvents enables a higher recovery of the compounds than their isolated use. This dataset may help other researchers on the application of a mixture design to recover phytochemicals from a broad range of co-products such as defatted meals and other nut cakes, which are sometimes discarded as waste by many industries. © 2020 The Author(s Value of the Data Data show that non-toxic organic solvents as ethanol, water, and acetic acid may be used as an alternative to increase the yield of phenolic compounds and the antioxidant activity of extracts obtained from the pecan nut cake in an easy approach aided by ultrasound-assisted extraction and the simplex centroid design. Thus, the applied mathematical and statistical tools were able to provide an optimized extraction method and generate mathematical models with satisfactory prediction capability, which may be useful in the extraction of other raw materials. Optimized values obtained from the desirability function of the extraction process offer support and may help other researchers on the recovery of bioactive compounds from different by-products. The dataset should also encourage the use of raw materials that are usually considered waste as an ingredient in the food, feed, pharmaceutical, and cosmetics industries adding value to them. The multi-response analytical optimization was shown to be a feasible strategy to improve the process conditions and to obtain a product with unique characteristics. Thus, the presentation of the statistical tools used herein is intended to support not only pecan nut researchers, but also professionals of food science and technology, microbiology, food development, sensory evaluation, and nutrition, promoting a reduction in time, labor and operations costs. The dataset presents information and tools that help the researchers to estimate the influence of variables on extraction processes by rejecting the variables that do not seem to contribute to the quality of the final product and to optimize the process conditions in order to obtain an improved extract.

Data
The pecan nut (Carya illinoinensis (Wang.) K. Koch) cake (PNC) is a by-product of the pecan nut oil extraction, which is rich in bioactive compounds. Therefore, this fraction has the potential to be used for the extraction of such substances [1]. The presented dataset shows a statistical approach on the extraction procedure employed for establishing the influence of different non-toxic solvents on the phytochemicals (total phenolic compounds -TPC, and condensed tannins -CT) and antioxidant activity (reducing potential of the hydrophilic compounds e RPHC, 2,2-diphenyl-1-picrylhydrazyl -DPPH, and total reducing capacity e TRC) of PNC, as well as for the quantification of minerals in PNC.
The manuscript is organized as follows: the data presented in Subsection 1.1, Screening of variables and obtaining of a mathematical model (Tables 1 and 2; Figs. 1e3) describe the data on estimates and regression coefficients (Raw data, unadjusted), plots of correlation, Pareto charts, and normality of residuals. In Subsection 1.2, Optimization by desirability function (Figs. 4 and 5, and Table  3), we presented data on the multi-response optimization of the mixture of solvents, trace graphs, in addition to the predicted and experimental values for the optimized data. Subsection 2.1, Preparation of the sample describes the steps for evaluating the nutritional, mineral, microstructural, and functional properties. Fig. 6 presents a detailed flowchart with the steps of the experimental approach performed. In Subsection 2.2, Mineral determination parameters (Table 4), the data on the analytical and instrumental parameters for the analysis of minerals are presented. Subsection 2.3, Statistical design of the extraction process, presents the parametric statistical techniques related to the mathematical modeling of processes using experimental design followed by multiple regression analysis, the so-called response surface methodology (RSM). The underlying requirements for assessing the fit, quality and predictability of the generated models are also presented. Fig. 7 describes how the statistical procedure can be used as a tool for analyzing and optimizing the ultrasound-assisted extraction process of the phytochemical and antioxidant content of PNC using a simplex centroid mixture design.

Screening of variables and obtaining of a mathematical model
The data regarding the content of phytochemicals and antioxidant activity obtained in the ultrasound-assisted extraction using the mixtures of ethanol, acetic acid, and water used for calculating the following statistics were reported in our previous work [1]. The data used to propose mathematical equations that explained the effects of each type of solvents on the phytochemicals and antioxidant activity of extracts are reported in Tables 1 and 2, and Figs. 1e3.

Optimization by desirability function
The data regarding the multiple linear regression based on the RSM was used to propose the simultaneous optimization that explained the effects of each type of solvents on the phytochemicals and antioxidant activity of extracts (Figs 4 and 5). The data used for calculating for the best option to obtain a mixture with maximized antioxidant capacity are reported in Table 3.  Reducing potential of the hydrophilic compounds; TRC: Total reducing capacity; Coefficient of determination R-sqr: R-squared or R 2 (regression coefficient) and R-adj: adjusted R 2 .

Preparation of the sample
The pecan nut sample was processed as reported by Maciel et al. [1]. After the oil removal, the obtained cake was evaluated for its nutritional and mineral compositions, microstructure, and functional properties. Then, the sample was extracted with the aid of an ultrasound system to obtain antioxidant-rich extracts according to an experimental design [1]. Finally, these extracts were evaluated for determining the total phenolic compounds, condensed tannins, and antioxidant activity. Fig. 6 shows a schematic diagram of the analyses performed for obtaining the pecan nut cake (PNC) and its corresponding extracts.  6. Flowchart of the experimental procedure for obtaining and analyzes for pecan nut cake (PNC) and PNC extracts.

Parameters for minerals determination
A total of 9 elements were evaluated (calcium, magnesium, sodium, and potassium) and 5 trace elements (zinc, manganese, copper, iron, cobalt). The analytical and instrumental parameters are specified in Table 4.

Statistical design of the extraction process
All the analyses were conducted in triplicate, and the data expressed as original replicates or the mean ± standard deviation. To screen the variables and obtain a mathematical model, we firstly Table 4 Analytical and instrumental parameters for the analysis of mineral by flame atomic absorption spectrometry (F-AAS) and Flame Atomic Emission Spectrometry (F-AES). evaluated the significant statistical differences using one-way ANOVA, followed by a Fisher LSD test (p 0.05) for parametric and homoscedastic data. The normality of the data was checked by the Shapiro-Wilk test, while the Brown-Forsythe test was used for homoscedasticity. Linear correlation analysis was performed to verify the degree of association between responses and regression analysis. Linear correlations were calculated and expressed by Pearson's correlation coefficient (r), where p values below 5% were considered significant. Correlation strengths were evaluated according to the following criteria: perfect (r ¼ 1.0), strong (r < 1.0 and !0.8), moderate (r < 0.80 and !0.50), weak (r < 0.50 and !0.10) and very weak (r < 0.10) [2].
The RSM was applied to estimate the effects of different solvents on the content of phytochemicals and the antioxidant activity of PNC. RSM was also applied for modeling the regression coefficients as a function of the variables (types of solvents). The analysis of variance of the models was calculated, and the effects and regression coefficients of the linear, quadratic, and cubic terms were determined. Nonsignificant regression coefficients (p ! 0.05) were discarded, and data were reevaluated to obtain the final model for each parameter. The statistical quality of the proposed models was evaluated by the percentage of variability explained by the coefficient of determination (R 2 ), the adjusted coefficient of determination (R 2 adj ), and the significance of the model (p 0.05). The P lack of fit value was used to verify the adequacy of the model, where models with P lack of fit > 0.05 indicate that it can adequately adjust to the experimental data. In addition, a confidence interval of ±95% was also measured for each effect. Regression coefficients were then used to generate Pareto charts and two-dimensional contour plots for each response. The residuals plots were examined for all response variables, for obvious patterns (predicted vs. experimental data) and formally tested for normality using the Kolmogorov-Smirnov test [3].
After modeling the responses, the optimization by desirability function steps was performed. The maximization of TPC, CT, and antioxidant activity of the PNC was predicted by the desirability function and d-value, which is a measure of how much the proposed formulation conforms to the main goal of the optimization obtained. A total of 80 interactions were run to provide the best condition for optimization. Then, the trace plot and multi-response optimization graph were generated for each response variable. Finally, the experimental validation of the values obtained with the optimization against the predicted values was performed. The Statistica software v. 10.0 (StatSoft Inc., USA), Microsoft Office Excel® v. 2016 (Microsoft Inc., USA), Action v.2.9 (Statcamp, Brazil) were used for data processing. A summary of all the steps that were used to apply the statistical tools is presented in Fig. 7.