Hyperspectral Imaging to Characterize Table Grapes

Table grape quality is of importance for consumers and thus for producers. Its objective quality is usually determined by destructive methods mainly based on sugar content. This study proposed to evaluate the possibility of hyperspectral imaging to characterize table grapes quality through its sugar (TSS), total flavonoid (TF), and total anthocyanin (TA) contents. Different data pre-treatments (WD, SNV, and 1st and 2nd derivative) and different methods were tested to get the best prediction models: PLS with full spectra and then Multiple Linear Regression (MLR) were realized after selecting the optimal wavelengths thanks to the regression coefficients (β-coefficients) and the Variable Importance in Projection (VIP) scores. All models were good at showing that hyperspectral imaging is a relevant method to predict sugar, total flavonoid, and total anthocyanin contents. The best predictions were obtained from optimal wavelength selection based on β-coefficients for TSS and from VIPs optimal wavelength windows using SNV pre-treatment for total flavonoid and total anthocyanin content. Thus, good prediction models were proposed in order to characterize grapes while reducing the data sets and limit the data storage to enable an industrial use.


Introduction
Grapes are one of the most consumed fruits in the word, as fresh fruit, grape juice, raisins, and wine. About 36% of grape production concerned the fresh fruit consumption (International Organization of Vine and Wine statistics). The European production of table grapes (~1.9 million tons) is mainly located in the Mediterranean area, with the domination of Italy (61%), Greece (16%), Spain (15%), and France (1.5%) [1]. The French production of table grapes is mostly in Vaucluse and Tarn-et-Garonne. About 80% of the production concern only three varieties: Alphonse Lavallée, Chasselas, and Muscat de Hambourg. French table grape production (~30,000 tons) represents approximately 40% of the national consumption, while the 60% remaining is mainly imported from Spain and Italy.
The right commercial harvest of table grapes is usually determined by different parameters like skin color, texture softening, titratable acidity, total soluble solids, and sometimes with flavonoid content, and aromatic compounds [2,3]. Visual attributes of table grapes, such as intensity and uniformity of color, large size of berries, and brightness are the main characteristics that influence consumer choice [4,5]. Color is of high importance to assess quality in the food industry [6]. Furthermore, some studies have found clear evidences that a greater consumption of fresh grapes decreases the risk of cardiovascular diseases and cancer [7,8]. This beneficial effect is mainly related to the presence of minerals, fibers, vitamins, and phytochemical compounds including flavonoids and anthocyanins [9,10]. However, the concentration of these quality attributes changes during postharvest storage and thus influence sensory perception and nutritional value of table grapes.

I.
Developing partial least square (PLS) models to validate the correlation between hyperspectral imaging spectra and Total Anthocyanins (TA) and Total Flavonoid (TF) contents and Total Soluble Solids (TSS), using the visible and short-wave nearinfrared region; II. Selecting the lowest number of optimal wavelengths, based on regression coefficient (RC) and Variable Importance in Projection (VIPs) algorithms, which gave the highest correlation between the spectral data and the three selected quality parameters; III. Developing Multiple Regression Models (MLR) using spectra from only the optimal wavelengths and then checking the validation of the developed calibration models.
The novelty of this study was to estimate the potential of hyperspectral imaging as possible prediction model supplier for quality parameters (TF, TA, and TSS) usable for all table grapes. Moreover, the use of specific wavelengths and not the full spectra for these products represented a new approach.

Samples
Seven table grapes varieties were bought in regional markets at commercial harvest ripeness: Three white table grapes (Sugarone Superior Seedless, Thompson Seedless, and Victoria) and four red/black table grapes (Sable Seedless, Alphonse Lavallée, Lival, and Black Magic). Alphonse Lavallée and Lival were chosen because they represented French cultivars produced in the south-east of France and mostly consumed throughout the country. The other 5 cultivars were chosen because they are largely diffused around the world. Approximately 5 kg of clusters randomly selected were sampled for each cultivar. A subsample of 50 berries of each variety, with short attached pedicels, was collected from different bunch parts (shoulders, middle, and bottom). Grapes were then washed and gently dried with absorbent paper, stored at 4 • C until the HIS acquisitions.
For the 7 varieties, 50 berries of each were analyzed in triplicate by hyperspectral imaging, then they were chemically analyzed, leading to 350 mean spectra and 350 TF and TSS and 200 TA. Then, PLS and MLR were applied on pre-treated data.

Hyperspectral Imaging System (HIS)
The system is composed of the following components ( Figure 1): (a) a hyperspectral imaging camera (Pika L, Resonon, Bozeman, MT, USA) coupled with an objective lenses (Xenoplan 1.4/23, Schneider-Kreuznach, Bad Kreuznach, Germany); (b) an illumination unit, which consists of four 35 W quartz tungsten halogen (QTH) MR16 35 W lamps adjusted at angle of 45 • to illuminate the camera's field of view; (c) a mounting tower; and (d) a transport stage (PS-12-20-1.0, Servo Systems Co., Rockaway, NJ, USA), with motor (DMX-J-SA-17, Arcus Technology Inc., Livermore, CA, USA). The sensor has 900 spatial channels each with 281 spectral channels covering the range from 387 to 1026 nm. The maximum spectral resolution is 2.1 nm. The camera was set up at 450 mm from the target. The spectral images were collected in a dark room where only the halogen light source was used. The HIS was controlled by a PC with the software SpectrononPRO (Resonon, Bozeman, MT, USA) for image acquisition.

Image Acquisition
The samples were kept at room temperature (20 °C) for 1 h prior to the imaging acquisition in the reflectance mode. The hyperspectral image of each sample (one berry) was recorded in three different berry positions corresponding to berry rotations of approximately 120° between positions. The berries reflectance measurement was made along the berry "equator" when considering the pedicel to be the "pole." This is a common practice reported by several articles [38,39]. The hyperspectral images were recorded by the SpectrononPRO software (Resonon, Bozeman, MT, USA) using an exposure time of 12 ms and a stage speed of 11 mm s −1 with a gain of 10. The spectral data in the wavelength range of 411-1000 nm was used in the data analysis for removing noise and reducing data redundancy out of this range. For each sample (50 berries), three reflectance spectra were collected, corresponding to the berry rotations, and averaged over the spatial dimension.

Preprocessing of Hyperspectral Images
All the acquired images were processed and analyzed using SpectrononPro 5.1 Hyperspectral Imaging System software (Resonon, Bozeman, MT, USA). The hyperspectral images were firstly corrected with a white and a dark reference (WD). The dark reference was used to remove the effect of dark current of the CCD detectors, which are thermally sensitive.
The corrected reflectance (R) is estimated using the following Equation (1): where S is the intensity of an image, W is the intensity of the white reference image (Teflon white board with 99% reflectance), and D is the intensity of the dark reference image (with 0% reflectance) recorded by turning off the lighting source with the lens of the camera completely covered. The corrected reflectances were the basis for the subsequent image analysis to extract the spectral response of each fruit, select effective wavelengths, and predict physicochemical parameters.

Image Acquisition
The samples were kept at room temperature (20 • C) for 1 h prior to the imaging acquisition in the reflectance mode. The hyperspectral image of each sample (one berry) was recorded in three different berry positions corresponding to berry rotations of approximately 120 • between positions. The berries reflectance measurement was made along the berry "equator" when considering the pedicel to be the "pole." This is a common practice reported by several articles [38,39]. The hyperspectral images were recorded by the SpectrononPRO software (Resonon, Bozeman, MT, USA) using an exposure time of 12 ms and a stage speed of 11 mm s −1 with a gain of 10. The spectral data in the wavelength range of 411-1000 nm was used in the data analysis for removing noise and reducing data redundancy out of this range. For each sample (50 berries), three reflectance spectra were collected, corresponding to the berry rotations, and averaged over the spatial dimension.

Preprocessing of Hyperspectral Images
All the acquired images were processed and analyzed using SpectrononPro 5.1 Hyperspectral Imaging System software (Resonon, Bozeman, MT, USA). The hyperspectral images were firstly corrected with a white and a dark reference (WD). The dark reference was used to remove the effect of dark current of the CCD detectors, which are thermally sensitive.
The corrected reflectance (R) is estimated using the following Equation (1): where S is the intensity of an image, W is the intensity of the white reference image (Teflon white board with 99% reflectance), and D is the intensity of the dark reference image (with 0% reflectance) recorded by turning off the lighting source with the lens of the camera completely covered. The corrected reflectances were the basis for the subsequent image analysis to extract the spectral response of each fruit, select effective wavelengths, and predict physicochemical parameters. Immediately after image acquisition, each berry was subjected to the determination of Total Soluble Solids (TSS), Total Flavonoids (TF), and Total Anthocyanins (TA). Each berry was weighed, manually peeled, and the juice was collected separately. Total Soluble Solids were measured by a portable refractometer (Mettler Toledo Refracto 30PX) with a 0.2 • Brix incertitude. The skins were separately weighed and extracted four times with 7.5 mL of hydrochloride ethanol solution (ethanol/water/hydrochloric acid 70/30/1 v/v/v). The samples were shaken for 60 with a horizontal shaker VXR vibrax (IKA-Werke, Staufen, Germany) at 1500 rpm and centrifuged at 5000 rpm for 5 , and the supernatant was collected in a volumetric flask. The supernatants were collected together, brought to the volume of 25 mL and, stored at −80 • C until analyses. The quantification of TA and TF was carried out spectrophotometrically by recording the UV-visible spectra in the range of 220-700 nm using a Safas UV mc 2 spectrophotometer (Safas, Monaco City, Monaco) and measuring the absorption values at 280 and 520 nm, as previously reported [40]. The results were expressed as mg (+)-catechin equivalents/kg fresh grape and mg malvidin-3-O-glucoside equivalents/kg fresh grape for the flavonoids and anthocyanins, respectively.

•
Collecting spectral data Only regions of interest (ROIs) were collected as already described [41] and an average spectrum was calculated by averaging the relative reflectance spectra.

•
Spectra pre-treatments To overcome or reduce unwanted spectral variation, baseline shifts, and various noise, a series of pre-treatment methods was applied on the mean spectral data to decrease the influence of high-frequency random noises, the nonuniformity in samples, and the surface scattering. Before building the validation model, different Equations (2) to (4) were used for spectral pre-treatments [42]: SNV: Standard Normal Variate (SNV). The average intensity (A mean ) and standard deviation (A SD ) of the spectrum are calculated and inserted in Equation (2): 1st derivative: The first derivatives A i was calculated using the symmetric difference quotient 1st derivative (3): 2nd derivative: The second derivate A i was calculated using the symmetric difference quotient 2nd derivate (4): With i = 1 to N, N being the number of samples.

Model establishment
The use of chemometrics in modeling spectral data is widely employed, being considered as a standard procedure for building predictive models in the analysis of hyperspectral images. The partial least squares (PLS) analysis between one quality attribute (TA and TF or TSS) and the spectral data (average spectra with 276 wavelengths in the range from 411 to 1000 nm) was conducted using XLSTAT software (Addinsoft, Paris, France, 2019). No outlier detection was performed in order to keep all spectra and the heterogeneity due to the vegetal material.
A total of 350 reflectance mean spectra were obtained from 350 berries. The calibration and validation sets were established by ordering the fruit samples according to their physicochemical references. Briefly, 4 samples per varieties, i.e., 28 samples in total were randomly selected for the prediction set. The two highest and two lowest values were assigned to the calibration set. Afterward, two-thirds of the samples were randomly selected as calibration data and one-third of the samples were defined as validation data in a 2:1 leave-one-out procedure. Thus, calibration set and validation set were independent.
PLS regression used to develop calibration models was carried out with two calibration sample sets: (i) N = 207 samples for TF and TSS and (ii) N = 116 samples for TA. The building of PLS models for TF and TSS took into account both the white and red table grape cultivars, while for TA was considered only the red and rosé grape cultivars since white grapes do not have anthocyanins. To reduce the probability of an over fitting of the experimental data [43], PLS models with 1-15 latent variables (LVs) were fitted, and the model with a number of PLS factors that maximized the coefficient of determination (R 2 cal ) for the calibration and minimized the root mean square error of calibration (RMSEC) was selected. These two parameters would allow the evaluation of the models.

•
Hyperspectral imaging model validation Two validation sets (N = 103 samples for TF and TSS; N = 56 samples for TA) were used to calculate the root mean square error of validation (RMSEV), the coefficient of determination (R 2 val ), the Bias, and the Ratio Performance Deviation (RPD) of the PLS models as follow [42]: where N is the number of samples, R is the number of PLS factors, y re f i is the reference value for sample i, and y i is the predicted value for sample i.

•
Hyperspectral imaging prediction The quality of prediction of the models was tested using 4 samples per variety. The level of prediction is discussed base on R 2 pr and RMSEP values. The RMSEP was calculated as follow: •

Selection of optimal wavelengths
Spectral wavelengths in hyperspectral images are characterized by their large degree of dimensionality with collinearity and redundancy. Researchers are often interested in finding the most important wavelengths which contribute to the evaluation of quality parameters and eliminate wavelengths having no discrimination power. After proving the good performance of the PLS models on the validation set, the next step was to select only the wavelengths that showed the maximum spectral information.
The regression coefficients (RC), also called β-coefficients, and the Variable Importance in Projection (VIP) scores were applied to select the most informative wavelengths, which provided the best PLS calibration model built with the full spectrum as variables. The wavelengths that corresponded to the highest absolute values of β-coefficients were considered optimal wavelengths [44]. Based on the studies conducted by Olah et al. [45], all wavelengths at which the VIP scores were above a threshold of 1.0 (highly influential) were selected and compared with those identified using β-coefficients. In this study, only the wavelengths with highest β-coefficients (absolute values) from one side and highest VIP scores (above the threshold of 1.0) on another side were selected to establish Multiple Linear Regression (MLR) models, instead of using the whole spectral range. Moreover, all the wavelengths with VIP score above 1 (spectral windows) were also used to carry out another PLS regression model in order to improve its performance.

Statistical Analyses
One way ANOVA on quality attributes of table grapes was performed with XLSTAT 2019.1 software (Addinsoft, Paris, France). Mean values were separated with Tukey's test (p < 0.05) to present the significant differences between varieties.

Grape Composition
Berries from each grape variety were characterized by their sugar content (Total Soluble Solids (TSS)), their Total Flavonoid content (TF), and Total Anthocyanin content (TA). Table 1 shows that the selected varieties had different total flavonoid content, from 201 mg kg −1 FW for Victoria grapes to 1642 mg kg −1 FW for Lival grapes, with white grapes presenting the lowest phenolic concentration as expected. This result is in agreement with Mikulic-Petkousek et al. [46], which showed that Victoria variety has a low phenolic content. Similarly, a large amount of total anthocyanin content was observed from 217 mg kg −1 FW for Alphonse Lavallée to 590 mg kg −1 FW for Sable seedless. Their sugar concentration was between 14.0 g/100 g (Victoria) to 24.8 g/100 g (Alphonse Lavallée) corresponding to ripening level [2]. Statistics showed that TF, TA, and TSS were significantly dependent on the grape cultivar.

Spectral Profiles
The mean reflectance spectra profile of each grape variety is presented in Figure 2. These spectra obtained by HIS showed clear differences between the grape varieties, as already reported by Baiano et al. [47] on 7 other varieties. White grapes exhibited important reflectance from about 500-650 nm on the contrary of reds. Chlorophyll pigments absorb indeed around 540 nm giving the green-yellow color to these varieties as hypothesized by Costa et al. (2019) [48]. All grapes presented much higher reflectance percentage between 700 and 950 nm, with a mix of intensity between reds and whites but varieties showed similar trends depending on the variety color: whites had higher intensity around 700-720 nm, which decreased to 950 nm, and reds showed flattened bell curve with a maximum around 820 nm. Absorption band at 840 nm is mainly due to sugar [47] and more than 960 nm due to water [48,49]. showed similar trends depending on the variety color: whites had higher intensity around 700-720 nm, which decreased to 950 nm, and reds showed flattened bell curve with a maximum around 820 nm. Absorption band at 840 nm is mainly due to sugar [47] and more than 960 nm due to water [48,49].

Modelization of Table Grape Composition Using the Whole Spectral Range of 411-1000 nm
PLS were developed to establish the relationship between the spectral data and the corresponding TA, TF, and TSS content analyzed by conventional chemical method. First of all, the whole dataset was dedicated to select the best pre-treatment for each quality parameter. The results are reported in Table 2. Five parameters were used to select the best model: R², LVs, RMSE, and Bias. RMSE has to be minimized and RPD has to be maximized [48]. HIS data were relevant [49] for modelizing Total Flavonoid, Total Anthocyanin contents, and TSS, since all determination coefficients (R 2 cal and R 2 val) were over 0.87 ( Table 2). All pretreatments showed good results. Thus, we decided for similar range of R² and RMSE, to select the pre-treatment leading to the lower number of LVs, that is to say the SNV pre-treatment for all quality parameters. In details, for the modelization of TF, the SNV pre-treatment used only 9 LVs, with R 2 cal = 0.94, R 2 val = 0.93, with RMSEV = 141 mg kg −1 . For Total Anthocyanins, the SNV model was characterized by R 2 cal = 0.93, R 2 val = 0.95, and RMSEV = 47 mg/kg with only 3 LVs. Finally, for TSS, the model, thanks to 10 latent variables, generated a R 2 cal = 0.94 and a R 2 val = 0.91 with RMSEV = 1.1 g/100 g. As for residual validation deviation (RPD), selected pre-treatments (mainly WD) generated values close to 4, which suggest the capability of the models to provide a good quantification and satisfactory prediction of TF, TA, and TSS [50,51]. The relatively low number of LVs of the models generated, and in particular for TA, and the fact that the models were built using grape berries of seven different cultivar contributed to the robustness of the models. Moreover, measured data vs. validated data were plotted for the three models selected (Figure 3). These graphs validated the selected models proving the ability of hyperspectral imaging data to predict TF, TA, and TSS in table grapes.  Table Grape Composition Using the Whole Spectral Range of 411-1000 nm PLS were developed to establish the relationship between the spectral data and the corresponding TA, TF, and TSS content analyzed by conventional chemical method. First of all, the whole dataset was dedicated to select the best pre-treatment for each quality parameter. The results are reported in Table 2. Five parameters were used to select the best model: R 2 , LVs, RMSE, and Bias. RMSE has to be minimized and RPD has to be maximized [48]. HIS data were relevant [49] for modelizing Total Flavonoid, Total Anthocyanin contents, and TSS, since all determination coefficients (R 2 cal and R 2 val ) were over 0.87 (Table 2). All pretreatments showed good results. Thus, we decided for similar range of R 2 and RMSE, to select the pre-treatment leading to the lower number of LVs, that is to say the SNV pre-treatment for all quality parameters. In details, for the modelization of TF, the SNV pre-treatment used only 9 LVs, with R 2 cal = 0.94, R 2 val = 0.93, with RMSEV = 141 mg kg −1 . For Total Anthocyanins, the SNV model was characterized by R 2 cal = 0.93, R 2 val = 0.95, and RMSEV = 47 mg/kg with only 3 LVs. Finally, for TSS, the model, thanks to 10 latent variables, generated a R 2 cal = 0.94 and a R 2 val = 0.91 with RMSEV = 1.1 g/100 g. As for residual validation deviation (RPD), selected pre-treatments (mainly WD) generated values close to 4, which suggest the capability of the models to provide a good quantification and satisfactory prediction of TF, TA, and TSS [50,51]. The relatively low number of LVs of the models generated, and in particular for TA, and the fact that the models were built using grape berries of seven different cultivar contributed to the robustness of the models. Moreover, measured data vs. validated data were plotted for the three models selected (Figure 3). These graphs validated the selected models proving the ability of hyperspectral imaging data to predict TF, TA, and TSS in table grapes.    It is interesting to note that for TF and TSS a bimodal effect could be observed. That phenomenon is due to the white varieties in the case of TF, since they had the lowest TF values (as expected) compared to the other varieties. Nevertheless, the prediction of low values of TF could be considered uncertain since the concentration of TF below 500 mg/kg seemed not to follow a linear trend even if they positively contribute to the model (data not shown). For TSS, that effect was due to the rosé grapes since they presented the highest TSS values.

Modelization of
The predictions of these quality parameters were good since the determination coefficients obtained ranged between 0.92 and 0.98 for the pre-treatments selected above. The RMSEP was in the same order of magnitude than those of the validation and calibration sets, and even a bit better, i.e., the RMSEP were of 33 mg/kg for TA and of 0.9 for TSS, whereas the RMSEV were of 47 mg/kg for TA and 1.1 for TSS. This result shows that the method used for the validation was good. Thus, these results showed also that all grape varieties could be gathered in a single model.

Modelization of Table Grape Composition from Optimal Wavelengths Obtained by β-Coefficients
Hyperspectral data with hundreds of contiguous wavelengths for each pixel of image are a great issue for data processing. Therefore, the selection of optimal wavelengths is very important to reduce the computation time, to simplify the potential prediction model and further to satisfy the real-time inspection [52]. In this section, regression coefficients (RC) resulting from full-spectrum PLS models, were employed to select the key wavelengths aiming to establish the Multiple Linear Regression (MLR) models. Figure 4 shows the values of β-coefficients for the quality attributes Total Flavonoids, Total Anthocyanins, and TSS from the HIS data. The optimal wavelengths are those having the highest absolute values of β-coefficients (framed in the figure). Thus Table 3 presents the accuracy and robustness of RC-MLR models built using the selected wavelengths. The model for TF showed R 2 = 0.94 and 0.95 for the calibration and validation set respectively and RMSEV = 128 mg/kg. For TA, the model had R 2 cal of 0.93, R 2 val of 0.95 with RMSEV = 48 mg/kg, and the model for TSS presented a value of R 2 cal = 0.95, R 2 val = 0.93, and RMSEV = 1.0 g/100 g. To visualize these models, measured data vs. validated data were plotted ( Figure 5). The correlation between the spectra data and the Total Flavonoid content (R 2 val = 0.95, Figure 5A) that of Total Anthocyanin content (R 2 val = 0.95, Figure 5B) and that of TSS (R 2 val = 0.93, Figure 5C) was good with points concentrated on the line y = x and relatively narrow scattering of data showing the low error of the model. As for Figure 3, the bimodal effect for TF and TSS was observed in Figure 5. These results seem obvious since the reduction in wavelengths for the model should not lead to a loss of information. Chemosensors 2021, 9, x FOR PEER REVIEW 12 of 21

Modelization of Table Grape Composition from Optimal Wavelengths Obtained by VIPs Score
The VIP scores resulting from the best preprocessing PLS regression model were used to develop a robust model by selection of feature-related wavelengths for TF, TA, and TSS of table grapes. The performance of the developed model by MLR depended largely on the cut-off value of the VIP scores. Generally, the "greater-than-one" rule is used to identify optimal wavelengths [54]. Only the wavelengths with highest value of VIP scores, above the threshold of 1.0, were selected to establish MLR models, whereas the wavelengths with VIP scores above 1 (spectral windows) were selected to perform a new PLS model. As shown in Figure 6, the optimal wavebands selected from all 283 wave- Thus, our models for TF, TA, and TSS showed good quantification and good prediction potential due to their RPD values (Table 3) [49][50][51]. However, the values of Bias are rather important for TA but that could be improved.
The prediction of the data from the test set showed also good results with R 2 over 0.93. The RMSEP obtained were in the range of RMSEC and RMSEV, with values slightly higher for TF but lower with TA and TSS.
Although the elimination of variables was approximately 92.0%, the MLR models had good performances. Compared to the full spectra, the MLR models were better for generating the model and for the prediction for TA, lower for TSS, and similar for TF. The fact that an improvement of the model is observed in some cases using MLR could be attributed to the use the optimal wavelengths neglecting unnecessary wavelengths, mitigating the problems of collinearity and overfitting [53]. Therefore, it could be demonstrated that regression coefficient algorithm is useful and effective for the selection of key wavelengths in predicting TF, TA, and TSS content in table grape.

Modelization of Table Grape Composition from Optimal Wavelengths Obtained by VIPs Score
The VIP scores resulting from the best preprocessing PLS regression model were used to develop a robust model by selection of feature-related wavelengths for TF, TA, and TSS of table grapes. The performance of the developed model by MLR depended largely on the cut-off value of the VIP scores. Generally, the "greater-than-one" rule is used to identify optimal wavelengths [54]. Only the wavelengths with highest value of VIP scores, above the threshold of 1.0, were selected to establish MLR models, whereas the wavelengths with VIP scores above 1 (spectral windows) were selected to perform a new PLS model. As shown in Figure 6, the optimal wavebands selected from all 283 wavebands were 10  Table 4 presents the accuracy and robustness of MLR models for TF, TA, and TSS based on VIP score. The model for the quality attribute TF led to R 2 of 0.90 for both calibration and validation sets and with RMSEV = 178 mg/kg. The model for TA content showed R 2 of 0.93 for the calibration set and 0.95 for the validation set with RMSEV = 37 mg/kg. For the sugar content (TSS), the VIPs-MLR model had R 2 cal equal to 0.86 and R 2 val of 0.83 with RMSEV = 1.6 g/100 g. Table 4. Performance of MLR models for predicting Total Flavonoids (TF), Total Anthocyanins (TA), and the Total Soluble Solids (TSS) using the optimal wavelengths extracted from VIPs of the best PLS full spectra analysis. The MLR models based on the VIPs wavelengths selection showed values of RPD close or higher to 2.5, which indicated that these models were good enough to have a high utility value model [52] and was over 4 for TA showing a good prediction potential. However, these results showed a declined validation accuracy of TF and TSS models comparing to the ability of full-spectrum PLS and RC-MLR models. On the contrary, the VIPs-MLR model for TA was much better than the other with a RMSEV only of 37 mg/kg instead of 47 mg/kg in the case of the full spectra. Once more, to check the quality of the models, the measured data vs. validated data were plotted (Figure 7). All graphs showed that validated data fitted with measured data. The model is particularly good for TA with more narrow spread of the data. Again, the bimodal effect can be observed for TF and TSS. In addition, as for Figure 3, the data show that information was not lost with the reduction in wavelengths. Chemosensors 2021, 9, x FOR PEER REVIEW 14 of 21 The MLR models based on the VIPs wavelengths selection showed values of RPD close or higher to 2.5, which indicated that these models were good enough to have a high    The prediction was good for all quality parameters (R 2 > 0.86) ( Table 4). The prediction errors were, however, higher for TF and TSS compared to those obtained with the full spectra and the MLR models, whereas the prediction was improved for TA.

Variable
The last trial was to select all the wavelengths with VIP score above 1 (spectral windows). New VIPs-PLS models were then build (  [55]. Figure 8 shows the curves measured data vs. validated data for these best models. Again, the models fitted well with the measured data since the data spread is rather narrow for all three parameters, suggesting good validation models from specific windows HIS data. The simplified VIPs-PLS model performed with slight increase in the validation accuracy of TF and TA compared to the ability of full-spectrum PLS models, in terms of determination coefficient, RMSE, and RPD values. However, the best validation model for TSS was built using the whole spectral data.

Discussion
The possibility to use the full spectra from HIS to generate a relevant PLS-model to predict the sugar content was indeed reported by Baiano et al. [47] using the same device. These authors developed a calibration models able to predict TSS of white and red table grape with R 2 val of 0.94 and 0.93, respectively. Our method was, however, valid for all grape varieties, with all reds, rosés, and whites, which would be easier to manage from an industrial point of view. In addition, the results of the present study were comparable to those of another work carried out by Gomes et al. [56], in which the prediction of TSS Concerning the prediction ability of VIPs-PLS models, Table 5 showed that again the determination coefficients were over 0.94, with errors in the same range than the calibration and the validation sets. The prediction models were even better with this method using spectral windows than with the full spectra, considering both R 2 and RMSEP for all three quality parameters TF, TA, and TSS.

Discussion
The possibility to use the full spectra from HIS to generate a relevant PLS-model to predict the sugar content was indeed reported by Baiano et al. [47] using the same device. These authors developed a calibration models able to predict TSS of white and red table grape with R 2 val of 0.94 and 0.93, respectively. Our method was, however, valid for all grape varieties, with all reds, rosés, and whites, which would be easier to manage from an industrial point of view. In addition, the results of the present study were comparable to those of another work carried out by Gomes et al. [56], in which the prediction of TSS in wine grape was performed using two different model development techniques, i.e., PLS regression and Neural Networks. The obtained values of R 2 of prediction were 0.92 for both PLS regression and Neural Networks with RMSEP of 0.94 • Brix and 0.96 • Brix, respectively. Hence, a good capacity of correlation was achieved in numerous other works on prediction of TSS for table and wine grapes [24,38,57,58].
Other authors have also reported good performance of linear models to predict the total anthocyanin content, with R 2 CV > 0.94 using spectral data in Vis-NIR [59] and NIR ranges [60] or total phenols content, with R 2 CV = 0.89 using the spectral data in Vis-NIR range [57,61]. Moreover, several studies also reported very good performance of nonlinear models to predict the TA content in whole Port and Cabernet sauvignon wine grape using the hyperspectral imaging device in Vis-NIR range [38,56,62]. Thus, our results were at least as good as those of other works but for the first time showed the relevance of HIS on red and white table grapes.
Our results highlighted that not only hyperspectral imaging is a relevant method to assess TA, TF, and TSS content but also the reduction in data is possible using MLR method with β-coefficients (RC method) or variable importance in the projection VIP. RC methods were already reported to be relevant to predict sugar content in the case of lychee fruit [55] and the total polyphenols concentration in cocoa beans [60,63]. Sen and co-workers [64] have also applied VIP selection to build OPLS models for the prediction of chemical parameters of wine by combined use of visible and mid-infrared (MIR) spectroscopies. These authors have built models able to predict anthocyanin compounds, total phenol content, and TSS of red wine with R 2 val ranging between 0.77 and 0.96. The use of VIP in a PLR model (specific windows) was applied by Sen and coworkers [64] to build OPLS models for the prediction of chemical parameters of wine by combined use of visible and mid-infrared (MIR) spectroscopies. These authors have built models able to predict anthocyanin compounds, total phenol content, and TSS of red wine with R 2 val ranging between 0.77 and 0.96. Our work is thus in adequation with the previous studies and showed for the first time that reducing data, thanks to VIP or β-coefficients from HIS, is suitable for table grapes. No similar results have been found in table grapes for the control of Total Flavonoids and the Total Anthocyanins, although they have been found in wine grapes and other matrices with errors of the same order of magnitude [24,59,60,65].
Looking now at the other quality factor of a calibration model, the measurement of TSS by refractometry led to a standard deviation ≤ 1.8 (Table 1) in which was included the incertitude due to the refractometer and to the heterogeneity of the berries. Using full spectra, the PLS model only led to a RMSEP of 0.9. The reduction in the number of wavelengths reduced it to 0.7 • Brix for β-coefficients wavelength selection. For TA, the lowest RMSEP was obtained thanks to VIPs-PLS (27 mg/kg), followed by both full spectra and VIPs-MLR from optimal wavelengths (33 mg/kg). For TF, the prediction models are much better for high level of flavonoid content. RMSEP decreased from 374 mg/kg (reference method, Table 1) to 128 mg/kg with VIPs score with specific windows or 149 mg/kg with β-coefficients. However, for very low concentration of flavonoids, like the Victoria variety, the models induced higher RMSEP.
Hyperspectral imaging is a tool, which could provide relevant on-line information about Total Flavonoid, Total Anthocyanins, and Total Soluble Solids through the use of consistent validation models. The models from the full spectra generated by SNV pretreatment and the fact that the models were built using grape berries of seven different cultivar contributed to the robustness of our models. The possibility to use the same pretreatment for all parameters and all varieties is interesting and could limit the complexity of the method and avoid mistakes in a professional use.
The reduction in data using only the wavelength with highest β-coefficient (absolute values) from one side, and spectral windows obtained from all the wavelengths with VIPs > 1 on another side, would allow an industrial use needing less computer data memory and quicker answers. That method could be used also as quality control. Database has first to be expanded not only to strength our current models but also to test new non-linear models. Another step would be to implement hyperspectral imaging on an industrial conveyor belt to take into account not only elements such as vibration on the conveyor but also analytical speed to provide real-time information. Moreover, in an on-line perspective, the localized information could be added for separating berries from a batch in order to get two or several final batches for different transformations or different quality array, depending on berry average spectrum, thus, on their composition. Nonetheless, that tool could anyway be used for a rapid table grape characterization in producer or industry places.