Headspace with Gas Chromatography-Mass Spectrometry for the Use of Volatile Organic Compound Profile in Botanical Origin Authentication of Honey

The botanical origin of honey determines its composition and hence properties and product quality. As a highly valued food product worldwide, assurance of the authenticity of honey is required to prevent potential fraud. In this work, the characterisation of Spanish honeys from 11 different botanical origins was carried out by headspace gas chromatography coupled with mass spectrometry (HS-GC-MS). A total of 27 volatile compounds were monitored, including aldehydes, alcohols, ketones, carboxylic acids, esters and monoterpenes. Samples were grouped into five categories of botanical origins: rosemary, orange blossom, albaida, thousand flower and “others” (the remaining origins studied, due to the limitation of samples available). Method validation was performed based on linearity and limits of detection and quantification, allowing the quantification of 21 compounds in the different honeys studied. Furthermore, an orthogonal partial least squares-discriminant analysis (OPLS-DA) chemometric model allowed the classification of honey into the five established categories, achieving a 100% and 91.67% classification and validation success rate, respectively. The application of the proposed methodology was tested by analysing 16 honey samples of unknown floral origin, classifying 4 as orange blossom, 4 as thousand flower and 8 as belonging to other botanical origins.


Introduction
Honey is a natural substance recognised for its many health benefits. It is considered a complex solution since its composition depends on several factors, including geographical and botanical origin and production process. Consequently, the beneficial effects are also dependent on phytochemical composition [1]. Depending on the botanical origin, three types of honey can be distinguished: monofloral and multifloral honey (from plant nectar) and honeydew (from plant secretions).
The origin of honey is closely related to market price and product quality. Generally, monofloral honey (from the nectar of a single plant species) has a more defined taste and aroma than multifloral honey (from the nectar of several plant species), which results in a higher price. As a high-value food product, honey is often subject to fraud, including adulteration and mislabelling of its origin [2]. The authenticity of honey in terms of botanical origin was traditionally determined by melissopalynological analysis based on pollen structure. However, this technique is difficult to apply for routine analysis as it requires much time and trained personnel [3]. At present, high-performance liquid chromatography (HPLC) [4][5][6], gas chromatography (GC) [7][8][9] and the analysis of physicochemical parameters such as colour, electrical conductivity, moisture, pH or the content of hydroxymethylfurfural (HMF) [7,8,10,11] are the most commonly used techniques for this purpose [12].

HS-GC-MS Method Optimisation
The first step was a comprehensive optimisation of the proposed analytical method to achieve efficient VOC extraction from the investigated Spanish honeys. The optimised parameters were oven programme, injection mode and volume, incubation time and temperature, amount of honey and the addition of NaCl to the sample. Optimisation was carried out using an orange blossom honey sample as a reference matrix. First, experiments were carried out using 1 g of honey fortified at 1 µg g −1 of the 37 VOC standards investigated (specified in Section 3.1). Then, honey was incubated at 100 • C and 750 rpm for 10 min, and 1 mL of headspace was injected into the system in splitless mode. The volatile profile of honey was used to study the response of the different conditions. Firstly, the oven programme was investigated to achieve an adequate peak separation in a shorter elution time. As no peak was observed at times over 35 min, the oven programme finally used was as follows: 40 • C (5 min), increased to 130 • C at 5 • C min −1 and then 200 • C at 35 • C min −1 .
Subsequently, the sample injection mode was optimised by performing splitless and split experiments at ratios of 10:1, 20:1, 50:1 and 100:1. The highest intensities and peak resolutions were obtained from the lowest dilution ratio. Therefore, the split ratio was set at 10:1. Regarding the injection volume, three different volumes were tested: 1, 1.5 and 2 mL. As was expected, signal intensity increased with injection volume; however, no significant differences were found between 1.5 and 2 mL, and thus the volume selected was 1.5 mL.
The amount of honey studied ranged from 0.5-5 g. As the amount of honey increased, the number and intensity of signals increased ( Figure S1). However, 3 g of honey was finally selected as optimal in order to avoid possible contamination by using higher quantities of sample. The addition of NaCl to honey was investigated in a range of 0-10% to improve VOC extraction by increasing the ionic strength of the medium. As a result, the presence of NaCl resulted in lower peak areas and was therefore rejected for honey analysis.
The next parameters were the incubation time and temperature of honey samples. Firstly, the time of incubation was studied in the 5-20 min range, achieving higher peak intensities at longer incubation times; thus, 20 min was selected as the optimal time. Finally, incubation temperature ranged from 80 to 110 • C. No significant differences were obtained from 90 • C onwards, and this temperature was therefore established as optimal for honey incubation. Higher temperatures were not tested to avoid forming derivative compounds which modify the volatile profile of honey.
As a result, the proposed methodology allowed the monitoring of 27 VOCs. Table 1 summarises the retention time and the target and qualifiers ions of these compounds.  Figure 1 shows the total ion chromatogram (TIC) obtained for an orange blossom honey sample fortified at 1 µg g −1 with the mixture using the monitored standards.  Figure 1 shows the total ion chromatogram (TIC) obtained for an orange blossom honey sample fortified at 1 μg g −1 with the mixture using the monitored standards.

Method Characterisation and Quantification of Identified VOCs
The determination of monitored compounds in honey samples was carried out by the construction of calibration curves using refined oil fortified at eight concentration levels ranging from 0.02 to 1 µg g −1 in duplicate. Samples were also fortified with toluene and p-xylene as IS at a constant concentration of 0.1 µg g −1 .

Compound
Linear Range (µg g −1 ) R 2 LOD 1 (LOQ 2 ) (µg g −1 ) 2 The optimised and validated method was applied for VOC quantification in honey samples from different botanical origins. All samples were analysed in duplicate. Table 3 shows the mean content, the range of concentrations and the percentage of occurrence of each compound in the different honeys. The mean and occurrence were calculated considering only concentrations above the corresponding LOQs. Although thirteen different honeys were available, five groups were established according to botanical origin, due to the limited number of samples from certain origins which were included in the same group. Thus, honey was classified as albaida, orange blossom (including orange blossom-lemon), thousand flower, rosemary or honey from other origins (including Spanish lavender, heather, melon, broom, oak and thyme).  To determine which compounds are relevant to each category of honey, one-way analysis of variance (ANOVA) and least significant difference (LSD) tests were conducted. Based on the results, ethyl butyrate and 2-heptanone are volatile compound characteristic of albaida and orange blossom honey, respectively, since these compounds were not detected in any other type of honey. Linalool allows the differentiation of orange blossom honey from the other honeys as it is classified in a separate group according to the LSD test, with an average concentration of 35 ± 19 ng g −1 . Similarly, decanal showed the highest content in rosemary honey (260 ± 446 ng g −1 ). On the other hand, trans-2-heptenal and trans-2-pentanal are only detected in the category "other origins" and, therefore, albaida, orange blossom, rosemary and thousand flower honeys do not contain them. Similarly, albaida, orange blossom and rosemary honeys do not contain 4-methylpentan-2-one; 2-nonanone was not found in orange blossom, thousand flower and rosemary. The compound 2-hexanone was detected below the LOQ in all cases. 2-Octanone showed the highest quantity in rosemary honey (56.2 ± 0.3 ng g −1 ) and the lowest in thousand flower honey (20.5 ± 0.4 ng g −1 ). Finally, no significant differences were found for octanal, nonanal and 4-methylacetophenone among the different honey categories. The compounds 2-pentanone, 6-methyl-5-hepten-2-one, valeraldehyde, 2-hexanone, octanal, hexanal, nonanal, benzaldehyde and linalool were also previously detected in honeys from the same origins by HS-GC-IMS [25].
Due to the variability of the compound contents found within a group, the assignment of marker compounds for a specific honey was not feasible. Thus, chemometric techniques were investigated for honey classification according to botanical origin.

Non-Targeted Approach Using GC-MS Data
A non-targeted analysis of honeys was carried out using GC-MS data in order to investigate other features of honey samples besides the identified and quantified VOCs. For this purpose, peak detection, deconvolution and alignment treatments were applied to the data. First, peak detection was performed using 1000 a.u. of amplitude as the minimum peak height, and a mass slice width and mass accuracy for centroiding of 0.5 Da. Data were smoothed using the "linear weighted moving average" approach with 3 scans of smoothing level and 20 scans of an average peak width. Peak deconvolution was carried out by setting a sigma window of 0.5 to obtain the resolved chromatographic peaks, avoiding the detection of noise. Finally, peak alignment was performed based on the RT with a tolerance of 0.075 min and a 70% similarity threshold. As a result, 274 features were detected: the 27 monitored VOCs of honey, the 2 IS compounds (toluene and p-xylene) and 245 nonidentified compounds. The detected peaks of identified and non-identified compounds were used for chemometric analysis.

Chemometric Model for the Classification of Honey According to Botanical Origin
In order to investigate honey sample classification according to the botanical origin, an orthogonal partial least squares-discriminant analysis (OPLS-DA) using the identified and non-identified compounds was carried out. For this purpose, the 31 available samples analysed in duplicate were used to classify them into the 5 established groups of floral origins: albaida, orange blossom, thousand flower, rosemary and others. Thus, the data matrix was composed of 62 sample analyses (rows) × 274 features of MS (columns).
The chemometric model was built using the unit variance (UV) scale, also known as "autoscaling". The honey samples displayed a normal distribution of data in a normal probability plot of residues. Eighty percent of the data was used for model training consisting of 50 analyses of albaida (5), orange blossom (8), thousand flower (13), rosemary (5) and others (19). The other 20% of the data was applied to validate the model; therefore, the validation set included 12 analyses (1 from albaida, 2 from orange blossom, 3 from thousand flower, 1 from rosemary and 5 from other origins). The proposed model was composed of 4 + 13 + 0 components with an R2X = 0.850, R2Y = 0.967 and a model prediction index (Q2) of 0.641, demonstrating the good predictive ability of the model. Figure 2 shows the two-dimensional scatter plot of the first component against the second component, which explains the largest variation of the X space. Honey samples were successfully classified into the five established botanical origin categories, achieving a 100% and 91.67% classification and validation success rate, respectively. Only the rosemary chromatogram was misclassified in the category of thousand flower in the validation set (Table S1). An analysis of variable importance in projection (VIP) was conducted to identify the compounds which most influence the classification of honey into the five established categories. As VIP values greater than 1 are considered influential values, 121 features including hexanal, 2-heptanone, 4-methylacetophenone, 1-hexanol, 1-octanol, ethyl isovalerate, 2-hexanone, trans-2-octenal, 6-methyl-5-hepten-2-one, 2-nonanone, 2-octanol, 4methylpentan-2-one and linalool were deemed key compounds. A loadings scatter plot of the model was also performed to evaluate the relationship between the Y variables and X variables of the predictive components. Higher loading values lead to higher contribu- An analysis of variable importance in projection (VIP) was conducted to identify the compounds which most influence the classification of honey into the five established categories. As VIP values greater than 1 are considered influential values, 121 features including hexanal, 2-heptanone, 4-methylacetophenone, 1-hexanol, 1-octanol, ethyl isovalerate, 2-hexanone, trans-2-octenal, 6-methyl-5-hepten-2-one, 2-nonanone, 2-octanol, 4-methylpentan-2-one and linalool were deemed key compounds. A loadings scatter plot of the model was also performed to evaluate the relationship between the Y variables and X variables of the predictive components. Higher loading values lead to higher contributions to model building. Figure S2 shows which categories of honey provide similar information to the model and the relationship of each one to the investigated MS features. As can be seen in Figure S2, "thousand flower" is the most influential category. Thousand flower, orange blossom and other honeys differ significantly more than albaida and rosemary, which provide similar information to the model. Regarding the MS features, the two most influential markers for each honey are also indicated in Figure S2
Methanol was supplied by ThermoFisher Scientific (MA, USA), and sodium chloride (NaCl) was purchased from Sigma-Aldrich. Helium from Messer (Madrid, Spain) was used as a carrier gas.

Honey Samples
The different varieties of monofloral and multifloral honey were provided by several beekeepers from Murcia (Spain). Specifically, 31 honey samples from 11 botanical origins were analysed, namely orange blossom (five samples), albaida (three samples), heather (two samples), orange blossom-lemon (one sample), Spanish lavender (one sample), melon (two samples), thousand flower (eight samples), broom (two sample), oak (two sample), rosemary (three samples) and thyme (two samples) honeys. Furthermore, 16 samples of unknown origin were purchased from local markets and analysed to evaluate the applicability of the proposed method. All honeys were kept in the dark at 4 • C until analysis.

Instrumentation and Software
An 8890-gas chromatograph from Agilent Technologies (CA, USA) with a multipurpose sampler (MPS) operating in headspace mode and a 2.5 mL syringe (Gerstel, Mülheim, Germany) were coupled to a 5977B-quadrupole mass spectrometer with an inert ion source also from Agilent. Chromatographic separation was performed using two in-line Agilent HP-5MS capillary columns (5% diphenyl-95% dimethylpolysiloxane) with 15 m × 0.25 mm I.D. × 0.25 µm, combined with a backflush system setting 0.83 min as post run time.
MassHunter Workstation software (Qualitative Analysis version B.08.00) from Agilent Technologies was used for data acquisition. StatGraphics Plus 5.1 (Statistical Graphics, Rockville, MD, USA), MS-DIAL 4.80 and SIMCA 14.1 (Umetrics, Umeå, Sweden) software were used for the processing of data. The NIST mass spectral library was used for the identification of VOCs.
For the homogenization of honey samples before analysis, an LLG-uniTEXER vortex agitator (Heathrow Scientific, Vernon Hills, Chicago, IL, USA) was used.

HS-GC-MS Analysis
Honey was tempered at room temperature before analysis and 3 g was weighed into a 20 mL vial. Then, 100 µL of MeOH and 30 µL of the IS solution at 10 mg L −1 were added. This mixture was shaken for 1 min by vortex at 1500-2000 rpm for homogenisation. Subsequently, the vial containing the sample was incubated at 90 • C for 20 min at 750 rpm and 1.5 mL of the headspace was injected into the GC system. The injection temperature was set to 100 • C at a ratio of 10:1 (split mode). The flow of the carrier gas, helium, was 1 mL min −1 . The GC oven programme started at 40 • C (5 min), increased to 130 • C at 5 • C min −1 and then 200 • C at 35 • C min −1 , resulting in a total runtime of 25 min. The ion source, transfer line and quadrupole temperatures were 230, 300 and 150 • C, respectively.
The MS was performed in electron impact (EI) mode (70 eV) and experiments were carried out using scan mode in a range of 35-500 m/z. The quantification of compounds was performed using selected ion monitoring (SIM) mode and the extract ion chromatograms (EIC) of the target ions.

Data Processing
First, HS-GC-MS data were converted to Analysis Base Framework (ABF) format for data processing using MS-DIAL. The pre-treatment of the total ion chromatograms (TICs) involved peak detection, deconvolution and alignment treatment of data. The detected peaks of identified and non-identified compounds were used for chemometric analysis in order to differentiate honey according to botanical origin. The chemometric model based on orthogonal partial least squares-discriminant analysis (OPLS-DA) was constructed with SIMCA software using the unit variance (UV) scale. Model building was carried out using a training dataset consisting of 80% of data, selected randomly, and validation was performed with a validation dataset using the remaining 20%. The parameters of R2X (cum), R2Y (cum) and Q2 (cum) were evaluated to assess the adequacy of the model. R2X and R2Y correspond to the cumulative fraction of variance in X and Y explained by a particular component, and Q2 (cum) represents the predictive ability of the model. The range of these parameters is 0 to 1 [33]. The chemometric model is acceptable at a Q2 value of 0.5 [34]. Sensitivity of the model was defined as ∑ True positive/(∑ True positive + ∑ False negative) × 100.

Conclusions
The proposed analytical method allowed the characterisation of honey from different botanical origins in a rapid and efficient way without the requirement of sample pretreatment, which involves a longer process time and additional costs for instruments and reagents. Characterisation was carried out by monitoring 27 volatile compounds, obtaining the average content, concentration range and incidence of these compounds in each honey. This allowed the identification of possible floral markers such as ethyl butyrate for albaida honey and 2-heptanone for orange blossom honey. In addition, the proposed OPLS-DA chemometric model based on a non-targeted analysis of honeys allowed a successful classification according to botanical origin, achieving a classification success rate of 100% and a validation success of 91.67%, demonstrating the suitability of the model for the identification of honeys of unknown origin.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/molecules28114297/s1, Figure S1: Total ion chromatogram of honey during sample amount optimisation by the proposed HS-GC-MS method; Figure S2: Loadings scatter plot of the proposed OPLS-DA model; Table S1: Validation rate of the proposed OPLS-DA model.