Application of vibrational spectroscopy in the quality assessment of Buchu oil obtained from two commercially important Agathosma species (Rutaceae)

Quality assessment of natural raw materials and derived consumer products is often done using conventional analytical techniques such as liquid and gas chromatography which are expensive and time consuming. This paper reports on the use of vibrational spectroscopy techniques as possible alternatives for the rapid and inexpensive assessment of the quality of ‘ buchu oil ’ obtained from two South African species; Agathosma betulina and Agathosma crenulata belonging to the Rutaceae family. Samples of A. betulina (55) and A. crenulata (16) were collected from different natural localities and cultivation sites in South Africa. The essential oil was obtained by hydrodistillation and scanned on Near infrared (NIR), mid infrared (MIR) and Raman spectrometers. The spectral data obtained was processed using chemometric techniques and orthogonal partial least squares discriminant analysis (OPLS-DA) was used to clearly differentiate between A. betulina and A. crenulata. The OPLS-DA technique also proved to be a useful tool to identify wave regions that contain biomarkers (peaks) that contributed to the separation of the two species. The three spectroscopy techniques were also evaluated for their ability to accurately predict the percentage composition of seven major compounds that occur in A. betulina ‘ buchu ’ oil. Using GC – MS reference data, calibration models were developed for the MIR, NIR and Raman spectral data to predict/profile the major compounds in ‘ buchu oil ’ . A comparison of the three spectroscopy techniques showed that MIR together with PLS algorithms produced the best model (R 2 X=0.96; R 2 Y=0.88 and Q 2 Ycum=0.85) for the quantification of six of the seven major oil constituents. The MIR model showed high predictive power for pseudo-diosphenol (R 2 =0.97), isomenthone (R 2 =0.97), menthone (R 2 =0.90), limonene (R 2 =0.91), pulegone (R 2 =0.96) and diosphenol (R 2 =0.85). These results illustrate the potential of MIR spectroscopy as a rapid and inexpensive alternative to predict the major compounds in buchu oil.


Introduction
Agathosma species 'buchu' are medicinal shrubs belonging to the Rutaceae family. The species are widely distributed throughout the Western Cape province of South Africa which harbours two commercially important species: A. betulina (Bergius) Pillans and A. crenulata (L.) Pillans (Spreeth, 1976). The two species are known as 'true buchu' however, due to their characteristic leaf shapes A. betulina is known as round leaf buchu and A. crenulata as long leaf buchu (Fig. 1). The buchus are integral to the traditional healing practices of inhabitants residing in the South Western Cape where they are used to treat renal disorders and chest complaint (Van Wyk and Wink, 2004). In addition, the essential oil obtained from A. betulina, commonly referred to as 'buchu oil' is used as a flavourant to enhance fruit flavours such as black current notes while in perfumery it is used as a fragrance material (Simpson, 1998;Turpie et al., 2003;Van Wyk and Wink, 2004).
Due to widespread commercialization of buchu-containing products, both locally and abroad, there has been an increase in the demand (albeit with fluctuation) for buchu oil. The oil from A. betulina has however gained market favour due to its unique organoleptic properties compared to A. crenulata (Posthumus et Webber et al., 1999). The need to correctly identify A. betulina and A. crenulata plant material and oil thus became apparent. Currently, the identification is based on leaf shape however, due to the emergence of hybrids and the variable leaf shape observed between several populations it may not always be a reliable character (Blommaert and Bartel, 1976;Pillans, 1950). Gas chromatography coupled to mass spectrometry (GC-MS) is the analytical tool used for analysis of buchu oil. However, the method is time consuming, expensive and requires skilled personnel (Qiao and Van Kempen, 2004). The need to identify fast, efficient and cost-effective methods for analysing buchu oil is important so as to supply a product of consistent high quality and reduce the risks of financial losses due to the supply of low quality oil. Vibrational spectroscopy has been identified as an important alternative method in quality assessment of raw material and herbal products. The technique has already been used in the inspection and analysis of raw materials and to quantify constituents in a wide range of products (Lin et al., 2009). In the food and beverage industries, spectroscopy is used for in situ analysis of moisture content, fat, protein, sugar and acid levels (Osborne and Fearn, 1993;Pedersen et al., 2003). In the pharmaceutical industry spectroscopy is used in process monitoring and product quality control that includes: raw material identification, content and particle size uniformity and moisture (Reich, 2005). The advantages over current analytical techniques include that it is robust, efficient, non-destructive, non-evasive, inexpensive and require little or no sample preparation (Schulz et al., 2004). In this study, the use of three vibrational spectroscopy techniques (Near infrared, mid infrared and Raman) to characterise and classify buchu oil from A. betulina and A. crenulata was evaluated. In addition, the techniques were evaluated for the quantification of the major constituents in the commercially important A. betulina oil. Chemometric tools were used to develop calibration models that would assist in the rapid prediction of major buchu oil components.

Selection and preparation of plant material
Seventy one A. betulina and A. crenulata plants (wild and cultivated) from 19 different locations in the South Western  Cape region of South Africa were obtained (Table 1). All samples were kindly supplied by Chicken Naturals. Several individual plants were harvested in both commercial plantations and also from the wild. The essential oil was obtained through hydrodistillation of the aerial parts using a Clevenger-type apparatus. The oils were stored at − 20°C prior to analyses.

Gas chromatography-mass spectrometry (GC-MS)
Analysis of the distilled oils was done using gas chromatography-mass spectrometry (GC/MS). An Agilent 6860 N chromatograph fitted with an HP-Innowax, 60 m × 250 μm polyethylene glycol column (film thickness 0.25 μm) was used. The following oven temperature was used: start at 60°C, rising to 220°C at 4°C/min, holding for 10 min, and then rising to 240°C at 1°C/min. Helium was used as the carrier gas at a constant flow of 1.2 ml/min, pressure of 24.79 psi (split 1:200). Chromatograms were obtained on electron impact at 70 eV, scanning from 35 to 550 m/z. Identification of the major compounds was done based on retention indices and library data bases that include Mass Finder ® and NIST ® . The percentage composition of the major compounds was obtained from the flame ionization detector (FID) peak areas according to the 100% method (Kamatou et al., in press).

NIR spectroscopy measurements
The Near infrared spectra of the oils were recorded on a NIRFlex N500 liquid cell spectrometer (Büchi Labortechnik AG, Flawil, Switzerland). High precision cells (cuvettes) of 0.20 mm path length (Hellma GmbH & Co, KG, Müllheim, Germany) were used. The oil spectra were collected in the transmittance mode between the wave regions of 4000 and 10,000 cm (2500-1000 nm). NIRWare 1.2 was used for operating the instrument and obtaining spectra. Approximately 50 μl of sample was aliquoted into the cuvette, placed on the spacer. A total of 32 scans were accumulated for each sample with a spectral resolution of 4 cm (1501 data points). The procedure was done in triplicate and the average spectra obtained in MS Excel ® for chemometric analysis.

MIR spectroscopic measurements
The mid-infrared spectra of the oils were recorded in the range of 550-4000 cm (~18,000-2500 nm) on an alpha-P Bruker spectrometer mounted with an ATR diamond crystal (Bruker OPTIK GmbH, Ettlingen, Germany). OPUS 6.5 was used for obtaining the spectra. The essential oil sample (10 μl) was placed directly on the surface of the ATR diamond crystal and spectral data obtained in the absorbance mode. A total of 32 scans were accumulated for each sample with a spectral resolution of 4 cm (2436 data points). The procedure was done in triplicate and the average spectra obtained in MS Excel ® for chemometric analysis (Baranska et al., 2005).

Raman measurements
FT-Raman spectra were recorded using a Nicolet NXR 9650 spectrometer equipped with a laser, emitting at 1064 nm and a germanium detector cooled with liquid nitrogen. Approximately 5 μl of essential oil was aliquoted into the center of a steel disk and placed on a xy stage. Spectral data of individual oils were accumulated using OMNIC software from 64 scans with spectral resolution of 4 cm in the range of 100-4000 cm (100,000-2500 nm) (8090 data points). Laser power of 100 mW was supplied by an unfocused beam (Baranska et al., 2006).

Data analysis
Chemometric analysis of the spectral data was performed using SIMCA-P + 12.0 software (Umetrics AB, Malmo, Sweden). Orthogonal partial least squares discriminate analysis (OPLS-DA) was performed on MIR, NIR and Raman spectral data for discrimination of the two Agathosma species. Spectral data were centered and the whole wavenumber region was used without spectral pretreatment for this analysis. Partial least Table 2 Oil composition of samples used in this study (A. betulina n = 55 and A. crenulata n = 16).

Component
Relative retention indices (RRI) squares (PLS) regression analysis was carried out on NIR, MIR and Raman spectra to set up calibration models of A. betulina essential oil constituents. Principal component analysis was initially done to identify any strong outliers (scores scatter plot) or moderate outliers (DmodX) that could be removed from the dataset. Pretreatments that were used include multiple scatter correction (MSC), standard normal variate (SNV) and second derivative. The whole wavenumber regions and cross validation with the prediction error sum of squares (PRESS) method were used to estimate the predictive ability of the model. Response permutation was applied to determine the appropriate number of PLS components to include in the model and hence avoid overfit. The model was fitted on centered spectral data while univariate (UV) scaling was applied to the reference (GC) data.
A training set and a test set were defined for external validation by randomly selecting 70% of observations to include in the training set and the remaining 30% (test set) were used to evaluate the predictive ability of the model. Statistical accuracy was described by the correlation coefficient (R 2 ) and root mean square error of prediction (RMSEP) for observations in the prediction set.

GC-MS reference analysis
The results show that there were no qualitative differences between the two species when comparing the major compounds (% area N 1) listed in Table 2. A. crenulata is characterized by high pulegone content ranging between 50 and 66%. Agathosma betulina is characterized by the presence of diosphenol (15-35%), pseudo-diosphenol (12-30%), isomenthone (4-26%) and limonene (5-24%). These compounds also occur in A. crenulata although some are found in very low quantities (b 2%). The commercially important sulphur containing compounds are characteristic of A. betulina (cis and trans-8mercapto-p-methan-3-one). Although these occur in small amounts, they are responsible for the characteristic organoleptic properties of the oil. Fig. 2 shows the total ion chromatograms of the two species highlighting the major compounds and the corresponding structures. These results are consistent with previous reports on the chemical profiles of A. betulina and A. crenulata essential oils. Fluck et al. (1961) observed that qualitatively, the two species had similar compounds however, A. betulina contained high diosphenol, while A. crenulata contained high levels of pulegone. Other reports also confirmed the occurrence of high pulegone levels as a marker compound for identifying A. crenulata oil and high diosphenol and low pulegone content for identifying A. betulina oil (Blommaert and Bartel, 1976;Collins and Graven, 1996).

Classification and discrimination of Agathosma species
A two component OPLS-DA model was successfully used to discriminate between A. betulina and A. crenulata oils using MIR spectral data. The model explained 96.5% of the total variation in X (R 2 X cum predictive + orthogonal) and the goodness of prediction of the model was 97.8% (Q 2 cum = 97.8). Most of the predictive information was found in the first component which showed that 89.7% variation in X was related to the separation of the two species (Fig. 3). 15.5% variation in X (orthogonal) is systematic variation that did not contribute to the separation. Separation of the two classes was very good as can be seen from the high R 2 Y (goodness of fit) and Q 2 Y values of 98% and 97.9%, respectively. Fig. 4a is the loadings plot that shows the correlation between the wavenumbers and the two species. The positive loadings are   correlated to A. betulina while negative loadings are correlated to A. crenulata. The indicated peaks show the regions that are highly responsible to the separation of the two species. Fig. 4b confirms these regions of high magnitude and high reliability in the separation of the two species. The wavenumbers (1388-1394; 1635-1652) indicated on the far top right are correlated to chemical profiles associated with A. betulina while in the bottom left corner the wavenumbers (1281-1289; 1675-1694) are correlated to A. crenulata. The region in the center is high risk to scrupulous correlation and is therefore not used to distinguish the species as it can give false information (Eriksson et al., 2006). The results obtained show that there is considerable variation between A. betulina and A. crenulata which can be confirmed using GC data. In addition the OPLS-DA model developed using MIR data reliably separated the two Agathosma species. Overall, the results demonstrate that MIR with the aid of chemometrics can be used to rapidly determine the authenticity of buchu oil.

Vibrational spectra of A. betulina essential oils
Three vibrational spectroscopy methods (MIR, NIR and Raman) were compared for analysis of major compounds in the essential oils of A. betulina. The obtained exemplary spectra for each of the techniques are represented in Fig. 5. The MIR spectrum shows characteristic key bands/peaks that can be assigned to the major compounds in the oils (Fig. 5a). In contrast, NIR spectrum consists of broad overlapping bands that can be applied in the quality assessment but may not be useful for assessment of individual compounds (Fig. 5b) (Baranska et al., 2006). Raman spectrum like MIR shows characteristic key bands that can be assigned to specific components in the oils (Fig. 5c). The spectrum however contains a lot of noise compared to MIR and may require smoothing which has a disadvantage in that it may mask or remove certain characteristic signals resulting in the loss of useful information.

Quantification of major compounds
Fifty-five A. betulina essential oils were used to develop linear calibration models for the prediction of seven major compounds which occur in varying concentrations in the oils (Table 2). Calibration models were developed from MIR, NIR and Raman data to directly predict the major compounds from the spectral data. A comparison of the three models was made to identify the technique that produced the best model for prediction. The results obtained are presented in Table 3. MIR proved to provide the best predictive model with only 3 PLS components explaining 96% variation in X (R 2 X = 0.96) and 88% variation in Y (R 2 Y = 0.88). The predictive ability of the model was obtained as 85% (Q 2 = 0.85). The whole wavenumber region of the spectra was included for this calibration model. Restricting the model to only the finger print region of the spectra did not improve the predictive power of the model. No strong or moderate outliers were identified thus the model included all 55 observations. Although three pretreatment methods (SNV, MSC and second derivative) were applied to the dataset, they seemed to distort the model resulting in unsatisfactory prediction quality. The model displayed in Table 3 therefore shows data without pretreatment. According to Eriksson et al. (2006) a good model is one where both R 2 (cum) and Q 2 (cum) should be N0.5 and the difference between the two values should not be N0.2. In this study, the MIR model gave R 2 and Q 2 values of close to 1 and the difference between them is 0.11. In addition the number of components used together with the results from response permutation, shows that the model is not overfitted. The model is therefore good for the prediction of the major components.
The NIR results gave a reasonable calibration model with eight PLS components explaining 99% variation in X (R 2 X = 0.99) and 89% variation in Y (R 2 Y = 0.89). The predictive ability of the model was 73% (Q 2 = 0.73) which is lower than what was observed for MIR data. The whole spectra were used for calibration and pre-processing with SNV improved the original model compared to the other two preprocessing methods (Table 3). No outliers were removed from the dataset. Although the model obtained was good, the difference between R 2 (cum) and Q 2 (cum) is 0.26 which is above 0.2 and may therefore give unsatisfactory prediction. The number of components used is also higher (8) than for the MIR (3) which indicates that the model might have been overfitted and hence the lower predictive ability. Overall, although the NIR technique can be used for predictions, the MIR technique presents a much better model with a better predictive ability.
Raman spectroscopy did not yield satisfactory results. The number of PLS components used to build the model was higher (10) compared to the other two models. In addition, pre-processing of the spectral data did not significantly improve the model. The difference between the R 2 X (0.99) and Q 2 (0.55) was 0.44 which shows that the model was heavily overfitted. Manually reducing the number of components resulted in the Q 2 (cum) being reduced to less than 0.5. Once again no outliers were identified in the model and all the observations were included in the model. The unsatisfactory model obtained with the Raman data could be a result of the noisy spectra obtained with Raman measurements. Smoothing of the spectra did not improve the model which might imply that the method is not sensitive enough for quantification (Baranska et al., 2006). From the results presented in Table 3, MIR showed the best calibration statistics and the results for prediction of the seven major compounds using this model is presented in Table 4.
The model developed using MIR data was evaluated for its ability to analyse all the seven Y-variables (compounds) in a single PLS model. This was done by creating a PCA of the Y matrix alone. The model had only 3 components which is small compared to the number of variables which implies that the Yvariables are correlated and therefore a single model could be used for all the Y-variables (Wold et al., 2001). A review of the PCA loadings plot also showed that there is no strong clustering within the Y-variables and thus a single PLS model was sufficient to predict all the components. The results presented in Table 4 show that most of the major compounds were well predicted using the MIR calibration model. Good predictive quality was observed for six out of the seven compounds (pseudo-diosphenol, isomenthone, menthone, limonene, diosphenol and pulegone) where R 2 values were N0.8. In addition, the RMSEP values (indicating the average difference between the measured and predicted response) for the compounds are low considering the range of concentrations used implying that reliable predictions are possible. In contrast cis-8-mercapto-p-methan-3-one was not well predicted using this model with R 2 = 0.45 and RMSEP = 1.9. The dispersion of points near the calibration line was significant showing that the errors were large (results not shown). The poor prediction could also have been a result of the narrow content range of the component in the samples used (1-11%). Overall, PLSR was a useful tool in modeling and analysing several compounds together which gives a simple overall picture than one separate model for each component (Wold et al., 2001). MIR gave the best prediction model which is consistent with the results from previous researchers' that obtained the best prediction using a model based on ATR-IR data compared to NIR and FT-Raman for determination of lycopene and ß-carotene in tomato fruits (Baranska et al., 2006).

Conclusion
In this study the feasibility of using vibrational spectroscopy as an alternative tool in the quality assessment of buchu oil has been shown. The technique was proven to be reliable in chemotaxonomic characterization of buchu oil where the use of leaf shape may not necessarily be reliable especially where hybridization is prominent. Although MIR, NIR and Raman spectroscopy can be used for quality assessment of buchu essential oils, MIR was shown to produce the best calibration model with the best predictive capacity for quantification of the major compounds. In this regard, spectroscopy was again proven to be a fast, reliable and inexpensive technique in the profiling of major compounds that occur in the commercially important buchu oil.