1 Introduction

Metabolomics research involves the comprehensive non-target analysis of all, or at least as many as possible, metabolites in cells, tissue or body fluids. The complexity of the metabolome makes this a challenging task for analytical chemists. For example, samples of the simplest microorganisms already contain, by estimation, several hundreds of different metabolites. At present, the main analytical techniques used for the analysis of the metabolome are nuclear magnetic resonance spectroscopy (NMR) and hyphenated techniques, such as gas chromatography (GC) and liquid chromatography (LC) coupled to mass spectrometry (MS).

Gas chromatography coupled to mass spectrometry (GC–MS) is a highly suitable technique for metabolomics analysis due to the high separation power, reproducible retention times and sensitive selective mass detection. In previous papers a one-dimensional GC–MS method (Koek et al. 2006) and a comprehensive two-dimensional gas chromatography mass spectrometry method (GC × GC–MS; Koek et al. 2008) suitable for the analysis of a broad range of small polar metabolites were described using a derivatization with an oximation reagent followed by silylation. Several other GC–MS (Fiehn et al. 2000; Jonsson et al. 2004; Roessner et al. 2000; Strelkov et al. 2004; Villas-Boas et al. 2005) and GC × GC–MS (O’Hagan et al. 2007; Pierce et al. 2006b; Shellie et al. 2005) based methods for metabolomics have been reported.

The principle of GC × GC–MS is based on the coupling of two analytical columns with different selectivities coupled through a modulator. The so-called dual-stage cryogenic modulator equipped with four jets (two liquid-nitrogen cooled and two hot-gas jets) allows for the consecutive trapping, cryogenic focussing and release of small fractions from the first column effluent in narrow bands onto the second column. In this comprehensive setup, the entire sample is separated on both columns and no information of the first separation is lost during the second one. The resulting GC × GC–MS chromatogram consists of a large series of consecutive second dimension (2D) separations. To maintain the separation of the first column, each peak eluting from the first dimension should be sampled, i.e. modulated, minimally three to four times (Murphy et al. 1998).

GC × GC–MS offers several advantages over GC–MS, i.e. higher chromatographic separation power, a broader dynamic range and lower detection limits, and should be the preferred technique for metabolomics analysis. However, quantification of metabolomics samples using GC × GC–MS is still a major challenge. To get from raw total-ion chromatographic data to a list of sample components with their corresponding peak areas and mass spectra, many steps are required, including peak finding, deconvolution, integration and combining of the peaks from different modulations originating from one compound. The performances of all the steps are influencing the final data quality and, consequently, the reliability of the biological information extracted from the data. In addition, all metabolites are of interest and need to be quantified. Several approaches have been published to process GC × GC–MS data after analysis to find metabolites that distinguish between samples (Mohler et al. 2008; Pierce et al. 2006a; Shellie et al. 2001; Sinha et al. 2004), but only few papers on the quantification of all (or at least as many as possible peaks) peaks have been published.

Hoggard and Synovec (2008) described a method for applying PARAFAC to GC × GC-TOF-MS data in an automated fashion that required no assumptions about analyte identities. They proposed that the method was applicable as post processing step providing deconvolution and quantification of all analytes in a sample. However, their method was very time-consuming, i.e. one chromatogram had to be divided in numerous subsections, and complete analysis required, by their estimation, tens of hours. Oh et al. (2008) developed a peak sorting method (MSsort) for GC × GC–MS data in Matlab. Raw data files were first processed using the ChromaTOF software (LECO, St. Joseph, MI, USA) to provide peak tables. Subsequently, MSsort was used to sort and combine peaks by utilizing first- (1rT) and second-dimension (2rT) retention times and the mass spectrum. However, no quantitative data were presented in neither of the described papers. To our knowledge the only attempt for non-target quantification of metabolites in a real life metabolomics study was published by Li et al. (2009). They quantified 692 peaks in 79 human-plasma samples to identify possible biomarkers for type-2 diabetes mellitus. Quantification was performed by exporting m/z 73 from the GC × GC–MS chromatograms and alignment, peak merging and quantification was performed using their in-house developed software (GC × GC Workstation; Qiu et al. 2007). The repeatability of the quantification was tested using pooled plasma samples. The mean relative standard deviations (RSDs) in five consecutive injections of one plasma sample and five consecutive injections of five different plasma samples were 14 and 20%, respectively. It is not fully clear how many peaks were included in the mean RSD (only peaks that were quantified in all samples were included). Besides, the use of a single mass trace (m/z) instead of the deconvoluted spectrum of a peak for quantification can result in errors in quantification of coeluting peaks and the assignment of the identity of a peak.

In this paper, the possibilities and limitations of the software with regard to non-target semi-automated processing of GC × GC–MS data were evaluated. This was done by measuring and processing a set of mouse-liver samples that were part of a larger study investigating the development of insulin resistance/type-2 diabetes mellitus (DM2) (Kleemann et al. 2010). DM2 is a multifactorial complex disease associated with metabolic deregulations. Despite major efforts, the pathophysiological mechanisms underlying the beginning and progression of the disease are still incompletely understood. Identification of changes in hepatic metabolite profiles can help to identify dysregulated metabolic pathways in DM2 and thus in the selection of (new) therapeutic regimens. Mice with a humanized lipoprotein metabolism, APOE*3Leiden transgenic (E3L) mice (Zadelaar et al. 2007) were used and mice were fed a high-fat diet known to induce insulin resistance/DM2. Livers were collected at different time points during 12 weeks of high-fat-diet feeding. The samples were measured with GC × GC–MS and with GC–MS to be able to compare the results from both methods. Results of semi-automated GC × GC–MS data processing were compared with a fully optimized, but labour-intensive, targeted GC–MS data processing method used in our lab, involving the inspection and, if required, manual correction of the integration of all quantified metabolites. In addition, time-resolved changes in metabolic profiles of the mouse livers were identified using principal-component analysis (PCA) and principal-component-discriminant analysis (PCDA).

2 Experimental

2.1 Chemicals and materials

Pyridine (Baker analyzed) was purchased from Mallinkrodt Baker (Deventer, The Netherlands) and pyridine hydrochloride (analytical grade) was purchased from Sigma-Aldrich (Zwijndrecht, The Netherlands). A solution of 56 mg/ml ethoxyamine hydrochloride (>99%, Acros Organics, Geel, Belgium) in pyridine was used for oximation and N-methyl-N-trimethylsilyl trifluoroacetamide (MSTFA; Alltech, Breda, The Netherlands) was used for silylation.

2.2 Standards

Standards used as quality-control standards, leucine-d3, glutamic acid-d3, phenylalanine-d5, glucose-d7, alanine-d4 and cholic acid-d4, were purchased from Spectra Stable Isotopes (Columbia, USA). 4,4-Difluorobiphenyl, trifluoroantracene and dicyclohexyl phthalate were purchased from Sigma-Aldrich. Three internal standard (IS) mixtures were prepared; IS mix 1 containing leucine-d3 (250 ng/μl), glutamic acid-d3 (250 ng/μl), phenylalanine-d5 (250 ng/μl), glucose-d7 (250 ng/μl) in water, IS mix 2 containing alanine-d4 (250 ng/μl) and cholic acid-d4 (250 ng/μl) in pyridine, and IS mix 3 containing 4,4-difluorobiphenyl (250 ng/μl), trifluoroantracene (250 ng/μl) and dicyclohexyl phthalate (250 ng/μl) in pyridine.

2.3 Mouse-liver samples

Animal experiments were approved by the Institutional Animal Care and Use Committee of The Netherlands Organization for Applied Scientific Research (TNO) and were in compliance with European-Community specifications regarding the use of laboratory animals. Male ApoE*3Leiden transgenic (E3L) mice subjected to high fat diet feeding essentially as specified in Kleemann et al. (2010). Briefly, E3L mice displaying a humanized lipoprotein metabolism and lipid profile and sensitive to high fat diet treatment (Kleemann et al., 2007) were fed a high-fat diet containing 24% beef tallow (HF diet; Hope Farms, Woerden, The Netherlands) and were euthanized with CO/CO2 after zero weeks (n = 8), 6 weeks (n = 8) and 12 weeks (n = 8) of high-fat diet feeding.

Livers were collected at sacrifice and were snap-frozen immediately in liquid nitrogen, and stored at −80°C until use (no longer than 10 months).

2.4 Sample preparation

The liver samples were freeze-dried overnight and homogenized. 10-mg aliquots of the liver samples were weighed and placed inside a 2-ml Eppendorf tube. After addition of 10 μl of IS mix 1 and 500 μl of methanol/water 4:1 v/v, all samples were sonificated for 30 min and subsequently centrifuged for 10 min at 14086×g (10000 rpm). The supernatants were transferred to autosampler vials and subsequently dried under nitrogen flow. Then 10 μl IS mix 2 and 30 μl ethoxyamine hydrochloride solution were added and the samples were oximated for 90 min at 40°C on a tube roller mixer placed inside an oven. Subsequently, 10 μl of IS mix 3 and 100 μl of MSTFA were added and the samples were silylated for 50 min at 40°C on a tube roller mixer inside an oven. Finally, the samples were centrifuged for 20 min at 2081×g (3500 rpm) prior to injection.

2.5 Quality-control (QC) sample

A pooled sample of six different liver samples from different time points (two per time point) was used as QC sample. The samples were prepared according to the sample preparation described above; however, after extraction the supernatants of all samples were mixed and subsequently divided over ten separate autosampler vials. Furthermore, the amounts of liver sample and IS-mix were adjusted to obtain the same amount of biomass and internal standards in the QC samples compared to the study samples.

2.6 GC–MS analysis

The derivatized extracts were analyzed with an Agilent 6890 gas chromatograph coupled with an Agilent 5973 mass-selective detector (Agilent technologies, Santa Clara, CA, USA). 1-μl aliquots of the extracts were injected into a DB5-MS capillary column (30 m × 250 μm I.D., 0.25 μm film thickness; J&W Scientific, Folson, CA, USA) using PTV-injection (Gerstel CIS4 injector; Mülheim an der Ruhr, Germany) in the splitless mode. The temperature of the PTV was 70°C during injection and 0.6 min after injection the temperature was raised to 300°C at a rate of 2°C/s and held at 300°C for 20 min. The initial GC oven temperature was 70°C, 5 min after injection the GC-oven temperature was increased with 5°C/min to 320°C and held for 5 min at 320°C. Helium was used as a carrier gas, and pressure programmed such that the helium flow was kept constant at a flow rate of 1.7 ml per min. Detection was achieved using MS detection in electron ionisation and full-scan monitoring mode (m/z 15–800). The temperature of the ion source was set at 250°C and that of the quadrupole at 200°C.

2.7 GC × GC–MS analysis

The derivatized samples were analyzed with an Agilent 6890 gas chromatograph fitted with a dual-stage, four-jet (two liquid-nitrogen cooled and two hot-gas jets) cryogenic modulator and a secondary oven (LECO) and coupled to a time-of-flight mass spectrometer (Pegasus III, LECO). The configuration of the first (1D) and second dimension (2D) column and the method parameters were optimized, as described in Koek et al. (2008).

A 30 m × 0.25 mm I.D. × 0.25 μm forte BPX-50 column (SGE, Milton Keynes, UK) was used as the 1D column and a 2 m × 0.32 mm I.D. × 0.25 μm forte BPX5 column (SGE Europe) was used as the 2D column.

1-μl aliquots of the derivatized extracts were injected using PTV-injection (Gerstel CIS4) in the splitless mode. The temperature of the PTV was 70°C during injection and 0.6 min after injection the temperature was raised to 300°C at a rate of 2°C/s and held at 300°C for 20 min. The initial GC-oven temperature was 70°C, 3 min after injection the temperature was raised to 300°C with a rate of 5°C/min and held at 300°C for 10 min. The temperature offset of the secondary oven and modulator compared to the GC oven were set at +30 and +40°C, respectively. The modulation time was 6 s, with the hot-pulse time set at 1 s. Helium was used as carrier gas and the analyses were carried out in constant-pressure mode at 300 kPa. The MS transfer line was set at 325°C and the ion-source temperature was 280°C. The detector voltage was set at −1600 V and the data acquisition rate was 75 Hz.

2.8 Data processing GC–MS

The Chemstation software (Version E02.00.493, Agilent Technologies) was used for processing of the data. A target table was constructed using an in-house library containing the mass spectra and retention times of over 600 reference metabolites (authentic standards), over 100 annotated metabolites (spectral match with NIST library) and over 200 unknown metabolites commonly found in blood products. Furthermore, metabolites (known or unknown) specific for this study were added to the target table. A total of 175 targets were found in the QC samples (total of three) and quantified in all samples by reconstructing an ion chromatogram of a specific mass from the mass spectrum of the target. The quantification for all targets was manually checked by visual control and if necessary peak integration was corrected manually.

2.9 Optimization of GC × GC–MS data processing

ChromaTOF software V3.35 was used for data processing. During the optimization step the following parameters were varied separately in the processing method: first dimension peak width (1wB) (30, 60, 90 and 120 s), second dimension peak width (2wB) (0.1, 0.15, 0.2, 0.3 and 0.4 s), smoothing factor (auto, 3, 5, 7) and the match required to combine different 2D peaks originating from one entry (400–800). The different processing methods were evaluated by investigating the deconvoluted mass spectra, the integration and the combining of the 2D peaks of the IS. For all IS, except for cholic acid-d4, the naturally-occurring non-labelled form was also detected in the sample and partly coeluted with the labelled IS. These naturally-occurring compounds (except for glucose that was present in extremely high concentration) were also evaluated to check the performance of the deconvolution. The IS and naturally-occurring metabolites were distributed over the entire chromatogram and eluted at 1rT between 356 and 2846 s and 2rT between 2.4 and 5 s. The 2wB and the match required to combine 2D peaks were the primary parameters determining the quality of the deconvolution (2wB) and the combining of the different 2D peaks from one peak (both parameters). Unfortunately, it is not possible to set different 2wB in the software for different 2rT, because metabolites eluting at high 2rT, e.g. cholic acid-d4, were better quantified with broader peak widths than metabolites eluting at low 2rT. In our case study, the 2wB was best set somewhat narrower (0.15 s) than the actual peak width of the narrowest peaks of interest (0.2 s baseline).

2.10 Data processing for GC × GC–MS

A computer with the following specifications was used: Pentium [R] dual Intel processor CPU 3.4 GHz, 3 GB RAM, hard disk: Serial ATA, 7200 RPM, 16 MB cache, RAID 24/7 (Seagate Barracuda ES, 3.0 GB/s, 500 GB). All samples were processed with ChromaTOF V3.35 software with the following settings. Baseline tracking: default; baseline offset: 1.0; peak width: 0.15 s; segmented processing: peak find S/N 20, number of apexing masses 2; GC × GC parameters: match required to combine 500, peak width 90 s, mass threshold 0. Quantification for every individual entry was performed on their unique mass in the mass spectrum determined by the ChromaTOF software. The peaks from the constructed calibration table were quantified in all QC and study samples.

2.11 Construction of calibration table for GC × GC–MS

One of the QC samples from the middle of the sequence was processed with the method described above, except the peak find S/N was set to 200. As many artefact peaks as possible were removed. For example, all peaks related with solvents and reagents (eluting at low 2rT) and multiple entries from highly concentrated tailing metabolites (i.e. phosphate). All remaining entries were added to a calibration table. Targets from the 1D-GC–MS target table that were unambiguously identified in the 2D-GC × GC–MS data, i.e. the identity was confirmed by the injection of a authentic standard or the mass spectrum of the metabolite was considered unique, were renamed (total 107 targets) in the 2D calibration table. The maximum 1rT deviation in the calibration table was set to 13 s for every entry. The retention time deviation was set to 0.1 s, the minimum area threshold was 0, the match threshold was 550 and the S/N threshold was set to 5.

2.12 Post processing of GC × GC–MS

The quantitative data for all 1025 targets in the calibration table were exported to Excel. Compounds that were not found in more than one QC sample were removed (825 entries left). Subsequently, entries with more than four blank values in all samples were removed from the data set (691 entries left excl. internal standards). Of course, a blank value can be obtained when the concentration of the metabolite is below the limit of detection. However, in many occasions blank values were obtained even when the peak of interest was present in the sample (further referred to as a missing value), due to a low spectrum match. A low match was mostly caused by mistakes in the deconvolution either in the sample itself or in the sample used for the construction of the calibration table. However, the use of a selective mass from the mass spectrum for every metabolite (as defined in the reference table of the selected QC sample), still allows the quantification of wrongly deconvoluted peaks, although the reliability is lower. To fill the remaining missing values in the data set (total of 169 blank values), the chromatograms were reprocessed with a match threshold in the calibration table of 200 rather than 500. In this way the missing values for peaks that were unassigned due to a low match factor could be filled from the newly processed data. Of course, only correct assignments of these missing peak areas (as manually controlled via correct mass spectrum and retention time) were filled from the newly processed data. Then, all remaining peaks with missing values in the QC and/or study samples were checked and corrected manually by assigning the right peak in the chromatogram to the compound in the calibration table. The integration of the peaks and the combining of 2D peaks were not corrected as this was extremely time-consuming and therefore considered an unrealistic option.

3 Results and discussion

The present study was directed at performing and optimizing non-target data processing for GC × GC–MS. A set of 29 mouse liver samples was analyzed with both GC–MS and GC × GC–MS. The same set of samples were analyzed with both systems, both systems used identical injectors, injection methods and gas chromatographs. Therefore, the variability in the RSDs of internal standards and QC samples was caused by later stages of the analytical process (i.e. GC temperature ramp, second-dimension GC separation versus no second-dimension GC separation, detection and processing). Ideally, the same detectors should be used to compared one-dimensional and two-dimensional processing. However, in our experience the Chemstation software allows a more precise quantification in one-dimensional processing than can be achieved with the ChromTOF software, e.g. due to (i) the possibility to set integration parameters for individual peaks, (ii) the ability to use qualifier masses (define ratios between masses that should be fulfilled to assign a target) and (iii) the absence of an automated deconvolution process. Therefore, we used an Agilent GC-quadrupole MS system for this comparative study (and the majority of metabolomics studies in our lab), even though the time-of-flight mass spectrometer is more sensitive.

The fully optimized GC–MS processing method had a targeted approach (see Sect. 2). All 170 quantified metabolites in all samples were automatically integrated, the integration results visually inspected and wrongly integrated peaks were manually corrected. The non-targeted GC × GC–MS data processing method was semi-automated, i.e. the construction of the target table and assignment of missing values required manual interaction; however, the integration of peaks or mistakes in the combination of peaks from the same entry were not corrected to reduce the processing time. The data-processing times and results for GC–MS and GC × GC–MS were compared. Furthermore, the general data quality of the GC × GC–MS analyses was investigated. Finally, the results obtained with the liver samples using GC–MS and GC × GC–MS were analyzed using multivariate statistics (PCA/PCDA) in order to identify time-resolved metabolite patterns. These data may provide biomarkers for the development and progression of insulin resistance/DM2 and insight into the metabolic dysregulations underlying the disease process.

3.1 Comparison of time required for processing of GC–MS and GC × GC–MS data

The workflow for the GC × GC–MS processing is shown in Table 1. In total approximately 50 h of analyst time were required to optimize the processing and subsequently process the entire data set of 29 samples. Furthermore approximately 61 h of computer time were needed for the processing. The processing of the data files with the computer was mostly done overnight, so that the optimization and processing of the data set could be performed within about 2 weeks. The GC–MS processing required about 40 h of analyst time; 25 h for the construction of the target table (including finding the targets by using the in-house database of commonly found metabolites, searching new targets in this specific study and adjusting integration parameters for individual metabolites) and 15 h for the processing of one batch and eventual correction of the integration of metabolites from the calibration table. Both processing methods are quite time consuming, especially for the GC × GC–MS data, requiring 25% more analyst time compared to GC–MS processing. However, the optimization and construction of the target table takes a relatively long time, and every extra batch of about 30 samples will take approximately 20 and 10 h of analyst time for, respectively, 2D and 1D-processing. For the GC × GC–MS processing, the assignment of the missing values is very time consuming (Table 1, step 8), even though we used a strategy to fill the missing values by reprocessing the data with a very low match threshold of 200 in the calibration table (cf. Sect. 2.12). This strategy decreased the required analyst time for this step (Table 1, step 8) with approximately 30–50% from 30 h to about 15–20 h per batch. However, one major bottleneck in the processing was the speed of the software in combination with the computer speed. Due to the large data files obtained with GC × GC–MS every action performed in the software is very slow, e.g. loading of a data file, selection of an assigned peak in the calibration table and the assignment of a peak to an entry in the calibration table and especially the correction of wrongly integrated and combined modulations from one entry. So even if no manual corrections are performed, the checking of the data is already very time-consuming. This might be improved using a faster computer, however, it is expected that the gain in speed will be limited. Probably, more efficient algorithms are required to improve the speed significantly. It should be noted that recently a newer and faster version of the ChromaTOF software (V 4.30) has become available. Most probably the computer processing time and the time required for the assignment of the blank values (step 8, Table 1) can be reduced with the new software.

Table 1 Workflow for optimizing and carrying out GC × GC–MS data processing

3.2 Comparing data processing results of one-dimensional GC–MS with two-dimensional GC × GC–MS

3.2.1 Number of entries

The target table for GC–MS was constructed using a home-made database with retention times and mass spectra of over 600 reference metabolites (authentic standards), over 100 annotated metabolites (spectral match with NIST library) and over 200 unknown metabolites commonly found in blood products. Additionally, compounds detected specifically for this study in the QC sample (see Sect. 2) were added to the target table. A total of 170 targets were found in the liver samples, some with very low signal-to-noise ratios (S/N = 3 in reconstructed ion chromatogram (RIC)). In Table 2 the number of entries above a certain signal-to-noise ratio in the total ion current (TIC) after deconvolution are shown for GC–MS (determined with AMDIS deconvolution software; Stein 1999) and GC × GC–MS. Due to lower detection limits and higher peak capacities in GC × GC–MS compared with GC–MS, more possible compounds were detected.

Table 2 Number of entries in a GC–MS and GC × GC–MS chromatogram of a pooled mouse liver sample

For the construction of the GC × GC–MS target table an S/N cutoff of 200 was chosen. It should be mentioned that this S/N is calculated for the unique mass (RIC) determined by the software, rather than the S/N ratio in the total ion current used in Table 2. Therefore, the number of entries at this cut-off value were higher than in Table 2, i.e. 1034 entries were found with a S/N >200. It should be mentioned that for some metabolites due to the oximation two peaks can be obtained for one metabolite, so the actual number of metabolites detected is lower than the number of entries found.

3.2.2 RSDs of internal standards

The quantification of the internal standards was checked and errors in the integrations were corrected manually for both methods. The RSDs of the MS response for the labelled internal standards were calculated for both the GC–MS and the GC × GC–MS methods (Table 3). The chromatographic performance of the GC × GC–MS method was comparable or even better than that of the GC–MS method, i.e. the RSDs of phenylalanine-d5 and glutamic acid-d3 were significantly better in the GC × GC–MS data compared to the GC–MS data and the other RSDs were comparable in both datasets. These results were in agreement with the results in Koek et al. (2008); in this study better RSDs were obtained compared to GC–MS for a representative set of metabolites, in addition better peak shapes and an improved mass loadability were obtained.

Table 3 Comparing the RSDs of normalized MS responsea for the internal standards for GC–MS and GC × GC–MS in all samples (QC and study samples)

3.2.3 RSDs in pooled QC samples

A set of pooled mouse-liver samples were used as QC samples. These samples were injected at the beginning and at the end of the sequence and between every six samples. In total five QC samples were measured over the course of the study. The RSDs of the MS response of target compounds that were found with both 1D and 2D GC–(×GC)–MS (total 107 targets) were compared. The RSDs for all compared metabolites are shown in Table S1 in the supplement.

For the majority of metabolites (70 metabolites) similar RSDs were found with both methods (<10% difference in RSDs), although generally the values of the RSDs for the GC × GC–MS data were slightly higher than obtained with the semi-automated GC–MS processing (Fig. 1). However, it should be taken into account that the comparison is somewhat biased towards the GC–MS method. Because the targets from the GC–MS list were searched for in the GC × GC–MS data and not the other way around, metabolites that performed well in the GC × GC–MS method and were not present in the GC–MS target list or were not detected with the GC–MS method (due to higher detection limits) were not evaluated in this comparision.

Fig. 1
figure 1

Comparison of RSDs for metabolites in the QC samples of GC–MS and GC × GC–MS data after processing; only metabolites detected in both methods are shown

For six metabolites, better RSDs (differences in RSD more than 10%) were obtained with the GC × GC–MS method. For 37 metabolites poorer RSDs (differences in RSD more than 10%) were obtained in the GC × GC–MS processing. The chromatographic performance of GC × GC–MS was comparable when compared visually, or even better than the performance of GC–MS; this is in agreement with the comparable or even better RSD’s for the manual corrected integration results of the internal standards (see above); the poorer RSD values for the not-manually corrected peaks with GC × GC–MS were therefore caused by errors in the data processing. Seven of these compounds were overloaded (S-Table 1), which resulted in split peaks in the second dimension. Obviously these peaks will not be integrated correctly in an automated fashion, neither in GC–MS nor GC × GC–MS. In the GC–MS processing method overloaded peaks were manually integrated and therefore better RSDs were obtained.

For most other peaks the higher RSDs resulted from errors in the combining of 2D peaks belonging to the same metabolite. For peaks to be combined the match between the mass spectra of different modulation cycles should meet the required match factor as set in the software. Decreasing the match required to combine, however, would risk combining peaks that originate from different metabolites, especially because masses 73 ((CH3)3SiOH) and 147 ((CH3)3SiOSi(CH3)2) are dominant masses in the mass spectra of silylated compounds. Furthermore, in most cases the problems with combining peaks was due to deconvolution faults, and decreasing the match factor would not be an option in these cases. For nine compounds isomeric interferences of a close eluting peak was the cause of the combination error (S-Table 1). Due to the nature of the derivatization, two distinct compounds are formed for, for example, sugars and sugar-phosphates (cis- and trans-oxime forms). These two forms of one sugar elute close together in the first dimension and posses identical mass spectra. In these cases the chance of wrong assignment of the identity or mistakes in the combination of second dimension peaks is high. Another obstacle that impaired the quantification of seven metabolites was the assignment of the unique mass in the mass spectrum by the ChromaTOF software. For these compounds the non-selective masses m/z 73 or m/z 147 were assigned as unique masses, while these masses are present in all mass spectra of silylized compounds. Due to interferences of (partly) coeluting compounds the integration of these metabolites was inaccurate. In principle, the masses used for quantification of these metabolites can be manually adjusted in the calibration table, and these mistakes can probably be avoided by selecting a more selective mass instead of m/z 73 or m/z 147. However, this requires extra time to check all automatically chosen quantification masses in the calibration table.

In summary, 70 metabolites were quantified correctly and 37 metabolites were quantified less accurately with the semi-automated GC × GC–MS data processing method compared to the semi-automated GC–MS processing. For seven of the less-accurately quantified peaks the cause for the less good quantification could not be attributed to the ChromaTOF software, but to overloading effects. Thus, the semi-automated GC × GC–MS data processing method worked for 70% of the evaluated metabolites as good as the manually corrected GC–MS reference method.

3.2.4 Summary on GC × GC–MS data quality

The goal in this study was to assess the feasibility of using a processing strategy based on commercially available software (i.e. ChromaTOF software, LECO) for the unbiased, non-target semi-automated quantification of as many metabolites as possible in mouse liver samples measured with GC × GC–MS. The RSDs of the MS response for all entries were calculated in the QC samples. In Fig. 2 an overview of the amount of entries per RSD-range is given for both GC–MS and GC × GC–MS. In the GC × GC–MS data, using a non-targeted approach, still 224 entries fit the strict FDA requirements for targeted analysis in bioanalysis (RSD <15%) (U.S. Department of Health and Human Services et al. 2001), compared to 116 entries from the GC–MS data. This illustrates the added value of GC × GC–MS compared to GC–MS. In addition, the entries with higher RSDs still contain useful information, when the differences between the compared groups of samples are larger than the analytical variation, as illustrated below (cf. Sect. 3.3).

Fig. 2
figure 2

Overview of the number of entries in the QC samples per RSD-range for GC–MS and GC × GC–MS

3.3 Application

The aim of the GC(×GC)–MS study was to investigate the influence of a high fat diet on the metabolite profiles in the liver. A total of 24 mouse liver samples, i.e. t = 0 weeks (n = 8), t = 6 weeks (n = 8) and t = 12 weeks (n = 8) after the start of the high fat diet, were measured with GC × GC–MS and GC–MS. Development of insulin resistance was monitored in individual mice over time by performing glucose tolerance tests and measuring specific biomarkers in plasma, and hyperinsulinemic-euglycemic clamp analysis to assess insulin resistance in a tissue-specific manner as described by Kleemann et al. (2010). After 6 weeks the mice had developed insulin resistance in the liver and after 12 weeks also in skeletal muscle and fat tissue (white adipose tissue).

With GC × GC–MS many more compounds were measured than with GC–MS. Typical chromatograms of a mouse liver sample measured with GC–MS and GC × GC–MS are shown in Figs. 3 and 4. A total of 170 and 691 metabolites were quantified with GC–MS and GC × GC–MS, respectively, and analyzed using PCA and PCDA. First, the peaks that were present in the GC–MS as well as in the GC × GC–MS data set, hereafter referred to as overlap data, were analyzed with PCA (Fig. 5) and PCDA (Figure S-1, cf. Supplement). The first principal component (PC1) in PCA and the first linear discriminant (LD1) in PCDA mostly explained the variance between the t = 0 samples and the t = 6 and t = 12 samples, PC2 (PCA) and LD2 (PCDA) explained some of the variance between t = 0 and t = 12 compared to t = 6 weeks. As can be seen from Fig. 5 the results from both overlapping data sets were very comparable and the groups of mice from the different time points were rather well separated. The mahalanobis distances, i.e. a measure for the separation of the groups taking into account the spreading of the samples within one group, were calculated in the overlap data after PCA analysis (Mahalanobis 1930) (Table 4); the distances between the t = 0 (baseline) and the high-fat-diet treated groups in the GC × GC–MS data was clearly improved, only the separation of the t = 6 and t = 12 groups was slightly better in de GC–MS data set. Furthermore, more than ten metabolites that were found in the top 20s of metabolites with the highest loadings in the GC–MS and GC × GC–MS data in PCDA (LD1 and LD2, data not shown), were the same for both data sets indicating the similarity of the two overlapping data sets. Consequently, although the RSDs of responses for the metabolites in the GC × GC–MS data were somewhat higher than in the GC–MS data, the biological information was preserved and even a slightly better group separation was obtained.

Fig. 3
figure 3

Total-ion GC–MS chromatogram of a pooled mouse-liver sample

Fig. 4
figure 4

Two-dimensional colour plot of a total ion GC × GC–MS chromatogram of a pooled mouse liver sample

Fig. 5
figure 5

PCA analysis of the overlap data (107 entries) for GC–MS (a) and GC × GC–MS (b) of mouse liver after 0 (red, +), 6 (blue, ∆) and 12 (green, ●) weeks (Color figure online)

Table 4 Mahalanobis distances in the overlap data after PCA analysis of GC–MS and GC × GC–MS

In Fig. 6 the result of the PCA analysis of the additional entries from the GC × GC–MS data set (all entries minus overlap entries) compared to the GC–MS data set is shown (PCDA is shown in supplement Figure S-2). Again, the groups of mice from different time points were rather well separated, indicating the biological relevancy of the additional peaks in explaining the differences between the three groups, i.e. these additional metabolites contain information on the differences in metabolic profiles during the development of insulin resistance. It should be mentioned that 170 entries were detected in GC–MS and only 107 of these metabolites were assigned in the GC × GC–MS data set. The remaining 63 entries in GC–MS, except for a maximum of six high boiling compounds (Mwunderivatized >600), are probably also present in the GC × GC–MS data set. However, these metabolites could not be unambiguously assigned, as no reference standards were available and/or their mass spectrum were not unique. As the first dimension column in GC × GC was a polar BPX50 column rather than an apolar DB5 column used for the GC–MS, the elution order and retention times differed for the first dimension separation, and retention times could not be used to identify metabolites found with one method in the other method.

Fig. 6
figure 6

PCA analysis of additional entries (compared to GC–MS; 584 peaks) in GC × GC–MS of mouse liver after 0 (red, +), 6 (blue, ∆) and 12 (green, ●) weeks (Color figure online)

The added value of GC × GC–MS is also demonstrated when the top 20 metabolites with the highest loadings in PCDA (LD1) of the total GC–MS and the total GC × GC–MS data set are compared (Table 5). Most of the metabolites with high loadings in LD1 of GC–MS data are found in the top 20 LD1 of GC × GC–MS (bold in Table 5) or are slightly lower ranked with still good loadings (italics in Table 5). With GC × GC–MS, in LD1 of the PCDA model many additional metabolites with high loadings were found. Only in a few cases the metabolites of the top 20 of GC–MS ranked lower due to poorer repeatability (i.e. 1-palmitoyl-sn-glycero-3-phosphocholine and gluconic acid); actually, a few metabolites (e.g. taurine and tyrosine) ranked within the top 20 of GC × GC–MS and not in the top 20 of GC–MS due to better performance with GC × GC–MS. In summary, the GC × GC–MS analysis provides many additional possible biomarkers for the development and progression of insulin resistance.

Table 5 Top 20 metabolites with highest loading in LD1 in PCDA analysisa

In Fig. 7 the box plots of the relative response of some high-ranked metabolites from the PCDA analysis of the GC × GC–MS data are shown; these metabolites could provide more insight in the metabolic processes involved in the development and progression of insulin resistance caused by a high fat diet. For example, the campesterol concentrations (Fig. 7, compound A) were significantly lowered after 6 and 12 weeks of high fat diet. Campesterol is a phytosterol, i.e. a steroid derived from plants, which is known for its cholesterol-lowering properties when used as a food additive (de Graaf et al. 2002). Moreover, a low campesterol/sistosterol ratio in serum and liver has been associated with severe liver damage and liver transplantation in primary bilairy cirrhosis (Nikkila et al. 1992a, b). In addition, the relative responses of metabolites B, C (linoleic acid), D (Fig. 7) and arachidonic acid were significantly lower after 6 and 12 weeks of high fat diet; metabolites B and D were only present in low concentrations and probably therefore not detected with GC–MS. These metabolites were all poly-unsaturated fatty acids that are considered beneficial for health (Buckley and Howe 2009; Furuhjelm et al. 2009; Panza et al. 2009; Yaqoob 2009). Lowered levels of these metabolites are associated with an increased risk of inflammation and promotion of subacute chronic inflammation in metabolic disorders, which is again related to the development of obesity and insulin resistance (Cai et al. 2005; Verschuren et al. 2009). Furthermore, our results are consistent with the observed elevated levels of hepatic inflammation markers determined in plasma of the mice used herein as well as the observed increased activity of intrahepatic proinflammatory transcription factors such as NFkB in liver biopsies adjacent to the ones used for the present study (Kleemann et al. 2010).

Fig. 7
figure 7

Box plots of the relative concentrations of the PCDA variables in LD1: Campesterol (a; P < 0.0001;), M0617 (poly unsaturated fatty acid methyl ester; P < 0.0001) (b) and linoleic acid (c; P < 0.0001) M0600 (poly-unsaturated fatty acid) (d; P = 0.0002), tyrosine (e; P < 0.0001) and spermidine (f; P = 0.0015). P-values calculated with one-way ANOVA with α = 0.05. Analysis of differences between groups with Tukey post-hoc testing resulted in significant differences between groups t = 0 vs. t = 6 and t = 0 vs. t = 12 for all metabolites

Finally, the relative responses of tyrosine, spermidine (Fig. 7, compounds E and F) and beta-alanine were lower in the groups of mice sacrificed at t = 6 weeks and t = 12 weeks, while the level of taurine was significantly increased. The levels of amino acids are known to fluctuate during the development of insulin resistance (Huffman et al. 2009; Wijekoon et al. 2004). Interestingly, taurine was suggested to have beneficial effects by its ability to reduce intracellular oxidative stress generation and glycooxidation (Anuradha 2009), while this is the only metabolite in the GC × GC–MS PCDA top 20 that was significantly elevated in the animals after 6 and 12 weeks of high-fat diet. Furthermore, it is believed that certain amino acids play an important role in the development of diabetes and that dietary treatment with amino acids could prevent diabetes and diabetic complications (Anuradha 2009).

In conclusion, the added value of GC × GC–MS compared to GC–MS is clearly illustrated in this pre-clinical study. Although the RSDs of compounds in the QC samples for GC × GC–MS were somewhat higher than in the GC–MS data, the biological information in the data was preserved. In addition, many more candidate biomarkers were detected that were significant in explaining the differences between the different sample groups in this study. Furthermore, the higher peak capacity resulted in cleaner mass spectra, facilitating the identification of possible biomarkers. Moreover, the position of the metabolite in the chromatogram (especially the 2rT) can also aid in the identification process (data not shown).

3.4 Concluding remarks

The feasibility of semi-automated non-target processing of GC × GC–MS data using commercially available software was assessed. A set of mouse liver samples was measured with GC–MS and GC × GC–MS and a total of 170 and 691 peaks, respectively, were quantified. The performance of the two methods was compared by evaluating the RSD values in the quality-control samples of metabolites present in both datasets. Although the chromatographic performance was comparable or even slightly better for GC × GC–MS, as demonstrated for the manually integrated labelled internal standards, somewhat poorer RSDs for the relative responses of peaks determined in a semi-automated manner due to less accurate processing. Still a reliable and repeatable quantification was obtained for approximately 70% of the peaks, even though the integrations of the peaks from the GC × GC–MS data were not manually corrected in contrast with the GC–MS data.

In addition, GC × GC–MS processing is time-consuming, the major bottleneck being the speed of the software tools and algorithms. However, application of the strategy described in this paper is feasible for small studies with a maximum of about 30–50 samples (eventually measured in duplicate). For the routine application of GC × GC–MS in metabolomics in larger studies, further improvement of data processing tools is required.

The mouse-liver samples were measured to study the development and progression of insulin resistance. The added value of GC × GC–MS was clearly illustrated, (i) over four times more peaks could be quantified, (ii) the biological information as acquired in GC–MS was preserved, (iii) several extra candidate biomarkers for the development of insulin resistance were found, and (iv) the superior peak capacity resulted in cleaner mass spectra, facilitating in principle the putative identification of metabolites.