Comprehensive Insight into Colorectal Cancer Metabolites and Lipids for Human Serum: A Proof-of-Concept Study

Colorectal cancer (CRC) ranks as the third most frequently diagnosed cancer and the second leading cause of cancer-related deaths. The current endoscopic-based or stool-based diagnostic techniques are either highly invasive or lack sufficient sensitivity. Thus, there is a need for less invasive and more sensitive screening approaches. We, therefore, conducted a study on 64 human serum samples representing three different groups (adenocarcinoma, adenoma, and control) using cutting-edge GC×GC–LR/HR-TOFMS (comprehensive two-dimensional gas chromatography coupled with low/high-resolution time-of-flight mass spectrometry). We analyzed samples with two different specifically tailored sample preparation approaches for lipidomics (fatty acids) (25 μL serum) and metabolomics (50 μL serum). In-depth chemometric screening with supervised and unsupervised approaches and metabolic pathway analysis were applied to both datasets. A lipidomics study revealed that specific PUFA (ω-3) molecules are inversely associated with increased odds of CRC, while some PUFA (ω-6) analytes show a positive correlation. The metabolomics approach revealed downregulation of amino acids (alanine, glutamate, methionine, threonine, tyrosine, and valine) and myo-inositol in CRC, while 3-hydroxybutyrate levels were increased. This unique study provides comprehensive insight into molecular-level changes associated with CRC and allows for a comparison of the efficiency of two different analytical approaches for CRC screening using same serum samples and single instrumentation.


Introduction
According to the World Health Organization (WHO), colorectal cancer (CRC) is the third most commonly diagnosed cancer and accounts for the second most cancer-related deaths globally [1]. The International Agency for Research on Cancer (IARC) projects a significant increase in the global burden of CRC, with a projected rise of 56% in the number of new cases per year between 2020 and 2040, totaling over 3 million cases annually. The estimated number of deaths from CRC is projected to increase by 69% during the same period, with approximately 1.6 million deaths worldwide expected in 2040. This increase is anticipated to be most prominent in countries with a high Human Development Index [1]. The abnormal growth of tissue protruding from epithelial cells of colorectal mucosa develops into a polyp and progresses into a tumor. If it remains benign, it is known as an adenoma or precancerous neoplastic lesion; when the tumor progresses to the carcinogenic stage, it is known as adenocarcinoma or malignant CRC tumor [2]. As shown in Figure 1, Multiple risk factors have been linked to an increased risk of developing CRC. These risk factors involve a medical history of cancer (individuals or their relatives), shown in Figure 1, Multiple risk factors have been linked to an increased risk of developing CRC. These risk factors involve a medical history of cancer (individuals or their relatives), inflammatory bowel disease (IBD), colon polyps, cholecystectomy, and diabetes mellitus. The lifestyle-associated risk factors include physical inactivity, overweight and obesity, alcohol consumption, cigarette smoking, and inappropriate dietary patterns (a diet high in red and processed meat; a diet low in fiber, fruits, vegetables, calcium, vitamin D, and dairy products). Furthermore, age, gender, race, and gut microbiota are also considered influencing risk factors for CRC [3]. The current standard tests suggested by European guidelines to maintain quality assurance in the screening and diagnosis of CRC include the Guaiac Fecal Occult Blood Test (gFOBT), Immunochemical FOBT (iFOBT or FIT), flexible sigmoidoscopy (FSIG), colonoscopy, and tissue biopsy [4,5]. The endoscopy-based approaches are invasive but sensitive, while stool-based approaches are non-invasive but less sensitive. Although, these guidelines acknowledge the newer screening techniques such as computed tomography (CT) colonography, capsule endoscopy, and stool DNA testing as emerging possibilities. At the same time, they do not recommend using them for screening and diagnostic purposes for the average-risk population [5]. Therefore, there is a need for a less invasive and more sensitive analytical method for the screening of CRC.
The rapid advancement of high-throughput "omics" approaches, such as metagenomics, transcriptomics, proteomics, metabolomics, lipidomics, microbiomics, and volatolomics, offers potentially less invasive alternatives than available techniques to develop novel biomarkers for CRC screening that could contribute to its clinical management ( Figure S1) [6]. Each approach offers its advantages to biomarker discovery, cancer screening, and diagnosis. The biomarker's specificity and metabolite/lipidome identification can often fluctuate depending upon the type of biospecimens (e.g., adipose tissue, tumor tissue, dried blood spot, plasma, serum, urine, and stool) investigated [7].
Metabolomics is a comprehensive study of small molecules within biofluids, cells, tissue, or living organisms. These small molecules, commonly known as metabolites, are low-molecular-weight organic compounds usually within a mass range of 50-1500 Daltons (Da). These metabolites are typically involved as a substrate or the product of the metabolic pathway in a living organism. Collectively, a total set of metabolites and their interactions within a biological system at any specific time point is known as a metabolome. The metabolome comprises complex mixtures of thousands of molecules with a The current standard tests suggested by European guidelines to maintain quality assurance in the screening and diagnosis of CRC include the Guaiac Fecal Occult Blood Test (gFOBT), Immunochemical FOBT (iFOBT or FIT), flexible sigmoidoscopy (FSIG), colonoscopy, and tissue biopsy [4,5]. The endoscopy-based approaches are invasive but sensitive, while stool-based approaches are non-invasive but less sensitive. Although, these guidelines acknowledge the newer screening techniques such as computed tomography (CT) colonography, capsule endoscopy, and stool DNA testing as emerging possibilities. At the same time, they do not recommend using them for screening and diagnostic purposes for the average-risk population [5]. Therefore, there is a need for a less invasive and more sensitive analytical method for the screening of CRC.
The rapid advancement of high-throughput "omics" approaches, such as metagenomics, transcriptomics, proteomics, metabolomics, lipidomics, microbiomics, and volatolomics, offers potentially less invasive alternatives than available techniques to develop novel biomarkers for CRC screening that could contribute to its clinical management ( Figure S1) [6]. Each approach offers its advantages to biomarker discovery, cancer screening, and diagnosis. The biomarker's specificity and metabolite/lipidome identification can often fluctuate depending upon the type of biospecimens (e.g., adipose tissue, tumor tissue, dried blood spot, plasma, serum, urine, and stool) investigated [7].
Metabolomics is a comprehensive study of small molecules within biofluids, cells, tissue, or living organisms. These small molecules, commonly known as metabolites, are low-molecular-weight organic compounds usually within a mass range of 50-1500 Daltons (Da). These metabolites are typically involved as a substrate or the product of the metabolic pathway in a living organism. Collectively, a total set of metabolites and their interactions within a biological system at any specific time point is known as a metabolome. The metabolome comprises complex mixtures of thousands of molecules with a wide range of chemical diversity, which includes nucleotides, oligopeptides, amino acids, organic acids, lipids, and sugars. The analytical strategies for metabolomics are predominantly based on NMR (nuclear magnetic resonance), chromatography coupled to mass spectrom-etry (MS), and mass analyzer [8,9]. Various studies have reported the use of different analytical techniques such as direct infusion MS [10], GC-MS (gas chromatography-MS), LC-MS (liquid chromatography-MS) [11][12][13], UHPLC-MS (ultra-high-performance liquid chromatography-MS) [14] investigating CRC related metabolome using different biological matrix including plasma [15,16], serum [17][18][19][20][21][22], urine [23,24], and stool [25][26][27].
Lipidomics, or comprehensive analysis of lipids, has emerged as a new branch of metabolomics or omics cascade owing to its wide range of lipid molecules having complex hydrophobic and amphiphilic natures [28,29]. Lipids are generally classified into eight classes based on the presence of isoprene and ketoacyl groups. These are fatty acids (FAs), glycerophospholipids, glycerolipids, sterol lipids, prenol lipids, scaccharolipids, sphingolipids, and poly ketides [30][31][32]. FAs play an essential role in many biological functions as they act as fundamental components for maintaining fluidity and the structural stability of all cell membranes and also work as building blocks for making structurally more complex lipids, and as energy storage molecules [33]. Lipidomics is commonly analyzed by LC coupled with MS and MS-related techniques [34]. Several studies have explored the influence FAs in CRC using various analytical techniques. For instance, Yaping Zhang [35] analyzed serum samples using Fourier transform ion cyclotron resonance MS (FTICR MS) and GC-MS for analyzing different types of samples, including plasma [36], serum [37], and tissue [38,39].
Over the past few years, we have successfully utilized the GC×GC-TOFMS (comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry) technique in various disease phenotyping and biomarker discovery applications, such as asthma, systemic sclerosis, and Crohn's disease [9,[40][41][42][43]. Our previous findings have established a strong foundation for the application of GC×GC-TOFMS in disease diagnosis, demonstrating its potential as a powerful analytical tool in the field of the omics world. The importance of dysregulated FA metabolism in cancer is well studied [31,44]. The coverage of FAs in metabolomics methods largely depends on the type of sample preparation and analytical method used, resulting in less extensive coverage of unsaturated long-chain fatty acids [C 13 to C 21 LFAs] and unsaturated very long-chain fatty acids [≥C 22 VLFAs], compared to methods developed explicitly for lipids (FAs) [45]. The aim of the research is to identify sensitive and less invasive diagnostic biomarkers for CRC using GC×GC coupled with lowand high-resolution TOFMS. This study demonstrates a proof of concept for the efficacy of a less exhaustive and more automated lipidomics approach in CRC screening compared to a metabolomics workflow from an analytical perspective. Furthermore, incorporating the multi-omics metabolomics and lipidomics analytical approaches using the same sample is a powerful tool in comparing and combining the overall omics information in CRC, compared to healthy volunteers (control) and benign (adenoma) conditions, to gain a deeper molecular-level understanding of CRC.

Results
In this study, a total of 64 serum samples were analyzed, representing three different groups: adenocarcinoma (n = 20), adenoma (n = 23), and control (n = 21). The analysis was conducted using two different analytical methods, specifically targeting lipids (fatty acids) and metabolites with GC×GC-LR/HR-TOFMS. Table 1 presents a comprehensive summary of the clinical and demographic characteristics of the participants, encompassing the three distinct study groups that were subjected to analysis.
The selection of CRC lipid biomarkers was based on the following three statistical criteria for multi-group analysis: First, false discovery rate (FDR) from one-way analysis of variance (ANOVA); second, the variable importance projection score (VIP score) from partial least squares-discriminant analysis (PLS-DA); third, Mean Decrease Accuracy (MDA score) from the random forest (RF) algorithm. Here, with a cut-off of FDR value < 0.05, nine significant features were identified (with p-value < 0.05-11 features). The variable importance in the projection (VIP) score plot is used to assess the importance of features with PLS-DA. A total of eight features were identified as significant with the cut-off of the VIP score > 1. Here, with the cut-off of MDA > 0.008 top eight features remained same in comparison to the first two tests. Higher MDA values indicate greater importance for classification accuracy.
After identifying the eight significant features, unsupervised analysis was performed. A PCA plot was generated to visualize the improvement in the clustering of the groups when compared to using all features ( Figure S2a). Using the top features, a small clustering in adenocarcinoma was observed ( Figure S3). Further investigation revealed that it was related to different stages of cancer based on the pTNM staging system. As depicted in Figure 3a, a clear clustering trend was observed between adenoma, control, and the different stages of adenocarcinoma (stage 1, stage 2, and stage 3). The heat map also showed a clear clustering trend between the three groups when performing with Pearson's correlation for similarity distance measures with Ward's linkage clustering algorithm. Thus, these findings suggest that the selected eight fatty acid analytes have a potential ability for differentiating among the adenocarcinoma, adenoma, and control groups. To evaluate the efficiency of the selected features using a lipidomics workflow, two supervised approaches, namely PLS-DA and RF, were used ( Figure 4). Upon performing a 5-fold cross-validation for PLS-DA, the accuracy, the goodness of fit (R 2 ), and goodness of prediction (Q 2 ) were found to be 0.93, 0.94, and 0.93, respectively, compared to using all features where the values were 0.75, 0.89, and 0.93, respectively. Here, the package has built-in support for cross-validation, eliminating the need to split the data into training and validation sets. One-third of the samples from the dataset were randomly selected as a validation set to obtain an average OOB error estimate. The model was built on the remaining two-thirds of the samples and used to predict the classes of the validation set. The misclassification rate of the validation set was calculated and repeated multiple times to obtain an average OOB error estimate. The OOB error was reduced from 0.203 (using all features) to 0.015 (using selected features). The selected features exhibited significantly low-class error rates of 0, 0, and 0.0476 for adenocarcinoma, adenoma, and control, respectively. To evaluate the efficiency of the selected features using a lipidomics workflow, two supervised approaches, namely PLS-DA and RF, were used ( Figure 4). Upon performing a 5-fold cross-validation for PLS-DA, the accuracy, the goodness of fit (R 2 ), and goodness of prediction (Q 2 ) were found to be 0.93, 0.94, and 0.93, respectively, compared to using all features where the values were 0.75, 0.89, and 0.93, respectively. Here, the package has built-in support for cross-validation, eliminating the need to split the data into training and validation sets. One-third of the samples from the dataset were randomly selected as a validation set to obtain an average OOB error estimate. The model was built on the remaining two-thirds of the samples and used to predict the classes of the validation set. The misclassification rate of the validation set was calculated and repeated multiple times to obtain an average OOB error estimate. The OOB error was reduced from 0.203 (using all features) to 0.015 (using selected features). The selected features exhibited significantly low-class error rates of 0, 0, and 0.0476 for adenocarcinoma, adenoma, and control, respectively.

Metabolomic Profiling of Colorectal Cancer Serum by GC×GC-LR/HR-TOFMS
The same 64 serum samples were analyzed with a metabolomics approach. The approach involved a two-stage derivatization process in which 50 µL of serum was used for derivatization. In total, 230 analytes were observed after the removal of artifacts (i.e., solvent interference, siloxanes). Out of which, 105 metabolites were putatively identified. The metabolite's nature ranged widely from amino acids, carboxylic acids, carbohydrates, fatty acids, nucleoside, purine, and vitamins. Firstly, all the features were analyzed using unsupervised analysis (PCA, heatmap, and dendrogram) to observe clustering trends ( Figure S4). However, there was not any clear or partial separation observed among the adenocarcinoma, adenoma, and control groups. Therefore, the same set of chemometric tests used for lipidomics was applied to find significant features for CRC screening with a metabolomics approach. Upon applying a one-way ANOVA test with a p-value cut-off of 0.05, 11 features were identified as significant. Applying a more stringent FDR cut-off for the metabolomics approach resulted in only five features being identified as significant out of 105. Therefore, the top 11 features with a p-value < 0.05 were selected, which were the same as the top 11 features of the VIP score (>1.6) plot. Moreover, 13 features were found to be significant with an MDA > 2 cut-off. Out of these, the top eight features from all the tests remained common in all the tests (except for one feature, according to MDA, which was in the ninth position). Using these chemometric tools in conjunction, we identified and selected a subset of eight features ( Figure 5) ( Table 3) that achieved high classification accuracy in distinguishing adenocarcinoma, adenoma, and control groups (Table  S1). As shown in Figure 5, out of the top eight significant metabolites, six belonged to the amino acid class. Among these, three amino acids (L-alanine, L-methionine, L-valine) were non-polar, uncharged aliphatic amino acids, and one amino acid (L-tryptophan) was a non-polar, uncharged aromatic amino acid. Additionally, one amino acid was a polar acidic amino acid, which was L-glutamic acid, and one was a polar aromatic amino acid, which was L-tyrosine. All of these amino acids were found to be downregulated in CRC. In addition, the six-carbon cyclic polyol (sugar alcohol) Myo-inositol was also downregulated in CRC [43]. On the other hand, the four-carbon organic acid 3-hydroxybutyric acid was found to be upregulated in CRC.

Metabolomic Profiling of Colorectal Cancer Serum by GC×GC-LR/HR-TOFMS
The same 64 serum samples were analyzed with a metabolomics approach. The approach involved a two-stage derivatization process in which 50 µL of serum was used for derivatization. In total, 230 analytes were observed after the removal of artifacts (i.e., solvent interference, siloxanes). Out of which, 105 metabolites were putatively identified. The metabolite's nature ranged widely from amino acids, carboxylic acids, carbohydrates, fatty acids, nucleoside, purine, and vitamins. Firstly, all the features were analyzed using unsupervised analysis (PCA, heatmap, and dendrogram) to observe clustering trends ( Figure S4). However, there was not any clear or partial separation observed among the adenocarcinoma, adenoma, and control groups. Therefore, the same set of chemometric tests used for lipidomics was applied to find significant features for CRC screening with a metabolomics approach. Upon applying a one-way ANOVA test with a p-value cut-off of 0.05, 11 features were identified as significant. Applying a more stringent FDR cut-off for the metabolomics approach resulted in only five features being identified as significant out of 105. Therefore, the top 11 features with a p-value < 0.05 were selected, which were the same as the top 11 features of the VIP score (>1.6) plot. Moreover, 13 features were found to be significant with an MDA > 2 cut-off. Out of these, the top eight features from all the tests remained common in all the tests (except for one feature, according to MDA, which was in the ninth position). Using these chemometric tools in conjunction, we identified and selected a subset of eight features ( Figure 5) ( Table 3) that achieved high classification accuracy in distinguishing adenocarcinoma, adenoma, and control groups (Table S1). As shown in Figure 5, out of the top eight significant metabolites, six belonged to the amino acid class. Among these, three amino acids (L-alanine, L-methionine, L-valine) were non-polar, uncharged aliphatic amino acids, and one amino acid (L-tryptophan) was a non-polar, uncharged aromatic amino acid. Additionally, one amino acid was a polar acidic amino acid, which was L-glutamic acid, and one was a polar aromatic amino acid, which was L-tyrosine. All of these amino acids were found to be downregulated in CRC. In addition, the six-carbon cyclic polyol (sugar alcohol) Myo-inositol was also downregulated in CRC [43]. On the other hand, the four-carbon organic acid 3-hydroxybutyric acid was found to be upregulated in CRC.     After identifying the eight significant features, unsupervised analysis was performed. A PCA plot was generated to visualize the improvement in clustering of the groups when compared to using all features ( Figure S4a). However, sample number 47 from control group was found to be affecting the separation (Figure S5a,b). It is important to note that sample number 47 was not identified as an outlier when performing the Grubbs test with all features. However, it needed to be excluded to analyze the selected features' actual separation capacity. As depicted in Figure 6a, a clustering trend with slight overlap was observed between adenoma, control, and adenocarcinoma classes. However, it was not possible to separate the different stages of adenocarcinoma based on pTNM staging using the metabolomics approach ( Figure S5c). The heatmap also showed similar results. The overlap between control and adenoma group was slightly higher than that between adenoma and adenocarcinoma, which could be due to the fact that adenoma samples represent an early precancerous stage that is closer to the control group than the adenocarcinoma group. After identifying the eight significant features, unsupervised analysis was performed. A PCA plot was generated to visualize the improvement in clustering of the groups when compared to using all features ( Figure S4a). However, sample number 47 from control group was found to be affecting the separation (Figure S5a,b). It is important to note that sample number 47 was not identified as an outlier when performing the Grubbs test with all features. However, it needed to be excluded to analyze the selected features' actual separation capacity. As depicted in Figure 6a, a clustering trend with slight overlap was observed between adenoma, control, and adenocarcinoma classes. However, it was not possible to separate the different stages of adenocarcinoma based on pTNM staging using the metabolomics approach ( Figure S5c). The heatmap also showed similar results. The overlap between control and adenoma group was slightly higher than that between adenoma and adenocarcinoma, which could be due to the fact that adenoma samples represent an early precancerous stage that is closer to the control group than the adenocarcinoma group. To assess the efficiency of the selected features using a metabolomics approach, supervised analysis techniques such as PLS-DA and RF were employed (Figure 7). Upon performing a 5-fold cross-validation for PLS-DA, the accuracy, R 2 , and Q 2 were found to be 0.81, 0.92, and 0.90, respectively, compared to using all features where the values were 0.50, 0.95, and 0.65, respectively. The out-of-bag (OOB) error was reduced from 0.375 (using all features) to 0.0476 (using selected features). The selected features exhibited significantly low-class error rates of 0, 0.087, and 0.05 for adenocarcinoma, adenoma, and control, respectively. To assess the efficiency of the selected features using a metabolomics approach, supervised analysis techniques such as PLS-DA and RF were employed (Figure 7). Upon performing a 5-fold cross-validation for PLS-DA, the accuracy, R 2 , and Q 2 were found to be 0.81, 0.92, and 0.90, respectively, compared to using all features where the values were 0.50, 0.95, and 0.65, respectively. The out-of-bag (OOB) error was reduced from 0.375 (using all features) to 0.0476 (using selected features). The selected features exhibited significantly low-class error rates of 0, 0.087, and 0.05 for adenocarcinoma, adenoma, and control, respectively. Furthermore, all the features of the lipidomics and metabolomics approaches were subjected to QEA (Figure 8) to find the biologically meaningful groups of metabolites in data. For the lipidomics approach, the alpha linolenic acid and linoleic acid metabolism remains the key metabolic pathway (Figure 8) affected for CRC. For the metabolomics approach (Figure 8b), amino acid pathways were majorly affected. Receiver operating characteristic (ROC) curves were generated to evaluate the ability of the biomarkers to distinguish CRC during the screening process (Figure 9). Furthermore, all the features of the lipidomics and metabolomics approaches were subjected to QEA (Figure 8) to find the biologically meaningful groups of metabolites in data. For the lipidomics approach, the alpha linolenic acid and linoleic acid metabolism remains the key metabolic pathway (Figure 8) affected for CRC. For the metabolomics approach (Figure 8b), amino acid pathways were majorly affected. Receiver operating characteristic (ROC) curves were generated to evaluate the ability of the biomarkers to distinguish CRC during the screening process (Figure 9).  Furthermore, all the features of the lipidomics and metabolomics approaches were subjected to QEA (Figure 8) to find the biologically meaningful groups of metabolites in data. For the lipidomics approach, the alpha linolenic acid and linoleic acid metabolism remains the key metabolic pathway (Figure 8) affected for CRC. For the metabolomics approach (Figure 8b), amino acid pathways were majorly affected. Receiver operating characteristic (ROC) curves were generated to evaluate the ability of the biomarkers to distinguish CRC during the screening process ( Figure 9).

Discussion
Comprehensive lipidomics and metabolomics analytical workflows were implemented using GC×GC-TOFMS for the screening of CRC using the same sample from three different groups: adenocarcinoma, adenoma, and control/healthy volunteer. Typically, biomarker investigations for CRC screening are performed using either lipidomics or metabolomics approaches separately, which limits the ability to compare their performance using the same samples.
In this study, a comparison of results from both approaches suggests that the lipidomics approach has advantages over the metabolomics approach for CRC screening. The lipidomics approach is less time-consuming and can be automated with a dual head set autosampler accommodating two different syringe volumes for sample preparation. While comparing both approaches' efficiency for CRC screening using same set of samples, the lipidomics approach (Figure 3) seemed to perform more efficient than metabolomics ( Figure 6). Moreover, the lipidomics approach demonstrated an ability to potentially differentiate between different stages of adenocarcinoma, adenoma, and control samples while monitoring fewer analytes (Figure 3). The results of lipidomics suggest that long and very long-chain PUFA (ω-3) and PUFA (ω-6) play a major role in the progression of CRC [48], which are usually not efficiently detected using the metabolomics approach. Overall, the identified significant features having omega-6 were upregulated in CRC (adenocarcinoma > adenoma > control), while the PUFA (ω-3) were downregulated in CRC (adenocarcinoma < adenoma < control). Figure 10 illustrates that both PUFA (ω-3) (such as eicosapentaenoic acid (EPA)), and PUFA (ω-6) (such as arachidonic acid (AA)) share a common biochemical pathway mediated by the same enzymes but exert distinct physiological effects through the production of different types of prostanoids (subclass of eicosanoids). Series 2 and 4 eicosanoids are more inflammatory, while series 3 and 5 eicosanoids are less inflammatory, based on their pro-inflammatory or anti-inflammatory properties, respectively [49]. PUFA (ω-6) have been implicated in the synthesis of proinflammatory prostanoids, which are known to promote inflammation and have been reported to be up-regulated in CRC [49,50]. Conversely, PUFA (ω-3) has been linked with the synthesis of anti-inflammatory prostanoids, which have also been reported to be down-regulated in CRC [51,52].

Discussion
Comprehensive lipidomics and metabolomics analytical workflows were implemented using GC×GC-TOFMS for the screening of CRC using the same sample from three different groups: adenocarcinoma, adenoma, and control/healthy volunteer. Typically, biomarker investigations for CRC screening are performed using either lipidomics or metabolomics approaches separately, which limits the ability to compare their performance using the same samples.
In this study, a comparison of results from both approaches suggests that the lipidomics approach has advantages over the metabolomics approach for CRC screening. The lipidomics approach is less time-consuming and can be automated with a dual head set autosampler accommodating two different syringe volumes for sample preparation. While comparing both approaches' efficiency for CRC screening using same set of samples, the lipidomics approach ( Figure 3) seemed to perform more efficient than metabolomics ( Figure 6). Moreover, the lipidomics approach demonstrated an ability to potentially differentiate between different stages of adenocarcinoma, adenoma, and control samples while monitoring fewer analytes ( Figure 3). The results of lipidomics suggest that long and very long-chain PUFA (ω-3) and PUFA (ω-6) play a major role in the progression of CRC [48], which are usually not efficiently detected using the metabolomics approach. Overall, the identified significant features having omega-6 were upregulated in CRC (adenocarcinoma > adenoma > control), while the PUFA (ω-3) were downregulated in CRC (adenocarcinoma < adenoma < control). Figure 10 illustrates that both PUFA (ω-3) (such as eicosapentaenoic acid (EPA)), and PUFA (ω-6) (such as arachidonic acid (AA)) share a common biochemical pathway mediated by the same enzymes but exert distinct physiological effects through the production of different types of prostanoids (subclass of eicosanoids). Series 2 and 4 eicosanoids are more inflammatory, while series 3 and 5 eicosanoids are less inflammatory, based on their pro-inflammatory or anti-inflammatory properties, respectively [49]. PUFA (ω-6) have been implicated in the synthesis of pro-inflammatory prostanoids, which are known to promote inflammation and have been reported to be up-regulated in CRC [49,50]. Conversely, PUFA (ω-3) has been linked with the synthesis of anti-inflammatory prostanoids, which have also been reported to be down-regulated in CRC [51,52]. A few studies have identified palmitic acid (C16:0) and stearic acid (C18:0) as significant fatty acids in CRC using metabolomics approaches [54][55][56]. However, unsaturated LFAs and VLFAs were found to be significant only using the lipidomics but not the metabolomics approach. This may be due to differences in sample preparation and the relative abundance of these fatty acids. A tailored sample preparation method for lipids can more efficiently extract unsaturated LFAs and VLFAs, which are present in lower abundance than SFAs and MUFAs in blood. Moreover, metabolomics approaches need to maintain broad selectivity towards various classes of metabolites, which can affect the extraction and detection rates of these fatty acids. Therefore, investigating the same serum sample using both approaches can reveal unique molecular changes involved in CRC. Using the metabolomics approach, the most frequently identified metabolite class perturbated in CRC is amino acid. The proteogenic amino acids (alanine (Ala), glutamic acid (Glu), methionine (Met), tryptophan (Thp), tyrosine (Thr), and valine (Val)) are downregulated in CRC. Amino acids can be preferentially metabolized to provide energy for the increased metabolism of cancer cells, act as precursors for the excessive synthesis of nucleotides in cancer cells or neutralize the heightened production of reactive oxygen species (ROS) by cancerous cells. Additionally, amino acids can also function as transcriptional or epigenetic regulators, fueling cancer-specific processes [54,57]. 3-Hydroxybutyrate, or beta-hydroxybutyrate (BHB), is a ketone body metabolite synthesized in the liver mitochondria from acetyl-CoA is a product of fatty acid degradation. It also serves as an essential energy source for the body. The level of 3-Hydroxybutyrate was increased in CRC compared to adenoma and control [54]. We acknowledge the limitation related to the limited number of samples in each cohort to make any definitive statements. However, these results serve as a proof of concept that can be used for power calculation and to guide the design of a larger-scale study. As discussed in detail in our previously published work that utilized the same sample preparation protocol [58], the importance of structural chromatographic separation using GC×GC-TOFMS for lipid analysis provides valuable insights without needing in-depth MS/MS investigations in untargeted analysis. The optimized sample preparation and known lipid molecules for CRC screening make it feasible to transfer the method to 1 D GC-MS, which is cost-effective and scalable. This becomes highly Figure 10. Short schematic summary of pathways for the synthesis of long-chain omega-6 and omega-3 PUFA [53]. Prostaglandin E (PGE), thromboxane A (TXA), prostacyclin (PGI), and leukotriene B (LTB) (* fatty acids were found to be significant in CRC screening using the lipidomics approach).
A few studies have identified palmitic acid (C16:0) and stearic acid (C18:0) as significant fatty acids in CRC using metabolomics approaches [54][55][56]. However, unsaturated LFAs and VLFAs were found to be significant only using the lipidomics but not the metabolomics approach. This may be due to differences in sample preparation and the relative abundance of these fatty acids. A tailored sample preparation method for lipids can more efficiently extract unsaturated LFAs and VLFAs, which are present in lower abundance than SFAs and MUFAs in blood. Moreover, metabolomics approaches need to maintain broad selectivity towards various classes of metabolites, which can affect the extraction and detection rates of these fatty acids. Therefore, investigating the same serum sample using both approaches can reveal unique molecular changes involved in CRC. Using the metabolomics approach, the most frequently identified metabolite class perturbated in CRC is amino acid. The proteogenic amino acids (alanine (Ala), glutamic acid (Glu), methionine (Met), tryptophan (Thp), tyrosine (Thr), and valine (Val)) are downregulated in CRC. Amino acids can be preferentially metabolized to provide energy for the increased metabolism of cancer cells, act as precursors for the excessive synthesis of nucleotides in cancer cells or neutralize the heightened production of reactive oxygen species (ROS) by cancerous cells. Additionally, amino acids can also function as transcriptional or epigenetic regulators, fueling cancer-specific processes [54,57]. 3-Hydroxybutyrate, or betahydroxybutyrate (BHB), is a ketone body metabolite synthesized in the liver mitochondria from acetyl-CoA is a product of fatty acid degradation. It also serves as an essential energy source for the body. The level of 3-Hydroxybutyrate was increased in CRC compared to adenoma and control [54]. We acknowledge the limitation related to the limited number of samples in each cohort to make any definitive statements. However, these results serve as a proof of concept that can be used for power calculation and to guide the design of a larger-scale study. As discussed in detail in our previously published work that utilized the same sample preparation protocol [58], the importance of structural chromatographic separation using GC×GC-TOFMS for lipid analysis provides valuable insights without needing in-depth MS/MS investigations in untargeted analysis. The optimized sample preparation and known lipid molecules for CRC screening make it feasible to transfer the method to 1 D GC-MS, which is cost-effective and scalable. This becomes highly challenging for the metabolomics approach, as it requires increased separation power of GC×GC while monitoring hundreds of metabolites.

Chemicals, Standards, and Samples
The derivatizing agents for the lipidomics sample preparation method, 0.5 M sodium methoxide (CH 3 ONa), and 20% boron trifluoride (BF 3 ) methanolic solution, were purchased from Thermofisher and Sigma-Aldrich, respectively. N-heptane was purchased from Biosolve. A Supelco 37 FAMEs standard mixture and n-alkanes mixture (C 7 -C 30 ) were purchased from Sigma-Aldrich and Millipore Sigma, respectively. The methyl undecanoic acid in heptane used as an internal standard was obtained from Reagecon, Ireland. A standard solution of 10 ppm FAMEs in dichloromethane, 10 ppm n-alkanes in hexane, and 500 ppm of methyl undecanoic acid (IS-1 ) in n-heptane were prepared.
This study utilized pooled human plasma purchased from TCS Biosciences (Buckingham, UK) as Quality Control (QC) samples at every five injections for both methods. The certified reference standard SRM 1950 "Metabolites in frozen human plasma" was procured from NIST (National Institute of Standards and Technology, Rockville, Maryland, USA). The blood samples were collected in BD ® Vacutainer ® SST™ 3.5 mL dry tube (#cat 367957) containing an inert gel barrier that separates serum from the blood clot during centrifugation. The clotting time was 30 min at room temperature, after which tubes were centrifuged at 2000 RCF at 4 • C for 10 min. At the end, 100 µL of serum per sample was stored at −80 • C. Before the sample preparation serum sample was thawed at room temperature for 15 min. A total of 64 serum samples with both the analytical techniques (metabolomics and lipidomics) were analyzed (Table 1).
For the lipidomics analytical method, the starting oven temperature was 55 • C (2 min), then the temperature was increased to 155 • C at 30 • C/min, was followed by a ramp of 2 • C/min to reach till 245 • C. At last, 300 • C temperature was achieved by 30 • C/min, holding it up for 2 min. The modulation period was 8 s (hot pulse: cool time between stages, 0.40: 3.60). The total run time for the GC method was 54.1 min. The modulator temperature offset was +15 • C. With positive-mode electron ionization (EI) at 70 eV, a mass range of 45-700 m/z was collected at an acquisition rate of 150 spectra/s. Transfer line and ion source temperatures were maintained at 250 • C and 230 • C, respectively.
For the metabolomics analytical method, the GC method, had a starting oven temperature of 50 • C for 2 min, followed by a ramp of 3 • C/min to reach 240 • C. Lastly, 300 • C was achieved by 30 • C/min and held for 5 min. The modulation period, modulator temperature offset, and MS method were the same as the lipidomics method (as mentioned above).
The same chromatographic and MS methods were used for GC×GC-HRTOFMS analysis performed with Pegasus GC-HRT 4D (LECO Corporation, St Joseph, MI, USA) equipped with an Agilent 7890 GC (Santa Clara, CA, USA) for the respective analyses of lipidomics and metabolomics.

Sample Preparation
The sample for the lipidomics method was prepared as mentioned by Bhatt K. et al. [58]. Additionally, 2 µL of 500 ppm of methyl undecanoic acid in n-heptane was spiked as the internal standard (IS-1 ), as shown in Figure 11a. The fundamental principle behind the metabolomics sample preparation protocol (Figure 11b) was based on [9]. However, minute changes were made to adjust to different instrument sensitivities for metabolomics sample preparation. The samples were collected at room temperature, stored at −80 • C, and thawed at room temperature for 20 min before sample preparation.
For the metabolomics analytical method, the GC method, had a starting oven temperature of 50 °C for 2 min, followed by a ramp of 3 °C/min to reach 240 °C. Lastly, 300 °C was achieved by 30 °C/min and held for 5 min. The modulation period, modulator temperature offset, and MS method were the same as the lipidomics method (as mentioned above).
The same chromatographic and MS methods were used for GC×GC-HRTOFMS analysis performed with Pegasus GC-HRT 4D (LECO Corporation, St Joseph, MI, USA) equipped with an Agilent 7890 GC (Santa Clara, CA, USA) for the respective analyses of lipidomics and metabolomics.

Sample Preparation
The sample for the lipidomics method was prepared as mentioned by Bhatt K. et al. [58]. Additionally, 2 µL of 500 ppm of methyl undecanoic acid in n-heptane was spiked as the internal standard (IS-1′), as shown in Figure 11a. The fundamental principle behind the metabolomics sample preparation protocol (Figure 11b) was based on [9]. However, minute changes were made to adjust to different instrument sensitivities for metabolomics sample preparation. The samples were collected at room temperature, stored at −80 °C, and thawed at room temperature for 20 min before sample preparation.

QAQC: The Injection Sequence
The pooled human plasma was used as the QC sample at an interval of every five injections for both the methods ( Figure 12). Firstly, system blank, n-alkane, 37 FAMEs standard mixture, 3 injections of NIST plasma metabolites, and sample blank were injected at the beginning and end of the entire analytical batch (before and after all samples analysis). Secondly, to create the baseline for QC Chart 10 pooled human plasma samples were injected after step 1 and before actual serum sample analysis. (Here, system blank was injected before, after 5 injections and at the end to check for carryover). All the serum samples (n = 64) were injected in a randomized order with QC samples (pooled human plasma) at an interval of every five injections. Representative analytes were selected to monitor QC samples with both approaches. For lipidomics, the average % RSD ± SD for monitored analytes of raw area, 1 DRt, 2 DRt were 10.03 ± 1.10, 0.12 ± 0.08, and 1.40 ± 0.65, respectively ( Figure S6). For metabolomics, the average % RSD ± SD for monitored

QAQC: The Injection Sequence
The pooled human plasma was used as the QC sample at an interval of every five injections for both the methods ( Figure 12). Firstly, system blank, n-alkane, 37 FAMEs standard mixture, 3 injections of NIST plasma metabolites, and sample blank were injected at the beginning and end of the entire analytical batch (before and after all samples analysis). Secondly, to create the baseline for QC Chart 10 pooled human plasma samples were injected after step 1 and before actual serum sample analysis. (Here, system blank was injected before, after 5 injections and at the end to check for carryover). All the serum samples (n = 64) were injected in a randomized order with QC samples (pooled human plasma) at an interval of every five injections. Representative analytes were selected to monitor QC samples with both approaches. For lipidomics, the average % RSD ± SD for monitored analytes of raw area, 1 D Rt , 2 D Rt were 10.03 ± 1.10, 0.12 ± 0.08, and 1.40 ± 0.65, respectively ( Figure S6). For metabolomics, the average % RSD ± SD for monitored analytes of raw area, 1 D Rt , 2 D Rt were 17.04 ± 2.86, 0.18 ± 0.05, and 3.21 ± 3.64, respectively ( Figure S7). analytes of raw area, 1 DRt, 2 DRt were 17.04 ± 2.86, 0.18 ± 0.05, and 3.21 ± 3.64, respectively ( Figure S7). Moreover, the reference standard NIST SRM 1950 (in triplicates), alkane, and FAMEs standard mixture were analyzed using GC×GC-HR-TOFMS to enhance identification confidence for both analytical approaches and to annotate MSI confidence levels [46,47]. In addition, with the metabolomics approach, actual pooled QC representing each class was also analyzed.

Data Processing and Chemometrics
The data processing step includes exporting data from ChromaTOF ® (ver. 4.72, LECO Corp., St. Joseph, MI, USA) with baseline correction in the Andi MS format (.cdf) and processing using GC Image TM (ver. 2021r, Zoex Corp., Nebraska, USA). The putative identification of analytes is based on following parameters were used: (i) electron impact ionization mass spectra libraries (main EI MS database (mainlib), and replicate spectra database (replib)) when MS % (≥700/1000), (ii) linear retention indices (LRI) (±20 window range) (C7-C30 alkane mixture), (iii) mass accuracy (±4 ppm) (upon using HRMS detector), (iv) NIST reference standard, FAMEs standard solution, and 2D plane location. Depending upon analytes if not all at least two-three criteria were fulfilled for identification. Before applying chemometric tools, the data were pre-processed by normalizing to sample median, square root transformation, and autoscaling. The chemometric tests, including oneway ANOVA, unsupervised screening (PCA, HCA, heatmap), multi-variate supervised analysis (PLS-DA), and random forest (RF), enrichment analysis were performed using MetaboAnalyst 5.0 (Xia Lab, McGill University, Montréal, QC, Canada) [59]. The random forest machine learning algorithm was also performed in Python (version 3.9.7) using scikit-learn library ('sklearn.ensemble.RandomForestClassifier') which provided the same results as MetaboAnalyst. Random forest is a supervised learning algorithm that utilizes an ensemble of classification trees. At each branch, a tree is grown by random feature selection from a bootstrap sample. The prediction of the class is determined by the collective majority vote of the ensemble. RF provides useful information such as the OOB (out-ofbag) error and MDA (Mean Decrease Accuracy). The MDA is calculated by randomly permuting the values of each feature in the dataset and retraining the RF model. The difference between the baseline accuracy and the accuracy of the retrained model is calculated for each feature, representing the decrease in classification accuracy when the feature is Moreover, the reference standard NIST SRM 1950 (in triplicates), alkane, and FAMEs standard mixture were analyzed using GC×GC-HR-TOFMS to enhance identification confidence for both analytical approaches and to annotate MSI confidence levels [46,47]. In addition, with the metabolomics approach, actual pooled QC representing each class was also analyzed.

Data Processing and Chemometrics
The data processing step includes exporting data from ChromaTOF ® (ver. 4.72, LECO Corp., St. Joseph, MI, USA) with baseline correction in the Andi MS format (.cdf) and processing using GC Image TM (ver. 2021r, Zoex Corp., Nebraska, USA). The putative identification of analytes is based on following parameters were used: (i) electron impact ionization mass spectra libraries (main EI MS database (mainlib), and replicate spectra database (replib)) when MS % (≥700/1000), (ii) linear retention indices (LRI) (±20 window range) (C 7 -C 30 alkane mixture), (iii) mass accuracy (±4 ppm) (upon using HRMS detector), (iv) NIST reference standard, FAMEs standard solution, and 2D plane location. Depending upon analytes if not all at least two-three criteria were fulfilled for identification. Before applying chemometric tools, the data were pre-processed by normalizing to sample median, square root transformation, and autoscaling. The chemometric tests, including one-way ANOVA, unsupervised screening (PCA, HCA, heatmap), multi-variate supervised analysis (PLS-DA), and random forest (RF), enrichment analysis were performed using Metabo-Analyst 5.0 (Xia Lab, McGill University, Montréal, QC, Canada) [59]. The random forest machine learning algorithm was also performed in Python (version 3.9.7) using scikit-learn library ('sklearn.ensemble.RandomForestClassifier') which provided the same results as MetaboAnalyst. Random forest is a supervised learning algorithm that utilizes an ensemble of classification trees. At each branch, a tree is grown by random feature selection from a bootstrap sample. The prediction of the class is determined by the collective majority vote of the ensemble. RF provides useful information such as the OOB (out-of-bag) error and MDA (Mean Decrease Accuracy). The MDA is calculated by randomly permuting the values of each feature in the dataset and retraining the RF model. The difference between the baseline accuracy and the accuracy of the retrained model is calculated for each feature, representing the decrease in classification accuracy when the feature is removed from the model. The mean of the decreases in accuracy across all trees in the random forest is then calculated for each feature, and the features are ranked according to their MDA values. These values generate a feature importance plot, showing the top-ranked features in descending order of importance.