Abstract
Despite remarkable success in the prevention and treatment of tuberculosis (TB), it remains one of the most devastating infectious diseases worldwide. Management of TB requires an efficient and timely diagnostic strategy. In this study, we comprehensively characterized the plasma lipidome of TB patients, then selected candidate lipid and lipid-related gene biomarkers using a data-driven, knowledge-based framework. Among 93 lipids that were identified as potential biomarker candidates, ether-linked phosphatidylcholine (PC O–) and phosphatidylcholine (PC) were generally upregulated, while free fatty acids and triglycerides with longer fatty acyl chains were downregulated in the TB group. Lipid-related gene enrichment analysis revealed significantly altered metabolic pathways (e.g., ether lipid, linolenic acid, and cholesterol) and immune response signaling pathways. Based on these potential biomarkers, TB patients could be differentiated from controls in the internal validation (random forest model, area under the curve [AUC] 0.936, 95% confidence interval [CI] 0.865–0.992). PC(O-40:4), PC(O-42:5), PC(36:0), and PC(34:4) were robust biomarkers able to distinguish TB patients from individuals with latent infection and healthy controls, as shown in the external validation. Small changes in expression were identified for 162 significant lipid-related genes in the comparison of TB patients vs. controls; in the random forest model, their utilities were demonstrated by AUCs that ranged from 0.829 to 0.956 in three cohorts. In conclusion, this study introduced a potential framework that can be used to identify and validate metabolism-centric biomarkers.
Similar content being viewed by others
Introduction
Tuberculosis (TB), a communicable disease caused by Mycobacterium tuberculosis (Mtb), remains a global health crisis. The World Health Organization (WHO) estimated that TB was responsible for 1.5 million deaths and approximately 10 million new patients in 20201. The persistently high incidence and prevalence of TB in part reflect inadequate diagnostic approaches; it has been estimated that only 60% of new cases are detected, especially in countries with a high disease burden and low treatment coverage1. A sensitive and easy-to-implement test would provide an important initial improvement in diagnostic accuracy. However, the current standard in the diagnosis of TB is smear microscopy or culture tests, both of which have a low sensitivity, are laborious, and require a specialist laboratory2. Molecular tests, such as the Xpert MTB/RIF assay, have been introduced; however, their use is not economically feasible in primary care settings3. Moreover, TB tests often rely on sputum and thus have sub-optimal sensitivity, especially for patients with early active TB—such patients cannot consistently provide sputum4. To improve TB healthcare quality, the WHO has urged the development of a rapid and sensitive biomarker-based non-sputum test that can be implemented at the clinical site and utilizes accessible samples, such as blood, urine, or breath condensate4.
Omics-based discovery studies in TB patients—which involved comprehensive profiling of the host transcriptome, metabolome, and proteome—identified several biomarkers for the diagnosis of TB5,6,7,8,9,10,11. Moreover, implementing network analysis with multi-omics data could potentially verify and choosing the best among those biomarkers12. Transcriptomics is the most matured technology that identifies promising transcript biosignatures for TB diagnosis, treatment monitoring, and outcome prediction. Among the proposed signatures, Sweeney3, a host-response three-gene signature, has met the WHO’s target product profiles for a triage test13, whereas lipidomics research applications in TB management remain limited. Consequently, there is a need for further research, particularly into the altered lipidome of TB patients; lipids and lipid-related genes also have potential for use as diagnostic and prognostic biomarkers. Moreover, large-scale lipid profiling using plasma can provide insights into the disease because host lipids constitute a significant nutrition source for Mtb growth and reproduction14.
Mutual metabolic alterations constitute important aspects of host–pathogen interactions; together with regulatory factors, such alterations are responsible for drug tolerance but can be exploited to design effective host-tailored therapies15,16. Mtb lipid metabolism in host macrophages has a vital role in TB pathogenesis16,17. Lipid droplet (LD) formation, an important event in Mtb lipid metabolism, is a multifaceted process related to Mtb intracellular growth and drug tolerance; it also acts as a host defense mechanism to combat the pathogen17,18,19,20. Accordingly, studies that examine biomarkers related to lipid metabolism and immunology are expected to be fruitful.
There have been several investigations of the biological fluid lipidome in TB patients, with the goal of identifying biomarkers for TB diagnosis21,22,23,24,25. Chen et al. described changes in lipid levels during TB treatment, and the unbiased lipidomics approach of Shivakoti et al. revealed an association between the host lipidome and treatment failure26,27. These pioneering studies indicate significant differences in lipid profiles of patients with active TB and their counterparts; thus, they highlight potential applications of lipid and lipid-gene biomarkers in diverse clinical scenarios.
In the current study, a robust workflow was developed that facilitates the identification and validation of multi-omics metabolism-centric lipid and lipid-gene biomarkers for the diagnosis of active pulmonary TB.
Material and methods
Institutional review board statement for the clinical cohort
The Institutional Review Board of Korea University Guro Hospital reviewed and approved the study (No. 2017GR0012). All procedures were carried out following the Declaration of Helsinki. Written informed consent was obtained from all participants that allowed the blood and clinical data analysis to be used.
Sample characteristics
As mentioned elsewhere, plasma and clinical data were obtained from the Biobank of Korea University Guro Hospital28. Patients with malignant diseases, diabetes mellitus, hyperlipidemia, human immunodeficiency virus infection, and chronic liver or renal diseases were excluded. Thus, 35 patients with confirmed pulmonary TB and 37 controls were included in this study. The demographic information of included populations was described in Supplementary Table S1. There were no statistically significant differences between the two groups in terms of age (Wilcoxon rank sum test) or sex (Fisher’s exact test).
Available transcriptomics data
Three data sets with baseline gene expression profiles of TB patients and the counterparts were selected for the differentially expressed analysis and machine learning (ML)-based classification studies. The data sets are: GSE107991 (21 TB, 21 latent tuberculosis infection (LTBI), and 12 Control), E-MTAB-8290 (54 TB and 127 non-TB, including presumptive symptomatic adults with negative TB diagnosis controls and with or without human immunodeficiency virus infection), and GSE101705 (28 TB and 16 LTBI)29,30,31,32.
Chemicals, reagents, and consumables
The LC–MS grade ammonium formate, formic acid, methyl tertbutyl ether (MTBE), and toluene were purchased Sigma Aldrich (St. Louis, Missouri, USA). LiChroSolv® LC–MS grade solvents including water, methanol, acetonitrile, and isopropanol were purchased from Merck KGaA (Darmstadt, Germany). The SPLASH Lipidomix® Mass Spec Standard was purchased from Avanti Polar Lipids (Alabama, USA).
Acquity charged surface hybrid technology (CSH) C18 2.1 × 100 mm, 1.7 μm column and Acquity VanGuard CSH C18 2.1 × 5 mm, 1.7 μm pre-column were purchased from Waters (Milford, MA, USA).
Sample preparation and lipid extraction
Sample preparation and lipid extraction were performed in accordance with previously established methods, with a few modifications33,34. In brief, 55-μL plasma samples were thawed on ice for approximately 30 min; subsequently, 5 μL were removed from each sample and pooled to obtain a quality control (QC) sample. Five microliters of the lipid internal standard mixture were injected into each sample (1:10, v/v) and the sample was briefly vortexed. After the sample had been incubated on ice for 20 min with intermittent vortexing, 300 μL of methanol (− 20 °C) and 1000 μL of MTBE (− 20 °C) were added. The mixture was vortexed vigorously for 10 s, then incubated at 4 °C for 1 h with occasional vortexing. After the addition of 250 μL of water, vigorous vortexing for 20 s, and a 10-min incubation at 4 °C, the sample was centrifuged for 2 min at 4 °C and 14,000 rcf. The two supernatants (lipid fraction), each comprising 500 μL, were collected. One half was used for assessments in positive ion mode, and the other half was used for assessments in negative ion mode. The lipid fraction was completely dried in a vacuum at room temperature and stored at − 80 °C until needed.
Instrumental conditions for untargeted lipid profiling
An Acquity charged surface hybrid technology C18 column (2.1 × 100 mm, 1.7 μm) and Acquity VanGuard charged surface hybrid technology C18 pre-column (2.1 × 5 mm, 1.7 μm) were used for lipid separation with a binary gradient elution as described in Supplementary Table S2. A Shimadzu Nexera LC system (Kyoto, Japan) was utilized for the experiment. Lipid extracts were resuspended in methanol/toluene (9:1, v/v) and kept at 4 °C in an autosampler. The injected volume was ion-mode- and data-acquisition-dependent. From 200 μL of resuspended volume, 1 μL (scan profiling) and 2 μL (information-dependent acquisition and SWATH-based data-independent acquisition) were injected in positive ion mode; 3 μL (scan profiling) and 6 μL (information-dependent acquisition and SWATH) were injected in negative ion mode. The separated lipid ions were analyzed using an X500R QTOF with a Turbo V™ ion source with a TwinSpray probe (SCIEX, MA, USA). For the tandem MS analyses, either 45 eV (spread of 15 eV) or 25 eV (spread of 15 eV) were used. The MS parameters are shown in Supplementary Table S3. Mass calibration was automatically performed after every fifth injection through the instrument’s CDS system, using X500R positive or negative calibration solutions.
Lipid data processing, alignment, and lipid annotation
Raw data (wiff files) were directly input to MS-DIAL (version 4.8) for data processing, alignment, and lipid identification. The parameters were ion mode-dependent, as described in Supplementary Table S4. The aligned data were exported for subsequent use. Post-alignment data processing was performed using MetaboAnalyst 5.0 and features with missing rates ≥ 50% were removed; otherwise, the k-nearest neighbors algorithm was used to impute the missing features35. Features with relative standard deviation of ≥ 25% in the pooled QC were also removed. The MS-DIAL inbuilt library and Fiehn’s lab lipidomics library were used for lipid identification36,37.
Transcriptomics data processing and differential analysis
Raw counts of transcripts mapped into genes were summarized using the sum level. The annotated gene-level raw counts were normalized using Trimmed Mean of M-values. The pipeline was implemented using NetworkAnalyst 3.038. Differentially expressed analysis was applied for lipid-related genes in the three transcriptome profiles (i.e., E-MTAB-8290, GSE107991, GSE101705) using two-sided unpaired t-test (rstatix package version 0.7.0, implemented in R 4.1.2). Genes with a false discovery rate (FDR) less than 0.05 were considered as significant.
Data exploration and visualization
An unsupervised method, principal component analysis (PCA), was employed to explore and visualize the lipidome data. Prior to the analysis, the data were normalized (using the median method), log-transformed, and Pareto scaled. PCs that explained the most sample variance were plotted in a two-dimensional space (MetaboAnalyst 5.0) or three-dimensional space (R package, Plotly version 4.10.0). Heatmap and volcano plots (MetaboAnalyst 5.0) were also used for data visualization.
Statistical analysis and modeling of lipidomic data
Prior to univariate analysis using an unpaired t-test, the data were normalized using the median method and log-transformed. An FDR of 0.05 was set as the threshold for significant features. Fold-change (FC) thresholds of 1.2, 1.5, and 2 were also tested for biomarker candidate selection. Class discrimination between the lipid profiles of the two groups was achieved using partial least squares-discriminant analysis (PLS-DA). Because the discriminant model has tuning parameters (e.g., the number of components), the optimal model was selected in a tenfold cross-validation process. The variable importance in projection (VIP) score of the PC1 of the optimal model was set at ≥ 1.2 as the threshold of important features used to detect potential biomarker candidates. Statistical analyses were conducted using MetaboAnalyst 5.0 unless stated otherwise.
Internal and external biomarker validation
Univariate receiver operating characteristic (ROC) curve analysis was conducted to examine the potential biomarker applications of individual lipids. Random forest and linear support vector machine (SVM) were carried out to investigate the discriminatory capacity of the lipid biomarker candidates. Random forest is an ensemble method that generates many decision trees, then aggregates their outcomes to obtain greater prediction accuracy39. This powerful tool uses bagging and random feature selection to build multiple base learners. In SVM, a hyperplane is identified that maximizes the margin from data points. A larger margin leads to greater separation by the hyperplane, thus reducing generalization error. The performances of random forest and SVM are stable, regardless of the domain and data types.
For internal validation, the ROC curve-based exploratory analysis was utilized because it can automate important feature identification and performance evaluation. In the external validation, the biomarkers that were overlapped with the quantified lipids in the data of Cho et al. (at the fatty acyl/alkyl sum composition) were used to validate their performance in classifying TB patients from latent infection and controls21. All matched biomarkers were used to establish the biomarker models. The analyses were carried out in three different scenarios: TB vs. LTBI + control, TB vs. LTBI, and TB vs. control. The biomarker models using lipid-related genes in the datasets E-MTAB-8290 (54 TB, 127 control/non-TB), GSE107991 (21 TB, 12 controls, 21 LTBI), and GSE101705 (28 TB, 16 LTBI) were trained and validated using the same approach. In particular, the gene expression of matched lipid-related genes in four different data sets were utilized for the ROC-curve-based exploratory analysis. In all analyses, the area under the curve (AUC) and 95% confident interval (CI) of the best models are reported. The analyses were performed in the “Biomarker Analysis” module of MetaboAnalyst 5.0.
Correlation network analysis
Normalized expression levels of lipid biomarker candidates were visualized by correlation network analysis in the R package corrr (version 0.4.3). The network shows variables as nodes and their association as edges. The proximity of two nodes is determined by their correlation strength; their locations (or Euclidean coordinates) are found by multidimensional scaling. This method reduces the number of data dimensions to facilitate variable visualization.
Functional analysis
Biomarker candidate data were submitted to Lipid Ontology (LION) for lipid ontology enrichment analysis via the “LION-PCA heatmap” module40. In addition, lipid-gene association networks were analyzed using Lipidsig and lipid-genes were extracted41. For visualization, the R package ggplot2 (version 3.3.5) was used.
Results
Lipid profiles of TB patients are distinguishable from lipid profiles of controls
PCA was performed in positive ion mode to explore sample tendencies independent of sample source. The analysis was conducted using a total of 3791 detected lipid features of TB and control samples, with and without QC samples. In the PCA scores plot with QC samples, all QC samples clustered tightly together (Fig. S1A), and the relative standard deviation of the raw total ion chromatogram among QC samples was only 6.5%. These data indicated satisfactory repeatability of the untargeted lipid profiling analysis, which allowed subsequent data analyses and interpretation. In the PCA scores plot with TB and control samples, the three first PCs explained 52.1% of the variance: 23.2%, 21.1%, and 7.8% for PCs 1, 2, and 3, respectively. The relative separation of samples into two separate groups is evident in the three-dimensional PCA plot (Fig. 1A). Heatmap analysis captured relative differences between the two groups at the feature level; differences in lipid features were relatively clear (Fig. 1B). In addition, PCA analysis, which included 762 detected features, were also conducted in negative ion mode. Similar to positive ion mode, the QC samples clustered together (relative standard deviation of total ion chromatogram: 5.6%, Fig. S1B). The three-dimensional PCA plot indicated relative separation of the samples into two groups (Fig. 1C). At the feature level in the heatmap, we could also notice a proportionately contrast between the two groups (Fig. 1D). Taken together, the data exploration analyses in positive and negative ion modes indicated considerable differences between the lipid profiles of TB patients and of controls.
Univariate and multivariate analyses suggest numerous lipid biomarker candidates
Data exploration indicated considerable differences in lipid metabolic profiles between TB patients and controls, but a more sophisticated statistical approach was needed to identify lipids that could be regarded as biomarker candidates. Supervised investigation was conducted using PLS-DA and the lipid profiles of TB patients and controls. In positive ion mode, a PLS-DA model with five components classified the two groups with appropriate performance metrics (Fig. 2A, accuracy = 0.90, R2 = 0.92, and Q2 = 0.58). Similarly, in negative ion mode, a PLS-DA model with five components provided satisfactory classification (Fig. 2B, accuracy = 0.90, R2 = 0.95, and Q2 = 0.62) (Fig. S2A,B). The VIP score of the first PC, which explained the most sample variance, was extracted as an additional metric of biomarker candidate potential. A VIP score ≥ 1.2 was determined for 821 (21.66%) and 139 (15.71%) features in positive and negative ion modes, respectively. Finally, the random forest model demonstrated satisfactory performance in distinguishing the two groups. The cross-validated out-of-bag errors were 19.7% and 11.3% for positive and negative ion modes, respectively.
Univariate analysis using a t-test was employed to further explore potential biomarker candidates. In positive ion mode, 752 significant features (351 up- and 401 downregulated in TB patients) were found based on an FDR threshold of 0.05. Among these features, 743 (343 up- and 400 downregulated in TB patients), 404 (115 up- and 289 downregulated in TB patients), and 195 (51 up- and 144 downregulated in TB patients) exceeded the FC thresholds of 1.2, 1.5, and 2.0, respectively. In negative ion mode, 175 significant features (94 up- and 81 downregulated in TB patients) were identified with an FDR threshold of 0.05. With the FC thresholds of 1.2, 1.5, and 2.0, 156 (78 up- and 78 downregulated in TB patients), 74 (23 up- and 51 downregulated in TB patients), and 33 (8 up- and 25 downregulated in TB patients) features were selected, respectively. The volcano plots in Supplementary Fig. S3A (positive ion mode) and S3B (negative ion mode) show significant features based on the FC threshold of 1.5 and FDR threshold of 0.05.
The intersection of two criteria, a VIP score (PC1, PLS-DA model) of ≥ 1.2 and an FC (t-test) of ≥ 1.5, revealed 89 and 28 potential biomarker candidates in positive and negative ion modes, respectively. Among the selected features, 73 (positive ion mode) and 26 (negative ion mode) were successfully annotated as lipids, thus yielding 93 non-overlapping lipid biomarkers (Table 1).
Internal and external validation indicate satisfactory performance of lipid biomarker candidates
Annotated lipid biomarker candidates were first subjected to univariate biomarker analysis. The ROC curves for those candidates were significantly associated with the TB status (Supplementary Table S5). Among 93 candidate lipid biomarkers, 21 had AUC values < 0.7, whereas 72 were considered promising (AUC ≥ 0.7); of the 72, 13 were considered good (AUC ≥ 0.8) and 2 were considered excellent (AUC > 0.9). The “excellent” lipid biomarkers were two ether-linked phosphatidylethanolamines: PE(O-38:5) and PE(O-40:5). The “good” biomarker candidates were from six lipid sub-classes: two phosphatidylcholine (PC), PC(36:0) and PC(38:7); two ether-linked phosphatidylcholines (PC(O-)), PC(O-36:0) and PC(O-34:0); two ether-linked lysophosphatidylethanolamines (LPE(O-)), LPE(O-16:1) and LPE(O-18:1); two phosphatidylethanolamines (PE), (PE(36:1) and PE(38:4); one PE(O-), PE(O-40:5); and four free fatty acids (FAs), FA(20:3), FA(20:5), FA(22:5), and FA(22:6). Multivariate biomarker analysis using the random forest method revealed that models with 93 variables had the best performance (AUC = 0.921, 95% confidence interval [95% CI] 0.834–0.987) (Fig. 3A). The result of the linear SVM method was approximately similar to the result of the random forest method (Fig. 3B). Correlation analysis showed a significant linear correlation among biomarkers for both TB patients (Fig. 3C) and controls (Fig. 3D), suggesting that a small number of lipids could be used as biomarkers to differentiate TB patients from controls.
To rule out the possibility that the internal validation overestimated candidate biomarker performance, external validation was conducted using the dataset from Cho et al. (21 active TB patients, 20 patients with latent infections, 28 controls)21 but restricted to overlapping lipids (i.e., one LPC, 4 PCs, and 7 PC(O-)s). The validation was first conducted by dividing the samples into active TB vs. non-TB (patients with LTBI and controls) groups. In the univariate ROC analysis, six lipids exhibited an AUC ≥ 0.7. While LPC(20:3) and PC(O-34:0) were unable to differentiate between groups, the AUC of PC(O-40:4) and PC(O-42:5) was 1. Satisfactory performance (AUC 1, 95% CI 1–1) was obtained using the random forest model. Similar results were achieved in the comparison of active TB vs. LTBI or TB vs. control (Table 2). The results of the external validation partially supported the results of correlation network analysis, i.e., only a few lipids could differentiate TB from LTBI or controls.
Functional analysis reveals profound lipid metabolic alterations in TB patients
The LION ontology results indicated that PC(O-), PC, and PE were generally enriched in the TB group, whereas FAs and triacylglycerols (TAGs) with longer acyl chains were downregulated. The TB group also exhibited enrichment of lipids associated with mitochondrion, endoplasmic reticulum, and membrane components; it showed decreased levels of LD-related lipid species (Fig. 4A).
The reported biomarker candidates belonged to 15 (sub)-classes, of which six contained ≥ 3 lipid species. Those six sub-classes were subjected to lipid-gene analysis to identify TB-associated functional dysregulation and potential gene biomarkers. As shown in Fig. 4B, numerous pathways were altered. Significantly enriched biological processes included the PI3K-Akt, Rap1, calcium, and chemokine signaling pathways. Ether lipid metabolism, fat digestion and absorption, linolenic acid metabolism, and cholesterol metabolism were also altered. The full list of altered pathways and associated genes is provided in Supplementary Table S6.
Lipid-genes are excellent biomarkers for differentiating active TB from its counterparts
Among dysregulated pathways detected in the lipid-gene analysis (p < 0.01), 162 unique genes were identified. These genes were tested for their ability to differentiate active TB from LTBI or controls in three different data sets. The random forest classifier established from the expression of lipid-related genes was able to distinguish TB from its counterparts in three different TB cohorts, with an AUC ranging from 0.829 (95% CI 0.707–0.931, E-MTAB-8290) to 0.958 (95% CI 0.909–1, GSE101705) (Fig. 5A–D). The linear SVM model showed similar results (Fig. S4A–D). However, most genes exhibited a small FC.
Discussion
This study demonstrated differences in the lipidomes of TB patients and non-TB controls; it revealed 73 and 26 potential biomarker candidates, identified in positive and negative mode, respectively (six biomarkers were detected in both). In TB patients, the biomarkers had at least a 1.5-fold difference (in either direction). Among the significantly altered lipid sub-classes, ceramide (Cer), LPC, PC(O-), and PE were generally upregulated; certain PCs, diacylglycerols, and FAs were downregulated in TB patients. TAGs with shorter acyl chains were strongly increased in TB patients, while TAGs with longer acyl chains were decreased. The biomarkers mostly belonged to the lipid classes PC, PC(O-), PE, PE(O-), FA, and TAG, suggesting that these lipid classes are important in TB pathophysiology.
Among the putative lipid biomarkers, 12 were matched with previously reported lipid profiles that utilized a targeted approach21. PC(O-40:4), PC(O-42:5), PC(36:0), and PC(34:4) were prominent biomarker candidates identified by internal and external validation. Besides, our biomarkers showed concordance partially with the top biomarkers reported by Chen et al. and Han et al. in terms of lipid species. PC and PC (O-) were dominant in the top list, which suggests the importance of these lipids in differentiating TB from the controls23,26. However, different biological and technical factors can be attributed to the heterogeneity of the findings among studies, such as cohort characteristics, genetic background, sample treatment and instrumental data acquisition, and utilized statistical methods.
Lipid-related genes were associated with various TB pathologically comparable pathways; they also formed a distinctive signature that differentiated active TB from LTBI and other non-TB controls. Although gene expression was only subtly altered, it provided a consistent signature. Among the 162 lipid-related genes, 96 genes were found to be differentially expressed in TB versus non-TB in at least one comparison (Supplementary Table S7); these genes are presumably involved in crucial biological processes that underlie TB pathophysiology. For example, differentially expressed genes related to the biochemical regulations of PC, which is profoundly changed in TB21,23,26, included CHPT1, LPCAT2, LPCAT4, PLA2G4A, PLA2G4C, PLD2, PLD4, MBOAT2, ADORA2A and ADORA2B.
A significant increase in Cer(d34:1) was identified in TB patients. This biomarker has been previously reported to exhibit consistently higher levels in TB patients than in healthy individuals or patients with other respiratory diseases23,24,42,43. Shivakoti et al. showed that Cer(d34:1) is also related to TB treatment outcome; patients with the highest Cer levels had an increased risk of treatment failure27. The involvement of Cer in host immune responses against Mtb—via immune cell activation, phagocytosis, and other mechanisms—might explain its higher level in TB patients than in controls24,44,45.
Although Mtb relies on different carbon sources at different stages of pathogenesis, host lipids are generally the primary carbon source for Mtb in vivo46. The genome of Mtb laboratory strain H37Rv has > 250 genes related to lipid metabolism47. The infection of macrophages by Mtb triggers the formation of foamy macrophages through the accumulation of lipid bodies of TAGs and cholesterol esters48,49. Consistent with previous findings23,26, our study showed a decrease in TAGs in host plasma; this may be related to the uptake of host TAGs into foamy macrophages to form LDs, which can serve as nutrient sources14,48. While LD formation may be a host-driven immune response rather than an Mtb-mediated process19, the resulting physiological changes in Mtb lead to TAG accumulation, LD formation, growth reduction, decreased metabolic activity, and development of phenotypic drug resistance; these processes are associated with the persistent and non-dividing stages of Mtb50. We also found the downregulation of FAs in TB patients; this hinders the formation of longer-chain FAs that form the main components of Mtb cell wall lipids51. Fas I/II-induced elongation could partially explain the decrease in plasma FAs.
Lysophosphatidylcholine acyltransferase 2 (LPCAT2) induces LD accumulation in cancer patients during the onset of chemoresistance52. Our study is presumably the first to report an association of the upregulation of LPCAT2 with metabolic alterations in TB patients vs. non-TB controls, suggesting that LPCAT2 can serve as a biomarker in the diagnosis of TB.
There is also emerging evidence concerning the crucial role of metabolism in host–pathogen dynamics, with the transcription factor PPAR (peroxisome proliferator-activated receptor) implicated in LD buildup during inflammation and infectious diseases53,54,55. Our analysis demonstrated the enrichment of several lipid-related genes associated with PPAR signaling pathways; these genes include PPARA, CD36, FABP4, and ACSL156. Lipid mediators, cytokines, and chemokines may act in a paracrine manner to induce LD formation14,57.
We also identified the involvement of lipids and lipid-related genes in chemokine signaling (CDC42, FGR, IKBKB, RAC1), ether lipid metabolism, glycerophospholipid metabolism, sphingolipid metabolism, and phospholipase D signaling pathways. In a mouse model of TB58 and a study of T cells from TB patients59, Mtb was found to inhibit host proinflammatory cytokine production through the PI3K-Akt signaling pathway. In the ontology analysis of lipid genes, the PI3K-Akt signaling pathway exhibited the most significant functional dysregulations in TB patients. Together, these observations support a significant role of lipid metabolism and lipid-related genes in the host immune response.
Our study had several limitations. First, its focus was on the discovery and validation of lipid biomarkers; however, there is evidence to support the use of hydrophilic metabolites (e.g., glutamic acid and glutamine) as biomarkers21. The combined use of these metabolites and lipids would significantly improve the detection of active TB in clinical settings. Second, other infectious respiratory diseases (e.g., community-acquired pneumonia) were not included in the lipidomics analysis. Nevertheless, some markers were able to reliably distinguish TB and LTBI. Subsequent studies should examine the differential diagnostic performance of those biomarker candidates in other infectious respiratory diseases. Third, quantitative information is available for some biomarker candidates, based on isotopically labeled internal standards at ratios relative to human plasma. However, an accurate quantification strategy (e.g., AdipoAtlas60) is needed to facilitate clinical application. This can be readily achieved through targeted analysis of a subset of the most promising biomarkers. Fourth, through our exploratory analysis, we enable the identification of several altered lipids and lipid genes, as well as lipid-related metabolism and immune response pathway in TB patients. Experimental studies on in vitro or animal models are required to substantiate our findings. Finally, a prospective validation cohort study with actual concentrations of lipid biomarkers is required to examine the relevance of the identified biomarkers in TB manifestations.
In summary, our study identified and validated lipid-focused biomarkers. Multiple data mining methods with lipidome and lipid-related transcript signatures were used to obtain robust biomarkers and gain new mechanistic insights into TB. Lipid species that belonged to the PC(O-), PCs, TAGs, FAs, and Cer were identified as excellent candidate biomarkers. PC(O-40:4), PC(O-42:5), PC(36:0), and PC(34:4) were externally validated and had a good performance. Additionally, our study revealed systemic and multi-omics levels of biologically relevant processes involved in host responses to Mtb infection. Overall, comprehensive omics analyses employing a data-driven, knowledge-based approach can support metabolism-centric biomarker discovery and validation.
Data availability
The data underlying this article cannot be shared publicly for the privacy of individuals that participated in the study. Pre-processed and imputed data will be shared by the corresponding author upon reasonable requests.
References
World Health Organization. Global Tuberculosis Report 2021 (World Health Organization, 2021).
Walzl, G. et al. Tuberculosis: Advances and challenges in development of new diagnostics and biomarkers. Lancet Infect. Dis. 18, e199–e210. https://doi.org/10.1016/S1473-3099(18)30111-7 (2018).
Stop TB Partnership. TB REACH Xpert Budget Estimation Tool. Accessed 23 March 2022.
World Health Organization. High Priority Target Product Profiles for New Tuberculosis Diagnostics: Report of a Consensus Meeting (World Health Organization, 2014).
De Groote, M. A. et al. Discovery and validation of a six-marker serum protein signature for the diagnosis of active pulmonary tuberculosis. J. Clin. Microbiol. 55, 3057–3071. https://doi.org/10.1128/JCM.00467-17 (2017).
Kumar, N. P. et al. Plasma chemokines are biomarkers of disease severity, higher bacterial burden and delayed sputum culture conversion in pulmonary tuberculosis. Sci. Rep. 9, 18217. https://doi.org/10.1038/s41598-019-54803-w (2019).
Ota, M. O. et al. Rapid diagnosis of tuberculosis using ex vivo host biomarkers in sputum. Eur. Respir. J. 44, 254–257. https://doi.org/10.1183/09031936.00209913 (2014).
Penn-Nicholson, A. et al. RISK6, a 6-gene transcriptomic signature of TB disease risk, diagnosis and treatment response. Sci. Rep. 10, 8629. https://doi.org/10.1038/s41598-020-65043-8 (2020).
Roe, J. et al. Blood transcriptomic stratification of short-term risk in contacts of tuberculosis. Clin. Infect. Dis. 70, 731–737. https://doi.org/10.1093/cid/ciz252 (2020).
Sambarey, A. et al. Unbiased identification of blood-based biomarkers for pulmonary tuberculosis by modeling and mining molecular interaction networks. EBioMedicine 15, 112–126. https://doi.org/10.1016/j.ebiom.2016.12.009 (2017).
Weiner, J. 3rd. et al. Metabolite changes in blood predict the onset of tuberculosis. Nat. Commun. 9, 5208. https://doi.org/10.1038/s41467-018-07635-7 (2018).
Kontsevaya, I. et al. Perspectives for systems biology in the management of tuberculosis. Eur. Respir. Rev. 30, 160. https://doi.org/10.1183/16000617.0377-2020 (2021).
Sweeney, T. E., Braviak, L., Tato, C. M. & Khatri, P. Genome-wide expression for diagnosis of pulmonary tuberculosis: A multicohort analysis. Lancet Respir. Med. 4, 213–224. https://doi.org/10.1016/S2213-2600(16)00048-5 (2016).
Peyron, P. et al. Foamy macrophages from tuberculous patients’ granulomas constitute a nutrient-rich reservoir for M. tuberculosis persistence. PLoS Pathog. 4, e1000204. https://doi.org/10.1371/journal.ppat.1000204 (2008).
Park, J. H., Shim, D., Kim, K. E. S., Lee, W. & Shin, S. J. Understanding metabolic regulation between host and pathogens: New opportunities for the development of improved therapeutic strategies against Mycobacterium tuberculosis infection. Front. Cell. Infect. Microbiol. 11, 635335. https://doi.org/10.3389/fcimb.2021.635335 (2021).
Shim, D., Kim, H. & Shin, S. J. Mycobacterium tuberculosis infection-driven foamy macrophages and their implications in tuberculosis control as targets for host-directed therapy. Front. Immunol. 11, 910. https://doi.org/10.3389/fimmu.2020.00910 (2020).
Saka, H. A. & Valdivia, R. Emerging roles for lipid droplets in immunity and host–pathogen interactions. Annu. Rev. Cell Dev. Biol. 28, 411–437. https://doi.org/10.1146/annurev-cellbio-092910-153958 (2012).
Chai, Q., Wang, L., Liu, C. H. & Ge, B. New insights into the evasion of host innate immunity by Mycobacterium tuberculosis. Cell. Mol. Immunol. 17, 901–913. https://doi.org/10.1038/s41423-020-0502-z (2020).
Knight, M., Braverman, J., Asfaha, K., Gronert, K. & Stanley, S. Lipid droplet formation in Mycobacterium tuberculosis infected macrophages requires IFN-gamma/HIF-1alpha signaling and supports host defense. PLoS Pathog. 14, e1006874. https://doi.org/10.1371/journal.ppat.1006874 (2018).
Santucci, P. et al. Nitrogen deprivation induces triacylglycerol accumulation, drug tolerance and hypervirulence in mycobacteria. Sci. Rep. 9, 8667. https://doi.org/10.1038/s41598-019-45164-5 (2019).
Cho, Y. et al. Identification of serum biomarkers for active pulmonary tuberculosis using a targeted metabolomics approach. Sci. Rep. 10, 3825. https://doi.org/10.1038/s41598-020-60669-0 (2020).
Collins, J. M. et al. High-resolution plasma metabolomics analysis to detect Mycobacterium tuberculosis-associated metabolites that distinguish active pulmonary tuberculosis in humans. PLoS ONE 13, e0205398. https://doi.org/10.1371/journal.pone.0205398 (2018).
Han, Y. S. et al. Identification of potential lipid biomarkers for active pulmonary tuberculosis using ultra-high-performance liquid chromatography-tandem mass spectrometry. Exp. Biol. Med. 246, 387–399. https://doi.org/10.1177/1535370220968058 (2021).
Lau, S. K. et al. Metabolomic profiling of plasma from patients with tuberculosis by use of untargeted mass spectrometry reveals novel biomarkers for diagnosis. J. Clin. Microbiol. 53, 3750–3759. https://doi.org/10.1128/JCM.01568-15 (2015).
Vrieling, F. et al. Patients with concurrent tuberculosis and diabetes have a pro-atherogenic plasma lipid profile. EBioMedicine 32, 192–200. https://doi.org/10.1016/j.ebiom.2018.05.011 (2018).
Chen, J. X. et al. Novel therapeutic evaluation biomarkers of lipid metabolism targets in uncomplicated pulmonary tuberculosis patients. Signal Transduct. Target. Ther. 6, 22. https://doi.org/10.1038/s41392-020-00427-w (2021).
Shivakoti, R. et al. Host lipidome and tuberculosis treatment failure. Eur. Respir. J. 59, 2004586. https://doi.org/10.1183/13993003.04532-2020 (2022).
Phuoc Long, N. et al. Molecular perturbations in pulmonary tuberculosis patients identified by pathway-level analysis of plasma metabolic features. PLoS ONE 17, e0262545. https://doi.org/10.1371/journal.pone.0262545 (2022).
Johnson, W. E. et al. Comparing tuberculosis gene signatures in malnourished individuals using the TBSignatureProfiler. BMC Infect. Dis. 21, 106. https://doi.org/10.1186/s12879-020-05598-z (2021).
Leong, S. et al. Existing blood transcriptional classifiers accurately discriminate active tuberculosis from latent infection in individuals from south India. Tuberculosis 109, 41–51. https://doi.org/10.1016/j.tube.2018.01.002 (2018).
Singhania, A. et al. A modular transcriptional signature identifies phenotypic heterogeneity of human tuberculosis infection. Nat. Commun. 9, 2308. https://doi.org/10.1038/s41467-018-04579-w (2018).
Turner, C. T. et al. Blood transcriptional biomarkers for active pulmonary tuberculosis in a high-burden setting: A prospective, observational, diagnostic accuracy study. Lancet Respir. Med. 8, 407–419. https://doi.org/10.1016/S2213-2600(19)30469-2 (2020).
Barupal, D. K. et al. Generation and quality control of lipidomics data for the Alzheimer’s disease neuroimaging initiative cohort. Sci. Data 5, 180263. https://doi.org/10.1038/sdata.2018.263 (2018).
Lange, M. & Fedorova, M. Evaluation of lipid quantification accuracy using HILIC and RPLC MS on the example of NIST(R) SRM(R) 1950 metabolites in human plasma. Anal. Bioanal. Chem. 412, 3573–3584. https://doi.org/10.1007/s00216-020-02576-x (2020).
Pang, Z. et al. MetaboAnalyst 5.0: Narrowing the gap between raw spectra and functional insights. Nucleic Acids Res. 49, W388–W396. https://doi.org/10.1093/nar/gkab382 (2021).
Tsugawa, H. et al. A lipidome atlas in MS-DIAL 4. Nat. Biotechnol. 38, 1159–1163. https://doi.org/10.1038/s41587-020-0531-2 (2020).
Kind, T. et al. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat. Methods 10, 755–758. https://doi.org/10.1038/nmeth.2551 (2013).
Zhou, G. et al. NetworkAnalyst 3.0: A visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucleic Acids Res. 47, W234–W241. https://doi.org/10.1093/nar/gkz240 (2019).
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Molenaar, M. R. et al. LION/web: A web-based ontology enrichment tool for lipidomic data analysis. Gigascience https://doi.org/10.1093/gigascience/giz061 (2019).
Lin, W. J. et al. LipidSig: A web-based tool for lipidomic data analysis. Nucleic Acids Res. 49, W336–W345. https://doi.org/10.1093/nar/gkab419 (2021).
Andreas, N. J. et al. Performance of metabonomic serum analysis for diagnostics in paediatric tuberculosis. Sci. Rep. 10, 7302. https://doi.org/10.1038/s41598-020-64413-6 (2020).
Lam, C. W. & Law, C. Y. Untargeted mass spectrometry-based metabolomic profiling of pleural effusions: Fatty acids as novel cancer biomarkers for malignant pleural effusions. J. Proteome. Res. 13, 4040–4046. https://doi.org/10.1021/pr5003774 (2014).
Anes, E. et al. Selected lipids activate phagosome actin assembly and maturation resulting in killing of pathogenic mycobacteria. Nat. Cell Biol. 5, 793–802. https://doi.org/10.1038/ncb1036 (2003).
Chackerian, A., Alt, J., Perera, V. & Behar, S. M. Activation of NKT cells protects mice from tuberculosis. Infect. Immun. 70, 6302–6309. https://doi.org/10.1128/IAI.70.11.6302-6309.2002 (2002).
Gouzy, A. et al. Mycobacterium tuberculosis nitrogen assimilation and host colonization require aspartate. Nat. Chem. Biol. 9, 674–676. https://doi.org/10.1038/nchembio.1355 (2013).
Cole, S. T. et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393, 537–544. https://doi.org/10.1038/31159 (1998).
Daniel, J., Maamar, H., Deb, C., Sirakova, T. D. & Kolattukudy, P. E. Mycobacterium tuberculosis uses host triacylglycerol to accumulate lipid droplets and acquires a dormancy-like phenotype in lipid-loaded macrophages. PLoS Pathog. 7, e1002093. https://doi.org/10.1371/journal.ppat.1002093 (2011).
Pandey, A. K. & Sassetti, C. M. Mycobacterial persistence requires the utilization of host cholesterol. Proc. Natl. Acad. Sci. USA 105, 4376–4380. https://doi.org/10.1073/pnas.0711159105 (2008).
Gengenbacher, M. & Kaufmann, S. H. Mycobacterium tuberculosis: Success through dormancy. FEMS Microbiol. Rev. 36, 514–532. https://doi.org/10.1111/j.1574-6976.2012.00331.x (2012).
Smith, J. L. & Sherman, D. H. An enzyme assembly line. Science 321, 1304–1305. https://doi.org/10.1126/science.1163785 (2008).
Cotte, A. K. et al. Lysophosphatidylcholine acyltransferase 2-mediated lipid droplet production supports colorectal cancer chemoresistance. Nat. Commun. 9, 322. https://doi.org/10.1038/s41467-017-02732-5 (2018).
Almeida, P. E., Carneiro, A. B., Silva, A. R. & Bozza, P. T. PPARgamma expression and function in mycobacterial infection: Roles in lipid metabolism, immunity, and bacterial killing. PPAR Res. 2012, 383829. https://doi.org/10.1155/2012/383829 (2012).
Almeida, P. E. et al. Mycobacterium bovis bacillus Calmette-Guerin infection induces TLR2-dependent peroxisome proliferator-activated receptor gamma expression and activation: Functions in inflammation, lipid metabolism, and pathogenesis. J. Immunol. 183, 1337–1345. https://doi.org/10.4049/jimmunol.0900365 (2009).
Mahajan, S. et al. Mycobacterium tuberculosis modulates macrophage lipid-sensing nuclear receptors PPARgamma and TR4 for survival. J. Immunol. 188, 5593–5603. https://doi.org/10.4049/jimmunol.1103038 (2012).
Szatmari, I. et al. PPARgamma regulates the function of human dendritic cells primarily by altering lipid metabolism. Blood 110, 3271–3280. https://doi.org/10.1182/blood-2007-06-096222 (2007).
Dawa, S. et al. Inhibition of granuloma triglyceride synthesis imparts control of Mycobacterium tuberculosis through curtailed inflammatory responses. Front. Immunol. 12, 722735. https://doi.org/10.3389/fimmu.2021.722735 (2021).
Liu, Y., Li, J. Y., Chen, S. T., Huang, H. R. & Cai, H. The rLrp of Mycobacterium tuberculosis inhibits proinflammatory cytokine production and downregulates APC function in mouse macrophages via a TLR2-mediated PI3K/Akt pathway activation-dependent mechanism. Cell. Mol. Immunol. 13, 729–746. https://doi.org/10.1038/cmi.2015.58 (2016).
Zhang, X. et al. Inhibition of the PI3K-Akt-mTOR signaling pathway in T lymphocytes in patients with active tuberculosis. Int. J. Infect. Dis. 59, 110–117. https://doi.org/10.1016/j.ijid.2017.04.004 (2017).
Lange, M. et al. AdipoAtlas: A reference lipidome for human white adipose tissue. Cell Rep. Med. 2, 100407. https://doi.org/10.1016/j.xcrm.2021.100407 (2021).
Acknowledgements
This study was sponsored in part by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIT) (No. 2022R1C1C1009250), and in part by the National Research Foundation of Korea (NRF) Grant funded by the Korean government (MSIT) (Grant No. 2018R1A5A2021242). The funding organization did not involve in the study design, data acquisition, data analysis and interpretation, or the content formation of the manuscript.
Author information
Authors and Affiliations
Contributions
N.P.L.: Conceptualization, Data curation, Methodology, Investigation, Formal analysis, Validation, Visualization, Writing—Original draft, Writing—Review & Editing. N.K.A.: Conceptualization, Data curation, Methodology, Investigation, Formal analysis, Validation, Visualization, Writing—Original draft, Writing—Review & Editing. N.T.H.Y.: Data curation, Methodology, Formal analysis, Validation, Visualization, Writing—Review & Editing. N.K.P.: Data curation, Methodology, Formal analysis, Validation, Visualization, Writing—Review & Editing. S.P.: Methodology, Formal analysis, Visualization, Writing—Original Draft, Writing—Review & Editing. V.T.A.T.: Methodology, Writing—Original Draft, Writing—Review & Editing. Y.S.C.: Conceptualization, Methodology, Writing—Review & Editing. J.G.S.: Conceptualization, Methodology, Writing—Review & Editing. J.Y.O.: Conceptualization, Methodology, Writing—Review & Editing, Resources, Supervision, Funding acquisition. D.H.K.: Conceptualization, Methodology, Writing—Review & Editing, Resources, Supervision, Funding acquisition. These authors contributed equally to this work: N.P.L and N.K.A.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Long, N.P., Anh, N.K., Yen, N.T.H. et al. Comprehensive lipid and lipid-related gene investigations of host immune responses to characterize metabolism-centric biomarkers for pulmonary tuberculosis. Sci Rep 12, 13395 (2022). https://doi.org/10.1038/s41598-022-17521-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-022-17521-4
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.