Integration of Metabolomics and Transcriptomics Reveals a Complex Diet of Mycobacterium tuberculosis during Early Macrophage Infection

The nutrients consumed by intracellular pathogens are mostly unknown. This is mainly due to the challenge of disentangling host and pathogen metabolism sharing the majority of metabolic pathways and hence metabolites. Here, we investigated the metabolic changes of Mycobacterium tuberculosis, the causative agent of tuberculosis, and its human host cell during early infection. To this aim, we combined gene expression data of both organisms and metabolite changes during the course of infection through integration into a genome-wide metabolic network. This led to the identification of infection-specific metabolic alterations, which we further exploited to model host-pathogen interactions quantitatively by flux balance analysis. These in silico data suggested that tubercle bacilli consume up to 33 different nutrients during early macrophage infection, which the bacteria utilize to generate energy and biomass to establish intracellular growth. Such multisubstrate fueling strategy renders the pathogen’s metabolism robust toward perturbations, such as innate immune responses or antibiotic treatments.

strategy renders the pathogen's metabolism robust toward perturbations, such as innate immune responses or antibiotic treatments. KEYWORDS Mycobacterium tuberculosis, host-pathogen interactions, metabolism, systems biology I ntracellular pathogens acquire nutrients from the host environment during infection to generate energy and biomass for survival and replication (1)(2)(3). Whether a given substrate is consumed by pathogens depends on its availability at the site of infection and the pathogen's metabolic capacity (4,5). Although nutrient uptake and utilization are the basis of an organism's physiology, it is poorly understood for most intracellular pathogens during infection. Investigations of in vivo physiology during infection are conceptually and technically challenging, and evidence for intracellular nutrient usage is mostly indirect. The standard method to probe intracellular pathogen uptake of a specific substrate is based on infection phenotypes of auxotrophic strains (6). More direct evidence may be obtained by monitoring 13 C incorporation into proteinogenic amino acids of bacteria infecting prelabeled host cells (7). These approaches provide insights into the consumption of host molecules as biomass precursors, but they typically fail to identify energy sources (8).
Development of omics technologies has enabled the assessment of system-wide changes of transcript and protein levels as surrogate measurements of metabolic rearrangements during infection (9)(10)(11). Conceptually, the main difficulty of omics experiments is to define appropriate reference conditions that allow distinguishing metabolic interactions from indirect effects, such as host response-triggered stresses that influence gene expression. In a recent study of Shigella flexneri infection, targeted metabolite measurements helped to identify a major carbon flux from the human host cell through the pathogen and back to the host (12). Although such direct analysis of metabolites would be ideal for understanding a pathogen's in vivo physiology, it is hampered by the large overlap of host and pathogen pathways (13). Therefore, physiological analysis of intracellular pathogens requires a combination of complementary (omics) measurements and integration strategies for data interpretation.
Mycobacterium tuberculosis, the causative agent of tuberculosis, led to an estimated 10.4 million new cases and 1.8 million deaths in 2015 (14). Airborne bacilli typically infect the lungs, where the bacteria are ingested by alveolar macrophages. M. tuberculosis successfully evades killing by macrophages by residing in modified vacuoles for replication and prolonged survival upon initiation of the adaptive immune response (15)(16)(17). To colonize its niche, M. tuberculosis resists host defense while acquiring nutrients from the host environment to generate energy and biomass for its survival and replication. To identify consumed nutrients and engaged metabolic pathways during infection, growth of auxotrophic M. tuberculosis strains was assessed in cellular and animal infection models (6). These studies revealed crucial roles of amino acid metabolism (18)(19)(20), mycobacterial lipid uptake and breakdown (21)(22)(23)(24), lipid-based fueling of central carbon metabolism (25,26), and gluconeogenesis (27,28) but intriguingly, also carbohydrate uptake and utilization (29,30). Genome-wide transcript data of intracellular mycobacteria aimed at capturing more subtle metabolic adaptations during infection, such as regulation of generally essential genes and genes that are not causing pronounced infection phenotypes upon perturbation. Targeted inspection of transcriptional changes in metabolic pathways and ontology enrichment analysis underscored the activation of lipid-related metabolic pathways and general stress responses upon entry into the host cell (31)(32)(33)(34)(35). Tracking biomass incorporation by cold or hot substrate labeling strategies directly demonstrated in vivo utilization of both specific lipids and amino acids by M. tuberculosis (7,24). However, despite remarkable progress in understanding mycobacterial metabolism of recent years, many aspects of intracellular nutrient acquisition by M. tuberculosis remain puzzling (36).
Inspired by the fact that extensive cometabolism of nutrients has previously been suggested for M. tuberculosis (7,37), we develop here an approach to (i) identify putative substrates of the intracellular pathogen, (ii) quantify their relative utilization, (iii) assess their fate in the pathogen's metabolism, and (iv) infer metabolic consequences in the host caused by bacterial metabolism. Early infection of human macrophage-like cells with M. tuberculosis served as a model system for the integration of dynamic metabolite and transcript data to infer metabolic host-pathogen interactions. We collected samples for both of these measurements from the same cultures and overlaid the resulting data onto a genome-wide metabolic reaction pair network to identify metabolic subnetworks of particular importance during infection. We then integrated the detected transcriptional changes into a combined genome-scale model of human macrophages and M. tuberculosis (38) to simulate the host-pathogen metabolic interactions in silico. Our analyses revealed the pathogen's coutilization of up to 33 different nutrients during macrophage infection, of which we predicted 3 to be used solely as biomass precursors, whereas the rest are further metabolized for energy generation and precursor formation. On the basis of these findings, we propose that such multiple-substrate fueling confers high robustness to interventions of the pathogen's metabolism.

RESULTS
Measured metabolites are mainly derived from the host. To study metabolic host-pathogen interactions during the early stage of infection, we infected THP-1 cells with M. tuberculosis H37Rv after induced differentiation into macrophages (39). This system was selected for the following reasons: (i) the human origin of the host in contrast to murine systems, (ii) the highly reproducible characteristics of a stable cell line in contrast to primary cell isolates, (iii) immediate intracellular replication upon infection compared to extensive initial bacterial killing of other model systems (34,39).
Given the high conservation of metabolic processes in biological systems, the vast majority of metabolites, such as nucleotides, amino acids, and central carbon metabolites, are commonly found across organisms. Hence, measured metabolites typically cannot be assigned to a particular organism in mixed samples, in sharp contrast to DNA, RNA, and proteins whose sequences are mostly species specific. The short biological turnover time of metabolites precludes physical separation of species prior to metabolite measurement; hence, infection samples contain both host-and pathogenderived metabolites. Therefore, prior to setting up the infection, we performed a metabolomics spike-in experiment of different mixtures of separately cultured THP-1 macrophages and tubercle bacilli (cell-to-bacterium ratios of 1:0, 1:5, 1:10, and 1:20) to evaluate the potential contribution of either organism to each of the assessed metabolite pools.
Untargeted metabolite measurements were applied for their high measurement coverage (40), which resulted in the detection of 4,593 ions in total. Pairwise comparison (t test) of ion intensities revealed that 233 (5%) of all detected ions showed a significant increase (fold change of Ͼ1.5; false-discovery rate [FDR] of Ͻ0.05) in the presence of the maximally assayed, nonphysiological bacterial load of 20 bacteria per single THP-1 cell compared to THP-1 alone ( Fig. 1A; also see Fig. S1 in the supplemental material). Only two ions were found to have significantly decreased intensities, demonstrating that addition of bacterial biomass to the samples does not negatively interfere with the measurement, e.g., through matrix effects (41). To unravel the identity of the increased ions upon the addition of bacteria to THP-1 cells, we adapted a metabolite reference list that aggregated the Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolite repository specific to M. tuberculosis H37Rv (mtu, 3,414 metabolites), Homo sapiens (hsa, 2,915 metabolites) (42), and the M. tuberculosis Mtb Lipid Database (http://mrl.colostate.edu/mtb/) and MycoMass (http://www.brigh amandwomens.org/research/depts/medicine/rheumatology/Labs/Moody/default.aspx) databases of mycobacterial lipids (10,424 metabolites in total) (43). Out of 233 increased ions, 26 could be annotated, of which 9 were likely derived from M. tuberculosis due to their species specificity ( Fig. 1B and Table S1). Since only a few human metabolites changed upon bacterial spiking (17 of 514 annotated metabolites), we conclude that detected metabolite signals mainly reflect host metabolites and that mycobacterial metabolites can be monitored only if they are unique and relatively abundant in the bacteria.
Dynamic metabolite changes during infection. THP-1 cells were infected at a multiplicity of infection (MOI) of 5, and immediate bacterial replication was comparable to replication in previous reports (39), based on CFUs that increased 3.6-fold between 4 and 48 h postinfection (h p.i.). To study metabolite dynamics during infection, we performed untargeted metabolomics analysis at 0, 4, 24, and 48 h p.i. (40). Based on the pilot experiment with bacterium-THP-1 cell mixtures, we expected measurements to reflect mainly host metabolic changes caused by either consumption of nutrients by the infecting bacteria or the response of host cells to phagocytosis. To control for the latter, we used gamma-irradiated tubercle bacilli, which still undergo phagocytosis and trigger a cellular response, but do not engage in active metabolism, i.e., consumption of nutrients. A total of 1,886 detected ions were annotated using the combined metabolite list described above, including 108 unique to mycobacteria and 173 unique to the lipids list ( Fig. 2A and Table S1).
To identify metabolites associated with dynamic changes during infection, we applied a three-step procedure. First, we reduced measurement noise applying the permutation filtering method for dynamic data recently reported (44). This method is based on the assumption that "real" metabolic changes are a continuous process, so that data of neighboring time points show smaller variation than randomly picked data (permutation of time points). Second, for each time point, we performed comparative analyses of the metabolomes of infected and uninfected cells and defined the threshold for significant changes (|log 2 fold change| Ͼ log 2 1.5; FDR Ͻ 0.05). Third, we applied a significance threshold for the comparison between the changes in the infection experiment versus control experiment (FDR Ͻ 0.05). These three subsequent steps led to the identification of consistently and significantly changing (132 increasing, 394 decreasing) metabolites during the 48 h of infection ( Fig. 2B and Table S1). To discover general metabolic trends in the host upon M. tuberculosis infection, we performed a metabolite set enrichment analysis for both significantly increased and decreased metabolites using the functional set definitions of the Human Metabolome Database (HMDB) (45,46).
Accumulating metabolites were significantly enriched for various subclasses of lipids, such as monoacyl glycerols (FDR ϭ 0.05) and phosphatidylinositol phosphates (FDR ϭ 5 ϫ 10 Ϫ4 ) ( Fig. 2C and Table S1). This is in agreement with previous observation of increased lipid levels in macrophages upon ingestion of M. tuberculosis (47,48). This lipid accumulation can be caused by shedding of bacterial lipids (49,50), through the pathogen's interference with the host's lipid turnover (51), and by the activity of bacterial lipolytic enzymes digging carbons from host membranes (52). The latter mechanism may explain the observed accumulation of monoacyl glycerols and fatty acids at the expense of phosphatidylethanolamines, phosphatidylcholines, and phosphatidylserine that decreased during infection (Fig. 2D).
The major functional classes enriched among decreasing metabolites were amino acids (FDR ϭ 0.3 ϫ 10 Ϫ6 ), nucleotides (FDR ϭ 2 ϫ 10 Ϫ3 ), and sugar phosphates (FDR ϭ 4 ϫ 10 Ϫ4 ) ( Fig. 2C and Table S1). These compounds are potential mycobacterial substrates, some of which have already been reported, such as aspartate and asparagine (18,19) (Fig. 2D). Moreover, a recent study based on 13 C-tracing experiments proposed the uptake of various amino acids by intracellular M. tuberculosis (7). The availability of another potential carbon source is indicated by the enrichment for the chemical classes   of monosaccharide and sugar phosphates (FDR ϭ 4 ϫ 10 Ϫ4 ) (Fig. 2D). The uptake of phosphorylated carbohydrate(s) by intracellular mycobacteria is intriguing, because it provides not only carbon but also phosphorus, which appears to be limiting during infection (53). Overall, the metabolic analyses suggest that mycobacteria have access to a variety of nutrients inside the macrophage, such as amino acids, nucleotides, sugar phosphates, lipids, and fatty acids.
Dynamic transcriptional changes in the host and pathogen during infection. To obtain further insights into host-pathogen interactions, we investigated whether transcriptional adaptations of both host and pathogen during infection reflect multinutrient acquisition of the bacteria. Analogous to metabolomics, the transcriptome was analyzed at 0, 4, 24, and 48 h p.i. Complete transcriptome analysis was performed on rRNA-depleted total RNA by microbe-enriched dual RNA sequencing (dual RNA-seq) (54). The physical mycobacterial RNA enrichment yielded an increased number of reads for mycobacterial genes (up to 5.7 million) and hence, improved the statistical power of mycobacterial transcript measurements ( Fig. 3A and B). Analyses of host transcripts were performed on samples without physical enrichment of mycobacterial RNA. RNA reads could be mapped to 4,008 mycobacterial annotated genes and 17,198 human annotated genes (Table S2).
Similar to the analysis of metabolite data, we aimed to identify general trends in the transcript changes. We first defined differentially expressed transcripts during infection (|log 2 fold change| Ͼ log 2 1.5; FDR Ͻ 0.05 compared to uninfected samples) and performed pathway enrichment analysis using KEGG and TBDB pathway definitions (55). Upregulated host transcripts were mainly enriched for infection-associated pathways, such as antigen processing and presentation, for steroid and primary bile acid  biosynthesis, and for xenobiotic degradation pathway. In contrast, aminoacyl-tRNA biosynthesis and several signaling pathways were among the significantly enriched functional groups of downregulated host transcripts (Table S2). In mycobacteria, several amino acid biosynthetic pathways were enriched among the downregulated transcripts, whereas glycine cleavage pathways and the previously reported methylcitrate cycle and steroid degradation were enriched in the group of upregulated transcripts ( Fig. 3C and Table S2). Overall, transcriptional changes in the host revealed a general stress response, whereas numerous pathway activation and deactivation events in the pathogen indicated major mycobacterial adaptations during infection. Although pathway enrichment analysis provides a general picture of the changes occurring, it strongly depends on arbitrary pathway definitions and is performed separately for metabolites and genes. To combine these data to characterize host-pathogen interactions better, we next integrated metabolite and RNA-seq data into an unbiased genome-scale reaction pair network. Unbiased integration of metabolomics and RNA-seq data using a genome-wide reaction pair network. To overlay metabolite and transcript data independent from a priori pathway definitions, we mapped the observed changes onto a genome-scale reaction pair network to identify changing metabolic subnetworks. We built the network based on KEGG reaction pairs and searched for subnetworks containing a minimum of three metabolically connected enzymes with significantly changing transcription (|log 2 fold change| Ͼ log 2 1.5 and FDR Ͻ 0.05 for at least one time point), as recently applied to proteomics data (56). We identified 24 subnetworks of changing enzymes and metabolites with a length of 3 or more that represent metabolic changes upon infection (Fig. 4). Several subnetworks had an increasing (cholesterol degradation, nucleotide salvage pathway, and pentose phosphate pathway) or decreasing (aromatic and branched-chain aliphatic amino acids and porphyrin metabolism) transcriptional pattern, while some showed a mixed response (glycerophospholipids, fatty acids, and central carbon metabolism). In the following sections, we focus on the key hostpathogen interaction subnetworks identified.
Cholesterol degradation fuels M. tuberculosis metabolism. Bacterial consumption and degradation of a host metabolite should lead to its decrease and to activation of the pathogen's catabolic genes (activated metabolic subnetworks in Fig. 4). Cholesterol utilization by intracellular tubercle bacilli has been discovered through different experimental approaches (23,34,57) and hence served as an illustrative consumption example. Most bacterial genes of the catabolic pathway were expressed at higher levels during infection. Consistent with the hypothesis of induced catabolism, we observed accumulation of the intermediate 4,5-9,10-diseco-3-hydroxy-5,9,17-tri-oxoandrosta-1(10),2-diene-4-oic acid (DSHA), which is not produced by humans and hence directly reflects mycobacterial cholesterol degradation (Fig. 5A). Host cholesterol levels decreased by 80% (P ϭ 0.09), and other cholesterol derivatives were also depleted during infection (Fig. 5A). As a consequence of such cholesterol depletion, host cells may need to replenish their cholesterol pools. In fact, we found induced expression of methylsterol monooxygenase 1 (EC 1.14.13.72) (MSMO1; Entrez gene identifier 6307) and the lipase A/cholesterol esterase (EC 3.1.1.13) (LIPA; Entrez gene identifier 3988) plus general enrichment of the steroid biosynthesis pathway among the host genes with increased expression during infection, which is consistent with increased cholesterol synthesis ( Fig. 5A and Table S2).
Cholesterol degradation results in profound abundance of propanoyl-coenzyme A (CoA), which becomes toxic unless further metabolized (58,59). Expectedly, we found strong transcriptional induction of the mycobacterial methylcitrate cycle, which converts propanoyl-CoA and oxaloacetate to succinate and pyruvate (Fig. 5B). Increased expression of Rv3280 and Rv3281 involved in converting propanoyl-CoA to methylmalonyl-CoA (Fig. 5B) indicated increased synthesis of methylmalonyl-CoA. This is likely linked to its role as a building block for branched-chain fatty acids, which have been found to be more abundant under axenic growth conditions on cholesterol (59). This is further supported by the fact that the expression of mutA (methylmalonyl-CoA mutase, Rv1492), converting methylmalonyl-CoA to succinyl-CoA in the tricarboxylic acid (TCA) cycle, was downregulated (Fig. 5B). These data underscore prior evidence of in vivo cholesterol consumption and indicate that the propanoyl-CoA produced by cholesterol degradation is converted to methylcitrate to fuel central carbon metabolism or to methylmalonyl-CoA to enter lipid biosynthesis (34).   . Nodes represent metabolites, and edges represent reactions according to the KEGG main reaction pairs. Reaction directions are indicated according to the KEGG kgml pathway maps (the arrows do not necessarily correspond to reaction reversibility). Only metabolic paths connecting changing reactions with length of Ն3 are shown. G1P, glucose-1-phosphate; GAP, D-glyceraldehyde 3-phosphate; S7P, sedoheptulose 7-phosphate; DSHA, 4,5-9,10-diseco-3-hydroxy-5,9,17-tri-oxoandrosta-1(10),2-diene-4-oic acid. Upregulated genes had a log 2 fold change Ͼ log 2 1.5 and FDR Ͻ 0.05 for at least one time point, and downregulated genes had a log 2 fold change ϽϪlog 2 1.5 and FDR Ͻ 0.05 for at least one time point. Increased metabolites had a log 2 fold change Ͼ log 2 1.5 and FDR Ͻ 0.05 for at least one time point, and decreased metabolites had a log 2 fold change ϽϪlog 2 1.5 and FDR Ͻ 0.05 for at least one time point. Subgraphs were drawn with Cytoscape and manually classified into KEGG pathways (see Table S4 in the supplemental material).
Cholesterol degradation illustrates the pathogen's activation of catabolic pathways to utilize host metabolites, which are consequently depleted in host cells. On the basis of a comparable activation pattern, we provide evidence that tubercle bacilli utilize glycine and sugar phosphates through the glycine cleavage and rhamnose biosynthesis pathways, respectively ( Fig. S2A and B).
Consumption of hydrophobic amino acids deactivates anabolism by mycobacteria. Among the deactivated mycobacterial subnetworks were nonpolar and aromatic amino acid biosynthesis pathways, i.e., expression of most genes involved in valine, leucine, and isoleucine biosynthesis as well as phenylalanine, tyrosine, and tryptophan biosynthesis was downregulated during infection (log 2 fold change Ͻlog 2 1.5; FDR Ͻ 0.05), whereas the levels of the corresponding end products decreased over time (log 2 fold change ϽϪlog 2 1.5; FDR Ͻ 0.05) ( Fig. 5C and D). This pattern of transcript and metabolite data suggests that these amino acids are directly utilized by M. tuberculosis, which is supported by a recent study demonstrating replication of a tryptophan auxotrophic M. tuberculosis strain in human monocyte-derived macrophages (20). As a consequence of supplemented biosynthesis with host-derived amino acids, corresponding anabolic pathways are downregulated. For other amino acids, expression patterns of biosynthetic and degradation pathways changed in different directions, suggesting that M. tuberculosis depends, at least in part, on their biosynthesis during infection. Nonpolar and aromatic amino acids may be particularly accessible nutrients for M. tuberculosis because their hydrophobic nature facilitates entry into the phagosome by passive membrane diffusion. Another example of metabolite uptake that deactivates biosynthesis in intracellular mycobacteria is pyrimidine metabolism, since levels of pyrimidine intermediates decreased during infection, transcription of the enzymes required for de novo synthesis from the pentose phosphate pathway intermediates was also decreased, whereas enzymes of the nucleotide salvage pathway were increased (Fig. S2C). In contrast to catabolic cholesterol degradation, these examples illustrate the availability of biomass precursors at the site of infection, because bacterial consumption deactivates their biosynthetic pathways in M. tuberculosis.
Integrating RNA-seq constraints into a genome-scale model of host-pathogen interactions. Although data integration using a genome-wide reaction pair network identified subnetworks of particular interest during infection, this approach focused solely on the bacterial response without considering metabolic host-pathogen interactions in their entirety. To obtain a more integrated picture, we modeled metabolic fluxes between host and pathogen using a comprehensive combined genome-scale metabolic model of macrophages and intracellular tubercle bacilli (38). This model contains both human and mycobacterial metabolic networks with gene-protein-reaction annotations connected through the phagosomal compartment. We integrated the dual RNA-seq data of both host and bacterial gene expression into the model using the gene-proteinreaction annotations and estimated interspecies metabolic fluxes between the host and pathogen to gain a complete view of the infection system behavior.
Modeling metabolic host-pathogen interactions was performed in five steps. First, we expanded the existing model by including the reactions that we identified to be transcriptionally changed in the pathogen during infection and that were missing; i.e., adding cholesterol degradation, methylcitrate cycle and the capacity to synthesize glycerol for the production of glycerolipids. Second, we updated the gene-proteinreaction rules in the original model using the latest version of human (Recon 2) and mycobacterial models (60). Third, we relaxed the phagosomal uptake constraints in the existing model, which forced the bacteria to utilize glycerol and nitrogen monoxide as the major carbon source and sole nitrogen source, respectively. This step is justified by previous work showing that THP-1 cells do not produce prominent amounts of nitrogen monoxide (61) and that glycerol consumption plays a negligible role during infection (62). Fourth, we used an optimization procedure to find the flux solution in the modified model that maximizes either macrophage or mycobacterial biomass flux. The macrophage maximal biomass flux was 0.027 h Ϫ1 as in the original model (38), whereas the mycobacterial maximal biomass flux was 0.029 h Ϫ1 (compared to 0.0021 h Ϫ1 with the original constraints) given the possibility to take up different nutrients from the phagosome. This corresponds to an estimated mycobacterial doubling time of 24 h, which matches our experimental data based on CFUs indicating roughly two replications during the 48-h infection experiment. Fifth, following the rationale that gene expression and fluxes through an enzyme are loosely related, we used transcript data to formulate an optimization function based on activation and repression of genes, as previously described (63). Briefly, we searched for a flux solution through the network that maximizes the number of active and inactive reactions corresponding to increased and decreased gene expression levels, respectively, while satisfying stoichiometric constraints. This procedure results in condition-dependent flux solutions that mimic the metabolic behavior of the cells provided RNA-seq fold change data and thresholds as model inputs.
In the host-pathogen model, bacteria are enabled to consume 68 different substrates, based on reported substrates under in vitro culture conditions (38). In order to investigate whether the RNA-seq data can provide insights into early host-pathogen interactions, we performed sensitivity analysis of the predicted phagosomal uptake to the mycobacterial genes used in the optimization function. We varied the number of differentially expressed genes used to formulate the flux optimization function by changing the fold change threshold and obtained the optimal flux solution for each set of the changing genes. Constrained by the RNA-seq data, the identified flux solution indicated consumption of 33 of the 68 possible M. tuberculosis nutrients for at least one of the tested sets of changing genes. Whereas cholesterol and aspartate were predicted to be taken up from the phagosome for Ͼ85% of the tested gene sets in the sensitivity analysis, glycerol was utilized rarely (in solutions for Ͻ30% tested gene sets), which is in agreement with previous experimental findings (62) (Fig. 6A). To assess the temporal development of the host-pathogen interactions, we also compared the solutions resolved for the constraints of each time point separately. As the infection progresses, fatty acids seem to become a less prominent carbon source, whereas amino acids gain importance as a carbon source (Fig. S3).
In order to investigate the contribution of the predicted substrate consumption on mycobacterial growth, we repeated the flux optimization procedure using the full list of RNA-seq constraints (all genes that pass the threshold of |log 2 fold change| Ͼ log 2 1.5 and FDR Ͻ 0.05) for each set of constraints on the lower bound of mycobacterial and host cell biomass flux, ranging from 0 to 50% of either maximal value (36 biomass constraint sets in total). For all parameter sets, fatty acids, amino acids, and cholesterol were predicted to be consumed (Fig. S4), indicating the robustness of the predictions against the set biomass parameter in the model. The consumption profiles, however, varied quantitatively, depending on the set constraints (Fig. S5). For each of the 36 constraint sets, we calculated the fraction of consumed metabolite used for biomass formation through comparison of uptake and biomass flux multiplied by the corresponding coefficient (Fig. S6). Among 33 identified carbon sources, 9 were used mostly for biomass (fraction Ͼ 0.50), whereas the remaining ones are mostly catabolized to produce other metabolites or energy (fraction Ͻ 0.5) ( Table 1 and Table S3). The main nitrogen sources were amino acids rather than nitrogen monoxide, consistent with increased gene expression of Rv2220 catalyzing the central nitrogen metabolism reaction from ammonia and glutamate to glutamine (Table S2). Of the 33 predicted carbon sources, 31 could be detected by our metabolomics measurements, of which 26 were significantly decreased during infection (log 2 fold change Ͻ Ϫlog 2 1.5; FDR Ͻ 0.05; Fig. 6B and Table S1), supporting our hypothesis of their bacterial consumption.
Overall, M. tuberculosis appears to follow a multiple-substrate fueling strategy during the infection of macrophages, utilizing a large variety of different nutrients ranging from amino acids to carbohydrates and lipids (Fig. 6). This conclusion is further supported by the activation of several mycobacterial transporters of sugars and amino acids (Fig. 6C and Table S2). Compared to a single carbon source metabolism, this metabolic adaptation suggests increased flexibility and hence robustness of M. tuberculosis toward interventions, either through host responses or chemotherapeutic treatments.

DISCUSSION
We characterized metabolic interactions between human macrophage-like cells and M. tuberculosis during early infection by integrating dynamic metabolomics and dual RNA-seq data with two genome-wide network analysis approaches. In the first approach, we identified metabolic subnetworks that were differentially active during infection of human macrophage-like cells with M. tuberculosis. This genome-wide reaction pair analysis revealed two distinct patterns of activity employed by intracellular tubercle bacilli to utilize available host metabolites. For certain consumed host metabolites, the pathogen activates specific metabolic pathways to make use of newly acquired substrates, such as the cholesterol degradation, methylcitrate cycle, and glycine cleavage pathways. Other metabolites, such as some amino acids and nucleotides, are directly used for anabolism, allowing M. tuberculosis to downregulate respective endogenous biosynthesis pathways. In addition to changes in mycobacterial metabolism, we identified adaptations in host metabolism and found indications for mycobacterial manipulation of host metabolism, for example an altered lipid composition that was likely caused by upregulated mycobacterial lipases during infection. In the second approach, the key findings of the first network analysis built the basis for refinements of a metabolic host-pathogen model. Subsequent genome-scale modeling with RNA-seq-derived constraints of the metabolic interactions between macrophages and intracellular tubercle bacilli revealed a multiple-nutrient strategy during early infection, utilizing more than 30 different carbon and nitrogen sources. Sensitivity analysis of the set constraints further weighted the robustness of each predicted substrate. Metabolic flux analysis further enabled us to estimate the relative contribution of each nutrient to the pathogen's intracellular growth and survival at different time points of the infection through determination of the fates of these nutrients in either anabolism or catabolism to serve as biomass precursors or energy sources, respectively. The demonstrated data-driven modeling of the host-pathogen interactions serves as a platform for future experiments aiming at integrating multi-omics data during bacterial infections. M. tuberculosis infection cycles can span decades during which pathogen and host undergo different stages as a result of persistent interactions (64). Generally, in vitro infection models such as the early infection model used here cannot recapitulate this complexity and are subject to inherent heterogeneity with respect to infection levels and intracellular growth. The results thus have to be interpreted with appropriate caution. Nevertheless, the results may give insights into some general  principles of mycobacterial infection strategies. The multiple-substrate fueling of intracellular M. tuberculosis demonstrated here underlines the organism's ability to utilize even sparsely available nutrients. This ability renders M. tuberculosis robust against pharmacological or host interventions that aim at blocking the food supply, but our findings could potentially help to identify the pathogen's "metabolic Achilles heel" for future therapeutic intervention measures.
To illustrate how our and future data may be used for guidance of therapeutic interventions, we performed a gene essentiality analysis with the combined hostpathogen genome-wide metabolic model before and after integration of our omics data by setting the flux value for each mycobacterial reaction to zero and estimating bacterial in vivo growth by solving the mycobacterial biomass optimization problem. The suggested multisubstrate strategy reduced the predicted essential metabolic genes from 162 to 103 (growth rate reduction Ͼ 90% compared to the wild type). The 59 genes that became "nonessential" under conditions of multisubstrate availability belonged mostly to amino acid metabolism (44 genes) (Table S3). Of these 59 genes, 18 were genes that were also found to be nonessential under axenic culture conditions (65). The 103 genes that were predicted to impact mycobacterial in vivo replication under multisubstrate utilization conditions were mainly found in membrane (27 genes), amino acid (17 genes), nucleic acid (29 genes), and lipid metabolism (18 genes). Our results suggest that there are fewer genes truly essential for mycobacteria during infection, thus narrowing down options of potential drug targets in mycobacterial metabolism. Supported by the recently discovered growth inhibitors of intracellular M. tuberculosis targeting cholesterol metabolism (57), we believe that our identified active metabolic subnetworks should be focused upon in future investigations aiming at identifying suitable metabolic perturbation targets in M. tuberculosis infection. Metabolic sample preparation, measurements, and data analysis. Cells were washed twice with prewarmed ammonium acetate solution (75 mM, pH 7.5), frozen on dry ice, and stored at Ϫ80°C until further processing. Metabolites were extracted three times with 70% ethanol at 78°C for 2 min, and the three extracts were pooled, dried under vacuum at ambient temperature (22°C), and resuspended in 0.5-ml water for metabolic analysis. Untargeted measurements of metabolites was performed by direct injection mass spectrometry using a quadrupole time of flight instrument (Agilent 6550 Q-TOF) with the following settings: negative mode; 4-GHz high-resolution mode; scanning the m/z range of 50 to 1,000 following published protocol (40). Negative ionization mode was chosen for its ability to detect the organic acids and phosphorylated compounds central to carbon metabolism and hence essential for the employed metabolic models. Ions were annotated to metabolites based on exact mass considering [M-Hϩ] and [MϩF-] ions using the metabolite reference list compiled from the Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolite repositories hsa (Homo sapiens) and mtu (M. tuberculosis), the Human Metabolome Database (HMDB) (46), and Mtb Lipid Database and MycoMass databases of mycobacterial lipids. Ions were assigned to metabolites using this list and allowing a mass tolerance of 1 mDa and an intensity cutoff of 1,500 counts as previously described (40). Raw intensity values were normalized by mean intensity value in each sample. The annotation was filtered in three steps. (i) For each ion, only metabolites with the top annotation score were retained. (ii) For each metabolite, only the annotation with the top score was retained. (iii) The annotations with adducts such as NaCl, H/Na, H/K were removed. Measurements were filtered to account for noise using the permutation filtering method recently reported (44). In brief, the variability of neighboring time point measurements was calculated and compared to the variability of the time point measurements arranged in a randomized order. The measurements with variability indifferent from the one of randomized data were removed from further analysis (quantile cutoff Ͼ 0.33).

Infection
RNA isolation, mycobacterial RNA enrichment, and RNA sequencing. Total RNA from mycobacterial cultures was prepared as previously described (66). Extraction of total RNA from THP-1 cells was prepared with TRIzol reagent using glycogen as a carrier according to the supplier's recommendation (Life Technologies). Total RNA from M. tuberculosis-infected THP-1 cells without enrichment was isolated by abrasive particles in a reciprocal shaker with TRIzol (66). Enrichment of mycobacteria from infected THP-1 cells was carried out by differential lysis of host and mycobacterial cells using guanidine thiocyanate (GITC).
Infected cells were washed with PBS at room temperature (RT). Cold 4 M GITC was added to the monolayer, and the cells were transferred to a 1.5-ml screw-cap tube. After centrifugation, the pellet was resuspended in residual GITC and mixed with 1 ml TRIzol containing 20 g/ml linear acrylamide, followed by incubation for 5 min at RT. Bacteria were disrupted by bead beating (FastPrep instrument; two cycles of 30 s at maximum speed with cooling on ice between cycles). The sample was centrifuged for 1 min at 4°C and 13,000 rpm, and the supernatant was transferred to a 2-ml screw-cap tube containing 200 l chloroform, mixed, and incubated at RT for 5 min. After centrifugation at 4°C and 13,000 rpm for 10 min, RNA was extracted from the aqueous phase using the Qiagen RNeasy minikit including an on-column DNase digestion (Qiagen). The quality and quantity of total RNA were assessed using an Agilent 2100 bioanalyzer (Agilent Technologies) and a NanoDrop 1000 spectrophotometer.
Bar-coded RNA-seq libraries were prepared according to the TruSeq RNA Sample Preparation v2 Guide (67) without fragmentation, as no additional fragmentation step was required because the average final library size was approximately 350 bp, and without size selection. The Gram-positive bacteria Ribo-Zero (Epicentre) rRNA magnetic removal kit was used to remove bacterial rRNA from mycobacterial total RNA. For total RNA from uninfected THP-1 cells at time point 0 h, the Ribo-Zero magnetic kit human/mouse/rat was used, while depletion of rRNA from infections with and without enrichment were depleted with the Ribo-Zero magnetic gold kit (epidemiology). All cDNA libraries were verified for size distribution and quality using the highly sensitive DNA kit (Agilent) on a 2100 bioanalyzer and quantified with the Qubit 2.0 fluorometer (Life Technologies). Libraries from each consecutive experiment were pooled as 8-plex and on-board loaded with a Hi-Seq 1500 instrument. All sequencing reactions were carried out as rapid runs using TruSeq rapid PE (paired-end) cluster kits and a 200-cycle TruSeq rapid SBS (sequencing by synthesis) kit as 2 ϫ 51 cycles including 7-cycle indexing in order to obtain 50-bp paired-end reads.
Mapping of RNA-seq reads. Raw reads were trimmed using Trimmomatic (68) to remove lowquality bases and adapters and quality checked using FastQC (http://www.bioinformatics.babraham.ac .uk/projects/fastqc/). Fastq files were imported in pairs into the CLC Genomics Workbench (CLC Bio, version 7.5.1) running on a CLC Genomics Server (CLC Bio, version 6.5.2). The imported reads were mapped to the mycobacterial genome H37Rv (NC_000962) and Homo sapiens genome hg38 (GRCh38) with the RNA-seq tool. For transporter analysis, RNA-seq data were mapped to a set of mycobacterial transporters compiled from the Transporter Classification Database (TCDB) [http://www.tcdb.org/ download.php, section Retrieve protein FASTA sequence(s) with Organism Name, query transporters for mycobacterium]. The following parameters were used for mapping: mismatch cost of 2, insertion cost of 3, deletion cost of 3, length fraction of 0.8, similarity fraction of 0.8, maximum number of hits for a read of 10, and failed reads were removed. Mapping reports were generated for each sample and exported from the CLC Genomics Workbench in csv format for further analysis. Unique read counts were normalized by a size factor calculated for each sample as the median of the ratios of the detected gene counts to the counts of an averaged sample. The averaged sample was calculated as the geometric mean across all samples (69). Enriched samples were used for mycobacterial RNA analysis, and nonenriched samples were used for human RNA analysis.
Statistical analysis. Mass spectrometry and dual RNA-seq data were imported into Matlab (Math-Works) for quantitative and statistical analysis. For differential analysis, human samples with spiked bacteria at a ratio of 0, 5, 10, or 20 bacteria per human cell were compared to the human samples without bacteria using t test with unequal variances (four biological replicates and two technical replicates per condition). Metabolomics mixed samples from infection and control experiments at 4, 24, and 48 h p.i. were compared to uninfected human samples using a t test with unequal variances (three or four biological replicates and two technical replicates per time point). Fold changes of the infection versus control experiments at 4, 24, and 48 h p.i. were compared using a t test with unequal variances. RNA-seq nonenriched mixed samples at 4, 24, and 48 h p.i. were compared to the uninfected human samples; RNA-seq enriched mixed samples were compared to the mycobacterial samples using a negative binomial model (Matlab function nbintest with constant variance link and pooled/nonpooled variance for enriched/nonenriched samples) (three biological replicates per time point). P values were corrected for multiple hypotheses testing by calculating the false-discovery rate (FDR) by the Benjamini-Hochberg procedure. For functional group and pathway enrichment, metabolite classes were defined as in the HMDB database. Pathway descriptions were downloaded from KEGG and TBDB databases. Metabolites passing the threshold of |log 2 fold change| Ͼ log 2 1.5, FDR Ͻ 0.05, and FDR(change to control) Ͻ 0.05 and transcripts passing the threshold of |log 2 fold change| Ͼ log 2 1.5 and FDR Ͻ 0.05 were ranked according to P value or fold change at each time point. Enrichment analysis was performed with the Fisher exact test for each subset of size varying from 1 to the total changing set size, and the smallest P value was retained for each pathway, as proposed in the gene set enrichment analysis (GSEA) method (45). P values for the enrichment analysis were adjusted for multiple hypotheses testing by calculating FDR by the Benjamini-Hochberg procedure.
Computational analysis. All computational analyses were done in Matlab (MathWorks), if other tools are not mentioned explicitly. For KEGG reaction pair network analysis, the KEGG reaction pair list, KEGG reaction pair-mycobacterial gene association list, and KEGG compound list were downloaded from KEGG API (application programming interface) (http://rest.kegg.jp/). An in-house script in Python (Python 2.7) was used to retain only main reaction pairs. For each mycobacterial gene, a list of metabolites involved in reactions associated with this gene was created. A gene-gene connectivity matrix was built based on the metabolites associated with genes (if two genes are associated with reactions sharing a metabolite, they are connected through this metabolite). K-shortest path script based on Yen's algorithm (http:// www.mathworks.com/matlabcentral/fileexchange/32513-k-shortest-path-yen-s-algorithm) was used to calculate the shortest paths between two genes based on the connectivity matrix. All paths of length three and more were visualized in Cytoscape software (Cytoscape 2.8.3). Reaction directions were downloaded from KEGG (kgml pathway files) with an in-house Matlab script.
Flux balance analysis with RNA-seq constraints: the joint macrophage-mycobacteria model was downloaded from the repository (http://systemsbiology.ucsd.edu/InSilicoOrganisms/MacTB) and extended with metabolites and reactions listed in Table S3 in the supplemental material. The latest releases of human genome-scale model Recon 2.04 and iNJ661 (http://bigg.ucsd.edu/models/iNJ661) were used to update the gene-protein-reaction (GPR) information (Table S3). Flux balance analysis was performed optimizing for biomass production using the cplexlp tool from CPLEX Optimizers 12.5.1 (IBM). For RNA-seq data integration, reactions associated with human and mycobacterial transcripts passing the threshold of |log 2 fold change| Ͼ log 2 1.5 and FDR Ͻ 0.05 for at least one time point were defined active or inactive. Mixed-integer linear programming (MILP) minimizing the flux through inactive reactions and maximizing the flux through active reactions was defined as described in reference 63 and solved using the cplexmilp tool. To perform sensitivity analysis with the RNA-seq constraints, mycobacterial genes were sorted by maximum absolute fold change, and all sets of parameters from 1 to the total number of changing model genes (236 genes) were used to constrain the model without additional biomass constraints. To perform sensitivity analysis with biomass constraints, a set of 36 constraints was used (all combinations of lower bounds for mycobacterial and TPH1 biomass fluxes from the set [0 10% 20% 30% 40% 50%] of maximum value). All changing genes were used to constrain the model. Based on the flux solution, catabolic flux of a metabolite was calculated as the difference between the phagosomal uptake flux and biomass requirements defined in the model for each set of biomass constraints, and the average fraction of metabolite taken up that was used for biomass formation was calculated. For the gene essentiality analysis, the original and modified models (with phagosomal uptake upper bounds for metabolites identified from sensitivity analysis to RNA-seq constraints set at 1,000) were used to solve the mycobacterial biomass optimization task with single gene knockout simulations. The genes were defined essential if the knockout resulted in more than 90% decreased growth rate compared to that of the wild-type strain.
Accession number(s). Raw sequence reads of the RNA-seq analysis can be downloaded from the EMBL-EBI European Nucleotide Archive under accession no. E-MTAB-5287. Metabolomics data are available in Table S1. The modified metabolic model and the browsable Cytoscape file of Fig. 4 can be downloaded from the laboratory's webpage (http://www.imsb.ethz.ch/research/sauer.html). All numerical data used for all main and supplemental figures are provided in the supplemental tables.