Genome-scale metabolic modeling of the human milk oligosaccharide utilization by Bifidobacterium longum subsp. infantis

ABSTRACT Bifidobacterium longum subsp. infantis is a representative and dominant species in the infant gut and is considered a beneficial microbe. This organism displays multiple adaptations to thrive in the infant gut, regarded as a model for human milk oligosaccharides (HMOs) utilization. These carbohydrates are abundant in breast milk and include different molecules based on lactose. They contain fucose, sialic acid, and N-acetylglucosamine. Bifidobacterium metabolism is complex, and a systems view of relevant metabolic pathways and exchange metabolites during HMO consumption is missing. To address this limitation, a refined genome-scale network reconstruction of this bacterium is presented using a previous reconstruction of B. infantis ATCC 15967 as a template. The latter was expanded based on an extensive revision of genome annotations, current literature, and transcriptomic data integration. The metabolic reconstruction (iLR578) accounted for 578 genes, 1,047 reactions, and 924 metabolites. Starting from this reconstruction, we built context-specific genome-scale metabolic models using RNA-seq data from cultures growing in lactose and three HMOs. The models revealed notable differences in HMO metabolism depending on the functional characteristics of the substrates. Particularly, fucosyl-lactose showed a divergent metabolism due to a fucose moiety. High yields of lactate and acetate were predicted under growth rate maximization in all conditions, whereas formate, ethanol, and 1,2-propanediol were substantially lower. Similar results were also obtained under near-optimal growth on each substrate when varying the empirically observed acetate-to-lactate production ratio. Model predictions displayed reasonable agreement between central carbon metabolism fluxes and expression data across all conditions. Flux coupling analysis revealed additional connections between succinate exchange and arginine and sulfate metabolism and a strong coupling between central carbon reactions and adenine metabolism. More importantly, specific networks of coupled reactions under each carbon source were derived and analyzed. Overall, the presented network reconstruction constitutes a valuable platform for probing the metabolism of this prominent infant gut bifidobacteria. IMPORTANCE This work presents a detailed reconstruction of the metabolism of Bifidobacterium longum subsp. infantis, a prominent member of the infant gut microbiome, providing a systems view of its metabolism of human milk oligosaccharides.

probiotics in the food industry for their multiple health-promoting activities (5).These microorganisms have a saccharolytic lifestyle, fermenting several carbohydrates with a preference for oligo and dietary polysaccharides (6).
The Bifidobacterium genus relies on the bifid shunt for central metabolism (7).This pathway is characterized by the presence of fructose-6-phosphate phosphoketolase (PKL), and it enables Bifidobacterium to produce more ATP per fermented glucose (2.5 ATP) than fermentative pathways present in other bacteria, e.g., two ATP in lactic acid bacteria (8).Essential enzymes in this shunt are fructose 6-phosphate erythrose 4-phosphate lyase (F6PE4PL), transaldolase (TALA), and transketolases (TK1 and TKT2), as well as ribose 5-phosphate isomerase and epimerase (6).Bifidobacterium metabolism theoretically produces acetate and lactate in a 3:2 ratio as end-products from hexoses (9).This ratio is indeed variable, and it depends on the nature of the substrate used for fermentation (10).The production of lactate and ethanol is useful for NADH regeneration (11), and the TCA cycle is incomplete in most Bifidobacterium (12).In addition, Bifido bacterium genomes display an enrichment in carbohydrate metabolism genes (13)(14)(15).These include exo-and endoglycolytic enzymes that target complex carbohydrates, ABC transporters, and feeder pathways deriving carbohydrates into the bifid shunt (16)(17)(18).Genomic and functional studies have shown that these microorganisms have coevolved with the host and display an extensive array of adaptations for the infant gut environ ment (19,20).
Genome-scale metabolic models (GSMMs) are powerful mathematical structures that provide a coherent representation of cell metabolism, enabling a better understanding of metabolic behaviors and microbial adaptations under different scenarios (33,34).To improve their prediction capabilities, computational methods integrating additional omics data, such as transcriptomics, have been developed and evaluated (35,36).These tools, paired with high-quality metabolic models, have provided a helpful platform for probing cellular metabolism and increasing our knowledge about bifidobacterial metabolism (7,12,37) Although B. infantis is a model microorganism for HMO utilization, the active metabolic pathways, fluxes, and metabolite exchange reactions associated with the metabolism of distinct HMOs molecules remain poorly understood.To address this limitation, a refined genome-scale network reconstruction of B. infantis metabolism (iLR578) was built consistent with recent literature information and transcriptomic data.The former reconstruction was then employed to generate four context-specific GSMMs describing growth on distinct carbon sources such as lactose or HMOs such as LNnT, 3′FL, and 6′SL.Our results revealed emergent patterns of metabolic adaptations of B. infantis under different HMOs utilization conditions.

Genome-scale network reconstruction and model building
Reconstruction and refinement of B. infantis (iLR578) metabolic network involved multiple steps (Fig. 1).First, data from multiple databases and references were employed to define gene-protein-reaction (GPR) relationships for each gene in the reconstruc tion.This process is used as a template for the reconstruction of Bifidobacterium_lon-gum_infantis_ATCC_15697 from AGORA 1.03 (Assembly of Gut Organisms through Reconstruction and Analysis) (38) and integrated genome annotation information for this strain (39).Genes were manually translated from PEG to Blon annotation using the SEED database (40).Context-specific metabolic models were constructed using computational reconstruction tools from COBRA Toolbox v3.0 for MATLAB (41).Transcriptomic data from a previous study (28) were used to confirm gene expression under different carbon sources (see below).This information was integrated into each metabolic model and employed to gap-fill the model using flux balance analysis (FBA) to maximize the specific growth rate as a biological objective (42).Additional reactions and metabolites were added until the list of experimentally expressed genes was consistent with the network reconstruction in all carbon sources.Moreover, the removal of reactions and genes without experimental support or evidence against their presence in the template model was also carried out (Supplemental Data).Finally, two biomass reactions were employed to simulate growth: a generalized reaction in the original AGORA v1.03 reconstruction and a recent biomass reaction from a close strain (B.longum) (12).
For the initial integration of the model with transcriptomics data, normalized counts for each gene and condition were determined using a variety of preprocessing and normalization steps (see next section).Then, reactions associated with each expressed gene were evaluated for their capacity to carry flux while allowing biomass growth (Supplemental Data).The list was checked manually, and reactions and metabolites were added based on information from KEGG pathways (43) and BiGG Models (44).Finally, the quality of the metabolic reconstruction was assessed with the open tool MEMOTE (MEtabolic MOdel TEst suite) (45).

Transcriptomics data quality control and preprocessing
Preliminary analysis of the transcriptomic data suggested a potential contamination with reads of another Bifidobacterium (Bifidobacterium bifidum).For this task, a quality control step was performed using FastQC (46) and MultiQC (47), and low-quality reads were trimmed using Trimmomatic (48) with the following parameters: leading 3, trailing 3, sliding window 3:15, and minlen 36.To eliminate B. bifidum reads, the transcriptomic libraries of B. infantis were aligned to the B. bifidum genome using bowtie2 (49).Subsequently, samtools was used to obtain the read identifiers.Using BBMap and Hisat2 (50,51), reads mapped to B. bifidum were removed from the data set.To assess the removal effect of filtered libraries, a principal component analysis was conducted to compare the behavior of the counts data of replicates under different conditions (Fig. S1).After removing the contaminated libraries, the processed data displayed a similar behavior for all but two conditions (2'-fucosyl lactose and lacto-N-neotetraose), forming clusters at approximately the same locations.This pointed to a robust data set behav ior, supporting the execution of the next steps.The preprocessed data were finally normalized using the median of ratios method from the DESeq2 (52) package to enable analysis across conditions.

Growth simulation under different carbon sources
FBA simulations were performed to probe B. infantis metabolism under different carbon sources.Gurobi 9.0.1 (Beaverton, Oregon) was employed as a linear programming solver in all computations.Briefly, FBA (equation 1) solves a linear optimization problem whose objective is to maximize the flux through the biomass reaction in the model.S is the stoichiometric matrix, v is the metabolic flux vector, and μ ⊂ support(v) is the specific biomass growth rate (objective function).Upper and lower bounds for the reaction fluxes are captured by the ub and lb vectors, respectively.These vectors are defined according to each simulated condition such that they support growth in a defined in silico medium with distinct exchange reactions potentially active.Reactions associated with genes not expressed based on transcriptomic data have both bounds set to zero. (1) Typically, the linear optimization formulation in equation 1 has multiple alternative optimal flux solutions.An FBA variant known as parsimonious FBA (pFBA) (53) was employed to determine a biologically relevant flux distribution.This method is known to produce flux distributions that approximate optimal enzyme usage and avoid infeasi ble loops (54)(55)(56).In addition, flux variability analysis (FVA) was employed to evaluate alternative solutions by computing the flux ranges for relevant reactions (i.e., network flexibility) under (sub)optimal conditions (57).
The in silico medium used for all simulations was based on a modified Man Rogosa Sharp medium containing 2% (wt/vol) of each carbon source.The carbon sources were lactose and three HMOs: LNnT, 3′FL, and 6′SL.To compare yield calculation across conditions, the total number of moles for each substrate contained in 200 µL at 2% (wt/ vol) was calculated and later normalized by the moles of lactose.Other carbon sources were also normalized by the moles of lactose for a fair comparison (Table S1).Similarly, amino acids, vitamins, and nucleotides were proportionally normalized and asserted that they did not limit growth.Bounds for alternative carbon sources absent in the media were set to zero.

Evaluation of the refined metabolic model prediction capabilities
The third step was the evaluation of the prediction fidelity of the model.Simulated growth of context-specific models was validated against experimental data (28) and literature (16,(58)(59)(60)(61)(62)(63)(64)(65)(66)(67).Data on growth rates, metabolite consumption, and production (10) were used to assess model predictions.Briefly, a binary analysis was performed where the experimental growth of B. infantis on different carbon sources was compared against in silico predictions.In this case, four outcomes can be obtained.A true positive (TP) indicates that both the model and data indicate growth on a particular condition; a false positive (FP) implies the model can grow on a given substrate but is not supported by the experimental evidence; a true negative (TN) indicates that both the model and data agree that no growth is observed; and, a false negative (FN) implies the model does not support growth, but experimental evidence does.The F-score is a standard measure of the prediction capabilities of the model (equation 2) (68)].Additionally, Matthew's correlation coefficient (MCC) was also calculated as it is regarded as a stricter measure of prediction power, producing a high score (MMC = 1) only if the prediction obtained good results in the outcomes above (69) (equation 3).

Transcriptomic data integration
The fourth step was integrating transcriptomics data into the model.For this task, RNA-seq data from Garrido et al. (28) for Bifidobacterium (particularly B. infantis ATCC 15697) growing exponentially on lactose and three HMOs (3′FL, 6′SL, and LNnT) were employed.The GIMME algorithm (35) was applied to build four context-specific models.Briefly, GIMME uses a user-defined expression threshold corresponding to the 30th percentile to determine if a gene has a low/high expression and then deactivates the reactions below the former.This choice is based on typical choices reported elsewhere (35) and yielded similar simulation results (Table S13).Simply put, this algorithm minimizes the usage of low-expression reactions while maximizing the biomass reaction flux (biological objective).In this way, the resulting model achieves greater consistency with the available data and assumed biological objective (35).Reactions with an "AND" GPR association are associated with the corresponding lowest expressed gene, while those with an "OR" GPR association are associated with the corresponding highest expressed gene.

Gene essentiality analysis and flux-gene expression maps
Essential gene lists were generated by simulating single-gene deletions in all conditions.An essential gene is defined as a gene that, when knocked out, causes null growth.The corresponding reactions were identified as essential reactions.To visualize flux distribu tion results after FBA simulations, metabolic maps were generated using Escher (70), an online application for the integrated visualization of flux and transcriptomic data.MAT models were translated to JSON format and then uploaded to Escher, along with fluxes and a transcriptomic data table to visualize network expression maps.An integrated map for each condition was manually generated.

Flux coupling network analysis
To reveal emergent metabolic properties under each growth condition, Flux coupling analysis (FCA) was performed using the F2C2 tool version 0.95b (71) to reveal flux dependencies (coupling) between reactions under each condition.For this task, we filtered networks encompassing major end products that contained exchanges for lactate, acetate, succinate, formate, ethanol, and 1,2-propanediol (1,2-PD) in each condition.Coupled reaction networks were identified, and their topological parameters were calculated and visualized using Cytoscape (72) as in Dal'Molin et al. (73).Finally, to visualize differences in connectivity between the four context-specific models, we built a consensus network with Dynet (74).For each node, the linkages shared between two or more networks and those unique to each network were identified.Then, changes (rewiring) were quantified using the rewiring score metric (Dn-score [74]).

Reconstruction and refinement of a GSMM for B. infantis
In this work, we combined metabolic modeling and transcriptomics to refine and expand the metabolism of B. infantis during the utilization of lactose and three HMOs (3′FL, 6′SL, and LNnT).The workflow followed is presented in Fig. 1.The initial AGORA model had 541 genes in PEG annotation, 1,005 reactions, and 889 metabolites in two different compartments (Table 1).
To integrate the genome-scale transcriptomics data into the metabolic model, the PEG annotation was manually translated to Blon annotation using the SEED database (75), yielding Draft Model v1.0 (Fig. 1).One PEG gene did not have a match in the Blon annotation, but it had a match in another B. infantis genome (BLIJ_2570, "pbiosynthe sis").This reaction had several genes in OR association, so it was assumed unimportant (Supplemental Data).Then, genes associated with reactions and metabolites involved in HMOs utilization were added to the reconstruction.This addition expanded the gene list to 546 genes, with 1,021 reactions and 900 metabolites (Fig. 1).Finally, GPR relation ships were included for each gene based on the available experimental and reported information.
The model was later integrated with transcriptomics data.Reactions associated with expressed genes were individually checked to see if they could carry flux while ensuring biomass production under each condition.For this task, a generic GSMM of B. infantis was built using the current reconstruction, expression, and carbon source uptake data, thereby enabling the assessment of its growth capabilities using (p)FBA (76).In this step, 23 reactions, 24 metabolites, and 32 genes were mapped and incorporated into the model.Relevant GPRs were recovered from a previous model (12).In addition, two biomass reactions and the nongrowth-associated maintenance energy were included, although the latter was not constrained due to the lack of data.The list of blocked reactions contained 20 genes associated with 33 reactions; these were blocked in the model (i.e., carried zero flux) but are present as expression data supports them (Table S2).
The resulting metabolic model was labeled iLR578, a functional GSMM able to produce biomass for B. infantis in all the studied conditions.The latter model con tained 578 genes, 1,047 reactions (893 are gene-associated), and 924 metabolites (Table 1).Metabolic reactions represented 72.6% of total reactions (760 reactions), whereas transport and exchange reactions were 135 and 149 (12.9% and 14.2% of the total number of reactions, respectively; see Table 1).While the number of blocked reactions slightly increased, the number of metabolic reactions mapped onto iLR578 also grew from 714 (AGORA v1.03) to 760 (6.4% net increase; Table 1).Notably, the numbers of genes, reactions, metabolites, and GPRs of iLR578 are higher than other related bifidobacterial GSMMs reported to date (7).For instance, the current model B. infantis ATCC 1,597 has 541 genes, 1,005 reactions, and 889 metabolites, whereas B. longum subsp.longum 157F has 519 genes, 914 reactions, and 853 metabolites.This could correlate with the larger genome size of B. infantis ATCC 15697 (15).Notably, while the general reconstruction indicators of iLR578 were improved over the AGORA v1.03 reconstruction (Table 1), the MEMOTE score-a quality metric standard for genome-scale network reconstructions-did not increase (in both cases, the score was 43, Supplemen tary Data), despite having an increase of 16.1% of mapped metabolites.The MEMOTE score evaluates GSMMs compliance with community sharing and annotation standards, incorporation of relevant meta-data (e.g., annotations, GPRs, metabolite formulas, etc.), as well as different simulation checks (e.g., growth feasibility, flux cycles, and metabolic imbalances predictions, among others) (45).Yet, this suite does not cover specific metabolic capabilities under different conditions and their consistency against experimental data.Experimental evaluation of the predictive power of iLR578 is assessed in the next section.

Model prediction evaluation: growth and metabolic profiles
Exponential phase growth data from a previous study using four different carbon sources (28) was employed to evaluate the predictive power of the iLR578, checking biomass production under anaerobic conditions using pFBA.Before this analysis, the prediction capabilities of the two biomass reactions were evaluated.Results showed that the generalized biomass reaction from AGORA was superior, reaching a 55.3% lower root-mean-square error (RMSE) when predicting the specific growth rate under the assumption of biomass maximization (mean RMSE 0.048 vs 0.108 h −1 ; Table S6).These results are somewhat to be expected, considering that there is a twofold difference in the growth-associated maintenance energy (GAM) of the two reactions (20 mmol•gDCW −1 vs 40 mmol•gDCW −1 for the Schöpping et al. and AGORA biomass reactions, respectively; Supplementary file).Besides, out of the 74 precursors of the biomass reaction reported by Schöpping et al. (12), 64 could be successfully mapped onto the iLR578 model (Supplemental file ).It is worth noting that the latter biomass reaction was developed for B. longum subsp.longum BB-46 growing on sucrose (12), which greatly differs from the simulation conditions of this study.In fact, even when adjusting and equating the GAM parameter, the latter equation displays a slightly higher RMSE than the AGORA v1.03 biomass reaction (2% greater; Table S6).Considering these results and the lack of additional experimental data under the studied conditions (e.g., chemostats under various dilution rates) and/or gene essential data, we decided to employ the AGORA v1.03 biomass reaction for the remaining of the study.
A theoretical ratio of 3:2 is expected for acetate to lactate in the B. infantis metabolism (6), so the former was imposed in the subsequent flux calculations.To assess the impact of the magnitude of the production ratio, the latter was set to vary from 2.5:2 to 3.5:2 of acetate to lactate.Results showed no statistically significant differences in the predicted specific growth rates (one-way ANOVA P-value = 0.971, data available in Table S7) or the maximum biomass yields (one-way ANOVA P-value = 0.987, data available in Table S8).Although only a modest increase (1.7%) in the predicted specific growth rate was determined when imposing a 3.5:2 ratio production of acetate to lactate (Table S7), this ratio condition will be employed in the following for the analysis.
Experimental biomass yields were similar for LNnT and approximately twice that of 3′FL and 6′SL (Table 2).Prediction of biomass yields on LNnT agreed with model predictions using pFBA.However, predictions in the remaining conditions overestimated the biomass yield except for lactose (Table 2).The production yields of major metabolic end products, such as lactate, acetate, ethanol, and succinate, were also evaluated  (10), and Garrido et al. (28).Yields were calculated assuming a 3.5:2 acetate to lactate production ratio based on experimental observations.This ratio maximized the specific growth rate for all substrates and was therefore chosen for subsequent analyses (Table S11).
under the assumption of biomass growth maximization (Table 3).Again, there were no substantial differences when employing different production ratios of acetate to lactate (Table S9 and S10).LNnT, 3′FL, and 6′SL are HMOs that contain lactose as a building block.Model predictions showed higher acetate and lactate yields on lactose followed by 6′SL.Overall, low formate and no succinate production were determined in silico.
Fucose metabolism in Bifidobacterium uniquely results in pyruvate and 1,2-PD ( 64) as an end-product (Table 3).Application of flux fariability analysis for the determination of the yield ranges for the major end products under near-optimal conditions (i.e., 99% of the optimal specific growth rate) produced tight ranges, pointing to similar results and confirming the observed metabolic trends shown in Table 3 (Table S11).Most notably, even when the ratio constraint for acetate and lactate is relaxed under near-optimal conditions, the computed yield ranges are only slightly larger, and they overall display the same trends, i.e., high acetate and lactate production and very low formate and ethanol, and no succinate (Table S12).
Finally, iLR578 showed an increase in sensitivity and F-score compared to the initial AGORA v1.03 model when predicting the growth of various carbon sources (Table 4; Table S4) (68).The former model had 34 TP, four FPs, and four FN.This increase was explained by iLR578 having more TPs and fewer FNs than the original model.Amino acid metabolism appears to be accurately represented by the metabolic model, considering B. infantis is only auxotrophic for L-cysteine (79).The F-score of 89.4% indicates that there is still space to improve the metabolic predictive capabilities of the model.Bifidobacterium genomes usually have many hypothetical proteins, and their metabolism is still not completely understood (15).A higher MCC value for iLR578 compared to the initial model corroborates the improvements made to this GSMM (Table 4).

Context-specific models of HMO utilization based on transcriptomic data integration
Context-specific GSMMs were built from iLR578 and transcriptomic data on each HMO (Fig. 2).This integration resulted in four carbon source-specific models for 3′FL, lactose, 6′SL, and LNnT (Fig. 1; Supplemental Data).The highest number of deactivated transcriptionally turn-off reactions was observed in the LNnT model, with 152 reactions, followed closely by the 6′SL model (153 reactions), then the 3′FL model (145 reactions), and finally, the lactose model, with 140 deactivated reactions (Fig. 1).Context-specific models showed similar biomass yields compared to the generic iLR578 (Table 2).
We first analyzed gene expression patterns in these context-specific models.3′FL is correctly imported by an ABC transporter and degraded intracellularly, generating fucose and lactose (Fig. 2A).Gene expression for genes converting lactose in galactose and glucose (including the Leloir pathway) and their conversion to glucose-6-phosphate was medium to high.Similarly, the putative fucose metabolic pathway showed medium to high expression, resulting in the production of 1,2-PD through lactaldehyde reductase (Blon_0540), acetate (Blon_1731; acetate kinase), and lactate (Blon_ 1090 or Blon_0840; The lactose model showed similar gene expression patterns as 3′FL, without fucose metabolization and 1,2-PD production (Fig. 2B).It showed an increased gene expression for reactions at the entry of lactose through lactose permease to the bifid shunt until the production of lactate and acetate (Fig. 2B).Formate and succinate production was lower than acetate and lactate, which correlated with a lower gene expression of Blon_1714, Blon_1715 (pyruvate formate lyases, PFL), and Blon_2035 (dihydrodipicolinate synthase, DHDPS), respectively.
6′SL is imported intracellularly in B. infantis (18); however, the transporter is unknown.Like other HMO, 6′SL is likely imported by an ABC transporter.While gene expression supported a 6′SL-derived lactose metabolization like 3′FL or lactose, sialic acid utilization resulted in additional pyruvate and acetate production, with increased gene expression of the overlapping aminosugar metabolic pathway (Fig. 2C; Blon_0644, N-acetyl-Dmannosamine kinase; Blon_0645, N-acetylmannosamine 6-phosphate epimerase; Blon_0882, N-acetylglucosamine-6-phosphate deacetylase; Blon_0881, glucosamine-6phosphate deaminase).However, certain enzymes in the sialic acid metabolism showed a low or null gene expression, which could indicate that other genes participate in this process, or B. infantis does not fully use sialic acid.
Gene expression in the context-specific metabolic models showed several reactions with unexpected or crossed expressions.Some examples are the expression of LNnT transport and GlcNAc metabolism in the presence of 3′FL (Fig. 2A), α-fucosidases during growth in lactose, 6′SL, and LNnT (Fig. 2B through D), and some fucose metabolizing enzymes during growth in LNnT (Fig. 2D).As described previously (28), B. infantis is likely prepared to use multiple HMOs simultaneously, suggesting that HMOs are not usually sensed as only one molecule by B. infantis.

Comparison of flux patterns and gene expression in the central metabolism of iLR578 under each carbon source
Fluxomics data showed less complex patterns than transcriptomics (Fig. 3).No crossed reactions were observed when considering metabolic fluxes.Metabolic fluxes were favored through the bifid shunt pathway in all HMOs and lactose (Fig. 3), with differences in feeder pathways according to the nature of the compound.In all four distribution maps, the carbon flux goes from sources to products, favoring the production of lactate and acetate in the previously set ratio.Reactions in the bifid shunt presented relatively high fluxes during consumption of these four substrates, which correlated with an  S4.
Figure 3B shows the flux distribution for central pathways in B. infantis during lactose utilization, the simplest of the substrates evaluated.Lactose is predicted to be imported by a lactose permease and hydrolyzed by a β-galactosidase.The fluxes of these reactions and galactokinase showed medium flux values (Fig. 3B).Galactose and glucose are derived into central metabolism by feeder pathways such as the Leloir pathway, which includes gene products such as UDPG4R (UDP-glucose-4-epimerase), GALU (UTP-glucose-1-phosphate uridylyltransferase), and UGLT (UDPglucose-hexose-1phosphate uridylyltransferase).The fluxes of these intermediate reactions were predicted to be high (Fig. 3B).
Metabolic analysis indicates that 3′FL and 6′SL show an even flux distribution between fucose/sialic acid and lactose metabolism pathways (Fig. 3A and C).In these substrates, lactose metabolism fluxes were lower (green) than lactose as the sole carbon source (yellow).Flux distribution captured well the production of 1,2-PD and pyruvate in 3′FL, which was not produced in any other substrate (Fig. 3A), and how sialic acid generates glucosamine and pyruvate (Fig. 3C).Finally, LNnT overall metabolic fluxes were similar to 6′SL, probably due to NeuAc metabolism overlapping with GlcNAc (Fig. 3D).
Some reactions were active in the TCA cycle, but with a low flux, producing formate or ethanol (Fig. 3).Albeit at different levels, the models reflected well the production of lactate and acetate.The four models displayed low fluxes of formate and ethanol (Fig. 3), associated with pyruvate-formate lyase (Blon_1714 and Blon_1715) and alcohol dehydrogenase (Blon_2241), also showing low but positive fluxes.The models did not predict succinate fluxes, contrasting with the low but positive gene expression of enzymes such as succinyl-CoA synthetase (Fig. 2 and 3).
Finally, the metabolic models indicate the coupling of two reactions catalyzed by GF6PTA (glutamine-fructose-6-phosphate transaminase; Blon_2018) and G6PDA (glucosamine-6-phosphate deaminase; Blon_0881).The net result of their action is the deamination of glutamine for NH 4 + release (Supplemental Data), which might indicate the models couple these reactions in the absence of ammonia.The expression of Blon_0881 was high in 6′SL and LNnT but not lactose or 3′FL, which correlates with this enzyme-generating glucosamine.Blon_2018 showed a low gene expression across all four conditions.However, metabolic modeling estimated high fluxes for these two reactions, which might be an artifact due to ammonia limitation.
The metabolic activity of almost all bifid shunt and lower glycolysis enzymes correlated with gene expression data (Fig. 4A).To assess this effect, genes expressed consistently across all conditions and their corresponding reactions were mapped, and their linear correlation coefficients were analyzed.When the absolute correlations distribution for each consistently expressed gene and the corresponding reaction was assessed, there was only a slight enrichment in the mean absolute correlation of central carbon reactions and associated genes (0.61 vs 0.56, Fig. 4B), albeit this was not statistically significant with a high confidence level (P-value = 0.360 > 0.05, Wilcoxon rank sum test).Overall, fluxes correlated well with gene expression, particularly in central carbon metabolism, supporting the integration of this information for gaining insights into the active metabolic pathways under each condition.

Gene essentially simulations and analysis
Genes are predicted to be essential when their deletion results in an in silico growth rate of zero.The accuracy of these calculations in less studied microorganisms ranges typically between 60% and 80% (80).While there are no experimental gene lethality studies of B. infantis in the literature, this analysis could provide helpful information for better understanding metabolic limitations.Of the 578 genes of iLR578, 81 (14%) were predicted to be essential and common to all carbon sources (Table S3).As expected, genes related to critical cellular functions such as replication, translation, cargo (demand and transport reactions, 38.3%), amino acid, nucleotide, nitrogen metabolism (19.8%), and vitamins metabolism (6.2%) were most predicted to be essential (Table S5).Regardless of the carbon source, 397 genes (68.7%) did not affect biomass growth.Most of these were related to transport and demand reactions (approximately 11%).Some genes were predicted to be nonlethal, albeit they caused a reduction in the computed growth rate.There were 76 genes of this type across the three HMOs studied (Supplemental Data).Blon_2152 (phosphoglycerate mutase) was previously predicted as an essential gene for B. infantis (7), and using iLR578, an approx.51% reduction in growth was indicated by a single knockout.Genes predicted to be nonlethal with reduced growth rates are related to carbon source-specific pathways, ATP production, bifid shunt enzymes, and biomass-demanding metabolites (Supplemental Data).

Identification of metabolic clusters involved in SCFAs production under different conditions
To better understand the release of major metabolites by B. infantis under different growth conditions, we analyzed how exchange metabolites are coupled to metabolic reactions in each context-specific model (Fig. 5).For this analysis, we again considered and imposed a 3.5:2 production ratio of acetate to lactate.The selected exchange reactions for network construction were lactate, acetate, succinate, formate, ethanol, and 1,2-PD.Connected networks of both partially and fully coupled reactions were identified (Fig. 5).Fully coupled reaction pairs are those where the increase/decrease in the flux value of one reaction causes a fixed increase/decrease in the flux of the other and vice versa (81).In the case of partially coupled reaction pairs, the extent of the increase/ decrease exerted by one reaction on the other is variable and not necessarily the same in the opposite direction (81).
The network topology for 6′SL and LNnT was identical (Fig. 5).We found that the network for 3′FL was only composed of fully coupled reactions, whereas lactose, LNnT, and 6′SL had both fully and partially coupled reactions (Fig. 5).The network with the highest number of nodes was lactose, far more than the other HMOs (Table Absolute correlation distribution for each consistently expressed gene and the corresponding reaction.While there is a slight increase in the mean absolute correlation of central carbon reactions and genes (0.61) vs all reactions and expressed genes (0.56), albeit this was not statistically significant (P-value = 0.360 > 0.05, Wilcoxon rank sum test).5).Interestingly, the 3′FL network displayed the highest clustering coefficient (0.45), whereas the lactose network showed the lowest (0.29) despite having similar network sizes (Table 5).This result suggests a more coordinated-and possibly regulatednetwork underpinning metabolite production under LNnT than lactose.The latter was confirmed by the increased network connections (edges).Lastly, we also evaluated the effect of the acetate-to-lactate ratio on the reaction coupling and generated the reaction networks without this constraint.The resulting networks were almost identical in size and contained fewer edges as expected due to the lower number of coupled reactions (Fig. S2).Of the four networks, three maintained the number of nodes but reduced the number of edges (Fig. S2).The only exception was the LNnT network, which reduced its number of edges from 22 to 17 and now shares the same topology as the 6′SL network (Fig. S2).Notably, while the lactose network did not show a decrease in the  number of nodes, its connectivity decreased substantially from 76 to 35, rendering a more disconnected network (Fig. S2).The networks remained almost identical in size and displayed similar topologies-except for lactose-suggesting the presence of specific network motifs under each condition regardless of the forcing ratio constraint.In all identified networks, their degrees followed a power law (scale-free) distribu tion, which provides a robust network structure against random failure/perturbations (82).Lactate, acetate, ethanol, and formate are usually clustered in the same network with their corresponding transporters (Fig. 5).Interestingly, succinate exchange usually formed a separate network from other metabolites (Fig. 5).
A global network coupling metabolites and fluxes across all four substrates is shown in Fig. 6.The Dynet analysis showed the most rewired nodes considering all four networks in Fig. 4. 1,2-PD and succinate clustered separately and were coupled to different reactions compared to lactate, acetate, formate, and ethanol (black nodes; Fig. 6).White nodes indicate reactions always coupled with these metabolites; light red nodes are reactions with a high degree of coupling.For example, lactate, acetate, formate, and ethanol appeared to be highly coupled to an alcohol dehydrogenase (ALCD2x), as well as to several reactions related to adenine and purine metabolism [ADSL, AICART, ADPT, ADSS, and EX_ade(e)].As expected, reactions coupled to 1,2-PD were related to fucose metabolism, especially lactaldehyde reductase (LCARS).Finally, succinate was associated strongly with reactions of arginine biosynthesis, sulfate and sulfite metabolism, and succinate dehydrogenase.

DISCUSSION
GSMMs are robust mathematical structures representing microbial metabolism, useful for understanding and optimizing metabolic pathways and microbial interactions (76).In this study, a functional GSMM of B. infantis was built using available transcriptomic and bibliomic information to understand HMOs utilization by this microbe.This general reconstruction served as the basis for the generation of four context-specific models describing growth on different HMOs: lactose, LNnT, 3′FL, and 6′SL.This model advan ces current models of Bifidobacterium, especially in the context of HMO utilization.A comprehensive literature search with multiple metabolites and reactions of carbo hydrate consumption being used for this microorganism and the inclusion of global gene expression data helped provide a more precise and accurate representation of the consumption of HMOs by B. infantis.
One of the most challenging tasks for refining this model was the lack of experimental information for cellular composition, growth capabilities, and metabolite production under different media for validation.Most experimental data available for B. infantis are not readily applicable for model validation.GSMMs can simulate a phenotype in a specific environment.For a complex medium, approximations are employed, and care must be taken when interpreting simulation results.This adds a layer of uncer tainty as the simulated production cannot be directly compared with experimental data if the simulation is deprived of some essential compound (83).Improvements in model simulations can be attained by adjusting the biomass reaction stoichiometry to experimental data such that growth and production/consumption capabilities match observations.In the case of this study, the best-performing biomass reaction was the AGORA v1.03 reaction (38), albeit the available data for testing was very limited (see Results).Condition-specific experimental data could help improve the current model's quality, highlighting gaps in our understanding of B. infantis metabolism.
All Bifidobacterium produce lactate and acetate as part of their bifid shunt, with varying ratios depending on the substrate (6).However, their production is not well captured by metabolic models and remains an issue to improve in GSMMs.Here, the expected ratio of acetate to lactate was fixed, a solution that forces the model to produce lactate.This issue has been addressed differently in GSMMs of lactic acid bacteria that do not accurately represent lactate production (7,33).
Compared to other reconstructions of Bifidobacterium, Devika et al. (7) used the AGORA model for B. infantis and added reactions for sialic acid and N-acetylglucosa mine utilization.El-Semman et al. (37) presented a reconstruction for Bifidobacterium adolescentis with 454 genes and 699 reactions.Recently, Schöpping et al. (12) presented specific biomass stoichiometric equations for Bifidobacterium animalis and B. longum.The latter unfortunately did not produce satisfactory results for predicting biomass growth and yields in this study.Importantly, iLR578 stands as the most comprehensive genome-scale metabolic reconstruction of B. infantis reported to date, encompassing an increased number of genes, GPRs relations, and most importantly, metabolic capabilities.
Metabolites produced by Bifidobacterium are in the front line of the protective effect for the host attributed to this genus.Modifying their production might result in increased health benefits.Fucose and FL metabolism resulted in the most diver gent metabolic profiles across three HMOs, strongly coupled to 1,2-PD production.The models resulting from this work predict formate being produced across all four substrates.However, experimental data suggest that formate in the infant gut derives mainly from FL and fucose metabolism (2).Finally, molecules produced at low quantities are complex to model, but it is well known that some might have a powerful effect on the host (84).
Lactate, acetate, formate, and ethanol production in B. infantis appeared intercon nected, strongly coupling to adenine metabolism (Fig. 6).Adenine is a building block in riboflavin biosynthesis-an essential vitamin produced by B. infantis-suggesting HMO metabolism results in the constitutive production of riboflavin.Adenine is also related to purine metabolism, specifically DNA/RNA synthesis, which may also suggest a more direct and intuitive link to biomass biosynthesis and cellular growth.A cou pling of succinate production in B. infantis to arginine biosynthesis was expected (6), but a connection between succinate and sulfate/sulfite metabolism deserves further investigation.

Conclusions
Bifidobacterium species are essential members of the gut microbiome.Their saccharolytic lifestyle enables them to thrive on different carbon sources.Bifidobacterium infantis is one of the most studied microorganisms for its ability to utilize different HMOs because of their genomic and functional adaptations, including ABC transporters and saccharo lytic enzymes.Modeling these adaptations using GSMMs provides an opportunity for a better understanding of the physiological features of B. infantis upon utilization of different HMOs, ultimately supporting the design of probiotics.In this study, we built more comprehensive and refined GSMMs of B. infantis metabolism, integrating contextspecific transcriptomic data under three different HMOs and lactose.This reconstruc tion (iLR578) recapitulated several differences in HMOs metabolism depending on the functional characteristics of the molecules.The production of major metabolites like acetate and lactate was common to all substrates, with a small production of succinate, formate, and ethanol.Differential production of compounds like 1,2-PD was observed in the fucosylated substrate.Finally, coupled reaction networks underpinning short-chain fatty acids production were identified under each condition, providing insights into their robustness and possible regulation.Overall, iLR578 constitutes a valuable platform for the scientific community to simulate B. infantis metabolism on different HMOs, which can be further expanded and improved as more experimental data become available.

FIG 4
FIG 4 Correlation analysis between gene expression and metabolic fluxes in B. infantis utilizing different substrates.(A) Correlation between genes expressed consistently across all conditions and their corresponding reactions.In red, the flux and corresponding expression of central carbon metabolism genes expressed in all conditions are shown (n = 37), whereas in blue, the flux and corresponding expression of all genes expressed in all conditions are shown (n = 201).(B)

FIG 5
FIG 5 Coupled reaction networks for each context-specific model derived from FCA.For each carbon source, major fermentation metabolites, namely, lactate, acetate, succinate, formate, ethanol, and 1,2-PD (red rectangles), are depicted linked to reactions coupled to them in each condition (colored ovals).Panels describe network generated under different carbon sources, namely, (A) 3´′FL, (B) 6′SL-LNnT (identical networks), and (C) lactose.The legend on the right indicates the type of reaction coupling (partial or full-dashed or solid line, respectively), edge betweenness (line thickness), and reactions subsystems (color coded).Reaction nomenclature can be found in the Supplemental Data.

FIG 6
FIG6 Comparison of network rewiring for building a consensus coupled reaction network.Consensus networks were constructed from all four context-specific networks using the DyNet tool.The network comprises 53 nodes and 131 edges.Black nodes represent major metabolite exchange reactions that were subject to the analysis, whereas white nodes describe reactions present in all context-specific networks.The red-shaded nodes are only present in some of the networks.Reaction nomenclature can be found in the Supplemental Data.

TABLE 1
Reaction content comparison for different reconstructions

Carbon source Experimental yield a In silico yield prediction from iLR578 In silico yield prediction from contextualized iLR578 models
a Data taken from Garrido et al. (16), Garrido et al. (77), Garrido et al.

TABLE 3
Yield of major metabolites (g⋅gDW −1 ) predicted by iLR578 assuming biomass maximization under different carbon sources a (78)idence for metabolite production taken from Garrido et al.(10), Van der Meulen et al.(11), and Cheng et al.(78).Yields were calculated assuming a ratio of 3.5:2 acetate to lactate production ratio as in Table2.b Production was observed experimentally.

TABLE 4
Performance comparison of GSMMs of B. infantis a a Data available in Table

TABLE 5
Topological properties of reaction networks found with FCA a Reaction networks for 6SL and LNnT were identical.