Identifying functional metabolic shifts in heart failure with the integration of omics data and a cardiomyocyte-specific, genome-scale model

The heart is a metabolic omnivore, known to consume many different carbon substrates in order to maintain function. In diseased states, the heart’s metabolism can shift between different carbon substrates; however, there is some disagreement in the field as to the metabolic shifts seen in end-stage heart failure and whether all heart failure converges to a common metabolic phenotype. Here, we present a new, validated cardiomyocyte-specific GEnome-scale metabolic Network REconstruction (GENRE), iCardio, and use the model to identify common shifts in metabolic functions across heart failure omics datasets. We demonstrate the utility of iCardio in interpreting heart failure gene expression data by identifying Tasks Inferred from Differential Expression (TIDEs) which represent metabolic functions associated with changes in gene expression. We identify decreased NO and Neu5Ac synthesis as common metabolic markers of heart failure across datasets. Further, we highlight the differences in metabolic functions seen across studies, further highlighting the complexity of heart failure. The methods presented for constructing a tissue-specific model and identifying TIDEs can be extended to multiple tissue and diseases of interest.


Introduction
In order for the heart to maintain its function, the main contractile unit (cardiomyocyte) reactions, were not detected in v18 of the HPA tissue data but two of the NOS isoforms, NOS1  (Blais et al, 2017) with additional metabolic tasks that were curated from the literature based on cardiac metabolism. These tasks were used to test if the model could perform functions known to be present in the heart while removing metabolic functions not present in the heart. After curation, the final iCardio model has a 100% cardiomyocyte-specific task accuracy.

Figure 2. Validating iCardio (A) descriptively using the number of reactions covered my metabolic tasks and (B) quantitatively using ATP yields for common carbon substrates. (A)
A descriptive validation of iCardio using the number of reactions covered using the different metabolic tasks. Together, the tasks account for almost half of the reactions in iCardio. The remaining reactions represent areas for future improvement of metabolic tasks. (B) A qualitative validation of iCardio using ATP yields for a variety of common carbon sources. The ATP yields for iCardio (y-axis) are compared to another recently published, but smaller, metabolic model, MitoCore (Smith et al., 2017), which contains 324 reactions. The agreement between the models demonstrates the lack of energy generating cycles and infinite loops in iCardio.
although the tasks may cover the same reactions, they cover different combinations of reactions 144 and pathways for each task. The two sets of tasks together may cover 1,700 reactions but in total 145 over 20,000 reactions are used in different combinations to complete the tasks, indicating that a 146 number of reactions are repeated between tasks. This result is to be expected, especially for 147 reactions that involve central carbon metabolism and ATP production, such as ATP synthase. 148 The maximum number of reactions covered by a given task was 788 reactions (1 task), the 149 metabolic task that describes the de novo synthesis of lipids from glucose and essential fatty 150 acids, and the minimum number of reactions covered by a task was one reaction (24 tasks), 151 metabolic tasks that describe transport. Overall, the reaction coverage of tasks demonstrates a 152 qualitative validation of iCardio. 153 To provide a quantitative validation of iCardio, ATP yields were predicted for a number Metabolic models provide an alternative approach for interpreting changes in gene 171 expression to yield insight into metabolic shifts that may be contributing to a diseased state.

172
Here, we use iCardio to identify metabolic tasks that were significantly associated with 173 differentially expressed genes, called Tasks Inferred from Differential Expression (TIDEs), 174 which represent metabolic functions that are shifted in an experimental versus control state 175 ( Figure 3). Heart failure is a complex disease, both in etiology and presentation, but heart failure 176 is associated with a shift in metabolism (Wende et al., 2017). Several different metabolic shifts 177 have been noted in heart failure, such as decreased fatty acid utilization (Lopaschuk, 2017; 178 Figure 3. Identifying functional metabolic changes from gene expression data using metabolic tasks. Metabolic tasks quantitatively describe metabolic functions that a tissue or organism is known to catalyze. iCardio was used to identify the reactions that are utilized in order to perform each of the previously defined metabolic tasks. Next, each reaction is assigned a reaction weight based on gene expression fold changes. A task score is calculated based on the weights for each reaction and represents an average gene expression value over reactions utilized in that metabolic task. In order to determine the statistical significance of this task score, the original gene expression data is shuffled over all genes with data and task scores are recalculated to give a distribution of task scores. A p-value is assigned based on where the original task score falls in the distribution of randomized data. utilization of branched chain amino acids (Sun et al., 2016). However, there are still 180 disagreements in the field. For example, one study noted an increase in fatty acid uptake in heart  To do this, we analyzed gene expression data from patients undergoing heart transplants 186 for either advanced ischemic or idiopathic heart failure compared to healthy hearts. We 187 integrated these differentially expressed genes (DEGs) and their associated log fold changes with 188 iCardio to determine TIDEs, representing metabolic functions associated with a significant 189 change in gene expression. When compared to healthy hearts, the ischemic hearts from GSE5406 190 had 2678 DEGs; 392 of these DEGs were present in iCardio (Supplemental Table 3). After  Next, we expanded the TIDE analysis to the two remaining studies (Supplemental Table  across the datasets. First, nitric oxide synthesis from arginine is decreased in 5 of the 6 datasets 218 (excluding the ischemic heart failure data from GSE1869). Synthesis of long-chain, unsaturated 219 fatty acids was decreased in 4 of the 6 datasets (excluding both ischemic and idiopathic heart 220 failure data from GSE1869). De novo synthesis of Neu5Ac was decreased in 4 of the 6 datasets 221 (excluding the ischemic and idiopathic heart failure data from GSE57345). The breakdown of 222 valine to succinyl-CoA was increased in 2 of the 6 datasets (the idiopathic heart failure data from 223 GSE1869 and GSE5406) and the breakdown of threonine to a Krebs cycle intermediate was 224 increased in 2 of the 6 datasets (both ischemic and idiopathic heart failure in GSE57345).

Figure 5. TIDEs analysis identifies functional metabolic changes across different datasets (A) compared to results from a traditional GSEA analysis (B). (A) Three datasets that contained samples for ischemic, idiopathic, and healthy hearts were downloaded, analyzed, and integrated with iCardio using the TIDEs pipeline to identify shifts in metabolic functions. For each of the three datasets, both ischemic and idiopathic heart failure are shown. TIDEs cluster by study rather than type of heart failure. (B) The same three datasets were used with a traditional GSEA analysis with gene sets defined by KEGG pathways. The KEGG metabolic pathways are shown here and grouped similar to the TIDE groupings shown in (A).
ischemic and idiopathic heart failure data from GSE5406). Across the different datasets, the 228 previously reported metabolic signatures of heart failure appear. However, they tend to appear 229 within a study rather than across all studies or types of heart failure. Finally, rather than 230 clustering by type of heart failure (ischemic vs idiopathic), the TIDEs results cluster by the 231 study.  model curation demonstrated that the CORDA method produced a draft model that was more 288 accurate with respect to performance of cardiomyocyte-specific metabolic tasks than other 289 integration methods (Supplemental Table 2). It is interesting to note that the CORDA algorithm 290 produced one of the smaller draft models while still maintaining the highest task accuracy. As  The metabolic tasks used for benchmarking the integration algorithms and further manual 301 curation cover 41% of the network while also identifying areas for future development of both 302 general and tissue-specific metabolic tasks. Here, metabolic tasks were formulated agnostic to 303 the gene expression data that was integrated with the resulting model. Additional tasks could 304 serve as hypotheses for changes in metabolic functions of the heart that could then be tested with 305 specific gene expression data sets. Finally, although not every reaction is covered by a metabolic 306 task, these reactions represent metabolic functions with either evidence of protein expression in 307 the HPA or pathway connectivity from the CORDA integration algorithm. These reactions and hypotheses for important, tissue-specific metabolic functions.

310
The pFBA approach assumes that the pathway for each metabolic task remains the same, 311 independent of the data. This reaction-centric TIDE approach for determining significantly  The code generated in this study is available at https://github.com/csbl/iCardio. specific models. For GIMME, ATP production was used as the objective function. Data was integrated from the HPA, which contains tissue-specific protein expression where each protein is 446 assigned either a high, medium, low, or no expression based on data from antibody-based 447 immunohistochemistry (Uhlén et al., 2015). Code for implementing each algorithm is available 448 at https://github.com/csbl/iCardio. The CORDA algorithm was chosen from among the 449 algorithms given its accuracy for the pre-defined list of cardiomyocyte-specific metabolic tasks 450 (Supplemental File 1, Supplemental Table 2).

451
The CORDA algorithm takes as an input user-defined high, medium, and negative 452 confidence reactions to produce a model that is (1)   Analyzing transcriptomics data from heart failure patients 475 Microarray data (Supplemental Table 3   The Edinburgh human metabolic network reconstruction and its functional analysis. Mol. Syst.