Identification of shared fatty acid metabolism related signatures in dilated cardiomyopathy and myocardial infarction

Aim It is to be elucidated the risk-predictive role of differentially expressed fatty acid metabolism related genes (DE-FRGs) in dilated cardiomyopathy (DCM) and myocardial infarction. Materials & methods Four gene enrichment analyses defined DE-FRGs’ biological functions and pathways. Three strategies were applied to identify risk biomarkers and construct a nomogram. The 4-DE-FRG correlation with immune cell infiltration, drugs, and ceRNA was explored. Results DE-FRGs were enriched in lipid metabolism. A risk nomogram was established by ACSL1, ALDH2, CYP27A1 and PPARA, demonstrating a good ability for DCM and myocardial infarction prediction. PPARA was positively correlated with adaptive immunocytes. Thirty-five drugs are candidate therapeutic targets. Conclusion A nomogram and new biological targets for early diagnosis and treatment of DCM and myocardial infarction were provided.


Risk model nomogram
Early diagnostic methods lndividualized treatment Dilated cardiomyopathy (DCM) and myocardial infarction (MI) are highly fatal cardiovascular diseases that pose significant public health threats globally [1,2]. DCM is characterized by heart muscle dilation and dysfunction, commonly ending in heart failure (HF) [1][2][3]; MI is caused by a sudden blockage of blood flow to the heart, leading to tissue damage and HF failure [4]. Nevertheless, current diagnostic tools that fail to predict DCM before changes in left ventricular ejection fraction [5,6] and MI before its sudden onset [4,7], are inadequate in detecting presymptomatic patients.
DCM and MI are inextricably linked. They share a few risk factors in common, including infections, inflammation and family history [8,9]. Moreover, overexpression of MI-associated transcripts has been observed in patients with inflammatory DCM [10]. Considering these observations, characterizing genetic profiles to construct a shared risk prediction model may hold promising implications for early intervention improvement.
Fatty acid, as the primary energy source for heart muscle [11], may substantially contribute to the pathogenesis of DCM and MI. Decreased expression of PPARα (fatty acid metabolism related protein) caused by doxorubicin may induce DCM [12]. Administration of eicosatetraenoic acid improved cardiac metabolism, attenuated histological remodeling and resulted in functional improvements after MI [13]. Identifying novel predictive genes in this pathway may provide valuable insights into disease pathogenesis and innovative therapeutic strategies development.
The present study aims at constructing a shared risk prediction model for DCM and MI related to fatty acid metabolism. The association between risk signatures and infiltrating immune cells, drugs and competing endogenous RNA (ceRNA) was also analyzed, which may assist in developing personalized preventive strategies to reduce the burden of cardiovascular disease. MI: Myocardial infarction.

Materials & methods
Data collection & data processing First, GSE21610, GSE66360, GSE120895 and GSE48060 were downloaded as candidate datasets from the Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/geo/). The basic information of retrieved datasets is shown in Table 1. Additional clinical information for the dataset can be found in Supplementary Table 1. The raw data were normalized with quantile normalization and transformed into a base two logarithmic scale using the R package 'LIMMA.' Subsequently, a principal component analysis was performed to verify their quality [14]. Then, we combined GSE21610 and GSE66360 as the train set and GSE120895 and GSE48060 as the test set. The R package 'sva' [15] removed the merged dataset's batch effects, and principal component analysis verified the quality (Supplementary Figure 1).

Differential expression analysis
Differentially expressed genes (DEGs) between control (healthy samples) and disease (DCM or MI samples) group in the training set were identified by R packages 'LIMMA'. DEGs with the p-value < 0.05 and |log2 FC| >1 were statistically significant. A total of 210 FRGs with a Relevance score >32 were obtained from the Genecards database. The intersection of obtained FRGs and DEGs yielded a set of DEGs associated with fatty acid metabolism (DE-FRG), visualized by R packages 'LIMMA', 'pheatmap' and 'ggpubr' [14,20,21].

Gene enrichment analysis
Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Set Enrichment Analysis (GSEA) were conducted with the use of the R package 'clusterProfiler' [22] to explore the function and pathway of the DE-FRGs. Disease ontology (DO) enrichment analysis was performed with the R package 'DOSE'. p < 0.05 was considered to show the statistical significance.
Construction & validation of the risk nomogram Three algorithms were applied to the DE-FRGs for screening risk biomarkers for DCM and MI, including the random forests (RF) [23,24], the least absolute shrinkage and selection operator (LASSO) logistic regression [25] and the support vector machine recursive feature elimination (SVM-RFE) [26]. The RF algorithm was adopted with the R package 'randomForest.' The LASSO logistic regression investigation was conducted with R package 'glmnet,' and minimal lambda was considered optimal. Later, three screening results were intersected to obtain risk biomarkers, visualized with a nomogram. The test set was used for in-depth testing of the efficacy of key biomarkers. The receiver operating characteristic curve (ROC), calibration curve, clinical impact curve and decision curve were used to judge the predictive ability of the nomogram. The train set and the test set were assessed based on the investigation of ROC curves, and the area under the curve (AUC) was calculated to evaluate the predictive effect achieved by the algorithms. These steps were achieved via R package 'dplyr' [27], 'timeROC' [28], 'rms' [29] and 'ROCR' [30]. Following this, the calibration curve (relationship between observation probability and prediction probability) and clinical impact curve were performed to evaluate the degree of consistency between observed and predicted outcomes. The decision curve analysis examined the theoretical relationship between patients' threshold death or relapse probability and the relative value of falsepositive and false-negative results to evaluate the nomogram benefit further [31]. These steps were achieved via R package 'Hmisc' [32], 'lattice' [33] and 'Formula' [34].

Evaluation & correlation analysis of infiltration related immune cells
The CIBERSORT website filters 28 kinds of the immune cell matrix. According to p < 0.05, the immune cell infiltration matrix was obtained. The present study adopted the R package 'corrplot' for drawing the correlation heatmap for visualizing the correlation of 28 kinds of infiltrating immune cells. The 'ggstatsplot' and 'ggplot2' packages were adopted to analyze the Spearman relationship between characteristic risk markers and immune infiltrating cells and visualize the result.

Construction of ceRNA network
A ceRNA network was built based on three public websites, TargetScan, miRDB, and miranda, to determine whether the DE-FRGs exist in competing endogenous regulating networks mediated by long non-coding RNAs (lncRNAs) and micro RNAs (miRNAs). Visualization of the ceRNA was performed by Cytoscape v3.9.1 (https: //cytoscape.org/). Data analysis R v4.1.2 (R Foundation for Statistical Computing, Vienna, Austria, www.r-project.org/) and RStudio v1.1.463 (Integrated Development for R. RStudio, Inc., MA, USA, www.rstudio.com/) were operated to perform data analysis. Significance evaluation was fulfilled by t-test for distributed data and by Wilcoxon Mann-Whitney test for other data. The Spearman correlation coefficient was calculated to identify the associations between the DE-FRGs and 28 infiltrating immune cells. A multivariate logistic regression model was established to estimate the hazards ratio along with 95% CI. Analysis with a p-value is statistically significant when it is <0.05 and considered statistically highly significant if it is <0.01.

Research flow & the collection of DE-FRGs
After downloading the dataset from GEO, quality control and normalization of the gene expression matrix were performed. Subsequently, differential analysis was executed for DEGs, intersecting with FRGs to generate 20 DE-FRGs. Among them, four were identified as risk biomarkers by the SVM-RFE algorithm, 12 by the LASSO algorithm, and 10 by the RF algorithm, respectively. After the intersection of these genes, four DE-FRGs were finally selected for nomogram. We then conducted KEGG, GO and DO analysis and established a proteinprotein Interaction (PPI) network of the 20 DE-FRGs. Immuno-infiltration analysis, ceRNA regulatory network construction and gene-drug interaction network construction were undertaken based on the 4-DE-FRG ( Figure 1). Differential gene expression analysis was performed on the control and disease groups of the train set, and a total of 20 DE-FRGs were obtained. In the disease groups, the PPARA, CD36, ELOVL4, PTGS2, ACSL1, MMAB, CSF2RA, CYP27A1, RXRA, ABHD5 and ALDH2 were downregulated and GAA, ELOVL6, TNF, SLC7A7, ASAH1, ABCA1, CRABP2, IL1B and TLR4 were upregulated (Figure 2A & B).

Construction & validation of the risk nomogram based on DE-FRGs
Among the 20 DE-FRGs, ten key biomarkers were identified by the RF algorithm with an importance score greater than 3 ( Figure 3A & B). Twelve genes were identified as important biomarkers with LASSO logistic regression algorithm ( Figure 3C & D). Later, four genes were identified as key biomarkers by the SVM-RFE algorithm ( Figure 3E & F). Following the intersection, four genes (ACSL1, ALDH2, CYP27A1 and PPARA) were obtained for subsequent analysis.
A risk model based on the 4-DE-FRG (ACSL1, ALDH2, CYP27A1 and PPARA) was constructed, and a nomogram ( Figure 4A) was drawn for clinical ease of use. According to the measurements of the 4 DE-FRGs expression levels in the patients blood, the user can find them on the corresponding scale in the nomogram and project them onto the top scale to read the points for each variant. The sum of each point is the total number of points, and the risk of disease for this patient can be obtained by projecting the total points downwards. The AUC of the train set was 0.936 ( Figure 4B), indicating that the risk nomogram had an excellent accuracy of predictive value. It was also confirmed by the test set (AUC = 0.737) ( Figure 4C). The calibration curves of the nomogram showed a good agreement in the train set ( Figure 4D). In addition, the clinical impact curves and the decision curve were proposed to assess the clinical usefulness of the risk prediction nomogram ( Figure 4E & F).

Enrichment analysis of DE-FRGs in DCM & MI
GO analysis revealed that DE-FRGs were mainly enriched in biological processes related to lipid metabolisms and small-molecule transport, such as fatty acid metabolic process, external side of plasma membrane and fatty acid synthase activity ( Figure 5A). The enriched GO terms for individual 4-DE-FRG were shown separately in Figure 5B-E.
Similarly, KEGG analysis showed that DE-FRGs were mainly enriched in lipid metabolism pathways, such as adipocytokine signaling pathway, PPAR signaling pathway and lipid and atherosclerosis (Figure 6 A). The enriched KEGG terms for individual 4-DE-FRG were shown separately in Figure 6B-E. These enrichments in DCM or MI suggested that their inhibitors could be considered potential therapeutic strategies for patients with DCM or MI.

Analysis of immune infiltration
Seventeen immune cells significantly differed between MI or DCM patients and healthy individuals ( Figure 8A). 10.2144/fsoa-2023-0008 Future Sci. OA (2023) FSO847 future science group Among them, activated B cell, activated CD8 + T cell, effector memory CD4 + T cell, memory B cell and central memory CD4 + T cell were lowly expressed in the disease group. PPARA was mainly positively correlated with adaptive immune cells and negatively correlated with innate immune cells, while ACSL1, ALDH2 and CYP27A1 were the opposite. ( Figure 8B).

Construction of regulatory gene-drug network
A gene-drug network was created to provide more detailed information to illustrate the complex associations between drugs and the 4-DE-FRG targets. According to Figure 9 and Supplementary Table 2, multiple drugs have agonist roles in PPARA.

Construction of ceRNA network
To better explain the interactions of multiple RNA types at the gene level, three public websites, TargetScan, miRDB and miranda, were used to predict potential combinations between 4-DE-FRG and transcription factors. As shown in Figure 10, all edges contained 412 nodes, 150 positively correlated miRNA-mRNA edges and 262 positively correlated miRNA-lncRNA edges.

Discussion
The lack of effective diagnostic methods for DCM and MI presymptomatic patients makes it important to develop a stable risk model. Our study constructed a risk prediction nomogram for DCM and MI based on the 4-DE-FRG (ACSL1, ALDH2, CYP27A1 and PPARA), showing good performance in external validation (AUC = 0.737). Notably, all the four genes were upregulated in disease groups. Later pathway enrichment analysis uncovered that the 4 DE-FRGs are associated with lipid metabolism and small molecule transport pathways. In addition, the 4-DE-FRG was highly correlated with infiltrating immune and 35 drug candidates in DCM and MI. The heart is a highly energy demanding organ with fatty acids as its leading energy supplier. Davila et al. found that cardiac fatty acid metabolism was decreased in DCM patients [36]. Malonic acid can promote cardiomyocyte proliferation and heart regeneration in MI mice [37]. Moreover, short-chain fatty acids are more easily oxidized and utilized in failing hearts [38], representing a promising metabolic target.
PPARA regulates lipid metabolism, making it the core of atherosclerotic plaque formation [39]. PPARA is protective against DCM and MI [40,41]. The PPARA agonist fenofibrate attenuates left ventricular diastolic abnormalities and systolic dysfunction to delay HF's progression and improve survival in HF animal models [40,42]. That is consistent with our belief that PPARA is protective. Exploring the mechanisms of PPARA in DCM and MI may be a breakthrough.
Seventeen immune cells significantly differed between MI or DCM patients and healthy individuals. Neutrophils may be a risk marker for the severity of HF in DCM patients [43], and their elevation is related to poor prognosis in MI [44]. Mast cells are influential in ventricular remodeling and HF progression in DCM mice [43], and their stabilizers reduce ischemia/reperfusion injury in MI [45,46]. Blockade of NK cells and their receptors reduces inflammation and injury in DCM [47][48][49][50]; similarly, NK cells increase inflammation in the infarct zone during MI. The function of adaptive immune cells during MI, such as CD4 + T [51,52], CD8 + T [53,54], and Th 2 [55,56], remains elusive. Few studies have considered how the 4 DE-FRGs affect DCM and MI inflammation, which may be necessary as the essential polyunsaturated fatty acids produce the most potent inflammatory mediators of MI [57,58]. According to our results and the research mentioned above, ACSL1, ALDH2 and CYP27A1 were positively associated with more harmful cells, whereas PPARA was negatively associated with them. That corresponds to our nomogram that ACSL1, ALDH2 and CYP27A1 are risk factors and PPARA is a protective factor. However, uncertainty about the role of immune cells in DCM and MI leaves these speculations to further investigation.
By KEGG and GO analysis, we found that lipid metabolism and small molecule transport-related pathways, such as the fatty acid metabolic process, the external side of the plasma membrane and the fatty acid synthase activity, were enriched in DCM and MI. By constructing a gene-drug regulatory network, we also revealed that thirty agonists and activators were sensitive to PPARA, which might be a promising drug target against DCM and MI. In contrast, the correlation between other genes and existing drugs requires more investigation. Noncoding RNAs have been evaluated as potential diagnostic and therapeutic targets for various cardiovascular diseases [59,60], including DCM [61,62]. Simultaneously, the lncRNA-miRNA-mRNA axis can link lncRNAs to various regulatory functions in MI [63]. Therefore, constructing ceRNA regulatory networks may lead to potential diagnostic and   therapeutic targets. Our result shows that PPARA was regulated by the maximum number of miRNAs among the 4 DE-FRGs, suggesting that it might be a hub node of the diagnosis and treatment for DCM and MI.

Limitations
First, this study is a retrospective study designed based on a public database, and the developed model lacks evidence from in vivo and in vitro experiments. More animal experiments and extensive clinical samples are needed. Second, the public datasets provided limited information on the samples, with missing data on the source of the samples, age, BMI and gender of the sample donors. Additionally, DCM is a complex disease with heterogeneous pathological mechanisms [64]. We did not consider that different etiologies expressed relevant pathways may respond differently to treatment, possibly leading to inaccurate model predictions. In the future, it is of prospective clinical value to experimentally investigate the role of immune cells and ceRNA regulatory networks in the development of DCM or MI.

Conclusion
In this study, we constructed a 4-DE-FRG nomogram and identified potential biological targets for the early diagnosis and treatment of DCM and MI.

Summary points
• A risk model based on the 4-differentially expressed fatty acid metabolism related gene (DE-FRG) (ACSL1, ALDH2, CYP27A1 and PPARA) was constructed. The area under the curve of the train set was 0.936 ( Figure 4B), and that of the test set was 0.737 ( Figure 4C). • PPARA was mainly positively correlated with adaptive immune cells and negatively correlated with innate immune cells, while ACSL1, ALDH2 and CYP27A1 were the opposite. That corresponds to our nomogram that ACSL1, ALDH2 and CYP27A1 are risk factors and PPARA is a protective factor. • Lipid metabolism and small molecule transport-related pathways, such as the fatty acid metabolic process, the external side of the plasma membrane and the fatty acid synthase activity, were enriched in dilated cardiomyopathy and myocardial infarction.