Abnormal expression of TGFBR2 , EGF , LRP10 , and IQGAP1 is involved in the pathogenesis of coronary artery disease

Coronary artery disease (CAD) is the most common cardiovascular disease worldwide. In this study, we investigated the pathogenesis of CAD. We downloaded the GSE98583 dataset, including 12 CAD samples and 6 normal samples, from the Gene Expression Omnibus (GEO) database and screened differentially expressed genes (DEGs) in CAD versus normal samples. Next, we performed functional enrichment analysis, protein-protein interaction (PPI) network, and functional module analyses to explore potential functions and reg-ulatoryfunctionsofidentifiedDEGs. Next,transcriptionfactors(TFs) and microRNAs (miRNAs) targeting DEGs were predicted. In total, 456 DEGs were identified in CAD and normal samples, including 175 upregulated and 281 downregulated genes. These genes were enriched in the intestinal immune network for immunoglobulin A pro-ductionandthemitogen-activatedproteinkinasesignalingpathway (e.g., TGFBR2 and EGF ). The PPI network contained 212 genes, and HIST1H2BJ , HIST1H2AC , EGF ,and EP300 werehubgeneswithdegrees higher than 10. Four significant modules were identified from the PPI network, with genes in the modules mainly enriched in the inflammatory response, protein ubiquitination involved in ubiquitin-dependent protein catabolic processes, protein transport, and mitochondrial translational elongation, respectively. Two TFs (E2F1 and FOXK1) and five miRNAs ( miR-122A , miR-516-5P , miR-507 , miR-342 , and miR-520F ) were predicted to target 112 DEGs. miR-122A reportedly targets both LRP10 and IQGAP1 in the TF-miRNA target regulatory network. The abnormal expression of TGFBR2 , EGF , LRP10 , and IQGAP1 may be implicated in CAD pathogenesis. Our study provides targetsandpotentialregulatorsforinvestigatingCADpathogenesis.


Introduction
Coronary artery disease (CAD), also known as ischemic heart disease, is a group of diseases that includes sudden cardiac death, myocardial infarction, and stable and unstable angina [1]. CAD is the most common cardiovascular disease and is usually characterized by chest discomfort or chest pain [2]. For CAD, primary risk factors include a lack of ex-ercise, smoking, excessive alcohol consumption, high blood pressure, obesity, depression, and poor diet [3,4]. Additionally, genetics is considered a risk factor for developing CAD [5]. In clinical practice, CAD can be diagnosed by employing coronary angiography, electrocardiogram, coronary computed tomographic angiography, and cardiac stress testing [6]. In 2015, 110 million CAD cases were reported, leading to 8.9 million deaths, thus making it the leading cause of disease-related deaths globally [7]. Therefore, elucidating mechanisms that underlie CAD is of considerable importance and significance.
Kalirin (KALRN ) reportedly inhibits the activities of guanine-exchange factor and inducible nitric oxide synthase, which play important roles in the CAD mechanism via the Rho GTPase signaling pathway [8]. Decreased adiponectin and increased interleukin-6 (IL-6) levels promote CAD progression in epicardial adipose tissues [9,10]. Moreover, levels of neuregulin-4 (Nrg4) are found to be inversely related to the development and severity of CAD [11]. Transforming growth factor-β1 (TGF-β1) is involved in the pathogenesis of restenosis, including thrombogenesis and inflammation. In patients with CAD, polymorphisms and TGF-β1 levels are independent risk factors for developing in-stent restenosis after coronary bare-metal stent implantation [12]. MiR-214 is known to inhibit the expression of vascular endothelial growth factor (VEGF), as well as activities of endothelial progenitor cells; therefore, circulating miR-214 could be employed as a novel biomarker and a diagnostic factor for CAD [13]. MiR-34a mediates sirtuin 1 (SIRT1) in endothelial progenitor cells, and atorvastatin reportedly improves endothelial function by promoting SIRT1 expression by suppressing miR-34a [14,15]. Serum levels of miR-126, miR-197, and miR-223 are reportedly increased in patients with CAD, and both miRNA-197 and miRNA-223 can predict cardiovascular death [16]. According to a report by Bai et al. [17], the MEG3-miR-26a-Smad1 regulatory axis can be implicated in regulating the proliferation/apoptosis balance of vascular smooth muscle cells during atherosclerosis. Although these studies have focused on CAD pathogenesis, key genes and miRNAs associated with CAD remain unclear.
Microarray studies of human diseases, including CAD, are limited owing to a lack of human disease tissues or appropriate disease models. Peripheral blood plays a crucial role in mediating immune responses, metabolism, and intercellular communication, as well as affords convenient sample collection; accordingly, it is an ideal tissue for biomarker detection [18,19]. Moreover, gene expression in peripheral blood could reflect CAD severity [20,21]. Additionally, Taurino et al. [22] revealed that analyzing gene expression in whole blood is useful for detecting genes that determine cardiovascular phenotypes, including those implicated in the pathogenesis and progression of CAD. In the present study, we utilized the microarray dataset GSE98583, contributed by Kumar and Kashyap et al. [23]. In the study by Kumar and Kashyap et al. [23], differentially expressed genes (DEGs) of different disease severities were identified, followed by functional enrichment and other analyses to explore candidate genes and pathways contributing to CAD severity. In the present study, we primarily identified DEGs in CAD and control samples, followed by functional enrichment and prediction of transcription factors (TFs) and microRNAs (miR-NAs) regulating these DEGs, to elucidate potential genes and their corresponding regulators involved in CAD pathogenesis. This study will provide deeper insights into the pathogenesis of CAD and provide a theoretical basis for developing targeted therapy.

Ethical approval
In the present study, all datasets were downloaded from public databases, which allowed researchers to download and analyze public datasets for scientific purposes. Therefore, ethical approval was not required.

Data source
We downloaded and used the microarray dataset GSE98583 from the Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) database, which is based on the GPL571 [HG-U133A_2] Affymetrix Human Genome U133A 2.0 Array. The GSE98583 dataset included 12 whole blood samples from non-diabetic male patients with CAD based on their coronary angiogram results. Of the 12 patients, 6 presented single-vessel disease (stenosis >95% in the left anterior descending artery, Gensini score 20-30), and 6 had triple vessel disease (stenosis >95% in all three major epicardial vessels, Gensini score 50-60). Additionally, six whole blood samples from control subjects with atypical angina and normal coronary angiograms were included.

Data preprocessing and differential expression analysis
The original CEL files were downloaded and preprocessed using the R package Oligo (version 1.34.0, http://bioc onductor.org/help/search/index.html?q=oligo/, Johns Hop-kins University, Baltimore, MD, USA.) [24]. Data preprocessing involved data format conversion, filling missing data, background correction, and data standardization. Next, the probes were annotated and combined with platform annotation files. Probes that could not be matched to gene symbols were filtered out. For multiple probes mapped to one gene symbol, the average value of the probes was obtained as the expression value of the corresponding gene symbol. Using the R package Limma (version 3.10.3, http://www.bioconductor.org/packages/releas e/bioc/html/limma.html, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia) [25], we analyzed DEGs between CAD and control samples. Genes with a Pvalue of <0.05 were defined as DEGs.

Enrichment analysis
Based on the Database for Annotation, Visualization and Integrated Discovery (DAVID, version 6.8, https://david. ncifcrf.gov/, Laboratory ofHuman Retrovirology and Immunoinformatic, USA) tool [26], Gene Ontology (GO) terms [27] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway [28] enrichment analyses were performed for identified DEGs. The number of genes involved in each term was set at ≥2, and a P-value < 0.05 was established as the significant threshold.

PPI network analysis
Combined with the STRING (version 10.0, http://string -db.org/) database [29], PPI pairs were used to predict proteins encoded by identified DEGs. The PPI score was set at 0.7. Next, a PPI network was constructed for DEGs using the Cytoscape software (version 3.2.0, http://www.cytoscape.or g, National Institute of General Medical Sciences, USA) [30]. Using the CytoNCA plug-in (version 2.1.6, http://apps.cyt oscape.org/apps/cytonca Central South University, Changsha, China) [31] in Cytoscape with parameter set as: without weight, we performed network topology analysis to identify the hub network nodes. Furthermore, significant modules with a score >5 were selected from the PPI network using the MCODE plug-in (version 1.4.2, http://apps.cytoscape.org/a pps/MCODE, University of Toronto, Canada) [32] in Cytoscape.

TF-miRNA target regulatory network analysis
Using the iRegulon plug-in (version 1.3, http://apps.cytos cape.org/apps/iRegulon, Laboratory of Computational Biology, KU Leuven, Belgium) [33] in Cytoscape, we performed a TF-target prediction for the PPI network nodes. The parameters "minimum identity between orthologous genes" and "maximum false discovery rate on motif similarity" were set at 0.05 and 0.001, respectively. Results with a normalized enrichment score (NES) of >4 were selected. Using the We-bGestalt GAST (version: updata 2013, http://www.webgesta lt.org/option.php, Baylor College of Medicine, Houston, TX, USA) tool [34], we predicted target miRNAs for PPI network nodes by employing the overrepresentation enrichment analysis (ORA) method. The least number of enriched genes was set at two, and the top five results are presented. Finally, the results of TF-target prediction and miRNA-target prediction were merged to build the TF-miRNA target regulatory network using Cytoscape [30].

Validation using the Comparative Toxicogenomics Database
The etiology of several chronic diseases is based on interactions between environmental chemicals and genes regulating physiological processes [35]. The Comparative Toxicogenomics Database (CTD, http://ctdbase.org/, NC State University, Raleigh, NC, USA) is a publicly available database for identifying chemical-gene-disease networks [36]. We conducted a CTD search to identify genes and pathways associated with CAD. Next, we performed a Venn analysis to identify overlapping genes and pathways between the CTD database and the microarray dataset GSE98583.

Differential expression analysis
The distribution of expression values after data normalization is shown in Fig. 1. The medians were at the same level, indicating that data preprocessing results were good. According to the screening threshold, a total of 456 DEGs (175 upregulated and 281 downregulated genes) were identified. The clustering heatmap indicated that DEGs could help distinguish samples with different disease statuses (Fig. 2).

Enrichment analysis
Multiple GO functional terms were enriched for upregulated and downregulated genes. For upregulated genes, positive regulation of proteins targeting mitochondria, the glutathione derivative biosynthetic process, and mitochondrial translational termination were the primary functional terms that were enriched (Fig. 3). In contrast, cellular response to mechanical stimulus, centrosome localization, and thymus development were potential functions of downregulated genes (top 20 listed, Fig. 4). Meanwhile, the upregulated genes were implicated in 3 pathways (such as the intestinal immune network for immunoglobulin A (IgA) production, P = 2.10 × 10 −2 ), whereas the downregulated genes were implicated in 15 pathways (such as endocytosis, P = 4.61 × 10 −4 ; mitogen-activated protein kinase [MAPK] signaling pathway, P = 1.22 × 10 −2 ) ( Table 1). In particular, the downregulated transforming growth factor-beta receptor 2 (TGFBR2) and epidermal growth factor (EGF) were enriched in the MAPK signaling pathway.

Validation with CTD database
In the CTD database, a total of 25,384 genes and 149 pathways were found to be associated with CAD. The Venn analysis identified 429 overlapping genes between CAD-associated genes and 456 DEGs, including TGFBR2, EGF, LRP10, and IQ-GAP1 ( Fig. 8A and Supplemental Table 1). Similarly, we screened 10 overlapping pathways between CAD-associated pathways and 18 significant KEGG pathways, including the MAPK signaling pathway (Fig. 8B and Table 3). These results suggest that identified genes and pathways are important in CAD and could be implicated in CAD pathogenesis.

Discussion
In the present study, we identified 456 DEGs (including 175 upregulated and 281 downregulated genes) between CAD and control samples. In the PPI network, EGF (down, degree = 16) was a hub node. Additionally, we screened four significant network modules (modules A, B, C, and D) and observed that each node was individually implicated in the inflammatory response, protein ubiquitination involved in ubiquitin-dependent protein catabolic process, protein transport, and mitochondrial translational elongation. Furthermore, we built a TF-miRNA target regulatory network.
In patients with CAD, TGFBR2 polymorphism is correlated with the risk of sudden cardiac arrest induced by ventricular arrhythmias, suggesting that genetic variations in the TGF signaling pathway could influence susceptibility to sudden cardiac arrest [37]. TGFBR1 is reportedly overexpressed in patients with left ventricular dysfunction and is thus considered a potential prognostic factor after acute myocardial infarction [38]. The mRNA expression levels of EGFR in atheromatous lesions could be a promising prognostic biomarker for predicting the stimulatory growth factorinduced increase in smooth muscle cell proliferation [39]. Circulating miR-23a could serve as a diagnostic biomarker to indicate the presence and severity of coronary lesions in patients with CAD. Moreover, miR-23a regulates vasculogenesis in CAD by inhibiting EGFR expression [40]. In the present study, both TGFBR2 and EGF were involved in the MAPK signaling pathway, thus indicating their potential roles in CAD development.
Plasma levels of miR-122 and miR-370, which are upregulated in patients with hyperlipidemia, are positively associated with CAD severity; therefore, they may be correlated with the development and progression of CAD in pa-tients with hyperlipidemia [41]. The expression of circulating miR-122-5p is reportedly elevated in patients with acute myocardial infarction, suggesting its application as a promising biomarker [42,43]. The plasma levels of miR-122, miR-140-3p, miR-720, miR-2861, and miR-3149 are higher in acute coronary syndrome samples than in control samples; thus, they can be employed as potential markers in patients with acute coronary syndrome [44]. These results indicate that miR-122 plays a critical role in CAD pathogenesis.
LRP is known to possess biological functions in multiple vascular biology-associated processes. Moreover, LRP poly- morphisms are risk factors, especially in Caucasians with premature CAD [45]. In cardiac fibroblasts, LRP1 contributes to the expression of matrix metallopeptidase 9 (MMP9), which has been associated with ventricular remodeling following myocardial infarction [46]. IQGAP1 reportedly affects neovascularization after ischemia by mediating endothelial cellregulated angiogenesis, macrophage infiltration, and reactive oxygen species production; therefore, IQGAP1 is a valuable therapeutic target for ischemic cardiovascular diseases [47,48]. As a scaffold for the extracellular signal-regulated kinase (ERK)1/2 cascade, IQGAP1 mediates the integration of hypertrophy and survival signals in the heart, facilitating left ventricular remodeling following pressure overload [49]. Therefore, both LRP10 and IQGAP1 were targeted by miR-122A in the regulatory network, implying a probable correlation between miR-122A and CAD via the regulation of LRP10 and IQGAP1.

Conclusions
In total, 456 DEGs were screened in CAD samples. Herein, we revealed the probable involvement of TGFBR2, EGF, LRP10, IQGAP1, and miR-122 in CAD pathogenesis. The functions of these genes and miRNAs in CAD pathogenesis need to be comprehensively validated in future experimental research.

Author contributions
Conception and design of the research: WZ and XL; acquisition of data: NW, TL and GZ; analysis and interpretation of data: SF and LL; Statistical analysis: XL and YH; drafting the manuscript: YD; revision of manuscript for important intellectual content: WZ. All authors read and approved the final manuscript.