Bioinformatic gene analysis for potential biomarkers and therapeutic targets of atrial fibrillation-related stroke

Background Atrial fibrillation (AF) is one of the most prevalent sustained arrhythmias, however, epidemiological data may understate its actual prevalence. Meanwhile, AF is considered to be a major cause of ischemic strokes due to irregular heart-rhythm, coexisting chronic vascular inflammation, and renal insufficiency, and blood stasis. We studied co-expressed genes to understand relationships between atrial fibrillation (AF) and stroke and reveal potential biomarkers and therapeutic targets of AF-related stroke. Methods AF-and stroke-related differentially expressed genes (DEGs) were identified via bioinformatic analysis Gene Expression Omnibus (GEO) datasets GSE79768 and GSE58294, respectively. Subsequently, extensive target prediction and network analyses methods were used to assess protein–protein interaction (PPI) networks, Gene Ontology (GO) terms and pathway enrichment for DEGs, and co-expressed DEGs coupled with corresponding predicted miRNAs involved in AF and stroke were assessed as well. Results We identified 489, 265, 518, and 592 DEGs in left atrial specimens and cardioembolic stroke blood samples at < 3, 5, and 24 h, respectively. LRRK2, CALM1, CXCR4, TLR4, CTNNB1, and CXCR2 may be implicated in AF and the hub-genes of CD19, FGF9, SOX9, GNGT1, and NOG may be associated with stroke. Finally, co-expressed DEGs of ZNF566, PDZK1IP1, ZFHX3, and PITX2 coupled with corresponding predicted miRNAs, especially miR-27a-3p, miR-27b-3p, and miR-494-3p may be significantly associated with AF-related stroke. Conclusion AF and stroke are related and ZNF566, PDZK1IP1, ZFHX3, and PITX2 genes are significantly associated with novel biomarkers involved in AF-related stroke. Electronic supplementary material The online version of this article (10.1186/s12967-019-1790-x) contains supplementary material, which is available to authorized users.

Furthermore, global increases in stroke prevalence plus stroke-related disability and mortality associated with aging will increase [5,6]. Thus, we may not now know the actual true burden of stroke due to limits in brain imaging identification in < 10 mm small hypointense areas and silent infarctions for 28% of those patients older than 65-years-of-age [7]. AF is commonly classified as paroxysmal, persistent or permanent, or new onset arrhythmia basing on the present continuous time, which mainly included that paroxysmal AF was self-terminates within 7 days, while persistent AF was lasts longer than 7 days or needs cardioversion, and usually has lasted for 3 months [8]. As we all kwon, AF is considering to be a major cause of ischemic strokes due to irregular heart-rhythm, coexisting chronic vascular inflammation, and renal insufficiency, and blood stasis. According to Rivaroxaban Once Daily Oral Direct Factor Xa Inhibition Compared With Vitamin K Antagonism for Prevention of Stroke and Embolism Trial in Atrial Fibrillation (ROCKET-AF) trial study, Steinberg et al. [9] suggested that the paroxysmal AF patients carrying a lower adjusted rate of stroke or systemic embolism (adjusted HR: 0.78, 95% CI 0.61-0.99, P = 0.045), all-cause mortality (adjusted HR: 0.79, 95% CI 0.67-0.94, P = 0.006), and the composite of stroke or systemic embolism or death (adjusted HR: 0.82, 95% CI 0.71-0.94, P = 0.005) than persistent AF patients after adjusted efficacy and safety outcomes. According to the Oxford vascular study (OXVASC), nearly 43.9% of ischemic strokes were associated with AF among patients 80 years-of-age or older who had a threefold increase in AF in the past 3 decades [10]. However, this assumption has been challenged by the atrial fibrillation reduction atrial pacing trial (ASSERT) which identified a temporal association between subclinical AF and stroke risk among patients with implantable pacemakers and defibrillators. They reported that only 8% and 16% of patients had an association between pre-detected and postdetected AF within months of stroke or systemic embolism, respectively [11]. Of note, AF is often intermittent and asymptomatic, and presents as an electromechanical disassociation of atrial fibrillation. Clinically, current stroke risk scores and traditional diagnosis with an electrocardiogram are practical, while the limitation of predict stroke risk accurately in individual AF patients was significantly identified, especially in persistent AF which carrying a higher risk of stroke or systemic embolism and all-cause mortality [12]. In this study, we identified co-expressed differentially expressed genes (co-DEGs) of persistent AF and stroke and elucidated molecular mechanisms and pathology of AF-related DEGs (AF-DEGs) and stroke-related DEGs (stroke-DEGs). Finally, we provide a bioinformatic analysis of DEGs and predicted microRNAs (miRNAs) for AF patients prone to stroke.

Materials and methods
GSE79768 and GSE58294 datasets were downloaded from GEO (http://www.ncbi.nlm.nih.gov/geo/) [13] and expression profiling arrays were generated using GPL570 (HG-U133_Plus_2) Affymetrix Human Genome U133 Plus 2.0 Array (Affymetrix, Santa Clara, CA). Additionally, the GSE79768 dataset, including 26 specimens with paired left atrial (LA) and right atrial (RA) tissue obtained from 13 patients was used to identify differential LAto-RA gene expression and molecular mechanisms for patients with persistent AF or sinus rhythm (SR) abnormalities and we describe potential mechanisms of AF-related remodeling in the LA and the relationship between LA arrhythmogenesis and thrombogenesis. In this study, persistent AF patients has lasts continuously for > 6 months, while the SR patients had no evidence of AF clinically and any anti-arrhythmic drug history. Blood samples of GSE58294 were collected from cardioembolic stroke (N = 69) and control patents (N = 23) at < 3, 5, and 24 h.

Data processing
R packages of "affy", "affyPLM", and "limma" (http://www. bioco nduct or.org/packa ges/relea se/bioc/html/affy.html), provided by a bioconductor project [14], were applied to assess GSE79768 and GSE58294 RAW datasets. We used background correction, quantile normalization, probe summarization and log2-transformation, to create a robust multi-array average (RMA), a log-transformed perfect match, and a mismatch probe (PM and MM) methods. The Benjamini-Hochberg method was used to adjust original p-values, and the false discovery rate (FDR) procedure was used to calculate fold-changes (FC). Genes expression values of the|log2 FC| > 1and adjusted p < 0.05 were used for filtering AF-DEGs. However, the |log2 FC| > 1.5 and adjusted p < 0.05 were used to identify stroke-DEGs, given that blood sample specificity pointed to many genes. Additionally, we calculated and made Venn diagrams for co-DEGs for AF-and stroke-DEGs.

Identification of protein-protein interaction (PPI) networks of DEGs
PPI networks of AF-and stroke-DEGs were analyzed using the search tool for the retrieval of interacting genes (STRING database, V10.5; http://strin g-db.org/) that predicted protein functional associations and proteinprotein interactions. Subsequently, Cytoscape software (V3.5.1; http://cytos cape.org/) was applied to visualize and analyze biological networks and node degrees, after downloading analytic results of the STRING database with a confidence score > 0.4 [19].

Functional enrichment analysis
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of AF-and stroke-DEGs were carried out using the database for annotation, visualization and integrated discovery bioinformatics resources (DAVID Gene Functional Classification Tool, http://david .abcc.ncifc rf.gov/) [20], and REACTOME databases (v62; http://www.react ome. org) [21]. GO terms and KEGG maps of biological functions associated with a p < 0.05 was considered to be significantly enriched. In addition, we presented different biofunctions of AF-and stroke-DEGs in biological processes, molecular functions, and cellular components from DAVID and REACTOME databases, respectively.

Identification of co-DEGs associated with nervous or cardiovascular diseases
The comparative toxicogenomics database (http://ctdba se.org/) was used to find integrated chemical-gene, chemical-disease, and gene-disease interactions to generate expanded networks and predict novel associations [23]. We used these data to analyze relationships between gene products and nervous or cardiovascular diseases. Here, relationships between co-DEGs and diseases and association or an implied association were identified.

Identification of DEGs
We identified 54,674 probes corresponding to 20,484 genes in GSE79768 and GSE58294 datasets and AF-and stroke-DEGs were confirmed. We found 489 DEGs in LA specimens of AF patients compared with SR patients, including 428 down-regulated genes and 61 up-regulated genes. However, total of 265, 518, and 592 DEGs were identified following the time points of less than 3, 5, and 24 h after stroke, respectively. Here, we defined 210 co expressed DEGs in the three time points mentioned above as the stroke-DEGs. Heatmaps of AF-DEGs in relation to inflammatory and immune response, ion channels, and cell signaling were conducted for genes expression and these data appear in Fig. 1 and Additional file 1: S1. Simultaneously, Fig. 2 and Additional file 2: S2 has shown the genes expression value in relation to inflammatory response, energy metabolism, ions channel and transportation, and neuronal regulation above the stroke-DEGs.

Functional enrichment in Co-DEGs
Figure 3c illustrates expressed AF-and stroke-DEGs and co expressed genes. Interestingly, four co expressed DEGs, including zinc finger protein 566 (ZNF566), PDZK1 interacting protein 1(PDZK1IP1), zinc finger homeobox 3 (ZFHX3), paired-like homeodomain 2 (PITX2), were observed. The AmiGO database was used to confirm GO term enrichment related to biological processes, molecular functions, and cellular components and Co-DEGs were associated with various processes as indicated in Table 1.
Using the DAVID database, the top 5 GO terms related biological processes among those genes were primarily associated with inflammatory response (  PPBP S100A9    Table 1 The Gene Ontology (GO) terms enrichment for the co-expressed genes of the AF-related stroke ISS sequence similarity evidence used in manual assertion, IGI genetic interaction evidence used in manual assertion, IDA direct assay evidence used in manual assertion, TAS traceable author statement used in manual assertion, IEA evidence used in automatic assertion, IPI physical interaction evidence used in manual assertion  Fig. 4 and Additional file 3: S3).

PDZK1IP1
KEGG pathway analysis data appear in Fig. 4c. The results suggesting that the AF-DEGs were mainly enriched in pathways of cytokine-cytokine receptor interaction (p-value: 1.02E−04), cGMP-PKG signaling pathway (p-value: 0.025), antigen processing and presentation (p-value: 0.022), and NF-kappa B signaling pathway (p-value: 0.037). However, KEGG terms included PI3 K-Akt signaling pathway (p-value: 0.017) and B cell receptor signaling pathway (p-value: 0.045) were enriched in stroke-DEGs. (As shown in Fig. 4c and Additional file 4: S4).GO terms enrichment using the REACTOME database identified additional associations and these appear in Fig. 4d. The CTD database showed that Co-DEGs targeted several nervous system and cardiovascular diseases and these data appear in

Identification of functional and pathway enrichment among predicted miRNAs and Co-DEGs
Prediction analysis using mirDIP, miRDB, TargetScan, and DIANA bioinformatic tools identified the top 5 selected miRNAs targeting each Co-DEG involved in AF-related stroke and these data appear in Table 2.
These data enable us to understand how predicted miR-NAs are related to AF-related stroke progress.

Discussion
Predicting AF is needed for stroke prevention but 30% of patients have no signs of AF despite months of continuous cardiac rhythm monitoring. Thus, cardiovascular malignant events may be correlated with irregular and infrequent cardiovascular incidents as well as limitations in electromechanical indices that should predict problems with atrial contractility [7,11,12]. Estimating markers and associations between atrial dysfunction and embolic stroke are thus of interest and may be novel therapeutic targets for primary care. The inflammatory and immune response, and ion channel and transportation are significantly associated with AF recurrence and maintenance, as well as the stroke occurrence. Several hub-genes involved directly or indirectly that regulate the nervous system were found among AF-DEGs. Visanji's group compared resting electrocardiograms of LRRK2associated Parkinson's disease (PD) patients, nonmanifesting carriers, noncarriers, and idiopathic PD patients to investigate heart rate variability in LRRK2-associated PD [24]. There is evidence that LRRK2 may act as a biofunctional mediator to correlate heart rate variability and PD [24]. In a molecular mechanistic study, the neural protective role for regulating mitochondrial complex I function and oxidative stress in ischemia/reperfusion was identified [25,26]. According gene-gene interaction analysis, Timasheva's group illustrated that the loci of CXCR2 is significantly associated with stroke development in patients with hypertension [26]. In addition, CXCR2 antagonism attenuated neurological deficits and infarct volumes via decreased cerebral neutrophil infiltration and peripheral neutrophilia in a hyperlipidemic ApoE −/− mice stroke model [27]. CALM1 is recognized as a major regulator of cardiac ion-current expression and calcium handling, and a key determinant of cardiac electrical function [28]. Also, specific risk alleles for CALM1 were identified as being associated with increased risk of stroke in studies of coronary heart disease [29]. Thus, there may be a relationship between cardiovascular and nervous system disease and they may arise from loci mutations or gene variants. Additionally, PITX2, of the pituitary homeobox (Pitx) family, has a critical role in organ morphogenesis and AF maintenance which is related to short stature homeobox 2 (Shox2) [30]. Pitx2 is expressed in the LA and the pulmonary vein, which is considered a substrate and trigger for AF maintenance respectively. However, several experimental data indicate a trend that PITX2 gene expression is silenced during aging in LA samples, suggesting genetic evidence for gene silencing for increased AF susceptibility [30,31]. Then, miRNAs function analysis and a genomic approach showed that miR-17-92 and miR-106b-25 were associated with Pitx2 expression regulation and are implicated in human AF susceptibility [31]. To reveal relationships between genetic variants and the risk of ischemic stroke, Malik's group studied PITX2 and ZFHX3 genes and found a significant association with cardioembolic stroke (CE) in a meta-analysis [31,32]. Similarly, in a genome-wide association study using clinical samples from paroxysmal or persistent AF patients, ZFHX3 was significantly associated with LA enlargement and persistent AF and subsequently with ablation outcomes [33]. Correspondingly, Choi's group found a significant association between top susceptibility loci (chromosomes 4q25 [PITX2], 16q22 [ZFHX3]) and AF recurrence after ablation in a Korean population, despite no top single nucleotide polymorphisms (SNPs) that predicted clinical recurrence after catheter ablation [34]. A regulatory role for PDZK1IP1 (MAP17) in reactive oxygen species production has been confirmed and is considered as a marker for increased oxidative stress and may be a new therapeutic target [35]. and recent research suggests a potential role for ions channels regulation, linked to the Na + /H + exchanger 3 and A-kinase anchor protein 2/protein kinase A pathway [36]. However, ZNF566 plays a central role in heart regeneration and repair, and endocardial and epicardial epithelial to mesenchymal transitions [37,38].