CircUBAP2-mediated competing endogenous RNA network modulates tumorigenesis in pancreatic adenocarcinoma

We investigated the role of the competing endogenous RNA (ceRNA) network in the development and progression of pancreatic adenocarcinoma (PAAD). We analyzed the expression profiles of PAAD and normal pancreatic tissues from multiple GEO databases and identified 457 differentially expressed circular RNAs (DEcircRNAs), 19 microRNAs (DEmiRNAs) and 1993 mRNAs (DEmRNAs). We constructed a ceRNA network consisting of 4 DEcircRNAs, 3 DEmiRNAs and 149 DEmRNAs that regulates the NF-kappa B, PI3K-Akt, and Wnt signaling pathways. We then identified and validated five hub genes, CXCR4, HIF1A, ZEB1, SDC1 and TWIST1, which are overexpressed in PAAD tissues. The expression of CXCR4, HIF1A, ZEB1, and SDC1 in PAAD was regulated by circ-UBAP2 and hsa-miR-494. The expression of CXCR4 and ZEB1 correlated with the levels of M2 macrophages, T-regulatory cells (Tregs) and exhausted T cells in the PAAD tissues. The expression of CXCR4 and ZEB1 positively correlated with the expression of CTLA-4 and PD-1. This suggests that CXCR4 and ZEB1 proteins inhibit antigen presentation and promote immune escape mechanisms in PAAD cells. In summary, our data suggest that the circUBAP2-mediated ceRNA network modulates PAAD by regulating the infiltration and function of immune cells.


INTRODUCTION
Pancreatic adenocarcinoma (PAAD) is the most commonly occurring pancreatic cancer, and the seventh leading cause of tumor-related death in both men and women worldwide [1]. It is often asymptomatic during the early stages. Although the treatment of PAAD has improved in recent years, thanks to the availability of newer chemotherapeutic drugs and advances in surgical techniques, the survival rates of PAAD patients remain very low. The majority of the patients are diagnosed with advanced stage PAAD, which is not amenable for surgical therapy [2]. Moreover, the 5-year survival rates of patients that have undergone successful surgery remain below 8% because of disease recurrence [3]. In addition, high rates of drug resistance to several chemotherapeutic drugs, including gemcitabine also contribute to low survival rates [4]. Therefore, reliable biomarkers useful for accurate early diagnoses are needed to improve survival rates in patients with PAAD.
Previous studies have revealed several protein-coding genes, miRNAs or long non-coding RNAs (lncRNAs) that regulate PAAD. For example, upregulation of miR-222 promotes tumor cell proliferation and invasiveness in PAAD by decreasing the nuclear levels of the p27 tumor suppressor protein [5]. In addition, silencing LINC00958 prevents initiation of pancreatic tumorigenesis by inhibiting PAX8 via miR-330-5p [6]. Circular RNAs (circRNAs) are a novel class of endogenous small non-coding RNAs that form covalently closed-loop structures and lack a 5′ cap or 3′ poly-A tail [7]. Although previously thought to be generated as a result of splicing errors [8], recent advances in RNA sequencing technologies have revealed that they play a key role in regulating gene expression by sponging miRNAs [9]. Aberrant expression of circRNAs has been documented in several human diseases, including cancers [10].
The exosomes secreted by PAAD cells contain circular aminoacyl-tRNA synthetases (circ-IARS), which promote tumor invasion and metastasis by enhancing endothelial monolayer permeability [11]. A competing endogenous RNA (ceRNA) network that includes circRNAs and miRNAs plays a critical role in posttranscriptional regulation of key proteins, and dysregulation of ceRNAs is associated with tumor development and progression [12]. However, the role of circRNAs in the growth and progression of PAAD is not fully understood [13]. The immune system plays a crucial role in PAAD growth and progression [14]. For that reason, we investigated the relationship between ceRNAs and tumor-infiltrated immune cells in the development and progression of PAAD.

Identification of DEcircRNAs, DEmiRNAs, and DEmRNAs in PAAD tissues
We normalized the expression of circRNAs (Supplementary Figures 1A and 1B), microRNAs and mRNAs (Supplementary Figures 1C and 1D) in PAAD and normal pancreatic tissues. We identified 457 differentially expressed circRNAs (DEcircRNAs), which included 174 in the GSE69362 dataset [15] and 283 in the GSE79634 dataset [16], based on a threshold of P value < 0.05 and |log FC| ≥ 1. We generated volcano plots and heatmaps of the DEcircRNAs (Figures 1A and 1B), 19 DEmiRNAs (7 up-regulated and 12 downregulated; Figure 1C) and 1993 DEmRNAs (1012 upregulated and 981 down-regulated; Figure 1D) to demonstrate their aberrant expression in PAAD tissues. We compared the DEcircRNAs from the two datasets and identified 19 up-regulated and 8 down-regulated DEcircRNAs for further analysis ( Figure 1E).

Analysis of DEmRNAs
We performed Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses of the DEmRNAs to identify aberrantly regulated biological processes and signaling pathways in PAAD. The Gene Ontology analyses suggested that biological processes such as cell proliferation, angiogenesis and negative regulation of transcription from RNA polymerase II promoter were dysregulated ( Figure 4A). KEGG pathway analysis of DEmRNAs highlighted the involvement of the NF-kappa B, PI3K-Akt, and Wnt signaling pathways, and cancer-related dysregulation of transcription ( Figure 4B). The NF-kappa B, PI3K-Akt, and Wnt signaling pathways have previously been implicated in the growth and progression of PAAD [17][18][19]. The Sankey plots for all the enriched GO terms and KEGG pathways with a P value < 0.05 were shown, further highlighting the genes and the pathways involved in PAAD.

Identification and validation of the hub genes
We analyzed the DEmRNAs using the Search Tool for the Retrieval of Interacting Genes database (STRING) and constructed a protein-protein interaction (PPI) network consisting of 141 nodes and 50 edges ( Figure 5). We identified the top 15 genes, which were found in at least 6/11 topological algorithms and had a high-ranking score. After integrating the results of the GO, KEGG and PPI network analyses, we identified 5 hub genes that were closely related to PAAD tumorigenesis, namely, C-X-C Motif Chemokine Receptor 4 (CXCR4), Hypoxia-Inducible Factor 1 Subunit Alpha (HIF1A), Zinc Finger E-Box Binding Homeobox 1 (ZEB1), Syndecan 1 (SDC1) and Twist Family BHLH Transcription Factor 1 (TWIST1).
The transcription levels of the 5 hub genes in GSE60980 (GPL14550), 3 related miRNAs in GSE60980 (GPL15159), and 4 circRNAs in GSE79634 ( Figure 6) and GSE69362 (Supplementary Figure 3) were shown. We further validated the 5 hub genes by analyzing their mRNA expression in the Gene Expression Profiling Interactive Analysis (GEPIA) and the protein expression in The Human Protein Atlas (THPA) database as shown in Figures 7A and 7B, respectively. The transcription levels of these 5 hub genes in GEPIA were similar to the results in Figure 6. The protein expression data in the THPA database showed that SDC1 and ZEB1 were aberrantly overexpressed in PAAD compared to normal pancreatic tissues, which were in accordance with our transcription results in Figure 6. However, HIF1A protein levels were low in both PAAD and normal pancreatic tissues, and the protein levels of the remaining two hub proteins, CXCR4 and TWIST1 were not found in the THPA database.

Analysis of prognosis and tumor infiltration of immune cells
We analyzed the overall survival (OS) using the Kaplan-Meier Plotter database (KM plotter database) to determine the prognostic value of hub genes and miRNAs. Prognosis based on their high and low expression in PAAD tissues is shown in Figure 8. High expression of SDC1 and TWIST1 was associated with poorer OS (HR = 2.23, P = 0.0030 and HR = 1.66, P = 0.018). High expression of hsa-miR-494 and hsa-miR-324 was associated with increased OS (HR = 0.64, P = 0.033 and HR = 0.50, P = 0.00095).
Since several studies have shown that immune cell infiltration and the tumor microenvironment plays a critical role in the development of cancers, we analyzed the relationship between the expression of the hub genes and the status of immune cell infiltration in PAAD tumors using the Tumor Immune Estimation Resource (TIMER) database. As shown in Supplementary showed no significant correlation with immune cell infiltration in the PAAD tumors. These results suggested that the circRNA-mediated ceRNAs might modulate tumorigenesis by regulating the infiltration of immune cells in PAAD.

Correlation analysis between the hub genes and the immune cell markers
Next, to further characterize the role of the different subsets of immune cells with PAAD tumorigenesis, we analyzed the correlation between the expression of hub genes and various immune cell markers, including markers specific for CD8+ T cells, T cells (general), B cells, monocytes, tumor-associated macrophages (TAMs), M1 and M2 macrophages, neutrophils, natural killer cells (NKs) and DCs. We also analyzed the data for different subsets of T cells, such as T-helper 1 (Th1) cells, T-helper 2 (Th2) cells, follicular helper T cells (Tfhs), T-helper 17 cells (Th17s), regulatory T cells (Tregs) and exhausted T cells. After adjusting for purity, we found that the expression of most of the immune cell markers positively correlated with the expression of CXCR4 and ZEB1 and negatively correlated with SDC1 expression in PAAD tissues ( Table 2).  Log2FC means that log2 transformation to fold changes of circRNA expression between PAAD tissues and normal tissues. UP means upregulated circRNA in PAAD tissues. DOWN means downregulated circRNA in PAAD tissues.

DISCUSSION
Complex mechanisms involving several genes and pathways are involved in the growth and progression of PAAD. The ceRNA network has emerged a novel method of posttranscriptional gene regulation. In the present study, we identified a total of 457 DEcircRNAs in GSE69362 and GSE79634, 19 DEmiRNAs in GSE60980 (GPL15159) and 1993 DEmRNAs in GSE60980 (GPL14550). We generated a circRNAmediated ceRNA network that included 4 circRNAs, 3 miRNAs and 149 mRNAs. We then constructed a PPI network and identified 5 hub genes, including CXCR4, HIF1A, ZEB1, SDC1 and TWIST1, by performing functional enrichment using GO and KEGG pathway analyses. We further analyzed the relationship between the protein expression of the hub genes and the prognosis and the status of immune cell infiltration in PAAD.
We identified 4 circRNAs, circ-UBAP2, circ-CLEC17A, circ-HIBADH and circ-TADA2A, as part of the ceRNA network from among 457 DEcircRNAs. Previous studies have shown that circ-UBAP2 plays an important role in many cancers by acting as a sponge for miRNAs. In triple-negative breast cancer (TNBC), upregulated circ-UBAP2 binds to and inhibits miR-661, which promotes high expression of the MTA1 oncogene that activates TNBC cell proliferation and migration [20]. In osteosarcoma cells, circ-UBAP2 acts as an miR-143 sponge and suppresses apoptosis by upregulating Bcl-2 [21].  We identified 5 hub genes that have previously been associated with PAAD tumorigenesis. Artemin promotes metastasis and invasiveness of PAAD cells by inducing CXCR4 expression via the activation of the NF-κB signaling [22]. Moreover, miR-494 inhibits proliferation, invasion and metastasis of prostate and breast cancer cells by suppressing CXCR4 expression [23,24]. Furthermore, miR-494 suppresses epithelialmesenchymal transition (EMT), metastasis and invasiveness of PAAD cells by downregulating SDC1 [25]. Liu H et al. reported that downregulation of TWIST1 suppresses PAAD cell invasiveness and metastasis [26]. TWIST1 may be a direct target of hsa-miR-214, which is part of the ceRNA network we constructed in this study. Cao et al. reported that downregulation of mir-214-5p promotes the proliferation, invasion and migration of PAAD cells in a JAG1dependent manner, which is consistent with the effect of TWIST1 knockdown [27]. We therefore speculate that the hsa-miR-214/TWIST1 axis plays a critical role in PAAD progression. When exposed to hypoxic conditions, PAAD cells generate exosomes that are rich in mir-301a-3p which induces M2 polarization of the tumor-associated macrophages (TAMs) via the activation of PTEN/PI3Kγ in a HIF1A-or HIF2Adependent manner, thereby promoting tumor cell EMT, invasiveness, and metastasis [28]. In hypoxic conditions, activated KRAS upregulates carbonic anhydrase 9 modulates pH and glycolysis via HIF1A or HIF2A and promotes aggressive growth of PAAD cells [29].
Notably, 4 out of the 5 hub genes, namely, CXCR4, SDC1, ZEB1 and HIF1A, are potential targets of the circ-UBAP2/hsa-miR-494 ceRNA network. This suggests that circ-UBAP2/hsa-miR-494 ceRNA network plays a critical role in the initiation, growth, and progression of PAAD.   To address the potential regulatory roles of the hub genes in the recruitment of tumor-infiltrating lymphocytes, we assessed the correlation between the expression of hub genes and the status of immune infiltration levels in PAAD using the TIMER database. Our data suggest that high expression of both CXCR4 and ZEB1 correlates moderately or strongly with the markers of immune cell types such as M2 macrophages, CD163, VSIG4 and MS4A4A, but shows a weak correlation with M1 macrophage markers. A previous study confirmed that high expression of ZEB1 induces polarization of TAMs and promotes ovarian tumor growth [30]. We hypothesize that up-regulated CXCR4 and ZEB1 promotes M2 polarization of TAMs and promotes growth and progression of PAAD. Earlier studies have shown that activated Tregs suppress antitumor immunity and promote tumor survival [31].  [34]. Furthermore, CXCR4 inhibitor in combination with an immuneactivating fusion protein called VIC-008 suppresses mesothelioma growth by inhibiting PD-1 expression in CD8+ T cells and promotes transition of Tregs into T helper cells [35]. The upregulation of PD-1 and TIM-3 on CD4+ and CD8+ T cells restricts T cell responses in patients with PAAD [36].
In summary, our study demonstrates that circRNAs regulate PAAD by modulating the expression of the hub genes via the ceRNA network. Furthermore, the ceRNA network involving circ-UBAP2, hsa-miR-494 and the 5 hub genes, especially for CXCR4 and ZEB1, regulates PAAD by modulating the tumor infiltration of immune cells.

Microarray data
In this study, the microarray data of circRNA was obtained from the GSE79634 and GSE69362 datasets. The GSE79634 dataset was based on the GPL19978 platform and included 20 PAAD and adjacent normal tissue samples each. The GSE69362 dataset was based on the GPL19978 platform and contained 6 PAAD and adjacent normal tissue samples each. The microarray data of miRNA was obtained from the GSE60980 (GPL15159) dataset and included 51 PAAD and 6 normal samples. The microarray data of mRNA was obtained from the GSE60980 (GPL14550) dataset and included 49 PAAD and 12 normal samples.

Identification of DEcircRNAs, DEmiRNAs and DEmRNAs
The expression of cicrRNAs in the GSE79634 and GSE69362 datasets was normalized in quantile method, and the DEcircRNAs were determined using the "limma" package [37]. P values < 0.05 and |log FC| ≥ 1 were defined as statistically significant. Furthermore, to enhance the accuracy of the results, the DEcircRNAs were analyzed using the Venn diagram software (http://bioinformatics.psb.ugent.be/webtools/Venn/), and the intersections between the various DEcircRNAs were calculated. We used a similar strategy to analyze the differential expression of the miRNAs in GSE60980 (GPL15159) and the mRNAs in GSE60980 (GPL14550).

Construction of the circRNA-miRNA-mRNA network
To construct the circRNA-miRNA-mRNA network, we obtained basic information regarding the circRNAs, including their chromosomal locations from the circBase (http://www.circbase.org) [38]. We then used the cancer-specific circRNA database (CSCD) [39] to establish a circRNA-miRNA network associated with pancreatic tumorigenesis based on the intersection between the circRNA-miRNA pairs and the DEmiRNAs. We also downloaded all the available information of the circRNAs from the CSCD, including the number and the position of the microRNA response element (MRE), the RNA binding protein (RBP) and the open reading frame (ORF). Next, we obtained the miRNA-predicted mRNA pairs from the miRDB, miRTarBase and TargetScan databases, and made an intersection between the predicted mRNAs and the DEmRNAs to construct a miRNA-mRNA network related to pancreatic tumorigenesis. In the three databases, these predicted miRNA-mRNA pairs are validated experimentally by reporter assay, western blot, microarray and nextgeneration sequencing experiments. Only pairs existed in at least 2 out of 3 databases mentioned before were thought stable and would be included in the present study [40][41][42]. Finally, we constructed a visual display of the circRNA-miRNA-mRNA network using the Cytoscape software [43].

Analysis of DEmRNAs
We performed GO enrichment analysis of the DEmRNAs to identify biological processes involved in pancreatic tumorigenesis using DAVID (version 6.8, Database for Annotation, Visualization and Integrated Discovery, https://david.ncifcrf.gov/) [44].
KEGG enrichment analysis was performed by the Gene Set Enrichment Analysis (GSEA) method using the "clusterprofiler" package [45]. Biological processes (BP) and KEGG pathways with a P value < 0.05 were considered statistically significant and visualized as a Sankey plot using the "ggalluvial" package [46].

Identification and validation of the hub genes
We analyzed the relative importance of the DEmRNAs using the STRING database (https://string-db.org/) [47], which contains 9,643,763 proteins from 2,031 organisms and 1,380,838,440 interactions. We constructed the PPI network using the Cytoscape software. The genes were ranked using the cytoHubba plug-in in Cytoscape [48]. We identified the hub genes by integrating the results of the GO, KEGG and PPI network analyses. The expression of 4 circRNAs (circ-HIBADH, circ-UBAP2, circ-TADA2A, and circ-CLEC17A), 3 miRNAs (has-miR-214, has-mir-324-3p, and has-miR-494) and 5 hub genes (CXCR4,HIF1A, ZEB1, SDC1 and TWIST1) were compared between the PAAD and the normal pancreatic tissues from the GEO dataset. The GEPIA (http://gepia.cancer-pku.cn/) [49] and THPA database (https://www.proteinatlas.org) [50] were used to validate the transcript (mRNA) and the protein levels of the hub genes. Differentially expressed genes were identified by applying the criteria of |log 2 FC| > 1 and P value < 0.05. Moreover, we compared the expression of the four circRNAs in men and women using the GSE79634 database that provides gender-based information for PAAD patients.