Cancer stem cell-specific expression profiles reveal emerging bladder cancer biomarkers and identify circRNA_103809 as an important regulator in bladder cancer

Bladder cancer stem cells (BCSCs), exhibiting self-renewal and differentiation capacities, may contribute to the tumor initiation, metastasis, recurrence and drug resistance of bladder cancer. However, the underlying functional mechanisms of BCSCs remain to be clarified. In this study, we describe the differentially-expressed mRNAs, lncRNAs, and circRNAs in BCSCs compared with that in bladder cancer non-stem cells (BCNSCs) through the transcriptome microarray data analysis using bladder cancer patients’ specimens. CircRNA_103809, the top one among the highly expressed circRNA identified in BCSCs, promotes the self-renewal, migration and invasion capabilities of bladder cancer by acting as a miR-511 sponge. Additionally, GO and KEGG pathway analysis suggest the differentially expressed genes identified may be involved in the cellular metabolism, differentiation and metastasis regulation of the cancer cells. Co-expression networks of lncRNAs/mRNAs and circRNAs/mRNAs constructed by WGCNA give a picture of the non-coding/coding RNAs regulating patterns in BCSCs. Notably, as core genes in the networks, AHCY, C6orf136 and LRIG1 show high potential to be prognosticators for bladder cancer. Therefore, further studies of non-coding RNA functional mechanisms in BCSCs is valuable for detecting the pathogenic mechanisms and discovering novel biomarkers in bladder cancer.


INTRODUCTION
Bladder cancer (BC) is one of the most life-threatening malignancies with high morbidity and mortality rates worldwide [1]. Benefit from the continual development of neoplasm diagnosis and therapy methods, the death rate of cancer had been decreased in the past 10 years. However, because of the lack of progress in the treatment for bladder cancer these years, the death rate of bladder cancer had barely changed [2]. Deeper researches on the pathogenesis and molecular biology of bladder cancer are needed to improve the diagnosis and treatment methods. Recent studies have suggested the critical role of cancer stem cells (CSCs) or cancer-initiating cells in In this study, we represented the expression profiles and interactive networks of mRNAs, lncRNAs, and circRNAs associated with BCSCs for the first time. Based on the RNAs expression data arose from two microarrays of three bladder cancer specimens from patients, differentially expressed circRNAs, lncRNAs and mRNAs in BCSCs when compared to that in BCNSCS were profiled. Then the most highly expressed circRNA in BCSCs named circRNA_103809 was chosen to be demonstrated further about its functions. GO and KEGG pathway enrichment analyses were performed to investigate the biological pathways tuned by these differentially expressed RNAs. The Coexpression networks were constructed to display the potential important interactive network within BCSCs. The expression of some of the core genes in the coexpression networks were found to be correlated to the prognosis of BC patients. Results from this study highlighted the biological processes regulated by noncoding RNAs in BCSCs and provided inspiring clues about future research for novel diagnostic and therapeutic targets based on non-coding RNAs in bladder cancer.

Differential expression profiles of mRNAs, lncRNAs and circRNAs in BCSCs
Our previous studies indicated that the monoclonal antibody BCMab1 recognized aberrantly glycosylated integrin α3 and could be used in the isolation of human bladder cancer stem cells (BCSCs) when combined with CD44 [31]. To identify genes associated with functions of BCSCs, we performed a transcriptome microarray analysis of human BCSCs (BCMab1+CD44+) and BCNSCs (BCMab1-CD44-) isolated from three BC patients. Workflow of this study was shown in Figure 1. The expression matrices for 3 pairs of BCSCs and BCNSCs samples were constructed after data preprocessing. Heatmaps for the expression of total mRNAs, lncRNAs and circRNAs were shown in Figure 2 respectively. Volcano plots were used for assessing gene expression variation between BCSCs and BCNSCs ( Figure 3). A total of 2849 mRNAs, 2698 lncRNAs and 127 circRNAs were found to be differentially expressed with the P-value < 0.05 and fold change > 2.0. Compared with that in BCNSCs, 1685 mRNAs, 1309 lncRNAs and 113 circRNAs were highly expressed, while 1164 mRNAs, 1389 lncRNAs and 14 circRNAs were lowly expressed in BCSCs. Hierarchical clustering analysis showed systematic variations in the expression of these RNAs among samples (Figure 4), and suggested that the expression of mRNAs, lncRNAs and circRNAs in BCSCs differ from those in BCNSCs.

Structure feature and chromosomes distribution of BC expressing non-coding RNAs
In this study, a total of 12,261 annotated lncRNAs and 4,451 novel circRNAs were identified from the BC samples. To present the structural features of these two kinds of non-coding transcripts in BC cells, analysis of gene structure and sequence conservation on the lncRNAs and circRNAs detected in this microarray assay was conducted. Our results showed that intergenic and natural antisense lncRNAs constitute the largest number in all expressing lncRNAs, respectively ( Figure  5A). And exonic and intronic circRNAs constitute the largest number in all expressing circRNAs ( Figure 5B). In addition, analysis of the distribution of circRNAs on 24 chromosomes (22 autosomes and 2 sex chromosomes) showed that circRNAs expressed in AGING bladder cancer cells are mainly distributed on the chr1 and the chr17 ( Figure 5C).

Validation of differentially expressed mRNAs, lncRNAs and circRNAs
To validate the differential expression profiles gained from the microarray analysis, six of mRNAs, lncRNAs and circRNAs were randomly selected respectively from the differentially expressed genes and be verified in vitro. According to previous reports, cells which are capable of forming suspensive oncospheres in a serum-free culture medium supplemented with selected growth factors show high stemness, oncosphere formation culture then be considered an efficient method to enrich cancer stem cells in vitro [32,33]. Therefore, we enriched bladder cancer stem cells from two bladder cancer cell lines T24 and EJ by oncosphere formation culture ( Figure 6A). Validation was performed by comparing the expression level of selected RNAs in oncosphere cells with that in nonsphere cells, and the trends differential expression were consistent with the results from microarray analysis based on patients' samples ( Figure 6B, 6C).  AGING circRNA_103809 serves as a sponge for the miR-511 and induces bladder cancer progression To study functions of the differentially expressed noncoding RNAs identified in BCSCs, we chose circRNA_103809, the most variable gene among the highly expressed circular RNAs, to investigate further. CircRNA_103809 locates at chr5:32379220-32388780 (CircBase ID: hsa_circ_0072088), and was found derived from the ZFR gene and may be associated with  AGING tumor relapse of hepatocellular carcinoma [34]. In this study, we knocked down the expression of circRNA_103809 in bladder cancer cells with two siRNAs that cover the back-splicing region of circRNA_103809, and confirmed the efficiency by qRT-PCR ( Figure 7A). It turned out that the silencing of circRNA_103809 significantly reduced the oncosphere formation, migration and invasion abilities of bladder cancer cells ( Figure 7B-7D). These findings indicated that under-expression of circRNA_103809 may be capable of reducing the progression of BC. To address whether circRNA_103809 works as miRNA sponge in BC cells, we predicted the circRNA/miRNA interaction with Arraystar's homemade miRNA target prediction  AGING software. According to the prediction, circRNA_103809 may interact with five miRNAs (Supplementary Figure  1). Then to validate whether circRNA_103809 serves as the binding platform of these miRNAs, we performed RNA pulldown assay in the EJ cells. As shown in Figure 7E, we first determined that miRNAs expression in EJ cell line. Then we used circRNA_103809 probe pull-down the endogenous different miRNAs, we found that only miR-511 was specifically enriched by circRNA_103809 in EJ cells ( Figure 7F). All these experiments proved that circRNA_103809 may function as a sponge for miR-511 in BC progression.

Biological processes and pathways enrichment of differentially expressed RNAs
The expression of many lncRNAs is significantly correlated with the functions of their neighboring protein-coding genes [35]. And the processing of circRNAs can affect the splicing of their precursor transcripts, leading to altered gene expression outcomes [36]. To clarified the biological processes influenced by the differentially expressed lncRNAs, circRNAs and mRNAs in BCSCs, we performed pathway and function enrichment analysis on gene sets constituted by neighboring protein-coding genes of the lncRNAs, precursor transcripts of the circRNAs, and the mRNAs respectively.
The GO network and diagrams were obtained from the Gene Ontology enrichment analysis (Figure 8). According to the GO category of "Biological Process", the most significant terms in the mRNA group were sensory perception of chemical stimulus, carboxylic acid metabolic process and G-protein coupled receptor signaling pathway. While in the lncRNA group, the most significant terms were ethanol oxidation, innervation, positive regulation of microtubule polymerization, interleukin-7-mediated signaling pathway, and positive regulation of vascular endothelial growth factor production. The most significant terms in the circRNA group were receptor recycling, digestive AGING tract morphogenesis, intra-Golgi vesicle-mediated transport, Golgi to plasma membrane transport, prostate gland development, vesicle-mediated transport to the plasma membrane. In the GO category of "Cellular Component", endoplasmic reticulum membrane, cytoplasmic part, vesicle, cell periphery and mitochondrion were the most significant in the mRNA group and the most significant terms in the lncRNA group were histone deacetylase complex, spindle microtubule, and proton-transporting two-sector ATPase complex. In the circRNA group terms were cortical actin cytoskeleton, specific granule lumen, ciliary tip, and ciliary base. In the GO category of "Molecular Function", we recorded transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific DNA binding, Gprotein coupled receptor activity and steroid hydroxylase activity as the most representative terms in Figure 8. GO analysis of differentially expressed mRNAs, lncRNAs, and circRNAs. GO and pathway terms were connected and grouped based on their functional similarity, which is calculated using kappa statistics in ClueGO. Terms in the same group are specified by identical colors, and their size is proportional to their statistical significance. The most significant term in each group is highlighted by a colorful and bigger font. the mRNA group. The most representative terms in the lncRNA group were alcohol dehydrogenase activity, zinc-dependent, retinal dehydrogenase activity, alcohol dehydrogenase (NAD) activity, steroid dehydrogenase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor, RNA polymerase II transcription corepressor activity. In the circRNA group terms were double-stranded RNA binding and phosphatidylinositol-4,5-bisphosphate binding.
As a major public database of pathways, the Kyoto Encyclopedia of Genes and Genomes has been used to determine the significantly enriched pathways for candidate target genes compared with the entire genome background [37,38]. The KEGG analysis performed on the gene set mentioned above had shown the pathways associated with the differentially expressed mRNAs, lncRNAs, and circRNAs in BCSCs respectively (Figure 9). For mRNAs, the drug metabolism, metabolism of xenobiotics by cytochrome P450 and steroid hormone biosynthesis were identified as the top enriched KEGG pathways ( Figure 9A). Fatty and metabolism, pantothenate and CoA biosynthesis, glyoxylate and dicarboxylate metabolism were the most significantly enriched pathways for lncRNAs ( Figure 9B). Whereas, the significantly enriched pathways were cysteine and methionine metabolism, hedgehog signaling and tight junction for circRNAs ( Figure 9C).

LncRNA-mRNA and circRNA-mRNA co-expression networks in BCSCs
Generally, the functions of lncRNAs and circRNAs were mainly performed by interacting with their targets [8,9]. The mRNAs whose expression is correlated with circRNAs or lncRNAs could be important targets of the latter. Here, co-expression networks of differentially expressed lncRNAs-mRNAs and circRNAs-mRNAs were constructed through weighted correlation network analysis (WGCNA). The coexpression network of mRNAs and circRNAs was comprised of 66 network nodes and 1088 connections, and this network included 6 mRNAs, CARHSP1, DOCK7, GFI1B, PARD6A, RAB3IL1 and SPNS3 constituting probably the core of the network ( Figure 10A). Meanwhile, the coexpression network of mRNAs and lncRNAs was comprised of 39 network nodes and 380 connections, and this network included 13 mRNAs, ADAL, BCAT1, IL19, KIAA0748, NR2E3, PRSS22, SCARB1, SMPD1, ZNF319, ALDOA, AHCY, C6orf136 and LRIG1 ( Figure 10B). Next, we analyzed the relationship between the expression of these core genes and the survival prognosis of patients in the TCGA bladder cancer database. A significant difference of overall survival and disease free survival between the high expression groups and the low expression groups for AHCY, C6orf136 and LRIG1 was observed ( Figure 11, P < 0.05). Therefore, the expression levels of AHCY, C6orf136 and LRIG1 could be prognosticators of survival in BC patients.

DISCUSSION
Bladder cancer is one of the most common cancers, which ranks the 4th among cancers in males according to the USA cancer statistics of 2017 [1]. Cancer stem cells with high tumorigenic, drug-resistant and metastatic characters were considered may contribute to the relapse and metastasis of cancer [39]. Making use of AGING a mAb (BCMab1) against CD44 + human bladder cancer cells that recognize aberrantly glycosylated integrin α3β1, we isolated a subset of bladder cancer cells from primary samples in the previous study. We showed that the BCMab1 + CD44 + cells act as bladder cancer stem cells (BCSCs) and are correlated with clinicalpathologic features of bladder cancer [31]. To gain some insights into biological functions of non-coding RNAs in the CSCs of bladder cancer, we performed a comprehensive analysis of microarray data of BCSCs and BCNSCs from three primary samples. We identified the top differentially expressed lncRNAs and circRNAs, then  AGING validated their expression by qRT-PCR. By constructing lncRNAs/mRNAs and circRNAs/mRNAs coexpression networks, we identified mRNAs which may function as core molecules in the genes regulation network of BCSCs. Results from overall survival analysis and disease free analysis with clinical data from TCGA revealed that core mRNAs in the coexpression networks, AHCY, C6orf136 and LRIG1, correlate with the progression and recurrence of bladder cancer significantly.
As shown in previous reports, AHCY (Sadenosylhomocysteine hydrolase) catalyzes the reversible hydrolysis of SAH (S-adenosylhomocysteine) to adenosine and l-homocysteine. This enzyme is frequently overexpressed in many tumor types and is considered to be a validated anti-tumor target [40]. Inhibition of AHCY decreased cell proliferation by G2/M arrest in MCF7 cells [41]. LRIG1 (Leucine-rich repeats and immunoglobulin-like domains 1) is a cell surface protein that antagonizes ERBB receptor signaling by downregulating receptor levels. Interestingly, a contrast to what we found in bladder cancer, LRIG1 was shown to be associated with good survival in 7 types of cancers such as breast cancer, uterine cervical cancer, and head-and neck cancer etc [42]. And it was also reported that LRIG1 marks stem cells in the gut and may maintain the intestinal epithelial homeostasis [43]. Lack of previous reports on the functions of C6orf136, further studies are needed to clarify cancer associated functions of this gene.
Whole-genome and whole-exome mapping have provided an overview of the genomic aberrance associated with the tumorigenesis of bladder cancer [44]. The expression profile data gained from microarray in this study improved the understanding of molecular mechanisms of bladder cancer stem cells by providing information about novel lncRNAs and circRNAs involved in the critical biological pathway of BCSCs.
To unravel the working mechanisms of these noncoding RNAs in the biological process regulation and prognosis of bladder cancer, this study has performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis on associated coding genes of lncRNAs and circRNAs which are differentially expressed between BCSCs and BCNSCs. The results from the analysis revealed that lncRNAs and circRNAs work through different pathways on the function maintenance of BCSCs. For example, lncRNAs may help to support the energy consumption during the efficient proliferation of bladder cancer stem cells by regulating the fatty acid metabolism process. And circRNAs tend to affect the metastasis ability and differentiation of bladder cancer cells by acting on the tight junction and hedgehog signaling associated pathway. The pathway mapping for mRNAs expressed differentially in the two types of cells was also performed and suggests that the drug metabolism function of BCSCs is quite different from BCNSCs, corresponding to the hypothesis that BCSCs may contribute to the chemoresistance of bladder cancer.
In summary, the interactive network of non-coding RNAs and mRNAs constructed by transcription data comparison between BCSCs and BCNSCs suggests that the RNA regulation network plays important role in stemness related functions of bladder cancer cells. Exploring further for critical non-coding RNAs working in BCSCs is required for a better understanding of the molecular mechanisms of bladder cancer recurrence and metastasis.
In conclusion, differentially expressed mRNAs, lncRNAs and circRNAs were screened from BCSCs and BCNSCs of three clinical bladder cancer specimens and analyzed. A total of 2849 mRNAs, 2698 lncRNAs and 127 circRNAs were differentially expressed, and regulate expression of their related genes and play important roles in the stemness maintaining of bladder cancer cells. The relevant signaling pathways of these RNAs shed light on the future study to investigate detailed mechanisms of bladder cancer initiation and progression. The current work also gives rise to the consideration that metabolism processes, especially those which are related to the drug metabolism should be investigated further to unravel the chemotherapy resistance mechanisms of bladder cancer stem cells.

Clinical specimens and cell lines
The 3 samples of bladder tumor specimens used in this study were obtained from patients during operation. All human studies were reviewed and approved by the Institute of Biophysics, Chinese Academy of Sciences, and written informed consent was provided according to the World Medical Association Declaration of Helsinki. Bladder cancer cell lines EJ and T24 used in this study were obtained from ATCC, the American Type Culture Collection. Both cell lines were cultured with RPMI 1640 medium (Gibco, NY, USA) supplemented with 10% fetal bovine serum (FBS) and incubated at 37°C in a humidified atmosphere containing 5% CO2.

Flow cytometry sorting cancer stem cells and microarray assay analysis
Primary human bladder cancer stem cells were sorted by flow cytometry (BD FACSAria II) as previously AGING described [31]. Total RNAs from each sample was quantified using the NanoDrop ND-1000. The sample preparation and microarray hybridization was performed based on the Arraystar's standard protocols. Briefly, total RNAs from each sample was amplified and transcribed into fluorescent cRNAs utilizing random primers according to Arraystar's Super RNA Labeling protocol (Arraystar Inc.). The labeled cRNAs were hybridized onto the Arraystar Human circRNA Array (8×15K, Arraystar). After having washed the slides, the arrays were scanned by the Agilent Scanner G2505C.
Differentially expressed LncRNAs, circRNAs and mRNAs with statistical significance between the two groups of BCSCs and BCNSCs were identified through P-value/FDR filtering. Hierarchical Clustering and combined analysis were performed using homemade scripts.

Cell migration and invasion assay
Cell migration was performed with in vitro scratch assay. The invasion was determined using modified Boyden chambers coated with Martrigel matrix in 24-well plate (BD Biosciences) as previously described [45].
circRNA targets miRNA prediction circRNA_103809-miRNA interaction was predicted with miRNA target prediction software based on TargetScan and miRanda (Arraystar's home-made).

GO and KEGG enrichment analysis
To explore the function of mRNAs, lncRNAs and circRNAs, the Gene Ontology (GO) and Kyoko Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were conducted. The ClueGO (v1.18.0) was used to perform the GO enrichment analysis of differentially expressed genes or target genes of lncRNAs or source genes of differentially expressed circRNAs, in which gene length bias was corrected. All three GO categories, namely cellular component, biological process, and molecular function, were included, and GO terms with the q-value < 0.05 were considered significantly enriched. R software was used to examine the statistical enrichment analysis of differential expression genes or lncRNA target genes or source genes of differentially expressed circRNAs in KEGG pathways. The enriched information was evaluated by the statistical test and correction. The EASE score was calculated to test the relevance, and p-value < 0.05 was considered significantly enriched by differentially expressed genes.

Correlation and coexpression analysis
The coexpression analysis was based on weighted correlation network analysis (WGCNA). WGCNA package in R as a powerful tool was used to make and analyze a co-expression network of selected lncRNAs, circRNAs and mRNAs. Compared to general methods, such as Pearson's correlation coefficient, WGCNA uses the soft threshold, which can provide more extensive and exact correlation between transcripts. Differentially expressed lncRNAs, circRNAs and mRNAs with fold change > 2, P < 0.05, and FDR < 0.05 were analyzed. The value of parameter soft threshold > 0.98 and P-value < 0.05 was recommended for the coexpression analysis. k-Core scoring was used to determine core transcripts of coexpression networks. A higher k-core score means a more central location of a transcript within a network. The soft threshold was adjusted to 0.8 to obtain the lncRNA and circRNA coexpressed mRNA cluster for further functional analysis of lncRNAs and circRNA.

Statistical analysis
We used UALCAN analysis to estimate the effects of hub gene expression levels on patient survival in the Cancer Genome Atlas (TCGA) bladder cancer datasets. Available TCGA patient survival data were also used for Kaplan-Meier survival analyses [47]. The statistical difference in gene expression of qRT-PCR results was analyzed by Student's t-test. It was considered to be statistically significant when p-value < 0.05.