Comprehensive analysis of lncRNAs N6-methyladenosine modification in colorectal cancer

Background: Long non-coding RNAs (lncRNAs) and their N6-methyladenosine (M6A) modifications are involved in cancer occurrence and development. Methods: lncRNA M6A modification in colorectal cancer (CRC) was comprehensively analyzed for the first time. Results: M6A levels of lnRNAs in CRC tissues were higher than those in tumor-adjacent normal tissues. A total of 8,332 M6A peaks were detected in 6,690 lncRNAs in CRC tissues. Approximately 91% of the modified lncRNAs had unique M6A modification peaks. A total of 383 lncRNAs were differentially methylated in CRC, of which 48.24% had a length of 1-1,000 bp. Most of these were located on chromosomes 1, 2, 7, 11, 16 and 19; 42.3% were within a sense-overlapping exon. RNA sequencing identified 163 differentially expressed lncRNAs in CRC. GO and KEGG analyses revealed that genes near differentially-methylated or -expressed lncRNAs were associated with CRC occurrence and development. Methylation was positively correlated with lncRNA expression levels in CRC and tumor-adjacent normal tissues. More unmethylated than M6A methylated lncRNA molecules were detected. A competing endogenous RNA (ceRNA) and lncRNA-mRNA expression-regulation network revealed a regulatory relationship between lncRNAs, microRNAs (miRNAs), and mRNAs. Conclusions: The findings may help improve our understanding of lncRNA function in colorectal cancer.


INTRODUCTION
Colorectal cancer (CRC) is the second most common cause of cancer-related death [1]. Epidemiological analysis has revealed that over 50% of patients with CRC have liver metastasis [2]. Although the treatment of CRC has been continuously optimized as medical technology has advanced, there is still no effective treatment for patients with advanced metastatic CRC. One of the basic processes during CRC occurrence and development is the accumulation of various genetic and epigenetic changes in colonic epithelial cells [3,4].
Epigenetic modifications, including DNA methylation, histone modification, nucleosome remodeling, and RNA modification [5], regulate cell self-renewal, differentiation, invasion, and apoptosis by modulating gene expression [6]. RNA can be modified by N6-methyladenosine (M6A), 5-methylcytosine (m (5) C), and pseudouridine [7]. M6A refers to an adenosine at the N6 position that has a methyl group attached to it and is the most abundant and evolutionarily conserved RNA modification; M6A can be found in most organisms, from bacteria to AGING mammals [8,9]. Abnormal M6A modification is closely related to the occurrence of a variety of tumors, such as hepatocellular carcinoma, breast cancer, lung cancer, and CRC [10][11][12][13]. Abnormal M6A modification has been suggested to contribute to tumor stem cell self-renewal and promote cancer cell proliferation [14]. However, M6A has also been found to affect tumor progression by regulating the expression of oncogenes or tumor suppressor genes, suggesting a dual regulatory role for M6A in cancer [15,16].
Non-coding RNAs (ncRNAs) are non-protein coding RNAs that primarily include small ncRNAs and long non-coding RNAs (lncRNAs). Although lncRNAs encode almost no protein, they can regulate gene expression through RNA interference, gene coinhibition, gene silencing, gene imprinting, and DNA demethylation [17,18]. LncRNAs also participate in various molecular biological processes. For example, lncRNAs play a key regulatory role in the occurrence and development of many diseases, including CRC [19]. LINC00858 plays a tumor-promoting role in CRC by upregulating HNF4α and down-regulating WNK2 [20]. Additionally, the lncRNA HITT inhibits HIF-1 mRNA expression and inhibits angiogenesis and tumor growth in vivo by interfering with the translation of HIF-1 α [21]. Increased expression of the lncRNA ITIH4-AS1 mediated by the downregulation of the Re1silencing transcription factor can promote the growth and metastasis of CRC cells by activating the JAK/STAT3 signaling pathway [22]. Therefore, it is essential to investigate lncRNAs that are differentially expressed between CRC tissue and tumor-adjacent normal tissues to elucidate the mechanisms by which lncRNAs influence the growth, metastasis, and invasion of CRC cells.
M6A methylated lncRNAs also have important regulatory functions in a variety of biological and pathological processes. In osteosarcoma cells, M6A modification promotes cellular proliferation and growth by upregulating lncRNA plasmacytoma variant translocation 1 [23]. M6A modification of LincRNA1281 affects the differentiation potential of mouse embryonic stem cells [24]. In addition, M6A modification regulates CRC proliferation and metastasis by affecting the expression of the lncRNA XIST [25]. However, the abnormal expression and M6A methylation of lncRNAs in CRC cells and their therapeutic applications remain unclear. Therefore, the establishment of a transcriptome map of the expression and M6A modification of lncRNAs in CRC is of great significance to understand the mechanisms by which lncRNAs affect the growth, metastasis, and invasion of CRC cells.

General features of M6A methylation of lncRNAs in CRC and tumor-adjacent normal tissues
There were 8,332 M6A peaks detected within 6,690 lncRNAs in CRC tissues, while 7, 064 M6A peaks were detected within 5,513 lncRNAs in the tumor-adjacent normal tissues. Among them, 4,202 peaks (accounting for only 37.5% of all peaks in the two groups) were shared between CRC and tumor-adjacent normal tissues ( Figure 1A). The low percentage of shared M6A peaks within lncRNAs indicates a difference in the M6A pattern between the two groups.
Analyzing the distribution of M6A peaks in each lncRNA revealed no significant difference between the two groups. Approximately 91% of the modified lncRNAs had unique M6A peaks, and approximately 7% of the modified lncRNAs had two M6A peaks ( Figure 1B), which is consistent with the proportions previously reported for mRNAs [26].
To further understand the M6A methylated lncRNAs in CRC and tumor-adjacent normal tissues, they were divided into the following six categories based on the positional relationship between the M6A methylated lncRNA and mRNA: bidirectional, exon sense overlap, intron sense overlap, intron antisense, natural antisense, and intergenic. Our results revealed that the majority of lncRNAs contained M6A sites with exon sense overlap (CRC tissues accounted for 57.02% and tumor-adjacent normal tissues accounted for 52.9% of such lncRNAs), and most lncRNAs contained intergenic M6A sites (CRC tissues accounted for 21.8% and tumor-adjacent normal tissues accounted for 24.6% of such lncRNAs). Only approximately 3.5% of the methylation sites were in the bidirectional group ( Figure 1C). Furthermore, M6A peaks demonstrated a higher fold enrichment in the bidirectional group in CRC tissues. However, in tumor-adjacent normal tissues, the natural antisense group harbored peaks with higher fold enrichment ( Figure 1D), indicating differential M6A patterns in CRC and tumor-adjacent normal tissues.

AGING
The top ten hypermethylated or hypomethylated lncRNAs are shown in Table 1.
Analyzing the length of differentially M6A methylated lncRNAs revealed no significant difference between those that were hypermethylated and those that were hypomethylated, and the length of the methylated lncRNAs was primarily 1-1,000 bp (48.24%) ( Figure 2B).
Further analysis of the proportions of the differentially M6A methylated lncRNAs showed that 42.3% of the differentially M6A methylated lncRNAs were in the exon sense-overlapping group ( Figure 2E). Among the hypermethylated lncRNAs, those in the intergenic group showed the highest fold change, whereas among the hypomethylated lncRNAs, those in the intronic antisense group had the highest fold change ( Figure 2F).

Functional analysis of genes near differentially methylated lncRNAs
To reveal the role of differentially methylated lncRNAs in the occurrence and development of CRC, we performed GO and KEGG pathway analysis on the genes located near differentially methylated lncRNAs.
GO results revealed that genes near hypermethylated lncRNAs primarily participate in the regulation of cell cycle arrest, DNA replication, and response to endogenous stimulus in the BP category. In terms of CC, genes are primarily involved in the nucleolus, membrane-enclosed lumen, intracellular organelle lumen, and postsynaptic density. In terms of MF, genes are primarily involved in RNA binding, poly (A) RNA binding, and organic cyclic compound binding ( Figure 3A). Genes near hypomethylated lncRNAs are primarily involved in the following BPs: regulation of catabolic process, regulation of GTPase activity, and positive regulation of hydrolase activity. In terms of CC, genes are associated with cell projections and neurons. In terms of MF, genes primarily participate in ankyrin binding, phosphatidylinositol phospholipase C activity, and phospholipase C activity ( Figure 3B).   AGING KEGG pathway analysis revealed the four most important signaling pathways associated with genes near hypermethylated lncRNAs to be glutamatergic synapse, selenocompound metabolism, morphine addiction, and circadian entrainment ( Figure 3C). For genes near hypomethylated lncRNAs, the four most important signaling pathways were the calcium signaling pathway, circadian entrainment, Rap1 signaling pathway, and dopaminergic synapse ( Figure 3D).

AGING
These results suggest that M6A modified lncRNAs may affect the occurrence and development of CRC through biological processes, cell composition, MF, and signaling pathways. Figure 4A shows a heatmap of differentially expressed lncRNAs, indicating that these lncRNAs have different expression patterns in the two groups. There were 163 differentially expressed lncRNAs in CRC, of which 44 were upregulated and 119 were downregulated ( Figure  4B). The top ten upregulated or downregulated lncRNAs are listed in Table 2.

Overview of differentially expressed lncRNAs in CRC
To further understand the effects of genes near lncRNAs, we identified the genes near differentially expressed lncRNAs and analyzed their GO enrichment and KEGG pathways. GO analysis revealed that genes near upregulated lncRNAs were significantly enriched in cell proliferation, extracellular region, and glycosaminoglycan binding ( Figure 4C). Genes near downregulated lncRNAs were significantly involved in the cell developmental process, actin cytoskeleton, and actin binding ( Figure 4D).
KEGG pathway analysis revealed that genes near upregulated lncRNAs were primarily involved in leishmaniasis, myocardial contraction, hypertrophic cardiomyopathy, hematopoietic cell pedigree, etc.
( Figure 4E). Genes near downregulated lncRNAs were mainly involved in retrograde endocannabinoid signaling, vascular smooth muscle contraction, dopaminergic synapse, insulin signaling, and Wnt signaling ( Figure 4F). Together, these results indicate that genes near differentially expressed lncRNAs may be related to CRC.

Association of M6A methylation and expression of lncRNAs
Methylation sequencing data showed that the level of lncRNAs methylation in CRC group was significantly higher than that in NC group ( Figure 6A). Here we further analyzed the methylation correlation between the two groups. The correlation graph showed that the M6A level of lncRNAs in CRC group was positively correlated with the M6A level of lncRNAs in NC group (Spearman-related Rho=0.49, p < 0.05).
Combining the results of methylation sequencing and RNA sequencing, we found that 132 lncRNAs were hypermethylated, but of which was not differentially expressed lncRNAs. 251 lncRNAs were hypomethylated, among them, 5 lncRNAs were downexpressed ( Figure 6B). To analyze the correlation between lncRNA methylation level and expression level, a correlation graph was constructed using the fold enrichment of lncRNA M6A methylation and expression value in terms of FPKM. The results indicated that there was a statistically significant positive correlation between methylation and expression levels of lncRNAs in CRC and NC ( Figure 6C, 6D).
To further analyze whether M6A methylation affects lncRNA expression, we divided all expressed lncRNAs into M6A lncRNAs and non-M6A lncRNAs, calculated the log two-fold change (log2FC) values of these lncRNAs, and generated a cumulative curve. The result revealed that the proportion of lncRNAs not modified by M6A was larger than that of lncRNAs modified by M6A, especially in terms of the log2FC of the lncRNA FPKM value between 0 and 2 ( Figure 6E).

Construction of lncRNA-miRNA-mRNA and lncRNA-mRNA networks in CRC
To explore the mRNAs regulated by lncRNAs, we screened eight lncRNAs with fold changes >7 out of 267 differentially methylated lncRNAs (Table 3) and eight lncRNAs with fold changes > 2.5 out of 158 differentially expressed lncRNAs ( Table 3), all of which were associated with CRC. A ceRNA network was constructed by lncRNA-miRNA-mRNA association analysis. The network consisted of the top five miRNAs combined with a screened lncRNA and the top five mRNAs bound to the miRNAs, including 16 lncRNAs, 80 miRNAs, and 400 mRNAs. From this ceRNA network, it is clear that lncRNAs regulate to miRNAs and mRNAs. For example, NR_103783 could act as a sponge for hsa-mir-3652/hsa-mir-4430 to affect TRIM40 expression, and hsa-miR-138-5p may be sponged to regulate FOXC1 expression ( Figure 7A).  Next, to investigate the regulatory relationship between lncRNAs and mRNAs in CRC cells, we screened 12 differentially expressed lncRNAs that were related to CRC (Table 4) and analyzed the lncRNA-mRNA regulatory network. The network consisted of 12 lncRNAs and 158 mRNAs. The lncRNA with the largest number of chains was uc001tfa.1, which has 61 edges. The lncRNA uc001tfa.1 can positively regulate the expression of ITIH5 and RAP1 and negatively regulate the expression of SQLE, TMEM9, and USAP1 ( Figure  7B). These two networks provide a deeper understanding of the functional role of lncRNAs in CRC.

DISCUSSION
In this study, we found significant differences in M6A methylation and expression of lncRNAs between cancer tissues and tumor-adjacent normal tissues. GO and KEGG analyses indicated that genes near differentially methylated or expressed lncRNAs were involved in important biological pathways and signaling pathways, which were related to the occurrence and development of CRC. The correlation graph revealed a positive correlation between the methylation and expression levels of lncRNAs in CRC and NC. The cumulative curve indicated that the proportion of lncRNAs not modified by M6A was greater than that of lncRNAs modified by M6A, suggesting that methylation downregulates lncRNA expression. A ceRNA and lncRNA-mRNA expression-regulation network revealed a regulatory relationship between lncRNAs, miRNAs, and mRNAs. This study comprehensively profiled M6A modification and expression patterns of lncRNAs in human CRC to provide a deeper understanding of the functional role of lncRNAs in CRC and new ideas and directions for the diagnosis and treatment of CRC in the future.
KEGG and GO analyses were used to functionally profile the genes near differentially methylated lncRNAs to determine the role of M6A modified lncRNAs in CRC. Our results showed that genes near differentially methylated lncRNAs participate in the biological processes of cell cycle arrest and catabolism; are expressed in cellular components, such as nucleolus, cell cavity, and cell projection; and participate in other molecular functions, such as RNA binding, anchor protein binding, and phosphatidylinositol phospholipase C activity. In addition, genes proximal to differentially methylated lncRNAs were also found to be involved in the regulation of glutamatergic synapses, selenium compound metabolism, and calcium signaling pathways. Cell cycle arrest can provide an opportunity for tumor cells to repair their damaged DNA [27]. The progress of glutamate synaptic input drives tumors in glioma cells [28]. Selenium is a trace element with important benefits for both humans and animals. Selenium compounds exhibit good chemopreventive and chemotherapeutic effects in cancer treatment [29][30]. This indicates that M6A modified lncRNAs affect the occurrence and development of CRC through biological processes, cell composition, and molecular function signaling pathways.

AGING
LncRNA NR-001566 (lncRNA TERC) is a component of telomerase RNA, and its 11-base template region is considered as an ideal target for direct enzyme inhibition of telomerase activity [31]. LncRNA ENST00000500112 (LncRNA CCAT1), which was upregulated in CRC and esophageal cancer, by adjusting the expression of HOXB13 and SPRY4 affect cell proliferation and migration in esophageal squamous carcinoma [32,33]. LncRNA ENST00000524003 (lncRNA OTUD6B-AS1) was downregulated in renal clear cells and inhibited the migration and invasion of cancer cells by inhibiting the activity of Wnt/ catenin pathway and epithelial-mesenchymal transformation [34]. In this study, we found that lncRNA TERC, lncRNA CCAT1 and lncRNA OTUD6B-AS1 were upregulated in CRC. The expression trend of lncRNA OTUD6B-AS1 in CRC and renal cell carcinoma is inconsistent, which may be related to tissue specificity and functional differences.
The relationship between lncRNA methylation and expression has always been a concern, but there is not yet a clear conclusion regarding this relationship. In our study, there was a positive correlation between methylation and expression levels of lncRNAs in CRC and NC. The cumulative curve revealed that the proportion of lncRNAs not modified by M6A was greater than that of lncRNAs modified by M6A, suggesting that methylation downregulates lncRNA expression. Previously, no comprehensive study has demonstrated the correlation between lncRNA methylation and expression. However, some studies on individual lncrRNAs have shown this correlation. Some studies have suggested that M6A methylation is positively correlated with lncRNA expression. M6A methylation was thought to be involved in the upregulation of lncRNA RP11 by increasing its nuclear accumulation [35]. M6A methylation improves the stability and level of RHPN1-AS1 by reducing RNA degradation [36]. However, other studies have found M6A levels to negatively correlate with lncRNA expression. For example, when the M6A methylation level of XIST was significantly decreased, the expression level of XIST was significantly increased [25].  In this study, we constructed a ceRNA and lncRNA-mRNA expression-regulation network to analyze the regulatory relationship between lncRNAs and miRNAs. From the two network diagrams, it can be seen that NR-103783 can serve as a sponge for hsa-mir-3652/hsa-mir-4430 to affect TRIM40 expression and for hsa-miR-138-5p to regulate FOXC1 expression. The lncRNA uc001tfa.1 can positively regulate the expression of ITIH5 and RAP1 and negatively regulate the expression of SQLE, TMEM9, and USAP1. The target genes of lncRNAs are related to the occurrence and development of tumors. TRIM40 has been reported to inhibit nuclear factor-kappaB (NF-κB) activity and prevent inflammation from causing cancer in the gastrointestinal tract [37]. High FOXC1 expression is closely related to metastasis, recurrence, and decreased survival rates in CRC [38]. ITIH5 is considered to be associated with extracellular matrix stability and may therefore play a   [39]. RAP1A promotes CRC development through the PTEN/FOXO3/ CCND1 signaling pathway [40]. SQlE is a breast cancer oncogene [41]. TMEM9, through vacuolar-ATPaseactivated Wnt/β-catenin signaling, promotes intestinal tumorigenesis [42]. Silencing of NUSAP1 inhibits cell proliferation, migration, and invasion by inhibiting DNMT1 expression in human CRC [43]. The abovementioned genes are only few of the tumor-related lncRNA target genes, and their functions require further investigation. However, from these genes, it is clear that the two network diagrams presented herein can both explain the regulatory relationship between lncRNAs and mRNAs and be of great significance to further understand the role of lncRNAs in the occurrence and development of CRC. This could provide a theoretical basis for the occurrence, diagnosis, and treatment of colon cancer at the transcriptional and posttranscriptional levels.  expression of lncRNA was expressed as fold change using the 2−ΔΔCt method and processed by SPSS19.0 with a one-way analysis of variance.

Bio-informatic analysis of LncRNA targets and associated pathways
After sequencing, Q30 was used for quality control and cutadapt [44] (v1.9.3) was used to remove 3' adaptors and low-quality reads to obtain clean high-quality reads.
High-quality reads were aligned to the human reference genome (UCSC HG19) using Hisat2 [45] (v2.0.4). Then, guided by the Ensembl gtf gene annotation file, the FPKM [46] (Fragments per kilobase of exon per million fragments mapped) values of transcript-level lncRNAs and gene-level mRNAs were obtained using cuffdiff (part of v2.2.1). These values were used to form the expression profile of lncRNAs and mRNAs, and multiple changes and p-values between the two groups of samples were calculated to identify differentially expressed lncRNAs and mRNAs. MACS [47] was used to identify methylated sites in each sample. Differentially methylated sites were identified by diffReps [48]. We wrote a program to filter the peaks on lncRNA exons.
LncRNAs are non-protein coding RNAs, but they can regulate target gene expression. Therefore, the function of lncRNAs can be characterized by the function of their target genes [49,50]. The target genes of lncRNA AGING can be divided into two types, cis-targets and transtargets [49]. The transcribed genes within a 10 kbp window upstream or downstream of lncRNAs location were considered as cis-target genes [50]. When the expression quantity correlation coefficient of a lncRNA and its corresponding target mRNA was p ≥ 0.9, it was considered to be a potential trans-target [49].
Annotation, visualization, and integrated discovery databases were used for gene ontology (GO) and pathway enrichment analyses. GO consists of three components: cell component (CC), molecular function (MF), and biological process (BP). The p-value indicates the significance of GO term enrichment. Pathway enrichment analysis is the functional analysis of genes mapped to specific pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG). The Fisher p-value indicates the importance of the path related to the condition.
We first screened lncRNAs related to CRC from the differentially methylated or expressed lncRNAs; then, we screened the lncRNAs with a methylation fold change >7 and the lncRNAs with an expression fold change >2.5 from the lncRNAs that were related to CRC. The miRNA-target gene prediction software based on Miranda [51] and TargetScan [52] was used to predict miRNAs and mRNAs which were combined for the screened lncRNAs. The ceRNA network was plotted using Cytoscape [53] (v3.7.1). The Pearson correlation coefficient of expression level between lncRNAs and mRNAs [54] was calculated, with a threshold value of 0.95, and a coding and non-coding co-expression (CNC) network was plotted using Cytoscape [53] (v3.7.1).

Ethics approval and consent to participate
This study was approved by the Ethics Review Committee of the First Affiliated Hospital of Chengdu Medical College (permit number: CYYFYEC2019002).

Data availability
If you need the raw high-throughput M6A and lncRNAs sequencing generated during the current study, you can ask the corresponding author by E-mail.