Comprehensive Analysis of Aberrantly Expressed Profiles of ncRNAs Revealed lncRNA AGPAT4-IT1 is a Potential Prognostic Marker for Breast Cancer Metastasis


 Aims: we investigated the relationship between long non-coding RNAs (lncRNAs) and breast cancer lung metastasis (BCLM). Methods: We performed lncRNA microarray analyses to establish the lncRNA profile of BCLM. Bioinformatics analyses were carried out to analyzed functional roles of identified lncRNAs. Kaplan-meier analysis was conducted to determine the relation between lncRNA AGPAT4-IT1 and prognosis of breast cancer. Results: We found 317 upregulated and 166 downregulated lncRNAs in BCLM group. We showed AGPAT4-IT1 was positively correlated with its parental gene APGAT4. Furthermore, we suggested AGPAT4-IT1 were highly expressed in higher tumour grade and predicted poorer prognosis. Conclusions: These findings provide evidence for exploring the mechanisms of BCLM and indicate AGPAT4-IT1 is a prospective prognostic marker for breast cancer metastasis.

lncRNA in the HOX termed HOTAIR was highly expressed in primary breast tumors and metastases, and suggested that high level of HOTAIR in primary tumors predicted a tendency to metastasis and poor prognosis [9]. In addition, another report reported that lncRNA, growth arrest-speci c transcript 5 (GAS5), was downregulated in breast cancer tissues than that in corresponding normal tissues, further study showed that GAS5 could induce growth arrest and apoptosis of breast cancer cells [10]. Additionally, our previous research also demonstrated that the expression level of lncRNA CUEDC1 was low in breast cancer cell lines, and could negatively regulate the stemness of breast cancer [11]. Considering the critical function of lncRNA in breast cancers, we hypothesized that the lncRNAs might be correlated with the pathogenesis of BCLM.
In the current study, we constructed lncRNA signature and circRNA pro le in primary breast cancer tissue and corresponding lung metastases. Gene ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were used to explore the functional roles of the identi ed lncRNAs might play. Besides, potential target microRNAs, were predicted for differently expressed lncRNAs to probe the underlying mechanisms. In addition, we found that lncRNA AGPAT4-IT1 were highly expressed in lung metastases group. To explore its functional roles in cancer metastasis, we showed that AGPAT4-IT1 were positively correlate with higher tumour grade with online database (https://lncar.renlab.org/).
Kaplan-meier analysis suggested that triple negative breast cancer patients with highly expressed AGPAT4-IT1 displayed a shorter overall survival and relapse free survival. Additionally, we established AGPAT4-IT1-mRNA network to explore potential action mechanisms. In summary, we conducted a comprehensive analyse and lay a substantial foundation for studying the roles of lncRNAs in BCLM, and suggested that lncRNA AGPAT4-IT1 might be a potential prognostic marker for breast cancer metastasis.

Materials And Methods
Tissue and cell line Human breast cancer cell line, MDA-MB-231, were obtained from the American Type Culture Collection (ATCC) and cultured in DMEM supplemented with 10% fetal bovine serum (FBS) as routine [11]. Primary tumor tissue and lung metastases were obtained from Orthotopic tumour model and metastases. The xenograft model was established according to our previous publications [12]. Brie y, a total of 2 × 10 6 /cells were injected into the fourth pair of mammary fat pads of nude mice. Six weeks later, mice were sacri ced and xenografted tumours were harvested. Lung metastasis model was constructed based on previous study [13]. Simply, 2 × 10 5 /cells were injected into tain vein of nude mice, six weeks later, mice were sacri ced and lung metastasis were collected. All tumour tissues and metastases were con rmed by pathology.
RNA extraction and the expression pro le analysis of lncRNAs Total RNA was extracted with RNeasy Total RNA Isolation Kit (Qiagen, GmBH, Germany) (Life technologies, Carlsbad, CA, US) based on the manufacturer's instructions, then puri ed with RNeasy Mini Kit (Qiagen, GmBH, Germany). Total RNA was checked for a RIN number to inspect RNA integration by an Agilent Bioanalyzer 2100 (Agilent technologies, Santa Clara, CA, US). The biotinylated ceRNA targets were then hybridized with the slides. After hybridization, slides were scanned on the Agilent Microarray Scanner (Agilent technologies, Santa Clara, CA, US). The lncRNAs Chip (Sino Human ceRNA array V3.0) contains 5 91,614 ncRNA probe and 25,353 mRNA probe. Then these ncRNAs were screened with several databases, including NCBI, Ensembl, Noncode v5 and lncpedia5.0. Data were extracted with Feature Extraction software 10.7 (Agilent technologies, Santa Clara, CA, US). Raw data were normalized by Quantile algorithm, R package "limma".
Identi cation of aberrantly expressed lncRNAs and mRNAs in primary breast cancer and lung metastases To establish the lncRNA pro les, the differently expressed lncRNAs were selected with a fold change of at least 2 for further analysis (P value < 0.05). Hierarchical clustering was done by a R package "pheatmap" of the target genes.

Biological analysis
For functional enrichment analysis, Gene Ontology database (http://www.geneontol-ogy.org) were used to run gene ontology (GO) analysis on the identi ed genes. GO enrichment analysis was done by using Fisher's exact test by a R package "clusterPro ler" of the target genes. GO categories with Fisher's exact test P values < 0.05 were selected. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was used to detect signi cant pathways that identi ed genes involved. Pathway enrichment analysis were performed with Fisher's exact test. Signi cant pathway terms were selected with P values < 0.05.

prediction of interacting miRNAs of differently expressed lncRNAs
Previous studies have demonstrated that lncRNAs could sponge microRNAs to suppress downstream gene expression [14,15]. In the present study, to explore the mechanisms the identi ed lncRNAs were involved, target miRNAs were predicted with online database. miRanda, TargetScan and miRbase were used to screen the target miRNAs. In addition, one of lncRNA/miRNA network, AGPAT4-IT1, were constructed via cytoscape software.

Survival analysis
To determine the functional role of lncRNA AGPAT4-1T1, we conducted survival analysis with online database (https://lncar.renlab.org/). In the current study, overall survival (OS) and metastasis-free survival (MFS) were analyzed. In addition, the correlation between AGPAT4-IT1 and clinical pathological chracteristics were analyzed. GSE58812 datasets were used in this analysis.
LncRNA-mRNA network Crosstalk between lncRNAs and mRNAs was established as previously described [16]. To analyze the interacted mRNA of lncRNA AGPAT4-IT1, we established interaction network with online database (http://syslab4.nchu.edu.tw/). In the current study, GSE43358 datasets were applied to do the correlation analysis. And cytoscape were used to construct the network.

Statistical analysis
All data were analysed with spss (version 17.0. SPSS, Chicago, IL) otherwise speci ed. Independentsample t-testing was used to compare the identi ed lncRNAs between groups. Fisher's exact test was used in GO and KEGG analysis. Kaplan-meier method were used to perform survival analysis. P < 0.05 was considered statistically signi cant unless stated.

Expression pro les of lncRNAs and mRNAs
In the current study, 3 lung metastases and 2 corresponding primary breast tumour were applied to perform the microarray. A total of 75590 human lncRNAs candidates were identi ed between primary breast cancer group and lung metastases group. in addition, 483 differently expressed lncRNAs (FC > 2 or FC < 0.5, p < 0.05) were revealed in the metastasis group compared to primary tumour (Fig. 1A). Among these aberrantly expressed RNA molecules, 317 lncRNAs were upregulated, and 166 lncRNAs were downregulated. The volcano plot screening detected differentially expressed lncRNAs with statistical signi cance between these two groups (Fig. 1B). In addition, the distribution of all the statistical aberrantly expressed lncRNAs in chromosomes were presented in Fig. 1C. And these differently expressed lncRNAs were visualized with heatmap and the top 100 aberrantly expressed lncRNAs were presented in Table 1.

GO and KEGG analysis
To decipher the functional roles of the differently expressed RNA molecules in lung metastases, the GO enrichment analyses and KEGG pathway analyses were conducted. The top 30 GO enrichments including biological processes (BP), cellular components (CC) and speci c molecular functions (MF) were chosen to visualized. Of these, most GO terms of differently expressed genes in cis manner were mainly related with kinase regulation such as protein tyrosine kinase binding (GO:1990782), protein serine/threonine phosphatase activity (GO:0004674), negative regulation of MAPK cascade (GO:0043409), negative regulation of ERK1and ERK2 cascade (GO:0070373) ( Fig. 2A-2B). As for genes in trans manner, signi cantly enriched GO terms were mainly associated with ATP metabolism including ATP dependent peptidase activity (GO:0004176), ATP dependent 5'-3' DNA helicase activity (GO:0004003), UDP-glucose process (GO:0006011), oxygen homeostasis (GO:0032364) and so on (Fig. 2C-2D). In addition, KEGG pathway analysis displayed the top 30 pathway enrichment of both in cis and trans manner. The data showed these differently expressed genes in cis manner mainly enriched in transcriptional misregulation in cancer (hsa05202), p53 signaling pathway (hsa04115), HIF-1 signaling pathway (hsa04066) and so on (Fig. 3A-3B). Conversely, top 30 KEGG enrichment in trans were also related with some cancer-metastasis related pathway including P53 signaling pathway (hsa04115) and Hippo signaling pathway (hsa04392), additionally, lipoic acid metabolism was also signi cantly enriched (hsa00785) (Fig. 3C-3D). These results suggested the differently expressed lncRNAs might play crucial roles in lung metastasis via pathways mentioned above.
LncRNA APGAT4-IT1 is correlated with longer survival and higher tumour grade.
Among the differently expressed lncRNAs,we found a lncRNA AGPAT4-IT1, which was located in chromosome 6 (chr6: 161160114-161161982), was highly expressed in lung metastases (fold change = 5.79, p value = 0.03). To explore functional roles of AGPAT4-IT1, we conducted survival analysis with online database. In the present study, geo datasets GSE58812 were used to do the survival analysis. We found that the high level of AGPAT4-IT1 was related with shorter OS and MFS in breast cancer (Fig. 4A-4B). In addition, we found that higher level of AGPAT4-IT1 was connected with higher tumor grade (Fig. 4C). Current data revealed that AGPAT4-IT1 harbored the potential to be a predictive marker of breast cancer prognosis.

Co expression network analysis
To explore and depict the functional roles of lncRNA AGPAT4-IT1, the AGPAT4-IT1 and interacting mRNA network were constructed based on correlation analysis. In the crosstalk network, a total of 201 mRNAs, include 57 positive correlated mRNAs and 144 negatively correlated mRNAs, were involved and 200 connections were established (Fig. 5). Of these, we found that AGPAT4-IT1 has strong positive correlation with its parent gene AGPAT4 (r = 0.6698). these results provide evidence that AGPAT4-IT1 might serve as a positive modulator of AGPAT4.
Prediction and annotation of lncRNAs/miRNAs network Increasing lines of evidence has suggested that most lncRNAs could function as microRNA sponge or competitive RNAs to modulate their downstream genes expression [17][18][19]. In addition, previous report showed that cytoplasmic-localized lncRNAs could serve as miRNA sponge to regulate gene expression [20]. In the current study, to investigate the underlying mechanisms how AGPAT4-IT1 worked on gene expression, we analyzed that lncRNA AGPAT4-IT1 were located in cytoplasm. Hence, we speculated that AGPAT4-IT1 might control its target gene via acting as an endogenous competitive RNA. Hence, corresponding interacting miRNAs of AGPAT4-IT1 were predicted by miRanda, TargetScan and miRbase ( Fig. 6), our data indicated that 5 miRNAs, including hsa-miR-1271-3p, hsa-miR-378a-5p, hsa-miR-4727-5p, hsa-miR-4728-3p, hsa-miR-6861-5p possessed crosstalk with lncRNA AGPAT4-IT1. Current data supported our hypothesis that AGPAT4-IT1 might control downstream gene expression by acting as miRNA sponges.

Discussion
Nowadays, bene t from endocrine therapy and HER-2 targeted therapy, breast cancer mortality has dropped dramatically. However, breast cancer metastasis remains the leading cause of breast cancer related mortality and is unescapable event. As is reported, above 90 percent of breast cancer related deaths were correlated with metastasis. Of these, breast cancer lung metastasis rate was demonstrated to account for 60%-70% [21]. Hence, there is urgently need to effectively manage breast cancer metastasis. Accordingly, deciphering mechanisms that drive breast cancer metastasis to lung is important for developing novel biomarkers and therapeutic targets. Currently, mounting lines of evidence have shown that lncRNAs might take part in metastasis of breast cancer [22,23]. however, systematic research remains poor.
In the present study, lncRNAs and mRNAs expression pro les has been constructed using primary breast cancer tissue and corresponding lung metastases. Hence, we identi ed some BCLM related lncRNAs. A total of 483 differently expressed lncRNAs were detected between primary breast cancer group and lung metastases group. Of these, 317 lncRNAs were upregulated, and 166 lncRNAs were downregulated. In addition, we found lncRNA AGPAT4-IT1 were highly expressed in the lung metastasis tissues (chr6: 161160114-161161982, fold change = 5.79, p value = 0.03). Additionally, breast cancer patients with high level of AGPAT4-IT1 tent to display higher tumour grade, and suffer shorter OS and MFS, with p value = 0.0001,0.0001 respectively. Besides, AGPAT4-IT1 related mRNAs were predicted and AGPAT4-IT1/mRNA network was constructed based on correlation analysis. Mechanistically, ve potential miRNA targets, including hsa-miR-1271-3p, hsa-miR-378a-5p, hsa-miR-4727-5p, hsa-miR-4728-3p, hsa-miR-6861-5p, were predicted via miRNADA, TARGETScan and miRbase. Current study suggested that lncRNA APGAT4-IT1 might serve as diagnostic and prognostic biomarkers for BCLM, and provided an innovation and therapeutic target to treat BCLM.
Reports have revealed that the function of lncRNAs could be mirrored by their correlated protein coding mRNAs [8,11]. To explore the function of these differently expressed RNAs, we predicted their functional roles based on their correlated mRNAs. Current data showed that the identi ed lncRNA are closely associated with cell proliferation related pathway such as MAPK pathway (GO:0043409) and ERK1/2 pathway (GO:0070373). Previous report showed that Ras/MAPK pathway played critical roles in epithelial mesenchymal transition (EMT), which facilitating invasion and metastasis [24]. In addition, TP53 (hsa04115) were reported to associated with the adverse prognosis, especially in the ER-positive, nodenegative and the basal-like tumours [25]. And in hypoxia condition, cancer cells could induce hypoxiainducible factors (HIF-1) (hsa04066) [26]. However, HIF-1 were shown to be related with hematogenous metastasis of breast cancer to the lungs [27]. Hence, these results indicated that the differently expressed lncRNAs might be associated with cancer metastasis.
Previous research has demonstrated that lncRNAs could function either in cis or trans manner. Cis manner meant that lncRNA could regulate its parental gene expression, while trans manner is that lncRNA could modulate gene expression at distant genomic or cellular locations [28,29]. The most famous and well-established example of a cis-acting lncRNA is the X-inactive speci c transcript Xist. Report has reported that Xist would accumulate in cis and trigger silencing of the entire chromosome, leading to relatively few active genes [30]. While HOTAIR was the rst lncRNAs demonstrated to control gene expression via trans manner. Rinn and his collegues found that HOTAIR could suppress transcription in trans style, they showed that HOTAIR was essential for Polycomb Repressive Complex 2 (PRC2) occupancy and was required for H3K27m3 of HOXD locus. besides, HOTAIR could have crosstalk with PRC2. They revealed that HOTAIR could induce gene silencing in distance [31]. In our study, we found a lncRNA AGPAT4-IT1 was highly expressed in lung metastases than primary breast tumor tissue (fold change = 5.79, p value = 0.03). Besides, we showed that high level of AGPAT4-IT1 predicted shorter OS and MFS, and suggested AGPAT4-IT1 were more highly expressed in higher tumour grade. LncRNA-mRNA network revealed that AGPAT4-IT1 displayed a positive correlation with its parental gene AGPAT4.
Combined these results, we speculated that AGPAT4-IT1 might upregulate the expression of AGPAT4 in cis manner, and might be serve as a prognostic predictor for breast cancer.
lncRNAs, as a class of functional RNA molecules, could be commonly classi ed to four type, including signaling, decoy, guide, and scaffold lncRNAs ( [32]). Of these, decoy lncRNAs meant they could act as decoy for transcription actors or repressors [33]. A common mechanism involved with decoy lncRNAs was competitive endogenous RNA (ceRNA) pattern or acted as "sponge" for miRNAs [32]. For example, H19 were demonstrated to work as a sponge for let-7 and impaired let-7 availability [34]. Another report also revealed that TUG1, a cytoplasmic localization lncRNA, could modulate PTEN expression via miRNA sponge pattern and the sponge function of TUG1 were related with its sub-cellular localization [35]. In the current study, to probe the potential mechanism of identi ed lncRNAs, we predicted their miRNA targets via miRNADA, miBase and targetScan database. In addition, ve microRNA, including hsa-miR-1271-3p, hsa-miR-378a-5p, hsa-miR-4727-5p, hsa-miR-4728-3p, hsa-miR-6861-5p, were suggested to be possible target of AGPAT4-IT1. These data provided evidence that AGPAT4-IT1 could serve as a miRNA sponge to be involved in breast cancer metastasis.

Conclusion
In conclusion, current study rstly established a systematic lncRNA pro le of breast cancer lung metastasis, and revealed that lncRNA AGPAT4-IT1 were highly expressed in lung metastasis tissues.
Survival analysis demonstrated that aberrant expression of AGPAT4-IT1 were closely related with shorter OS and MFS, in addition, high level of AGPAT4-IT1 is associated with higher tumour grade. Finally, we found that AGPAT4-IT1 represented a positive correlation with its parental gene AGPAT4 via correlation analysis. Current study provided a framework for better exploring the underlying mechanisms of breast cancer lung metastasis, and lent supports that lncRNA AGPAT4-IT1 might be a potential predictive marker for breast cancer lung metastasis.

Declarations
Ethics approval and consent to participate Institutional review board have approved this manuscript and the current study