Identification of Potential lncRNAs and miRNAs as Diagnostic Biomarkers for Papillary Thyroid Carcinoma Based on Machine Learning

Background Papillary thyroid carcinoma (PTC) accounts for most of the proportion of thyroid cancer (TC). The objective of this study was to identify diagnostic, differentially expressed long noncoding RNAs (lncRNAs) and microRNAs (miRNAs), contributing to understanding the epigenetics mechanism of PTC. Methods The data of lncRNA, miRNA, and mRNA were downloaded from the Cancer Genome Atlas (TCGA) dataset, followed by functional analysis of differentially expressed mRNAs. Optimal diagnostic lncRNA and miRNA biomarkers were identified via random forest. The regulatory network between optimal diagnostic lncRNA and mRNAs and optimal diagnostic miRNA and mRNAs was identified, followed by the construction of ceRNA network of lncRNA-mRNA-miRNA. Expression validation and diagnostic analysis of lncRNAs, miRNAs, and mRNAs were performed. Overexpression of ADD3-AS1 was performed in PTC-UC3 cell lines, and cell proliferation and invasion assay were used for investigating the role of ADD3-AS1 in PTC. Results A total of 107 differentially expressed lncRNAs, 81 differentially expressed miRNAs, and 515 differentially expressed mRNAs were identified. 11 lncRNAs and 6 miRNAs were regarded as the optimal diagnostic biomarkers for PTC. The epigenetic modifications via the above diagnostic lncRNAs and miRNAs were identified, including MIR181A2HG-FOXP2-hsa-miR-146b-3p, BLACAT1/ST7-AS1-RPS6KA5-hsa-miR-34a-5p, LBX2-AS1/MIR100HG-CDHR3-hsa-miR-34a-5p, ADD3-AS1-PTPRE-hsa-miR-9-5p, ADD3-AS1-TGFBR1-hsa-miR-214-3p, LINC00506-MMRN1-hsa-miR-4709-3p, and LOC339059-STK32A-hsa-miR-199b-5p. In the functional analysis, MMRN1 and TGFBR1 were involved in cell adhesion and endothelial cell migration, respectively. Overexpression of ADD3-AS1 inhibited cell growth and invasion in PTC cell lines. Conclusion The identified lncRNAs/miRNAs/mRNA were differentially expressed between normal and cancerous tissues. In addition, identified altered lncRNAs and miRNAs may be potential diagnostic biomarkers for PTC. Additionally, epigenetic modifications via the above lncRNAs and miRNAs may be involved in tumorigenesis of PTC.


Introduction
yroid cancer (TC) accounts for 94% of the endocrine system cancers and 66% of the deaths. Clinically, TC is divided into four histological subtypes: follicular TC, anaplastic TC, medullary TC, and papillary thyroid carcinoma (PTC). Mortality of TC is low. However, the incidence and recurrence rate of PTC are increasing worldwide, especially among women [1]. Some factors appear to increase the risk mechanism of PTC, develop novel diagnostic strategy for the disease, and interfere with the thyroid neoplasm progress into malignant cancer.
Previous reports have demonstrated that long noncoding RNAs (lncRNAs) play roles in TC [9][10][11]. e lncRNA BRAF-activated long noncoding RNA (BANCR) can promote proliferation and inhibit apoptosis in TC cells [12]. In addition, changes in the expression of multiple microRNAs (miRNAs) may be a major mechanism in TC tumorigenesis and could be used in disease diagnosis [13,14]. In PTC cells, several miRNAs (hsa-miR-220, hsa-miR-221, and hsa-miR-222) are significantly upregulated, while, several other miRNAs (hsa-let-7, hsa-miR-26, and hsa-miR-345) are significantly downregulated [15][16][17][18]. It is worth mentioning that the interaction between lncRNAs and miRNAs influences tumor development and progression [19]. In this study, we tried to identify potential diagnostic lncRNAs and miRNAs biomarkers in PTC. We performed the differential expression analysis of lncRNAs, miRNAs, and mRNAs. en, we performed machine learning to find optimal diagnostic, differentially expressed lncRNAs and miRNAs for PTC. Subsequent analysis was based on these optimal diagnostic lncRNAs and miRNAs.

Data Retrieval.
In the Cancer Genome Atlas (TCGA) database (http://sangerbox.com/TcgaDown), the clinical data of 506 PTC patients, lncRNA and mRNA data of 502 PTC patients, and miRNA data of 506 PTC patients were recorded. Transcriptome lncRNA, miRNA, and mRNA count data of PTC were obtained from primary solid tumor and normal solid tissue. Finally, the lncRNA and mRNA data (involving 502 cases and 58 controls) and miRNA data (involving 506 cases and 59 controls) were used for analysis.

Identification of the Optimal Diagnostic lncRNAs and miRNAs.
Firstly, importance value of each lncRNA/miRNA was ranked by the random forest algorithm. en, the optimal number of features was found by subsequently adding one lncRNA/miRNA at a time in a top-down forward-wrapper approach. Optimal lncRNA/miRNA with diagnostic value for PTC were used to establish classification models including random forests (RF), decision tree (DT), and support vector machine (SVM). e "randomForests" package in R language, "rpart" package in R language, and "e1071" package in R language were used to establish the RF model, DT model, and SVM model, respectively. Diagnostic ability of classification prediction was evaluated by obtaining the area under a receiver operating characteristic (ROC) curve.

2.4.
Network of lncRNA-mRNA-miRNA. Firstly, the Pearson correlation coefficient (r) of identified diagnostic lncRNAs and differentially expressed mRNA was estimated using the cor.test function in R. e standard of co-expression relation pair was |r| > 0.5 and p value ≤ 0.05. Secondly, the pairwise Pearson correlation coefficient (r) between identified diagnostic miRNAs and mRNAs was calculated. miRNA-target prediction tools (miRDB, TragetScan, miRanda, and PIC-TAR2) were used to predict target mRNAs of diagnostic miRNAs. Interaction network of diagnostic lncRNA-coexpressed mRNA and the selected diagnostic miRNA-target pairs with negative correlations were visualized by using Cytoscape.

Functional Analysis of mRNAs.
In order to understand the biological function of mRNAs, we conducted Gene Ontology (GO) enrichment analysis through DAVID 6.8 database (https://david.ncifcrf.gov/). p < 0.01 was considered as statistically significant.

Validation and Diagnostic Analysis of lncRNAs and miRNAs on Public Gene Expression Dataset.
e GEO dataset, GSE33630 [20,21] (involving 49 cases and 45 controls), was used for expression validation and ROC analysis of lncRNAs. In addition, GSE104006 [22] (involving 57 cases and 11 controls) was used for expression validation and ROC analysis of miRNAs. e expression result of these lncRNAs/miRNAs was shown by box plots.

Ex Vivo Validation of lncRNAs, miRNAs, and mRNAs.
e inclusion criteria of PTC patients were as follows: (1) patients were diagnosed with PTC according to pathological examination; (2) patients underwent surgery of PTC for the first time and received no adjuvant therapy before; and (3) patients had complete clinical data including medical history of present illness, family history, personal history, detailed physical examination data, and postoperative pathological data. e exclusion criteria of PTC patients were as follows: (1) patients had autoimmune thyroid disease, other malignant tumors, or viral infections; (2) patients with recurrence; (3) patients with incomplete clinical information; and (4) patients took psychotropic drugs for a long time. e tumor tissue and paracarcinoma tissue of 7 patients were collected. e tumor tissue and paracarcinoma tissue of these patients were collected for validation. All participating individuals provided informed consent with the approval of the ethics committee of the local hospital.
All tissue specimens (confirmed by the pathologist) were frozen continuously within 15 min after surgery and stored in the refrigerator at −80°C for use. Total RNA of the tissue and paracarcinoma tissue was extracted using the TRIzol reagent. 1 µg RNA was applied to synthesize DNA by FastQuant cDNA first-strand synthesis kit (TIANGEN). en real-time PCR was performed in the ABI 7300 realtime PCR system with SYBR ® Green PCR Master Mix.
GAPDH and ACTB were endogenous controls of mRNA. Hsa-U6 was endogenous controls of lncRNA. Relative lncRNA/miRNA/mRNA expression was analyzed fold change method.

Function
Analysis of Identified lncRNAs. In order to further analyze the function of identified lncRNAs, PTC cell lines (PTC-UC3) were used for experiments on cell proliferation and invasion. To confer stable overexpression, a lentiviral vector was used for gene delivery [23,24]. e lncRNA was PCR-amplified from a complementary deoxyribonucleic acid library. PCR products and plasmid were purified. When PTC-UC3 cells were 50% confluent, they were infected with overexpression vector by Lipofectamine 2000 (Invitrogen) for 24 h at 37°C. en, the virus-containing medium was replaced with fresh complete medium. e PTC-UC3 cells were incubated for a further 48 h. Cell growth status was deciphered by MTT assays. e absorption of the solution was measured at 490 nm at various time points (0 h, 24 h, 48 h, and 73 h). Cell invasion assay was evaluated using Transwell inserts with Matrigel-coated membrane matrix. e cell migration rate was calculated.

Network of lncRNA-mRNA-miRNA.
To investigate the interaction between diagnostic lncRNAs and mRNAs and diagnostic miRNAs and mRNAs, the ceRNA network of lncRNA-mRNA-miRNA was constructed. A total of 335 mRNAs were co-expressed with 11 diagnostic lncRNAs. e interaction network of diagnostic lncRNA-co-expressed mRNA is shown in Figure 4. In addition, 152 diagnostic miRNA-mRNA pairs (involving 6 miRNA and 130 mRNA) were identified in the targeted and negatively correlation analysis. e established regulatory network of diagnostic miRNA-targeted mRNA with negative correlation is shown in Figure 5. After taking the intersection, we obtained 95 common differentially expressed mRNAs between networks of diagnostic lncRNA-co-expressed mRNA and diagnostic miRNA-targeted mRNA. e ceRNA network of diagnostic lncRNA-common mRNA-diagnostic miRNA is shown in Figure 6. e ceRNA network contains 11 diagnostic lncRNAs, 6 diagnostic miRNAs, and 95 differentially expressed mRNAs. Several interaction pairs were identified,

Functional Enrichment of mRNAs.
e GO analysis results demonstrated that these targeted differentially expressed mRNAs were significantly involved in some biological processes (BP), such as cell adhesion (involved MMRN1) and endothelial cell migration (involved TGFBR1) (Figure 7).
e expression validation and diagnostic analysis were consistent with the informatics analysis.

Ex Vivo Validation of lncRNAs, miRNAs, and mRNAs.
e ex vivo experiment was used to further validate the expression of identified lncRNA (BLACAT1), miRNAs (hsa-International Journal of Endocrinology miR-146b-3p and hsa-miR-214-3p), and mRNA (MMRN1) in 7 PTC patients. Clinical information of 7 patients is presented in Table 1. e expression of BLACAT1 and hsa-miR-146b-3p was significantly upregulated, while the expression of hsa-miR-214-3p and MMRN1 was downregulated in PTC without significance (Figure 9). e expression trend of these molecules was in line with the bioinformatics analysis.

Function Analysis of ADD3-AS1.
According to the bioinformatics analysis and related literature report, we selected one of lncRNAs ADD3-AS1 for function analysis. e MTT assay showed that overexpression of ADD3-AS1 significantly inhibited cell proliferation at 24 h, 48 h, and 73 h ( Figure 10). e cell invasion result indicated that overexpression of ADD3-AS1 decreased the PTC-UC3 cells' invasion capacity compared with the normal control ( Figure 11). ese results demonstrated that overexpression of ADD3-AS1 inhibited the proliferation and invasion of PTC-UC3 cells.
LBX2-AS1 is upregulated in PTC and TC [37]. MIR100HG is downregulated in pediatric medulloblastoma [31]. In gastric cancer, downregulation of MIR100HG inhibits cell proliferation and migration [40]. In triple-negative breast cancer, reduced expression of MIR100HG leads to a reduction in cell proliferation [41]. It is reported that MIR100HG is significantly associated with overall survival of PTC patients [42]. We found that LBX2-AS1 and MIR100HG were, respectively, upregulated and downregulated in PTC tumor. In addition, cadherin-related family member 3 (CDHR3) was co-expressed with LBX2-AS1 and MIR100HG. Moreover, CDHR3 was targeted by hsa-miR-34a-5p. It is indicated that the epigenetic regulation between LBX2-AS1, MIR100HG, hsa-miR-34a-5p, and CDHR3 may provide a novel field in understanding the pathological mechanism of PTC.
ADD3-AS1 is downregulated in PTC and TC [35,43]. In PTC, hsa-miR-9-5p is downregulated and plays an important role by targeting B-Raf proto-oncogene and serine/ threonine kinase (BRAF) [44]. Hsa-miR-9-5p is downregulated in anaplastic TC, suggesting its role as a potential tumor suppressor in anaplastic TC [45]. We found that ADD3-AS1 and hsa-miR-9-5p were downregulated in PTC, Figure 6: e ceRNA network of diagnostic lncRNA-common mRNA-diagnostic miRNA in PTC after taking the intersection of the interaction network of diagnostic lncRNA-co-expressed mRNA and diagnostic miRNA-mRNA pairs. e rhombus, rectangle, and ellipse represent the diagnostic lncRNAs, diagnostic miRNAs, and mRNAs, respectively; the pink and blue colors represent upregulation and downregulation, respectively; the black border represents the top 10 upregulated and top 10 downregulated lncRNA/miRNA/mRNA. 8 International Journal of Endocrinology which agrees with previous studies. Moreover, upregulated protein tyrosine phosphatase receptor type E (PTPRE) was co-expressed with ADD3-AS1 and targeted by hsa-miR-9-5p. Compared with benign PTC, PTPRE is upregulated in malignant PTC [46]. In the function analysis, we found that overexpression of ADD3-AS1 inhibited the proliferation and invasion of PTC-UC3 cells, which suggested the important role of ADD3-AS1 in the tumorigenesis of PTC. In addition, our study implied that the interaction pairs of ADD3-AS1-PTPRE-hsa-miR-9-5p play a crucial role in PTC.
It is revealed that hsa-miR-214-3p is decreased in hepatocellular carcinoma and non-small-cell lung cancer [47,48]. hsa-miR-214-3p is involved in PTC [37]. We found that the expression of hsa-miR-214-3p decreased in PTC. In addition, upregulated transforming growth factor beta receptor 1 (TGFBR1) was co-expressed with ADD3-AS1 and one of the targets of hsa-miR-214-3p. e expression of TGFBR1 is upregulated in PTC and TC [29,49]. In PTC, SLC35F2 expedites the proliferation and migration of tumor cells by targeting TGFBR1 [50]. Our finding indicated that the ceRNA network of ADD3-AS1-TGFBR1-hsa-miR-214-3p could be involved in PTC proliferation and migration.
Up to now, no association between LOC339059 and cancer has been found in previous reports. In PTC, downregulated hsa-miR-199b-5p can predict tumor aggressiveness and is a diagnostic blood marker of PTC [53,54]. In follicular TC, hsa-miR-199b-5p is   downregulated, indicating its potential to promote malignant transformation and an important diagnostic tool [55]. In medullary TC, downregulated hsa-miR-199b-5p may function as a carcinogen or tumor suppressor [56]. Herein, we found that LOC339059 and hsa-miR-199b-5p were downregulated and had a remarkably diagnostic value for PTC patients. In addition, upregulated serine/threonine kinase 32A (STK32 A) was co-expressed with LOC339059 and targeted by hsa-miR-199b-5p. e expression of STK32 A has been found in familial nonmedullary TC and PTC [43,57]. is suggested that LOC339059-STK32A-hsa-miR-199b-5p played a crucial role in PTC.
Besides above lncRNAs, the remaining 3 downregulated diagnostic lncRNAs (MORC2-AS1, FAM95C, and FAM181A-AS1) may play roles in PTC. MORC2-AS1, FAM95C, and FAM181A-AS1 are downregulated in TC [35,37,43]. In addition, we found that MMRN1 and TGFBR1 were, respectively, involved in cell adhesion and endothelial cell migration according to the GO functional analysis. Some neural cell adhesion molecules are expressed  International Journal of Endocrinology normally in thyroid follicular cells. It is found that various pathways, such as cell adhesion molecules, were increased in PTC [58]. e endothelial cell migration is a crucial prerequisite for tumor angiogenesis. It has been demonstrated that endothelial cell migration is involved in TC [59-61].

Data Availability
All data used to support the study are included within the article.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.