Network-Based Genetic Profiling Reveals Cellular Pathway Differences Between Follicular Thyroid Carcinoma and Follicular Thyroid Adenoma

Molecular mechanisms underlying the pathogenesis and progression of malignant thyroid cancers, such as follicular thyroid carcinomas (FTCs), and how these differ from benign thyroid lesions, are poorly understood. In this study, we employed network-based integrative analyses of FTC and benign follicular thyroid adenoma (FTA) lesion transcriptomes to identify key genes and pathways that differ between them. We first analysed a microarray gene expression dataset (Gene Expression Omnibus GSE82208, n = 52) obtained from FTC and FTA tissues to identify differentially expressed genes (DEGs). Pathway analyses of these DEGs were then performed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) resources to identify potentially important pathways, and protein-protein interactions (PPIs) were examined to identify pathway hub genes. Our data analysis identified 598 DEGs, 133 genes with higher and 465 genes with lower expression in FTCs. We identified four significant pathways (one carbon pool by folate, p53 signalling, progesterone-mediated oocyte maturation signalling, and cell cycle pathways) connected to DEGs with high FTC expression; eight pathways were connected to DEGs with lower relative FTC expression. Ten GO groups were significantly connected with FTC-high expression DEGs and 80 with low-FTC expression DEGs. PPI analysis then identified 12 potential hub genes based on degree and betweenness centrality; namely, TOP2A, JUN, EGFR, CDK1, FOS, CDKN3, EZH2, TYMS, PBK, CDH1, UBE2C, and CCNB2. Moreover, transcription factors (TFs) were identified that may underlie gene expression differences observed between FTC and FTA, including FOXC1, GATA2, YY1, FOXL1, E2F1, NFIC, SRF, TFAP2A, HINFP, and CREB1. We also identified microRNA (miRNAs) that may also affect transcript levels of DEGs; these included hsa-mir-335-5p, -26b-5p, -124-3p, -16-5p, -192-5p, -1-3p, -17-5p, -92a-3p, -215-5p, and -20a-5p. Thus, our study identified DEGs, molecular pathways, TFs, and miRNAs that reflect molecular mechanisms that differ between FTC and benign FTA. Given the general similarities of these lesions and common tissue origin, some of these differences may reflect malignant progression potential, and include useful candidate biomarkers for FTC and identifying factors important for FTC pathogenesis.


Introduction
Thyroid cancers are the most common type of endocrine malignancy, although they have a relatively low mortality rate compared to most other common metastatic diseases. The United States had 56,460 new diagnoses of thyroid cancer and 1780 related deaths reported in 2012 [1]. The incidence of thyroid cancers is also rising globally at about 5% per year, although some of this increase may be due to improved detection, and it notably affects those in the 20 to 34 year age range [2]. Thyroid cancers include several major types, such as papillary thyroid carcinomas, medullary thyroid carcinoma, anaplastic thyroid carcinoma, and follicular thyroid carcinomas (FTCs) [3]; FTC is one of the more aggressive types, although it accounts for a minority (14%) of total thyroid cancers [4][5][6].
The causes and cellular processes underlying FTC and controlling these tumours' behaviours are poorly understood; accordingly, these cancers have few effective treatment options [7]. There is therefore a great need to understand the mechanisms that drive development and progression in FTC to identify new approaches to detection, estimate the risk of progression, and find new therapies. In addition, differential diagnosis of FTC is problematic as it can be difficult to distinguish from follicular thyroid adenoma (FTA), a benign and non-invasive lesion. Accordingly, there is more focus on molecular markers that distinguish FTC and FTA (and other types of thyroid lesions). Thus, Wojtas et al. conducted a gene expression comparison of FTC and FTA lesions which identified potential markers that can distinguish FTC from FTA with a sensitivity and specificity of 78% and 80%, respectively [8]. We aim to also use this dataset to identify pathways with different levels of activity in benign and aggressive thyroid tumours (here, FTA and FTC) that may reflect important molecular mechanisms that underlie their behaviour.
For such pathway studies, our starting point is the differential expression of genes (DEGs) between these lesions. We can then use analytic tools to study these DEGs and discover pathways, and pathway hub genes that may affect cell functions and thereby underlie tumour morbidity, growth, and invasion [9][10][11][12]. The repertoire of candidate pathway factors can be extended using approaches such as gene ontology analysis and protein-protein interaction (PPI) studies.
To identify such DEGs, microarray gene expression profiling is widely used [13,14], and several such studies have been performed on thyroid cancer subtypes [15]. For example, Huang et al. [16] studied gene expression profiles of thyroid cancers, although not functional interactions between gene products. To better understand the underlying molecular mechanisms and identify functionally critical biomolecules, integrative analysis within a gene network context is needed [17][18][19][20]. That may lead to candidate FTC biomarkers (the focus of the work by Wojtas et al.), but our main interest here is to obtain key or hub genes distinguishing malignant FTC from benign FTA, as these may be potential therapeutic targets. Such a systems biology approach integrating statistical network and topological analyses of experimental datasets can thus clarify disease mechanisms that are difficult to identify by other approaches [21]. Thus, we used a systems approach (Figure 1) to identify FTC molecular signatures at the miRNA and mRNA levels (and derived from these, the protein-protein interactions) that distinguish FTC tissue from FTA. These analyses should be more meaningful than a comparison between FTC with normal thyroid tissue, as the the two tumour types are similar, while markedly differing in their potential for invasive metastasis. Figure 1. The multi-stage analysis methodologies employed in this study. Gene expression datasets related to follicular thyroid carcinoma (FTC) and follicular thyroid adenoma (FTA) tissues were collected from the NCBI Gene Expression Omnibus (NCBI-GEO) database and statistically analysed using GEO2R to identify differential expression of genes (DEGs). Four types of functional enrichment analyses of DEGs were then performed to identify significantly enriched pathways. Thus, we constructed protein-protein interaction networks around DEGs topological analyses to identify putative pathways hub proteins, identified possible micro-RNA (miRNA) and transcription factor (TF) interactors, and used Gene Ontology annotation terms to provide pathways enrichment. TF and miRNA studies employed JASPAR and miRTarbase databases, respectively. DEGs were integrated with those networks, and higher degree and the betweenness centrality were used to designate TFs and miRNAs as the reporter transcriptional regulatory elements. The target DEGs of reporter miRNAs and TFs were subjected to pathway enrichment analyses.

Materials and Methods
In this study, the multi-step analysis method we developed and applied is shown in Figure 1. We statistically analysed gene expression datasets to identify the DEGs and their regulatory patterns. We employed these DEGs to identify enriched pathways, biological processes, and annotation terms (i.e., Gene Ontology terms) by using functional enrichment methods. Then, to identify reporter biomolecules, we integrated the intermediate analysis results with biomolecular networks.

Dataset Employed and Statistical Methods Used
We obtained the gene expression data of FTC and FTA (GSE82208) for our study from the NCBI Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) [8,22]. This dataset contains analyses of RNA from frozen tumour tissue specimens from 27 FTC and 25 FTA lesions using Affymetrix human genome U133 (Plus 2.0) arrays. The lesions were diagnosed histologically, many with a second diagnosis to confirm, when paraffin embedded material was available [8]. Gene expression analysis using microarrays is a widely used method to develop and refine the molecular determinants of disorders affecting humans. Using these methods, we analysed the gene expression profiles of FTC and FTA. To identify the DEGs between FTC samples and FTA, we ran t-tests using the Limma package [23]. To identify the up-regulated genes, we used the conditions of p-value < 0.05 and logFC > 2 (FC, fold change), and for identifying the down-regulated genes, p-value < 0.05 and logFC < −2 were used. All identified up-regulated genes and down-regulation genes were considered DEGs. We applied the topological and neighbourhood-based benchmark methods to find gene-gene associations. A gene-gene network was constructed by using the gene-gene associations, where the nodes in the network represent gene [18,24]. This network can also be characterised as a bipartite graph. These topological and neighbourhood-based benchmark methods were adopted from our previous studies [12].
The common neighbours are based on the Jaccard Coefficient method, where the edge prediction score for the node pair is as [25]: where G is the set of nodes and E is the set of all edges. We used R software packages "comoR" [9] and "POGO" [12] to cross-check the genes-diseases associations.

Functional Enrichment of Gene Sets
We performed gene ontology and pathway analysis on identified up-regulation genes and down-regulation genes using DAVID bioinformatics resources (https://david-d.ncifcrf.gov/) (version v6.8) [26] to obtain further insight into the molecular pathways that differ between FTC and FTA. In these analyses, GO and KEGG pathway databases were used as annotation sources. Enrichment results showing adjusted p-values < 0.05 were considered significant.

Construction and Analysis of Protein-Protein Interaction (PPI) Subnetworks
The PPI network was first constructed with the DEGs and analysed using STRING [27], a web-based visualisation software resource. The constructed PPI network was represented as an undirected graph, where nodes represent the proteins and the edges represent the interactions between the proteins. To construct the PPI network from the STRING database (http://string-db.org) [27], we used database data, data mined from PubMed abstract text. Co-expression, gene fusion, and neighbourhood were active interaction sources and a combined score that was greater than 0.4 was set as the level of significance. The PPI network was then visualised and analysed using Cytoscape (v3.5.1) [28,29]. A topological analysis was applied to identify highly connected proteins (i.e., hub proteins) by using the Cyto-Hubba plugin [30] where betweenness centrality and higher degree were employed. The top three modules (i.e., the three most highly interconnected protein clusters) in the PPI subnetwork were identified using the MCODE plug-in [30]. These modules were further analysed and characterised using enrichment analyses by NetworkAnalyst [31]. The KEGG pathway enrichment analysis of the PPI networks involving DEGs were performed by NetworkAnalyst [31].

Identifying TFs and miRNAs that Influence the Expression of Candidate Genes
To identify TFs and miRNAs that affect transcript levels around which significant changes occur at the transcriptional level, we obtained experimentally verified TF-target genes from the JASPAR database [32] and miRNA-target gene interactions from TarBase [33] and miRTarBase [34] by using NetworkAnalyst tools [31] where betweenness centrality and higher degree filters were used. At the time, there were many techniques to measure the topological properties. We used the degree centrality (DC) and betweenness centrality (BC) to find out network's topological properties. We can define the DC of a node v in a network as the total number of nodes which are directly connected to node v in that network. The definition can also be written as follows [35]: whereas n represents total number of nodes in the network, and a vj represents that the node v and the node j are directly connected. In the case of betweenness centrality (BC), the total number of times of node v appearing in the shortest path between other nodes is quantified. It is also defined as follows [35]: where σ ij = total number of shortest paths from node i to node j, and σ ivj = total number of paths through node v.

Results of DEG Analyses
Transcriptomic Signatures: Differentially Expressed Genes FTA and FTC tissue gene expression patterns were analysed using oligonucleotide microarrays from the NCBI GEO (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE82208) [8]. Gene expression profiling was performed in 27 malignant FTCs and 25 FTAs: 598 genes were differentially expressed (p < 0.05, > 1.0 log2 fold change) relative to FTAs, of which 465 genes were significantly lower expression and 133 genes were higher expression levels in FTC lesions.

Pathway and Functional Correlation Analysis
By combining large scale and state of the art transcriptome and proteome analysis, we performed a regulatory analysis to gain further insight into the molecular pathways associated with the FTC and predicted links to pathways that differ relative to the benign FTAs. DEGs and pathways were analysed using KEGG pathway database (http://www.genome.jp/kegg/pathway.html) and functional annotation tool DAVID version 6.8 (http://niaid.abcc.ncifcrf.gov) to identify overrepresented pathway groups, and it was observed that four pathways were associated with DEGs with higher expression in FTC compared to FTA; namely, the "one carbon pool by folate" pathway, the p53 signalling pathway, the cell cycle, and the progesterone-mediated oocyte maturation signalling pathway. Fold enrichment, adjusted p-values, and genes associated with these pathways are presented in Table 1 (a). Early growth response 2 Reporter Transcription Factor hsa-mir-548c-3p MicroRNA 548c Reporter microRNA hsa-mir-335-5p MicroRNA 335 Reporter microRNA hsa-mir-26b-5p MicroRNA 26b Reporter microRNA hsa-mir-155-5p MicroRNA 155 Reporter microRNA hsa-mir-124-3p MicroRNA 124 Reporter microRNA hsa-mir-1-3p MicroRNA 1 Reporter microRNA hsa-mir-16-5p MicroRNA 16 Reporter microRNA hsa-mir-1-1 MicroRNA 1 Reporter microRNA hsa-mir-192-5p MicroRNA 192 Reporter microRNA hsa-mir-215-5p MicroRNA 215 Reporter microRNA It was also observed that eight pathways were significantly over-represented. Mineral absorption, thyroid hormone synthesis, pathways in cancer, oestrogen signalling, sphingolipid metabolism, protein processing in the endoplasmic reticulum, axon guidance, and focal adhesion pathways (Table 1 (a)) were identified. These are associated with the DEGs expressed at lower levels in FTC lesions compared to FTA. We then performed GO analysis using DAVID to obtain further insight into the molecular roles and biological function of DEGs identified in this study. From this analysis, nine GO groups were associated with DEGs that were more highly expressed in FTC (see Table 2 (a)). Reflecting their greater number, the genes with lower expression in FTC compared to FTA were associated with 80 GO groups; the 10 most significant GO groups are presented in Table 2 (b). Fold enrichment, adjusted p-values, and genes associated with these ontologies are also indicated. Table 2. (a) Gene ontologies associated with genes with higher expression in FTC. (b) Gene ontologies associated with genes with lower expression in FTC compared to FTA. The 10 most significant biological functional ontologies are indicated and are those associated with these significant genes. Fold enrichment, adjusted p-values and genes associated with these ontologies are presented in these tables.  RNA polymerase II promoter   FGFR2, GLIS3, TSHZ3, ZMYND11, CAV1,  TFCP2L1, ZNF366, PRDM16, CALR, CBX7,  CITED2, MINA, TCF4, BHLHE41, NFIL3,  NR2F1, TXNIP, EGR1, NR4A2, TLE1, FOSB,  SNAI2, FOXP2, CD36, BTG2, ID1, SALL1,  FOXE1, LRP8, ID3, PRDM1, SMARCA2 A PPI network was constructed using the DEGs identified in this study (Figure 2A) by using the STRING package. Topological analyses using STRING and further analysis by Cytoscape's Cyto-Hubba plugin identified the most significant hub proteins, which were identified as gene products of TOP2A, JUN, EGFR, CDK1, FOS, CDKN3, EZH2, TYMS, PBK, CDH1, UBE2C, and CCNB2. The simplified PPI network for the most significant hub genes was constructed by using Cyto-Hubba plugin and is shown in Figure 2B.

A.
B.

Enrichment Analysis of the Modules Found in PPI of DEG
The PPI network was analysed by using MCODE plug-in in Cytoscape (version 3.5.1), and top three modules were selected ( Figure 3). The enrichment analysis of the top three modules was analysed by using DAVID. Module 1 represented a set of biological process significantly enriched for ATP binding, cell division, and cell proliferation; the associated enriched cellular components were the kinetochore, the condensed chromosome kinetochore (both of which relate to cell division), and nucleoplasm and cytoplasm, which are very general categories. Molecular functions significantly enriched in Module 1 included chromatin binding and protein kinase activity (Table 3 (a)). Module 2 showed significantly enriched biological process positive regulation of protein phosphorylation and protein autophosphorylation; these significantly enriched cellular components included membrane raft, endosome, apical plasma membrane, and focal adhesion. Molecular functions of Module 2 were enriched in protein tyrosine kinase activity, enzyme binding, protein binding, and integrin binding (Table 3 (b)).  Module 3's gene ontology annotation showed a significant enrichment for protein ubiquitination, with significantly enriched cellular components being ubiquitin ligase complex and SCF ubiquitin ligase complex, which have involvement in proteosomal and autophagy functions. Module 3's molecular functions showed significantly enrichment in ubiquitin-protein transferase activity and zinc ion binding (Table 3 (b)). It is important to note that pathway enrichment analyses found Modules 1 and 2 were significantly enriched in protein binding pathways, which was not seen in Module 3 (Table 3 (b)).

Transcriptional Regulatory Network Construction and TF Enrichment Analysis
The construction of reporter TFs and DEGs interaction network by NetworkAnalyst revealed a number of potentially important TFs selected by the topological analysis using a dual metric approach involving degree and betweenness (Figure 4). The top 10 ranked TFs with the highest degree and betweenness centrality included FOXC1, GATA2, YY1, FOXL1, E2F1, NFIC, SRF, TFAP2A, HINFP, and CREB1. Pathway enrichment analysis of the TFs that differ between FTC and FTA and DEGs mainly identified as statistically significant, a number of relevant pathways which included transcriptional misregulation in cancer, acute myeloid leukaemia, pathways in cancer, thyroid cancer, cell cycle, dorso-ventral axis formation, arrhythmogenic right ventricular cardiomyopathy (ARVC), cocaine addiction, and prostate cancer pathways (Table 4 (b)).

Discussion
To clarify the molecular mechanisms underlying the pathological features (especially malignancy potential) that distinguish malignant and benign thyroid lesions, we examined differential gene expression between FTC and FTA. Such a comparison is likely to be more informative than making such a comparison of malignant and normal tissues, which would contain many more differences that relate to neoplastic cell development. In addition, the lesions contain significant non-neoplastic cell components (such as mesenchymally-derived and endothelial cells) that are also likely to differ markedly from those of normal tissue.
The DEGs identified from the FTC/FTA comparison were used to identify possible regulatory patterns, key molecular pathways, and protein-protein interactions among these pathway gene products. Such an analytical approach can be used to find molecular signatures that may serve either as potential therapeutic targets or as biomarkers to differentially diagnose or identify FTC. As did the original study by Wojtas et al. [8], we identified significantly altered genes that may be candidate biomarkers that distinguish FTC. However, another important use of such data (and our main focus here) is to identify and characterise biological functions associated with these genes that may give insights into the biology and behaviour of FTC lesions themselves. Wang et al. [36] examined on normal tissue vs. FTC and normal tissue vs. FTA in the microarray dataset accession number GSE27155. They identified some important pathways that included drug metabolism cytochrome p450, viral carcinogenesis, tyrosine metabolism, cytokine-cytokine receptor interaction, and cocaine Addition as significant pathways distinguishing normal from FTC tissue. Our analysis of the FTA vs. FTC tissue data found several important pathways, as described in the above results section. However, there were no pathways that we identified (in our FTC/FCA comparison) which were also seen in the studies comparing FTC to normal tissues. This almost certainly reflects the fact that a tumour has very different cellular profiles compared to normal tissue (e.g., infiltrating leukocytes and stromal cells) so in this circumstance, where we want to better understand the behaviour of the tumour, it is be better to compare with a benign or non-aggressive tumour such as FTA. Our study thus identified a number of important pathways that were differentially expressed in FTC, some of which would be expected by the malignant profile of FTC compared to FTA. For example, the p53 signalling pathway plays an important role in cancer cell apoptosis and DNA repair by causing cycle arrest in response to DNA damage [37], and has been previously implicated in thyroid cancers [38]. Cell cycle progression is regulated and facilitated by cyclin-dependent kinases that are activated by cyclins such as cyclin D1. Indeed, cyclin D1 levels can influence tumour progression, and may have prognostic significance in FTC [39]. Related to this, it has also been reported that the one-carbon metabolism pathway is actively involved in cancer progression, probably due to its involvement in nucleotide synthesis [40]. We have summarised the significant GO groups associated with the DEGs, as shown in Table 2 .
PPI network reconstruction and the analysis of reconstructed PPI network represent a powerful approach for understanding disease mechanisms, so to construct a PPI network for the DEGs in our study, we combined results of statistical analyses with the protein interactome network. To identify potential hub proteins, topological analysis strategies were employed. This method identified 12 hub genes (Table 1 (b)), including a number of genes commonly associated with cancers (including thyroid cancers), such as EGRF, JUN, FOS, CDK1, and other cyclin pathway genes, E-cadherin (CDH1) and E2F1. The hub protein EGFR is linked to growth in many types of cancer, and cardiovascular diseases, and has been previously linked to thyroid cancers [41]. JUN and FOS are best characterised as subunits of the AP-1 transcription factor that has a crucial role in many types of cell differentiation and inflammation processes [42]. GO databases also delineated particular roles for JUN in angiogenesis and the regulation of endothelial cell proliferation, both of which have central roles in metastasis and invasion (Table 2 (b)). CDK1 is an important cell cycle regulator and is involved in breast, lung, and ovary carcinomas [43]. CDK1 overexpression has been documented in lung cancer, lymphoma, and advanced melanoma, while loss of cytoplasmic CDK1 predicts poor patient survival and may confer chemotherapeutic resistance in the latter [44]. Cyclin B1 (in the same family as CDK1) overexpression and/or mislocalisation has been described in several primary cancers, including thyroid carcinoma and colon, gastric, prostate, breast, and non small-cell lung cancers [44]. It is also actively involved in mitotic cell cycle phase transition (Table 2 (a)). CDKN3 encodes a cyclin inhibitor (regulating cell cycle) and has been described as being overexpressed in lung adenocarcinoma (ADC), squamous cell carcinoma (SCC), hepatocellular carcinoma, cervical cancer, and epithelial ovarian cancer [45]. Over-expression of EZH2 is frequently observed in many cancer types [46]. TYMS expression is also associated with the risk of development of epithelial cancers and lymphoma cancer [47]. In addition, in hereditary diffuse gastric cancer (HDGC), the hub gene CDH1 mutations are connected with an increased incidence of lobular carcinoma of the breast, and possibly, prostate cancer and colorectal carcinoma [48]. This gene is also involved in extracellular matrix organisation (Table 2 (b)). Hub gene CCNB2 is known to be underexpressed in thyroid cell tumours [49]. We found that CDK1 and CCNB2 hubs genes were actively associated with the p53 signalling, progesterone-mediated oocyte maturation, and cell cycle pathways. The hub gene TYMS is also involved in one carbon pool by folate pathway; FOS, JUN, and CDH1 genes are involved in pathways in cancer; and JUN and FOS genes are involved in the oestrogen signalling pathway. A number of other hub genes have been described as associated with malignant thyroid cancers, including EXH2, UBCH10,TFAP2A, TOP2A, and SRF [50][51][52][53][54]. By considering the possible roles of these hub proteins in pathogenesis of FTC and other related diseases, new roles for these proteins may be identified. In addition to the above-mentioned hub genes, we also identified a number of factors not previously noted as having a role in FTC or thyroid cancers, including UBA52, GATA2, FOXL1, and NFIC.
As the regulation of gene expression is controlled by TFs and miRNAs at post-transcriptional and/or transcriptional levels, changes in these molecules may provide crucial information regarding the dysregulation of gene expression in FTC. Thus, we investigated TFs and their relationship to DEGs in these tumours, including FOXC1. One prior study found that FOXC1 is strongly associated with thyroid cancers [55]. The overexpression of YY1 in differentiated thyroid cancers has also been noted [56]. The pathway analysis of these TFs showed statistically significance in FTC. From the DEGs, a miRNA interaction network of miRNAs (Table 1 (b)) was also identified and analysed. Our pathway analysis of these miRNA also showed statistically significant relationships with FTC. Notably, the target DEGs of these miRNAs and TFs included the p53 signalling pathway, PPAR signalling pathway, dorso-ventral axis formation pathway, focal adhesion pathway (Table 4 (a)), pathways in cancer, transcriptional misregulation in cancer, and thyroid cancer pathway. The miRNA species we identified that have previously been shown to have a role in thyroid cancer, including hsa-mir-335-5p, hsa-124-3p, hsa-mir-17-5p, and hsa-mir-20a-5p [57][58][59][60]. However, six other miRNAs we identified have not previously been associated with thyroid lesions. miRNA species have wide ranging effects on gene expression which are indirect, so further analysis for these are needed.
We analysed the dataset originally generated by Wojtas et al., who used combined micro-array methods and literature meta-analysis to achieve this to find markers to distinguish FTC and FTA. The gene markers they identified included TFF3, CPQ (previously characterised FTC markers), PLVAP, and ACVRL1, all genes expressed in the thyroid, although ACVR1 is an activin receptor with wide expression. These latter genes were not identified by our analytical approach, but it should be noted that our study used very different analytic tools, focussing on identifying pathways and on finding evidence for gene functions using PPI and other resources. Our analytical approach was somewhat similar to that of Wang et al. [36] who examined an older thyroid cancer dataset (GSE27155) containing 10 FTA, 14 FTC, and 4 normal thyroid tissue samples plus those of other types of thyroid lesions. While we employed a number of additional analytical tools (some developed previously by us), there were a number of significant genes and pathways that were also found in the work by Wang et al. [36] where they compared FTC and FTA. These notably included the AP-1 TF and its components (e.g., JUN and FOS), BCL2, factors relating to DNA damage, and cyclin-related factors that regulate the cell cycle. As noted above, We identified no genes that Wang et al. identified when comparing FTC and normal tissues. Recent studies by Shang et al. and by Liang and Sun [61] examined datasets generated from papillary thyroid cancer lesions that were compared with normal tissues. DEGs identified in these studies were analysed by methods related to our approach [35]. However, unlike the Wang et al. study cited above, very few of the hub genes these studies identified were found in our analysis, BCL2 being the main exception. This lack of concordance with the FTC studies may reflect the different tumour types (and lower malignancy rates) represented by papillary thyroid cancer, and it might be useful to compare all these datasets in further detail to determine how these types of tumour differ. We also note that work by Pfeifer et al. [62] developed a classification using gene expression data that considered only five genes, and this distinguished FTC from FTA with good accuracy. They identified five genes (ELMO1, EMCN, ITIH5, KCNAB1, and SLCO2A1) as biomarkers. However, it should be emphasised that the focus of our work and methodology is different (since we studied pathway differences not gene classifiers), so not surprisingly, there were no significant genes identified in common between our work and Pfeifer et al.

Conclusions
Our data-driven approach has uncovered a number of significant molecular mechanisms that may underlie the pathogenic differences seen between FTC and FTA. In this study we used integrated bioinformatic analyses to study gene expression profiles in FTC and FTA, and used this information to identify candidate hub proteins that could play significant roles in FTC development. Our results included a number candidate genes and pathways that indicate the directions for future experimental work needed to clarify the roles of these cellular factors. This study identified gene networks that advance our understanding of FTC pathogenesis and indicates new avenues to develop therapies for FTC.

Abbreviations
The following abbreviations are used in this manuscript: