Functional genomics of AP-2α and AP-2γ in cancers: in silico study

Among all causes of death, cancer is the most prevalent and is only outpaced by cardiovascular diseases. Molecular theory of carcinogenesis states that apoptosis and proliferation are regulated by groups of tumor suppressors or oncogenes. Transcription factors are example of proteins comprising representatives of both cancer-related groups. Exemplary family of transcription factors which exhibits dualism of function is Activating enhancer-binding Protein 2 (AP-2). Scientific reports concerning their function in carcinogenesis depend on particular family member and/or tumor type which proves the issue to be unsolved. Therefore, the present study examines role of the best-described AP-2 representatives, AP-2α and AP-2γ, through ontological analysis of their target genes and investigation what processes are differentially regulated in 21 cancers using samples deposited in Genomic Data Analysis Center (GDAC) Firehose. Expression data with clinical annotation was collected from TCGA-dedicated repository GDAC Firehose. Transcription factor targets were obtained from Gene Transcription Regulation Database (GTRD), TRANScription FACtor database (TRANSFAC) and Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining (TRRUST). Monocle3 R package was used for global samples profiling while Protein ANalysis THrough Evolutionary Relationships (PANTHER) tool was used to perform gene ontology analysis. With RNA-seq data and Monocle3 or PANTHER tools we outlined differences in many processes and signaling pathways, separating tumor from normal tissues or tumors from each other. Unexpectedly, a number of alterations in basal-like breast cancer were identified that distinguished it from other subtypes, which could bring future clinical benefits. Our findings indicate that while the AP-2α/γ role remains ambiguous, their activity is based on processes that underlie the cancer hallmarks and their expression could have potential in diagnosis of selected tumors.

factors (TFs) [9]. Of these, probably the best known is Guardian of the Genome (p53 protein), whose correct activity requires that all of its subunits are not mutated and that no dominant negative effect occur [10]. In addition, the members of the Activating enhancer-binding Protein 2 (AP-2) family also demonstrate ambiguity; of these, AP-2α exhibits dualism of function depending on its related signaling pathway [11,12]. Another AP-2 protein, AP-2γ, was thought to have oncogenic activity since its transcriptional activity is inhibited by WW Domain Containing Oxidoreductase (WWOX) tumor suppressor after its sequestration outside the nucleus [13]. However, the other reports indicating that AP-2γ induces expression of p21 protein suggest that it may also demonstrate anti-tumoral traits [14], presenting contrasting function which prompts consideration.
The entire AP-2 family fits within the basic Helix-Span-Helix (bHSH) superclass and comprises the five members (AP-2α, AP-2β, AP-2γ, AP-2δ, AP-2ε) encoded by TFAP2A-E genes [15]. The activation domain is located in the amino terminus, whereas the sequences responsible for dimerization and binding of DNA are on the side of carboxyl terminus [16]. All members recognize specific G/C-rich evolutionarily conserved motifs i.e. GCCN3/4GGC, GCCN3/4GGG or CCC CAG GC [17,18] while the binding of transcription factors' themselves is dictated by a proline-rich motif located in the activation domain (excluding AP-2δ, which lacks these critical residues) [15]. Cytogenetically, AP-2 family members are encoded on the sixth chromosome, except AP-2γ and AP-2ε which are on the twentieth and first chromosomes, respectively [16]. When functioning correctly, the AP-2 transcription factors regulate appropriate gene expression during early developmental processes such as face, eye or limb development [19]. It was suggested that in the case of mutation, the loss of TF activity of AP-2 members can lead to the impairment of proliferation, differentiation and apoptosis processes [16], suggesting AP-2 activity may play role in development. Indeed, both AP-2α and AP-2γ have prognostic value for some tumor types. AP-2α overexpression is recognized as a prognostic indicator of shorter patient survival in papillary thyroid carcinoma and epithelial ovarian cancer but longer survival in gastric adenocarcinoma [20][21][22]. In terms of AP-2γ, its overexpression was also associated with poorer overall survival in breast cancer patients [23] with the other literature data to support this statement and enrich it with correlation presenting worse anti-hormone therapy response [24]. The other case of chemoresistance (to 5-fluorouracil) was shown in colorectal cancer upregulating AP-2γ [25] while endometrial cancer example demonstrated that knockdown of this AP-2 member sensitizes cells to megestrol acetate via Estrogen receptor alpha (ERα) expression upregulation [26].
Literature data indicates that the activity of AP-2 in carcinogenesis is very much dependent on specific family members. The most recent scientific reports concern the first three transcription factors, while the least is known about AP-2δ or AP-2ε. The best described factors, AP-2α and AP-2γ, play distinct roles depending on tumor type, as noted in a recent review [15]; however; this summary has since been enriched with additional tumor models not available at that time. These findings confirm that AP-2α activity is influenced by tumor tissue type: while it was found to demonstrate oncogenic activity in cervical, gallbladder and ovarian cancer [27][28][29], it has been also seen to act as a suppressor in many other tumors [11,[30][31][32][33][34][35]. AP-2γ has been conclusively demonstrated to play an oncogenic role in breast cancer [36] (which coincides with previous evaluation) and furthermore expanded with its unfavorable characteristics in colorectal cancer [25]. The last representative of the AP-2 family abundantly described in the literature is beta member. Latest data regarding contribution in cancer indicates that AP-2β overexpression has been found to promote tumor growth in both breast and thyroid cancer and predicted poor prognosis or tumor progression, respectively [37,38]. Finally, although little is known of AP-2δ and AP-2ε, the former is thought to be associated with progression and genomic instability of prostate cancer while the latter acts as a tumor suppressor in neuroblastoma [39,40].
The precise functionality of AP-2 factors in cancer clearly remains unknown. Therefore, the aim of the present study was to clarify the role of transcription factors AP-2α and AP-2γ in carcinogenesis using samples (RNAseq) acquired from 21 cancers of The Cancer Genome Atlas (TCGA) and Monocle3 or Protein Analysis Through Evolutionary Relationships (PANTHER) bioinformatics tools. The study identifies the processes which are differentially regulated between the studied cancer types, compare the findings with those of normal tissue and identifies whether any differences exist in the expression of target genes for these factors in specific types of tumors.

Acquisition of tumor patients' data
RNA-seq expression data with corresponding clinical annotation was collected from 21 tumors (Table 1) of TCGA-dedicated GDAC Firehose Repository (level 3 RNA-seqV2, RSEM normalized, data status of 28th Jan 2016 available at https ://gdac.broad insti tute.org/). Patients that lacked expression or clinical data were discarded from the study. Available normal, paired solid tissues were additionally retrieved through R-dedicated package TCGA-Assembler [41].

Global profiling of the tumors and ontological annotation
Phenotype heterogeneity between selected tumors accompanied by normal tissues was studied and visualized by applying the UMAP method preceded by principle component analysis (PCA) regarding the expression of the targets of both AP-2α and AP-2γ using the Monocle3 R package (https ://cole-trapn elllab.githu b.io/monoc le3/) [44]. The analyses of AP-2α and AP-2γ targets were performed as two separate entities accordingly. The PCA pre-processing step (preprocess_cds()) was done with the dimensionality of the reduced space of 100 (num_dim). Reduction of dimensions (reduce_dimension()), and clustering of the individuals (cluster_cells()) within spaces were applied with the UMAP algorithm for dimensionality reduction method upon which to base clustering (reduction_ method). The clusters of individuals were compared with graph_test() function based on Moran's I spatial autocorrelation analysis with knn neighbor graph and q-value threshold of 0.05. Furthermore, the genes varying across the clusters selected at the previous step were grouped into modules through Louvain community analysis (find_gene_modules()) with parameters set to default. The modules were clustered in two ways: 1) clustering involving all of the individuals enabling to compare between tumors and tumor vs non-tumor samples of specific cancer type and 2) clustering of the modules restricted to the BRCA foci enabling to differentiate between PAM50 subtypes, separately for AP-2α and AP-2γ targets. Finally, the results were visualized with pheatmap() and clustered with the Ward D2 method. The whole pipeline was performed according to the Monocle3 tutorial (https ://cole-trapn ell-lab. githu b.io/monoc le3/). To annotate the findings in the context of the biological processes in which they are involved, overrepresentation test available at PAN-THER Classification System (http://www.panth erdb. org/) with Fischer's Exact Test and Bonferroni correction for multiple testing were performed. The methodology is graphically summarized in Fig. 1.

Identification of global differences using toolkit for spatial analysis
Considered cohorts were clustered according to designed variable discriminating tumors and their corresponding normal tissues (marked with an additional prefix "n_"). Their distribution across Uniform Manifold Approximation and Projection (UMAP) dimensions was presented according to the genetic targets for a given transcription factor (AP-2α or AP-2γ). The analysis based on AP-2α identified a group of cancers distinct from other separated clusters but to a lesser extent from the corresponding normal tissue: LIHC, KIRP, KIRC, PRAD, SKCM, OV, THCA, BRCA, STAD, LUAD and PAAD. In some cases, one cancer type coincided in the plot with another (forming a "cluster") e.g. GBM and LGG, COAD and READ. Surprisingly, the remaining tumor reservoir (BLCA, CESC, UCEC, ESCA, HNSC, LUSC) formed a heterogeneous mixed cluster together with their normal tissues, albeit with deviations for ESCA and LUSC. The same tendencies as of AP-2α were observed in terms of AP-2γ targets, although with minor differences-these are collectively presented in Fig. 2.
In the next stage, distribution in the UMAP1 and UMAP2 dimensions was further analyzed with regard to differences in expression of the transcription factor itself. Global similarities were observed in the case of both AP-2α or AP-2γ expression; however, some exceptions were observed e.g. AP-2α expression was higher in SKCM while AP-2γ was higher in THCA ( Fig. 3a, b, respectively). Evident differences in expression were observed between kidney tumors (KIRC, KIRP) and normal kidney tissue; this is not surprising in the case of AP-2α which has been suggested as a biomarker in renal carcinoma [45,46] however our findings suggest that AP-2γ may have the same properties.
Furthermore, due to noticeable separation of breast cancer samples visible for both target gene lists (Fig. 2), possession of extraordinary tumor profiling signature (Prosigna Breast Cancer Prognostic Gene Signature Assay-PAM50) prompted us to perform additional analyses to assess the subtypes' separation by TFs target genes. For both AP-2α or AP-2γ, differences were observed in the PAM50 classifiers of breast cancer indicating that the basal-like subtype was different from the others (Fig. 4a, b, respectively).
Lastly, heatmaps were generated to arrange the transcription factor target genes into modules with common expression profile, indicating how these modules differ between clusters; this approach allowed more specific differences between individual cancers, the tumor and the corresponding normal tissue or intrinsic subtypes to be identified (Figs. 5, 6). The content of modules from particular analysis type has been summarized in the Additional file 1. At first glance, signaling by the AP-2 targets in tumor tissue is congruent to signaling in the corresponding normal tissue for PAAD, LIHC, THCA and SKCM (in the case of AP-2α) or for PAAD, THCA and SKCM (in the case of AP-2γ). However, the analysis of signaling through TF targets in the remaining clusters indicated a large number of modules that could be used to distinguish the tumor from normal tissue (the latter taken as a reference), or the basal-like BRCA subtype from the luminal A/B and HER2-enriched subtypes. The proposed modules taken for further comparisons between tumor and normal tissue or between different tumor tissues are summarized in Table 2 (AP-2α) and Table 3 (AP-2γ). In the case of differences between BRCA subtypes, the modules 6, 12 and 13 were taken for subsequent consideration for the AP-2α target gene list while 5, 13, 19 modules for AP-2γ.
The differences between tumors were then identified from a heterogeneous mixed cluster. Using modules 2, 11, 19 with target genes for AP-2α, the entire selection was divided into two different "communities"-LUSC, ESCA, HNSC and UCEC, BLCA, CESC. The same tendency could be explained using module 17 of AP-2γ targets with additional dissimilarities in UCEC versus LUSC or ESCA versus BLCA which can be explained by the inverse of module 15 or 6, respectively.

Gene ontology of modules distinguishing specific comparisons
Analyzing a multitude of comparisons in the form of modules diversifying groups (from the previous section) led to the extraction of genes for their subsequent ontological analysis. The greatest attention was paid to modules showing the opposite tendency in the context of expression of genes contained therein. As previously, the following distinctions were made: between the cancer and the corresponding normal tissue, between similar tumors or between PAM50 classifiers for breast cancer. Kołat et al. BMC Med Genomics (2020) 13:174

Comparisons by means of target genes for AP-2α transcription factor
Differences between tumors and normal tissues indicated changes in processes (e.g. apoptosis, proliferation, cell migration, cell cycle, cytoskeleton organization, cell adhesion or angiogenesis) and signal transduction pathways (e.g. ERBB, Wnt, MAPK and Notch) which are visualized in Fig. 7. In regards to alterations between different types of tumors, they were approximate to above mentioned however additional processes such as regulation of cell shape were also noticeable (Fig. 8). Further data such as specific genes that described a particular module during ontological analysis, as well as statistical significance, are reviewed in Additional file 2.

Comparisons by means of target genes for AP-2γ transcription factor
In the case of second AP-2 family representative, tumor versus normal tissue changes also affected processes (e.g. autophagy, adhesion, vesicle budding, extracellular matrix organization, apoptosis, cell growth and spreading) and signaling pathways (e.g. TNF, EGFR, Wnt, JAK-STAT, mTOR, NFkB and TGFβ) which is summarized in Fig. 9. Regarding variation between tumors, no additional processes other than those mentioned above were identified, even though different modules were used (i.e. module 6) (Fig. 10). The specific genes that characterizing individual modules through ontological analysis, together with their statistical significance, are recapitulated in Additional file 3.

Comparisons of breast cancer intrinsic PAM50 subtypes
Pursuing the scheme of previous subsections, gene ontology of modules distinguishing basal-like BRCA signature from other subtypes indicated differences e.g.
in apoptosis, cytoskeleton organization, autophagy, but also in EGFR, TNF, Wnt or cadherin signaling pathway. The results are summarized in Fig. 11 and Additional file 4.

Validation of the Results
Validation of the findings based on independent adequate cohorts for all considered tumors was not performed due to lack of relevant expression data.

Discussion
Carcinogenesis is shaped by both tumor suppressors and oncogenes where one group of crucial proteins that demonstrate such duality of function are transcription factors. Available literature indicates that the character of AP-2 factors depends on the particular family representative or considered tumor type. Assurance definitely applies to fact that AP-2α and AP-2γ have been described to the greatest extent compared to other family members. Unlike previous papers, the present study examines the activities of the target genes of AP-2α and AP-2γ in more than 20 solid tumors from the TCGA database. The observed alterations in the expression of transcription factors and their effect on target genes identified processes that can potentially distinguish the tumor from its corresponding normal tissue or from other tumor types.
Regarding the first type of comparisons, a multitude of differences were observed; these allowed the identification of more than one module that could distinguish tumor tissue from corresponding normal tissue, with ontological descriptors concerning both biological processes and molecular pathways. A few modules from AP-2α target genes list explained the differences for several tissues at once: modules 2, 11 and 12 respectively differentiated CESC and UCEC, HNSC and KIRC, as well as COAD, ESCA and STAD from their corresponding normal tissues. Module 2 has been described as implicated in cell cycle regulation as well as the apoptotic pathway, with the cell cycle being differentially regulated between cervical cancer versus solid normal tissue samples [47] or uterine corpus endometrial carcinoma versus normal controls [48]. Following, in module 11 the processes involving e.g. MAPK activity was noticed which could support previous observations that MAPK may play a role in HNSC and KIRC carcinogenesis [49,50]. Lastly, module 12 was annotated with processes involved in cell cycle or cell adhesion. Differences in cell cycle regulation were previously found between tumor tissue and paired normal samples of colon adenocarcinoma [51], while cell adhesion processes were found to distinguish tumor samples from normal tissue in esophageal carcinoma [52] or gastric cancer [53].
The remaining modules selected for gene ontology analysis indicated differences in PRAD, LUSC, BLCA, GBM, READ, LUAD, BRCA and KIRP between tumors and their matched solid normal tissues. Module 18 shown alterations in apoptosis, the ERBB pathway and regulation of cytoskeleton organization between PRAD and its paired control which is consistent with previous findings for programmed cell death [54] and ERBB signaling [55]. The other pathway shown was Wnt, describing lung squamous cell carcinoma versus normal tissue differences with module 15 which corresponded to literature data on cellular levels of this protein [56]. Furthermore, "regulation of cell cycle" or "response to drug" descriptors of module 13 explained bladder cancer versus normal bladder tissue; this is consistent with current scientific reports where the drug resistance is dependent on the JUND representative of the Jun family [57,58], which was a part of AP-2α target genes list forming module 13 along with two other Jun members. Module 10 concerns the misfolding of proteins which appeared to be crucial mechanism of lipid storage or increased cellular proliferation and varied between glioblastoma multiforme and corresponding normal tissue [59]. Module 8 revealed  differences in intrinsic apoptosis, among others, between rectum adenocarcinoma and controls as noted previously [60]; the same signaling pathway was annotated in module 6 explaining lung adenocarcinoma versus normal lung, although with an additional cell adhesion descriptor which might be related to changes in adhesive properties during epithelial-to-mesenchymal transition, as confirmed via AP-2α-promoted tumorigenicity of LUAD [61]. Finally, kidney renal papillary and breast invasive carcinomas were distinguished from their paired normal tissues using module 1 and 3, respectively. Among the others, signaling pathways such as Wnt, Notch (for KIRP) or ERBB (for BRCA) were implicated in explanation which is coherent with other research [62,63].
Since the magnitude of the results for AP-2γ is greater than in the case of AP-2α, obtained differences were confronted with the literature only in terms of signaling pathways. Differences in certain pathways, such as TNF or ERBB (for liver hepatocellular carcinoma and prostate adenocarcinoma), EGFR or Wnt (for lung squamous cell carcinoma), JAK-STAT (for kidney renal clear cell carcinoma), NFkB (for breast invasive carcinoma) and TGFβ, Fig. 7 Gene ontology of selected modules differentiating tumor and corresponding tissue by means of AP-2α targets mTOR or NFkB (for stomach adenocarcinoma) were noted between tumor and normal tissue. Of these, TNF for LIHC, NFkB for BRCA and TGFβ for STAD have been identified previously [64][65][66].
Both AP-2α and AP-2γ demonstrated differences associated with signal transduction pathways or biological processes in tumor versus tumor comparisons. Regarding first transcription factor, many of the modules generated through Monocle3 analysis can distinguish KIRC from KIRP; in this, ontological analysis identified changes in processes e.g. cell adhesion, proliferation, angiogenesis, cell migration, cell cycle or pathways i.e. Notch, Wnt, ERBB signaling (Fig. 8). Differences in adhesion or cell cycle have been noted between two renal cell carcinomas, showing specific gene expression to be lower in KIRP [67]. Increased expression of AP-2α targets genes from module 13 (ontologically indicating regulation of cell cycle, proliferation or development) in KIRC compared to KIRP was also noticeable in LGG versus GBM comparison although with inverse tendency (Fig. 8).
Considering the cell cycle differences, the alteration frequencies indicate on average higher percentage of pathway disruption in GBM (86%) compared to LGG (46%) [68]. The cell adhesion and intrinsic apoptotic signaling pathway (module 6) remained on trend in both KIRC versus KIRP and LUSC versus LUAD, implying that lung carcinomas differ not only in terms of adhesion or apoptosis but also regarding cell migration or Wnt signaling pathways, as demonstrated by module 15 (Fig. 8). This is consistent with a previous comparison of genes linked to adhesion and migration that were found to be differentially expressed between LUSC and LUAD [69]. Of the two types of lung cancer, LUSC was also found within a mixed cluster (Fig. 2a) created by more similar tumors (LUSC, ESCA, HNSC, UCEC, BLCA, CESC). Nevertheless, the tumors could be distinguished using modules 2, 11 and 19 which included genes related to cell adhesion, regulation of cell cycle arrest and inactivation of MAPK activity.
Concerning differences outlined through AP-2γ targets, identical subgroups could be distinguished within the mixed tumor cluster (Fig. 2b) formed by the same six cancer types described above-distinction is explained by module 17 that post-ontologically shown differences in Fig. 9 Gene ontology of selected modules differentiating tumor and corresponding tissue by means of AP-2γ targets cell adhesion (Fig. 10), which is concordant with the analysis of the same tumors through AP-2α targets (Fig. 2a). Additionally, module 6 or 15 distinguished two subdivisions inside the cluster i.e. ESCA versus BLCA or UCEC versus LUSC, respectively. Using ontology descriptors, changes concerned regulation of mTOR signaling and extrinsic apoptosis for module 6 or EGFR and Wnt pathways for module 15 (Fig. 10). In regards to the latter, the alteration frequencies indicate on average lower percentage of Wnt pathway disruption in LUSC (18%) compared to UCEC (47%) [68]. Other comparisons applied to COAD versus READ explained by vesicle budding (module 9) or LGG versus GBM elucidated through mTOR signaling, apoptosis and cellular adhesion (module 5) with the latter confirmed by another bioinformatics analysis [70]. Lastly, numerous modules distinguished KIRC from KIRP or LUSC from LUAD with module 12 (response to hypoxia, regulation of autophagy, adhesiondependent cell spreading) differentiating both. Referring the literature, renal cell carcinomas have already been differentiated in terms of hypoxia indicating specific gene expression to be higher in KIRC [67]. Other modules for KIRC versus KIRP comparison have been associated with processes such as autophagy (module 3), hormone-mediated signaling pathway (module 4) and cell growth regulation or NFkB signaling (module 16). For LUSC versus LUAD, they differ according to genes associated with Wnt pathway (module 10), regulation of cytoskeleton organization (module 7) or EGFR signaling (module 15). The last mentioned might be supported by a threefold difference in the frequency of EGFR mutations between LUSC and LUAD during a previous pan-cancer analysis [71].
A separate topic was the division of breast cancer subtypes noticed during the global analysis of all cancer cohorts in relation to their corresponding normal tissues. Both target gene lists i.e. for AP-2α and AP-2γ, have separated two subgroups of BRCA samples. Additional spatial analysis and assignment of PAM50 signatures to samples revealed that the distinguished group was a basal-like subtype (Fig. 4). Due to the fact that this subtype is the most aggressive form of breast  [72], finding potential differences from the others could support clinical aspects in the future. Ontological analysis was performed on modules whose constituent genes demonstrated opposite expression profiles (basal-like vs. luminal A/B or HER2-enriched). Mutual part indicated through target genes lists of both AP-2α and AP-2γ shown that the processes differentiating the basal-like subtype concern regulation of programmed cell death or EGFR and TNF signaling pathways (Fig. 11), with the latter observed during previous enrichment analysis [72]. Nevertheless, the cadherin and Wnt signaling pathways play a key role in subtype differentiation due to the number of genes described ontologically in module 5 (for AP-2γ targets' comparisons), as noted previously [73].
On the whole, the analysis presented in this research indicate that many differentiating processes or pathways between tumor and normal tissue or between different tumor types/subtypes find their reference in current scientific reports. However, some of the obtained alterations not having regard in the literature, could serve as a novel preliminary point for the development of new anti-cancer therapies.

Conclusions
Collectively, our research indicates both novel and previously described differences between tumors or between a specific tumor and normal matched solid tissue. In addition, an unplanned division of samples from the breast cancer cohort was performed, differentiating the basal-like subtype from the others. These findings could have the clinical applications.
Analyses also proved that the regulation of gene expression by AP-2α and AP-2γ is very complex and involves many biological processes and signaling pathways. The role of AP-2 in carcinogenesis is ambiguous, but it is associated with the regulation of processes that underlie cancer hallmarks (proliferation, apoptosis, angiogenesis, cell growth), and the target genes of both transcription factors allow for further analysis of changes that occur in individual tumors. Finally, the differences in AP-2α and AP-2γ expression observed between tissue types (e.g. THCA and SKCM or KIRC/ KIRP vs. normal kidney tissue) suggest their potential usefulness in diagnosing and treating specific cancers.
Nonetheless, we believe that our findings regarding AP-2α and AP-2γ functional genomics sheds new light on specific comparisons yet it is entirely possible that subsequent discoveries through in-depth data mining could have a further impact on the future of medical practice.