Pancancer Analyses Reveal Genomics and Clinical Characteristics of the SETDB1 in Human Tumors

Background Malignant tumor is one of the most common diseases that seriously affect human health. The prior literature has reported the biological function and potential therapeutic targets of SET domain bifurcated histone lysine methyltransferase 1 (SETDB1) as an oncogene. However, SETDB1 has rarely been analyzed from a pan-cancer perspective. Methods Bioinformatics analysis tools and databases, including GeneCards, National Center for Biotechnology Information (NCBI), UniProt, Illustrator for Biological Sequences (IBS), Human Protein Atlas (HPA), GEPIA, TIMER2, Sangerbox 3.0, UALCAN, Kaplan-Meier (K-M) plotter, cBioPortal, Catalogue Of Somatic Mutations In Cancer (COSMIC), PhosphoSitePlus, TISIDB, STRING, and GeneMANIA, were utilized to clarify the biological functions and clinical significance of SETDB1 from a pan-cancer perspective. Results In this study, the pan-cancer analysis demonstrated that SETDB1 showed significantly differential expression in most tumor tissues and paracancerous tissues, and SETDB1 expression was associated with clinicopathological features and clinical prognosis. We also found that SETDB1 mutations occurred in most tumors and were related to tumorigenesis. In addition, DNA methylation of SETDB1 primarily occurred at the cg10444928 site and was associated with prognosis in several human tumors. The predicted phosphorylation site of SETDB1 was Ser1006. We found that SETDB1 was significantly related to the specific tumor-infiltrating immune cell populations and expression of clinically targetable immune checkpoints and may be a promising immunotherapy target. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses also indicated that SETDB1 may function as crucial regulator in carcinogenesis of human cancers. Conclusions SETDB1 is an important oncogene involved in tumorigenesis and tumor progression through different biological mechanisms. Furthermore, SETDB1 may be a potential therapeutic target for cancer treatment.


Introduction
Malignant tumor is becoming a major disease endangering human health [1]. At present, there are no curative strategies for malignant tumors. Antitumor therapies, including radical surgical resection, radiofrequency ablation, transplantation, chemotherapy, immunotherapy, and targeted therapy, have been developed [2]. However, the overall survival (OS) for patients with cancer, especially pancreatic adenocarcinoma (PAAD), lung adenocarcinoma (LUAD), and breast invasive carcinoma (BRCA), remains low because of the complexity and heterogeneity of tumorigenesis [3][4][5].
Germline mutations caused by abnormal activation and expression of oncogenes have also been confirmed as major inducements of tumorigenesis [6]. The investigation of epigenetic changes, expression levels, potential molecular basis, and clinical significance of oncogene can help understand the mechanisms of tumorigenesis and improve the treatment of various cancers.
SET domain bifurcated histone lysine methyltransferase 1 (SETDB1) protein, also known as ERG-associated protein with SET domain (ESET), KG1T, KMT1E, TDRD21, and H3-K9-HMTase4, is a member of the SET family involved in chromatin gene silencing, chromatin remodeling, transcriptional suppression, and histone methylation in cells [7,8]. The SET family also shows epigenetic regulation, participating in gene expression and function changes without altering the DNA sequence. SET family is a significant regulator of tumorigenesis and is important for tumortargeted therapy [9,10]. SETDB1 was first reported in 1999, and increasing studies have found that SETDB1 is significantly related to human tumorigenesis and immune cell functions [11][12][13]. SETDB1 is also a well-known histone H3 lysine 9 (H3K9) methyltransferase that associates with methylation in various euchromatic regions, which causes gene silencing [12]. Therefore, it is important to conduct a comprehensive genomic analysis of SETDB1 and explore its relation with clinical outcome and potential target of oncotherapy in human malignant tumors.
In this study, for the first time, we conducted a structure/ function pan-cancer analysis of SETDB1 based on several online databases to explore its oncogenic role and clinical significance in various cancers.

Material and Methods
2.1. Omics Analysis of SETDB1. Firstly, we acquired the chromosome localization, coding sequence (CDS), and exon counts of SETDB1 based on the GeneCards database (https://www.genecards.org/). Subsequently, the biological information of the SETDB1 gene and its encoded six protein isoforms was obtained from the "gene" and "protein" module of the National Center for Biotechnology Information (NCBI) (https://www.ncbi.nlm.nih.gov/), with its 3D (three-dimensional) protein structure explored in the UniProt database (https://www.uniprot.org/). In addition, the CDS in nucleotide sequence and conserved domains in amino acid sequence were visualized using Illustrator for Biological Sequences (IBS, version 1.0) [14] (http://ibs .biocuckoo.org/). The position of conserved domains of histone-lysine N-methyltransferase SETDB1 isoform 1 protein was obtained from the "HomoloGene" of NCBI. Conserved amino acid sequences encoded by SETDB1 and phylogenetic tree of SETDB1 family were explored by Constraint-based Multiple Alignment Tool (https://www .ncbi.nlm.nih.gov/tools/cobalt/) in the NCBI. Finally, the distribution of SETDB1 protein was obtained from Human Protein Atlas (HPA) (https://www.proteinatlas.org/) database.

Clinicopathological Features Analysis. The "Pathological
Stage Plot" module of GEPIA2 was applied to assess the relationship between the SETDB1 gene expression and cancer stage based on TCGA. P < 0:05 was set as the significance threshold. We also used the UALCAN database to explore the relationship between the SETDB1 mRNA expression level and clinicopathological stage, including the withinstage correlation. Sangerbox 3.0 was applied to confirm the connection between the SETDB1 mRNA transcription level and other clinicopathological features, including TNM classification and clinicopathological grade.

Survival
Analysis. GEPIA2 is also an available comprehensive prognosis analysis database, and its target gene was used as input for survival analysis in various human cancers. In this study, we first used the "Survival Map" module of GEPIA2 to explore the OS and disease-free survival (DFS) significance map data of SETDB1 among all tumors from TCGA datasets. The patients with cancer were divided into the high-and low-expression subgroups according to the median expression levels of SETDB1. Subsequently, the "survival analysis" module of GEPIA2 was used to draw the significance of the Kaplan-Meier (K-M) curves for patients with cancer, and P < 0:05 was set as the significance threshold. Furthermore, the Cox analyses based on Sangerbox 3.0 database were performed for disease-specific survival (DSS) and progression-free interval (PFI) of SETDB1 across various cancers samples, and the results were displayed by a forest plot. Then, we explored the prognostic value of SETDB1 in ovarian cancer, liver hepatocellular carcinoma (LIHC), LUAD, and BRCA based on the K-M plotter [17] (http://kmplot.com/analysis/). The survival datasets were derived from TCGA and Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/).

Genetic Alteration Analysis.
A total of 32 studies containing 10967 samples were selected from the "TCGA Pan-Cancer Atlas Studies" module of cBioPortal (https://www .cbioportal.org) online database. The genetic alteration levels of SETDB1 were further explored using 110443 samples with mutation data. The totality genetic alteration samples of SETDB1 across TCGA tumors were generated from the "OncoPrint" module of cBioPortal. According to the "Cancer Types Summary" module of cBioPortal, the alteration frequency, number of genetic mutations, type of SETDB1 mutations, and copy number variation (CNV) in each tumor type were analyzed. The mutated site information of SETDB1 was shown in the amino acid sequence containing conserved domain sites and the 3D structure by the "Mutations" module. The Sangerbox 3.0 online serve was used to comprehensively analyze the mutational landscape of SETDB1. The GSCA database (http://bioinfo.life.hust .edu.cn/GSCA/#/) was used to analyze the CNV percentage 2 Journal of Oncology in each cancer and the relationship between SETDB1 expression and CNV. The Catalogue Of Somatic Mutations In Cancer (COSMIC) (https://cancer.sanger.ac.uk/cosmic) is also a comprehensive alteration analysis database for exploring the mutation of SETDB1 in human cancers.
2.6. Methylation and Protein Phosphorylation Analysis. We first assessed the differential expression of SETDB1 promoter methylation in tumor tissues and normal tissues with the UALCAN online database. Furthermore, the relationship between SETDB1 expression and RNA modificationrelated genes was explored using the Sangerbox 3.0 online service. The MethSurv [18] (https://biit.cs.ut.ee/methsurv/) online database was used to obtain the relative expression level of single CpGs of SETDB1 methylation and their prognostic value. MethSurv database is specifically designed to compare the relative expression level of a single CpG and perform multivariable survival analysis using DNA methylation data. The prognostic value of single CpG of SETDB1 in 25 cancers was also assessed using the "all cancers" and "single CpG" modules of the MethSurv database. Subsequently, the PhosphoSitePlus (version 6.6.0.2, https://www.phosphosite.org/) was used to investigate the protein phosphorylation sites of SETDB1 in amino acid sequence. The UALCAN online database was also used to compare SETDB1 phosphorylation levels between tumor tissues and paracancerous tissues. The protein phosphorylation data were sourced from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) (https://proteomics.cancer .gov/programs/cptac), and BRCA, glioblastoma multiforme (GBM), PAAD, head and neck squamous cell carcinoma (HNSC), and LUAD.
2.7. Immune and Molecular Subtype Analysis, Immune Infiltration Analysis, and Immune Checkpoint Inhibitor-Related Gene Analysis of SETDB1. We then logged into the TISIDB [19] (http://cis.hku.hk/TISIDB/index.php) database with submitting "SETDB1" to assess the association of SETDB1 expression status with immune cell and molecular subtypes in various human cancers. The "Immune-Gene" module of the TIMER2, which is specifically designed to analyze the immune infiltration across all TCGA cancers, was applied to explore the association between the SETDB1 expression and tumor-related immune cell infiltration levels. The SETDB1 expression was related to the abundance of tumor-infiltrating cells, including cancer-associated fibroblasts (CAFs), CD8 + T cells, CD4 + T cells, regulatory T cells (Tregs), and B cells. The scatterplots were used to present the correlation between SETDB1 mRNA expression and the abundance of infiltrating CAFs. The TISIDB database was also used to analyze the association of SETDB1 expression with immune checkpoint inhibitor-related genes, including immunoinhibitor and immunostimulator. Part of the results with significant correlation was presented as scatterplots. To identify potential groups that may benefit from immunotherapy, we used the radar chart to display the association between microsatellite instability (MSI) and tumor mutational burden (TMB) in various cancers based on the Sangerbox 3.0.

Function and Pathway
Analysis. The STRING (https:// string-db.org/) database was used to explore the targeting gene-binding proteins by searching protein name "SETDB1" and organism type "Homo sapiens." By setting parameters of STRING, the experimentally determined SETDB1binding proteins were obtained. Using GeneMANIA (http://genemania.org/) online database, we predicted the possible function of SETDB1-related genes according to their association with genes with assigned biological functions. To clarify the functions of the target genes, the Gene Ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of SETDB1-related genes were performed by R (version R 3.6.3, https://www.r-project.org/). Furthermore, the HALLMARK terms were analyzed by Sangerbox 3.0. In order to construct the mRNA-miRNA-lncRNA network, we first predicted the miRNA targeting SETDB1 based on TargetScanHuman (Release 7.2 March 2018, http://www .targetscan.org/), mirDIP (http://ophid.utoronto.ca/mirDIP/), and miRWalk (http://mirwalk.umm.uni-heidelberg.de/). Then, the complementary sequences of SETDB1 and miRNA targeting SETDB1 were displayed using the TargetScan-Human database. Finally, the lncRNA targeting miRNAs were predicted by the LncBase Predicted v2 module of DIANA tools (http://carolina.imis.athena-innovation.gr/ diana_tools/web/index.php?r=site/index).
2.9. Drug Sensitivity and Resistance Analysis. The drug sensitivity analysis and resistance analysis were performed based on the Genomics of Drug Sensitivity in Cancer (GDSC) (https://www.cancerrxgene.org/) database, and the volcano plot was displayed. The IC50 values of Ara-G and Bleomycin (50 μM) for SETDB1 mutation were analyzed. GSCALite (http://bioinfo.life.hust.edu.cn/web/GSCALite/) is a comprehensive web-based analysis platform for gene set cancer analysis and drug sensitivity analysis. DrugBank (https:// www.drugbank.com/) database was used to explore the chemical formula and structural formula for Ara-G and Bleomycin (50 μM).
2.10. Validating Expression of SETDB1 by IHC. Six pairs of paraffin-embedded digestive system tumors, including LIHC, CHOL, COAD, ESCA, PAAD, and STAD, and corresponding adjacent tissues were collected in the Shulan (Hangzhou) Hospital. Collected tissues were embedded in paraffin and sliced into 4 μm sections, then baked in an oven at 65°C for 2 hours, and hydrated. These tissues were incubated with 1 : 25 dilution of anti-SETDB1 monoclonal antibody (catalog number: KHC0067). After incubation with the anti-rabbit secondary antibody (ORIGENE) at room temperature for 1 h, diaminobenzidine (DAB) was used to reveal the color of antibody staining. Finally, the stained sections were observed under the microscope.  (Figure 2(a)). The CDS of SETDB1 in nucleotide sequence was displayed (Figure 2(a)). The SETDB1 encoded six protein isoforms, including histone-lysine N-methyltransferase SETDB1 isoforms 1-6, which were mainly distributed in the nucleoplasm ( Figure S1A). The mRNA, protein reference sequences (Refseq), and the conserved domains of SETDB1 were summarized (Table 1). Histone-lysine N-methyltransferase SETDB1 isoform 1 was the dominant isoform and the main undertaker of histone-lysine N-methyltransferase SETDB1 functions. To better understand the biological function and structural information of histone-lysine Nmethyltransferase SETDB1 protein isoform 1, structurefunction analysis was conducted, and the protein domains, region, and nucleotide compositional bias were displayed. As shown in Figure 2(b), the protein structure of SETDB1 consists of six domains, five regions, one coiled coil, and ten nucleotide compositional biases. For multiple species, the SETDB1 contains six domains, including two Tudors (cl02573), MBD (cl00110), pre-SET (cl02622), SET (cl02566), post-SET, and SEEEED (cl19208) (Figure 2(d)). The pre-SET, SET, and post-SET domains are required for methyltransferase activity. Additionally, the protein 3D structure was also displayed (Figure 2(c)). According to the NCBI online database, the SETDB1 protein was conserved in different species, such as chimpanzee, rhesus monkey, dog, cow, mouse, rat, chicken, zebrafish, and frog ( Figure S1B). The phylogenetic tree of SETDB1 protein was produced using the fast minimum evolution, and it presented the evolutionary relationship among different species ( Figure S1C). We also found that SETDB1 protein was mainly localized in the nucleoplasm of A-431 (human epithelial carcinoma cell line), U-2 OS (human osteosarcoma cells), and U-251 MG (human brain glioblastoma astrocytoma cancer cells) and vesicles of U-2 OS cell lines ( Figure S1D).

Gene Expression Analysis of SETDB1.
We first confirmed that SETDB1 was widely expressed in human normal and tumor tissues ( Figure S2A and Table S1). Then, the SETDB1 mRNA expression levels were compared in nontumor tissues based on the HPA and GTEx database (Figure 3(a) and Figure S2B). The bar charts showed that SETDB1 had the highest expression level in the testis, followed by the thymus, tonsil, spleen, and lymph node, indicating that the SETDB1 was mainly expressed in the bone marrow and lymphoid tissues. The SETDB1 expression level was high in most normal tissues, indicating the low tissue specificity of the SETDB1 mRNA expression. Additionally, the IHC and H&E staining results of the top five normal tissues in terms of SETDB1 expression level were displayed based on HPA online database (Figures 3(b) and 3(c)). No data related to the IHC and H&E staining assessment of SETDB1 in thymus tissues were obtained. The expression levels of SETDB1 in different cell lines were assessed. The results from the HPA database showed that SETDB1 was significantly enriched in U-698, followed by the BEWO, THP-1, NTERA-2, and SH-SY5Y (Figure 3(d)). In addition, SETDB1 single cell      7 Journal of Oncology specificity is displayed in Figure S2C. The SETDB1 expression level was significantly higher in late spermatids, early spermatids, spermatocytes, oligodendrocytes, and microglial cells. Finally, the SETDB1 expression patterns in testis tissues were assessed using published RNA-sequencing data ( Figure S2D).
In order to further explore the expression levels of SETDB1 in different tumor tissues and paracancerous tissues, we further performed the differential expression analysis using several online databases. In the TIMER database, the SETDB1 expression level was elevated in bladder urothelial carcinoma (BLCA), BRCA, CHOL, colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), GBM, HNSC, KIRC, LIHC, LUAD, lung squamous cell carcinoma (LUSC), rectum adenocarcinoma (READ), stomach adenocarcinoma (STAD), thyroid carcinoma (THCA), and uterine corpus endometrial carcinoma (UCEC) (Figure 4(a)). However, compared with the SETDB1 expression level in paracancerous tissue, that in KICH (kidney chromophobe) was lower (Figure 4(a)). The mRNA expression levels of SETDB1 in HNSC-HPV + tumor and SKCM-Metastasis tissues were higher than those in HNSC-HPVtumor and SKCM primary tumor tissues (Figure 4(a)). These data are in agreement with the expression levels of SETDB1 in tumor tissues and paracancerous tissues in Sangerbox and UAL-CAN databases ( Figure S3). The expression of SETDB1 in SKCM was lower in patients with primary tumors than that in patients with metastasis tumors, indicating that a high expression level of SETDB1 in SKCM may imply metastasis (Figure 4(a)). In the GEPIA2 online database, combined with the TCGA and GTEx datasets, the differential expression analysis showed that SETDB1 was highly expressed in the thymoma (THYM), lower grade glioma (LGG), acute myeloid leukemia (LAML), GBM, lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), and CHOL, while it was lowly expressed in THCA, prostate adenocarcinoma (PRAD), KICH, and adrenocortical carcinoma (ACC) and was not expressed in other cancers ( Figure 4(b)).
For acquiring more comprehensive expression information, HPA and UALCAN online databases were combined to assess the protein expression of SETDB1 in various cancers and normal tissues. In HPA online dataset, the protein of SETDB1 was observed in 45 normal tissue samples. Among them, 15 samples showed a high expression score, nine samples exhibited a medium expression score, 11 samples exhibited a low expression score, and 10 samples had no expression score ( Figure 5(a)). In tumor samples, most cancers were weakly stained or negative. Moderate to strong nuclear and cytoplasmic positivity was observed in several gliomas, lymphomas, melanomas, colorectal cancer, endometrial cancer, and testicular cancer ( Figure 5(b)). Furthermore, the classic IHC staining was performed. The results showed that SETDB1 had high expression levels in brain glioma and Hodgkin's lymphoma, had medium levels in THCA and BLCA, and had a low level in endometrium adenocarcinoma ( Figure 5(c)). In the "CPTAC" module of UALCAN, we accessed the differences in protein expression of SETDB1 between various cancers and paracancerous tissues. Compared with SETDB1 protein expression levels in the adjacent normal tissues, those in BRCA (p = 2:
The results indicate that SETDB1 may be a protective factor in these tumors. Furthermore, for uveal melanoma (UVM) and TGCT patients, the SETDB1 expression levels in cancers at advanced clinicopathological stages were significantly L u n g c a n c e r C o l o r e c t a l c a n c e r H e a d a n d n e c k c a n c e r L i v e r c a n c e r P a n c r e a t i c c a n c e r R e n a l c a n c e r U r o t h e l i a l c a n c e r P r o s t a t e c a n c e r T e s t i s c a n c e r B r e a s t c a n c e r C e r v i c a l c a n c e r E n d o m e t r i a l c a n c e r O v a r i a n c a n c e r S k i n c a n c e r M e l a Most cancers were weakly stained or negative. Moderate to strong nuclear and/or cytoplasmic positivity was observed in several gliomas, lymphomas, melanomas, colorectal, endometrial and testicular cancers.   Journal of Oncology lower than those in tumors at early stages, implying that decreased SETDB1 expression may indicate the tumor progression in these patients (Figures 6(r) and 6(s)). These results demonstrated that abnormal expression of SETDB1 may be associated with the initiation and progression of various human cancers. We further investigated the association of SETDB1 expression with other clinicopathological features, such as TNM classification and clinicopathological grade based on Sangerbox 3.0. SETDB1 expression in GBM, BRCA, ESCA, SARC (Sarcoma), and PRAD was significantly associated with tumor T classification (Figure 6(t)). SETDB1 expression is significantly associated with N classification in LGG, LIHC, and READ and is related to M classification in LUAD, PAAD, and UVM (Figures 6(u) and 6(v)). Additionally, the clinicopathological grade of LUSC, UCS (uterine carcinosarcoma), and SARC is associated with SETDB1 expression (Figure 6(w)).

Survival Analysis of SETDB1.
To explore the prognostic value of SETDB1 in various human cancers, we classified the cancer samples into high-and low-expression subgroups according to the median expression value of SETDB1. First, the GEPIA2 database was used to perform the OS and DFS analyses in pan-cancer cohorts. In terms of OS, higher expression of SETDB1 was associated with poorer clinical outcomes in ACC (p = 0:0055) and LIHC (p = 0:0290) (Figure 7(a)). The results also revealed a correlation between high SETDB1 expression levels and poor DFS in ACC (p = 3:9e − 05), READ (p = 0:0180), and PRAD (p = 0:0150) (Figure 7(a)). It is indicated that abnormally expressed SETDB1 may be a prognostic indicator in these tumors.  (Figure 7(c)). We focused on the association between SETDB1 expression and the breast cancer, ovarian cancer, lung cancer, and gastric cancer prognosis. In order to evaluate the prognostic abilities of SETDB1 in these cancers, independent clinical factors were selected as the subgroups (Table S2-S6). 3.5. Genetic Alteration Analysis of SETDB1. Malignant tumor is caused by genetic alterations, and mutated genes offer potential molecular therapeutic targets [20,21]. Given that SETDB1 genetic alterations were associated with molecular therapeutic targets for various human cancers, we investigated the genetic alteration levels of SETDB1 in various human cancers based on TCGA datasets. The results showed that SETDB1 altered 630 cases (6%) out of 10439 cases (data from PanCancer Atlas and TCGA) ( Figure 8A). We also

15
Journal of Oncology found that missense mutation was the main type of SETDB1 mutation, followed by the truncating mutation and splice mutation (Figure 8(a) and Figure S4A). Furthermore, the primary SNV class was C > T (29.81%), followed by G > A (21.12%), A > G (11.47%), and G > T (10.56%) ( Figure S4B). By analyzing genetic alterations and expression, we found that SETDB1 genetic alterations induced a switch in the mRNA expression levels of SETDB1 in human tumors. However, few differences were observed in the genetic alteration by deep deletion ( Figure S4C). Furthermore, the genetic alteration type in cholangiocarcinoma (CHOL) (8.99% of 523 cases), PCPG (3.37% of 178 cases), DLBC (7.5% of 440 cases), and THYM (0.81% of 123 cases) (Figure 8(b)) was amplification. The genetic alteration type in kidney renal papillary cell carcinoma (KIRP) (1.81% of 276 cases), LAML (0.5% of 200 cases), and THCA (0.2% of 490 cases) (Figure 8(b)) was missense mutation. Additionally, the SETDB1 mutation frequency in patients with mixed endometrial carcinomas was the highest (15.09% of 517 cases), including 8.51% (44 cases) mutation and 6.58% (34 cases) amplification (Figure 8(b)). The structural variant and deep deletion were rare in human cancers and were only identified in five cancers among cancers included in this research, namely, LIHC, BRCA, and SKCM (structural variant) and ESCA and SARC (deep deletion) (Figure 7(b)). We used the "mutation" module of the cBioPortal database to investigate the type and site of SETDB1 mutation (NM_001145415/ENST00000271640) in each sample. The R1256W/L/Q mutation and translation from R (Arginine) to W (Tryptophan) or L (Leucine) or Q (Glutamine) were observed in the SET conserved domain and occurred in one case of GBM (R1256W), one case of skin cutaneous melanoma (SKCM) (R1256L), one case of STAD (R1256W), and two cases of colorectal adenocarcinoma (R1256W and R1256Q). However, the function of R1256W/L/Q mutation remained unknown (Figure 8(d)). The mutation spectrum of SETDB1 was explored by Sangerbox 3.0 ( Figure S4D). Finally, the 3D structure of SETDB1 protein and the mutation of the sequence were displayed (Figure 8(c)). However, the R1256W/L/Q mutation was not displayed in the 3D structure of the SETDB1 protein. The CNV pie chart also showed that the heterozygous amplification of CNV was distributed in most cancers, whereas the heterozygous deletion was predominantly distributed in the KICH ( Figure S4E). A significant positive correlation was observed between SETDB1 expression and CNV in various cancers ( Figure S4F).
3.6. Methylation Analysis of SETDB1. Growing evidence showed that aberrant methylation was associated with oncogenesis and may have a significant clinical value [22]. Therefore, we assessed the DNA methylation levels of SETDB1 and its prognosis value in various human cancers. Firstly, we compared the levels of SETDB1 promoter methylation in tumors and paracancerous tissues based on the UALCAN database. The results showed that the promoter methylation levels of SETDB1 in BLCA, BRCA, COAD, ESCA, HNSC, LIHC, LUAD, LUSC, PRAD, READ, TGCT, and UCEC were significantly reduced compared with those in paracancerous tissues (Figures 9(a)-9(l)). Correlation analysis showed that SETDB1 expression was significantly positively correlated with RNA modification-related genes ( Figure S5). In the MethSurv online database, we evaluated the DNA methylation level and prognostic value of SETDB1 in various human cancers, and the relative methylation level was displayed in Figure S6. It can be seen that cg10444928 site of SETDB1 in 25 human tumors showed the highest DNA methylation level. To analyze the association of the cg10444928 site of SETDB1 with prognosis across various human cancers, we explored the prognosis value of single CpG (cg10444928) of SETDB1 based on the "single Logran g g g g g g g k p p p p p p p p = 4.7e-06 HR (high) = 5.3 p p p p p p p p p p p p p p (HR) = HR) R) ( ) ) ) ) ) ) 3.9e-. 9

Protein Phosphorylation Analysis of SETDB1.
Protein phosphorylation may be a promoter or a suppressor of oncogenesis. Therefore, exploring protein phosphorylation is beneficial to developing a novel antitumor agent in human tumors [23]. We first explored the protein phosphorylation site of SETDB1 based on the PhosphoSitePlus database. As shown in Figure 9(t), the most predominant protein phosphorylation locus for the SETDB1 is Ser1006 (flanking sequence: RNYGYNPsPVkPEGL) located in the SET conserved domain. Subsequently, we assessed the differences in phosphorylation levels at the single phosphorylation site of SETDB1 between tumor tissues and paracancerous tissues using the CPTAC dataset. The Ser1006 locus of SETDB1 possessed a higher phosphorylation level in BRCA (p = 1:55E − 08), GBM (p = 1:27E − 02), PAAD (p = 1:00E − 11), HNSC (p = 1:16E − 33), LUAD (p = 1:91E − 32), and KIRC (p = 8:36E − 17) (Figures 9(u)-9(z)). These results implied that protein phosphorylation of SETDB1 at Ser1006 locus may play an important role in the development and progres-sion of those tumors. Our previous findings suggested that SETDB1 protein was mainly located in the nucleoplasm. However, whether the protein phosphorylation of SETDB1 at Ser1006 locus affects its location or its function remains unknown and requires more investigations.

Immune and Molecular Subtype Analysis of SETDB1.
We assessed the relationship between SETDB1 expression status and immune activity as well as molecular subtypes in human cancers based on the TISIDB database. According to immune activity, the tumor tissues were divided into C1 (wound healing), C2 (IFN-gamma dominant), C3 (inflammatory), C4 (lymphocyte depleted), C5 (immunologically quiet), and C6 (TGF-b dominant). In order to verify the dynamic relationship between SETDB1 expression status and immune activity, we assessed the immune activity levels of the six subtypes in various cancers. The results showed that SETDB1 expression was significantly associated with immune subtypes in BLCA, COAD, KICH, KIRC, LIHC, LUAD, LUSC, ovarian serous cystadenocarcinoma (OV), SARC, STAD, and TGCT (Figure 10(a)). Similarly, the SETDB1 expression was associated with molecular subtypes in ACC, BRCA, COAD, GBM, HNSC, KIRP, LGG, LUSC OV, PCPG, PRAD, SKCM, STAD, and UCEC (Figure 10(b)).    These results implied that SETDB1 expression status was relevant to the immune subtypes and molecular subtypes of various cancers.
3.9. Immune Infiltration Analysis of SETDB1. Considering the importance of the immune microenvironment in tumorigenesis and cancer progression, we characterized immune infiltration levels of SETDB1 based on several databases. CAFs are the fibroblasts around tumor cells and the major stromal cells in the tumor microenvironment. They play a significant role in the initiation and progression of tumors [24,25]. It has been demonstrated that targeting CAFs is an effective treatment strategy for various cancers [26]. The EPIC, MCPCOUNTER, and TIDE algorithms were applied to assess the relationship between the infiltration level of CAFs and SETDB1 gene expression in various human cancers. We observed a significantly positive correlation between SETDB1 expression and infiltration level of CAFs  (Figure 11). Furthermore, partial correlation analysis between SETDB1 expression and immune cell infiltration was conducted using the TIMER2.0 database. The results demonstrated a remarkable correlation between SETDB1 expression and CD8 + T cells, CD4 + T cells, Tregs, and B cells (Figure S7A-7D).  Journal of Oncology by immune cells [27]. We also demonstrated a significant correlation between SETDB1 expression and each immune checkpoint-related gene (immunoinhibitor and immunostimulator) in diverse cancers in TCGA (Figures 12(a) and  12(b)). For example, in KIRC, SETDB1 expression has a significantly positive correlation with the expression of TIGIT, PDCD1, CTLA4, CD96, CD244, CD160, BTLA, ADORA2A, TGFBR1, LAG3, HHLA2, CXCR4, CD80, CD70, CD48, etc. (Figure 12(d)). The MHC is a human leukocyte antigen (HLA) that plays an important role in tumor immunotherapy by activating T cells [28,29]. According to our results, the association between SETDB1 expression and HLArelated genes varies markedly among cancer types. The SETDB1 expression and HLA-related genes in ACC, CESC,

25
Journal of Oncology and KIRC were positively correlated but were negatively correlated in other cancers in TCGA (Figure 12(c)). TMB and MSI are emerging predictors associated with survival and response to immunotherapy [30,31]. This study showed that SETDB1 was positively correlated with MSI in BLCA, CESC, LUAD, LUSC, READ, and SARC, while negatively correlated with MSI in DLBC, but did not show correlation with MSI in other cancers (Figure 12(e)). SETDB1 expression was positively correlated with TMB in BLCA, BRCA, LGG, LUAD, and STAD, while negatively correlated with TMB in THCA and UCS, but did not correlate with TMB in other cancers (Figure 12(f)).

Function and Pathway Analysis of SETDB1-Related
Genes. To further elucidate the biological function and molecular mechanism of SETDB1 and provide theoretical support for the study of tumorigenesis, we identified the targeting SETDB1-binding proteins with the STRING tool and conducted bioinformatics analyses. The interaction network analysis showed interaction information of SETDB1 and these proteins with 51 nodes and 715 edges ( Figure S8A). These genes were considered the target genes to obtain enriched GO terms and significant KEGG pathways. In GO analysis, 364 GO categories were detected, including 251 biological process (BP), 40 cellular component (CC), and 73 molecular function (MF). In the GO-BP category, the target genes were mainly enriched in covalent chromatin modification (GO:0016569), histone modification (GO:0016570), and peptidyl-lysine modification (GO:0018205) (Figure 13(a)). In the GO-CC category, the genes were related to heterochromatin (GO:0000792), chromosomal region (GO:0098687), chromosome, and telomeric region (GO:0000781) (Figure 13(a)). In the GO-MF category,    Journal of Oncology     (Figure 13(a)). In KEGG pathway analysis, 51 genes were categorized into 11 KEGG pathways. As a result, lysine degradation, thyroid hormone signaling pathway, and cell cycle were identified and marked as main KEGG pathways (Figure 13(b)). These results are consistent with the results of GSEA analysis (Figures 13(c) and 13(d)). Additionally, in terms of the HALLMARK, a high expression level of SETDB1 was significantly enriched in the mitotic spindle, unfolded protein response, and PI3K-AKT-MTOR signaling. In contrast, the low expression level of SETDB1 was significantly enriched in the inflammatory response, allograft rejection, coagulation, and epithelial-mesenchymal transition (Figures 13(e) and 13(f)). Furthermore, the results of GeneMANIA also revealed that SETDB1 and targeting SETDB1-binding proteins were mainly related to chromatin assembly, chromatin assembly or disassembly, DNA packaging, DNA conformation change, protein-DNA complex, nucleosome organization, and DNA packaging complex ( Figure S8B). CeRNA is important for tumorigenesis by forming an extensive ceRNA network involving mRNA, miRNA, and ncRNA. We identified hsa-miR-29a-3p as the most vital miRNA regulator by overlapping predictions of three databases ( Figure S8C). We then explored the complementary sequences between SETDB1 and hsa-miR-29a-3p using the TargetScanHuman database ( Figure S8D). We also predicted the twelve target lncRNAs by interacting with miRNA and lncRNA sequences in the LncBase database. Then, the lncRNA-miRNA-mRNA network was constructed based on lncRNA-miRNA and miRNA-mRNA regulation pairs ( Figure S8E).
3.12. Drug Sensitivity Analysis. Genetic mutations can influence the efficacy of chemotherapy and targeted therapy. Therefore, we evaluated the role of SETDB1 in chemotherapy or targeted therapy and investigated the drug sensitivity and drug resistance of cancer cell lines from the GDSC datasets. ANOVA analysis showed that drug sensitivity toward Arg-G (nelarabine), Nilotinib, and KIN001-042 was significantly correlated with the expression of SETDB1 (negative correlation with IC50). However, the drug resistance toward Bleomycin (50 μM), JAK3_7406, and FGFR_0939 was correlated with the expression of SETDB1 (positive correlation with IC50) (Figure 14(a)). The correlation between GDSC

34
Journal of Oncology drug sensitivity and SETDB1 expression showed that most drugs were negatively correlated with SETDB1 expression, while 17-AAG and trametinib showed a positive correlation with SETDB1 expression (Figure 14(b) Figure 15). These results are similar to those we have previously reported for different database.

Discussion
The SETDB1 gene, first identified in 1999 by Harte et al. [11], with a length of 38.6 kilobases (kb), is located on the human chromosome 1q21.3. Chromosome numbers and 35 Journal of Oncology structural abnormalities are important factors for tumorigenesis and the therapeutic response [32,33]. Tumor tissues have many chromosomal variants. Chromosome 1q gains occurred in various human cancers, such as LUAD, LIHC, OV, BRCA, and multiple myeloma [32,34,35]. Chromosome 1q21.3 abnormalities are related to breast cancer recurrence, and they can promote cell proliferation and DNA damage response in metastatic melanoma [35,36]. SETDB1 is located in the 1q21.3 region that encodes a histone methyltransferase which regulates transcriptional repression, histone methylation, and gene silencing [37,38]. This study has demonstrated that the SETDB1 is differentially expressed in most tumors and normal tissues, indicating that it also plays an oncogenic role in these tumors. The amplification of SETDB1 in human tumors is significantly associated with immune exclusion and tumor progression, but its biological and functional role or contribution to tumor prognosis is unknown [12]. This paper is the first pan-cancer analysis of SETDB1 across 33 different tumors based on the data of TCGA, CPTAC, and GEO databases. The results show that SETDB1 is significantly correlated with tumorigenesis and clinical outcomes.
SETDB1 has specific domains [39], such as two Tudors, MBD, pre-SET, SET, and post-SET, and this result is consistent with our finding. The most biological function of SETDB1 is ascribed to the SET domain, which is highly conserved across species and originally identified in the Drosophila Trithorax (TRX) and human MLL proteins [40]. SETDB1 is a member of SET family and is an H3K9 methyltransferase that modulates gene activity. The pre-SET, SET, and post-SET domains are crucial for histone methyltransferase activity. Furthermore, SETDB1 protein has a canoni-cal CpG DNA methyl binding domain (MBD) at the Nterminus, which can bind methylated DNA at one site [41]. Growing evidence suggests the involvement of MBD genes in cancers [42]. MBD is involved in various signaling pathways and cellular functions, including DNA damage repair, chromatin remodeling, histone methylation, and X chromosome inactivation [42]. MBD can also potentially coordinate the functions of DNA methyl-CpG binding and H3K9 methylation, both of which can promote epigenetic marks [42,43]. SETDB1 also contains a unique tandem Tudor domain that recognizes histone H3 sequences containing acetylated lysines and methylated [44]. SETDB1 biological function is a two-edged sword. On the one hand, it may downregulate antioncogenes through histone methylation. On the other hand, it may inhibit tumor-intrinsic immunogenicity, enabling cancer cells to evade immune responses [12,45].
A recent study indicated that tumor cell-intrinsic epigenetic alterations drive tumorigenesis and cancer progression [46]. The epigenetic characters reflect the heterogeneity of tumors and indicate potential epigenetic changes, which lead to cancer cell invasion during tumor progress [46,47]. As an important player in tumor epigenetics, SETDB1 expression is significantly differential in most cancerous tissues and adjacent healthy tissues [8,[48][49][50][51], which is consistent with our findings. It is demonstrated that SETDB1 is an oncogene and an important prognostic factor in some tumors. SETDB1 expression is upregulated in LIHC tissues and is associated with tumor size, enhanced stage, and TNM classification [52]. Similarly, for TCGA-LIHC patients, we observed that the expression level of SETDB1 is significantly elevated in tumor tissues compared to that in paracancerous  Figure 15: Immunohistochemistry results of SETDB1 performed in normal and tumor tissues of human. 36 Journal of Oncology tissues. LIHC tissues from patients with advanced-stage tumors show significantly higher expression levels of SETDB1 compared with those from patients with earlystage tumors. We also observed that the expression levels of SETDB1 were significantly lower in the stage 4 tumor than those in early-stage tumors. However, this result may be inaccurate due to the sample size limitation (six samples with stage 4 tumor). Therefore, a large sample size study is needed to further verify the conclusion. Cancer develops as a result of genetic mutational events that lead either to the overexpression of growth-promoting oncogenes or the inactivation of cell cycle-controlling tumor suppressor genes [53]. Growing evidence implies that SETDB1 is a potential oncogene for tumorigenesis [8]. Therefore, comprehensively understanding the biological functions of SETDB1 mutations can help to inhibit tumorigenesis and develop effective antitumor agents. It has been reported that mutated SETDB1 is widespread and occurs in most malignant pleural mesothelioma [54,55]. The frequent SETDB1 mutation indicates that there may be a potential therapeutic target for malignant pleural mesothelioma [55]. We first used the cBioPortal online database to explore genetic mutation levels of SETDB1 in various cancers. The pan-cancer mutation spectra showed that a high mutant frequency of SETDB1 occurred in most human tumors, with the highest frequency in UCEC (15.09% of 517 cases). These results also demonstrated that SETDB1 mutation played a significant role in tumorigenesis.
Cancer is an increasingly health-threatening disease that has a poor prognosis due to the lack of effective treatment. The progression and recurrence of the tumor challenge the effectiveness of therapies [56,57]. Due to the therapeutic resistance and tumor relapse after therapy, the paradigms of cancer-centric therapeutics are not sufficient to eradicate the malignancy [58]. Targeting tumor microenvironment (TME) is a novel tumor treatment strategy in recent years. CAFs are the most abundant stromal cells in the TME and play significant roles in tumor development. Our results also revealed the significant association between SETDB1 expression and tumor-related immune cell infiltration level of CAFs in certain tumors, including ACC, BRCA, CESC, COAD, HNSC, HNSC-HPV, KIRP, LIHC, READ, and TGCT. Furthermore, we used the online databases to explore the correlation between SETDB1 expression and immune cell infiltration level in human cancer and found that tumor-related immune cells significantly increased in tumor tissues with high SETDB1 expression levels. These results also demonstrated that the expression levels of SETDB1 influenced tumor growth, metastasis, and prognosis.
Immunotherapy has emerged as a new pillar of cancer treatment in recent years. The introduction of PD-1, PD-L1, and CAR-T cell immunotherapy into the therapeutic strategy of advanced cancer leads to unprecedentedly prolonged survival for patients [59]. According to our findings, increased expression of SETDB1 has a significantly negative correlation with immunoinhibitor and immunostimulator in most cancers. Therefore, we speculated that decreasing SETDB1 expression in tumor cells might enhance immunotherapeutic responses.

Conclusion
In summary, we conducted the pan-cancer analysis of SETDB1 oncogenes for the first time. The omics analysis, prognostic analysis, methylation and phosphorylation analysis, immune analysis, and enrichment analysis of SETDB1 were performed. The mRNA and protein expression levels and gene alteration levels were analyzed. It is expected that the investigation and characterization of SETDB1 biological function can help to identify the key targets and regulatory pathways and promote human cancer treatment in the future.