Associations of individual and joint expressions of ERCC6 and ERCC8 with clinicopathological parameters and prognosis of gastric cancer

Background Excision repair cross-complementing group 6 and 8 (ERCC6 and ERCC8) have been implicated in ailments such as genetic diseases and cancers. However, the relationship between individual and joint expressions of ERCC6/ERCC8 and clinicopathological parameters as well as prognosis of gastric cancer (GC) still remains unclear. Methods In this study, protein expressions of ERCC6, ERCC8 and ERCC6-ERCC8 were detected by immunohistochemistry (IHC) in 109 paired GC and para-cancerous normal tissue samples. The mRNA expression was detected in 36 pairs of tissue samples. IHC results and RNA-seq data extracted from The Cancer Genome Atlas (TCGA) were used to explore the clinical value of ERCC6 and ERCC8 expression in GC. We further conducted protein-protein interaction analysis, Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, gene set enrichment analysis, and gene-gene interaction analysis to explore the function and regulation networks of ERCC6 and ERCC8 in GC. Results Individual and joint ERCC6/ERCC8 expression were significantly higher in adjacent normal mucosa compared with GC tissues. ERCC6 mRNA expression showed no difference in GC and paired tissues, while ERCC8 mRNA was significantly decreased in GC tissues. Protein expression of ERCC6, ERCC8, double negative ERCC6-ERCC8 and double positive ERCC6-ERCC8 and overexpressed ERCC6 mRNA were related to better clinicopathologic parameters, while overexpressed ERCC8 mRNA suggested worse parameters. Univariate survival analysis indicated that the OS was longer when ERCC6 protein expression and ERCC8 mRNA expression increased, and double negative ERCC6-ERCC8 expression was associated with a short OS. Bioinformatics analyses showed ERCC6 and ERCC8 were associated with nucleotide excision repair (NER) pathway, and six and ten gene sets were figured out to be related with ERCC6 and ERCC8, respectively. KEGG pathway showed that ERCC6/ERCC8 related gene sets were mainly involved in the regulation of PI3K/AKT/mTOR pathway. Direct physical interactions were found between ERCC6 and ERCC8. Conclusions Individual and joint expressions of ERCC6/ERCC8 were associated with clinical features of GC. Protein expression of ERCC6, ERCC6-ERCC8, and mRNA expression of ERCC8 were related to prognosis of GC. ERCC6 and ERCC8 primarily function in the NER pathway, and may regulate GC progression through the regulation of PI3K/AKT/mTOR pathway.


Background
The human genome is under a condition where DNA impairment and correction are in a dynamic equilibrium relationship. A variety of DNA repair mechanisms have been found to help maintain the integrity of genes damaged by endogenous and exogenous variables (1). Nucleotide excision repair (NER) can repair a wide range of DNA lesions, including oxidatively damaged DNA bases, bulky adducts and UV-induced cyclobutane pyrimidine dimers (2,3). The repair pathway cannot be activated with de cient expression of DNA repair genes, which can lead to a decrease in DNA repair capacity and an increase in cancer susceptibility (4).
Excision repair cross-complementing group 6 (ERCC6/CSB) and excision repair cross-complementing group 8 (ERCC8/CSA) are core members vital for NER pathway (5)(6)(7)(8). Considering their important function in the NER pathway, many scholars conducted experiments to explore the roles of ERCC6 and ERCC8 in disease onset and progression. Results of these experiments showed that expression of ERCC6 and ERCC8 had clinical signi cance in ailments such as genetic disease and cancer. Cheng et al. (9,10) observed in their studies that patients with lung and head and neck cancer were more likely to have decreased ERCC6 mRNA expression in comparison with controls. In sharp contrast, overexpressed ERCC6 mRNA was detected in colorectal cancer compared with matched normal tissues (11), and Caputo et al. (12) have convincingly reported higher ERCC6 mRNA expression in kidney and lung cancer samples, and western blot analysis revealed increased ERCC6 protein levels in bladder, cervix, prostate and breast cancer cells compared with normal cells. Previous researches have also revealed that ERCC6 and ERCC8 play a crucial part in the onset and progression of a hereditary disease called Cockayne syndrome (13). It has been observed in our previous study that ERCC6, ERCC8, and ERCC6-ERCC8 joint expression is related to the risk of gastric cancer (GC) (14).
All current data turn out to indicate the fact that there exist heterogeneous expression of ERCC6 and ERCC8 in disease, and thus in uencing the development and progression of disease. However, to date no prior reports have recorded the prognostic implications of ERCC6, ERCC8 and ERCC6-ERCC8 expressions in GC. For the rst time, our study conducted a comprehensive analysis using immunohistochemistry (IHC) and RNA-seq data to explore the associations between ERCC6 and ERCC8 expression and clinicopathological parameters and prognosis of GC.

Materials And Methods
Human tissue specimens Paired gastric cancer and para-cancerous normal tissue samples were collected from 109 individuals with gastric cancer, who were diagnosed at the anorectal department of the First Hospital of China Medical University. Each selected patient had not received neoadjuvant chemoradiotherapy or any other treatment from 2012 to 2015.
Histological diagnoses were completed following the updated Sydney System Classi cation (15) and the World Health Organization criteria (16) for gastritis and GC, respectively. The 2010 7th edition of the TNM staging system of the International Union Against Cancer/American Joint Committee on Cancer was selected to stage the tumor (17), based on postoperative pathologic examination. This study was approved by the ethics committee of the First Hospital of China Medical University. Written informed consent was obtained from each patient.
Two experienced pathologists who were blinded to patient-related information evaluated and scored the staining results independently. A semi-quantitative scoring criterion was applied to analyze the staining extent of each slide (18). The staining score was categorized on the basis of intensity: 0 -no staining, 1 -light staining, 2moderate staining and 3 -strong staining and coloring ratio: 0 (≤ 5%), 1 (5-25%), 2 (25-50%), 3 (50-75%), 4 (≥ 75%) of the IHC results. An immunoreactivity score (IS) was generated by multiplying the two scores for each sample. All the scores were applied to indicate certain extent of positive staining except the score of 0, which suggested a negative protein expressed level.
Clinical information and RNA-seq data collection Data of 109 IHC cases concerning age, gender, smoking, family history and alcohol consumption were obtained via questionnaire. Clinical characteristics were collected from medical records, including TNM stage, Lauren's classi cation, Borrmann classi cation, tumor size, phase of progression, lymph node metastasis, perineural invasion, and vascular invasion. The nal follow-up was completed on July 2016. Full data concerning prognosis were obtained from 97 participants.
Data of 415 stomach cancer patients, which included RNA-seq data and survival information, were downloaded from TCGA (https://cancergenome.nih.gov/) (19,20). Detailed information of clinicopathological parameters including age, gender, grade, TNM stage and histological type were also downloaded for further analyzing.

Interaction and functional analysis of ERCC6 and ERCC8
Given the clinical signi cance of ERCC6/ERCC8 in GC, we then investigated the biological functions of ERCC6 and ERCC8. First we used ERCC6 and ERCC8 as core genes to construct protein-protein interaction (PPI) networks by Search Tool for the Retrieval of Interacting Genes (STRING v.11.0; https://string-db.org/; accessed on August 27, 2020), to mine proteins that have functional interactions with ERCC6/ERCC8 (21). Analytic information including nodes degrees and biological networks was visualized with Cytoscape platform (v.3.7.2) (22). And ten most associated proteins were showed in the diagrams.
Then we conducted enrichment analyses of Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) to explore the biological functions of ERCC6 and ERCC8 with the Database for Annotation, Visualization and Integrated Discovery (DAVID; v.6.8; https://david.ncifcrf.gov/home.jsp; accessed on August 31, 2020), a user friendly database providing comprehensive analysis of gene annotation (23). R language (Version 3.6.3) and the ggplot2 package were applied to visualize the analytic results. Terms with a P < 0.05 were deemed signi cant and for GO, only top ten terms of each group were selected to be visualized.
Identi cation of regulation networks of ERCC6 and ERCC8 by GSEA Gene set enrichment analysis was conducted on the GSEA platform (version 4.1.0; https://www.broadinstitute.org/gsea/) coupled with MSigDB database (24,25). Oncogenic signature gene sets (c6.all.v7.1.symbols.gmt) and obtained TCGA expression data were included in this analysis. Through GSEA, we could clarify the regulation networks of which ERCC6 and ERCC8 were involved in GC. Signi cantly changed regulation networks were identi ed through a thousand times of phenotype permutation test and the metric for the analysis was set as pearson. A normalized p value < 0.01 and a false discovery rate (FDR) less than 0.25 were selected as criteria for signi cant enrichment results. Normalized enriched score (NER) was applied to rank the obtained results.

Identifying gene-gene interactions between ERCC6 and ERCC8
To identify interactions between ERCC6 and ERCC8, we performed gene-gene interaction analysis with the Gene Multiple Association Network Integration Algorithm (GeneMANIA; https://www.genemania.org/; accessed on November 13, 2020), which is a user-friendly interface providing analysis with available genomics and proteomics data (26). To display interactions, nodes and links represented genes and networks respectively in the visualized results.
Statistical analysis IBM SPSS Statistics for Windows, version 23.0 (IBM Corp., Armonk, N.Y., USA) and R platform (Version 3.6.3) were used for statistical analyses. For IHC, Pearson χ 2 test was used when analyzing the correlations between ERCC6 and ERCC8 expressed levels and clinicopathological parameters; univariate and multivariate Cox regression analyses have been selected for the determination of their impact on overall survival (OS), and variables including age, TNM stage, perineural invasion, vascular invasion and lymph node metastasis were further adjusted in the multivariate model to evaluate the independent prognostic value. For RNA-seq data, Wilcoxon test and Kruskal-Wallis H test have been employed when calculating the interrelationships between ERCC6/8 expressions and clinical characteristics; two-sided Log-rank test and multivariate Cox proportionalhazards model adjusted by gender, age, grade, stage, T, N, and M were selected to clarify the prognostic value of ERCC6 and ERCC8. A P < 0.05 suggested statistical difference.

Expression of ERCC6 and ERCC8 in GC and adjacent normal mucosa
In this study we compared the expressed levels of ERCC6 and ERCC8 between GC and adjacent normal mucosa. Representative ERCC6 and ERCC8 staining were present in Fig. 1. Our results suggested that individual and joint expressions of ERCC6 and ERCC8 were obviously higher in adjacent normal mucosa than in GC tissues (all P < 0.001) ( Table 1). Speci cally, the ERCC6-ERCC8 double positive rate dropped to 16.5% in GC and the double negative rate was only 1.8% in adjacent normal mucosa. We explored the associations of ERCC6/ERCC8 expressed levels with clinicopathological parameters in GC patients and the results were summarized in Table 2. ERCC6 expression was signi cantly related to Borrmman classi cation (P = 0.017), Lauren's classi cation (P = 0.004), TNM stage (P = 0.005) (P = 0.012 for T stage) and perineural invasion (P = 0.001). High ERCC6 expression was observed in gastric cancers of Borrmman class I-II, TNM stage I-II, intestinal-type and without perineural invasion. Expression of ERCC8 was statistically higher in study cohort with TNM stage I-II in comparison to stage III-IV (P < 0.001) (P < 0.001 for T stage; P = 0.002 for lymph node metastasis), and was higher in cancer with early-stage and small size (P = 0.031 and 0.007, respectively). Higher expression of ERCC8 was also observed in intestinal type GC (P = 0.008). As for the joint expression of ERCC6/ ERCC8, double positivity was related to small tumor size (P = 0.005), Borrmman I-II stage (P < 0.001), TNM I-II stage (P = 0.001) (P < 0.001 for T stage), Lauren intestinal type (P = 0.014) of GC and negative perineural invasion (P = 0.015). Double negativity was associated with TNM III-IV stage (P < 0.001) (P = 0.002 for T stage), positive perineural invasion (P = 0.002), advanced stage (P = 0.034) and diffuse type (P < 0.001) of GC.
Analysis results of RNA-seq data obtained from TCGA was shown in Fig. 2, higher ERCC6 expression was related with better T stage (P = 0.027), which is consistent with the IHC analysis results, while no statistical results were found between ERCC6 expression and age (P = 0.570), gender (P = 0.646), histological type (P = 0.425), grade (P = 0.072), stage (P = 0.091), N stage (P = 0.572) and M stage (P = 0.242). As for ERCC8, Fig. 3 revealed that overexpressed ERCC8 was closely related to worse grade (P = 0.018), advanced stage (P = 0.022), and worse N stage (P = 0.037); the associations between ERCC8 expression and age, gender, histological type, T stage and M stage were not signi cant ( all P > 0.05).
The relationship between ERCC6 and ERCC8 expression and prognosis of GC Univariate survival analysis showed no signi cant correlation between protein expressed levels of ERCC8 and GC prognosis (P = 0.211), while a signi cant correlation was observed between ERCC6 protein expressed levels and GG prognosis(P = 0.047, HR = 3.416, 95% CI = 1.017-11.475). In addition, double negative expression of ERCC6 and ERCC8 was signi cantly associated with poorer prognosis (P = 0.022, HR = 2.603, 95% CI = 1.148-5.905). However, no statistical association was found between the expressed levels of ERCC8 and ERCC6-ERCC8 and GC prognosis (both P > 0.05). Because TNM stage (P < 0.001), age (P = 0.039), perineural invasion (P = 0.043), vascular invasion (P = 0.007), and lymph node metastasis (P < 0.001) are all statistically associated with gastric cancer prognosis (Additional le 1: Table S1), Cox's proportional hazards model adjusted by perineural invasion, TNM stage, age and vascular invasion was further applied to evaluate the prognostic value. However, the multivariate analysis suggested that ERCC6 or ERCC8 or ERCC6-ERCC8 expressed level was not an independent factor for GC prognosis (all P > 0.05) ( Table 3).
Survival analysis with RNA-seq data suggested that higher ERCC8 mRNA expression was related to better OS (P = 0.025; Fig. 3I) while ERCC6 expression made no sense (P = 0.15; Fig. 2I
Then we conducted analyses of GO and KEGG according to the network results we obtained from String. As revealed in Fig. 5, ERCC6 network genes showed enrichment in molecular functions of protein N-terminus binding, damaged DNA binding and DNA-dependent ATPase activity (Fig. 5A). They were mainly involved in nucleoplasm, transcription factor TFIID complex, and holo TFIIH complex according to cellular components analysis result (Fig. 5B). Figure 5C showed that ERCC6-interactive genes were signi cantly enriched in biological processes of nucleotide-excision repair and UV protection. Analysis results of KEGG suggested that these genes were closely related to nucleotide excision repair, Huntington's disease, RNA polymerase and basal transcription factors (Fig. 5D). As for ERCC8 network genes, results of GO enrichment analysis showed that these genes were related to the composition of transcriptional initiation complexes and ubiquitin ligase complexes, and not surprisingly, their main molecular functions and participated biological processes bore a remarkable resemblance to ERCC6 network genes ( Fig. 6A-C). Similar KEGG results, nucleotide excision repair and basal transcription factors were also observed in ERCC8 network genes. And a special part of this analysis results was ubiquitin mediated proteolysis (Fig. 6D).

Identi cation of gene sets associated regulatory networks of ERCC6 and ERCC8
Further we identi ed most positively and negatively related gene sets with ERCC6 and ERCC8, to gure out the cellular regulatory networks in GC that ERCC6 and ERCC8 were involved in. Oncogenic signatures analysis indicated that in GC there existed six and ten most signi cant gene sets for ERCC6 and ERCC8, respectively.
Among the results, both ERCC6 and ERCC8 were associated with TBK1 and BCAT associated cellular regulatory networks; ERCC6 was also associated with regulated by EIF4, MTOR, JAK2 and CSR related regulatory networks; ERCC8 was also associated with PIGF, RB, ERBB2, GCNP, SRC and CYCLIN D1 related regulatory networks.
Detailed information are shown in Fig. 7 for ERCC6 and Fig. 8 for ERCC8.

Gene-gene interaction network between ERCC6 and ERCC8
Gene-gene interaction network accessed from GeneMANIA clari ed the correlations of ERCC6 and ERCC8 among pathway, predicted, shared protein domains, physical interactions, co-localization, and co-expression. As shown in Fig. 9, there existed direct interactions including physical interactions and pathway and indirect interactions including prediction, co-expression, colocalization and shared protein domains between ERCC6 and ERCC8.

Discussion
By analyzing IHC and TCGA data, our experiment elucidated that abnormally expressed ERCC6 and ERCC8 were associated with clinicopathological behaviors and survival of GC. Furthermore, by performing bioinformatics analysis of GO, KEGG, GSEA and gene-gene interaction analysis, our research extended the existing knowledge of ERCC6/ERCC8 in GC.
We initially detected protein expressed levels of ERCC6/ERCC8 in GC and para-cancerous tissues. The results indicated that their expressions were signi cantly decreased in GC, in comparison to adjacent tissues despite individual or joint expression. Then we investigated associations between protein expression of ERCC6 and ERCC8 and clinicopathological parameters, and the analysis showed that overexpressed ERCC6, ERCC8 and ERCC6-ERCC8 were signi cantly related to favorable clinicopathological features, which are key factors that have great impact on disease progression. RNA-seq data revealed identical results with ERCC6 that higher ERCC6 expression was associated with favorable T stage, while overexpressed ERCC8 was associated with unfavorable clinicopathological parameters. We suspected that the discrepancy may be due to some potential mechanisms that resulted in the instability of ERCC8 protein in GC progression. And cancer cells lacking ERCC6 or ERCC8 protein, which are responsible for DNA repair, may exhibit a more malignant and poorly differentiated phenotype. A recent research reported that ERCC6 de ciency could result in heterochromatin loss and exacerbates cellular aging (27). Defects in ERCC6 and ERCC8 will in uence the coupling of transcription and repair to a certain extent, thus leading to declining DNA repair capacity (28). Physiologically, DNA repair capacity could be related to expression levels of proteins involved in DNA repair activities (29). Previous studies have also reported that a downregulation of DNA repair genes is related to late stage cancers and malignant transformation (30). Therefore, it is conceivable that the expressed status of DNA repair genes could re ect the capacity of a cell to meet repair demands after being stimulated by a carcinogen. We suggested that ERCC6 and ERCC8 downregulation could induce persistent existence of unrepaired DNA lesions, leading to decreased DNA repair capacity and increased cancer susceptibility, and nally resulting in cancer progression.
Further to explore the prognostic value of ERCC6 and ERCC8, we studied the correlation between ERCC6 and ERCC8 expression and survival in GC patients with both IHC and RNA-seq data. According to univariate survival analysis based on IHC, higher ERCC6 protein expression was associated with better prognosis while double negative ERCC6 and ERCC8 expression indicated worse overall survival of GC patients. And RNA-seq data also showed that overexpressed ERCC8 was related to a better OS of GC patients. When adjusting for certain parameters in the Cox multivariate analysis, analyses results of ERCC6 and ERCC8 expression with IHC and RNAseq data no longer maintained independent predictive power, which may be due to the complexity of tumor progression. A previous lab study showed that knockdown of ERCC6 could sensitize HCT116 cells to 5-Fluorouracil in xenograft mouse models and colorectal cancer patients with high ERCC6 expression exhibited shorter overall survival (31). As for other DNA repair family genes, high ERCC5 expression was shown to correlate with shorter survival times compared with low ERCC5 expression(32), whereas decreased ERCC1 expression was reported to predict a favorable prognosis in gastric cancer (33). Overall, our data suggested that expressed levels of ERCC6 and double negative ERCC6-ERCC8 protein, and ERCC8 mRNA, to some extent, may possess potential prognostic value in GC, and some certain factors should also be taken into account to estimate GC prognosis more comprehensively in the further analysis.
Next, bioinformatic analyses were conducted to better investigate biological functions and regulation networks of ERCC6 and ERCC8 in GC progression. First we queried the 10 most relevant genes of ERCC6 and ERCC8 through String and then performed GO and KEGG analyses with the obtained results. Enrichment analysis of ERCC6 and ERCC8 and their relevant genes showed similar results. Both the two genes were mainly involved in the composition of transcriptional initiation complexes and exerted in uences on diverse nucleotide excision repair pathways. Similarly in other experiments, researchers have identi ed ERCC6 and ERCC8 as core NER genes (34)(35)(36). KEGG pathway analysis results further revealed that ERCC6 also functioned in Huntington's disease and ERCC8 showed signi cant impacts in ubiquitin mediated proteolysis. Consistent with our analyses, one study have reported that ERCC8 are involved in the formation a complex which exhibits ubiquitin ligase activity (37).
Furthermore, we conducted GSEA analysis to explore ERCC6 and ERCC8 associated regulation networks in GC.
Usually a set of genes which exhibit certain patterns of up or downregulation when an already known pathway related to tumorigenesis is activated, are de ned as an oncogenic pathway signature. Here in our study, oncogenic signatures analysis suggested that ERCC6 was signi cantly associated with the oncogenic signatures of EIF4E, TBK1, BCAT, mTOR, JAK2 and CSR related regulation networks. As for ERCC8, the results indicated a signi cant relationship with TBK1, PIGF, BCAT, RB, ERBB2, GCNP, SRC and CYCLIN_D1 associated oncogenic regulation networks. These days emerging evidence has illustrated that PI3K/AKT/mTOR pathway deregulation plays an important part in GC progression (38). Currently, one study conducted by Riquelme mentioned that two mTOR pathway genes, EIF4E and mTOR, were overexpressed in GC cells (39). It has been found that ERBB2 could mediate the activation of PI3K (40). Moreover, some studies have reported the environment-dependent inhibition or activation role of TBK1 in mTOR signaling (41)(42)(43). Therefore, given all the above results, we suspected that ERCC6 and ERCC8 could regulate GC progression through the regulation of PI3K/AKT/mTOR pathway. Because of the similar and identical functions and pathways found in our analysis, we then did gene-gene interaction analysis to gure out the potential associations between ERCC6 and ERCC8. The results demonstrated that there did exist direct physical interactions and pathways between ERCC6 and ERCC8, which was supported by one previous study (44). And indirect interactions including prediction, co-expression, colocalization and shared protein domains were also revealed in the results. From these we suggested the existence of alliance mechanisms between ERCC6 and ERCC8, which needs further in-depth study.

Conclusions
In conclusion, individual and joint expressions of ERCC6 and ERCC8 were associated with clinical features of GC.      Protein interaction networks of 10 associated partners with a con dence score > 0.4 obtained from String database. A. ERCC6 is de ned as the core gene; B. ERCC8 is de ned as the core gene.

Figure 4
Protein interaction networks of 10 associated partners with a con dence score > 0.4 obtained from String database. A. ERCC6 is de ned as the core gene; B. ERCC8 is de ned as the core gene.   Gene-gene interaction network between ERCC6 and ERCC8. Nodes and links represent genes and networks respectively.