Exploration of a Novel Biomarker in Endometrial Carcinoma

Background The adenomatous polyposis coli (APC) gene, located on chromosome 5q21, is the chromatin-remodeling related gene and a typical tumor suppressor. As reported, patients with high expression of programmed death-ligand 1 (PD-L1) or a high level of tumor mutational burden (TMB) may benet from immunotherapy in endometrial cancer. The objective of this study was to demonstrated that APC as a new target for the diagnosis and treatment of endometrial cancer, by analyzing the correlation of APC with PD-L1 expression or TMB. Methods We performed an integrative analysis of a commercial panel including 520 cancer-related genes on 99 tumors from an endometrial cancer cohort in China and DNA-seq data from The Cancer Genome Atlas (TCGA) to identify new gene mutations as endometrial cancer immunotherapy markers. To determine the effect of gene mutations on endometrial cancer, we explored the correlation between gene mutations and tumor immune microenvironment, and explored the immune microenvironment in endometrial cancer, including TMB, PD-L1 expression and lymphocytic inltration.


Background
Endometrial cancer is the one of the most common malignant tumors of the female reproductive system. In recent years, the morbidity and mortality of endometrial cancer have increased worldwide. Early diagnosis, surgery, and chemotherapy reduce endometrial cancer mortality. However, there is a subset of low-grade, early stage, well-differentiated endometrioid tumors in which unexpected recurrences and poor outcomes do occur. For women diagnosed with a clinically aggressive histologic subtype of the disease, such as serous, clinical outcomes worsen considerably with recurrent or advanced disease (1,2).
Traditional endometrial cancer typing (Bokhman) has limited predictive value for prognosis, because it does not consider the genetic variation and heterogeneity of the tumor. The Cancer Genome Atlas (TCGA) published a comprehensive genomic study of serous and endometrioid histotypes and reported four genomic subtypes: polymerase-epsilon (POLE), microsatellite instability (MSI), copy-number low and copy-number high (3). Intracellular mutation accumulation caused by POLE mutations or MSI, leads to high tumor mutational burden (TMB), increased expression of new antigens and abundant tumor in ltrating lymphocytes, which result in sensitivity to immune checkpoint inhibitors (4).
Immunotherapy, the most promising research direction in cancer treatment, regulates the tumor immune microenvironment to kill tumor cells by activating or regulating the body's immune system. Multiple clinical trials have proven that the type of POLE hypermutation and MSI are potential bene ciaries of programmed cell death protein (PD-1) and programmed death-ligand 1 (PD-L1) immunotherapy in endometrial cancer (3,5,6). The expression of PD-L1 was typically detected by immunohistochemistry to predict the effect of immunotherapy for endometrial cancer. However, some clinical studies have indicated that patients with positive PD-1 expression do not respond to PD-1/PD-L1 immunotherapy (6,7). Coincidentally, the latest research by Dou et al. found that the e ciency of antigen expression varies greatly in MSI-type endometrial cancer, and the down-regulation of antigen presentation ability would alter the effect of immunotherapy (7,8). It is urgent to identify new targets for diagnosis and treatment to improve the effect of immunotherapy in endometrial cancer.
Epigenetics is the study of heritable phenotype changes, most often involving changes that affect gene activity and expression, but do not involve DNA sequence alterations. Due to epigenetic changes, tumor cell immunogenicity and immune recognition mechanisms are destroyed (9,11). In addition, epigenetic silencing affects antigen processing and presentation (12). Chromatin remodeling is an important part of epigenetics. Impaired chromatin remodeling leads to the accumulation of epigenetic abnormalities.
Several studies have found, in malignant tumors, that the gene related to chromatin remodeling has high mutation frequency and plays an important role in tumor immune escape (13,14). The adenomatous polyposis coli (APC) gene, located on chromosome 5q21, is the chromatin-remodeling related gene and a typical tumor suppressor. APC protein is involved in the modi cation of transcription activation and cell cycle regulation. APC has an oligomerization domain, a 15-or 20-residue repeat domain important for binding to β-catenin, SAMP repeats for axin binding, a basic domain for microtubule binding and Cterminal domains that bind to EB1 and DLG proteins (15). The basal region and the C-terminal region, combined with microtubules, interact with EB1 to promote chromosome aggregation (16). APC inactivation leads to the loss of spindle function in mitosis and the instability of the genome and chromosome (17). Aberrant structure or expression of APC has been reported to be associated with various cancers. For example, high APC expression is an unfavorable prognostic factor for T4 gastric cancer and may be used as a novel biomarker for pathogenesis research, diagnosis, and treatment of gastric cancer (18). Loss of APC function leading to Wnt/β-catenin signaling hyper-activation is considered one of the driving forces of colorectal cancer tumorigenesis (19). APC mutation occurs in 20-45% of endometrial cancer (20), and the methylation of APC is associated with endometrial cancer occurrence (21). Recent studies have shown that APC mutation can induce endometrial hyperplasia and endometrial cancer by preventing estrogen signal transduction in endometrial epithelium (22). Therefore, we speculated that APC may have an important role in the pathogenesis and clinical progression of endometrial cancer.
The objective of this study was to demonstrated that APC as a new target for the diagnosis and treatment of endometrial cancer. We used data from TCGA and tumor samples from an endometrial cancer cohort to characterize endometrial cancer with and without the APC mutation. And our ndings support that APC may play an important role in the immunotherapy of endometrial cancer.

Research data source
We randomly selected 100 endometrial cancer tumors from 1000 in the endometrial cancer cohort biorepository of Shanghai First Maternity and Infant Hospital. Specimens were kept on dry ice to maintain specimen integrity and then cryo-pulverized. The cryo-pulverized specimen and paired blood sample were prepared for DNA isolation for molecular characterization. Given that one sample failed the quality test, a total of 99 endometrial cancer patients were enrolled in this study, with an average age of 58 years. The cohort study of china includ 88 cases endometrioid adenocarcinoma, 6 cases uterine carcinosarcoma, 3 cases serous carcinoma and 2 cases other types of endometrial cancer. Among them, 74.2% of patients were grade I, followed by19.1% grade II and 6.7%grade III. Most of the samples submitted for inspection are early-stage (71.6% for stage I and 10.1% for stage II), and 18.1% for latestage. The data form TCGA, including high throughput RNA sequencing (RNA-Seq) and clinical follow-up information for uterine corpus endometrioid carcinoma, were downloaded from TCGA on May 31, 2020. Samples from TCGA with more than 30 days of follow-up were screened for clinical follow-up data to further match the RNA-seq expression pro le. DNA extraction and fragmentation DNA isolation and subsequent sequencing procedures were performed in the laboratory of Burning Rock Biotech (Guangzhou, China) accredited and certi ed by the College of American Pathologists and Clinical Laboratory Improvement Amendments. Genomic DNA (gDNA) was extracted from tumor tissues and white blood cells by QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany). Qubit uorometer with the dsDNA high-sensitivity assay kit (Life Technologies, Carlsbad, CA, USA) was used to measure DNA quality following the manufacturer's instructions. DNA fragmentation was performed using a Covaris M220 Focused-ultrasonicator (Woburn, MA, USA), followed by end repair, phosphorylation, and adaptor ligation.
Fragments between 200 to 400 bp were selected using AMPure beads (Agencourt AMPure XP Kit, Beckman Coulter, Brea, CA, USA). Subsequently, the hybridization with capture probe baits, hybrid selection with magnetic beads, and PCR ampli cation were performed. A high-sensitivity DNA assay was then performed to assess the quality and size of the DNA fragments (Agilent 2100 bioanalyzer instrument, Agilent, Santa Clara, CA, USA).
Next-generation sequencing (NGS) library preparation, and capture-based targeted DNA sequencing The NGS library was constructed for the DNA isolated from tumor tissues and white blood cells according to an optimized protocol. A minimum of 50 ng of DNA was required for NGS library construction. Target capture was performed using commercially available panels of 520 cancer-related genes (OncoScreen Plus, Burning Rock, Guangzhou, China), spanning 1.64 Mb of the human genome (Burning Rock Biotech, Guangzhou, China). These panels comprehensively and accurately detect single nucleotide variants, insertion-deletions, copy number variations (CNV), and structural variations of genes that are clinically relevant to cancer. The average sequencing depths were 1,000X for tissue DNA. Indexed samples were sequenced on Nextseq500 sequencer (Illumina, Inc., San Diego, CA, USA) with paired-end reads.
Sequencing data were analyzed using proprietary computational algorithms optimized for somatic variant calling as described previously (23,24).

Sequencing data analysis
The sequencing data in FASTQ format were mapped to the human genome (hg19) using Burrows-Wheeler Aligner v.0.7.10 (25). Local alignment optimization and variant calling were performed using GATK v.3.2 (26) and VarScan v.2.4.3 (27), respectively. Variants were ltered using the VarScan fp lter pipeline; loci with sequencing depth less than 100 were eliminated. Matched white blood cells were used to lter out germline mutations. Variant calling in plasma and tissue samples required at least 8 supporting reads for single nucleotide variants and 2 and 5 supporting reads for insertions and deletions, respectively. Variants with population frequency over 0.1% in the ExAC, 1000 Genomes, dbSNP or ESP6500SI-V2 databases were grouped as single nucleotide polymorphisms and excluded from further analysis. The remaining variants were annotated with ANNOVAR (2016-02-01 release) (28) and SnpEff v.3.6 (29). DNA translocation analysis was performed using Factera v.1.4.3 (30). Gene-level CNV was assessed using an in-house-developed algorithm based on a statistic after normalizing read depth at each region by the total read number and region size and correcting the GC-bias using a LOESS algorithm. CNV was called if the coverage data of the gene region was quantitatively and statistically signi cantly different from its reference control. The limit of detection for CNVs was 1.5 for copy number deletions and 2.64 for copy number ampli cations. TMB was calculated as the ratio of mutation count to the size of coding region of the panel (1.26 Mb), excluding CNV, fusions, large genomic rearrangements and mutations occurring on the kinase domain of EGFR and ALK. The MSI phenotype detection method used a read-count-distribution-based approach according to a previously published protocol and algorithm (31).

Immunohistochemistry
Para n-embedded tissue sections (4 µm) of the endometrial cancer samples were processed for immunohistochemistry. First, specimens were depara nized and dehydrated, and then sections were stained with anti-APC antibody from Abcam 1:100 dilution); anti-PD-L1 antibody from Ventana (prediluted); anti-CD3 + antibody and anti-CD8 + antibody from Springs, (1:200 dilution); and anti-MLH1antibody, anti-MSH2, anti-MSH6, and anti-PMS2 from Maixin (1:100 dilution). After washing, the sections were incubated with biotin-conjugated secondary antibodies and subsequently with streptavidin-HRP. The sections were nally visualized by incubation with 3,3′diaminobenzidine substrate. Images were obtained with the Mantra System (PerkinElmer, Waltham, Massachusetts, USA) with identical exposure times. The integrated optical density was used to quantify the protein levels of APC, PD-L1, CD3 + , CD8 + , MLH1, MSH2, MSH6 and PMS2 in tumor tissue, and this integrated optical density was calculated by staining intensity dividing the staining area (brown stained area).

Statistics
Categorical data were described by frequency and percentage. Quantitative variables were expressed as means ± SEM. Fisher's exact or Chi-square test was performed to compare categorical variables. The Student t-test was used to compare continuous variables between two groups. Pearson's correlation coe cient was used to assess correlations. All the data were analyzed via R statistics package (R version 3.5.3; R: The R-Project for Statistical Computing, Vienna, Austria). All statistical tests were two-sided, and P-values < 0.05 were considered signi cant. R package survival KM (Kaplan-Meier) was used to estimate the survival rate, and Cox proportion hazards regression was used to calculate a hazard ratio.

Comprehensive analysis of gene mutation in endometrial cancer
The clinical and pathological characteristics of the 99 tumors from the cohort biorepository are summarized in Table S1. We used tissues and peripheral blood to detect gene mutations. The sequencing results showed that the high-frequency mutant genes of somatic cells were PTEN, PIK3CA and ARID1A, with 82%, 62% and 53% mutation rates, respectively. DNA damage repair (DDR) gene mutations were detected in 94% of samples (Fig. 1A). We also found that 17% of tumors had high levels of MSI, and the tumors with MSI had higher TMB (mean TMB: 53.25 mut/mb in MSI, 37.37 mut/mb in Microsatellitestable ). (Fig. 1B). In addition, we found that TMB was signi cantly higher in tumors with the POLE mutant and those with MSI than those without (both p < 0.01; Fig. 1C). In our study, the total mutation rate of mismatch repair genes, including MSH2, MSH6, MLH1 and PMS2, was 26%. (Figure D). Through analyzing the genes with more than 10% mutations, we found that most of the mutated genes related to chromatin status, including covalent modi cation of chromatin, chromosome organization and chromatin organization (Fig. 1E).

Immune microenvironment correlated with chromatin status related genes
To determine the effect of chromatin-related gene mutations on endometrial cancer, we explored the correlation between gene mutations and tumor immune microenvironment. From the mutated chromatin-related genes, we identi ed 12 mutant genes that were signi cantly positively correlated with PD-L1 expression (R > 0.25; p < 0.05) or TMB (R > 0.8; p < 0.05). Mutations of these 12 genes (STAG2, TAF1, ARID1A, KMT2C, KMT2D, APC, KMT2A, JAK2, BRCA2, PRKDC, ATRX, BCORL1) were detected in 66% of the endometrial cancer samples ( Fig. 2A-B). The 66 samples that contained these mutated genes comprised the discovery set, and the remaining 33 samples were used as the control set. Next, we compared the discovery set with the control set in terms of TMB, PD-L1 expression and lymphocytic in ltration. The mean TMB of the discovery set (58.57 mut/mb, log2TMB = 4.45; SD = 1.96) was signi cantly higher than that of the control set (4.92mut/mb, log2TMB = 2.16; SD = 0.64; p < 0.001; Fig. 2C). We then quanti ed the expression of PD-L1 in tumor cells and immune cells. PD-L1 expression was higher in the discovery set than in the control set in both immune cells and tumor cells. PD-L1 expression increased signi cantly (both p = 0.001), especially in the immune cells (PDL1.TC: 0.074 in control and 0.130 in discovery; PDL1.IC: 0.225 in control and 0.378 in discovery; Fig. 2D). When comparing the discovery set and the control set, we found no signi cant difference in lymphocytic in ltration for CD3 + T cell (p = 0.11) or CD8 + T cell (p = 0.07; Fig. 2E).

Chromatin remodeling-related gene APC affects the immune microenvironment
To evaluate the effects of the 12 speci c gene mutations, we explored the immune microenvironment, including TMB, PD-L1 expression and lymphocytic in ltration, in endometrial cancer (Fig. 3A). Four genes (KMT2C, APC, KMT2A, JAK2) were selected for signi cance and relevance. The mutation rates were 18% for KMT2C, 18% for APC, 17% for KMT2A, and 11% for JAK2 (Fig. 3B). Unlike the other three genes, the mutations of the chromatin remodeling-related gene APC were associated with signi cant increases in TMB, PD-L1 expression and CD3 + T cell in ltration (all p < 0.02; Fig. 3C-E). To further verify the impact of APC mutations on the immune microenvironment, we analyzed the gene mutations of endometrial cancer from TCGA, which yielded similar results (Fig. 4A-C).

Chromatin remodeling-related gene APC expression in endometrial cancer
To further investigate expression of the APC endometrial cancer-speci c immunotherapy markers, we assessed the protein levels of APC, MLH1, MSH2, MSH6, PMS2, PD-L1, CD3 + and CD8 + by immunohistochemistry using tissue from 99 endometrial cancer tumors. We quanti ed staining with the integrated optical density value that combined the staining intensity and the percentage of positive cells. In endometrial cancer, most APC mutations are inactive mutations which lead to reduced protein levels (22). In our study, the low expression of APC was accompanied by high levels of PD-l expression and increased in ltration of CD3 + and CD8 + T cells. To evaluate MSI, we quanti ed the expression levels of MLH1, MSH2, MSH6 and PMS2 by immunohistochemistry. Results indicated that samples with APC inactivated mutations were MSI type, which was consistent with gene sequencing (Fig. 5A-B). The results of the survival analysis among the 526 TCGA samples suggested that the APC mutation was associated with longer survival (p = 3.5e-06; Fig. 5C).

Discussion
We analyzed gene expression pro les in endometrial cancer samples from our cohort by using a commercial panel of 520 genes that are closely associated with cancer pathogenesis. We comprehensively analyzed the mutation pro les of endometrial cancer and con rmed the somatic and germline mutations of previously study described at the genomic level (3). In recent years, PD-1/PD-L1 treatment that blocks immune checkpoints has developed rapidly and attracted attention broadly. TMB, MSI and PD-L1 expression all have been reported as markers for PD-1/PD-L1 immunotherapy (4,6,32). However, as a result of paradoxical clinical trials (7), the issue has been the identi cation of markers that accurately predict the e cacy of immunotherapy in endometrial cancer. Our results revealed that the samples characterized by MSI or POLE mutation were accompanied by an increase of TMB ( Fig. 1B-C). Beyond that, clustering analysis of gene mutations pro les showed that 94% of the samples had mutations of the DDR gene, and we clearly demonstrated that genes with more than 10% mutations are related to chromatin status (Fig. 1D-E). These ndings suggested that mutations of chromatin staterelated genes may have an impact on the immune microenvironment of endometrial cancer.
Feature selection is a data preprocessing technique that has been widely used in many bioinformatics applications (33). Here, we modeled marker discovery as an approach to select the best feature subset for the immunotherapy of endometrial tumors. It is not easy to choose a reliable feature subset due to the multi-dimensionality of QIAamp DNA FFPE tissue kit data. Therefore, we designed a cross feature selection schema, based on the correlation of chromatin status related genes with PD-L1 expression or TMB, to select a reliable subset ( Fig. 2A-2B). Filters may lead to locally optimum sets but not the best discriminative subset, which may make it impossible to nd diagnostic markers with high sensitivity and speci city. We further analyzed the TMB level, PD-L1 expression and lymphocytic in ltration of the selected subset to verify its reliability, which was identi ed containing 12 chromatin state-related genes ( Fig. 2C-2E). The results indicated that the cross method performed extremely well for identifying a discovery set of genese associated with immune microenvironment in endometrial cancer, except for lymphocytic in ltration.
As cancer treatments, immunotherapy approaches have been highly successful but can be affected by immune microenvironment. Our results indicate that the gene APC may serve as a new marker to assessing the impact on the immune microenvironment( Fig. 3A-3B). By analyzing the relationship between individual mutant genes in the discovery set and the immune microenvironment, we found that the APC mutations was signi cantly associated with the immune microenvironment, including TMB, PD-L1 expression and lymphocytic in ltration (Fig. 3C-3E). Our results showed that APC mutations might suggest elevation of the TMB, PD-L1 expression and lymphocytic in ltration, which may help identify patients who may bene t from immunotherapy. Moreover, we con rmed our ndings regarding the effect of the APC mutation on the immune-microenvironment using TCGA (Fig. 4A-4C). Further biological validation and clinical trials are needed to evaluate the clinical signi cance of chromatic status-related gene APC, along with studies to understand the mechanism by which APC regulates the immune microenvironment.
In our study, the signi cantly decreased expression of APC was further con rmed in human endometrial cancer tissues by immunohistochemistry (Fig. 5A-5B). The APC mutation may contribute to increased expression of new antigens and abundant tumor in ltrating lymphocytes, which may result in the sensitivity of immune checkpoint inhibitors. This may explain why patients with APC mutation have a more favorable prognosis (Fig. 5C).

Conclusion
In summary, immune therapy, including checkpoint inhibition and tumor vaccination, plays a crucial role in cancer treatment. The sensitivity of existing target molecules is insu cient, such that a signi cant proportion of patients fail to respond to immunotherapy. Our results suggest that APC may affect the immune microenviroment of endometrial cancer; thus, patients with APC mutations may be more sensitive to immunotherapy. Consequently, this information could improve selection of endometrial cancer patients who will have a better response to immune checkpoint therapy and thus a better prognosis. Although our results are observational, they provide the basis for multiple hypotheses of clinical relevance that can and should be further explored by the scienti c community.

Consent for publication
The authors con rm that we have obtained written consent from the patients to publish this manuscript.

Availability of data and material
The datasets used and/or analyzed during the present study are available from the corresponding author on reasonable request.

Competing interests
The authors have declared that no con ict of interest exists. Authors' contributions YRL carried out the molecular genetic studies and immunoassays, participated in the sequence alignment, performed the statistical analysis, and drafted the manuscript. KW participated in the design of the study and the sequence alignment. XPW conceived of the study, participated in its design and coordination, and helped to draft the manuscript. All authors read and approved the nal manuscript.  A. Speci c mutated genes screened by signi cant correlation with PD-L1(programmed death-ligand 1) expression or TMB(tumor mutational burden), Pearson's correlation coe cient was used to assess correlations B. Mutation of 12 speci c mutated genes identi ed as discovery set C. Relationship between discovery sets and TMB D. Relationship between discovery sets and PD-L1 Expression. E. Relationship between discovery sets and CD3+ and CD8+ T cell in ltration  Analysis of TCGA. A. Relationship between APC and TMB B. Relationship between APC and PD-L1 Expression. C. Relationship between APC and CD3+ and CD8+ T cell in ltration