Characterization of Leading Dysregulated Plasma- Proteome Associated Genes in Patients with Gastro-Esophageal Cancers


 Background: Gastro-esophageal cancers are one of the major causes of cancer-related death in the world. There is a need for novel biomarkers in the management of gastro-esophageal cancers to identify new therapeutic targets and to yield predictive response to the available therapies. Our study aims to identify leading genes that are dysregulated (upregulated or downregulated) in patients with gastro-esophageal cancers.Methods: We examined gene expression data for those genes whose protein products can be detected in the plasma in 600 independent tumor samples and 46 matching normal tissue samples using the Cancer Genome Atlas (TCGA) to identify leading genes that are dysregulated in patients with gastro-esophageal cancers. Non-parametric Mann-Whitney-U test was used to evaluate differential expression of genes using a cut-off of P< 0.05.Results: The comparison between tumors sample and healthy tissue showed BIRC5 (p=2.61 E-08), APOC2 (p=3.23E-08), CENPF (p=4.38E-08), STMN1 (p=5.74E-08), and HNRPC (p=8.21E-08) were the leading genes significantly overexpressed in esophageal cancer whereas CST1 (p=3.97 E-21), INHBA (p=9.22E-20), ACAN (p=1.08E-19), HSP90AB1 (p=2.62E-19), and HSPD1 (p=3.91E-19) were the leading genes that were overexpressed in stomach cancer. Conversely, C16orf89 (9.78E-08), AR (1.01E-07), CKB (1.17E-07), ADH1B (1.79E-07), and NCAM1 (2.15E-07) were the leading gene that were significantly downregulated in esophageal cancer whereas GPX3 (1.65E-19), CLEC3B (5.70E-19), CFD (5.68E-18), GSN (4.5IE-17), and CCL14 (1.12E-16) were significantly downregulated in stomach cancer. Furthermore, Stage-based examination showed stage-specific differential expression of various genes as well as stage-wise increasing or decreasing up-regulation or down-regulation of selected genes, respectively. Conclusions: The present study identified leading upregulated and downregulated genes in gastro-esophageal cancers that can be detected in the plasma proteome. These genes have potential to become diagnostic and therapeutic biomarkers for early detection of cancer, recurrence following surgery and for development of targeted-treatment.


Introduction
Cancers of the stomach and esophagus or gastro-esophageal (GE) cancers represent a highly aggressive disease and are one of the major causes of cancer-related death in the world. Stomach cancer is the fth most common cancer and the third leading cause of cancer-related death worldwide. For example, in 2018, more than 1 million new cases of stomach cancer were diagnosed and about 783,000 people die from it (1). Likewise, esophageal cancer is the seventh most common cancer and the sixth leading cause of cancer-related death. Each year more than 500,000 new cases of esophageal cancer are diagnosed and about 509,000 people die from it (1). Despite improvements in surgical and radiation treatments and the availability of newer agents, the prognosis of patients with recurrent GE cancers remains very poor (2)(3)(4). The need for novel strategies to improve current therapy is therefore vital in the management of GE cancers.
It is well known that cancer development and progression are triggered by altered activities and dysregulated expression of genes that control cell proliferation and differentiation (5). Comparative assessment of genetic aberrations between cancerous and matched normal tissues control has facilitated identi cation of new biomarkers that may also serve as new therapeutics targets or predict various cancer-related outcomes. There are several known biomarkers that are associated with tumorigenesis or have prognostic and predictive values in patients with stomach and esophageal cancers (6)(7)(8)(9)(10). For example, approximately 20% of stomach and gastroesophageal junction cancers are associated with the ampli cation of the HER2 gene that is an important therapeutic target and predicts response to trastuzumab (8). Most biomarkers such as TP53 or CDH1, however, have limited therapeutic and predictive values and presence or absence of them does not alter treatment strategies (9). There is a strong unmet need for novel biomarkers in the management of GE cancers to identify new therapeutic targets and to yield predictive response to the available therapies.
We conducted this study by intersecting the gene expression pro les from The Cancer Genome Atlas (TCGA) with the plasma proteome databases to identify leading genes that are dysregulated (upregulated or downregulated) in patients with esophageal and stomach cancers (11,12). The purpose of these analyses is to identify differentially expressed novel tumor-speci c genes that code for the plasma proteins and use this information to develop blood-based prognostic biomarker studies in the near future.

TCGA Gene expression Analyses
We obtained the level-3 HiSeq RSEM gene-normalized RNA-seq gene expression data for stomach adenocarcinoma (STAD) and esophageal carcinoma (ESCA) from the TCGA database (11). Overall, gene expression data for 415 independent tumor samples and 35 matching normal tissue samples for STAD and for 185 ESCA cases with 11 matching normal tissue samples were available. We also downloaded the plasma proteome database from http://www.plasmaproteomedatabase.org. The database contained information on 1241 protein-coding genes, while gene expression pro les from TCGA mapped to 1232 protein-coding genes. To analyse plasma proteome genes, we used non-parametric Mann-Whitney-U test to identify genes that are expressed at signi cantly different levels (p < 0.05) in cancerous and normal tissues. The deregulated genes were grouped according to tumor stages based on the available patient data. This allowed identi cation of genes with expression signi cantly increased or decreased at multiple stages of cancer.

Leading upregulated genes in GE cancers
We examined gene expression of 1232 protein-coding genes that are detected in plasm proteome in tumors of 185 patients with esophageal cancer and in the matching tissue of 11 subjects with no cancer. Among cancer patients, 18 (9.7%) had stage I tumors, 78 (42.2%) had stage II disease, 56 (30.3%) had stage III disease, and 9 (4.9%) had stage IV cancer. In 24 (13%) patients, cancer stage was not known. The comparison between esophageal tumors and healthy tissue showed BIRC5 (p = 2.61 E-08), APOC2 (p = 3.23E-08), CENPF (p = 4.38E-08), STMN1 (p = 5.74E-08), and HNRPC (p = 8.21E-08) to be ve leading genes overexpressed in esophageal cancer (Fig. 1A). The stage-based assessment of overexpressed genes showed signi cant overexpression of BIRC5, APOC2, CENPF, STMN1, and HNRPC across all cancer stages including early, locally advanced and metastatic esophageal tumors (Fig. 1B) and HSPD1 (p = 3.91E-19) were the leading ve genes that were overexpressed in stomach cancer (Fig. 1C). The stage-based assessment of overexpressed genes showed signi cant upregulation of CST1, INHBA, ACAN, HSP90AB1, and HSPD1 genes across all stages, including early, locally advanced and metastatic stomach cancer (Fig. 1D). The p-values for each gene compared to normal samples and number of samples in each stage are provided in supplementary table 2.

Stage-speci c upregulation of genes in GE Cancers
In this analysis we examined pattern of genes expression based on the speci c-stage of the disease in GE cancers. In addition to the ve overexpressed genes reported above, the stage-based analysis showed a differential expression of following genes based on the stage of the disease. Patients with stage I esophageal cancer have signi cantly higher expression of CPS1 (p = 0.003), PNP (p = 0.007), SERPINBB (p = 0.042) and EHD1 (p = 0.046). In patients with stage II disease, MSN (p = 0.003), KRTS (p = 0.004), TNC (p = 0.007), and NAP1L4 were overexpressed compared with other stages of the disease. In patients with stage IV esophageal cancer CYCS (p = 0.014), PON3 (p = 0.14), ACPP (p = 0.047), and RPL22 (p = 0.047) were signi cantly upregulated compared with patients with early stage cancer. We did not notice a stagespeci c upregulated gene in stage III esophageal cancer ( Fig. 2A). The p-values for each gene compared to normal samples are provided in supplementary table 3.
Likewise, the stage-based analysis in patients with stomach cancer showed a differential expression of several genes at the speci c stages of the disease. Thus, patients with stage I, II, III, and IV stomach cancer have signi cantly higher overexpression of PYGB (p = 0.043), TNF (p = 0.02), HLA-A (0.05), and EFNB2 (0.001) genes, respectively (Fig. 2B). The p-values for each gene compared to normal samples are provided in supplementary table 4.

Stage-wise increasing overexpression of leading genes in GE cancers
In this analysis, we examined if certain genes expression intensi es in parallel with the progression of disease. The analysis showed a stage-wise increasing overexpression of ANGPT2, APOC2, CXCL5, HIST1H1E, IL17A, IL2RA, IL8, OSM, PFaV1, and SAA4 in esophageal cancer. Among them, a gradual increase from early stage cancer to more advanced stages was strongest for APOC2 and IL8 genes followed by for SAA4 and OSM genes, whereas HIST1H1E, PF4V1, CXCL5, and IL2RA expression showed limited stage-wise upregulation (Fig. 3A).
In the stomach cancer, the stage-wise analysis showed a gradual increasing overexpression of the following genes: AGRN, CETP, FGL1, HABP2, MDK, OSMR, RNASE2, SELE, SERPINE1, and VCAN. Among them, the upregulation from the early stage cancer to more advanced stages was the strongest for RNASE2, SERPINE1, and CETP (Fig. 3B).
In patients with stomach cancer following ve genes were most signi cantly downregulated: GPX3

Discussion
Our investigation discussed in this manuscript, allowed to characterize the leading genes that are upregulated or downregulated in patients with esophageal and stomach cancers. We speci cally focussed on those genes whose protein products can be detected in the plasma, as measured in the plasma proteome database. Thus, our investigation has a direct translational impact. The abnormal gene expression plays a pivotal role in tumor development and progression (5). We noted that compared to normal tissue, BIRC5, CENPF, STMN1, APOC2, and HNRPC were the ve most signi cantly upregulated genes in esophageal cancer. Furthermore, these genes were also overexpressed in stomach cancer.
The baculoviral IAP repeat containing 5 (BIRC5) gene, also known as survivin, is a member of the inhibitor of apoptosis (IAP) family, where genes encode regulatory proteins that prevent apoptotic cell death.
Survivin localizes to the mitotic spindle and participates in regulating mitosis. In addition to GE cancers, it is highly expressed in various other malignancies and is associated with inferior outcomes including a shorter survival period (13)(14)(15). CENPF gene encodes centromere protein F that associates with the centromere-kinetochore complex. CENP-F protein is thought to be a cell cycle regulated protein that may play a role in chromosome segregation during mitosis. Interestingly, there is an evidence that CENPF expression is associated with inferior outcomes in patients with esophageal cancer and patients with lower CENPF expression had a better survival rate compared with those with higher CENPF expression (16). The CENPF gene is also ampli ed in other solid tumors including hepatocellular and breast cancers and correlates with patients' outcomes (17)(18)(19). As cancer cells undergo active division, perhaps the up regulation of genes like BIRC5, CENPF could be a direct consequence of active mitosis. STMN1 belongs to the stathmin family of genes and encodes a cytosoplasmic phosphoprotein stathmin 1. The encoded protein belongs to the family of microtubule-destabilizing proteins that control the assembly and disassembly of the mitotic spindle and thereby, regulate mitosis. Similar to esophageal cancer, STMN1 is highly expressed in various cancers, including leukemia, breast, prostate and lung cancer, and is a promising target for cancer therapy (20,21). There is some evidence that it may have a prognostic signi cance in the early stage gastric cancer. For example, a study that evaluated STMN1 role in both operable and advanced gastric cancers showed that in the operable cohort, STMN1 expression correlated with cancer recurrence, and resistance to adjuvant therapies (22).
In contrast to the relatively known roles of BIRC5, CENPF, and STMMN1 in malignancies, functions of APOC2 and HNRNPC genes in cancer cell are less well de ned. The APOC2 gene encodes a lipid-binding protein that belongs to the apolipoprotein family and is a component of the very low-density lipoprotein. This protein activates the enzyme lipoprotein lipase, which hydrolyzes triglycerides. APOC2 mutations could cause hyperlipoproteinemia type IB, characterized by hypertriglyceridemia, xanthomas, and early atherosclerosis (23,24). HNRNPC gene encodes a protein that belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and are associated with pre-mRNAs in the nucleus. These proteins are involved in pre-mRNA processing and other aspects of mRNA metabolism and transport along with cell proliferation and differentiation (25). However, functions of hnRNPs in tumorigenesis and cancer progression in solid and hematological malignancies are not well understood (26).
Our analysis showed that compared with normal stomach tissue, CST1, INHBA, ACAN, HSP90AB1, and HSPD1 were the leading ve genes that were overexpressed in stomach cancer. Furthermore, these genes were also upregulated in patients with esophageal cancer. The CST1 gene encodes a secretory peptide called Cystatin SN, which is a cysteine proteinase inhibitor. Cysteine proteases are involved in tissue remodeling during development, and they support the migration of cancer cells. CST1 itself is known to promote proliferation, clone formation, and metastasis in breast cancer cells and high CST1 expression is negatively correlated with breast cancer survival (27). CST1 has also been considered as a potential tumor marker in various epithelial malignancies (27,28). Of note, a study involving patients with esophageal squamous cell carcinoma whose tumors express high levels of Cystatin SN showed favorable survival compared with those patients with low Cystatin SN expression (29).
Inhibin-βA (INHBA), a ligand belonging to the transforming growth factor-β superfamily, is associated with cell proliferation in cancer. INHBA is overexpressed in various types of cancers including esophageal and stomach tumors (30,31). Overexpression of the INHBA gene is considered a useful independent predictor of outcomes in patients with gastric cancer after the curative surgery. High INHBA gene expression has shown to be associated with signi cantly poorer 5-year overall survival compared with low expression cases in patients with stomach cancer (31).
HSPD1 and HSP90AB1 belong to heath shock protein (HSP) group and encodechaperonin family proteins (32). HSPD1 encodes a mitochondrial protein, which is important for assembly of imported proteins in the mitochondria and may function as a signaling molecule in the immune system. HSP90AB1 is thought to play a role in gastric apoptosis and in ammation. HSPs control a wide variety of signaling and cellular responses and have been classi ed into several subfamilies such as the HSP60s, HSP70s, HSP90s, and HSP100s (33). HSP expression often correlates with patient prognosis in various malignancies (34)(35)(36) For example, HSP60 has been identi ed as an independent prognostic factor for both overall survival and recurrence-free survival in patients with early stage stomach cancer (36). The ACAN gene is a member of the aggrecan/versican proteoglycan family. The encoded protein is an integral part of the extracellular matrix in cartilagenous tissue and it withstands compression in cartilage. Mutations in this gene may be involved in skeletal dysplasia and spinal degeneration, however, its role in cancer is not well understood.
With respect to downregulated genes, C16orf89, AR, CKB, ADH1B, and NCAM1 were the leading downregulated genes in patients with esophageal cancer. C16orf89 is predominantly expressed in the thyroid gland and is involved in the development and function of the thyroid (37). Its role in tumorigenesis and progression has not been elucidated yet. The androgen-receptor (AR) gene encodes AR. Once AR binds its hormone ligand testosterone, it translocates into the nucleus, and stimulates transcription of androgen responsive genes (38). In vitro evidence suggests a signi cant in uence of sex hormones upon cancer growth (38,39). For example, AR pathway plays an important role in the development of prostate cancer and various other epithelial malignancies including bladder, kidney, lung, breast, liver and ovary (39). However, AR role in GE cancers development and progression is not known (40). CKB or creatinine kinase B gene encodes a cytoplasmic enzyme that is involved in energy homeostasis. Its dysregulation could promote cancer invasiveness and progression (41). Similar to AR, its disease modulating effect in GE cancers is unknown. ADHIB encodes alcohol dehydrogenase 1B enzyme. Evidence suggests that genetic polymorphisms of this enzyme has been associated with the increased risk of the aerodigestive cancer triggered by alcohol consumption (42). The NCAM1or neural cell adhesion molecule 1 gene encodes a cell adhesion protein, a member of the immunoglobulin superfamily that is involved in both cell to cell and cell to matrix interactions. Its downregulation has been linked to cancer progression and development of metastases in gastrointestinal and other malignancies (43).
In patients with stomach cancers, GPX3, CLEC3B, CFD, GSN, and CCL14 were the leading ve genes that were most signi cantly downregulated. GPX3 encodes glutathione peroxidase that belongs to a family of selenocysteine-containing redox enzymes that play important roles in cell signaling and immune modulation (44). Promoter hypermethylation and downregulation of Gpx3 in melanoma, stomach, head and neck, cervical and lung cancers suggest that Gpx3 serves as a tumor suppressor in these cancers (44,45). C-Type Lectin Domain Family 3 Member B (CLEC3B) is a member of the C-type lectin superfamily that encodes tetranectin. Dysregulation of CLEC3B has been reported in various epithelial cancers including stomach cancer (46,47). Chen and others using TCGA database also noted downregulation of CLEC3B in stomach cancer. However, when they evaluated 328 patients with early-stage stomach cancer, high intratumoral tetranectin level was signi cantly associated with tumor invasion, lymph node metastasis, advanced TNM stage, and a shorter overall survival (47). CFD or complement factor D encodes a serine protease that catalyze breakdown of factor B a rate limiting step of alternative pathway of complement activation. Impaired balance of complement activation could promote in ammation and tumorigenesis resulting in malignant cells proliferation, migration, invasiveness and metastasis (48,49). GSN or Gelsolin gene encodes a protein that is involved in assembly and disassembly of actin laments. Gelsolin has been attributed in prostate tumorigenesis and malignant transformation (50,51). The C-C type chemokine 14 gene that encode the CC14 chemokine, which induces targeted cell migration is thought to play a role in carcinogenesis and metastasis of certain malignancies including breast cancer (52,53).
Aside from the leading upregulated and downregulated genes in patients with GE cancers, we also noted a stage-wise upregulation of several genes, such as APOC2, IL8, RNASE2, SERPINE1, and CETP, and stage-wise downregulation of other genes that play important role in and survival, including AHNAK, MEGF8, MMRN1, PROC, REG1A, SECTM1, TNFRSF10C, and TPPP3. The stage-related expression of these genes suggests their potential role in the disease progression and utility as monitoring markers or therapeutic targets. The cholesteryl ester transfer protein (CETP) for example maintains cholesterol homeostasis and has been identi ed as a potential target for estrogen positive breast cancer (54). Conversely, AHNAK can act as a tumour suppressor gene and mediates the negative regulation of cell growth (55) While our work provides a signi cant amount of novel information regarding the behavior of cancerrelated molecules in esophagus and stomach tumors, it does not assess the level of gene expression based on the molecular classi cation of stomach and esophageal cancers (7). Furthermore, we did not have information on histopathology of esophageal and stomach cancer available therefore, were not able to segregate the data based on histopathology.
In summary, the present study identi ed leading upregulated and downregulated genes in esophageal and stomach tumors. Since expression of the upregulated genes was minimal in both stomach and esophageal normal tissues, these genes have a strong potential to become diagnostic and therapeutic biomarkers for screening and early detection of cancer, recurrence following surgery and for anti-cancer therapies. Future studies will be required for validating diagnostic, therapeutic and prognostic importance of these genes. Our group plans to prospectively evaluate prognostic and predictive values of selected genes in a cohort of patients with metastatic gastroesophageal cancer who are treated with combination chemotherapy.