Retrospective review using targeted deep sequencing reveals mutational differences between gastroesophageal junction and gastric carcinomas

Adenocarcinomas of both the gastroesophageal junction and stomach are molecularly complex, but differ with respect to epidemiology, etiology and survival. There are few data directly comparing the frequencies of single nucleotide mutations in cancer-related genes between the two sites. Sequencing of targeted gene panels may be useful in uncovering multiple genomic aberrations using a single test. DNA from 92 gastroesophageal junction and 75 gastric adenocarcinoma resection specimens was extracted from formalin-fixed paraffin-embedded tissue. Targeted deep sequencing of 46 cancer-related genes was performed through emulsion PCR followed by semiconductor-based sequencing. Gastroesophageal junction and gastric carcinomas were contrasted with respect to mutational profiles, immunohistochemistry and in situ hybridization, as well as corresponding clinicopathologic data. Gastroesophageal junction carcinomas were associated with younger age, more frequent intestinal-type histology, more frequent p53 overexpression, and worse disease-free survival on multivariable analysis. Among all cases, 145 mutations were detected in 31 genes. TP53 mutations were the most common abnormality detected, and were more common in gastroesophageal junction carcinomas (42% vs. 27%, p = 0.036). Mutations in the Wnt pathway components APC and CTNNB1 were more common among gastric carcinomas (16% vs. 3%, p = 0.006), and gastric carcinomas were more likely to have ≥3 driver mutations detected (11% vs. 2%, p = 0.044). Twenty percent of cases had potentially actionable mutations identified. R132H and R132C missense mutations in the IDH1 gene were observed, and are the first reported mutations of their kind in gastric carcinoma. Panel sequencing of routine pathology material can yield mutational information on several driver genes, including some for which targeted therapies are available. Differing rates of mutations and clinicopathologic differences support a distinction between adenocarcinomas that arise in the gastroesophageal junction and those that arise in the stomach proper.


Background
Gastric cancer accounts for over 10,000 deaths annually in the United States [1], and is the second most common cause of cancer mortality worldwide [2]. Although carcinomas of the gastroesophageal junction (GEJ) have been grouped with gastric carcinomas in cancer registries and in clinical trials for targeted therapies [3], lesions at these two sites have distinct clinical features. Adenocarcinomas of the stomach proper are primarily caused by Helicobacter pylori infection [4] and are decreasing in incidence worldwide [1]. In contrast, GEJ cancers are most associated with gastroesophageal reflux disease [2][3][4][5] and obesity [6], and the incidence of GEJ carcinomas has remained stable over the past 20 years [7]. In addition, the prognosis of GEJ carcinomas has been noted to be worse than gastric carcinomas, and there is uncertainty as to whether GEJ carcinomas should be staged as gastric or esophageal tumors [8]. Recognizing the distinction between carcinomas of the GEJ, esophagus, and stomach may enhance the collection of meaningful epidemiologic data and result in increased management precision [9].
Several studies have noted differences in the molecular characteristics of GEJ carcinomas versus those that arise elsewhere in the stomach. TP53 mutations are more frequent in the GEJ than in the distal stomach, while loss of heterozygosity of the TP53 locus is also more common in GEJ tumors [10,11]. Significant differences in promoter methylation rates of APC and CDKN2A have also been described [12]. Furthermore, differences in APC mutation rates and protein expression, as well as differences in global gene expression profiles between the two sites have also been demonstrated [13][14][15][16].
Testing of amplifications of the ERBB2 (also known as HER2) gene in gastric and gastroesophageal junction cancers is now routine practice in many institutions [17]. Similarly, testing for driver mutations, particularly single nucleotide substitutions, in oncogenes and tumour suppressor genes currently informs treatment in adenocarcinomas of other sites such as the lung and colon [18][19][20]. As further molecular targets are discovered across disease sites, effective assays will be required to determine cancers' susceptibility to targeted treatment.
Next-generation sequencing may be used in the near future to interrogate multiple genes in a single sample, and these data could be used to inform clinicians of driver mutations and guide targeted treatment. Targeted panel sequencing is a form of next-generation sequencing where single nucleotide variants are detected in a limited number of previously determined genomic loci, which by intention are often prognostically and therapeutically critical. Panel sequencing enables multiplexing of samples, and deep coverage (>500x) facilitates the analysis of suboptimal template material from archival tissue and samples with low tumor cellularity. The narrower set of genes also allows for quicker specimen processing and bioinformatic analysis. Thus, actionable results can be obtained within days, rather than the weeks, compared to whole genome and exome approaches. However, data is restricted by the inherently biased selection of genes, and the inability to detect copy number changes, loss of heterozygosity, and structural rearrangements such as gene fusions. Thus, the effective use of NGS requires careful assessment of technologies, assay limitations, template requirements, and the research and clinical questions under consideration.
The objectives of this study were to probe the utility of panel sequencing on formalin-fixed paraffin-embedded (FFPE) tissue, and to compare clinically annotated GEJ and gastric carcinomas through panel sequencing of the hotspots of 46 cancer genes. We also sought to compare the frequencies of mutations identified with panel sequencing of hotspots against whole-exome sequencing, using publically available data from The Cancer Genome Atlas.

Case selection and retrieval of clinicopathologic data
Institutional ethics approval was obtained from the University of British Columbia/British Columbia Cancer Agency research ethics board (#H07-2807), and research was conducted in accordance with the Helsinki declaration. Cases of gastric carcinoma were retrieved from departmental archives from the British Columbia Cancer Agency (BCCA), a provincial referral center. Inclusion criteria were referral to the agency between 2004 and 2010, available FFPE tissue from surgical resection of the primary tumor, complete clinicopathologic data including clinical outcomes on follow-up, and the absence of metastatic disease at presentation. Biopsy specimens of primary and metastatic lesions were excluded due to the absence of complete pathologic data. GEJ location was defined as lesions with an epicenter within 5 cm of the proximal end of the gastric rugal folds [21]. No distinction was made between tumors with regards to the location of their epicenter within the 5 cm of the GEJ (i.e. Siewert type was not recorded) [22]. Carcinomas located exclusively within the esophagus were excluded, as per the most recent WHO criteria [21]. All gastric tumors located distal to the GEJ were binned together for this study. Clinicopathologic data was collected retrospectively through review of patients' charts by a member of the clinical team, as well as through review of pathology reports.

DNA sample processing, sequencing, and variant calling
In each case, hematoxylin and eosin slides were used to guide macrodissection or scrolling of tumor tissue from FFPE slides following outlining of tumours by an anatomical pathologist. Tumor DNA from each case was extracted using Qiagen FFPE DNA extraction kit (Qiagen, Venlo, Netherlands); no germline DNA was extracted. Extracted DNA was quantified using the QUBIT HS dsDNA assay (Life Technologies Gaithersburg, MD, USA); all cases had a minimum of 10 ng of DNA extracted from FFPE, in keeping with a previously reported requirement for the assay [21]. A minimum A260/280 ratio of 1.8 was required for each DNA sample. DNA amplicon library construction was performed using DNA primers from the Ion Ampliseq™ Cancer Hotspot Panel v1 (Life Technologies). The kit consists of 207 primer pairs that cover 739 hotspots within 46 cancer-related genes (Additional file 1: Table S1). Indexed amplicon libraries were pooled for emulsion polymerase chain reaction and sequencing on the Ion Torrent PGM platform (Life Technologies). A minimum of at least 500x base pair coverage was required for each case. Variant calling was performed using the Torrent Variant Caller v2.2 (Life Technologies) using the hg19 reference genome. Only variants present at frequencies ≥5% were considered. Because germline DNA was unavailable for comparison, variants were excluded as possible somatic mutations if they were identified as single nucleotide polymorphisms with mean allele frequencies of >0 within the dbSNP database (www.ncbi.nlm.nih.gov/SNP); their status as nongermline variants was further confirmed using a PubMed search (www.ncbi.nlm.nih.gov/pubmed).

Comparison with the Cancer Genome Atlas (TCGA) data
Curated somatic mutation calls for 281 TCGA stomach adenocarcinoma samples with known anatomical sites were retrieved from the TCGA Data Portal (https://tcgadata.nci.nih.gov/tcga/) on February 19, 2014. Protein-coding mutations located in the regions amplified by the Ion Ampliseq™ Cancer Hotspot Panel v1 in each of the 46 genes were obtained for cases and stratified by location (60 cardia/proximal and gastroesophageal junction versus 221 fundus/body, antrum/distal and stomach NOS). Copy number data, RNA expression data, and protein expression data were not considered as our own assay only detects single nucleotide variants (SNVs) and small basepair insertions/deletions (INDELs). The frequencies of mutations, irrespective of the type of mutation, were compared versus the hotspot multiple panel sequencing that we performed.

Data analysis
Mann-Whitney U-tests and student t-tests were used to compare linear variables, where appropriate. Fisher exact and chi-square tests, where appropriate, were used to compare categorical values. Survival analyses were performed using log-rank (Kaplan-Meier) and Cox proportional hazards tests. The 46 panel genes were mapped to the Kyoto Encyclopedia of Genes and Genomes (KEGG) [22,23] and the Ingenuity® Integrated Pathway Analysis program (Qiagen) to identify oncogenic pathways and networks enriched for mutations, and to test for statistically significant differences between gastroesophageal junction and gastric adenocarcinoma specimens. P values were corrected for multiple testing using the Benjamini-Hochberg (BH) correction [24]. All statistical tests were two-tailed and a P value of < .05 was considered statistically significant. Statistical analyses were performed using SPSS Statistics software (v22, IBM, Armonk, NJ, USA) and the R statistical language v.2.15.1 (R Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/).

Results
Within departmental archives at the BCCA, 229 resection specimens of gastric and GEJ carcinomas were obtained from 2004 to 2010 and were available for construction of a tissue microarray. DNA was available for extraction for 176 cases. No clinicopathologic data was available for correlation in 6 cases. Three cases had metastatic disease documented within a month of presentation, and these were excluded from the analysis. Of the remaining 167 cases, 92 originated in the gastroesophageal junction and 75 originated in the remainder of the stomach (Figure 1).

Clinicopathologic differences between GEJ and gastric carcinomas
The clinicopathologic features of these cases are summarized in Table 1 and anonymized clinical data is provided in a supplemental file (Additional file 2: Table S2). GEJ carcinomas were associated with younger age at resection, more frequent intestinal-type and less frequent diffuse histology, more frequent p53 overexpression and less frequent loss of p53 expression, more frequent stage III disease, less frequent stage I disease, and more frequent recurrences. Disease-free survival was significantly worse among patients with GEJ carcinomas (Figure 2A), though the two cohorts were not statistically different in terms of overall survival ( Figure 2B). Other clinicopathologic features were similar between tumors of the two locations, including T-stage, resection margin involvement, ERBB2 amplification, and MMR protein loss ( Table 1). The proportion of diffuse carcinomas in the Lauren classification) was similar between the two sites. Subgroup analysis of only intestinal-type carcinomas showed persistent differences between GEJ and gastric carcinomas in disease-free survival and p53 expression. Differences in age, p53 expression and outcome persisted when considering only intestinal-type carcinomas, as well as when tumours were stratified into three subtypes (proximal nondiffuse, diffuse, and distal non-diffuse) as suggested by Shah et al. [16] (Additional file 3: Table S3).
No differences in the involvement of oncogenic pathways were noted between the two sites, based on mutational profiles.

Prognostic significance of mutations
ERBB4 mutations were associated with worse diseasefree survival (p = 0.018), while there was a trend towards worse disease-free survival associated with mutations in ABL1 (p = 0.063) and JAK3 (p = 0.059). None of these mutations were prognostically significant after accounting for age, sex, Lauren subtype, stage, grade and margin status. Mutations in BRAF (p < 0.001), FGFR3 (p < 0.001), FLT3 (p < 0.001) were associated with worse overall survival on univariate analysis as a result of a single case with mutations in all three of these genes.). BRAF mutation remained prognostically significant after accounting for age, sex, Lauren subtype, stage, grade and margin status (p = 0.002).

Comparison with TCGA data
When assessing the hotspot regions covered by the sequencing panel, the overall number of mutated genes per case was similar between the TCGA and study cohorts (p = 0.659), including when comparing either GEJ (p = 0.399) or gastric (p = 0.845) tumors only ( Figure 5A). A trend towards more frequent cases with mutations in ≥3 genes in the stomach compared to the GEJ was also observed in the TCGA data (12% vs. 3%, p = 0.054). The overall frequency of TP53 mutations was not different between the study cohort and the TCGA cohort (p = 0.230).
No differences in TP53, KRAS, and APC/CTNNB1 mutation rates between GEJ and gastric carcinomas were observed in the TCGA dataset ( Figures 5B-D). The mutated genes in the TCGA data set are included in Additional file 7: Table S7. Regarding the mutations with possible prognostic significance identified in our cohort, there was a trend towards worse overall survival associated with BRAF mutations (p = 0.079), while no prognostic association was found in the TCGA cohort in association with mutations in ERBB4, ABL1, JAK3, FLT3 or FGFR3.

Discussion
This study aimed to probe the utility of panel sequencing in identifying single nucleotide changes in routinely processed gastric resection specimens, which could be used to guide targeted therapies. We secondarily sought to contrast GEJ and gastric carcinomas through targeted deep sequencing of a panel of 46 cancer-related genes, which revealed some differences at the genomic level that may reflect differing clinicopathologic profiles. Finally, we also sought to compare the frequencies of mutations obtained using this panel with results from whole exome sequencing in The Cancer Genome Atlas. Adenocarcinomas of the gastrointestinal tract are molecularly heterogeneous and complex [41][42][43][44]. In gastric carcinoma, deep sequencing of single nucleotide polymorphism and RNA expression arrays have recently revealed abnormalities in several pathways including WNT, Hedgehog, cell cycling, DNA damage repair and the epithelial-to-mesenchymal transition [45]. The current use of multiple single gene tests is untenable given this complexity, particularly in the presence of a growing number of targeted therapies, constrained resources, and limited tissue availability. Thus, it is desirable to investigate multiple genes simultaneously. Panel sequencing has a sensitivity of close to 100% relative to conventional assays such as Sanger sequencing and PCR-based methods, as well as an ability to detect SNVs and INDELs at allele frequencies as low as 5% and 20%, respectively, in both FFPE [21,[46][47][48] and cytology specimens [49][50][51][52]. Targeted panel sequencing can detect aberrations in cancer-related genes in early gastric cancers and precursors lesions [53], and its deep coverage could be particularly useful in gastric cancer by providing adequate results despite scant biopsy material and the admixture of tumor cells with desmoplasia and inflammatory cells.
Putative driver mutations were identified in a majority of GEJ and gastric carcinomas investigated in this study. By far the most frequently detected mutated gene was TP53, and these mutations have also been detected in early stage and precursor lesions using the same assay [53]. Multiple driver mutations were identified in several cases, reinforcing the idea that multiple genes need to be interrogated at once in genomically complex tumors such as gastric adenocarcinomas. A case with a mutation in BRAF (as well as FLT3 and FGFR3) was associated with poor overall survival on both univariate and multivariable analysis. This finding mirrors a trend observed in the TCGA data towards poor overall survival in BRAFmutated tumours, suggesting that in some cases panel sequencing could have a prognostic role.
We were also able to detect potentially actionable mutations in approximately 20% of cases, which involved either genes or pathways where targeted therapies are available or in development. While this number would ideally be higher, our assay only covered certain hotspot regions of these genes, and did not account for copy number alterations that could also yield useful information. Further refinement of such panels to include a broader range of genes and gene segments will likely increase the proportion of cases in which mutations are identified. For example, although TP53 mutations occur throughout the gene, the panel primarily covers exons 5-8, and some of the gene segments that were not sequenced are more frequently associated with loss of p53 on immunohistochemistry [54]. This fact may potentially explain both the differences in the rates of TP53 mutations and patterns of immunohistochemical expression observed in the GEJ and stomach. Nevertheless, this study does demonstrate that single nucleotide variants can be identified from routine/archival pathology materials, and that with additional refinement panel sequencing may have a significant role in the future.
An unexpected result of the cancer hotspot panel sequencing approach was the identification of mutations in genes usually associated with non-epithelial malignancies, such as IDH1 R132H/R132C, JAK3 V722I, and FLT3 A680V. The IDH1 variants identified occur primarily in glial and hematologic malignancies, and result in altered cancer cell metabolism [55]. To the best of our knowledge, these cases constitute the first report of pathogenic IDH1 mutations in gastric cancer. Recently IDH1 mutations have been targeted [33], and mutationspecific treatments are currently the aim of a phase I clinical trial that includes cholantiocarcinomas (http:// clinicaltrial.gov/ct2/show/NCT02073994). FLT3 mutations occur in a third of cases of acute myelogenous leukemia [56], and the point mutation resulting in the A680V substitution has not been previously described in gastric cancer, while being observed occasionally in AML [57]. Similarly, activating JAK3 mutations such as V722I have been identified in acute megakaryoblastic leukemia [58] and NK/T-cell lymphoma [31], and only in a few cases of gastric and breast cancer [59].
Epidemiologic and clinicopathologic differences exist between GEJ and gastric carcinomas [60][61][62]. GEJ carcinomas in this cohort were associated with younger age, different histotypes, and worse disease-free survival. As in other series, the rates of p53 overexpression were higher in the GEJ, as were the rates of TP53 mutation [10,11], while Wnt abnormalities were more common in the gastric carcinomas [12]. In addition, more frequently there were mutations across ≥3 genes in gastric carcinomas, suggesting a higher mutational load and/or a bias towards genes included in the panel compared to GEJ lesions. Although the absence of differences in actionable mutations suggests that tumors in these sites can be considered together, the differences in TP53 and Wnt component mutation rates support the recent push to use location to distinguish proximal and distal gastric carcinomas as separate entities. Based on gene expression data, Shah et al. recently suggested that gastric carcinomas be grouped into three different subtypes [16]. The detection of more frequent KRAS mutations within distal non-diffuse carcinomas in our dataset when using this subclassification further supports pathologic classification of gastric cancers based on location and histotype.
Overall, mutation frequencies within the targeted hotspots were detected at a similar rate as those observed with exome sequencing in the TCGA data, also suggesting that with appropriate design, panel sequencing could be a viable method for interrogating multiple genes with a single test. Cases with mutations in ≥3 genes were also more common in the stomach in this cohort. However, the differences in mutation rates in TP53, KRAS, and APC/ CTNNB1 between GEJ and gastric carcinomas were not observed within the TCGA cohort, even after comparing mutation frequencies within specific gastric locations. It is uncertain whether differences in case selection relating to etiology, geography or ethnicity could account for such differences, or whether differences in sequencing technology or bioinformatic analyses may also have contributed to these divergent observations. Further studies directly comparing the two approaches and comparing different patient populations will further enhance our understanding of GEJ and gastric carcinoma.

Study limitations
Regarding case selection, in the presence of gastroesophageal reflux many of the landmarks used to delineate the stomach from the esophagus are destroyed. This study relied on the epicenter of the tumor being 5 cm from the gastroesophageal junction. However, we derived this classification from pathology reports and could not confirm the gross descriptions, nor did we subclassify tumours by Siewert type. Many of the tumours in this series may have in fact been esophageal in origin, and this could explain the similarities of the tumours with esophageal adenocarcinoma (e.g. worse prognosis and rates of TP53 mutations). The patients' family histories were not recorded for correlation, and the presence of gastric and GEJ cancer risk factors such as Helicobacter infection and Barrett esophagus were also not recorded. Sampling for sequencing and tissue microarray construction was limited, and intratumoral heterogeneity was not addressed. No germline DNA was available for comparison; as a result some somatic variants, which contribute to carcinogenesis but are present at low frequencies as single nucleotide polymorphisms, may have been omitted. In addition, we did not perform validation with Sanger sequencing or other methods. As such, we could not confirm the assay's sensitivity and specificity on this series. The assay has been shown to be accurate in other studies and in our own laboratory. Further validation of this platform with Sanger sequencing or other methods would be required before this assay could be used clinically.

Conclusions
GEJ and gastric tumors differ in several clinicopathologic respects, including the frequencies of mutations in certain caner-related genes. Tailoring treatment towards individual gastric cancer patients will require in-depth characterization of their tumors. This study shows that such characterization will derive information from both traditional clinicopathologic parameters such as tumor location, as well as from emerging molecular assays. Targeted panel sequencing is an approach that can be applied towards routine pathology material and can simultaneously yield information on several genes. Refinement of this approach may be a powerful tool for pathologists and clinicians in the future.