Prognostic value of a 25-gene assay in patients with gastric cancer after curative resection

This study aimed to develop and validate a practical, reliable assay for prognosis and chemotherapy benefit prediction compared with conventional staging in Gastric cancer (GC). Twenty-three candidate genes with significant correlation between quantitative hybridization and microarray results plus 2 reference genes were selected to form a 25-gene prognostic classifier, which can classify patients into 3 distinct groups of different risk of mortality obtained by analyzing microarray data from 78 frozen tumor specimens. The 25-gene assay was associated with overall survival in both training (P = 0.017) and testing cohort (P = 0.005) (462 formalin-fixed paraffin-embedded samples). The risk prediction in stages I + II is significantly better than that in stages III. Analysis demonstrated that this 25-gene signature is an independent prognostic predictor and show higher prognostic accuracy than conventional TNM staging in early stage patients. Moreover, only high-risk patients in stage I + II were found benefit from adjuvant chemotherapy (P = 0.043), while low-risk patients in stage III were not found benefit from adjuvant chemotherapy. In conclusion, our results suggest that this 25-gene assay can reliably identify patients with different risk for mortality after surgery, especially for stage I + II patients, and might be able to predict patients who benefit from chemotherapy.


Results
Screening of candidate biomarkers by microarray and establishment of a 31-gene prognostic algorithm. Detailed clinicopathological characteristics of the selecting, training, and testing sets were shown in Table 1. For initial screening of candidate biomarkers by microarray profiling, we selected 78 patients with qualified frozen tissues in the first batch.
On Affymetrix microarray analysis of tumors from the 78 cancer and 24 matched adjacent non-cancerous gastric mucosa, 2880 genes showed significantly differential expression between the GC tissues and adjacent non-cancerous gastric mucosa. We used a Cox proportional hazards modeling as the main analytical test used to develop the prognostic algorithm (to build a prognostic classifier), which selected31 target genes (Table 2) in the 78-patient selecting cohort. Among them, 14 genes were correlated with patient prognosis analyzed by hazard ratios from univariate Cox regression, including 6 protective genes (XAF1, IFITM1, NCOA7, GZF1, APAF1, and TCF7L2, with hazard ratio less than 1), and 8 risk genes (DYRK2, UBA2, EPHB2, PDCD5, FADD, MARCKS, B3GALT6, and ITCH, with hazard ratio more than 1), while 17 genes were related to classical oncogenic pathways or potential therapeutic targets in GC from previous publication, including MMP2

Continued
We then derived a formula (Supplementary materials and methods) to calculate the risk score for their risk of mortality for every patient based on their individual 31-gene expression levels. Those GC patients with a high-risk thirty-one gene signature had a shorter median overall survival than the patients with intermediate-risk gene signature and low-risk gene signature (median survival: 13.42 months vs. 32.24 months vs. not reached, P < 0.001, Fig. 1B). Moreover, we also validated our model in the publicly available gastric cancer data set (GSE62254). Similar to our previous results, those GC patients with a high-risk thirty-one gene signature also had a shorter median overall survival than the patients with intermediate-risk Gene selection by quantitative hybridization assay in GC. Since in clinical settings where reproducibility, cost, and widespread availability are key priorities, we then aimed to establish a practical prognostic algorithm based on FFPE tissue samples, we systematically measured the expression 31 genes from microarray analysis in 61 matched FFPE tissues by QGP.
The twenty-five-gene signature and survival in GC. Then we identified the gene-signature by quantitative hybridization assay. Reference genes were TBP and PGK1 (Supplementary materials and methods). We establish a 25-gene (23 correlated genes plus 2 reference genes) prognostic algorithm based on FFPE tissues. And the coefficient for each of the 23 genes was derived from the previous cohort and formula of risk score calculating for each patients was changed accordingly.
In order to remove excess statistical confounding factors, patients with TNM stage IV were excluded in the training and testing cohort (the characteristics of this group is shown in Table 1). Then we evaluated this 25-gene assay in the FFPE tissue from the training cohort of 102 patients. Those high-risk GC patients were with a shorter   Fig. 1D).
To confirm that the 25-gene algorithm had similar prognostic value in different populations, we tested it in an independent cohort of 360 patients. The general condition between patients in training cohort and test cohort is  (Table 3S). We applied the cutoff 's value for categorization in training group to the independent test set of 360 patients. Similar to the training cohort, our results showed that patients with a high-risk gene signature had a shorter median overall survival than those with a low-risk gene signature (median survival: 34.39 months vs. 37.77 months vs. not reached, P = 0.005) (Fig. 1E).    Fig. 2B). Moreover, the associations between the gene signature and prognosis in training and testing cohort were also analyzed with stage I and II or stage III respectively. In the subgroup analysis of 29 patients with TNM stage I and II of training cohort, those with a high-risk gene signature had a shorter overall survival than those with an intermediate-risk gene signature and low-risk gene signature (median survival: 21.45 months vs. 63.39 months vs. not reached, P = 0.002, Fig. 2C). In the subgroup analysis of 126 patients with stage I and II disease of testing cohort, those with a high-risk gene signature also showed a shorter overall survival than those with a low-risk gene signature (median survival: 68.64 months vs. not reached vs. not reached, P = 0.014, Fig. 2D), while either in original training or testing cohort the overall survival in the stage III group did not differ significantly (training cohort: P = 0.194; testing cohort: P = 0.264, figure not shown).

Twenty-five-gene assay is an independent prognostic factor in stage I and II patients. Moreover,
we also noted similar results in the patients with stage I and II disease combined training and testing cohort, those with a high-risk gene signature showed a shorter overall survival than those with a low-risk gene signature (median survival: 54.57 months vs. not reached vs. not reached, P < 0.001, Fig. 2E).
Age, sex, gene signature, differentiation, vascular invasion, and TNM stage were included in the Cox multivariate regression analysis. According to the analysis, the high-risk gene signature, differentiation, and tumor stage II were significantly associated with death from any cause among the 155 patients (Table 3) (hazard ratio for the high-risk signature vs. the intermediate-risk signature: 5.325, 95% confidence interval, 2.061 to 13.758, P = 0.001; high-risk signature vs. the low-risk signature: 6.248, 95% confidence interval, 2.320 to 16.826, P < 0.001).
The 25-gene signature based classifier also showed significantly higher prognostic accuracy than any clinicopathological risk factor, including TNM stage and differentiation (Fig. 2F). Thus, this signature can add prognostic value to clinicopathological prognostic features.
Twenty-five-gene signature and adjuvant chemotherapy. We noted adjuvant chemotherapy can enhance survival in all 436 patients (another 26 cases chemotherapy information was missed, median survival:  Table 4S).
In the 149 stage I and II cases, adjuvant chemotherapy did not enhance survival (median survival: not reached vs. not reached, P = 0.101, Fig. 3E). Results from subgroup analysis using our twenty-five-gene signature based classifier showed that patients in the high-risk group had a favorable response to adjuvant chemotherapy (median survival: 36.59 months vs. not reached, P = 0.043, Fig. 3F Table 4S).
Moreover, the associations between adjuvant chemotherapy and prognosis in different lymph node metastasis group were also analyzed (N negative, P < 0.001, Fig. 4A; N positive, P = 0.967, Fig. 4E), we found that adjuvant chemotherapy can improve survival in GC patients with either high-risk (P = 0.012, Fig. 4B) or intermediate-risk (P = 0.001, Fig. 4C) group in lymph node metastasis positive group. In the other patients including lymph node metastasis positive group with low-risk (P = 0.454, Fig. 4D) and all lymph node metastasis negative group (High-risk group, P = 0.567, Fig. 4F; Intermediate-risk group, P = 0.51, Fig. 4G; Low-risk group, P = 0.347, Fig. 4H), overall survival is not significantly different between the chemotherapy and no chemotherapy group (Table 4S). The results indicate that our classifier could successfully identify patients who were suitable candidates for adjuvant chemotherapy.

Discussion
Our practical, quantitative-hybridization-based assay reliably identified GC patients at high risk for mortality after surgical resection, discriminating such patients with greater accuracy than use of NCCN criteria alone. Among the twenty three genes, most of them can be generally classified into the following types: epidermal growth factor receptor, cell cycle factor, angiogenesis, matrix metalloproteinase, and apoptosis genes [17][18][19][20][21] . Moreover, some genes also involved in notch signaling pathway, Tor signaling pathway, regulation of transcription 22,23 , MAPK signaling pathway, and metabolic process 24,25 . Although other groups have developed gene signatures prognostic of survival in GC 9 , none of these previous studies used FFPE samples. Furthermore, most previous studies did not subject their prognostic signatures to large-scale, independent validation. Taken together, our assay for GC is the first of its kind in these important respects: the performance of the assay in one of the studies in a laboratory that was independent from the laboratory in which the assay was developed, the relatively large sizes of the independent testing cohorts, and the potentially large disparity between the genetic background of one of the test cohorts and that of the original training cohort used for development of the assay.
In this study, we noted that in stage I and II GC patients, those with a high-risk gene signature also showed a poor overall survival than those with a low-risk gene signature, while the overall survival of the patients with stage III did not differ significantly. This is probably because the consideration of palliative nature of surgical treatment in stage III group. The survival of patients with this stage will be affected significantly by many clinical and treatment factors other than genetic background of the cancer, and some gene classifiers study were more willing to focus on early stage cancer 26 . Now two large Asian randomized Phase III studies (the ACTS GC and CLASSIC trials) have confirmed the survival benefit for postoperative chemotherapy after curative D2 lymph node dissection in patients with GC 27,28 . But not all patients, especially in patients with early stage need chemotherapy and can benefit from it 29,30 . In our study, these results indicate that our classifier could successfully identify patients with different stage who may benefit from adjuvant chemotherapy. The results indicate that the 25-gene signature could be used to select early stage GC patients at high risk for adjuvant chemotherapy and advanced stage at either intermediate or high risk for adjuvant chemotherapy. Meanwhile, it may spare early stage at either low-risk or intermediate-risk and advanced stage at low risk patients from for unnecessary chemotherapy.
In conclusion, we identified a 25-gene signature associated with prognosis in GC, and validated it in another 360 cases. Statistical analysis demonstrated it is an independent prognostic predictor. The predicting role of it in stage I and II is significantly better. Moreover, the patients with high-risk assay had a chemotherapy benefit in stage I and II GC, while low-risk patients in stage III were not found benefit from adjuvant chemotherapy.

Materials and Methods
Patients and samples. All of the patients with GC included in this study were diagnosed and surgically treated in Peking University Cancer Hospital between 1996 and 2007 and followed up to January 2013. This investigation was approved by the Institutional Review Boards of the hospital, informed consent was obtained from each patient, and all methods were performed in accordance with the relevant guidelines and regulations. All the frozen samples were collected and stored by Central Biobank Facility and all FFPE samples by Pathological Department of hospital. All frozen sample for this investigation passed the histological re-assessment containing at least 70% tumor cells. All FFPE tissues samples were hematoxylin-eosin (HE) stained and evaluated for one slide by two pathologists (YL and BD) and manually dissected to remove non-cancerous mucosa and mesenchymal tissues to guarantee at least 80% tumor cells. The TNM stage of GC was classified according to the 7th edition of classification recommended by the American Joint Committee on Cancer (AJCC) 31 . This investigation was performed after approval by Ethics Committee of Peking University Cancer Hospital. Informed general consent was obtained from each patient at the time of collection. Fig. 1. At first, the frozen tissues from 78 patients (78 cancer tissues and 24 matched normal tissues) were profiled by Affymetrix Hu133Plus2 arrays for mRNA expression. Then analysis of microarray data generated candidate biomarkers related to prognosis and consensus therapeutic targets. And only biomarkers with comparable expression detected by quantitative Quantigene assay in the matched FFPE tissues were selected to develop a multiple gene assay, which was then tested and validated in two cohorts of patients with FFPE tissue samples.

Study design. The study design is shown in
In the expression profiling assay, the frozen tissues from all stages (I-IV) patient with complete clinicopathological and follow-up information were randomly selected and retrieved from the Central Biobank Facility. In the following quality control test, samples with tumor percentage <70% by histology evaluation and poor RNA quality were excluded.
In the test and validation phase, only patients with TNM stage I-III GC undergoing curative resection (histologically negative resection margin), and with complete clinicopathological and follow-up information available, were included. The reason for exclusion of stage IV from the validation cohort is because the consideration of palliative nature of surgical treatment for stage IV patients. The survival of patients with stage IV will be affected significantly by many clinical and treatment factors other than genetic background of the cancer. Patients with FFPE tissues not available or fail to pass the quality assessment were also excluded.
Microarray Analysis. Total RNA were extracted from frozen tissues and profiled by Affymetrix Hu133Plus2 arrays for mRNA expression according to the manufacturer's specifications. Robust Multi-array Analysis (RMA) algorithm provided by software Expression Console was used to call gene level expression values from raw signals. Based on the algorithm published, only the best probe set was selected to represent gene expressions. Any gene without probe set with informative score < 0.5 is removed from this analyses.

Quantitative hybridization Assay in FFPE tissues.
After manually dissected the FFPE slides to remove non-cancerous mucosa with scalpels, tissue homogenates were prepared according to the procedure described in the QuantiGene Sample Processing Kit for FFPE Tissues (Panomics, Inc., Fremont, CA). Briefly, 200ul of homogenizing solution supplemented with 2 µl of proteinase K (50 µg/µl) were incubated with 6 deparaffinized 5 µm sections overnight at 65 °C. Then the tissue homogenate was separated from debris by brief centrifugation, and transferred to a new tube.
Standard probe design software was used to design specific oligonucleotide probe sets for detecting target genes by QuantiGene plex 2.0 Reagent Systems (Panomics, Inc.), which gives 400-fold signal amplification. And the assay was performed according to protocols recommended by manufacturer (Panomics, Inc.). Briefly, probe set oligonucleotides were mixed with the sample solution into a 96-well plate. Target RNA was captured during an overnight incubation at 54 °C. Unbound material was removed by three washes with 200 µl of wash buffer followed by sequential hybridization of RNA amplifier molecules, then pre-amplifier hybridization, amplifier hybridization, and label probe hybridization were performed. Finally, plate were prepared for analysis after Streptavidinconjugated Phycoerythrin (SAPE) working reagent was added.
Gene Signature and Statistical Analysis. First, the genes which showed significantly differential expression between the GC tissues and adjacent non-cancerous gastric mucosa were selected from the 78 microarray results. Then we used a Cox proportional hazards modeling as the main analytical test used to develop the prognostic algorithm. Hazard ratios from univariate Cox regression analysis were used to determine which genes were associated with death. Protective genes were defined as those associated with a hazard ratio for death of less than 1; risk genes were defined as those associated with a hazard ratio for death of more than 1. For genes that were significantly correlated with survival, we used a linear combination of the gene-expression coding values weighted by the regression coefficients to calculate a risk score for each patient. Resultant predicted risk scores from the training cohort were divided at the 33rd and 67th percentiles to generate cutoff s for categorization of risk score as low-risk, intermediate-risk, and high-risk. Kaplan-Meier analysisi was used to compare survival the survival distributions of two or more groups of a between-subjects factor with the log-rank test. Multivariate Cox proportional hazards regression analysis with backward, stepwise selection was used to evaluate independent prognostic factors associated with survival. The correlation of the microarray and QGP results was indexed by Pearson's correlation test. P < 0.05 was considered to indicate statistical significance, and all tests were two-tailed.
Significance. This study develops and validates a practical, reliable assay which can identify patients with different risk for mortality after surgery, and might be able to predict patients who may benefit from chemotherapy in GC.