p53, cathepsin D, Bcl-2 are joint prognostic indicators of breast cancer metastatic spreading

Traditional prognostic indicators of breast cancer, i.e. lymph node diffusion, tumor size, grading and estrogen receptor expression, are inadequate predictors of metastatic relapse. Thus, additional prognostic parameters appear urgently needed. Individual oncogenic determinants have largely failed in this endeavour. Only a few individual tumor growth drivers, e.g. mutated p53, Her-2, E-cadherin, Trops, did reach some prognostic/predictive power in clinical settings. As multiple factors are required to drive solid tumor progression, clusters of such determinants were expected to become stronger indicators of tumor aggressiveness and malignant progression than individual parameters. To identify such prognostic clusters, we went on to coordinately analyse molecular and histopathological determinants of tumor progression of post-menopausal breast cancers in the framework of a multi-institutional case series/case-control study. A multi-institutional series of 217 breast cancer cases was analyzed. Twenty six cases (12 %) showed disease relapse during follow-up. Relapsed cases were matched with a set of control patients by tumor diameter, pathological stage, tumor histotype, age, hormone receptors and grading. Histopathological and molecular determinants of tumor development and aggressiveness were then analyzed in relapsed versus non-relapsed cases. Stepwise analyses and model structure fitness assessments were carried out to identify clusters of molecular alterations with differential impact on metastatic relapse. p53, Bcl-2 and cathepsin D were shown to be coordinately associated with unique levels of relative risk for disease relapse. As many Ras downstream targets, among them matrix metalloproteases, are synergistically upregulated by mutated p53, whole-exon sequence analyses were performed for TP53, Ki-RAS and Ha-RAS, and findings were correlated with clinical phenotypes. Notably, TP53 insertion/deletion mutations were only detected in relapsed cases. Correspondingly, Ha-RAS missense oncogenic mutations were only found in a subgroup of relapsing tumors. We have identified clusters of specific molecular alterations that greatly improve prognostic assessment with respect to singularly-analysed indicators. The combined analysis of these multiple tumor-relapse risk factors promises to become a powerful approach to identify patients subgroups with unfavourable disease outcome.


(Continued from previous page)
Keywords: Breast cancer, Metastatic relapse, Prognostic indicators, TP53, Bcl-2, Cathepsin D, RAS Abbreviations: CI, Confidence interval; CV, Cross-validation; Fab, Fragment antigen-binding; FFPE, Formalin-fixed paraffin-embedded; HR, Hazard ratio; IHC, Immunohistochemistry; PCR, Polymerase chain reaction; PK, Proteinase K; PLS-DA, Partial least squares discriminant analysis; TMA, Tissue micro-array; VIP, Variable importance in the projection Background Breast cancer (BC) is the most frequent malignancy in women with 800 cases out of 100,000 people, four-times as many as the second most frequent one, i.e. colorectal cancer [1]. Histopathology classification of BC according to tumor grade, stage, histotype, lymph node invasion and hormonal receptor status [2] is broadly used to draw correlations with survival. However, this classification performs poorly in predicting differential biological aggressiveness of tumors with identical grade and stage. As an example, patients with the best prognosis, i.e. bearing small size tumors, expressing estrogen receptors and without lymph node invasion, experience early tumor relapse in 10-20 % of the cases [3,4]. Cases that relapse do not detectably differ from those that do not, as far as conventional prognostic parameters are concerned.
Tumor development depends on the accumulation of several specific genetic and epi-genetic changes [12][13][14]. Thus, the analysis of individual oncogenic factors is unlikely to suffice in defining the biological nature and aggressiveness of a tumor [15]. Major control pathways or clusters of drivers of cell growth, apoptosis or invasion are, on the other hand, expected to associate with tumor aggressiveness and overall malignancy much more strongly than individual factors. In this work we went on to test this model. Histopathology and oncogenicallyactivated determinants of tumor progression of BC were analyzed in the framework of a case-control study. The results obtained were evaluated by means of statistical analyses able to detect significant interactions of biological determinants connected with tumor relapse. This showed that correlated p53, Bcl-2 and cathepsin D specifically associate with unprecedented high levels of relative risk for local invasion and metastatic relapse. As matrix metalloproteases, which play a key role in local invasion and distant cancer spreading, were shown to be a transactivation target for mutant p53, in cooperation with oncogenic Ras, exon sequence analysis was performed for TP53 and RAS genes, and findings were coordinately analyzed with the immunohistochemistry (IHC) data and clinical phenotypes.

Breast cancer case series
A multi-institutional case series of BC patients was collected from the National Cancer Institute of Naples, together with the University of Udine, the district hospital of Venice and Rovigo, Italy. Two hundred and seventeen BC patients were analyzed (Table 1). Clinical data (age, family history, clinical stage, disease follow-up) and conventional prognostic indicators (size, pathological stage, local invasion, margin width, lymph-node invasion, histological type, necrosis, inflammatory infiltration, hormonal receptor status) were recorded [16,17] ( Table 1). Cancer grade was determined as described [18] (Table 1; Additional file 1: Table S1 and Additional  file 2: Table S2). Twenty six cases (12 %) showed disease relapse during follow-up (Additional file 1: Table S1). Relapsed cases were matched with a set of control patients by tumor diameter, pathological stage, tumor histotype, age, hormone receptors and grading (Additional file 1: Table S1), and analyzed for expression of tumor progression determinants by immuno-histochemistry (IHC) and DNA sequencing, as indicated. To identify patterns of aggregation of molecular alterations associated to different classes of BC prognosis, stepwise grouping procedures were performed for model structure fitness assessment, as described.

Histopathology
Tissue micro-arrays (TMA) of tumor samples were assembled as described [19,20]. Briefly, whole-tumor sections of formalin-fixed paraffin-embedded (FFPE) BC samples were stained with hematoxylin-eosin, and used for guiding selection of tumor-containing areas. Three 1 mm diameter cylinders were then obtained from all tumors and transferred to recipient blocks. Filled blocks were heated for 15 min at 37°C to induce the tumor cores to adhere to the paraffin walls. TMA sections were analysed by IHC for the expression of markers relevant to tumor development and aggressiveness (Figs. 1, 2 and 3, Table 1). Briefly, 5 μm sections of BC TMA were mounted onto Vectabond-coated slides (Vector Laboratory). Before staining, sections were heated at 56°C and dewaxed in xylene/ethanol. Endogenous peroxidase was blocked with hydrogen peroxide in methanol. Heatmediated 'antigen retrieval' was performed by treatment in pH 6 citrate buffer in a pressure cooker or microwave oven, as required for each specific target. After preincubation with appropriate blocking agents, e.g. species-matched normal serum, sections was incubated with the primary antibody (Additional file 3: Table S3). After washing, sections were challenged with fragment antigen-binding (Fab) 2 biotinylated secondary reagents, followed by avidin-peroxidase and 3,3′-diaminobenzidine  is the maximum observed intensity and 2 corresponds to an intermediate intensity. A combined score was obtained by multiplying percentages of positive cells by intensity. Scores were then categorized for statistical evaluation [21].

DNA extraction
FFPE BC sections were processed as described [22,23]. This procedure provided with relatively crude DNA preparations, which, however, could be efficiently used as a template in ≤150 bp-long polymerase chain reaction (PCR) amplifications [24]. Briefly, four 5 μm tumor sections were deparaffinized by two extractions with either xylene or Histoclear (Carlo Erba), followed by two extractions with ethanol. Samples were then digested for 3 h at 50°C with proteinase K (PK) 2 mg/ml, Tween 20, Tris-Cl 50 mM, EDTA 1 mM, pH 8.5, then overnight at 50°C after PK replenishing. Samples were then incubated at 95°C for 15 min to inactivate PK, centrifuged at top speed for 15 min at 4°C, transferred to a fresh tube and stored at − 20°C. DNA yields were quantified by ethidium bromide fluorescence in solution [25]. On average 30 μg DNA/sample were obtained. Size distribution of the extracted DNA [26] was profiled by ethidium bromide/agarose gel electrophoresis for sample quality assessment.

PCR amplification
After thawing, DNA samples prepared as above were incubated at 95°C for 25 min (this step was critical for successful amplification). One μl of this crude extract was added to the amplification mix. Primers were designed using Primer3 [27,28] (Additional file 4: Table  S4). TP53 exons (from 2 to 11) were separately amplified

Sequence analysis of TP53, Ha-RAS and Ki-RAS in human tumors
In both Ha-RAS (c-Ha-RAS1) and Ki-RAS (c-Ki-RAS2) activating oncogenic mutations are found at hotspots in exon 1 and 2, at codons 12, 13 (exon 1) or 61 (exon 2). Care was taken to differentially amplify the regions of interest of functional genes versus non-expressed pseudogenes, i.e. c-Ha-RAS2 and c-Ki-RAS1. Benchmark PCR amplification of Ha-and Ki-RAS exons 1 and 2 was performed using genomic DNA and cDNA from the T24 cell line, which carries a mutated, oncogenic form of Ha-RAS with a transversion at codon 12 (from GGC to GTC). When using cDNA templates, PCR primers were designed that reside in exonic regions, for simultaneous amplification of both exon 1 and 2 of Ha-and Ki-RAS. Joint amplification of exon 1 and 2 from genomic DNA was only performed for the Ha-RAS gene (the intervening intron is only 267 bp long in the Ha-RAS gene; it is more than 12,500 bp long in the Ki-RAS gene). Additional primers were designed that included intronic regions and were therefore specific for amplification of functional genes from genomic DNA. Amplified fragments were sequenced on both strands. Insertions or deletions (indels) of the TP53 gene (Additional file 5: Table S5) were shown to carry the highest prognostic weight [29]; such mutations were identified and matched against those listed in the IARC database [29].

Statistical analysis
The independent impacts of individual risk factors on prognosis is commonly evaluated in the framework of uni-or multivariate models [8,19,30,31]. Univariate analyses were performed with GraphPad Prism 6.0 (GraphPad Software Inc., La Jolla, Ca) and XLStat 2009 (Addinsoft, Paris, France). Multivariate analyses and data modeling were performed using MetaboAnalyst 2.0 [32][33][34] and SIMCA-P+ 11 (Umetrics, Umea, Sweden) [35] software. However, uni-or multivariate analyses do not effectively quantify interaction effects on the final outcome. To explore such interactions, a priori specified hypotheses have been used in the past as trial models, but at the risk of introducing analytical bias. To overcome these limitations, patterns of aggregation of molecular parameters affecting prognosis were modeled here through logistic regression and partial least squares discriminant analysis (PLS-DA). PLS-DA clustering was performed using relapse as a dichotomic variable. PLS-DA model validation was performed as previously described [36]. Briefly, to define the optimal number of PCs, "7-fold cross-validation" (CV) was applied [37]. Using CV, the predictive power of the model was verified through R 2 (goodness of fit) and Q 2 (goodness of prediction). A model with Q 2 > 0.5 was considered good, Q 2 > 0.9 excellent [38]. The performance of PLS-DA models was further validated by a permutation test (200 times). To help interpreting results from PLS-DA, we utilized variable importance in the projection (VIP) scores. This allowed to evaluate the parameter influence on the model and to identify the best descriptors of relapsing versus non-relapsing BC. VIP scores are weighted sums of squares of the PLS loading weights, which take into account the amount of explained Yvariation for each dimension [33]. VIP values were cumulatively calculated from all extracted PLS components, usign a threshold of 0.8 [39]. As some variables may exert effects on the whole population (global), while others can be relevant in specific subgroups only (local), procedures were utilized to identify homogeneous subgroups with respect to corresponding parameters subclasses [40][41][42]. Spearman's correlation analysis was performed using MetaboAnalyst 2.0 software [32][33][34] and GraphPad Prism.
Correlated analysis of the analyzed tumor determinants revealed marked increase in HR for p53, cathepsin D and Bcl-2 (Figs. 1, 2). Positivity for p53 nuclear expression was found to associate with an eleven-fold increase in relapse risk (HR = 11; 95 % C.I. = 2.5-51.8). Unprecedented increase in risk was found for cathepsin D expression (HR = 20; 95 % C.I. = 2.3-184.3). Notably, expression profiles of p53 and cathepsin D remained significantly different between cases and controls when subgrouping patients by lymph node status, supporting an independent prognostic value of these parameters. Lymph node diffusion correlated with local cancer relapse (rho = 0.405, p = 0.014), but did not with distant metastatic relapse, raising the issue that determinant of local invasion may differ from those required for metastatic diffusion. Hence, we assessed the impact of p53 and cathepsin D in lymph node-negative patients. Remarkably, tumor co-expression of p53 and cathepsin D in this patient subgroup remained associated to a sixteen-fold higher risk of experiencing relapse (HR = 16; 95 % C.I. = 1.5-171.2). Trends for association of positive lymph nodes and tumor size were found: 50 and 78 % of lymph-node-positive women were positive for p53 and cathepsin D, respectively; 63 and 74 % of women with tumors bigger than 2 cm were positive for p53 and cathepsin D, respectively.
Remarkably, the expression of Bcl-2 was associated with a markedly better prognosis, and a nine-fold reduction of risk (HR = 9.2; 95 % C.I. = 1-87.8). Bcl-2 expression was previously found to correlate with a differentiated cancer phenotype, i.e. with lower grading and lack of p53 [45]. Consistent, Bcl-2 expression was found to correlate with that of ERα and PgR, and was anti-correlated with cancer grading and with the expression of p53, Cyclin E and Her-2 (Table S3).
Correspondingly, Bcl-2 expression was shown to have a beneficial influence on prognosis [46,47], whereas loss of Bcl-2 was found in 70 % of the aggressive triplenegative BC, and was significantly associated with high proliferation, tumor progression, increased risk of death and recurrence [48]. Still, the magnitude of Bcl-2 prognostic impact observed here in metastatic versus nonmetastatic BC had not been previously revealed [49], supporting a critical value of correlated evaluation of malignancy determinants (Bcl-2, p53, cathepsin D) for effective use in prognostic assessment.
To verify the strength of this unsupervised analysis, and to further build on it, we performed a supervised PLS-DA [50]. Datasets of pathological/experimental parameters were grouped using a dichotomic classification (metastatic relapse versus no relapse). This model was found to have strong goodness of fit (cumulative R 2 Y = 0.828) and prediction power (cumulative Q 2 = 0.548) (Figs. 4, 5 and 6). PLS-DA-identified determinants clusters yielded a clear-cut discrimination between metastatic versus non metastatic tumours (Figs. 4, 5a). A PLS-DA weight plot was generated in order to identify the major discriminants between the groups analyzed (Fig. 4). Next, VIP scores were computed for each parameter. Twenty descriptors, i.e. local relapse, grading, HER-2 (membrane intensity), lymph node status, p53, p16, Bcl-2, Cyclin E, PgR, together with stromal cathepsin D, PAI-1, uPA and MMP-11 were found to markedly contribute to the classification model (VIP score ≥ 0.8) (Fig. 5b) [39]. Permutation tests were carried out in order to validate the PLS-DA model [38,50]. The original model was found to have higher R 2 and Q 2 values than the permuted models, and negative Q 2 values were obtained for all two permuted groups tested (Fig. 5c).

DNA extraction
DNA was extracted from sections of FFPE BC (Additional file 1: Table S1A). Ethidium bromide gel electrophoresis (Fig. 7) and amplification of RAS and TP53 exons benchmarked DNA as viable for downstream analyses. RAS and TP53 sequences were determined on cases and control DNA (Additional file 1: Table S1), as described ( Figs. 7 and 8).

TP53 mutations
Case and control FFPE tumor samples, were systematically analyzed for insertions, deletions and stop codons in the coding region of the TP53 gene by PCR and sequencing of PCR amplification products (Figs. 7 and 8; Additional file 5: Table S5). Structural alterations of the TP53 gene are listed in Additional file 5: Table S5. Three indels were identified, and one stop codon, all of which led to truncation of the corresponding p53 proteins. Remarkably, all truncated p53 (8.7 % of the BC cases) were identified in relapsing cancers, three out of four cases being grade 3, i.e. those with the most malignant phenotype. These findings support models were severely damaged p53 is a strong risk factor for tumor progression in defined subgroups of BC [7,8,10,31]. Notably, though, only one of these cases was a triple-negative tumor, a tumor phenotype traditionally associated with tumor aggressiveness [8], suggesting that the present molecular characterization may lead to novel subgrouping strategies of BC for risk determination. However, larger case series are needed to validate this approach.

RAS mutations
Case and control BC samples, were analyzed for mutations at codons 12, 13, 14 and 61 of the Ha-RAS and Ki-RAS genes by PCR amplification and sequencing of the first and second exon. Three cases showed a mutation at codon 12 of Ha-RAS, from GGC to GTC (Gly → Val); one case showed an additional mutation at codon 14 from GTG to ATG (Val → Met), with an overall prevalence of tumors bearing RAS mutations of 6.4 % (Additional file 1: Table S1). Of note, all mutations occurred in the metastatic and locally invasive/relapsing cases. This suggested relevance of mutated Ha-RAS in a small, distinct subset of metastatic BC. Mutations of both Ha-RAS and TP53 were identified in the same cancer, suggesting a possible cooperativity in cell transformation [51].

Discussion
Traditional prognostic indicators of BC, i.e. lymph node diffusion, tumor size, grading and estrogen receptor expression, are inadequate predictors of metastatic relapse. Therefore, identification of additional parameters versus traditional prognostic indicators is urgently needed. Several genes (oncogenes, tumor suppressor genes, transcription factors, signaling molecules, adhesion proteins, proteases) play a driving role in tumor progression [52]. Individual oncogenic determinants, e.g. p53, Her-2, Ecadherin, Trops, have been shown to possess prognostic/ predictive power [7-11, 20, 53]. However, they did not outperform traditional prognostic indicators. Tumor progression is a multistep process [13,[54][55][56][57][58], which correlates with multiple, successive molecular modifications [13,14]. Hence, clusters of tumor-driving traits are expected to be associated with tumor aggressiveness and overall malignancy, much more strongly than individual factors. In this work, we tested such a model in BC. Histopathological and molecular determinants of tumor progression of post-menopausal BC were analyzed, to assess impact on metastatic relapse. Aggregation of cancer determinants was expolored by modeling through discriminant analysis, logistic regression, partial least squares and partition trees. This identified upregulation of p53 and cathepsin D, together with downregulation of Bcl-2, as associated with a major increase in risk of disease relapse.
p53 is a tumor suppressor gene which is frequently mutated in cancer cells [59], and was identified as an indicator of both prognosis [8,[60][61][62] and response to therapy [7]. A cooperation of p53 with other drivers of tumor progression, e.g. Her-2 [8,63] and Trop-1/Ep-CAM [10,64] was previously shown, thus lending support our model of interaction between distinct prognostic determinants.
Bcl-2 inhibits cellular apoptosis [65]. Hower, Bcl-2 expression has a stronger impact as indicator of retained cancer differentiation, and of better disease outcome [45]. Indeed, loss of Bcl-2 was shown to have negative prognostic impact [46,47,49]. Bcl-2 expression was lost in 70 % of the most aggressive triple-negative BC cases, i.e. those lacking ERα, PgR and Her-2, and was significantly associated with high proliferation, tumor progression and increased risk of death and recurrence [48]. Supporting these findings, we found that Bcl-2 expression negatively correlated with cancer grading and with the expression of p53, cyclin E and Her-2. On the other hand, Bcl-2 expression was found to correlate with that of ERα and PgR, i.e. with differentiated cancer phenotypes.
As for the additional determinants we analyzed, cyclins D and E regulate the cell cycle [80], and increased levels are associated with worse prognosis and increased relapse rates in BC patients [81]. p27/kip1 and p16/INK4 are inhibitors of cyclin-dependent kinases and can prevent progression through the cell cycle [55], but can also be determinants of malignancy. High levels of the p27/ kip1 cyclin inhibitor have been associated with worse prognosis and higher relapse rate in BC [82,83]. On the other hand, deletion of p16/INK4 can be selected for in BC [84]. Consistent with an interactive predictive value, the levels of Cyclin E and of the p27 cyclin inhibitor were shown to have a higher impact when combined [82]. The mitotic index (Ki-67) is a measure of the percentage of tumor cells in active division and is a relevant prognostic indicator in BC [31]. Her-2 is a transmembrane tyrosine kinase receptor that regulates the growth of tumor cells [85]. The levels of expression of Her-2 have been shown to be independent indicators of worse prognosis, with respect to tumor relapse and overall survival in BC patients [86].
To identify interaction effects of different variables on disease outcome, expression profiles of tumor progression drivers were assessed, and results were evaluated by means of statistical analyses designed to detect Unlike R 2 X (cum), Q 2 (cum) is not additive. c Permutation tests for: metastatic (left) and non metastatic tumors (right). Permutation tests were performed by comparing R 2 and Q 2 of the original model with R 2 and Q 2 of Y-class-permutated models. The correlation coefficients of original and permuted data are reported on the X axis; 200 random permutations were carried out. The values of R 2 and Q 2 are reported on the Y axis. The green triangles and blue squares in the upper right (ρ = 1) correspond to the values of R 2 (green triangles) and Q 2 (blue squares) of the original data. The low values of intercepts show that the model has high statistical significance (no over-fitting) significant prognostic interaction. To preempt the need for a priori specified hypotheses, patterns of aggregation of molecular parameters affecting prognosis were modeled through logistic regression and PLS-DA, using relapse as a dichotomic variable. PLS-DA score plot clustering of tumor samples with or without disease relapse, obtained separation between the two clusters. Major discriminant parameters were shown to be, HER-2, p53, p16, Cyclin E, PgR, together with stromal cathepsin D, PAI-1, uPA and MMP-11 were found to markedly contribute to the classification model; these efficiently clustered with local relapse, lymph node diffusion, tumor staging and grading. Among prognostic factors, p53 and cathepsin D stood up as major determinants of cancer relapse. Bcl-2 expression was shown to provide with unprecedented protective power versus tumor recurrence, candidating the combined assessment of these IHC parameters for use in clinical settings. Of interest, our case-control study included only one triple negative BC, indicating that a triple negative status was not a confounding variable in our study, and that p53, cathepsin D and Bcl-2 are efficient aggressiveness determinants in BC across currently categorized cancer subgroups.
Specific mutations of oncogenes and tumor suppressor genes play key roles in tumor progression. TP53 is frequently inactivated in several human tumors [87][88][89] and TP53 mutations help classifying and selecting patient subgroups with different biological features [8,90], particularly in BC [8,10,31]. Mutations in different regions of TP53 were shown to be heterogenous in nature [91] and clinical outcome, indels having the highest impact [92]. Consistent, sequencing of the TP53 gene revealed a subgroup of BC where truncating mutations, such as indels and stop codons, were in all cases associated with cancer relapse.
The RAS genes code for small G proteins that play a critical role in signal transduction pathways downstream of growth-factor receptors. RAS mutations can affect prognosis [93][94][95]. Moreover, Ras downstream target genes are synergistically upregulated by mutated p53 and Ha-Ras, among them, matrix metalloproteases, which play a key role in local invasion and distant dissemination [96]. Hence, hot-spot sequence analysis was performed for Ha-and Ki-RAS, and findings were correlated with the IHC data and clinical phenotypes. The constitutive activation of the Ras proteins by point mutations, concentrated in hotspots at codons 12, 13, 61, is among the most frequently observed oncogene activation in human malignancies (75 % of adenocarcinomas of the pancreas, 40 % of adenomas and carcinomas of the colon and rectum, 25 % of carcinomas of the lung) and have been linked to worse prognosis [97]. However, although mutations in Ha-RAS and of Ki-RAS are often found in animal models of BC [98], their mutation frequency in human BC was shown to vary widely across studies. c-Ki-RAS mutations were shown to occur in 1 out of 8 BC by Yanez et al. [99]. Ha-RAS mutations were detected by Spandidos et al. [100], but not by Biunno et al. [101]. An overall low frequency of Ha-RAS mutations was found in most subsequent studies [97,[102][103][104][105][106]. Our findings support an incidence of mutated Ha-RAS in ≈ 5 % of BC cases. No mutations were detected in Ki-RAS. Remakably, all RAS mutations were identified in relapsed cases, suggesting impact of mutated Ha-RAS in a distinct subset of malignant BC [97,[104][105][106]. This finding warrants testing in a prospective clinical trial with adequate size and predictive power for relapsed cases subgroup dissection.

Conclusions
Taken together, our findings support a model of high BC aggressiveness as associated to high levels of p53 [8,10] and cathepsin D [79], together with a downregulation of Bcl-2 [48]. An interaction between tumorrelapse risk factors may thus have a marked impact on prognosis, paving the way for using cluster molecular profiling of BC, to identify patient subgroups with distinct disease outcomes.  Table S5); the mutation site is boxed; the corresponding amino acid sequence is indicated