Prediction of Recurrence and Survival for Triple-Negative Breast Cancer (TNBC) by a Protein Signature in Tissue Samples*

To date, there is no available targeted therapy for patients who are diagnosed with triple-negative breast cancers (TNBC). The aim of this study was to identify a new specific target for specific treatments. Frozen primary tumors were collected from 83 adjuvant therapy-naive TNBC patients. These samples were used for global proteome profiling by iTRAQ-OFFGEL-LC-MS/MS approach in two series: a training cohort (n = 42) and a test set (n = 41). Patients who remains free of local or distant metastasis for a minimum of 5 years after surgery were classified in the no-relapse group; the others were in the relapse group. OPLS and Kaplan–Meier analyses were performed to select candidate markers, which were validated by immunohistochemistry. Three proteins were identified in the training set and validated in the test set by Kaplan–Meier method and immunohistochemistry (IHC): TrpRS as a good prognostic markers and DP and TSP1 as bad prognostic markers. We propose the establishment of an IHC test to calculate the score of TrpRS, DP, and TSP1 in TNBC tumors to evaluate the degree of aggressiveness of the tumors. Finally, we propose that DP and TSP1 could provide therapeutic targets for specific treatments.

HER2/neu receptors and account for about 15% of all breast cancers. This subtype is associated with poor prognosis (1) in terms of distant free survival (DFS) and overall survival (OS), and to date, there is no clinically available targeted therapy for patients diagnosed with TNBC. Because of the absence of specific treatment guidelines for this group of patients, TNBC are managed with standard adjuvant chemotherapy (2), which, however, seems to be less effective in those cancers. In order to improve survival, it is important to determine new specific-targeted treatment.
A proteomic analysis has several inherent advantages over a genomic approach in that measured mRNA levels do not necessarily correlate to corresponding protein levels. In addition, protein detection is probably also more reflective of the tumor microenvironment. Several proteomic studies have been conducted on TNBC (3)(4)(5), but no proteomic study was conducted on large cohorts including the clinical outcome of the patients, except a recent comparative proteome analysis that identified a 11-protein signature for aggressive TNBC in a large cohort of 93 microdissected tumors (6). Although microdissection was necessary to elucidate the contribution of TNBC cells, it did not reflect the tumor with its microenvironment that is increasingly described as fundamental to explain the tumor outcome. Thus, it is now recognized that carcinomas derive from phenomena that occur in tissues, not in individual cancer cells. From this perspective, the microenvironment becomes an integral, essential part of the tumor (7,8). In this context, taking into account the tumor microenvironment, we investigated a cohort of 83 TNBC samples without microdissection by a quantitative proteomic approach using iTRAQ labeling. Based on clinical data, we established a protein signature of the most aggressive tumors. From these differentially expressed proteins, some of them appeared to be potential therapeutic targets.

PATIENTS AND METHODS
Patients-The study involved 83 patients diagnosed and treated at the Institut of Cancé rologie de l'Ouest (ICO) for a TNBC, between early 1998 and 2007. The primary inclusion criterion was an adequate fresh tumor obtained from a resected tumor sample (see below). Patients were included if they fulfilled the following criteria: (a) female primary unilateral invasive ER/PR and HER2 negative-breast carcinoma without previous or concomitant malignancies; (b) T1T2, N0N1 N2 N3-, M0 staging according to UICC criteria; (c) older than 18 years old; and (d) surgical first-line treatment. All patients showed no evidence of distant metastasis at the time of diagnosis. None had received chemotherapy, endocrine therapy, or radiation therapy prior to surgery. Treatment decisions were based solely on consensus recommendations at the time of diagnosis. All the patients received the same adjuvant chemotherapy (FEC100) and radiotherapy treatments.
Patients were followed up for disease evolution. The 83 tumors were divided in two cohorts: a training cohort (n ϭ 42) corresponding to patients diagnosed at ICO Paul Papin (Angers) and a test cohort (n ϭ 41) corresponding to patients diagnosed at ICO René Gauducheau (Nantes). The clinicopathological characteristics of these TNBC cohorts are listed in Supplemental data S1. Follow-up data were collected for all patients, including, disease-free survival (DFS; time from diagnosis to first recurrence of the disease or contralateral breast cancer or second primary other cancer) and overall survival (OS, time from diagnosis to death from any cause). Recurrences were defined as locoregional (breast, mammary region, or regional lymph nodes) or metastatic (visceral or not). Informed consent was obtained from patients to use their surgical specimens and clinicopathological data for research purposes, as required by the French Committee for the Protection of Human Subjects. This study did not need ethical approval.
Tumor Characteristics-Pathological data were reviewed by two pathologists. Tumor size (pT) was measured on fresh resection specimens, as the maximum diameter (mm) of the tumor. Histological type was determined according to the WHO classification and histological grade according to the Elston and Ellis methods.
ER and PR status were accessed by immunochemistry on representative formalin-fixed tumors blocks at a 4-m thickness. Tumors were determined as negative when Ͻ 10% cells stained positive. All patients where HER-2 negative, that means an immunostaining 1ϩ, score according to the Her-cepTest scoring system or 2ϩ without HER-2 gene amplification investigated by in situ fluorescence.
Sample Collection-All specimens were collected immediately after surgery, snap frozen, and stored in liquid nitrogen until the time of analysis. The time between the resection of the breast tumor and its freezing is less than 1 h. We also selected four normal macroscopic areas for our control pool. Frozen sections (12 m thick) of either TNBC or normal areas were embedded in OCT and cut on a cryostat (Bright lnstrument Co. Ltd., St. Margarets Way, UK). Specific sections were stained with toluidine blue for visual reference, and each tissue section from all specimens was evaluated by experienced pathologists for cancer cell proportion determination. Samples containing less than 75% of tumor cells were removed.
Protein Extraction from Frozen Tissues-Ten frozen sections per tumor were lysed in a buffer consisting of 0.1 M Tris-HCl, pH 8.0; 0.1 M DTT; and 4% SDS at 95°C for 90 min. Detergent was removed from the lysates, and the proteins were digested with trypsin using the FASP protocol (9) using spin ultrafiltration units of nominal molecular weight cut of 30,000. Using YM-30 microcon filter units (Cat. No. MRCF0R030, Millipore) containing protein concentrates, 200 l of 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UA), was added, and samples were centrifuged at 14,000 g at 20°C for 8 min. This step was repeated three times. Then 6 l of 200 mM MMTS in 8 M urea was added to the filters, and the samples were incubated for 20 min. Filters were washed three times with 200 l of 8 M UA followed by six washes with 100 l 0.5 M TEAB. Finally, trypsin (AB sciex, Carlsbad, CA) was added in 100 l 0.5 M TEAB to each filter. The protein to enzyme ratio was 100:1. Samples were incubated overnight at 37°C, and released peptides were collected by centrifugation. Samples were then dried completely using a SpeedVac and resuspended in 100 l of 0.5% trifluoroacetic acid (TFA) in 5% acetonitrile and were desalted via PepClean C-18 spin columns (Pierce Biotechnology, Rockford, IL). Peptide content was determined using Micro BCA Protein Assay Kit (Pierce-Thermo Scientific).
Peptide Labeling with iTRAQ Reagents-Each peptide solution was labeled at room temperature for 2 h with one iTRAQ reagent vial previously reconstituted with 70 l of ethanol for 4plex iTRAQ reagent and reconstituted with 50 l of isopropanol for 8plex iTRAQ reagent. A mixture containing small aliquots from each labeled sample was analyzed by MS/MS to determine a proper mixing ratio to correct for unevenness in peptide yield. Labeled peptides were then mixed in 1:1:1:1 (or 1:1:1:1:1:1:1:1) ratio. Peptide mixture was then dried completely using a SpeedVac.
Peptide OFFGEL Fractionation-For pI-based peptide separation, we used the 3100 OFFGEL Fractionator (Agilent Technologies, Bö blingen, Germany) with a 24-well setup using our protocol (10). Briefly, prior to electrofocusing, samples were desalted onto a Sep-Pak C18 cartridge (Waters). For the 24-well setup, peptide samples were diluted to a final volume of, respectively, 3.6 ml using OFFGEL peptide sample solution. To start, the IPG gel strip of 24 cm-long (GE Healthcare, Mü nchen, Germany) with a 3-10 linear pH range was rehydrated with the Peptide IPG Strip Rehydradation Solution according to the manufacturer protocol for 15 min. Then, 150 l of sample was loaded in each well. Electrofocusing of the peptides was performed at 20°C and 50 A until the 50 kVh level was reached. After focusing, the 24 peptide fractions were withdrawn, and the wells were washed with 200 l of a solution of water/methanol/formic acid (49/50/1). After 15 min, the washing solutions were pooled with their corresponding peptide fraction. All fractions were evaporated by centrifugation under vacuum and maintained at Ϫ20°C. Just prior nano-LC, the fractions were resuspended in 20 l of H 2 O with 0.1% (v/v) TFA.
Capillary LC Separation-The samples were separated on an Ultimate 3,000 nano-LC system (Dionex, Sunnyvale, USA) using a C18 column (PepMap100, 3 m, 100 A, 75 m id ϫ 15 cm, Dionex) at 300 nl/min flow rate. Buffer A was 2% ACN in water with 0.05% TFA, and buffer B was 80% ACN in water with 0.04% TFA. Peptides were desalted for 3 min using only buffer A on the precolumn, followed by a separation for 105 min using the following gradient: 0 to 20% B in 10 min, 20% to 45% B in 85 min, and 45% to 100% B in 10 min. Chromatograms were recorded at the wavelength of 214 nm. Peptide fractions were collected using a Probot microfraction collector (Dionex). We used CHCA (LaserBioLabs, Sophia-Antipolis, France) as MALDI matrix. The matrix (concentration of 2 mg/ml in 70% ACN in water with 0.1% TFA) was continuously added to the column effluent via a micro "T" mixing piece at 1.2 l/min flow rate. After 12 min run, a start signal was sent to the Probot to initiate fractionation. Fractions were collected for 10 s and spotted on a MALDI sample plate (1,664 spots per plate, Applied Biosystems, Foster City, CA).
MALDI-MS/MS-MS and MS/MS analyses of offline spotted peptide samples were performed using the 5800 MALDI-TOF/TOF Analyzer (AB sciex) and 4000 Series Explorer software, version 4.0. The instrument was operated in a positive ion mode and externally calibrated using a mass calibration standard kit (AB sciex). The laser power was set between 2,800 and 3,400 for MS and between 3,600 and 4,200 for MS/MS acquisition. After screening all LC-MALDI sample positions in MS-positive reflector mode using 1,500 laser shots, the fragmentation of automatically selected precursors was performed at a collision energy of 1 kV using air as collision gas (pressure of ϳ 2 ϫ 10 Ϫ6 Torr) with an accumulation of 2,000 shots for each spectrum. MS spectra were acquired between m/z 1,000 and 4,000. For internal calibration, we used the parent ion of Glu1-fibrinopeptide at m/z 1,570.677 diluted in the matrix (30 femtomoles per spot). Up to 12 of the most intense ion signals per spot position having an S/N Ͼ 12 were selected as precursors for MS/MS acquisition. Peptide and protein identification were performed by the Protein-Pilot TM Software V 4.0 (AB Sciex) using the Paragon algorithm as the search engine (11). Each MS/MS spectrum was searched for Homo sapiens species against the Uniprot/swissprot database (UniProtKB/Sprot 20,120,208 release 01, with 525,997 sequence entries). The searches were run using the fixed modification of methylmethanethiosulfate labeled cysteine parameter enabled. Other parameters such as tryptic cleavage specificity, precursor ion mass accuracy, and fragment ion mass accuracy are MALDI 5800 built-in functions of ProteinPilot software. The detected protein threshold (unused protscore (confidence) in the software was set to 1.3 to achieve 95% confidence, and identified proteins were grouped by the ProGroup algorithm (AB sciex) to minimize redundancy. The bias correction option was executed.
A decoy database search strategy was also used to estimate the false discovery rate (FDR), defined as the percentage of decoy proteins identified against the total protein identification. The FDR was calculated by searching the spectral against the Uniprot H. sapiens decoy database. The estimated low FDR of 0.9% indicated a high reliability in the identified proteins.
Quantification of Relative Protein Expression-We employed a customized software package, iQuantitator (12)(13)(14) to infer the magnitude of change in protein expression. The software infers sample-dependent changes in protein expression using Markov chain, Monte Carlo, and Bayesian statistical methods. Basically, this approach was used to generate means and 95% credible intervals (upper and lower) for each protein expression of each tumor of the training set and the test set by using peptide-level data for each component peptide. For proteins whose iTRAQ ratios were down-regulated in tissues, the extent of down-regulation was considered further significant if the higher limit of the credible interval had a value lower than 1. Conversely, for proteins whose iTRAQ ratios were up-regulated in tumors, the extent of up-regulation was considered further significant if the lower limit of the credible interval had a value greater than 1. The width of these credible intervals depends on the data available for a given protein.
Since the number of peptides observed and the number of spectra used to quantify the change in expression for a given protein are taken into consideration, it is possible to detect small but significant changes in up-or down-regulation when many peptides are available. The peptide selection criteria for relative quantification were performed as follows. Only peptides unique for a given protein were considered for relative quantification, excluding those common to other proteins. In cases where a peptide could be assigned to more than one protein, it is eliminated from consideration prior to analysis. Proteins were identified on the basis of having at least two peptides with an ion score above 95% confidence. The protein sequence coverage (95%) was estimated for specific proteins by the percentage of matching amino acids from the identified peptides having confidence greater than or equal to 95% divided by the total number of amino acids in the sequence.
Functional Annotation and Network Analysis-Gene ontology (GO) terms for identified proteins were extracted, and overrepresented functional categories for differentially abundant proteins were determined by the high throughput GOMiner tool (National Cancer Institute, http://discover. nci.nih.gov.gate2.inist.fr/gominer/) (15). All proteins that were subjected to iQuantitator analysis served as the background list, and GO terms with at least five proteins were used for statistical calculations. A p value for each term was calculated via the one-sided Fisher's exact test, and FDR was estimated by permutation analysis using 1,000 randomly selected sets of proteins sampled from the background list. Statistically significant (FDRϽ25%) GO terms were clustered based on the correlation of associated proteins to minimize potential redundancy in significant GO terms.
Statistical Analysis-To visualize clustering of groups, a two-way (by protein and tumor ID) hierarchal clustering was performed on log2-transformed data. Further multivariate statistics and modeling was performed with SIMCA (SIMCA 13.0, Umetrics, Sweden) (16). The analysis was performed on mean-centered, unit-variance-scaled data, assuming equal importance of each protein regardless of relative abundance and magnitude of variance between samples. Principal component analysis (PCA) (17,18), was performed to get an overview of the data, detect clustering of the data, and pick up outliers if any. The PCA summarizes the variation of the data matrix (i.e. protein ratios) and shows the relationship between the observations. For classification and identification of proteins differentiating relapse from relapse-free tumors/ patients, we used orthogonal partial least square analysis (OPLS) (19). The OPLS analysis detects the protein expression data that covaries with the defined clinical groups. For optimization of the OPLS models, we used the variable importance in the projection value to judge protein influence (including prediction performance) on the model. The OPLS models were validated by sevenfold full cross-validation. Proteins with high variable importance in the projection throughout the cross-validation of the model (95% confidence interval) were selected for the optimized model. We used the plots of the scores predicted in the cross-validation and analysis of variance (CV-ANOVA) to evaluate the model validity. Fisher exact test were calculated for the training set versus the test set and for the relapse versus no-relapse cohorts.
Survival rates were calculated using the nonparametric Kaplan-Meier method, and log-rank tests were performed to evaluate the difference in the time between recurrence and nonrecurrence groups. Multivariate Cox models were used to assess the prognostic value of each variable.
Immunohistochemistry and Scoring-The 42 tumors from the training set (ICO René Gauducheau) were studied by immunohistochemistry. The immunohistochemistry was carried out on 4-m thick paraffin embedded sections of forma-lin-fixed tumor samples. Details of the antigen retrieval technique and dilution of primary antibodies (TrpRS, DP, and TPS1) are described in Supplemental data S2. The immunolabeling technique was performed by a benchmark automatized tissue staining system (Ventana Medical System, Tucson, AZ). The immunohistochemistry was evaluated semiquantitatively by the percentage of cytoplasmic staining cells, the intensity, and the presence or not of secretory granules. To exclude subjectivity, all slides were evaluated by two pathologists who had no knowledge of the patients' identities or clinical status. In discrepant cases, the two pathologists reviewed the slides together and reached a consensus. The percentage of immunopositive stained cells (A) was divided into four grades as: Ͻ10% (0); 10 -30% (1); 30 -50% (2); 50 -70% (3); and Ͼ70% (4). Second, the intensity of staining was scored by evaluating the average staining intensity (B) of the positive cells (0, none; 1, weak; 2, intermediate; and 3, strong). The score for each section was measured as A ϫ B, and the result was defined as negative (-, 0), weakly positive (ϩ, 1-3), positive (ϩϩ, 4 -7), and strongly positive (ϩϩϩ, 8 -12). The immunohistochemical data were subjected to statistical analysis. All quantitative data were recorded as mean Ϯ S.D. Comparison between multiple groups was performed by one-way ANOVA and Wilcoxon rank test (p valueϽ .05).
Receiver Operating Characteristic Curves-Individual and combined biomarker performances were investigated on the receiver operating characteristic curves with linear discriminant analysis. Linear discriminant analysis was used to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination was then used as a linear classifier. To determine how accurately the learning algorithm was able to predict data, cross-validation and bootstrapping methods were used. In leave-one-out cross-validation, one sample was removed from the dataset, and a classifier was generated using the remaining samples to predict the status of the removed sample. In 10-fold cross-validation, the data were divided into 10 subsets of approximately equal size, and 10 iterations of training and validation were performed. The 0.632ϩ bootstrap cross-validation uses resampling technique. The 10-fold cross-validation and bootstrapping procedures were replicated 100 times. Statistical analyses were performed using TANAGRA (v1.4.49).

RESULTS
The training tumors were profiled by iTRAQ-LC-MS/MS approach (Supplemental data S3). The baseline clinical features of patients were similar between the ICO Paul Papin training set and the ICO René Gauducheau test cohorts, although patient tumor size were slightly bigger in the training set (Table I). The median follow-up for the good prognosis patients in the training and test sets was 168 (range ϭ 68 -279) and 203 (range ϭ 51-413) months, respectively. In the training cohort, 14 patients experienced a relapse (13 distant metastasis and 1 contralateral), and among these patients, we recorded 11 deaths. In the test set, 17 patients experienced a relapse (15 distant metastasis and 2 contralateral), and among these patients, we recorded 13 deaths. Clinical data used for data analysis were updated until January 2013.

Identification of Expressed Proteins in the Training Samples: Proteomic Coverage of 42 Triple-Negative Breast Tumors-As
we considered the microenvironment was an integral, essential part of the tumor, the samples were not microdissected, but each tumor section was validated as containing more than 75% tumor cells by pathologists. Using Protein Pilot and iQuantitator software, we identified and quantified a total of 2,784 nonredundant proteins with at least two peptides, according to the schematic workflow of the experimental design presented Fig. 1. By taking into consideration both the peptide and spectra numbers, this approach allowed us to detect small but significant expression changes, provided that several peptides are detected. Using this analysis, we were able to obtain a list of quantified proteins from the 20 iTRAQ experiments. Following Metacore analysis using the " Enrichment of protein function" function, (Supplemental data S4), we identified 690 enzymes, 58 phosphatases, 122 proteases, 105 kinases, 73 ligands, 82 transcription factors, and 83 receptors. This analysis showed that the best enrichment score and p value were assigned to the GO Process "Metabolic Process" and to the "Cytoskeleton Remodeling" pathway (Supplemental data S4). Among these 2,784 proteins, 220 proteins met our definition for differential expression in a comparison between tumor and normal tissues: 126 were overexpressed, and 93 were underexpressed (Supplemental data S3).
A Proteomic Coverage of Different Status-We used the iQuantitator software to quantify protein expression between  the different status "relapse" (Supplemental data S5) and "no relapse" (Supplemental data S6). For the relapse group, 295 proteins were significantly differentially expressed: 165 were overexpressed, and 130 were underexpressed. The Metacore analysis of this list of proteins indicated a cytoskeleton remodeling with a p value ϭ 9.2 10 -12 for the Process Network "Regulation of Cytoskeleton Rearrangement" and a best enrichment score and p value for "Binding" (p ϭ 9.4 10 -26) in the GO Molecular Functions term. It should be noted that 26 secreted proteins were found in this list characterizing the relapse group (Supplemental data S5). For the no-relapse group, 189 proteins were significantly differentially expressed: 98 were overexpressed, and 91 were underexpressed. For this group, the best score for the Process Network was obtained for "Cell adhesion_Integrin-mediated cell-matrix adhesion" (p ϭ 7.5 10 -11). The protein class "ligands" was found to have to best z-score in the module "Enrichment for Protein Function" with 15 proteins (Supplemental data S6). Classification Based on Relapse Status-We investigated if we could detect differences between the relapse and norelapse groups in terms of protein levels in the triple-negative tumors by OPLS analysis. This analysis was performed on 549 proteins for which quantitative value was available in all the tumors. The OPLS model, initially based on all 549 proteins, was optimized by stepwise removal of proteins with small variable importance in the projection value. This was performed until the model did not improve anymore as judged by the CV-ANOVA p value, indicative of the probability that the model is the result of chance alone. The optimized OPLS model included 59 proteins (p ϭ 2.1 10 -15) (Fig. 2). Among these proteins, 33 were assigned to the group without recurrence and 26 to the group with recurrence. These proteins  were matched against a database consisting of known protein signaling pathways using Metacore. For the no-relapse group, two significant pathways (p Ͻ .05) were found: Blood coagulation (p ϭ 4.4 10 -6) and Chemotaxis_Lipoxin inhibitory action on fMLP-induced neutrophil chemotaxis (p ϭ .0003). The relapse group was characterized by just one significant pathway: Cytoskeleton remodeling_Keratin filaments (p ϭ 7.9 10 -7) (Supplemental data S7). Proteomic Signature of Triple-Negative Breast Tumors Relapse-By combining protein lists obtained from the univariate (with iQuantitator) and the multivariate analyses (OPLS), we generated two lists of proteins that characterize the relapse (DP, TPS1, G6PD, IDH, KRT19, KRT8, EPPK1, ARH-GAP1, DPYSL3,) and no-relapse (TrpRS, HSPE1, SAMHD1, HK1, IGHG1) groups of triple-negative breast tumors These proteins were submitted to a Fisher exact test and only TrpRS, TSP1, DP, and IDH were validated (Table II and Supplemental data S8).
Pronostic Value of the Markers in the Training Set and in the Test Set-The prognostic value of the markers was evaluated through estimation of disease-free survival (DFS) and overall survival (OS) using the Kaplan-Meier method. By the same iTraq quantitative proteomic approach used in the training set, 41 tumors were analyzed. Then, in the both training and test cohorts, the patients were divided into two categories based on the median iTraq expression data for each marker: high expression (protein levels higher than median) versus low expression (protein levels lower than median).
In the training set, for the no-relapse group, a high expression level of TrpRS was correlated with a significantly better DFS compared with a low expression (p ϭ .0129) (Supplemental data S9). In the test set, TrpRS (p ϭ .049) was validated. When we considered the expression of TrpRS with the OS rates, we showed that TrpRS was validated in the training (p ϭ .098) and in the test set (p ϭ .0136) (Supplemental data S10).
In the training set, for the relapse group, patients' tumors with high expression level of any of the three proteins experienced a significantly worse DFS compared with those with low expression (p Ͻ .0001, p ϭ .0330, p ϭ .0016) for DP, TPS1, and IDH, respectively (Supplemental data S11). In the test set, DP (p ϭ .002), TPS1 (p Ͻ .0001), and IDH (p ϭ .0040) were validated. Furthermore, tumors with high DP (p ϭ .0209), TSP1 (p ϭ .0364), and IDH (p ϭ .0007) expression were also associated with lower OS rates in the training set (Supplemental data S12) while only DP (p ϭ .0256) and TPS1 (p ϭ .0018) were validated in the test cohort. By combining the expression levels of DP and TPS1, we showed that patients who have a high expression of these both biomarkers in their tumors have a significantly worse DFS and OS compared with those with low expression of DP and TPS1 (Fig. 3). Finally, DP and TSP1 expression was a strong independent prognostic factor to predict risk of recurrence and TrpRS expression an independent prognostic factor to predict no recurrence. No other prognostic factors, such as age, tumor size, or nodal status of the patients, were statistically significantly associated with recurrence (Supplemental data S13).
Validation of Dysregalated Protein Expression-To confirm the dysregulation of the best biomarker candidates for the relapse and no-relapse groups, the expressions of TrpRS, and TPS1 and DP were analyzed by immunohistochemistry using paraffin-embedded tissues isolated from the training and the test cohorts. Representative pictures of TrpRS, TPS1, and DP staining in no-relapse and relapse cases are shown Fig. 4 for the test set. For the no-relapse group, intense cytoplasmic staining was shown for TrpRS whereas we observed a significant (p Ͻ .0001) decrease staining intensity in the no-relapse tumor group for this marker. Inversely, for TPS1 and DP, the no-relapse cases exhibited a moderate cytoplasmic staining whereas the cytoplasmic staining increased significantly in the relapse cases ( Fig. 4 and Supplemental data S14). In order to evaluate the performance of our protein signature by IHC approach, we investigated the receiver operating characteristic curves that discriminate the no-relapse and relapse groups. Individual biomarker and combination of two or three biomarkers were investigated. The corresponding AUC was estimated in the training and in the test cohorts (Supplemental data S15). In the training set, each individual biomarker was able to discriminate the both groups, but we obtained the best AUC for the association of TSP1-TrpRS (AUC: 0.92) and for the combination of the three biomarkers TSP1-TrpRS-DP (AUC: 0.93). In the test set, each biomarker was also able to discriminate between both groups, but the best protein combination to discriminate between no-relapse and relapse groups was obtained with the association of the three biomarkers (AUC : 0.82) (Fig. 5). Finally, to confirm the robustness of the learning algorithm, accuracy was estimated using common methods based on resampling: cross-validation (leave-one-out and repeat 10-fold cross-validation) and bootstrap. Interestingly, results show stable estimate accuracies for the linear discriminant analysis classifier (Supplemental data S15). DISCUSSION Despite the many recent advances in breast tumor treatments through targeted therapies, no specific treatment exists for the triple-negative breast tumors, and there are no prognostic molecular markers that would predict whether a tumor will behave aggressively or remain indolent. It is abundantly clear that tumor biology plays a significant role in resultant tumor behavior. Unfortunately, triple-negative breast primary tumors that are placed in the same prognostic category based on currently used parameters may behave differently. It is our hypothesis that the underlying biology of these tumors and differences in its detail will determine a particular tumor's potential for aggressiveness. In addition, we can use these biological differences to identify novel molecular markers that may be useful for diagnostic, prognostic, or predictive purposes, the success of which would pave the road for a new era of personalized medicine in breast cancer.
In this study, we performed quantitative proteomic profiling of 83 triple-negative breast tumors to identify biomarkers for good and bad prognostic. Although the size of the training set and the test set are limited, we propose the couple DP/TPS1 for the recurrence and bad prognostic and TrpRS for a good prognostic.
Desmoplakin is the principal plaque protein of desmosomes, involved in the adhesion junctions found in various tissues. Desmosomes are intercellular junctions that provide strong adhesion among cells. These proteins are ubiquitously expressed in epithelia and play a critical role in the maintenance of epithelial tissue integrity. Recently, studies suggest that desmosomes participate in the regulation of cell motility, growth, differentiation, and apoptosis (20 -22). DP, as founding member of the plakin family, is an obligate component of desmosomal plaques (23). Two isoforms of DP have been reported so far, DP I (322 kDa) and DP II (259 kDa), both encoded by the DSP gene on human chromosome 6p24.3. DP proteins interact with plakoglobin (␥-catenin), plakophilins, and intermediate filaments, providing the intimate link between desmosomal cadherins and the cytoskeleton (24,25) and belong to the pathway cytoskeleton Remodeling_Keratins-filaments and Gap junctions found in Metacore analysis. This is in agreement with the fact that this pathway is the top-ranked pathway characterizing the relapse group in our proteomic approach.
The second poor prognostic biomarker, TPS1, thrombospondin-1, is an extracellular matrix glycoprotein (26) and was initially recognized as an antiangiogenic factor (27,28). More recently, TSP1 was shown to induce proangiogenic activity in breast cancer cells (29). Other experiments suggested that TSP1 accelerates invasion and metastasis in breast (30), pancreatic (31), thyroid (32), and prostate (33) cancers. Recent study on invasive ductal carcinoma of the breast have reported that TSP1 is highly expressed in tumors associated with lymph node metastasis (34). In this context, TSP1 is not a specific marker of triple-negative breast tumors but rather a marker for aggressive tumors.
The other interesting protein as a good prognostic is TrpRS, which is a tryptophanyl-tRNA synthetase. Several tRNA synthase proteins have been identified as secreted cytokines that control angiogenesis and immune responses and that may have roles in the tumor microenvironment (35). This protein is 653 amino acids in length and is involved in protein synthesis, regulation of RNA transcription and translation, and cytokine activities in inflammatory and angiogenic signaling pathways (36). The native enzyme lacks angiogenic activity. Proteolysis or alternative splicing of its N-terminal 47 amino acids, generates a mature mini-TrpRS or T2-TrpRS (37,38), which possesses the angiostatic cytokine function (39). The production of mini-TrpRS is stimulated by IFN␥. The secretion of TrpRS is mediated by the dissociation from a ternary complex that is formed with annexin II and S100A10 in the cytosol (40). Interestingly, cytosolic S100A10 was observed to be decreased in abundance in this study. VE-cadherin, a calcium-dependent adhesion molecule, was identified as a receptor for mini-TrpRS (41) and is selectively expressed and concentrated at the intercellular junctions of endothelial cells.
TrpRS has been demonstrated to regulate ERK, Akt, and eNOS activation pathways that are associated with angiogenesis, cytoskeletal reorganization, and shear stress-responsive gene expression (40). Interestingly, a recent report indicated that low expression levels of TrpRS are related to an increased risk of disease recurrence and reduced survival of patients with colon cancer (42), although it is not known whether this finding is related to the angiostatic activity of TrpRS.
Future prospective clinical trials are needed to further consolidate the validity of this biomarker signature. Nevertheless, we propose the establishment of an IHC test to calculate the score of TrpRS, DP, and TPS1 and to evaluate the degree of aggressiveness of the triple-negative tumors.