Novel approaches for the identification of biomarkers of aggressive prostate cancer

The ability to distinguish indolent from aggressive prostate tumors remains one of the greatest challenges in the management of this disease. Ongoing efforts to establish a panel of molecular signatures, comprising gene expression profiles, proteins, epigenetic patterns, or a combination of these alterations, are being propelled by rapid advancements in 'omics' technologies. The identification of such biomarkers in biological fluids is an especially attractive goal for clinical applications. Here, we summarize recent progress in the identification of candidate prognostic biomarkers of prostate cancer using biological fluid samples.

resulted in the detection of lower-risk prostate cancer at earlier and more treatable stages of disease [1], prostate cancer is still the most frequently diagnosed cancer in men in developed countries and remains the second most common cause of cancer-specifi c mortality [2]. Furthermore, with the advent of large-scale screening for serum PSA, unnecessary biopsies and patient over-treatment are becoming increasingly evident [3,4]. Th e US Preventative Services Task Force has recommended against PSAbased prostate cancer screening on the basis of high false-positive rates and the risks associated with biopsies and over-treatment [5]. Aggressive or advanced cancers can spread quickly and warrant intensive treatment, but up to 90% of men who have prostate cancer harbor localized disease [6] and many patients are over-treated on the basis of PSA screening [4,7]. Th e prospective demarcation of patients with indolent tumors from those with aggressive disease is therefore of paramount importance. Th e identifi cation of biomarkers that can classify patients into high-and low-risk groups, before their cancers reach advanced or metastatic states, is a major area of ongoing research. A biomarker is a measurable biological indicator that can provide information about the presence or progression of a disease or the eff ects of a given treatment. A clinically useful biomarker should be safely obtainable from the patient by non-invasive means, have high sensitivity and specifi city, high positive and negative predictive values, and facilitate clinical decisions that allow optimal care to be administered [8].
Proteomics and integrated genomics approaches have resulted in the identifi cation of numerous putative prognostic biomarkers for prostate cancer. With the recent advances in mass spectrometry technologies especially, proteomes can now be analyzed with impressive coverage. Verifi cation and validation platforms have also improved signifi cantly; mass spectrometry-based assays with multiplexing capability can be established for the targeted quantifi cation of specifi c peptides of interest. In this review, we begin by summarizing some of the eff orts that have been made in various fi elds to identify prognostic biomarkers for prostate cancer. Following this, we introduce concepts for biomarker discovery in

Prostate cancer prognostic biomarkers
In broad terms, current and proposed alternative or adjunct prognostic markers for prostate cancer can be divided into clinical-pathological features and molecular factors (Table 1). In this section, we briefly summarize existing and recently proposed prognostic biomarkers for prostate cancer. These include the classic pathological scoring system for biopsy specimens, Gleason grading and more recent discoveries, such as molecular features, that might offer insight into disease progression and prognosis.

Classic prognostic biomarkers
Currently, Gleason grading is considered to be the best predictor of outcome [9]. When using this method, patho logists assign numerical grades (ranging from 1 to 5, with 5 being the poorest grade) to the two most commonly observed histological patterns, based on the degree of loss of normal glandular tissue. These two grades are summed into a Gleason score. Patients with Gleason scores 7 or higher are at increased risk of extraprostatic extension and recurrence after therapy [10,11]; furthermore, individuals with Gleason 4+3 tumors (those where pattern 4 is most prevalent but some amount of pattern 3 is also observed) may be at greater risk of prostate cancer-specific mortality than Gleason 3+4 patients (pattern 3 most prevalent but some pattern 4 is also observed) [12]. The multifocal nature of prostate cancer, whereby different genetic alterations may exist in different tumor foci of a prostate, however, increases the likelihood of missing a high-grade focus. Furthermore, the risks associated with biopsies, such as bleeding and increased risk of infections potentially leading to sepsis, underscore the need for alternative approaches for accurate prognosis [13]. The change in PSA levels (that is, PSA velocity) has also been used as a predictor of outcome after treatment; a PSA velocity of greater than 2 ng/ml/year is associated with a significantly higher risk of prostate-cancer-specific mortality [14].

Cellular markers
Ki-67 is a nuclear protein that is associated with cellular proliferation [15]. Its immunohistochemical staining index has been correlated with outcome in treated patients [16][17][18][19]. Heterogeneous immunohistochemical staining for α-methylacyl-coenzyme A racemase (AMACR) has been correlated with Gleason score [20], and low AMACR gene expression in localized prostate cancer has been linked to recurrence and metastasis [21]. Prostatespecific membrane antigen (PSMA) is a transmembrane protein expressed in all types of prostatic tissue that is used in the diagnosis of prostate cancer [22]. Its overexpression is associated with higher tumor grade, stage, PSA recurrence and metastatic disease [23,24].

Genetic aberrations as prognostic biomarkers
Focusing on a specific pathway or a group of interrelated genes that are involved in fundamental tumor biology has also proven useful. Cuzick et al. [25] focused on genes involved in cell-cycle progression and measured the mRNA expression of 126 genes in formalin-fixed paraffin-embedded prostate cancer tissues. A 31-gene signature was generated on the basis of their correlation with the mean expression of the entire panel of 126 genes. When used to retrospectively score patients who underwent prostatectomy and patients with localized disease, this signature was shown to predict recurrence after surgery and risk of death in conservatively managed patients, independently of Gleason score and other clinical factors. Using comparative transcriptomic analyses, Ding et al. [26] identified the robust activation of the Tgfβ/Bmp-Smad4 signaling pathway in indolent Ptennull mouse prostate tumors. Deletion of Smad4 in the Pten-null mouse prostate led to highly proliferative, invasive, metastatic and lethal tumors. When combined with expression levels of the key molecular players cyclin D1 and osteopontin, a four-gene expression signature (for PTEN, SMAD4 and genes coding for cyclin D1 and osteopontin) could predict biochemical recurrence and supplement the Gleason score in predicting lethal metastasis of prostate cancer in patients.
Genomic variations, such as copy number alterations, have also been linked to diseases including cancer. In a comprehensive genomic analysis of prostate cancer, Taylor and colleagues [27] analyzed copy number alterations in primary prostate tumors and found distinct patient clusters with varying degrees of relapse that had no association with Gleason score. Penney and colleagues [28] constructed a 157-gene signature based on the comparison of Gleason ≤6 and Gleason ≥8 patients. When applied to patients with Gleason 7 scores, their signature improved the prediction of lethality when compared to Gleason score alone.
DNA methylation patterns in prostate cancer may also provide insight into prostate cancer outcome. Cottrell et al. [29] performed a genome-wide scan in patients with early recurrence, high Gleason score or advanced stage; they identified 25 methylation markers that were significantly different between low-and high-Gleasonscore patients. Furthermore, the methylation states of three markers (GPR7, ABHD9 and Chr3-EST) were significantly increased in patients whose tumors reoccurred, as measured by elevated post-prostatectomy PSA levels.

Circulating biomarkers
Urokinase plasminogen activator (uPA) and its inhibitor, PAI-1, have been associated with aggressive prostate cancer exhibiting extraprostatic extension and seminal vesicle invasion, and with post-prostatectomy recurrence in patients with aggressive disease [30]. Preoperative plasma levels of transforming growth factor beta 1 (TGF-β1) have been shown to be a predictor of biochemi cal recurrence [31] and, in conjunction with preoperative plasma levels of interleukin 6 receptor (IL-6sR), have been associated with metastasis and progression [32].
Disseminated tumor cells in the bone marrow, a common site of prostate cancer metastasis, have been shown to have an association with metastatic disease and high Gleason score [33,34]. Although disseminated tumor cells may be a prognostic marker of unfavorable outcome in patients with localized disease at diagnosis, attention has shifted to tumor cells that have entered the peripheral blood as these are more easily accessible. The number of circulating tumor cells can be determined at the time of diagnosis and elevated numbers, as indicated by reverse transcriptase polymerase chain reaction for PSA, have been associated with advanced stage and increased Gleason score [35]. Goodman et al. [36] determined that prior to treatment, a cut-off value of 4 circulating tumor cells per 7.5 ml of blood or more was negatively correlated with survival and could predict metastasis.

MicroRNAs
MicroRNAs (miRNAs) are a class of small, non-coding RNA molecules that are involved in the negative regulation of gene expression. Porkka and colleagues [37] demonstrated distinct miRNA expression profiles of benign prostate hyperplasia, untreated prostate cancers, and hormone-refractory prostate cancers, suggesting a potential prognostic role for miRNAs. Mitchell et al. [38] demonstrated that tumor-derived miRNAs are present in plasma and could show that miR-141 was significantly elevated in the sera of prostate cancer patients, demonstrating the utility of miRNAs as blood-based cancer biomarkers. Khan et al. [39] analyzed localized prostate tumor and adjacent normal tissues, as well as samples from advanced cases, using isobaric tags for relative and absolute quantification (iTRAQ) followed by mass spectrometry. Integrating their findings with a cancer microarray database, these authors identified differentially expressed proteins that are targets of miR-128, a finding that was further supported by in vitro experiments demonstrating a role for miR-128 in prostate cancer invasion [39].

Marker
Source Reference(s)

Clinical or pathological characteristics
High-risk prostate cancer defined as: stage T2c a or higher and PSA >20 ng/ml, or  Tissue biopsy, serum D' Amico et al. [11] PSA velocity Serum D' Amico et al. [14] Circulating

Emerging 'omics' approaches
Alternative strategies for the identification of disease biomarkers include metabolomics and lipidomics. Sreekumar and colleagues [40] undertook a global metabolomic profiling study to look for alterations that are associated with prostate cancer progression using mass spectrometry. Over 1,000 metabolites were identified in over 250 prostate cancer samples (of urine, plasma, and tissue). Sarcosine, an N-methyl derivative of glycine, was found to be elevated in patients with metastatic disease when compared to those with organ-confined tumors and was shown to be involved in prostate cancer invasion. Using a lipidomics approach, Zhou et al. [41] profiled 390 lipid species in plasma from patients with prostate cancer and healthy controls. Of the 390 species, 35 were found to be significantly differentially expressed, and 12 of these were proposed as individual markers of prostate cancer based on a sensitivity above 80% and specificity above 50%.

Prostate-related proximal tissue fluids
In the context of protein-based analyses platforms, the potential of serum or plasma as a source of biomarkers is hampered by its immense complexity [42] ( Figure 1). The human plasma proteome, for instance, has a dynamic range of protein concentrations in the order of 10 10 for many known proteins [43]; low-abundance species are thus overlooked by currently available technologies (that is, mass spectrometers can detect proteins over a maximum five orders of magnitude). Tissue-proximal fluids are located in close proximity to the tissue of interest and have been proposed as rich sources for biomarker discovery [44]. They house secreted proteins and sloughed cells that provide a potentially comprehensive assessment of the organ and the extent of disease. These fluids include urine, seminal fluid, semen, and expressed prostatic secretions (EPS). EPS exist either as direct-EPS, which are collected from the prostate prior to radical prostatectomy, or as EPS-urine, which is expelled into void urine post-digital rectal examination (DRE). The prostatic urethra carries urine through the prostate and hence may represent a useful source of prostate cancer biomarkers. One major advantage of urine over serum or plasma, with regards to protein biomarker detection, is that its contents remain relatively stable and do not undergo massive proteolytic degradation [45]. Nevertheless, the volume collected may result in varying protein concentrations, highlighting the need for standardized collection protocols.

Biomarkers in urine
Prostate cancer antigen 3 (PCA3) is a prostate-specific non-coding RNA that was first identified in a comparative transcriptomics study looking at tumor and adjacent normal tissues [46]. Subsequently, a RT-PCR based test was developed to detect PCA3 in urinary EPS [47]. A ratio of the PCA3:PSA RNA, known as the PCA3 score, is used, in combination with other clinical information, to guide decisions on repeat biopsy in men who are 50 years of age or older and who have previously had at least one negative prostate biopsy. Interestingly, Nakanishi et al. [48] reported mean PCA3 score to be significantly lower in patients with low-volume and lowgrade prostate tumors than in those with advanced tumors. The ability of the PCA3 test to predict aggressive prostate cancers is, however, under debate [48][49][50].
Tomlins et al. [51] first reported the occurrence of a recurrent TMPRSS2:ERG fusion transcript (transmembrane protease serine 2 gene fusion with E twenty-six (ETS) transcription factors) in those with prostate tumors. These fusions were detectable in 42% of urinary EPS samples from men with prostate cancer [52], although their presence in urinary sediment was not correlated with biopsy Gleason scores [53]. Telomerase is a ribonucleoprotein involved in telomere synthesis and repair [54]. Its activity, which can be measured in urinary EPS using the telomeric repeat amplification protocol assay [55,56], was found to be increased in prostate cancer and has been shown to be associated with Gleason score [55]. Urinary annexin A3 and various matrix metalloproteinases have also been shown to have diagnostic and/ or prognostic potential in prostate cancer [57][58][59][60].
Approximately 3% of the total urinary protein content is composed of exosomal proteins [61], which thus represent a sub-fraction for the discovery of prostate cancer biomarkers [62,63]. Exosomes are small vesicles (40 to 100 nm) containing protein, RNA and lipids that are secreted by various normal and tumor cells [63,64]. Wang et al. [65] used shotgun proteomics to generate the largest catalogue of urinary exosome proteins to date. In their study, over 3,000 unique proteins were identified from samples derived from nine healthy individuals. Exosome secretion is elevated in the biofluids of cancer patients, including those with prostate cancer [66], and exosomes have been shown to be enriched in tumor-cellspecific transcripts [67,68]. miRNA and mRNA can be transferred between cells via exosomes and have been shown to be functional in their new location [69]. Nilsson et al. [63] showed, in a proof-of-concept study, that urinary exosomes derived from prostate cancer patients contained two known biomarkers (PCA3 and TMPRSS2:ERG) and thus could be used as sources of biomarkers for disease.

Proteomics in prostate cancer biomarker discovery
Proteomics approaches allow for high-throughput analyses of complex biological samples, leading to the identification of biomarker candidates (Table 2). A typical cancer biomarker discovery workflow consists of a discovery phase, during which a comprehensive compara tive catalogue of candidate proteins is generated. Th is is followed by verifi cation of candidates using targeted methods of quantifi cation, and fi nally, validation and clinical assay development [42].

Protein biomarker discovery in prostate-proximal fl uids
Using mass spectrometry, Li et al. [70] identifi ed 114 proteins in the direct-EPS from patients with low-and high-grade prostate cancers, benign prostate hyperplasia and one healthy individual. In a subsequent study, Drake and colleagues [71] used multidimensional protein identifi cation technology [72,73] to analyze direct-EPS from nine prostate cancer patients (Gleason 6 and 7 cancers). Over 900 proteins were identifi ed by Drake et al., 94 of which were also identifi ed in the study of Li and colleagues [70]. Zhao and colleagues [74] used stableisotope-labeled secretome standards, a technique in which prostate cancer cells (PC3 cell line) were grown in media labeled with heavy stable isotopes and the labeled secreted proteins subsequently used as a standard across 11 direct-EPS samples to identify and quantify 86 proteins simultaneously. Principe et al. [75] performed a comparative study of urine obtained from individuals with or without cancer before and after prostatic massage. A total of 1,022 proteins were identifi ed, of which 49 were found to be prostate-enriched. Furthermore, proteomic analyses of urine by Adachi et al. [76] catalogued over 1,500 proteins in urine from 10 healthy individuals. Seminal fl uid may also represent a source of proteins that may be informative about prostate cancer outcome, and thus should be explored for this purpose [77,78]. Th ese examples provide an important resource for future biomarker discovery eff orts in these important classes of prostate-proximal fl uids.

Targeted proteomics
Th e validation of candidate protein biomarkers, which includes the task of selectively and reliably quantifying disease-related alterations in protein concentrations,  • Invasive procedure • Routine collection not possible remains a major bottleneck. Traditional workflows utilize antibodies for the targeted quantification of such candidates, but caveats associated with antibody development and validation significantly reduce the feasibility of relying on these types of assays for high-throughput biomarker validation. Selected reaction monitoring mass spectrometry (SRM-MS) can be used to develop highly quantitative assays that can complement the more traditional approaches. Although this method is reliably used for quantifying small molecules [79], it has recently been adopted as a robust, sensitive, reproducible and specific assay for protein quantification [80][81][82]. Several studies have developed SRM-MS for validation of cancer biomarkers, such as biomarkers of bladder cancer in urine [83], biomarkers of ovarian cancer in ascites and serum [84], human lung cancer xenograft lysates in mice [85], and biomarkers of prostate cancer in serum [86]. Quantification by SRM-MS can be achieved by spiking the sample with a known concentration of a stable heavyisotope-labeled peptide standard, which has the same biophysical properties as the endogenous peptide but a difference in mass that is resolved by mass spectrometry. By comparing the peak areas of the endogenous and heavy peptides, the concentration of the endogenous peptide can be inferred. Highly purified and accurately quantified heavy peptides (AQUA™ Peptides, Thermo Scientific) can be used for the absolute quantification of endogenous peptides. These peptides are costly, however, so absolute quantification is reserved for the most promising biomarker candidates. Unlike antibody-based combinatorial detection systems, SRM-MS-based quanti fication approaches have the advantage of being easily multiplexable, and thus have great potential for success.
Hüttenhain et al. [87] developed a high-throughput workflow for the quantification of cancer-associated proteins in human urine and plasma. Their study, which utilized SRM-MS, tracked 408 urinary proteins. Interestingly, 169 of these were previously undetected in the datasets from the Human Protein Atlas and in the urinary proteome dataset from Adachi et al. [76]. Furthermore, using SRM-MS assays of plasma from patients with ovarian cancer and benign ovarian tumors, Hüttenhain et al. [87] were able to demonstrate the reproducible differential expression of a number of candidates. In another study, Cima and colleagues [86] focused their analyses on the glycoproteome of Pten-null mouse serum and prostate. Label-free comparative analysis of the Pten-null animals and age-matched wildtype mice revealed 111 candidates from the prostate tissue and 12 candidates from the sera that were significantly differentially expressed. Next, these authors utilized SRM-MS assays to reliably quantify the 39 protein orthologs (selected on the basis of consistent quantification) in the sera of prostate cancer patients and controls, and used the resulting profiles to build predictive regression models for the diagnosis and grading of prostate cancer. Our group has also aimed to develop a proteomics-based platform for the discovery Alaiya et al. [104] Panel of 6 candidates Tissue: localized (n = 7) compared with lymph node metastases (n = 1) Pang et al. [105] Lamin A Tissue: low (n = 23) compared with high (n = 23) Gleason ROC AUC a = 0.88 Skvortsov et al. [106] Transthyretin Serum: n = 4 Wang et al. [107] Clusterin Serum: n = 4 Wang et al. [107] Shotgun proteomics Panel of 5 markers Tissue (n = unknown) Drake et al. [108] Panel of 3 markers Pooled serum: metastases (n = 5) compared with progressing (n = 5) Rehman et al. [109] Panel of 7 candidates Pooled serum: metastases (n = 5) compared with progressing (n = 5) Rehman et al. [109] Panel of 5 markers Serum: Gleason <7 (n = 27) compared with ≤7 (n = 27) ROC AUC a = 0.79 Cima et al. [86] Acetyl-coenzyme A acetyltransferase Cell lines: androgen-dependent compared with androgen-independent Saraon et al. [110] Targeted proteomics Alpha-methylacyl-CoA racemase and subsequent verification of prostate cancer-related proteins [71,75,88]. Specifically focusing our attention on prostate-proximal fluids, we have recently identified over 100 protein candidates that are differentially expressed when patients with organ-confined and extraprostatic tumors are compared [88]. A small number of these candidates were also found to be expressed differentially in urinary EPS from patients with recurrent disease (identified on the basis of elevated post-prostatectomy PSA levels) when assayed by stable isotope dilution-SRM-MS. Future studies will be dedicated to the verification of all differentially expressed candidates, using SRM-MS in a medium-sized cohort of urinary EPS samples from clinically stratified prostate cancer patients, in order to demonstrate the application of SRM-MS as a useful verification tool for protein biomarker candidates in these fluids.
Recently, sequential window acquisition of all theo retical fragment-ion spectra mass spectrometry (SWATH-MS) has come to the forefront of new developments in mass spectrometry. Relying on data-independent acquisition, and originally described by the Yates group [89], this approach records the fragment ion spectra of all analytes in a sample that fall within a predetermined m/z range and retention-time window [89][90][91]. This approach allows confident identification of peptides over a dynamic range of four orders of magnitude and detects precursor ions that have not been selected in the MS scan by datadependent acquisition [90]. Although the sensitivity of the targeted data analysis coupled to SWATH-MS method is slightly lower than that of SRM-MS, its quantification accuracy rivals that of SRM-MS [90,91], and thus this method could prove to be a powerful platform for biomarker discovery and verification. Advances in mass spectrometry have also led to higherresolution instruments that can allow for the systematic removal of interferences [92][93][94], allowing improved targeted analyses in complex backgrounds. This can be achieved by mass spectrometry in single ion monitoring (SIM) mode coupled with tandem mass spectrometry (MS/MS), which allows quantification at the MS/MS level. Gallien et al. [94] comparatively assessed the performance of SIM-MS and SRM-MS in analyzing urine and noted similar sensitivities, albeit the SIM-MS analysis was able to quantify a larger number of peptides at the lowest concentrations of spiked-in standards.
Biological fluids are highly complex and efforts in pursuit of complete proteome coverage are underway. Functionalized nanoparticles with high-affinity baits can be used to capture desired classes of proteins, including low-abundance proteins [95][96][97]. Alternatively, focusing analyses to specific sub-proteomes by exploiting posttranslational modifications can also selectively enrich for desired classes of proteins. One such modification that is commonly used in biomarker discovery efforts is Nlinked glycosylation, which is particularly abundant in secreted and membrane proteins. [98]. N-linked glyco sylated proteins are captured by a solid support via hydrazide chemistry and then enzymatically released by peptide N-glycosidase F [99,100] (alternatively, various lectin-affinity approaches can be used). In addition, peptide antibody-based techniques, such as stable isotope standard capture with anti-peptide antibodies (SISCAPA®) [101,102], can be coupled to SRM-MS to enrich and quantify target peptides selectively.

Into the clinic
According to the Early Detection Research Network [103], a biomarker should undergo five major phases of development before it can be confidently utilized under clinical settings for the benefit of the population. These phases are: i) preclinical exploratory studies, during which tumor-and/or aggressive-disease-associated samples are compared to non-tumor or indolent disease specimens in order to identify molecular characteristics that distinguish both cohorts and can be further explored; ii) clinical assay development and validation, during which an assay that can accurately measure the biomarker and can reliably segregate tumor from nontumor specimens is developed; iii) retrospective longitudinal studies that utilize specimens from individuals who were monitored over time for the development or progression of disease (such as patients who progress from indolent to aggressive prostate cancer) are compared to individuals who do not develop disease or do not progress; iv) prospective screening studies that are performed using the assay in order to evaluate the extent of disease at the time of detection; and v) randomized control studies that are performed in order to determine the reduction of disease burden in the population as a result of performing the assay.
Emerging technologies that not only provide an indepth look into the complex biology of tumors but also allow timely verification and validation will undoubtedly accelerate the progress of molecular markers through the biomarker development pipeline. We and others have shown that such technologies are applicable to a variety of sample types, including bio-fluids, and can enable rapid verification of exhaustive lists of candidate biomarkers.

Conclusions
The long path from biomarker discovery to validation and clinical use has resulted in exhaustive lists of biomarker candidates but relatively few are currently used in patient management. The consensus within the field is that candidate biomarkers need to be verified rapidly using large, well-annotated sample cohorts, standardized assays and multi-institutional validations. Rapidly improv ing targeted proteomics approaches could lay the foundations for such validation platforms in the near future. The use of proximal tissue fluids (such as EPSurine) in combination with specific enrichment protocols (such as those for exosomes and glycoproteins) are especially exciting strategies that will need to be systematically evaluated. In the context of exosomes, additional cancer-specific biomolecular cargo, such as tumor-derived miRNAs and mRNAs and possible tumor-DNA, could complement these studies and provide powerful multidimensional biomarker panels for the accurate detection of aggressive prostate cancers (see Figure 1 for a summary of the various biomarker pipelines).

Competing interests
The authors have no competing interest.