Analytical Validation of a Cell Cycle Progression Signature Used as a Prognostic Marker in Prostate Cancer

Background: Prostate cancer is the most common cancer in men in the developed world. Appropriate clinical care requires accurate prognostic information to determine whether definitive treatment or conservative management is most appropriate for a given patient. We previously demonstrated that a gene expression signature, which measures the RNA expression of 31 cell cycle progression (CCP) genes and generates a CCP score, is a robust predictor of patient outcome in cohorts of conservatively managed patients diagnosed by needle biopsy or transurethral resection of the prostate. Methods: These current studies represent the analytical validation of this gene signature, for the testing of either formalin-fixed paraffin-embedded (FFPE) prostate resection tissue (radical prostatectomy, RP) or FFPE prostate needle biopsy samples. Results: The measured standard deviation (SD) of the signature was determined to be 0.1 score units, representing 1.6% of the range of scores observed within previous clinical validation studies. Individual amplicons for all genes within the signature had a SD <1 CT, with a median SD of 0.52 CT’s. We observed the median amplification efficiency for all genes was 92.6%. The linear range of the signature was over a ~260-fold range of RNA concentrations. We observed that 100% of RP samples and 99.8% of needle biopsy samples produced sufficient RNA for testing, when RNA was extracted from 7,525 recent prostate samples. Finally, RNA samples were able to reproduce similar CCP scores when stored for up to 8 weeks. Conclusion: These studies indicate that this prognostic gene signature is robust and reproducible, and is analytically validated for use on FFPE prostate biopsy radical prostatectomy samples. Journal of Molecular Biomarkers & Diagnosis J o u r n a l o f M ole cul ar iorkers & iag n o s i s


Introduction
Prostate cancer is the most prevalent cancer among men in the developed world, with ~30,000 annual deaths in the US [1]. While PSA screening has significantly decreased disease specific mortality [2], prostate cancer progression is highly variable. As PSA screening is not capable of discriminating between low and high-risk cancers [3,4] this type of screening results in the over-detection of indolent cancers that do not pose a significant risk of mortality [5]. While clinical and pathologic features, such as Gleason score, clinical stage, and baseline PSA levels, are currently utilized to distinguish aggressive and indolent prostate cancers, this has been shown to have limited accuracy [6][7][8].
As a result of these clinical limitations, a majority of men with prostate cancer will receive treatment that may include surgery and/or additional therapies, despite the fact that only 15% to 30% of prostate cancers will exhibit oncologic progression [9][10][11]. The risk of treatmentrelated complications and morbidity for many of these patients may outweigh their prostate cancer risk [5][6][7][8][9][10][11]. The lack of adequate screening technologies also results in the under-treatment of men with more aggressive cancers. Ultimately, appropriate clinical care requires accurate prognostic information to determine if removal of the prostate or additional therapies after prostate removal might improve patient outcomes.
We recently developed a gene expression signature that can assess the risk of death from prostate cancer by measuring the RNA expression of 31 cell cycle progression (CCP) genes, which have increased expression in aggressive tumors. The expression of the CCP genes is normalized by the expression of 15 housekeeper genes to generate a CCP score, as previously described [12]. Previous clinical validations have demonstrated that the CCP score is a robust predictor of prostate cancer outcomes, including disease-specific survival, in conservatively managed cohorts.
RNA expression assays require stringent analytical validation of the components of the assay, as well as the tissue being tested. The work presented here represents the analytical validation of this gene signature, for the testing of either formalin-fixed paraffin-embedded (FFPE) prostate resection tissue (radical prostatectomy, RP) or FFPE prostate needle biopsy samples. The aim of these studies is to demonstrate that the CCP score is a robust and reproducible molecular diagnostic tool that is appropriate for clinical use.

RNA extraction and CCP score calculation:
This CCP signature has previously been clinically validated on FFPE prostate resection tissue (radical prostatectomy, RP) or FFPE prostate needle biopsy samples [13][14][15][16]. The analytical validation studies presented here were performed using commercial samples of Figure 1: Precision of individual genes within the signature. The SD of each target (black) and housekeeper (gray) gene is graphed and the error bars represent the 95% confidence intervals. The SD of the target CCP genes is on the ∆C T scale, while the SD of the housekeeper genes is on the C T scale. FFPE prostate biopsy or RP tissue (Avaden Biosciences, Inc., Scarsdale, NY), or residual RNA from samples submitted for clinical testing. Upon completion of clinical testing, all samples used in these studies were anonymized. All of these studies were performed within a CLIA certified laboratory under established protocols.
RNA expression was analyzed as previously reported [18]. In brief, every sample required an H&E stained slide and 1-5 sections of unstained tissue, containing a total of 3-25 µm of tissue. An anatomic pathologist identified and circled tumor-enriched areas on the H&E slide, which were then macro-dissected from the unstained tissue. RNA was extracted from the unstained tissue using the RNeasy FFPE kit (QIAGEN, Valencia, CA) on a QIACube instrument (QIAGEN). Isolated RNA was treated with DNase I (Sigma-Aldrich, St. Louis, MO), and quantified using a Nanodrop spectrophotometer (Thermo Scientific, Waltham, MA).
To measure gene expression, 25-500 ng of RNA was used to synthesize cDNA with the High Capacity cDNA Reverse Transcription Kit (Life Technologies, Carlsbad, CA). Unless otherwise noted, samples with <25 ng of total RNA were not tested. All CCP and housekeeper genes were pre-amplified for 14 cycles in a single multiplex reaction, using the TaqMan PreAmp Master Mix and the associated TaqMan expression assays for each gene (Life Technologies). Gene expression was then measured on a custom TaqMan Low Density Array, on an Applied Biosystems 7900HT machine. All samples were run in triplicate, with samples being split into the triplicate measurements directly after cDNA synthesis. The expression of each gene was recorded as the C T (crossing threshold) at a pre-specified threshold. The CCP score was calculated as previously reported [12].

Precision Estimation
The precision of the CCP score was assessed in a set of 6 FFPE biopsy and 12 FFPE RP samples, collected by a commercial supplier according to IRB-approved guidelines (Avaden Biosciences Inc). The RP samples had sufficient tissue for 3 replicates, while the biopsy samples had sufficient tissue for 4 or 6 replicates. Samples were required to have mean expression of housekeeper genes ≤24 C T , in order to match the average expression of clinical samples. Potential biological variation between the different unstained tissue sections was minimized by combining a set of interleaving unstained tissue slides for each replicate. The precision for the overall CCP score was defined as the standard deviation captured in the residual variation term from the linear mixed model described as: Y ij = µ+α i +ε ij where Y ij was the CCP score, µ was the overall mean effect, α i was the random effect from the ith sample (i = 1, 2… 23), and ε ij was the residual error ~N(0,σ ε 2 ) from the jth run of the ith sample (j = 1, 2, …, 6), which was assumed to be independent of α i . Individual gene SDs were determined within the triplicate measurements.

Assessment of linear range for RNA concentration and cDNA amplification efficiency
The linear range for each gene was tested on three "samples", each of which was the aggregation of anonymized RNA from both biopsy and RP clinical samples with known CCP scores. RNA for samples with similar scores was first combined, then concentrated using a Savant SpeedVac (Thermo Scientific). Serial 2-fold dilutions were prepared for each sample with RNA concentrations ranging from 125-0.06 ng/ µL. Samples were processed using a fixed volume of input RNA after dilutions were made.
The linear range was defined as the minimum and maximum RNA concentrations for which the CCP scores did not vary more than 1 SD from the mean and the R 2 value between the C T and the log concentration was greater or equal to 0.93 for all the genes. Amplification efficiencies were calculated for sample B (Figure 1); the resulting cDNA were estimated by using the formula: Efficiency=100(2 -1/slope -1) where the slope is estimated from the regression of C T measurements versus log base 2 RNA concentrations, over the previously determined linear range.

Stability of extracted RNA
RNA was extracted from five biopsy and six RP samples and aliquoted after quantification. One aliquot for each sample was initially tested at time zero. All remaining RNA aliquots were stored at -20°C, and an aliquot of each sample was tested every two weeks, over an eight week period.

Clinical laboratory control measures
A variety of control processes have been implemented to ensure the accuracy and precision of clinical samples run within our CLIAcertified laboratory, which are codified within standard operating procedures (SOPs). In brief, a batch of six clinical samples will contain both a positive control (RNA with a known CCP score) and a notemplate negative control. In order for clinical samples in a batch to be reported, both the positive and negative control within the batch must perform within predetermined specifications. New lots of reagents are quality control tested against previous reagent lots, prior to use. Reagents are aliquoted and stored until use, according to manufacturer's guidelines. All instruments within the laboratory have a preventative maintenance schedule; new or serviced instruments are qualified prior to use, according to current operational qualification SOPs. Laboratory technicians undergo biannual proficiency testing.

Reproducibility and precision of the CCP gene expression signature
The overall precision of the signature was determined by testing 18 samples, with 3-6 biological replicates for each sample (depending on the amount of tissue available). Interleaving unstained tissue sections were used when testing biological replicates of each sample, to minimize potential biological variation between the different unstained tissue sections. This precision measurement includes variation within all steps of the process, from tissue macro-dissection through the quantitative measurement of gene expression.
The overall standard deviation (SD) of the signature was determined to be 0.1 CCP score units (95% CI, 0.08-0.13) between replicate measurements. We observed CCP scores from -2.0 through 4.1 during recent clinical validation studies [13][14][15][16]; this precision represents 1.6% of this range of observed scores, indicating the gene expression signature is reproducible and precise.
During the calculation of the CCP score, the expression of 31 CCP genes is normalized by the expression of 15 housekeeper genes. We also determined the precision of the individual TaqMan amplicons for all 46 genes within the signature, by calculating the SD for each amplicon (Table 1 and Figure 1). We observed that all amplicons had a SD ≤1 C T , with a median SD of 0.325 C T units. More specifically, we observed that the housekeeper genes were more precise than CCP genes, with a median SD of 0.43 ∆C T and 0.26 C T units for CCP and housekeeper genes, respectively. This result was anticipated, as the housekeeper genes generally have higher expression than the CCP genes

Stability of stored RNA
To assess the reproducibility and stability of RNA that is stored at -20°C, we tested the reproducibility of CCP scores of 11 samples (5 biopsy and 6 RP) over an 8 week timeframe, testing each sample every 2 weeks. We observed that the CCP scores were reproducible across all time points, with no trend in the scores of any of the individual samples ( Figure 2). This indicates that there is not a bias for the CCP score to change as the RNA is stored over this time period. Furthermore, we observed that all samples had a SD equal to or less than 0.1 CCP score units, which is similar to overall precision of the signature.

Yields of RNA extraction from FFPE tissue
This RNA expression signature requires a minimum RNA concentration of 2 ng/µL (25 ng of input RNA). Samples with concentrations greater than 40 ng/µL (500 ng of input RNA) are diluted and tested at an RNA concentration of 40 ng/µL. In order to determine the frequency at which the RNA extraction process provides sufficient RNA for testing, the RNA yields from 952 RP and 6,573 biopsy clinical samples were assessed (the majority of which are less than 2 years in age). We observed that 100% of the RP and 99.8% of the biopsy samples produced sufficient RNA for testing ( Figure 3). Furthermore, 82.6% of the RP samples produced RNA concentrations in excess of 40 ng/µL and required subsequent dilution, while only 5.5% of the biopsy samples required dilution (Figure 3). It was expected that RP samples would produce such large quantities of RNA compared to biopsy samples, due to the large relative difference in the sizes of the two types of samples.

Linearity of the RNA concentration
The linearity of the signature was determined in regards to RNA concentration, as the amount of RNA obtained from FFPE samples can vary drastically. To this end, 3 samples were tested across a range of RNA concentrations from 125 to 0.06 ng/µL (1,560 to 1.5 ng of input RNA), which exceeds the clinical range over which we test samples (40 to 2 ng/µL, 500 to 25 ng of input RNA). The 3 samples tested had scores that ranged from -1.5 to 2.1, to assess whether samples with different CCP scores produced consistent scores over the same range of RNA concentrations. We observed that all three samples had consistent CCP scores across the entire range of RNA concentrations that was assessed ( Figure 4); however, none of the three samples produced a CCP score at 0.06 ng/μL (1.5 ng of input RNA) because the CCP scores at those concentrations did not pass our quality control measures. Although the  low scoring sample failed to give a result at 7.8 ng/μL due to an apparent technical failure, at least two concentrations greater than and less than this concentration gave successful results.
Using these data, we next calculated that the linear range of the RNA concentration was from 62.5 to 0.24 ng/µL. This ~260-fold range exceeds the 20-fold range of RNA concentrations over which the signature was clinically validated and clinical samples are tested (40 to 2 ng/µL). We also observed that every CCP score within this linear range had a CCP score within <1 SD (0.1 score units) of the average CCP score for that sample, for all three samples (Figure 4).

Amplification efficiency of genes within the CCP gene expression signature
The amplification efficiency of each amplicon within the signature was also determined, over the linear RNA concentration range from 62.5 to 0.24 ng/µL (Table 1). We observed no statistical difference in the amplification efficiencies when comparing housekeeper and target genes (p-value, 0.39).

Dynamic range of the CCP gene expression signature
We previously determined the dynamic range over which this gene expression signature could produce valid CCP scores [18]. This was accomplished by testing samples with an expected CCP score (as created by pre-specified ratios of the CCP genes to housekeeper genes) and comparing the expected score with the observed CCP score for each sample. We found that the signature had a very wide dynamic range over a 10 8 -fold range of gene expression, from CCP scores of -13 to 14. The range of CCP scores we observed within recent clinical validations in prostate cancer samples (CCP scores from -2.0 to 4.1) [12][13][14][15][16][17] is well within the dynamic range of the gene expression signature.

Discussion
The CCP score is calculated from the average expression of 31 CCP genes and 15 housekeeper genes and has been clinically validated as a robust predictor of prostate cancer outcomes in cohorts of conservatively managed men [12][13][14][15][16][17]. Previous studies have also demonstrated the clinical utility of the CCP score, which has been shown to influence medical management [19]. However, this clinical utility requires high analytical accuracy and precision. Here, we have presented the analytical validation of this signature, assessing the reproducibility, dynamic range, linearity, amplicon efficiency, and precision of the signature.
Previous studies have demonstrated a clinical range of CCP scores in biopsy and RP prostate samples from -2.0 to 4.1 [13][14][15][16]. The standard deviation of the CCP score determined here is 0.1 CCP score units between sample replicates, which is only 1.6% of the range of clinically observed scores. This is similar to previous analytical validations of RNA expression signatures, including the precision of this CCP expression signature within lung resection tissue [18,[20][21][22]. The individual amplicons for all 46 genes in the signature had a SD ≤1 C T , with the housekeeper genes showing a greater degree of precision than CCP genes (Figure 1 and Table 1). These data are consistent with the higher overall expression of the housekeeper genes compared to the CCP genes in prostate tissue, resulting in higher precision when measuring housekeeper gene expression.
The stability of the RNA extracted from both needle biopsy and RP samples was also examined, as clinical samples occasionally require re-testing. Currently, extracted RNA from clinical samples is stored for one month within our CLIA laboratory, after which RNA must be extracted from new tissue if a sample requires re-testing. We observed that representative biopsy and RP samples were stable over at least an eight week period (Figure 2), which is double the amount of time clinical samples are retained for re-testing.  : Dilution series for the RNA concentration, for 3 samples with different CCP scores. Note, the Low score sample failed to produce a CCP score at concentrations 7.8 and 0.12 ng/µL, and all three samples failed to produce a CCP score at 0.06 ng/µL because the CCP scores at those concentrations did not pass our quality control measures.
The quantity of RNA extracted from FFPE tissue can be highly variable, which can impact clinical testing of RNA expression. Here we have shown that 100% of RP samples (n=952) produced sufficient RNA for testing (2 ng/µL, 25 ng of input RNA), with 82.6% having concentrations high enough to require subsequent dilution prior to testing (≥40 ng/µL). Similarly, 99.8% of biopsy samples (n=6,573) had RNA concentrations sufficient for testing, while only 5.5% had concentrations high enough to require dilution. With the large difference of the relative sizes of RP and needle biopsy lesions, it was unsurprising that a large percentage of only RP samples yielded such high concentrations of RNA.
Previous studies have shown that dynamic range of the CCP score is from -13 to 14 [18]. This extends well beyond the clinical range of scores observed in biopsy and RP samples (-2.0 to 4.1). Linearity of the CCP score was observed for RNA concentrations ranging from 62.5 to 0.24 ng/µL (781 to 3 ng of input RNA) (Figure 1), which includes the full range of RNA concentrations specified in the clinical testing protocol (40 to 2 ng/µL). Although this indicates that testing samples with concentrations as low as 0.24 ng/µL could provide an accurate CCP score, the threshold at 2 ng/µL is based on the detection limit of the Nanodrop spectrophotometer used for RNA quantification.
Many analytical validations of RNA expression signatures also require assessment of potential PCR inhibitors found within different types of tissue. For example, melanin found in skin biopsies can inhibit PCR and must be explicitly investigated [20]. However, there are no known PCR inhibitors found in prostate tissue.

Conclusions
The work presented here demonstrates this CCP gene expression signature is reproducible and robust when testing prostate FFPE needle biopsy and FFPE RP samples. The linear and dynamic range of the signature exceeds the parameters utilized in clinical testing, indicating that the test is suitable for use. In conjunction with the previous clinical validation [13,16] and utility studies [19], these studies indicate that this signature can be useful in providing accurate prognostic information to better inform medical management decisions for men with prostate cancer.