Validation of T2- and diffusion-weighted magnetic resonance imaging for mapping intra-prostatic tumour prior to focal boost dose-escalation using intensity-modulated radiotherapy (IMRT)

Highlights • 5 mm mapping prostate biopsies correlated with imaged intra-prostatic tumour.• Diffusion-restricted tumour of ≥0.5 cm3 can be dose-escalated with confidence.• Tumours of <0.5 cm3 should not be dose-escalated.• Diffusion-weighted MR has good diagnostic accuracy for dominant tumour lesions.


a b s t r a c t
Background and purpose: To assess the diagnostic accuracy and inter-observer agreement of T2-weighted (T2W) and diffusion-weighted (DW) magnetic resonance imaging (MRI) for mapping intra-prostatic tumour lesions (IPLs) for the purpose of focal dose-escalation in prostate cancer radiotherapy. Materials and methods: Twenty-six men selected for radical treatment with radiotherapy were recruited prospectively and underwent pre-treatment T2W+DW-MRI and 5 mm spaced transperineal templateguided mapping prostate biopsies (TTMPB). A 'traffic-light' system was used to score both data sets. Radiologically suspicious lesions measuring 0.5 cm 3 were classified as red; suspicious lesions 0.2-0.5 cm 3 or larger lesions equivocal for tumour were classified as amber. The histopathology assessment combined pathological grade and tumour length on biopsy (red = 4 mm primary Gleason grade 4/5 or 6 mm primary Gleason grade 3). Two radiologists assessed the MRI data and inter-observer agreement was measured with Cohens' Kappa co-efficient. Results: Twenty-five of 26 men had red image-defined IPLs by both readers, 24 had red pathologydefined lesions. There was a good correlation between lesions 0.5 cm 3 classified ''red" on imaging and ''red" histopathology in biopsies (Reader 1: r = 0.61, p < 0.0001, Reader 2: r = 0.44, p = 0.03). Diagnostic accuracy for both readers for red image-defined lesions was sensitivity 85-86%, specificity 93-98%, positive predictive value (PPV) 79-92% and negative predictive value (NPV) 96%. Interobserver agreement was good (Cohen's Kappa 0.61). Conclusions: MRI is accurate for mapping clinically significant prostate cancer; diffusion-restricted lesions 0.5 cm 3 can be confidently identified for radiation dose boosting. Although dose-escalation to the whole prostate gland improves biochemical control of prostate cancer, it is at the expense of increased rectal toxicity [1][2][3][4][5][6][7]. The most important site for local recurrence is the dominant intra-prostatic tumour lesion (DIL) [8][9][10][11] suggesting that focal radiation boosts to the DIL may improve the therapeutic ratio of prostate radiotherapy [12,13]. Therefore, to achieve this improvement, the accuracy of imaging to detect the DIL needs to be established.
Diffusion-weighted MR is the most widely used multi-parametric magnetic resonance imaging (mpMR) parameter for detecting and staging prostate cancer, because its quantitation is correlated with Gleason grade [14][15][16][17][18][19][20]. Much of the histopathological correlation between mpMRI and histopathology has used the gold standard of whole-mount radical prostatectomy specimens (WM-RP), restricting analysis of imaging to patients suitable for radical prostatectomy. There is little data correlating mpMR with whole-gland histology in patients treated with radiotherapy, and in particular to define DIL's which might be suitable for radiation boosts.
Mathematical modelling suggests transperineal template mapping prostate biopsies (TTMPB) with 5 mm spacing detects lesions 0.125 cc with 95% certainty [21]. Clinical studies have shown the accuracy to detect a cancer volume of 0.2 cc or greater and 0.5 cc or greater is in order of 90-95% respectively with 5 mm sampling [22,23]. Five mm sampling (giving a sampling density of approximately 1 core per millilitre) have been shown to have a 95% sensitivity and negative predictive value (NPV) for detecting clinically significant prostate cancer (defined as 0.5 cm 3 or Gleason 7), although this is significantly reduced with 1 cm mapping [24,25]. If clinically significant lesions are defined as 0.5 cm 3 then 5 mm TTMPB detects 96-100% of such lesions [21,22]. A study classifying tumour burden from TTMPB core biopsy samples found that a single core with maximum cancer core length (MCCL) of 6 mm or greater had sensitivity to detect more than 95% of lesions of 0.5 cm 3 (approximating to a 1 cm diameter lesion). A 4 mm MCCL detected more than 95% of 0.2 cm 3 lesions [23]. The present study was designed to assess the diagnostic accuracy and inter-observer agreement of T2W+DW-MRI for mapping IPLs, using TTMPB as the reference-standard, for the purpose of focal dose-escalation in patients selected for prostate cancer radiotherapy. This is the key first step in defining DIL for boost therapy as tested in Phase 3 trials such as FLAME (NCT01168479) and PIVOTALBoost (ISRCTN80146950).

Study design and patient population
This single institution prospective study was a sub-group of the DELINEATE trial (ISRCTN04483921). Consenting patients were recruited sequentially. The trial was approved by the local institutional review board and Regional Ethics Committee and performed in accordance with European Union guidelines for Good Clinical Practice. Hormone-naïve patients with National Comprehensive Cancer Network (NCCN) [26] intermediate or high risk localised prostate cancer were eligible, patients with seminal vesicle involvement were excluded. All patients had standard staging investigations prior to recruitment. Eight weeks after the diagnostic trans-rectal ultrasound-guided biopsies, patients underwent an MRI comprising of T2W and DW-MR followed by a TTMPB procedure.

MR acquisition
MR imaging was performed on a 1.5T whole-body MR scanner (Avanto, Siemens, Erlangen). Data were acquired using an endorectal receiver coil (ERC) inflated with 60mls of air in combination with an external phased array body coil. A 20 mg intramuscular injection of butylscopolamine bromide (Buscopan, Boehringer Ingelheim) was administered to reduce peristalsis. The MR protocol comprised slice-matched, 3-mm, transverse T2W fast spinecho and single-shot echo-planar DWI MRI to cover the entire prostate gland. T2W fast spin-echo images were also acquired in sagittal and coronal planes. ADC maps were generated from all b values 0-800 s/mm 2 (parameter details in Supplementary Appendix A).

TTMPB procedure
Patients were anaesthetised, given prophylactic antibiotics and set-up in the lithotomy position. Biopsies were taken at 5 mm intervals, apical and basal aspects of the prostate were biopsied separately if prostate length required. Cores were taken by an experienced urologist, blinded to the MR results, and each core was marked with ink at the apical end to define polarity [27]. The supra-urethra area was avoided to prevent urethral injury.

Imaging and histopathology interpretation
Two uro-radiologists qualitatively and independently read the T2W+DW-MRI data. Both radiologists were blinded to the clinical patient data. Tumour was defined as a low signal-intensity focal lesion on T2W that showed restricted diffusion on DW-MRI. In the transitional zone, the homogeneity of the lesion and its mass effect was also considered in order to differentiate it from stromal nodules. The prostate was analysed in octants and in Barzell zones [28], which were modified for analysis taking into account the size of the prostate gland. If the prostate was short in length the analysis was performed as a single layer making quadrants or 11 Delineate-modified Barzell zones (DMBZ) ( Fig. 1B and Supplemental Appendix B). Each sector was classified with a traffic-light system as red, amber, green or white. Red corresponded to tumour 10 mm in diameter (0.5 cm 3 ) on imaging or MCCL 6 mm of primary Gleason grade 3 or 4 mm primary Gleason grade 4 on TTMPB. Amber corresponded to tumour between 7-9.9 mm (0.2-0.49 cm 3 )/abnormality equivocal for tumour on imaging or MCCL 4-6 mm of primary Gleason grade 3 or 2 mm of primary Gleason grade 4. Green corresponded to tumour 6 mm diameter (0.2 cm 3 ) or low suspicion of tumour on imaging or MCCL <4 mm of Gleason 6 or <2 mm of Gleason 7. White corresponded to no tumour on imaging or biopsy (Fig. 1A). Each sector was analysed using 2 thresholds; true positive if imaging and pathology sectors were classified as (1) both red or (2) both either red or amber. Green and white were not considered to be clinically significant prostate cancer lesions. If less than 1/3 on a sector on biopsy was affected by tumour it was classified as negative. The DMBZ analysis was performed with both strict and flexible methods. The flexible method allowed for minor geographical mismatch between sectors; if an imaged sector was positive where corresponding pathology sector was negative, any directly adjacent positive pathology zones were classified as a true positive [29]. The DIL was defined as the largest lesion identified by both readers. The second IPL was defined as the next largest lesion identified by one or both readers. The pathological DIL and other IPLs were defined considering the total cancer core length contained in clustered positive biopsies i.e. all adjacent/contiguous biopsies (Fig. 2).

Statistical analysis
Statistical analysis was performed in Excel (Microsoft) and Software Package for Social Sciences (SPSS Ò v21.0, IBM Corp, NY, USA) following a pre-specified Statistical Analysis Plan (Supplemental Appendix B). Descriptive statistics were used to assess tumour volumes and diameters. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) by sector with binomial 95% confidence intervals (CI) were calculated with Wald adjustments. All sectors were combined to give diagnostic accuracy measurements for each reader. ROC curves compared the AUC for each reader for significant cancer detection. Interobserver agreement was measured with Cohen's Kappa coefficient [30] and interpreted as: 0-0.2 slight agreement, 0.21-0.4 fair agreement, 0.41-0.6 moderate agreement, 0.61-0.8 good agreement, 0.81 almost perfect agreement. Spearman rank correlation assessed relationships between imaging tumour volumes and pathological findings. The pre-specified primary end-point was the diagnostic performance using DMBZ with the red only, flexible methodology, other endpoints were regarded as exploratory.

Results
Twenty-six eligible and consenting patients were recruited between October 2010 and November 2013 (56 patients were recruited to the whole study's initial phase The diagnostic accuracy parameters for the T2W+DW-MR images for readers 1 and 2 are shown in Table 2. The diagnostic accuracy of T2W+DW-MRI was high for both readers for identifica-tion of tumour within a given sector of the prostate; sensitivity 85-86%, specificity 93-98%, PPV 79-92% and NPV 96%. Cohen's kappa statistic for inter-observer agreement was 0.61 indicating good agreement between the 2 readers. Median DIL volumes (Fig. 3A) were 2.2 cm 3 (IQR 1.4-3.1 cm 3 ) for reader 1 and 1.54 cm 3 (IQR 1.1-2.7 cm 3 ) for reader 2. A 2nd IPL was recorded on MR in 11 patients by reader 1 (median volume 0.63 cm 3 (IQR 0.34-0.88 cm 3 ) and in 5 patients by reader 2 (median volume 0.29 cm 3 (IQR 0.27-0.38 cm 3 ) (Fig. 3A + B).
Cancer core lengths in DILs and 2nd IPLs are shown in Fig. 3C  + D. There was a statistically significant correlation between both the imaged DIL volume and any imaged red lesion volume with total cancer core length in that volume; reader 1: r = 0.44 and 0.61, p = 0.026 and 0.0001 respectively; reader 2: r = 0.50 and 0.44, p = 0.01 and 0.03 respectively (Table 3). Correlation between the 2nd IPL volume or any imaged amber lesion volume was poorer and did not reach statistical significance. Exploratory endpoints including amber as well as red categories, assessment using octants and strict rather than flexible DMBZ definitions and are  shown in Supplementary Appendix C. The various methods showed sensitivity 67-87%, specificity 84-98%, PPV 49-91% and NPV 89-96%.

Discussion
DELINEATE is the first prospective study to our knowledge assessing the diagnostic accuracy of MR for the purposes of defining the position of a radiation boost in a population planned for radical treatment with radiotherapy rather than prostatectomy. TTMPB at 5 mm intervals were used as the reference standard to achieve whole gland pathological sampling.
This study has shown that T2W+DW-MRI has good diagnostic accuracy for mapping the location and extent of tumour lesions measuring 0.5 cm 3 or 1 cm diameter with restricted diffusion on MR. Sensitivity, specificity and NPV was consistently high by 2 independent observers for the primary outcome measure; red only, flexible method (85-86%, 93-98% and 96% respectively), although the PPV varied more significantly (79-92%) due to the low prevalence of total sectors affected by tumour. The inclusion of amber lesions in the analysis caused a small decrease in sensitivity (78-80%) and NPV (92%). This was due to a marginal decrease in false positives with a corresponding larger increase in false negatives.
Other small, single institution series have referenced MR with WM-RP but vary in technical parameters used and so are not strictly comparable [17,[31][32][33][34]. Reported sensitivities and specificities vary widely from 23 to 96% [35][36][37] mainly reflecting the prevalence of the disease, the reason for performing the test, the variability in the threshold level for defining positive disease and the sector level selected for analysis, all of which critically influence the reported results.
Two complementary studies have recently reported assessments of MR in the diagnostic pathway for prostate cancer. The UK PROMIS paired-cohort multicentre trial [38] investigated mpMRI (T2W, DWI and dynamic contrast-enhanced (DCE) MR) to define primary Gleason 4 or cancer of any grade 6 mm (0.5 cm 3 ) on TTMPB, demonstrating a sensitivity, specificity, PPV and NPV of 93%, 41%, 51% and 89% respectively. However, as the primary end-point was cancer diagnosis, the threshold chosen for a positive MR was lower and considered the whole prostate as a single entity. When the PROMIS data is assessed with only lesions likely to represent cancer (exclusion of equivocal lesions), sensitivity, specificity, PPV and NPV are 70%, 78%, 70% and 84% respectively, which remain lower than but more comparable with our work. This may be explained by the different population of patients examined, improved imaging resolution using an endorectal coil and the multicentre nature of the PROMIS data. A similar prospective multicentre Australian study of 344 men assessed T2W/DWI/ DCE using the PI-RADS scale [19] in 344 men with a cut-off of equivocal for positive MR. [39]. Significant prostate cancer was defined as Gleason 7 with more than 5% grade 4, 20% of cores positive or 7 mm of prostate cancer in any core on transperineal template-guided prostate biopsies (median 30 cores per patient). Sensitivity, specificity, PPV and NPV of 96%, 36%, 52% and 92% were reported respectively which are very similar to PROMIS. Anatomical concordance of the location of imaged lesion and significant cancer on biopsy was found in 97%. The lack of 5 mm mapping is likely to have impacted on the ability to detect all clinically significant cancer as the size of tumour left undetected is directly related to the uniform spacing between core samples [21]. The consistent results of the PROMIS and Australian multicentre studies results suggest that our findings will be generalisable.
For initial MR screening it is desirable to keep the false negative rate as low as possible. For focal dose-escalation, where the remainder of the prostate is getting standard doses of radiation, it is preferable to keep the false positive rates as low as possible, i.e., a higher specificity. Dose-escalation to false positives could cause increased toxicity without additional benefit, thereby reducing any improvement in therapeutic ratio. In our data only 2 patients (8%) had a false positive MR when assessing the whole prostate; one had multiple adjacent amber cores (multiple cores of 2-5.5 mm Gleason grade 4 + 3). The second had multiple adjacent cores with <4 mm of tumour in the basal sections of the cores, classified as green, suggesting inadequate sampling of the imaged basal tumour.
Inter-observer agreement between the readers and correlation between the delineated MR volumes (for the DIL and red lesions) and the total cancer core length was good, largely a consequence of the size of the DIL and the higher Gleason grade of these tumours causing substantial diffusion-restriction on the DW-MRI. Correspondingly, smaller lesions of lower Gleason grade were more difficult to define and led to poor agreement between radiologists and poorer correlation with total cancer core length. The inclusion of these lesions for dose boosting is questionable both because of the imaging uncertainty and lack of need to boost smaller cancer foci. Reassuringly, inter-observer agreement in the multi-centre PROMIS trial (0.63 Cohen's Kappa) was similar to that in our study. In future, a combination of T2W and diffusionweighted imaging will generate contrast for more accurate and even semi-automated GTV delineation.
There are several limitations to our study. First, the number of patients was small. Second, despite the extensive sampling some areas of prostate are difficult to fully biopsy without undue risks to patients. These areas include the extreme base of the gland (bladder neck injury), the supra-urethral area (urethral injury) and pubic arch interference limiting access to the most anterior part of larger prostate glands. In our patients this caused ''false positives" to be scored on a minority of imaging sectors. Third, prostate biopsies may underestimate the true tumour burden by sampling the periphery rather than the centre of smaller lesions. Although this risk is reduced with 5 mm mapping, it will have had an effect on the analysis of total cancer core length as a surrogate for pathological volume. Fourth, we acknowledge the statistical analysis assesses all DMBZ and octants as independent of each other within each patient. This is however a well-documented approach to sector-based diagnostic accuracy studies [29,[40][41][42]. Finally, we acknowledge that there is inevitably some uncertainty related to the mapping of the prostate images to the stylised Barzell diagram which certainly will have introduced minor geographical discrepancies between the reporting radiologists and the pathological assessments.
We have shown that T2W+DW-MRI robustly identifies DIL for focal boost radiotherapy, the accuracy of which underpins clinical evaluation of such approaches. The DELINEATE trial has now recruited over 200 patients using conventional or modest hypofractionation schedules. A recent systematic review identified 988 patients treated with a DIL radiation boost within Phase1/2 studies which appear to be associated with low toxicity [43] even with prolonged follow-up of 8 years [44]. DIL boosts are being assessed in ongoing clinical phase 3 trials such as FLAME (NCT01168479) [45,46] and PIVOTALboost (ISRCTN80146950).
In summary, focal dose escalation to DIL may be limited to lesions 1 cm in diameter (0.5 cm 3 ), where T2W+DW-MRI imag-  ing suggest a high suspicion of tumour which can be defined with confidence. Lesions <0.5 cm 3 or larger lesions less restricted on DW-MRI should be treated with standard radiation doses. Including these lesions in the threshold for focal boosts increases false positives and risks increasing toxicity without therapeutic benefit.