Automated Patient-level Prostate Cancer Detection with Quantitative Diffusion Magnetic Resonance Imaging

Take Home Message An automated, quantitative biomarker from prostate magnetic resonance imaging—derived from a 2-min scan on a standard clinical scanner—was superior to conventional apparent diffusion coefficient and comparable with expert-determined Prostate Imaging Reporting and Data System for patient-level detection of clinically significant prostate cancer.

Restriction spectrum imaging (RSI) is an advanced technique for diffusion-weighted imaging (DWI) that accounts for a complex tissue microstructure by estimating the contributions of distinct tissue compartments believed to correspond to restricted intracellular water, hindered extracellular water, freely diffusing water, and vascular flow [15,17]. We have recently developed a PCa MRI biomarker, called the RSI restriction score (RSI rs ), which relies specifically on the restricted intracellular water signal (Fig. 1). RSI rs gives improved cancer conspicuity and voxel-level PCa detection compared with the current clinical standard for quantitative DWI, the apparent diffusion coefficient (ADC) [17,18].
The most important current clinical use of mpMRI is to guide the decision of whether to biopsy-that is, patientlevel detection of csPCa [1,2,[19][20][21][22]. Here, we evaluate RSI rs as a quantitative marker for patient-level detection of csPCa (grade group !2) without reliance on the subjective expert manual identification of specific lesions. We compared the performance of RSI rs with that of conventional ADC as well as of PI-RADS v2.1 in a dataset not used in prior studies. We hypothesized that RSI rs is superior to ADC for patient-level detection of csPCa on biopsy.

Study population
With IRB approval, we retrospectively studied all men who underwent

MRI acquisition and processing
Scans were collected on a 3-T clinical MRI scanner (Discovery MR750; GE Healthcare, Waukesha, WI, USA) using a 32-channel phased-array body coil (acquisition parameters are shown in Table 1).
We performed postprocessing of MRI data in MATLAB (MathWorks, Natick, MA, USA), including corrections for distortions from B 0 inhomogeneity, gradient nonlinearity, and eddy currents [23,24]. We performed RSI calculations as described previously ( Fig. 1) [17,18] We generated receiver operating characteristic (ROC) curves for patientlevel detection of csPCa using ADC, RSI rs , and PI-RADS. In the primary analysis, we analyzed RSI rs and ADC as quantitative metrics, taking the maximum RSI rs and minimum ADC within the prostate. It is important to note that the use of the minimum ADC here differs from clinical practice, where an expert radiologist typically identifies a suspicious lesion and then calculates the mean ADC from all or part of that lesion [27].
Our approach using the maximum RSI rs and minimum ADC is analogous to the use of the maximum standardized uptake value in positron emission tomography imaging for cancer [28]. We chose this prostatewide approach to evaluate whether a quantitative metric could be used in fully automated fashion within the prostate, without relying on subjective delineation of individual lesions that depend on reader experience [11]. We considered biopsies finding only grade group 1 cancers (Gleason 6) or benign tissue as negative results for the ROC curves.
We assessed performance by the area under the ROC curve (AUC) and made statistical comparisons via 10 000 bootstrap samples to calculate 95% confidence intervals and bootstrap p values for the difference between the performance (AUC) of ADC, RSI rs , and PI-RADS [29]. We used two-sided a = 0.05 to determine statistical significance.
We used procedures analogous to those described above for subsequent analyses as follows:

Quantitative diffusion MRI within PI-RADS categories
To determine whether RSI rs enhances the detection of higher-grade PCa compared with PI-RADS alone, we repeated the RSI rs patient-level analysis within the strata of each PI-RADS category (ie, 3, 4, and 5).

Combination of PI-RADS and RSI
To explore overall performance of the combination of PI-RADS and RSI rs , we generated an ROC curve for PI-RADS + RSI rs by concatenating the within-PI-RADS strata performance from above (ie, the logistic posterior probabilities) across categories. We then calculated the AUC of the resulting ROC curve for PI-RADS + RSI rs and compared it with either PI-RADS or RSI rs alone.

Peripheral zone and transition zone
We again repeated the patient-level analysis in subgroups with lesions in only either the peripheral zone or the transition zone. For the transition zone analysis, we limited the search for the maximum RSI rs and minimum ADC to the central gland (transition and central zones). We performed an analogous analysis for patients with peripheral zone cancers. Then, to evaluate whether zone-specific searching was necessary to optimize performance, we repeated the transition zone and peripheral zone subgroup analyses but allowed the search for the maximum RSI rs and minimum ADC to include the whole prostate.

Results
A total of 151 patients met the criteria for inclusion (characteristics are summarized in Table 2). Ten radiologists had interpreted the imaging for these 151 patients, reading a median of 18 cases each (interquartile range [IQR]: four to 24 cases). The radiologists were board certified and subspecialty fellowship trained, with a median of 4 yr of experience (IQR: 4-9 yr). More experienced radiologists read more cases, so the mean number of years of experience per case was 8.5 yr (standard deviation: 1 yr).

Primary analysis: patient-level detection of csPCa
All 151 patients were included in the primary (wholeprostate) analysis. AUC values for ADC, RSI rs , and PI-RADS are reported in Table 3. Both RSI rs (p < 0.0001) and PI-RADS (p < 0.0001) were superior to ADC as a patient-level classifier of higher-grade PCa. The performance of RSI rs was comparable with that of PI-RADS (p = 0.8). The histograms and ROC curves for the primary analysis are shown in Figures 2 and 3A, respectively.  Table 3. Performance for RSI rs was numerically greater than that for ADC in each subset, although confidence intervals were wide. The difference was statistically significant within patients with PI-RADS 4 lesions (p < 0.0001) but not within patients with PI-RADS 3 (p = 0.10) or 5 (p = 0.13) lesions.
There was no significant difference in performance between the alternate ADC maps and vendor-calculated ADC maps (p = 0.24).

Combination of PI-RADS and RSI
AUC values for RSI rs concatenated within-PI-RADS subsets (PI-RADS + RSI rs ) and applied to all 151 patients are shown in Table 3. PI-RADS + RSI rs was superior to either PI-RADS (p = 0.001) or RSI rs (p = 0.03) alone. ROC curves are shown in Figure 3A.

Peripheral zone
We found 103 patients with a peripheral zone lesion and no transition zone lesion (15 benign, 23 grade group 1, and 65 csPCa). AUC values are shown in Table 3. RSI rs performance was comparable with that of PI-RADS for the peripheral zone (p = 0.98) and superior to that of ADC (p = 0.0002). ROC curves are shown in Figure 2B. PI-RADS + RSI rs was superior to either PI-RADS (p = 0.005) or RSI rs alone (p = 0.003). Similar results were obtained when searching the whole prostate for the maximum RSI rs .

Transition zone
We found 37 patients with a transition zone lesion and no peripheral zone lesion (14 benign, 15 grade group 1, and eight csPCa). AUC values are shown in Table 3. RSI rs performance was superior to that of ADC (p < 0.0001) in the transition zone. RSI rs performance was numerically superior to that of PI-RADS, but this difference was not statistically significant (p = 0.08). PI-RADS + RSI rs was superior to PI-RADS (p = 0.005) but not RSI rs alone (p = 0.63). ROC curves are shown in Figure 3C. RSI rs images and ADC maps for two patients with transition zone lesions are shown in Figure 4.
Similar results were obtained when searching the whole prostate for the maximum RSI rs , suggesting that zonespecific searching may not be necessary.

Discussion
RSI rs performed well for quantitative, automated detection of csPCa at the patient level. ADC proved unreliable as a quantitative marker with an analogous approach. We note that routine clinical use of ADC is not automated and fully quantitative; rather, it is typically used within expertdefined lesions. RSI rs was based solely on a 2-min diffusion MRI acquisition on a standard clinical scanner; yet, performance was comparable with PI-RADS categories assigned by experts using all images from a complete mpMRI examination. An analysis of the transition zone was underpowered because relatively few csPCa cases could be included (there were many more false positives from PI-RADS interpretation than true positives in the transition zone). With that limitation, there was no suggestion of worse performance for RSI rs in the transition zone, with an AUC of 0.84 (0.68, 0.95) for RSI rs , compared with 0.73 (0.54, 0.88) for PI-RADS (p = 0.08). This should be investigated further in larger datasets, as a prior retrospective analysis using a different RSI model found superior specificity for RSI in the transition zone [30].
In exploratory analyses, we found that combining PI-RADS categories and the maximum RSI rs might improve performance over either of these alone. RSI rs had an AUC of !0.70 within each PI-RADS subset, including PI-RADS 3. Concatenating the within-PI-RADS ROC results showed that the combination of PI-RADS and RSI rs also performed better than PI-RADS alone across the full dataset. This last finding should be interpreted cautiously because there were relatively few patients in each PI-RADS category subset. In the future, larger datasets will permit development of a multivariable model with PI-RADS and RSI rs , which could then be validated in an independent dataset. In contrast, all other findings in this study already represent validation tests in an independent dataset from the one used to develop the quantitative RSI rs biomarker.
Our approach is clinically feasible. RSI rs was calculated from a 2-min acquisition on a standard clinical scanner, and all postprocessing was achieved in 14 min per patient using a desktop computer. Similar RSI models are already commercially available and in clinical use. The present study demonstrates performance of a quantitative RSI metric for csPCa detection in a completely independent dataset from that used to develop the model and with a distinct acquisition protocol (different b values and echo time).
PI-RADS categories for this study were assigned during routine clinical practice. All readers were board-certified and subspecialty-trained attending radiologists at an academic center and adhered to PI-RADS standards, but this does not preclude some inter-reader variability. The goal of this analysis was not to use idealized PI-RADS implementation with central reads, but rather to obtain a real-world comparator for the quantitative biomarker. Performance for PI-RADS here is within the range of expected values [4]. Clinical decision-making surrounding biopsy may have been influenced by any number of imaging and nonimaging clinical factors, per standard of care. However, as none of these additional risk factors are formally incorporated into PI-RADS, there is nothing to suggest that this decisionmaking would unduly influence the relative performance of PI-RADS, ADC, and RSI rs among men who did undergo biopsy. Studies to incorporate RSI rs and other clinical factors for optimal decision-making are ongoing and would only improve on the encouraging performance demonstrated in the present work.
Limitations of this study include its retrospective, singleinstitution design. Patients who did not undergo biopsy were excluded, although mpMRI is known to have a high negative predictive value, and the population included in this study is most likely to benefit from improvements in quantitative MRI. Imaging for this dataset was acquired on a single scanner. This study relied on PI-RADS interpretation per clinical routine, which reflects real-world practice at our institution but may differ from the centralized review by one or two readers. Biopsy as the gold standard is also a lim- itation (some cancers may be missed), although this also reflects real-world performance; neither prostatectomy nor template-mapping biopsy is offered for routine diagnosis. In a post hoc subset analysis, the main findings were unchanged when evaluating only those who did not have a prostatectomy (results not shown). We could not adequately evaluate lesion-level performance because the retrospective analysis does not permit histopathologic verification of lesions detected by RSI rs , although the patient-level decision of whether to biopsy is the most important clinical use case of MRI [1,2], and voxel-level performance with RSI rs was quite good in prior studies [17,18].

Conclusions
In an independent validation, the performance achieved by RSI rs for patient-level detection of csPCa was superior to that of conventional ADC and comparable with that of routine, clinical PI-RADS. The combination of PI-RADS and RSI rs may perform better than either RSI rs or PI-RADS alone. These patterns held true within the transition zone, a region known to be more challenging for standard mpMRI. RSI rs holds promise as a quantitative marker and should prospectively be studied for improvement of PCa diagnosis. and also serves on its Scientific Advisory Board. These companies might potentially benefit from the research results. The terms of this arrangement have been reviewed and approved by the University of California San Diego in accordance with its conflict-of-interest policies.
Funding/Support and role of the sponsor: This work was supported, in part, by the National Institutes of Health (NIH/NIBIB K08 EB026503), the American Society for Radiation Oncology, and the Prostate Cancer Foundation.