Published online Apr 27, 2020.
https://doi.org/10.3348/kjr.2019.0842
The Diagnostic Performance of the Length of Tumor Capsular Contact on MRI for Detecting Prostate Cancer Extraprostatic Extension: A Systematic Review and Meta-Analysis
Abstract
Objective
The purpose was to review the diagnostic performance of the length of tumor capsular contact (LCC) on magnetic resonance imaging (MRI) for detecting prostate cancer extraprostatic extension (EPE).
Materials and Methods
PubMed and EMBASE databases were searched up to March 24, 2019. We included diagnostic accuracy studies that evaluated LCC on MRI for EPE detection using radical prostatectomy specimen histopathology as the reference standard. Quality of studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. Sensitivity and specificity were pooled and graphically presented using hierarchical summary receiver operating characteristic (HSROC) plots. Meta-regression and subgroup analyses were conducted to explore heterogeneity.
Results
Thirteen articles with 2136 patients were included. Study quality was generally good. Summary sensitivity and specificity were 0.79 (95% confidence interval [CI] 0.73–0.83) and 0.67 (95% CI 0.60–0.74), respectively. Area under the HSROC was 0.81 (95% CI 0.77–0.84). Substantial heterogeneity was present among the included studies according to Cochran's Q-test (p < 0.01) and Higgins I2 (62% and 86% for sensitivity and specificity, respectively). In terms of heterogeneity, measurement method (curvilinear vs. linear), prevalence of Gleason score ≥ 7, MRI readers' experience, and endorectal coils were significant factors (p ≤ 0.01), whereas method to determine the LCC threshold, cutoff value, magnet strength, and publication year were not (p = 0.14–0.93). Diagnostic test accuracy estimates were comparable across all assessed MRI sequences.
Conclusion
Greater LCC on MRI is associated with a higher probability of prostate cancer EPE. Due to heterogeneity among the studies, further investigation is needed to establish the optimal cutoff value for each clinical setting.
INTRODUCTION
Assessment of extraprostatic extension (EPE) in prostate cancer patients is associated with adverse oncological outcomes such as post-treatment biochemical recurrence, development of metastasis, and decreased survival (1, 2, 3). Recognition of the presence of EPE is critical in patients treated with radical prostatectomy (RP) (e.g., whether to perform nerve-sparing procedures) and treatment planning for patients who undergo radiotherapy (4). However, it is challenging to accurately predict EPE based only on clinical assessment using digital rectal examination, biopsy Gleason scores, and/or prostate-specific antigen levels (5, 6).
Multiparametric magnetic resonance imaging (mp-MRI) has been widely utilized for the detection, local staging, and treatment planning in patients with prostate cancer (7). However, the accuracy of mp-MRI in determining EPE has been variable among studies (8). This may stem from the fact that EPE evaluation on MRI has traditionally been based on the subjective assessment of imaging findings of abutment, irregularity or prostate capsule, bulging, and neurovascular bundle thickening on T2-weighted images (T2WIs). Poor interreader agreement and dependence on the level of experience renders consistent reporting of EPE among radiologists difficult (9, 10). To overcome this shortcoming of subjective EPE evaluation, objective and quantitative measures have been introduced for predicting EPE on mp-MRI, including calculation of apparent diffusion coefficient (ADC) values of the dominant lesion or standardizing interpretation and reporting using the Prostate Imaging Reporting and Data System (PI-RADS) (11, 12).
The length of tumor capsular contact (LCC), defined as the length of prostate tumor in contact with the capsule, has been proposed as an independent and reproducible predictor of EPE (13, 14, 15, 16). LCC showed improved accuracy in predicting EPE compared to the previously reported qualitative MRI findings of bulging or irregularity of the capsule (13). However, its adoption in clinical practice has been slow for several reasons. Prior studies have been based on small numbers of patients and have assessed LCC using different thresholds (6–20 mm) using various MRI sequences (i.e., T2WI, ADC, or dynamic contrast-enhanced [DCE] MRI) (13, 14, 15, 16). The purpose of this study was to systematically review the literature and perform a meta-analysis regarding the diagnostic performance of LCC on MRI for detecting EPE in prostate cancer.
MATERIALS AND METHODS
This study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis of Diagnostic Test Accuracy guidelines (17). A research question based on the patient, index test, comparator, outcome, and study design (PICOS) criteria was formulated as follows: what is the diagnostic performance of LCC on MRI for predicting EPE in prostate cancer patients, as compared with that of histopathological results after RP, in original articles?
Literature Search
PubMed and EMBASE databases were systematically searched up to March 24, 2019. The following keywords and related terms were included in the search query: prostat* AND (“magnetic resonance” OR MR OR MRI) AND (capsul* OR contact) AND (extracapsular OR extraprostatic). The references of identified articles were screened to find other eligible studies.
Inclusion Criteria
Studies were considered eligible in the meta-analysis if they met the following PICOS criteria (18): 1) patients diagnosed with prostate cancer; 2) LCC on MRI used for EPE detection as an index test; 3) histopathological results after RP as comparator, 4) EPE as outcome; and 5) original articles as type of study.
Exclusion Criteria
Exclusion criteria were 1) studies with fewer than ten patients; 2) publication type other than original articles; 3) studies focusing on different topics (i.e., diagnostic accuracy of other MRI findings for EPE prediction); 4) overlapping patient populations; and 5) insufficient data necessary for meta-analytic pooling (even after attempts to contact the authors). If overlap was present among multiple publications, the study with the largest patient cohort was included. Two reviewers performed the literature search and study selection independently. Consensus was reached after discussion with a third reviewer.
Data Extraction and Quality Assessment
Data regarding patient, study, and MRI characteristics were extracted using a standardized form. Methodological quality of the included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool (18). Two reviewers independently performed both data extraction and quality assessment followed by discussion with a third reviewer in cases of disagreement.
Data Synthesis and Analysis
Sensitivity and specificity were calculated from reconstructed data from the included studies in 2 × 2 tables (true positive, false negative, false positive, and true negative). Results from the more experienced reader were used for the meta-analysis if results from multiple independent readers were available.
The pooled sensitivity and specificity and their 95% confidence interval (CI) were calculated using the bivariate random effects model (19). A hierarchical summary receiver operating characteristics (HSROCs) curve with a 95% confidence region and prediction region was presented graphically to display the results (20). Publication bias was assessed using the Deeks' funnel plot and Deeks' asymmetry test (21).
Heterogeneity was determined using both Cochran's Q test with p < 0.05 indicating statistical significance (22) and inconsistency index (I2) using the following criteria (23): 0–40%, heterogeneity might not be important; 30–60% moderate heterogeneity may be present; 50–90%, substantial heterogeneity may be present; and 75–100%, considerable heterogeneity. The threshold effect was visually assessed with coupled forest plots of sensitivity and specificity and a Spearman correlation coefficient greater than 0.6 between sensitivity and false-positive rates was considered to suggest a considerable threshold effect (24).
Meta-regression analysis was conducted to explore the cause of heterogeneity using the following categories: 1) method to determine threshold of LCC (receiver operating characteristics [ROCs] curve vs. others); 2) LCC cutoff value (> 10 mm vs. ≤ 10 mm based on 10 mm in PI-RADS version 2 and > 12 vs. ≤ 12 mm dichotomized by the median of included studies); 3) LCC measurement method (curvilinear vs. linear); 4) prevalence of high Gleason score (≥ 7) on biopsy (≥ 75.4% [median of included studies] vs. < 75.4%); 5) magnetic field strength (3- vs. 1.5-Tesla [T]); 6) experience of MR readers (≥ 10 years vs. < 10 years); 7) use of endorectal coils; and 8) publication year (before 2000 vs. after 2000). Additional subgroup analyses stratified according to the MRI sequences that were used to measure LCC (T2WI, ADC, and DCE) were performed.
The “midas” module in Stata 10.0 (StataCorp LLC, College Station, TX, USA) and “mada” package in R software version 3.5.1 (R Foundation for Statistical Computing, Vienna, Austria) were used for statistical analyses with p < 0.05 suggesting statistical significance.
RESULTS
Literature Search
Initially, 176 studies were found in the systematic literature search. After removing 57 duplicates, screening of the 119 titles and abstracts yielded 30 potentially eligible studies. After full-text reviews, 17 studies were excluded for the following reasons: insufficient data to reconstruct 2 × 2 tables (n = 3), focusing on overall local staging of prostate cancer (n = 7), and assessment of other MRI findings as a predictor of EPE (n = 7). Ultimately, 13 original articles including 2136 patients assessing the diagnostic performance of LCC on MRI for detection of EPE in prostate cancer patients were analyzed (13, 14, 15, 16, 25, 26, 27, 28, 29, 30, 31, 32, 33). Figure 1 summarizes the detailed study selection process.
Fig. 1
Flow diagram describing study selection process for meta-analysis.
EPE = extraprostatic extension
Characteristics of Included Studies
The patient characteristics are shown in Table 1. The number of patients ranged from 30 to 553 patients. Seven studies reported the Gleason score from biopsy with a median value of 7. Pathological T stage in addition to histopathological EPE status was reported in seven studies.
Table 1
Patient Characteristics of Included Studies
The study characteristics are described in Table 2. Only one study was prospective in design, but LCC measurement was performed in a retrospective manner. In terms of the methods to determine optimal LCC threshold, a ROC curve was utilized in ten studies, 10 mm as stated in the PI-RADS v2 guideline in one study, and the method was unclear in two studies. All studies evaluated LCC on T2WI. LCC was additionally measured on ADC in three studies, on DCE in two studies, and the maximum value among all sequences in one study. The threshold value of LCC ranged from 6 to 20 mm. LCC was measured using a curvilinear ruler tool in nine studies, linear in three, and the method was not specified in one.
Table 2
Study Characteristics of Included Studies
The MRI characteristics are summarized in Table 3. Eight studies used 3T scanners and five studies used 1.5T scanners. The experience of MRI readers ranged from 0.5 to 22 years. An endorectal coil was utilized in five studies and was not used in eight.
Table 3
MRI Characteristics of Included Studies
Quality Assessment
In general, the quality of the studies was considered good, with 10 of the 13 studies satisfying five or more of the seven domains (Fig. 2). In the patient selection domain, two studies had an unclear risk of bias, as they did not describe whether patients were consecutively enrolled (15, 32). Regarding the index test domain, there was a high risk of bias in two studies that did not explicitly mention how the LCC cutoff value was determined (31, 33). There was an unclear risk of bias in three studies as it was not explicit whether MRI was read blinded to clinicopathological information (13, 27, 28). Regarding reference standard domain, there was an unclear risk of bias in eight studies as it was unclear whether pathologists were blinded to the MRI interpretation (13, 15, 25, 26, 27, 28, 31, 32). Regarding the flow and timing domain, five studies had an unclear risk of bias as the interval between MRI and surgery was not provided (27, 28, 29, 30, 31).
Fig. 2
Grouped bar charts show risk of bias (left) and concerns of applicability (right) for 13 studies using QUADAS-2 tool.
QUADAS-2 = Quality Assessment of Diagnostic Accuracy Studies-2
Diagnostic Performance of LCC on MRI for Detection of EPE
The range of sensitivities and specificities of the individual studies were 59% to 91% and 44% to 88%, respectively. The Q-test demonstrated that heterogeneity was present (p < 0.01). The Higgins I2 statistics demonstrated substantial heterogeneity in terms of both the sensitivity (I2 = 62%) and specificity (I2 = 86%). A threshold effect was not evident based on the coupled forest plots (Fig. 3) and Spearman correlation coefficient between the sensitivity and false-positive rate (−0.174 [95% CI −0.661–0.417]).
Fig. 3
Corresponding heterogeneity statistics are provided in bottom right corners. Horizontal lines indicate 95% CIs. CI = confidence interval
Coupled forest plots of summary sensitivity and specificity.
Numbers are pooled estimates with 95% CIs in parentheses.
For the 13 studies combined, the summary sensitivity was 0.79 (95% CI 0.73–0.83) with a specificity of 0.67 (95% CI 0.60–0.74). Summary positive likelihood ratio and negative likelihood ratio were 2.4 (95% CI 1.9–3.0) and 0.31 (95% CI 0.24–0.41), respectively. A large difference between the 95% confidence and prediction regions were noted in the HSROC curve, also suggesting heterogeneity between studies (Fig. 4). The area under the HSROC curve was 0.81 (95% CI 0.77–0.84). The likelihood of publication bias was low according to Deeks' funnel plot, with a p-value of 0.69 for the slope coefficient (Fig. 5).
Fig. 4
HSROC curve of diagnostic performance of length of tumor capsular contact on MRI for extracapsular extension prediction in prostate cancer patients.
HSROC = hierarchical summary receiver operating characteristic
Fig. 5
Deeks' funnel plot.
Likelihood of publication bias was low with p value of 0.69 for slope coefficient. ESS = effective sample size
Heterogeneity Exploration
The results of meta-regression analysis are provided in Table 4. Among the several covariates evaluated, LCC measurement method, prevalence of high Gleason score on biopsy, experience of MR readers, and use of endorectal coils were significant factors affecting heterogeneity (p ≤ 0.01). However, when comparing sensitivity and specificity estimates among these subgroups, significant and clinically meaningful differences were only seen regarding endorectal coil usage. Studies that used endorectal coils showed a significantly lower sensitivity (0.72 [95% CI 0.66–0.77]) compared with those not using endorectal coils (0.83 [95% CI 0.79–0.87], p = 0.001). For other subgroup comparisons, there were significant overlap in the 95% CIs of the sensitivities and specificities (p = 0.07–0.83 and 0.02–0.81 for sensitivity and specificity, respectively). The method used to determine the LCC threshold, LCC cutoff value, magnet strength, and publication year were not significant factors affecting the heterogeneity (p = 0.14–0.93).
Table 4
Meta-Regression Analyses Stratified by Multiple Variables
Additional subgroup analyses revealed both sensitivity and specificity estimates were comparable regardless of the MRI sequences used to measure LCC as follows: T2WI (n = 13) (sensitivity 0.78 [95% CI 0.73–0.82], specificity 0.68 [95% CI 0.59–0.75]); ADC (n = 3) (sensitivity 0.77 [95% CI 0.66–0.86], specificity 0.68 [95% CI 0.56–0.77]); DCE (n = 2) (sensitivity 0.80 [95% CI 0.66–0.90], specificity 0.67 [95% CI 0.56–0.77]), and maximum length from all sequences (n = 1) (sensitivity 0.84 [95% CI 0.71–0.93], specificity 0.76 [95% CI 0.68–0.83]).
DISCUSSION
In the current study, we performed a meta-analysis on the diagnostic performance of LCC on MRI for the detection of EPE in prostate cancer. Overall, the diagnostic performance of LCC was moderate with sensitivity and specificity estimates of 0.79 (95% CI 0.73–0.83) and 0.67 (95% CI 0.60–0.74), respectively. Nevertheless, it is noteworthy that the sensitivity was relatively higher than results based on the subjective assessment of EPE (pooled sensitivity of 0.57) as reported in a meta-analysis by de Rooij et al (8). In fact, it has been shown that additionally using LCC along with other indirect criteria for EPE resulted in increased sensitivity for detecting EPE (57.4% to 83.9%) compared with that when using only direct criteria (i.e., focal capsular irregularity/disruption or neurovascular bundle invasion) (34). This may have important clinical implications as setting either a high sensitivity or specificity reading could be applied to different clinical settings (35). High sensitivity is required when selecting optimal patients to enroll in active surveillance or choosing candidates for RP with neurovascular bundle sparing. On the other hand, high specificity could be favored when there is a need to avoid withholding potential curative treatment (11). Therefore, based on the results of our study, we believe that LCC on MRI can provide incremental value in the management of patients with prostate cancer, especially in clinical settings where high sensitivity for predicting EPE is needed.
Substantial heterogeneity existed among the included studies regarding MRI sequences, threshold values and how they were determined, and methods for measuring LCC. However, all sequences, including T2WI, ADC, and DCE, showed similar diagnostic performance (sensitivities and specificities of 0.76–0.80 and 0.67–0.68, respectively). Although not significantly different, the sensitivity and specificity (0.84 and 0.76, respectively) using the maximum value from all sequences was slightly greater in one study (16). This may be related to due fact that tumor measurements on MRI tend to underestimated compared with that performed using pathological specimens, and using the maximum value could potentially lessen the degree of MRI-based underestimation of LCC relative to the pathological LCC (36). However, there are also studies stating that MRI-based LCC correlates well with pathologic-LCC without any overestimation or underestimation (14, 37). Therefore, conclusions regarding the incremental value of using the maximum value cannot be drawn as there are not enough data to support one and further studies are warranted.
Although different cutoff values ranging from 6 to 20 mm were used, this was not shown to affect the overall heterogeneity (p = 0.93 and 0.88 using cutoff values of 10 and 12 mm). Moreover, a threshold effect was not observed, which is generally expected in meta-analyses using thresholds for a continuous variable like LCC. Still, increasing LCC has been shown to be associated with a greater probability of EPE (30, 38). For example, in the study by Masumoto et al. (30, 38), every increase in 1 mm of LCC was associated with a 13% increase in the odds for EPE. In addition, there are studies that suggest anterior tumors are less aggressive and that different threshold values, specifically, a less strict criterion using higher cutoff values, should be used for anterior tumors compared with that used for posterior tumors (38, 39). Therefore, the cutoff value should be tailored to several factors, such as location of the tumor and the clinical setting (e.g., using a lower cutoff to detect EPE more sensitively). Curvilinear measurement of LCC on MRI, theoretically, may reflect pathological LCC better than linear measurement, and was a significant factor affecting heterogeneity. However, there was no significant difference in the sensitivity or specificity; thus, it may seem reasonable to use any available ruler tool that is provided by the image viewing software.
Sensitivity and specificity from more experienced readers were not significantly better than those from less experienced readers were. At first, this may be unexpected, as it has been shown in the literature than the accuracy of EPE, using prostate MRI is dependent on experience (9, 40). However, those studies were based on subjective assessment on EPE while LCC can be considered more objective and quantitative, rendering it less dependent on experience and potentially enhancing the reproducibility between readers with different experience levels. In fact, substantial to almost perfect agreement (kappa values of 0.70–0.98 and intraclass correlation coefficients of 0.979–0.983) for measuring LCC was shown in the majority of studies assessing interreader agreement (14, 15, 29, 32, 33).
In this meta-analysis, technical aspects of MRI were investigated. Magnet strength (3- vs. 1.5T) did not have a statistically significant effect on either sensitivity (0.78 and 0.81 for 3- and 1.5T, respectively) or specificity (0.68 and 0.66, respectively). This contrasts with results of a previous meta-analysis assessing the performance of MRI for detecting EPE using subjective assessment (sensitivity of 0.61 and 0.55 for 3- and 1.5T, respectively) (8). Detection of subtle capsular irregularity or small foci of extracapsular tumor may require high spatial resolution, which is easier to obtain at higher magnetic strengths; however, it seems that LCC may be less affected by magnet strength when ≥ 1.5T scanners are used. Therefore, both 1.5- and 3T scanners may provide comparable and objective LCC measurement on MRI provided MRI protocols are optimized. Studies with endorectal coils showed significantly lower sensitivity compared with those without. We speculate that this may stem from the fact that endorectal coils can lead to deformation of the prostatic contour possibly influencing measurement of LCC (41).
In the present study, prevalence of high Gleason tumors was a factor affecting heterogeneity, but no significant differences were observed in either sensitivity or specificity estimates. Nevertheless, in one of the included studies that additionally evaluated LCC stratified by tumor grade, mean LCC was significantly larger for higher grade tumors (15.3 mm vs. 9.0 mm, p = 0.0001) (25). Furthermore, for the LCC criterion of < 10 mm, 41.6% of higher-grade tumors still had EPE compared with only 2.8% of lower grade tumors (25). Bakir et al. (37) recently stated pathology-based LCC cutoff values decreased as the International Society of Urological Pathology grade group increased in terms of EPE positivity, substantiating the possible influence of tumor grade on the LCC-EPE relationship. Further studies are warranted to investigate whether different threshold values are required for tumors with different Gleason scores.
Our study has several limitations. Firstly, all included studies were performed retrospectively, which may have introduced selection bias. Second, as full organ histopathological correlation was required to evaluate pathological EPE, we solely included patients that underwent RP. Therefore, caution is needed when applying these results to a non-surgical population (i.e., radiotherapy, active surveillance, or recurrence). Third, the heterogeneity was substantial between the studies. Although we performed meta-regression analyses and sensitivity analyses to identify potential factors attributable to this heterogeneity, some factors remain unexplained. Fourth, although we were able to derive some conclusions, for instance, that LCC can be measured using any MRI sequence with either curvilinear or linear tools, the optimal threshold value of LCC could not be established. The cutoff value should be tailored to the likelihood of EPE, and the clinician and patient's preferences for management.
In conclusion, greater LCC on MRI was associated with a higher probability of EPE. Despite its overall moderate diagnostic performance, the relatively higher sensitivity compared with that of conventional subjective assessment might be of incremental value for helping select candidates for active surveillance or functional preservation treatments by avoiding underestimation of the disease. Furthermore, as LCC is relatively simple to measure and is less dependent on reader experience, it can be considered a reproducible and objective quantitative predictor in the assessment of EPE in prostate cancer. However, further studies are needed to establish the optimal cutoff value for each clinical setting.
Conflicts of Interest:For conflict of interest, since May 2017, Dr. Hricak has served on the Board of Directors of Ion Beam Applications (IBA), a publicly traded company, and she receives annual compensation for her service. Furthermore, Dr. Hricak is a member of the External Advisory Board of the University of Michigan Comprehensive Cancer Center, the International Advisory Board of the University of Vienna, Austria, and the Scientific Committee of the DKFZ (German Cancer Research Center), Germany; she does not receive financial compensation for any of these roles. Otherwise, we do not have any other conflict of interest to disclose.
References
-
Mikel Hubanks J, Boorjian SA, Frank I, Gettman MT, Houston Thompson R, Rangel LJ, et al. The presence of extracapsular extension is associated with an increased risk of death from prostate cancer after radical prostatectomy for patients with seminal vesicle invasion and negative lymph nodes. Urol Oncol 2014;32:26.e1–26.e7.
-
-
Boccon-Gibod L, Bertaccini A, Bono AV, Dev Sarmah B, Höltl W, Mottet N, et al. Management of locally advanced prostate cancer: a European consensus. Int J Clin Pract 2003;57:187–194.
-
-
Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions, Version 5.1.0. Cochrane Collaboration; 2011 [updated March 2011]. [Accessed March 21,2019].Available at: https://training.cochrane.org/handbook/archive/v5.1/.
-
-
Costa DN, Passoni NM, Leyendecker JR, de Leon AD, Lotan Y, Roehrborn CG, et al. Diagnostic utility of a Likert scale versus qualitative descriptors and length of capsular contact for determining extraprostatic tumor extension at multiparametric prostate MRI. AJR Am J Roentgenol 2018;210:1066–1072.
-
-
Kongnyuy M, Sidana A, George AK, Muthigi A, Iyer A, Ho R, et al. Tumor contact with prostate capsule on magnetic resonance imaging: a potential biomarker for staging and prognosis. Urol Oncol 2017;35:30.e1–30.e8.
-
-
Van Holsbeeck A, Degroote A, De Wever L, Vanhoutte E, De Keyzer F, Van Poppel H, et al. Staging of prostatic carcinoma at 1.5-T MRI: correlation of a simplified MRI exam with whole-mount radical prostatectomy specimens. Br J Radiol 2016;89:2016010
-
-
Cornud F, Rouanne M, Beuvon F, Eiss D, Flam T, Liberatore M, et al. Endorectal 3D T2-weighted 1mm-slice thickness MRI for prostate cancer staging at 1.5Tesla: should we reconsider the indirects signs of extracapsular extension according to the D'Amico tumor risk criteria? Eur J Radiol 2012;81:e591–e597.
-