Diagnostic accuracy of high b-value diffusion weighted imaging for patients with prostate cancer: a diagnostic comprehensive analysis

We performed a meta-analysis to assess the diagnostic accuracy of high b-value diffusion-weighted imaging for patients with prostate cancer. A comprehensive literature search of the PubMed, Excerpta Medica Database, Cochrane Library, China National Knowledge Infrastructure, China Biology Medicine disc, and Wanfang databases from January 1, 1995, to April 30, 2021, was conducted. The quality of the retrieved papers was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2. The sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio, and their 95% confidence intervals (CIs) were evaluated using bivariate mixed effects models. A total of twenty-four articles matched the selection criteria and were finally included after screening the titles, abstracts, and full texts of 641 initial articles. The pooled sensitivity and specificity (95% CI) were 0.84 (0.80–0.87) and 0.87 (0.81–0.91), respectively. The pooled positive and negative likelihood ratios (95% CI) were 6.4 (4.4–9.3) and 0.19 (0.16–0.23), respectively. The diagnostic odds ratio was 34 (95% CI: 22–51). The area under the summary receiver operator characteristic curve was 0.91 (95% CI: 0.88–0.93). Subgroup analysis presents similar results. The diagnostic accuracy of high b-value diffusion-weighted imaging was similarly high in the qualitative and quantitative evaluation of prostate cancer.


INTRODUCTION
Prostate cancer is the second most commonly diagnosed cancer and the sixth leading cause of cancer-related death in men [1]. Early diagnosis and treatment are of great importance. Serum prostate-specific antigen (PSA) detection is the primary option for screening prostate cancer [2]. However, PSA is specific to prostate tissue rather than tumor tissue, and prostatitis, urinary tract infection and even prostate massage can lead to an increase in PSA levels. It has been reported that the specificity of PSA is very low when 4.0 ng/ml serum PSA is used as a threshold [3]. PSA detection alone may cause a high false-positive rate and lead to a large number of unnecessary biopsies [4]. Therefore, PSAbased screening for prostate cancer is controversial.
In clinical practice, magnetic resonance imaging (MRI) has been widely used to detect prostate cancer.
It differs from other techniques such as computed tomography and ultrasound since it produces excellent soft tissue contrast without harmful ionizing radiation; MRI also provides imaging evidence for the clinical examination of prostate cancer location, staging, postoperative follow-up, and the evaluation of tumor invasion [5]. MRI mainly included five imaging parameters: T1-weighted imaging, T2-weighted imaging, diffusion-weighted imaging (DWI), magnetic resonance spectroscopy, and dynamic contrastenhanced imaging. DWI can estimate morphological changes in prostate tissue that occur with the induction of plasticity by probing water diffusion and can qualitatively and quantitatively evaluate the cellular and histological structure of prostate cancer [6]. The bvalue is one of the primary parameters influencing DWI results. According to the Prostate Imaging Reporting and Data System version 2, high b-values (1,400-2,000 s/mm 2 ) are favored over standard b-AGING values (800-1,000 s/mm 2 ) for improving tumor detection since they can qualitatively distinguish lesions and normal prostate tissue. The latter shows high signal intensity on DWI, which may not be suppressed even at a b-value of 1,000 s/mm 2 , resulting in obscured prostate cancer [7,8]. Higher b-value DWI has been continuously applied in clinical practice. Although a high b-value reduces the signal-to-noise ratio of images and may distort images, it can reduce the T2 penetration effect and microcirculation perfusion of images and more truly reflect the tissue and cytological structure. Because of conflicting results from qualitative and quantitative studies, it is not definitively known whether high b-value DWI improves the diagnostic accuracy of prostate cancer. The purpose of this meta-analysis was therefore to assess the diagnostic performance of high b-value DWI for detecting prostate cancer.

Search process and general characteristics
A flow chart of the study selection process is shown in Figure 1. A total of 641 articles were retrieved during the original search of publications from January 1, 1995, to December 31, 2020, and 233 remained after excluding duplicates among the different databases during the second search round. After excluding reviews, comments, case reports, and studies unrelated to our topics, 61 articles were reviewed. Twenty-four articles remained after omitting articles that did not mention diagnostic accuracy or that had insufficient data , . The excluded studies with reasons are provided in Supplementary Material 1. The backgrounds and designs of these studies are shown in Table 1 and Table 2. Thirteen studies selected populations with PCa, and two studies were performed on suspected cases. The methods for identifying bvalues were different: ten studies used motion-probing gradients, and five studies used signal extrapolation by fitting models. All DWI scans were acquired using single-shot spin-echo echo-planar imaging. The biopsy types included targeted biopsy, prostatectomy and systematic biopsy. The used tissue amounts were hardly reported in the studies. The primary characteristics of the selected studies are shown in Table 3. Among the enrolled studies, 17 were prospective studies and 7 were retrospective studies. Among the studies, a total of 1887 patients with 11374 lesions were analyzed. The mean  age of all patients was 66.3 years. The MRI field intensity used in most of the studies was 3.0 T, with only three studies using a field intensity of 1.5 T. An endorectal coil was used in only two study. Table 1 also lists the MRI suppliers, gold standards, and DWI diagnostic measures. Table 2 shows that each study contained a b-value of 2,000 s/mm 2 , and the highest bvalue was 4,500 s/mm 2 . The ranges of the sensitivities and specificities in all studies were 44.0-98.6% and 50.0-99.4%, respectively.

Quality assessment
As shown in Figure 2 and Figure 3, the risk of bias consisted of flow and timing, patient selection, index test, and reference standard, whereas applicability concerns consisted of the last three domains but not flow and timing. Only five study had a high risk, and one was unclear in terms of the index test for both the risk of bias and applicability concerns. For the reference standard, one studies were unclear, and four study were unclear in terms of flow and timing, three studies were unclear in index test. Overall, the quality of the identified studies was high.

Pooling results
The Spearman correlation indicated no threshold effect (r = 0.317, P = 0.094). From the data obtained, we determined pooled sensitivity and specificity values of 0.84 (95% CI: 0.80-0.87) and 0.87 (95% CI: 0.81-0.91), respectively ( Figure 4 and Figure 5). The AUC was 0.941(0.88-0.93) ( Figure 6). The PLR and NLR were AGING  Figure  7), when the pretest probability was 20%, the corresponding post-test probability was 61% using the PLR and 5% using the NLR. The diagnostic performance was visualized by a likelihood ratio scattergram ( Figure 8). All of these results suggest that the degree of diagnostic accuracy of high b-value DWI for detecting prostate cancer was relatively high.

Subgroup analysis
We conducted subgroup analyses of six subgroups (study design, number of patients, mean age of the patients, MRI field intensity, b-value, and DWI diagnostic measures) to identify the sources of heterogeneity. All results are shown in Table 4. The study design was divided into prospective and retrospective studies, and the sensitivity, specificity, PLR, NLR, and DOR showed no significant differences, but the AUC was significantly different, suggesting that it was a cause of the heterogeneity. The sensitivity, AGING      Table 4). The heterogeneity was high within studies. The meta-regression indicated that publication year and population setting may cause the heterogeneity within studies (Table 5).

Sensitivity analysis and publication bias
The sensitivity analysis is presented in Figure 9. The goodness of fit (A) and bivariate normality (B) show the degree of fitting of the regression line to the observed value. As shown, the observed value is distributed around the reference line. The observed values are stable. The influence analysis indicated that four studies may overestimate the pooled results. The outlier detection test indicated that two studies were out of the detection range.
After excluding these studies, the pooled sensitivity, specificity did not change (results not show). In addition, we constructed Deek's plot, which indicated that there was no publication bias (t = −1.21, p = 0.240) ( Figure 10).

DISCUSSION
This meta-analysis compared twenty-four studies evaluating the use of high b-value DWI to diagnose prostate cancer. Importantly, the analysis indicated that  DWI is crucial for diagnosing prostate cancer when using MRI. Compared with central lesions, peripheral lesions are easier to assess using DWI, which is consistent with the Prostate Imaging Reporting and Data System version 2 scoring system. The human prostate is a highly heterogeneous organ at the cellular level, and structural tissues show changes during the early stages on a scale of micrometers or smaller for many prostate pathologies. The ability of DWI to detect prostate cancer relies on the shrinkage of glands, tight cell arrangement, and increased parenchyma density, such that water diffuses slower in prostate cancer tissues than in normal prostate tissues [33][34][35], [37][38][39]. The b-value, which needs to be selected carefully in clinical applications, is a key parameter reflecting the sensitivity of DWI for detecting diffusional movements. A high b-value can better distinguish cancerous tissues from normal tissues; however, it also has some disadvantages, such as reducing the image signal-to-noise ratio, which obscures cancerous tissues [36]. But it is reported that Siemens developed Readout Segmentation of Long Variable Echo-trains DWI technology that adopts multiple excitation segmental readout for acquisition and K space filling, which significantly shortens echo time, reduces echo interval and improves image quality on DWI. The special targeting uniformity technology can also obtain the best magnetic field uniformity, further improve the magnetic field uniformity of complex parts, so as to further improve the image quality of DWI. This technology has been confirmed in several tumors [37][38][39]. There is no accepted standard range of b-values for DWI that is optimal for diagnosing prostate cancer.
Our results suggest that a high b-value is a robust tool for prostate cancer diagnosis. Regarding study design, the sensitivity of the prospective studies was superior to that of the retrospective studies, despite no obvious differences between the two designs. Furthermore, the DOR was higher for the prospective studies than the retrospective studies, and the AUC showed a significant difference between the two groups. We therefore concluded that the study design was a source of the heterogeneity observed. MRI field intensity showed the most striking results in that the PLR and DOR were almost twice as high for the studies using 3.0 T than for those using 1.5 T, and the AUC showed a significant difference between the two groups. The MRI field intensity contributed more to the heterogeneity among studies than did the study design. The sensitivity of studies involving high b-values (2,000 s/mm 2 ) was slightly lower and the specificity was slightly higher than that of studies involving ultrahigh b-values (> 2,000 s/mm 2 ), and the DOR and AUC between the two groups showed no significant differences. With respect to the number of patients, the studies with > 50 patients (compared with ≤ 50 patients) showed a higher sensitivity, PLR, DOR, and AUC. The mean ages of the patients showed the same results as those for the number of patients, but the difference was not significant.   We also performed subgroup analysis for qualitative and quantitative evaluation, and the studies based on ADC values (quantitative) versus visual evaluation (qualitative) showed no differences, suggesting that they had little or no contribution to the heterogeneity among studies. Although visual evaluation relies on the AGING experience and skills of the performer, it results in no overall changes. Although the diagnostic accuracy was almost indifferent, there were still differences in imaging characteristics. A previous study found that the image deformation of DWI is smaller, the lesion contrast is higher in qualitative analysis, and the ADC value of DWI sequences shows better repeatability in quantitative analysis than standard DWI sequences [40]. The Prostate Imaging Reporting and Data System version 2.1 has recommended qualitative evaluation of DWI [41].
There were some limitations for this study. (1) The heterogeneity was high within studies. Even when we conducted subgroup analyses, the heterogeneity was carefully considered. The meta-regression indicated that the year of publication and population setting can affect the estimations. Besides, human prostate cancer cells are heterogenous, containing a variety of cancer cells with phenotypical and functional discrepancies, and this may generate heterogeneity. However, almost none of studies provided prostate cancer cell types, all studies just distinguish PCa from the tissues. Further research was required. of the searched studies was restricted to English and Chinese, which may have reduced the representativeness of the included studies. (4) It is needless to say that repeatability of DWI signal decay derived parameters needs to be evaluated because high repeatability of measurements is a prerequisite for quantitative patient tailored treatment planning and therapy monitoring. Previous studies found that Monoexponential model demonstrated the highest repeatability and clinical values in the regions -of interest-based analysis of prostate cancer DWI. However, included studies did not introduce used modeling and this study was based on the b values in the range of 0-500/mm 2 . Modeling evaluation based on high b-values are required [42].
In summary, high b-value DWI showed high diagnostic accuracy in the qualitative and quantitative evaluation of prostate cancer. We should consider the possibility of its clinical application, although studies with large sample sizes and higher quality are needed, particularly for quantitative evaluation. In addition, publication bias should be carefully considered when interpreting and applying our results.

METHODS
This meta-analysis was performed and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines listed in the PRISMA statement [43]. The PRISMA statement is provided in Supplementary Material 5.

Literature search
A comprehensive systematic literature search in the PubMed, Excerpta Medica Database (EMBASE), Cochrane Library, China National Knowledge Infrastructure, China Biology Medicine disc, and Wanfang databases was conducted to identify studies investigating the diagnostic performance of high bvalue DWI for detecting prostate cancer. The search query combined synonyms and related terms of prostate cancer ("prostate disease," "prostate tumor," "prostate lesions," and "PCa"), high b-value ("strong b-value," "multiple b-value," and "ultra-high b-value"), DWI, and diagnostic accuracy ("diagnostic performance," "sensitivity," "specificity," and "receiver operator characteristic curve"). Logical operators (AND, NOT, and OR) were then used to conduct comprehensive combinations of these terms. The details of search strategy were provided in the Supplementary Material 6. Studies were restricted to English and Chinese languages, and the time span of the studies was January 1, 1995, to April 30, 2021. The references included in the identified papers were also screened to expand the range of our search.

Selection criteria
Two reviewers (LC and LN) independently conducted the study selection. Controversies were settled by discussion.
Only studies that met all of the following criteria were chosen: (1) a retrospective or prospective design was used; (2) the purpose of the study was to evaluate the diagnostic value of high b-value DWI in prostate cancer alone or data for assessing accuracy of high b value for prostate cancer can be extracted. (3) the study used bvalues ≥ 2000 s/mm 2 ; (4) the study included ≥ 30 patients; (5) histopathological results (as the gold standard) were available for all patients; and (6) sufficient information was provided to establish 2 × 2 contingency tables and to calculate the sensitivity and specificity for detecting prostate cancer. Studies were excluded if they satisfied any of the following criteria: (1) reviews, case reports, dissertations, or unpublished articles; (2) inclusion of animal experimental data; and (3) combination with other MRI modalities (T2-weighted or dynamic contrastenhanced imaging) to evaluate the diagnostic performance of high b-value DWI for prostate cancer.

Data extraction
One reviewer independently collected the data from the included studies using normative tables. The other reviewer double-checked this process. The following information was collected from the studies: author(s), country, study design, numbers of patients and lesions, mean age of the patients, PSA level range, MRI field intensity, MRI supplier, coil type, b-value, DWI diagnostic measures, population setting, biopsy type, diffusion times, DWI postprocessing, evaluation type (quantitively or qualitive), and methods for identifying the b-value. In addition, the numbers of TP (true positive), TN (true negative), FN (false negative) and FP (false positive) cases were collected to calculate the sensitivity and specificity. Disagreements in the data extraction findings were resolved via discussion or adjudication with a third reviewer.

Quality evaluation
Each paper's quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2, a validated tool specifically designed to evaluate diagnostic accuracy studies via four domains: flow and timing, patient selection, index test, and reference standard. A risk of bias existed in all four domains, but applicability concerns existed in only the last three domains. Both risk of bias and applicability concerns were graded as low, unclear, or high [44]. This step was conducted independently by two reviewers, and controversies were settled by discussion or by consulting a third party.

Statistical analysis
Using uses an exact binomial rendition of the bivariate mixed-effects regression model developed by von Houwelingen for treatment trial meta-analysis and modified for synthesis of diagnostic test data [45]. The pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and area under the SROC curve (AUC) along with 95% confidence intervals (CIs) were determined using Stata 14.0 software (https://www.stata.com/). The heterogeneity among studies was quantified using the Q test and I 2 statistic [46]. The Q test was defined by Cochran and is calculated by summing the squared deviations of each study's effect estimate from the overall effect estimate, weighting the contribution of each study by its inverse variance. The I 2 index measures the extent of true heterogeneity, dividing the difference between the result of the Q test and its degrees of freedom by the Q value itself and multiplying by 100 [43]. An I 2 > 50% and P < 0.05 were considered to indicate heterogeneity. Subgroup analysis was used to evaluate the heterogeneity among groups. The study design, number of patients, mean age of the patients, MRI field intensity, DWI diagnostic measures, and b-value were compared by subgroup analyses. The aim of our study was to understand the effect of high b-values and standard b-values on the diagnostic accuracy of prostate cancer, but without further analyses of the effect between high b-values and ultrahigh b-values, subgroup analyses were conducted. The concrete comparisons were (1) study design: prospective versus retrospective; (2) number of patients: ≤ 50 versus > 50; (3) mean age: ≤ 65 years versus > 65 years; (4) MRI field intensity: 1.5 T versus 3 T; (5) bvalue: high (2,000) versus ultrahigh (> 2,000); and (6) DWI diagnostic measure: ADC values versus visual evaluations. Fagan's nomogram was used to show the relevance of the prior test probability, likelihood ratio, and posterior test probability. Publication bias was visualized using Deek's funnel plot. Meta-regression was performed for exploring the heterogeneity within studies. All statistical computations were conducted using Stata 14.0 software (Stata Corp, College Station, TX, USA), and the results were considered significant at P < 0.05.

Data availability
All data was within the manuscript without any restriction.

AUTHOR CONTRIBUTIONS
LC contributed this idea for the present study. SLF designed the search strategy. LN and LZZ extracted and collected data. LZZ provided analysis software. LC drafted the manuscript. LZZ revised this manuscript. All authors reviewed this manuscript and approved this submission.