Introduction

Neoadjuvant chemoradiotherapy (nCRT) combined with subsequent total mesorectal excision (TME) is the current standard of care for locally advanced rectal cancer (LARC) because of its great performance on decreasing the chance of local recurrence and increasing overall survival rate [1, 2]. Pathological examination is the gold standard to evaluate the therapy response to nCRT of LARC. Currently, the pathologic response to nCRT is variable across LARC individuals, of which 15–27% achieve a pathologic complete response (pCR) [3]. These patients may need no further anti-tumor treatment and benefit from this “wait-and-watch” strategy [4], while alternative therapy should be considered for those without pCR. Therefore, identifying a pathologic response to nCRT as early as possible will be of great value for personalized treatment.

Magnetic resonance imaging (MRI) plays an increasingly important role in identifying the therapeutic response to preoperative treatment in rectal cancer [5]. Morphologic MRI can determine the T and N stages of LARC due to its excellent soft-tissue contrast [6, 7], but it does little to assess the response to chemotherapy and radiation [8]. Functional MRI approaches such as diffusion-weighted imaging (DWI) [9] and dynamic contrast-enhanced MRI [10] have advantages in evaluating the treatment effect of LARC. Nevertheless, dynamic contrast-enhanced MRI requires the administration of an exogenous gadolinium contrast agent, which may cause some risks such as allergy and nephrogenic systemic fibrosis in a small percentage of patients, restricting its application in clinical practice. Based on a mono-exponential decay model, traditional DWI report shows the potential to predict treatment response of LARC, but its performance remains controversial across previous reports [11, 12].

Intravoxel incoherent motion DWI (IVIM-DWI) is on the basis of a bi-exponential model with the ability to quantitate the pure diffusion motion and perfusion-related motion of water molecules without the administration of gadolinium agent. Thus, IVIM-DWI is theoretically superior to conventional DWI in providing the information of tumor micro-environment related to treatment effect. Indeed, pretreatment IVIM-DWI is reportedly more powerful than traditional DWI in discriminating the pathological response of LARC to nCRT [13]. However, the performances of the baseline IVIM-DWI parameters on predicting pCR are conflicting across prior investigations on LARC [13,14,15]. A possible reason is that the findings in the above-mentioned reports [13,14,15] were based on the mean or percentiles values of IVIM-DWI parameters, which may not adequately reflect intratumoral heterogeneity [16, 17], a feature closely associated with therapeutic response [18].

Texture analysis can assess the heterogeneity within a tumor by quantitating the gray-level intensity or position of the pixels of an image [16, 19, 20]. Recently, several studies demonstrated that texture features derived from T2-weighted image [21, 22] and apparent diffusion coefficient (ADC) map [23] have the potential to predict the pathologic response, T down-staging or lymph node eradication of rectal cancer receiving nCRT. Theoretically, IVIM-DWI is more effective than morphologic MRI and conventional DWI in predicting the curative effect of chemoradiotherapy on tumors [13, 24,25,26]. Therefore, we hypothesize that texture analysis based on IVIM-DWI has a capacity to identify the pathological response of LARC to nCRT. However, to the best of our knowledge, this capacity is still unclear until now, which prompts the present study to be conducted.

Materials and methods

Patient selection

This prospective single-center study was approved by the Medical Ethics Committee of our institution, and was conducted in accordance with the Declaration of Helsinki. Written informed consent was acquired from all patients. The inclusion criteria were (1) newly diagnosed non-mucinous adenocarcinoma with pathological confirmation, (2) scheduled for nCRT before surgical operation, (3) clinical stage of II to III (cT3-4M0, and/or regional lymph node positive), and (4) over 18 years old. Patients were excluded if they (1) refused to participate, (2) received prior anti-tumor treatment, or (3) had contraindication for nCRT, MRI, or surgery. In total, 45 patients were initially enrolled from January 2017 to August 2017.

Conventional MRI protocol

All patients underwent conventional MRI examinations and IVIM-DWI on a 1.5-Tesla MRI scanner (Optima® MR360, GE Healthcare, Milwaukee, WI, USA) 1–3 days before nCRT, and a phased-array body coil was used. The imaging parameters are summed up in Table 1.

Table 1 MRI protocol in this study

IVIM-DWI protocol

IVIM-DWI was conducted on all patients after the conventional MRI examinations. Twelve b values were applied in a single-shot diffusion-weighted spin-echo-planar (ssSE-DW-EPI) sequence. Multiple b value measurements in one series were performed based on the modification of the lookup table of gradient direction. The details are shown in Table 1. In order to reduce the possible distortion and image degradation, magnetic field shimming was performed every time before starting the DWI data acquisition, and all the MRI images were carefully reviewed and the patients with poor image quality would be eliminated.

IVIM-DWI parametric values measurement

All the IVIM-DWI data were post-processed by the MADC Kit (a software package for multiple ADC measurements) in the Advantage Workstation (version AW 4.6, GE Medical Systems). Four IVIM-DWI parametric (ADC; pure diffusion coefficient, D; pseudo-diffusion coefficient, D*; perfusion fraction, f) maps of each primary rectal tumor were generated on the basis of a pixel-by-pixel fitting according to the Levenberg–Marquardt algorithm [27]. To determine the scope of each tumor, one radiologist (Observer A with 11 years of experience in abdominal radiology) who was blinded to the clinicopathological characteristics and treatment outcomes manually traced the outer edge of the lesion on each axial ADC map with reference to the T2-weighted image, and the corresponding two-dimensional (2D) region of interest (ROI) for each map was acquired. Both the most superior and the most inferior slices for each tumor were excluded to avoid volume averaging. Based on all the ROIs of this tumor, the MADC Kit automatically generated a three-dimensional (3D) volume of interest (VOI) and output the mean ADC value of this VOI. The same VOI was also automatically copied and pasted by this software onto all the other IVIM-DWI maps. To avoid possible mismatch, these VOIs were further verified by the above radiologist, with reference to T2-weighted images. Subsequently, the corresponding IVIM-DWI parametric values were obtained.

After the acquirement of IVIM-DWI parametric values, the IVIM-DWI maps with ROI were saved as BMP format images for texture analysis. Additionally, all the conventional MRI images and IVIM-DWI parametric maps in DICOM (digital imaging and communications in medicine) format were stored into the picture archiving and communication system (DHPACS® version 4.6.4, Donghua Software, Beijing, China) in our institution, as archive data.

Texture analysis

All the BMP format images were transferred into the MaZda program (http://www.eletel.p.lodz.pl/programy/mazda/index.php?Action=mazda) for texture analysis. Because the ROIs on the IVIM-DWI parametric maps cannot be utilized directly by MaZda, one radiologist (Observer A) carefully manually traced the border of these original ROIs to generate new ROIs for texture analysis. Subsequently, a VOI for each tumor was generated automatically based on these new ROIs. For each VOI, gray-level normalization was performed by using μ ± 3σ (μ, gray-level mean; σ, gray-level standard deviation), to minimize the influence of contrast and brightness variation. For each IVIM-DWI parametric map, 5 first-order texture features (Mean, Kurtosis, Skewness, Variance, and Median) and 11 GLCM features were extracted automatically from the VOI by MaZda, including Angular Second Moment (AngScMom), Contrast, Correlat, Difference Entropy (DifEntrp), Difference Variance (DifVarnc), Entropy, Inverse Difference Moment (InvDfMom), Sum Average (SumAverg), Sum Entropy (SumEntrp), Sum of Squares (SumOfSqs), and Sum Variance (SumVarnc).

nCRT treatment

All patients received intensity-modulated radiation therapy for 5 weeks, accompanied by concurrent chemotherapy with oral capecitabine (1650 mg/m2 body-surface area) daily. The prescription dose was 45 Gy/25 fractions (1.8 Gy/fraction, 1 fraction/day, 5 fractions/week) for the planning target volume, and 50 Gy/25 fractions (2.0 Gy/fraction, 1 fraction/day, 5 fractions/week) for the gross tumor volume.

Pathological response assessment

TME was performed 8 weeks after the end of nCRT. The fresh specimens were soaked in formalin for 48 h after TME. A pathological expert with 12 years of experience in colorectal pathology evaluated these tissue sections stained with haematoxylin–eosin. The pathological response to nCRT was classified in accordance with the Dworak tumor regression grade (TRG) system [28], namely, TRG 4, a fibrotic mass with no residual tumor (complete response); TRG 3, rare residual tumor cells scattered in the fibrosis; TRG 2, more residual tumor cells, but still less than fibrosis; TRG 1, more residual tumor than fibrosis; and TRG 0, no regression changes (no response). Patients with a TRG score of 4 were classified into the pCR group of the present study, while the other patients were divided into the non-pathological complete response (non-pCR) group.

Statistical analysis

All statistical analyses were done through SPSS version 22.0 (SPSS Inc., Chicago, 162 IL, USA) or MedCalc v15.0 software (MedCalc Software bvba, Ostend, Belgium). P < 0.05 was regarded as statistically significant. The differences in the values of IVIM-DWI parameters and texture features between the pCR and non-pCR groups were investigated using the Mann–Whitney U test. To identify independent predictors, the indicators showing statistical significance (P < 0.05) in univariate analysis were selected as input variables in subsequent multivariate logistic regression analysis (forward stepwise, LR; probability for stepwise entry, 0.05; removal, 0.1). Receiver operating characteristic (ROC) curve analysis was conducted to explore the diagnostic performance of these individual indicators and multivariate regression model for identifying pCR.

Results

Four patients were excluded from this study due to lack of TME procedure (n = 2), poor MRI image quality (n = 1), or withdrawal by the patient (n = 1). The cohort of this study eventually included 41 LARC patients (pCR, n = 9; non-pCR, n = 32). Table 2 lists the clinicopathological characteristics of this cohort. Among the 44 GLCM features, 11 exhibited evident difference between the pCR and non-pCR groups. Lower values of Contrast and DifVarnc derived from the ADC, D, and D* maps, respectively, together with lower CorrelatD value, were observed in the pCR group (all P < 0.05). Meanwhile, the pCR group exhibited higher CorrelatD*, Correlatf, SumAvergADC, and SumAvergD values than the non-pCR group (all P < 0.05), as shown in Tables 3, 4, 5, and 6. There were no significant differences in the values of IVIM-DWI parameters and first-order texture features between the two groups (Table 7). Figures 1 and 2 demonstrate two examples of MRI and pathological images for LARC patients with different treatment responses. In univariate analysis, the values of area under the curve (AUC) of ROC curve for the individual predictors ranged from 0.698 to 0.837, with sensitivities from 43.75 to 87.50%, and specificities from 66.67 to 100.00% (Table 8). In multivariate logistic regression analysis, CorrelatD* (P < 0.001), DifVarncADC (P = 0.024), and DifVarncD (P < 0.001) were the independent predictors to pCR, with an AUC of 0.986 (95% confidence interval, 0.958 to 1.000), a sensitivity of 93.75%, and a specificity of 100.00%.

Table 2 Clinical and pathological characteristics of patients
Table 3 Differences in the GLCM features from the ADC map between the pCR and non-pCR groups
Table 4 Differences in the GLCM features from the D map between the pCR and non-pCR groups
Table 5 Differences in the GLCM features from the D* map between the pCR and non-pCR groups
Table 6 Differences in the GLCM features from the f map between the pCR and non-pCR groups
Table 7 Differences in the first-order texture feature and parameter of IVIM-DWI between the pCR and non-pCR groups
Fig. 1
figure 1

A patient with LARC from the pCR group. Images in sequence are pre- and post-therapy T2WI, VOI, pathological, and IVIM-DWI parametric maps. The pre- and post-therapy T2WI shows a relatively good response to nCRT. The pathological map (haematoxylin–eosin staining, original magnification × 40) after nCRT implies the absence of residual cancer (TRG 4). LARC locally advanced rectal cancer, pCR pathological complete response, T2WI T2-weighted imaging, VOI volume of interest, IVIM-DWI intravoxel incoherent motion diffusion-weighted imaging, nCRT neoadjuvant chemoradiotherapy, TRG tumor regression grade

Fig. 2
figure 2

A patient with LARC from the non-pCR group. Images in sequence are pre- and post-therapy T2WI, VOI, pathological, and IVIM-DWI parametric maps. The pre- and post-therapy T2WI shows a relatively good response to nCRT. The pathological map (haematoxylin–eosin staining, original magnification × 40) after nCRT implies more residual tumor cells, but still less than fibrosis (TRG 2). LARC locally advanced rectal cancer, pCR pathological complete response, T2WI T2-weighted imaging, VOI volume of interest, IVIM-DWI intravoxel incoherent motion diffusion-weighted imaging, nCRT neoadjuvant chemoradiotherapy, TRG tumor regression grade

Table 8 Diagnostic efficacy of the GLCM features from the IVIM-DWI maps in differentiation between the pCR and non-pCR groups

Discussion

This study focused on the utility of texture features based on IVIM-DWI, together with mean parametric values, to identify pCR from non-pCR to preoperative chemoradiotherapy of LARC. Our study found that the pCR and non-pCR groups differed significantly from each other in the baseline GLCM features related to IVIM-DWI, which demonstrated the potential of GLCM analysis to discriminate the pathological response status for LARC. Furthermore, GLCM analysis may be superior to the first-order texture analysis and traditional mean-parameter-value analysis when pretreatment IVIM-DWI is used to identify pCR in LARC.

In the current study, we investigated three types of indices derived from IVIM-DWI, including conventional mean parametric values, first-order texture features, and second-order texture features. Our finding that the pCR and non-pCR groups shared similar values of all the 4 IVIM-DWI parameters suggested that the conventional IVIM-DWI method might not own the ability to predict pCR for LARC, which was in line with several prior investigations on rectal cancer [14, 15]. Nevertheless, Wen Lu et al. found that rectal cancers with a response of pCR after nCRT have obviously higher pretreatment D* and f values than those with non-pCR [13]. A recent report on LARC also demonstrated that the pCR group exhibits significantly lower baseline ADC and D values, compared with the non-pCR group [29]. Although these inconsistent findings might result from the differences in the clinicopathological characteristics, treatment regimens, imaging protocols, and data analysis methods across studies, a very possible reason is the difficulty for mean parametric value to quantify intratumor heterogeneity. Tumors are heterogeneous in the cellularity, angiogenesis, extravascular extracellular matrix, and areas of necrosis [16]. Intratumor heterogeneity is a feature closely associated with therapeutic response [18]. The IVIM-related micro-environment features, such as cellularity, necrosis, and microcirculation, obviously vary across different regions of malignant tumors [30]. However, analysis based on mean IVIM-DWI parametric values cannot describe the underlying spatial distribution [16, 31], and therefore cannot reflect intratumor heterogeneity. Thus, using the mean values of IVIM-DWI parameters to predict the treatment response of tumor may be not accurate enough.

Texture analysis is able to quantitatively evaluate intratumoral heterogeneity by analyzing the variations of pixel or voxel gray levels in an image [32]. Among the methods utilized in texture analysis, statistical-based approach is most frequently used, which can provide three levels of texture features including first-, second-, and higher-order statistics. Histogram-based measures, commonly referred to as first-order statistics, can reflect the intensity distribution of a VOI [33]. As a widely used texture analysis algorithm, gray-level co-occurrence matrix (GLCM) can provide second-order texture features that reflect the location of the pixels and the spatial interrelationship between gray values [34, 35]. GLCM features are found promising in the prediction of therapeutic response in many kinds of tumors [18, 23, 36, 37]. In this study, there were significant differences in the GLCM features, rather than in the first-order texture features, between the pCR and non-pCR groups, revealing that second-order texture features may have advantages over the first-order features obtained from IVIM-DWI maps in predicting the pathological response to nCRT in LARC. Similar observations were also reported by previous study on rectal or cervical cancers [15, 38]. For example, no relationship was found between the baseline median ADC, D, D*, or f value of rectal cancer and its response to chemoradiotherapy [15].

It is reported that the histogram metrics calculated from IVIM-DWI maps will not add to the median parametric values in discriminating the pathological response of rectal cancer to chemoradiotherapy [15]. Similarly, GLCM features generated from dynamic contrast-enhanced MRI images report can discriminate between the cured and relapsed cervical cancer patients who underwent chemoradiotherapy, whereas the first-order features cannot [38]. These observations imply the advantage of GLCM analysis over the first-order statistics in discriminating the pathological response of LARC, which may be because the first-order texture features do not provide spatial information of images and therefore they cannot provide sufficient information on tumor heterogeneity [17, 33, 34, 39].

Our data revealed that the pre-therapy Contrast, DifVarnc, Correlat, and SumAverg values differed markedly between the pCR and non-pCR groups in univariate analysis. Multivariate analysis in this study demonstrated that CorrelatD*, DifVarncADC, and DifVarncD severed as the independent indicators to pCR. These findings revealed that baseline GLCM features derived from IVIM-DWI could act as imaging predictors of pCR in patients with LARC. Among these GLCM features, Correlation and SumAverg are not directly related to heterogeneity, whereas Contrast and DifVarnc are measures of tissue heterogeneity [39], that is to say, lower Contrast or DifVarnc means less heterogeneity [35, 36, 39, 40]. In the present study, all the values of Contrast and DifVarnc derived from the ADC, D, and D* maps in the pCR group were lower than those in the non-pCR group, implying that higher heterogeneity on IVIM-DWI maps may be associated with poorer therapeutic effect. This result was in line with several previous studies which indicated that higher heterogeneity on imaging maps is usually a sign of poorer outcome in a variety of malignancies such as rectal, breast, and cervical cancers [23, 38,39,40,41]. Rectal cancer responding favorably to nCRT has higher value of InvDfMom (a GLCM feature directly related to intratumoral homogeneity) on the pre-therapy ADC map [23], or exhibits both lower Entropy (a GLCM feature representing heterogeneity) value and higher Uniformity value on the pretreatment contrast-enhanced CT images [41]. Similarly, lower baseline Contrast value was also found for the responders to chemotherapy on the baseline T1-weighted MRI images after the administration of gadolinium agent, compared with the non-responders with breast cancer [40]. In a recent study, higher Contrast values for cervical cancer on the pharmacokinetic parameter (enhancement amplitude and washout rate of contrast agent) maps of dynamic contrast-enhanced MRI [38] before nCRT also indicate a treatment outcome of relapse.

There are some limitations in the present study. First, a relatively small-study population was enrolled in this study, which may lead to statistical bias. Second, posttreatment texture features derived from IVIM-DWI were not investigated, which may bring about insufficient discussion on the relationship between the treatment response and texture features of LARC. Thus, further studies with larger sample size and analysis of posttreatment texture features generated from the IVIM-DWI maps are needed to better understand the performance of texture features on the prediction of the treatment outcomes of LARC. Third, the images were assessed by a single radiologist in this study. Considering the excellent interobserver agreement observed in our and others’ studies [13, 15, 42], this limitation might not exert an obvious influence on accurately assessing the MRI images. Fourth, the process of copying VOI may lead to possible mismatch across different IVIM-DWI maps. As the four IVIM-DWI maps for each rectal lesion were generated from the same MRI data acquisition, the VOI drawn on the ADC map theoretically ought to match the VOIs on the other 3 IVIM-DWI maps. Additionally, the VOIs in all IVIM-DWI maps were manually verified to further reduce the possibility of mismatch.

In conclusion, the present study implied that GLCM analysis based on IVIM-DWI may be a potential approach to identify the pathological response of LARC before starting chemoradiotherapy.