Introduction

Neoadjuvant chemotherapy (NACT) is a systemic treatment that helps to reduce the size of a tumour or, in the best case, remove all tumour cells before surgery. NACT allows breast-conserving surgeries in patients with locally advanced breast cancer and often enables surgeries on initially non-operable tumours. In addition, NACT helps provide better delineation between healthy and malignant tissue during the surgery, and increases the number of patients with better postoperative recovery [1, 2]. In particular, a pathologic complete response (pCR) after NACT has been associated with a significantly better disease-free and overall survival of patients compared to a partial response [35]. Therefore, the non-invasive prediction of response to NACT with imaging might play an important role in potential therapy plan modifications.

Contrast-enhanced (CE)-MRI, because of its high sensitivity, is the standard method for prediction of NACT response in breast cancer. The most commonly used CE-MRI marker is the lesion size. Another marker is segmented tumour volume, which has been reported to be more predictive of response than the maximal tumour diameter [6].

Diffusion-weighted imaging (DWI) – providing a supplementary MRI contrast – is a well-established method for breast lesion characterization, with a high specificity for the detection of malignant lesions [710]. The apparent diffusion coefficient (ADC), derived from DWI images, reflects changes in tissue cellularity and its mean value from the lesion was found to be affected during therapy earlier than the lesion size [11, 12]. Changes of mean ADC values at an early time-point in therapy have been able to predict therapy outcome for responders, compared to non-responders, in several previous studies [1118].

The aim of our study was to determine the optimal MRI acquisition and evaluation method, as well as the ideal time-point, using CE-MRI and DWI at 3T, to predict pCR in breast cancer patients undergoing NACT.

Materials and methods

Patients and therapy

Institutional review board approval and written informed consent was obtained from all patients. Forty-two breast cancer patients, 52 ± 10 years old (mean ± standard deviation, range 29 – 74), were investigated in the period from December of 2010 to December of 2013. All patients underwent baseline (BS) MRI before the NACT. During the NACT, patients were measured once or twice again. If the patient was measured twice, the first time was in the first half of the NACT and the second time in the second half.

The patient inclusion criteria were: 18 years of age or older, not pregnant, not breast feeding, histology proven cancer prior to enrolment (BI-RADS 6) with no previous treatment, and no contraindications for MR imaging or contrast agent administration.

There were two types of NACT regimens administered in 3-week long cycles:

  1. 1)

    taxane-based (22/42) with anthracyclines (19/22) delivered in six or eight cycles

  2. 2)

    anthracycline/taxane-based (20/42) consisting of four + four cycles, where anthracycline treatment and cyclophosphamide were followed by taxanes (CA – T, 16/20), or, vice versa (T – CA, 4/20).

The study flow chart with number of patients measured at different time-points and chemotherapy used is depicted on the Fig. 1.

Fig. 1
figure 1

A flow chart of the study design depicting number of patient and time-points they were measured/examined. (NACT neoadjuvant chemotherapy)

In some patients with Her2/Neu-positive lesions (5/42), trastuzumab was used in combination with the NACT regimen. The duration of NACT therapy ranged from 84 to 168 days (three to eight cycles, median 138 days).

pCR was defined as no residual invasive or non-invasive cancer in breast tissue or in lymph nodes (ypT0 N0) on histopathology from surgical resection that was performed in all patients. The average time between the last chemotherapy and surgery was 37 ± 16 days.

Measurements

All experiments were performed on a 3 T MR system (TIM Trio, Siemens Healthcare, Erlangen, Germany) using a dedicated bilateral breast coil with four 1H-channels (In vivo, Orlando, FL, USA) and with patients in the prone position. First, a transversal T2-weighted turbo spin echo sequence with fat suppression was measured in all patients (data not used in this study). Then, DWI images were acquired with a bilateral three-scan trace using readout-segmented echo planar imaging (rs-EPI) with fat-suppression in the transversal plane and with b-values of 0 and 850 s/mm2, a repetition time (TR)/echo time (TE) of 5800/68 ms, and a total measurement time of 3 min. The rs-EPI was used, because of its advantages over single shot EPI sequence, e.g. fewer image distortions at 3 T [19]. The in-plane resolution was 1.4 × 1.4 mm2 with a 5-mm slice thickness. DWI was measured before contrast injection and followed by a CE-MRI sequence. Two different 3D T1-weighted sequences with fat suppression were used during the period of this study – both were gradient echo-based: i) a high-spatial and high-temporal resolution sequence (time-resolved angiography with stochastic trajectories – TWIST) with a temporal resolution of 14 s, spatial isotropic resolution of 1.1 mm3, field of view 248 × 350 mm2, matrix size 240 × 320, 144 slices per slab, one average, TR/TE of 6.81/2.84 ms, and a flip angle of 11° [20]; or ii) a fast low angle shot (FLASH) sequence with a duration of 2 min, isotropic 1 mm3 spatial resolution, field of view 320 × 134 mm2, matrix size 320 × 134, 96 slices per slab, one average, TR/TE of 877/3.82 ms, and a flip angle of 9°, interleaved with high temporal resolution (13 s) volumetric interpolated breath-hold examination (VIBE) imaging [21, 22]. Images from these two different protocols were considered equivalent because of similar resolution and timing of images used for tumour size evaluation. The whole protocol is also represented on the block chart in Fig. 2. Only one pre-contrast image and one maximum contrast image measured at 2 min 12 s after the contrast agent injection were used in data processing from each sequence. Gadoterate meglumine contrast agent (Dotarem, Guerbet, IN, USA) was injected intravenously as a bolus (0.1 mmol/kg of body weight) 2 min after the start of CE-MRI.

Fig. 2
figure 2

A block diagram showing the succession of sequences in the study protocol, along with sequences duration and time resolution. (RS-EPI readout-segmented echo planar imaging, DWI diffusion weighted imaging, FLASH fast low angle shot magnetic resonance imaging, res. resolution, VIBE volumetric interpolated breath-hold examination, TWIST time-resolved angiography with stochastic trajectories)

Image analysis

The MRI was examined in consensus by two experienced radiologists, both with more than 10 years of experience in breast MRI. Tumour size was assessed based on percentage enhancement (PE) maps that were defined as PE = [(S1 – S0)/S0], where S0 and S1 represent the signal intensity on the pre-contrast and maximum-contrast images [2]. The volume of contrast-enhanced lesions (3Dseg) was calculated from three-dimensional segmented ROIs defined on the PE maps. The threshold used for ROI segmentation varied from 70-80 % on BS PE maps to 25-70 % on maps acquired during/after the therapy, to account for lower contrast medium uptake after the NACT, as proposed by Partridge et al. [6]. For comparison, the largest diameter of the lesion in two and three perpendicular directions was measured manually on CE images according to RECIST 1.1 [23], as the standard tumour size measure.

Non-target lesions were followed and documented as “present,” “absent,” or “unequivocal progression.” Lesions < 10 mm or pathological lymph nodes ≥ 10 mm to < 15 mm in the short axis, were considered non-measurable disease and were not considered for response assessment.

ROIs on diffusion images were defined to delineate the hyperintense areas on diffusion-weighted images (b = 850 s/mm2) [7] and corresponded to hypointense areas on the ADC map. These ROIs were marked down and then copied to the ADC maps. Two-dimensional ROIs were drawn manually on one slice with the largest tumour area as ADC data are typically processed. Necrotic areas, represented as hyperintense regions, on both DWI and ADC maps, were excluded (values higher than approx. 2 × 10-3 mm2/s). All ROIs were independently defined for each time-point.

All 3D segmented ROIs (on CE MRI and DWI) were selected using the “Grow Region” 3D segmentation with a threshold (lower/upper bounds) algorithm and processed in OsiriX® (Pixmeo, Geneva, Switzerland)/HorosTM (horosproject.com). ROIs were defined only for the target lesion. The readers were blinded to pathologic results.

Data analysis and statistics

Our statistical analysis was performed using IBM SPSS Statistics 22 (Armonk, NY, USA). In our study, the distribution of ADC values within each lesion was non-normally distributed, but since most articles report mean ADC values, we included them in our evaluation and results. Mean ADC values were calculated automatically from two- and three-dimensional ROIs. Histogram analysis was used to assess median ADC values and 15th, 25th, 50th, 75th, and 90th percentiles. The 15th and 90th percentiles were specifically reported to enable a direct comparison with literature data [24]. Only 3D ADC values, exported using the “Export ROIs” plugin in Osirix, were used for histogram analysis because of the larger number of data points.

The size measures (3Dseg, tumour diameters) and DWI measures (mean, median, and percentile of ADCs), and changes in these parameters were compared using a non-parametric Mann-Whitney U-test. In the case of paired data as, e.g., comparison between mean ADC values before and during therapy, Wilcoxon signed-rank test was used. The ability to predict pCR was assessed using the area under the curve (AUC) obtained via the receiver operating characteristics (ROC) analysis. We performed De Long’s test when comparing AUC of different measures within the same time-point and bootstrap test when comparing AUC values within different ones using the package pROC [25] in the R Project for Statistical Computing [26].

All parameter changes mentioned in the manuscript are meant to be relative to the baseline (BS) measures.

Results

Forty-two lesions were assessed. Histological types and grades, along with receptor positivity, are listed in Table 1.

Table 1 Summary of clinical parameters in measured patients: histology types and grades of tumours and receptor positivity

AUC values from the ROC analysis for pCR prediction (using i.e., largest tumour diameter, 2D and 3D diameters, 3Dseg) at the initial time-point (BS) and during the chemotherapy are displayed in Table 2.

Table 2 Area under the curve from the ROC analysis of pathologic complete response prediction using different MRI measures

Changes in 2D diameter, ADC values, initial diameter, response, and NACT regimens for all lesions are listed in Table 3.

Table 3 Changes in 2D diameter and ADC values, initial diameter, response, and neoadjuvant chemotherapy regimens in all lesions. The top half of the table includes lesions with pathological complete response (pCR), and below are lesions without (non-pCR)

Comparison of 2D and 3D ADC

Three-dimensional ADC values from all time-points and patients were not significantly different from 2D ADC (p = 0.061). In general, mean ADC values from 3D ROIs performed better or similarly to 2D ROIs in pCR prediction with higher/similar AUCs in ROC analysis (see Table 2). Moreover, 3D ADC ROIs have the advantage of more single ADC values for histogram analysis. Therefore, all comparisons of ADC values are provided from 3D ROIs, if not stated otherwise.

There was no significant difference between pCR and non-pCR lesions using mean ADC values at any time-point (BS: see above, after second cycle: p = 0.368, after third and fourth cycle: p = 0.999, after fifth cycle: p = 0.093). However, it is still important to note the behaviour of the mean ADC values development during the NACT, which is depicted on Fig. 3. Moreover, the ADC values from 2D ROIs measured after the fifth cycle were significantly lower in the pCR group (p = 0.034). We were motivated to investigate this difference because of the high AUC value (0.800) at this time-point (see Table 2). In comparison, the difference between the pCR and non-pCR groups using 2D ROIs was not significant for other time-points (p = 0.272 at the BS, 0.456 after the second cycle and 0.875 after the third and fourth cycle). However, there was no significant difference in AUCs between any time-point or when comparing different methods.

Fig. 3
figure 3

Plot depicting mean ADC values of pathological complete responders (pCR) and non-pCR before (baseline) and during (after the 2nd, 3rd and 4th, and 5th cycle) the neoadjuvant chemotherapy

The potential added benefit of ADC values for pCR prediction during second, third, and fourth cycle was investigated on a scatter plot by correlating 2D diameter change with median 3D ADC values (Fig. 4).

Fig. 4
figure 4

A scatter plot depicting the relation of median 3D ADC change and 2D diameter change in the first half of the therapy (after 2nd or 3rd or 4th cycle) from the baseline measurement. The size change shows a good predictor for pCR because of the majority of non-pCR cases are distributed on the left side of the plot. However, the ADC values for pCR cases are distributed equally along with non-pCR on the y-axis; therefore, they show little to no contribution to the pCR prediction at this time-point

Selected examples of a false-positive case with higher ADC values during and after NACT in a non-responder and a false-negative case, with subsequently low ADC values during and after NACT in a pCR lesion, are shown in Figs. 5 and 6, respectively.

Fig. 5
figure 5

Examples of DWI and CE MRI in a 39-year-old patient with invasive ductal carcinoma and stable disease after neoadjuvant chemotherapy (non-responder). Mean ADC values (×10-3 mm2/s) measured before and during the therapy are depicted next to the corresponding ADC map, with a region of interest surrounding the lesion (white arrows)

Fig. 6
figure 6

Examples of DWI and CE MRI in a 37-year-old patient with invasive ductal carcinoma and pathologic complete response to neoadjuvant chemotherapy. Mean ADC values (×10-3 mm2/s) measured before and during the therapy are depicted next to the corresponding ADC maps. An example of a segmented region of interest fills the lesion (green colour) in the baseline ADC map. NaN value missing

Baseline measurements

All patients were included in baseline data analysis. The tumour diameter ranged from 2.1 to 12.5 cm before therapy measured on CE-MRI.

There was no significant difference between the pCR and non-pCR groups in all BS tumour size measures (p = 0.843 for 3Dseg, 0.388 for 3D diameter, and 0.530 for 2D diameter).

The mean ADC values were not significantly different for pCR in comparison with non-pCR (p = 0.287) and the values were: 0.87 ± 0.12 × 10-3 mm2/s in pCR and 0.96 ± 0.23 × 10-3 mm2/s in non-pCR.

After the second cycle

The size diameter change was more prominent on average within pCR cases than in non-pCR tumours, but this difference was not significant (p = 0.371 using 3Dseg, 0.112 using 3D diameter, and 0.112 using 2D diameter of tumour).

The mean ADC values measured after the second cycle were significantly higher than at BS (p = 0.008). The mean ADC values after second cycle were as follows: pCR – 1.33 ± 0.28; and non-pCR – 1.13 ± 0.26 × 10-3 mm2/s.

After the third and fourth cycles

The tumour diameter change from the baseline started to be significantly higher for pCR cases when compared with non-pCR (p = 0.017 in 3D diameter and 0.039 in 2D diameter). However, there was no significant difference between pCR and non-pCR using 3Dseg (p = 0.056).

The mean ADC values were again significantly higher from BS (p < 0.001). Mean ADC values after third and fourth cycle were: 1.24 ± 0.15 in pCR and 1.28 ± 0.30 × 10-3 mm2/s.

After the fifth cycle

There was again a significant difference between pCR and non-pCR using 2D and 3D tumour diameter change (p = 0.016 for 3D diameter and 0.014 for 2D diameter) and the difference was not significant for 3Dseg (p = 0.116).

The mean ADC values after the fifth cycle were significantly higher than the BS (p < 0.001). The mean ADC values after fifth cycle were as follows: pCR – 1.10 ± 0.24; non-pCR – 1.34 ± 0.33 × 10-3 mm2/s.

Discussion

In this study, we measured breast cancer patients at several time points before and during NACT using CE-MRI and DWI at 3 T. We assessed the ability to predict pCR using three tumour size measures based on CE-MRI in addition with two different ROI definition approaches on ADC maps. Our results show the advantage of tumour size measures for therapy monitoring, mostly during the first half of NACT. Compared to size measures, we found that ADC values were not good enough for NACT prediction, not even after the second cycle of NACT.

In a comparable study at 1.5 T by Fangberget et al. [14], DWI and CE-MRI was performed in 31 patients at three time points: baseline, after the fourth cycle, and before the surgery. This study found ADC values, tumour size, and tumour size reduction after four cycles of chemotherapy to be strong predictive markers for pCR. They found AUC for pCR prediction using ADC values to be 0.80 and the ADC values were significantly higher in pCR in the middle of the therapy, which was found in our case at an earlier time-point and the difference was not significant. In the study, even patients with remaining DCIS were included in the pCR group, the ADC ROIs were drawn on ADC maps, and they used single-shot EPI with 1.5 times smaller resolution (1.5 × 1.5 × 4 mm2), which could have caused increased partial volume effect from healthy/necrotic tissue. These factors all together could be the source of differences of their results from ours.

We found that tumour diameter measurement at the mid-therapy time-point (after the third or fourth cycle) is more predictive for pCR than at other time-points. The highest AUC values (>0.9) were found for 3D diameter changes measured after the third and fourth cycle of NACT. This is in agreement with previous reports by Hylton et al. [2]. In this study, they used data from 216 patients and included only tumours bigger than 3 cm (we had five cases with a diameter of less than 2.5 cm), and DCIS was part of their pCR group. In our study, DCIS lesions were considered as non-pCR because pCR, defined as no residual invasive or non-invasive cancer, was found to be associated with, highly favourable outcome, compared to other groups in a study by von Minckwitz et al. [3]. Furthermore, Hylton et al. found the segmented volume to perform better in pCR prediction at the early time-point (after the first cycle) than tumour diameter, but in our data, diameter measures were always more advantageous or similar to tumour volume.

Additionally, we have found that if the baseline data would not be present, the tumour size alone – measured after the second cycle – was much less predictive for pCR than the tumour size measured later in the therapy. In contrast, the change in tumour size from the baseline performed well in predicting the pCR outcome at every time point.

In a study by Partridge et al. [6], the baseline tumour volume, diameter, and tumour volume change after NACT were associated with the length of recurrence-free survival, but early (after the first cycle) tumour volume and largest diameter were not. pCR was found to be associated with a highly favourable outcome [3], which would suggest an association with recurrence-free survival, too. However, in our study, all baseline tumour size measures were not sufficient to efficiently predict pCR (AUC values of 0.527 – 0.626), and there was no difference between pCR and non-pCR patients using baseline size measures.

ADC values measured before NACT were lower in the pCR group, although not significantly, and were hardly able to predict pCR (highest AUC of 0.669 for the 15th percentile). In contrast, two other studies found significantly higher pre-treatment ADC values in responders when compared to non-responders [16, 27]. Wilmes et al. [24] found that lower pre-treatment ADC metrics were generally found in responders to therapy, but the difference was significant only for high-resolution DWI. This could be supported by the hypothesis that higher ADC values are linked to tissue necrosis, characterized by hypoxia, acidity, and poor perfusion, which might account for the resistance to treatment [28, 29]. Several other studies found no correlation with tumour size change or pathologic complete response [11, 30, 31], which is in accordance with our results. However, a study by Richard et al., found ADC values to be a good predictor of non-pCR considering breast cancer subtypes [32].

We found that ADC values in tumours increased during chemotherapy, which is in agreement with the literature [12].

ADC metrics and their changes during the first half of the therapy were of limited value in predicting pCR (AUC values ranged from 0.5 to 0.788), when compared with size measures at the same time-points. Interestingly, the DWI measurement after the second cycle was the only time at which mean ADC values were higher in the pCR group than in the non-pCR group. While in the non-pCR group, the mean ADC values tended to increase or stay similar to the time-points before, in pCR group there is an increase in ADCs after the second cycle and then the values decrease again. It is possible that better results would have been obtained if ADCs were assessed at an earlier time-point. ADC assessment at the second cycle could have been too late to detect necrotic changes caused by the chemotherapy and represented by higher diffusivity. Lesions could have already started to transform into fibrotic tissue. Possible evidence of outgoing fibrosis can be deducted from high AUC obtained using mean ADC values from 2D ROIs after the fifth cycle, where the ADC values in pCR lesions were significantly lower than in non-pCR that could be caused by fibrosis in the responding tumours. Moreover, there was not a big difference between 2D and 3D ADC values, when used for pCR prediction. This could have been caused by the non-isotropic resolution of the DWI images (5 mm slice thickness).

Our semi-automated segmentation method provided mean ADC values comparable to manually delineated 2D ROIs. This ROI determination technique is faster than manual delineation. Moreover, 3D segmented ADC ROIs were more convenient for histogram analysis, because there is higher number of ADC values per each lesion. Furthermore, median ADC values from 3D data have proved to be better in pCR prediction than mean values.

Other factors that could contribute to different results, when compared to the literature, include: different DWI acquisition parameters, ROI delineation, and differences in tumour size assessment. More data can help to understand better the benefits of integrating mean (median) ADC values as an additional imaging biomarker for NACT monitoring. Moreover, the limitation of this study was that the smaller number of patients at earlier time-points after the start of NACT and that patients were not all measured at all time-points. This was not possible because of the management complications caused by NACT side effects. Moreover, patients measured after the fifth cycle were not all measured after the end of the therapy. Another factor influencing tumour size estimation could be delay in enhancement of the tumour, making parts of the tumour not yet visible at the time of the post-contrast measurement.

In conclusion, the results of this study suggest that size changes assessed at an earlier time-point predict pCR in NACT better than later measurements. No advantage for therapy monitoring was found in using segmentation based volumetric measures when compared with the standard 2D diameter. If measured in the middle of the therapy, DWI measurement is less capable in therapy outcome prediction than size change measures.