Introduction

[18F]HX4 is a new 2-nitroimidazole PET imaging agent for hypoxia, in which structure–activity relationships have been used to optimize pharmacokinetic and clearance properties [1, 2]. Tumor hypoxia is a condition in which insufficiently vascularized tumor cells deprived of oxygen not only become more aggressive and malignant, but also more resistant to treatment by radiation and chemotherapy [35]. The presence of hypoxia is therefore generally considered a poor prognostic disease marker in cancer patients [6]. However, it is difficult to measure oxygen levels reproducibly and noninvasively in a highly heterogeneous tumor environment. Reliable diagnostic methods to detect and quantify tumor hypoxia are therefore needed. It has been hypothesized and currently being investigated that inclusion of hypoxic cell sensitizers during treatment, i.e., the delivery of higher radiotherapy doses to hypoxic regions [7] or the use of hypoxia-targeting therapy [811], might improve the outcome in patients with hypoxic tumors [12]. [18F]HX4 has the potential to serve as a clinically useful diagnostic tool to aid the use of hypoxia-targeting therapies in those patients who will most likely benefit from them [13, 14].

This pilot phase 2 study was primarily designed as a test–retest study to investigate the repeatability of [18F]HX4 as a noninvasive PET imaging marker for detection of tumor hypoxic regions. Here we present the results in patients with lung cancer and patients with head and neck (H&N) cancer.

Materials and methods

Patients

This multicenter study (NCT01075399) was conducted in accordance with the ethical principles of Good Clinical Practice, according to the International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). Both the FDA and the institutional review boards of the participating institutions approved the study protocol and the informed consent form. All participants reviewed and signed the informed consent form before study entry. [18F]HX4 PET/CT images were acquired in 19 patients, 9 with lung cancer and 10 with H&N cancer. The patients underwent two sequential pretreatment [18F]HX4 PET/CT scans within 1 week to assess repeatability. Patient characteristics are presented in Table 1.

Table 1 Patient characteristics

Radiochemistry

[18F]HX4 (flortanidazole, 3-[18F]fluoro-2-(4-((2-nitro-1H-imidazol-1-yl)methyl)-1H-1,2,3-triazol-1-yl)-propan-1-ol) was prepared by Siemens Molecular Imaging (Culver City, CA) or a Siemens PETNET qualified manufacturing site and delivered to each site on the day of injection. The radiosynthesis has been described previously [15]. Briefly, the precursor (Siemens Molecular Imaging Inc., Culver City, California, USA) was reacted with 18F-K2.2.2 and K2CO3 in MeCN at 110 °C for 10 min, followed by a deprotection step using 1.0 mol/l HCl at 100 °C for 5 min. [18F]HX4 was purified by RP-HPLC and stabilized with ascorbic acid before sterile filtration. In order to be released, each dose of [18F]HX4 had to have a radiochemical purity greater than 95 %.

Scanners and technical parameters

[18F]HX4 PET/CT scans were performed using a high-resolution full-ring PET/CT scanners, including a GE Discovery, GE Discovery LS, Philips Gemini, and a Siemens Biograph PET/CT scanner. Images were reconstructed using scanner-specific parameters in accordance with each facility’s standard procedure, including at least attenuation and scatter correction. Repeat scans were performed on the same PET/CT scanner using the same protocol and patient positioning without respiratory gating.

[18F]HX4 PET/CT imaging

For each [18F]HX4 PET/CT scan, the patient received a single intravenous bolus injection of 368 ± 48 MBq (range 199 – 488 MBq) of [18F]HX4 followed by a saline flush. A static PET/CT scan was acquired with an acquisition time of 3 min (range 1.7 – 5 min) per bed position after an uptake time of 99 ± 10 min (range 89 – 125 min). The average difference in uptake time between repeat PET scans was 6 ± 7 min (range 0 – 27 min).

Image evaluation of [18F]HX4

[18F]HX4 PET/CT scans were analyzed using an Inveon Research Workplace (Edition 4.0.0.3; Siemens, Germany). Gross tumor volumes (GTV) of the primary lesion or largest lymph node were defined in centimeters cubed by manual contouring of the tumor on the CT images by one observer (D.C.). These tumor delineations were applied to the PET images and the maximal and mean standardized uptake values (SUVmax, SUVmean) were measured in grams per milliliter. Under the assumption of water density, the SUV is reported as unitless. For each patient, the reference tissue was defined by contouring a volume of interest (VOI; sphere of radius 25 mm) in a large (thigh) muscle on the CT image. From this muscle VOI the SUVmean (M) was determined. Tumor-to-background ratios (TBR) were calculated by dividing tumor SUVmax and tumor SUVmean by muscle SUVmean (M)

$$ \begin{array}{c}\hfill {\mathrm{TBR}}_{\max } = \mathrm{Tumor}\ {\mathrm{SUV}}_{\max }/\ \mathrm{M}\hfill \\ {}\hfill {\mathrm{TBR}}_{\mathrm{mean}} = \mathrm{Tumor}\ {\mathrm{SUV}}_{\mathrm{mean}}/\ \mathrm{M}\hfill \end{array} $$

The hypoxic volume (HV; in centimeters cubed) of each tumor was defined as the [18F]HX4 tumor volume with a TBR >1.2 (HV1.2) or TBR >1.4 (HV1.4):

$$ \begin{array}{c}\hfill {\mathrm{HV}}_{1.2} = \mathrm{V}\mathrm{olume}\ \mathrm{with}\mathrm{in}\ \mathrm{G}\mathrm{T}\mathrm{V}\ \mathrm{with}\ \mathrm{T}\mathrm{B}\mathrm{R} > 1.2\hfill \\ {}\hfill {\mathrm{HV}}_{1.4} = \mathrm{V}\mathrm{olume}\ \mathrm{with}\mathrm{in}\ \mathrm{G}\mathrm{T}\mathrm{V}\ \mathrm{with}\ \mathrm{T}\mathrm{B}\mathrm{R} > 1.4\hfill \end{array} $$

The fraction of HV (FHV, percent) of each tumor was determined by dividing the HV by its respective GTV:

$$ \begin{array}{c}\hfill {\mathrm{FHV}}_{1.2} = {\mathrm{HV}}_{1.2}/\ \mathrm{G}\mathrm{T}\mathrm{V}\hfill \\ {}\hfill {\mathrm{FHV}}_{1.4} = {\mathrm{HV}}_{1.4}/\ \mathrm{G}\mathrm{T}\mathrm{V}\hfill \end{array} $$

To evaluate the repeatability of the heterogeneous uptake pattern, the second [18F]HX4 PET/CT scans were rigidly registered and inspected for accurate registration, and a voxel-wise comparison of the SUVs within the GTV was performed.

Statistics

For all parameters, the mean ± SD are reported. The relationships among GTV-based parameters (SUVmean, SUVmax, TBR, HV, FHV) extracted from repeat [18F]HX4 PET images were analyzed by calculating Pearson correlation coefficients. A p value <0.05 was assumed to be statistically significant. In addition, a Bland-Altman analysis was performed for all parameters providing the mean difference of each parameter and the absolute and relative coefficients of repeatability (CR 1.96 × SD), defined as the value below which the difference between two measurements will be with 95 % probability. To evaluate the voxel-wise analysis, a linear fit of the data was performed, providing the correlation coefficient and slope. A Bland-Altman plot was created providing the difference in uptake for each matching voxel (ΔSUV) with the lower and upper limits of agreement of the 95 % confidence interval. In addition a histogram of SUVs within the GTV was prepared.

Results

[18F]HX4 PET/CT imaging in nine patients with lung cancer and ten with H&N cancer were included in the analysis. Two sequential baseline [18F]HX4 PET/CT scans were performed at an average interval of 1.1 days (range 1 – 2 days) in patients with lung cancer and 2.1 days (range 1 – 6 days) in patients with H&N cancer.

[18F]HX4 uptake in the GTV

[18F]HX4 uptake varied considerably among tumors on both the first scan with an average SUVmax of 1.86 ± 0.52 (range  1.2 – 2.9) and SUVmean of 1.20 ± 0.28 (range 0.85 – 1.90) and the second scan with an average SUVmax of 1.84 ± 0.50 (range 1.15 – 2.82) and SUVmean of 1.20 ± 0.28 (range 0.92 – 1.97; Table 2).

Table 2 Repeatability of [18F]HX4 uptake and hypoxic tumor volume and fraction using a threshold of 1.2 times background (HV1.2)

The uptake parameters from the first and second scans were highly correlated: r = 0.958 for SUVmax (p < 0.001, Fig. 1), and r = 0.946 for SUVmean (p < 0.001, Supplementary figure). High correlations between scans were also seen within each subgroup of cancer patients: r = 0.972 for SUVmax (p < 0.001) and r = 0.960 for SUVmean (p < 0.001) in those with lung cancer, and r = 0.945 for SUVmax (p < 0.001) and r = 0.952 for SUVmean (p < 0.001) in those with H&N cancer. In the Bland-Altman analysis, SUVmax showed a mean difference of 0.02 with an absolute CR of 0.29 and a repeatability percentage of 17 % (Fig. 1), and SUVmean showed a mean difference of 0.01 with an absolute CR of 0.18 and a repeatability percentage of 15 %.

Fig. 1
figure 1

Correlation and Bland-Altman plots (including 95 % confidence intervals) of the image parameters SUVmax and TBRmax

High correlations were also seen for TBRmax (r = 0.962, p < 0.001; Fig. 1) and TBRmean (r = 0.965, p < 0.001). High correlations were also seen within each subgroup of cancer patients: r = 0.939 for TBRmax (p < 0.001) and r = 0.972 for TBRmean (p < 0.001) in those with lung cancer, and similarly r = 0.972 for TBRmax (p < 0.001) and r = 0.964 for TBRmean (p < 0.001) in those with H&N cancer. In the Bland-Altman analysis, TBRmax showed a mean difference of −0.01 with an absolute CR of 0.30 and a repeatability percentage of 17 % (Fig. 1), and TBRmean showed a mean difference of −0.01 with an absolute CR of 0.11 and a repeatability percentage of 10 %.

HV and FHV analysis

The average tumor volume was 70 cm3 (range 2.6 – 361 cm3). The average HV1.2 in the first scan was 32 cm3 (range 0 – 211 cm3) and in the second scan was 34 cm3 (range 0 – 204 cm3; Table 2). For HV1.2, there was a high correlation between the first and second scans (r = 0.995, p < 0.001; Supplementary figure) which was retained in each subgroup of cancer patients: r = 0.997 (p < 0.001) in those with lung cancer and r = 0.998 (p < 0.001) in those with H&N cancer. In the Bland-Altman analysis, HV1.2 showed a mean difference of –1.55 cm3 with an absolute CR of 13.5 cm3 (Supplementary figure).

Applying the higher threshold of 1.4 times the background, in the first scan the average HV1.4 was 19 cm3 (range 0 – 175 cm3) and in the second scan was 19 cm3 (range 0 – 162 cm3; Supplementary table). For HV1.4, there was also a consistently high correlation between the first and second scans (r = 0.982, p < 0.001) which was retained in each subgroup of cancer patients: r = 0.959 (p < 0.001) in those with lung cancer and r = 0.999 (p < 0.001) in those with H&N cancer. In the Bland-Altman analysis, HV1,4 showed a mean difference of 0.08 cm3 with a confidence interval of –17.2 to 17.4 cm3.

There was a wide range of FHV1.2 due to varying levels of hypoxia among the tumors. In the first scan the average FHV1.2 was 20 ± 25 % (range 0 – 85 %) and in the second scan the average FHV1.2 was 23 ± 26 % (range 0 – 80 %; Table 2). This was also seen when the higher threshold of 1.4 times the background was applied: in the first scan the average FHV1.4 was 9 ± 18 % (range 0 – 71 %) and in the second scan the average FHV1.4 was 10 ± 17 % (range 0 – 63 %; Supplementary table).

For FHV1.2, there was a high correlation between the first and second scans (r = 0.957, p < 0.001) which was retained in each subgroup of cancer patients: r = 0.966 (p < 0.001) in those with lung cancer and r = 0.950 (p < 0.001) in those with H&N cancer. For FHV1.4, there was also a high correlation between the first and second scans (r = 0.975, p < 0.001) which was retained in each subgroup of cancer patients: r = 0.963 (p < 0.001) in those with lung cancer and r = 0.985 (p < 0.001) in those with H&N cancer. In the Bland Altman analysis, FHV1.2 showed a mean difference of -3.1 % with an absolute CR of 14.9 %, and FHV1.4. showed a mean difference of -0.9 % and an absolute CR of 7.8 %.

Using 1.2 times the background as the threshold to determine FHV, 79 % of the tumors (15/19) were found to have some level of hypoxia but when the higher threshold of 1.4 times the background was applied to determine FHV, only 47 % of the tumors (9/19) were characterized as having hypoxia.

Repeatability of the spatial uptake pattern

An example of voxel-wise image analysis in a patient with head and neck cancer (patient 12) is shown in Fig. 2. Comparison of the heterogeneous uptake within the GTV between the first and second [18F]HX4 PET scans showed a moderate to strong correlation in the majority of patients, with an average correlation coefficient of 0.65 ± 0.14. There were two exceptions (patients 14 and 16) in whom a poor correlation was observed (R = 0.38 and 0.39). The average slope and intercept of the linear fit of the data were 0.56 ± 0.17 and 0.47 ± 0.19, respectively. The Bland-Altman analysis showed an average ΔSUV of 0.02 ± 0.06, with a lower and upper limit of agreement of 0.15 ± 0.09 and 0.19 ± 0.08. Examples of voxel-wise image analysis in patients with lung cancer (patients 1 and 4) are shown in Fig. 3. In addition, the results for each patient are shown in Table 3.

Fig. 2
figure 2

Example of voxel-wise analysis in a patient with head and neck cancer (patient 12). The axial, coronal and sagittal planes of the first and the rigidly registered second [18F]HX4 PET/CT scan are shown. The gross tumor volume is delineated. The bottom row shows the correlation plot, the Bland-Altman and the histogram plot of the voxels within the gross tumor volume

Fig. 3
figure 3

Examples of voxel-wise analysis in patients with lung cancer (patients 1 and 4). The axial plane of the CT scans with the gross tumor volumes delineated in yellow, the first [18F]HX4 PET scan, the rigidly registered second [18F]HX4 PET scan and the difference map from the two scans are shown

Table 3 Results of the voxel-wise analysis

Discussion

The aim of this study was to investigate the repeatability of [18F]HX4 as a noninvasive PET imaging marker for the detection of tumor hypoxia in patients with lung cancer and patients with H&N cancer. Tumor hypoxia is known to be a dynamic process characterized by the presence of acute and chronic hypoxia. Acute hypoxia is usually the result of a blockage or disruption in the perfusion of the tumor, while chronic hypoxia is mainly caused by limitations of oxygen diffusion due to an inefficient blood vessel network which results in larger distances between the blood vessels and tumor tissue. Static PET imaging will show only the hypoxic status at one specific time-point and contain information about both acute and chronic hypoxia. To be able to select patients for treatment with antihypoxia therapy and/or for a hypoxia-based radiotherapy dose redistribution, it is important to gain an insight into the day-to-day variability in tumor hypoxia and its spatial location. Therefore we compared [18F]HX4 uptake, tumor-to-muscle levels and hypoxic fractions between two consecutive [18F]HX4 PET scans. To obtain information about the spatial distribution of tumor hypoxia, a voxel-wise comparison of the [18F]HX4 uptake was performed.

While there was, as anticipated, a large interpatient variability in [18F]HX4 uptake, no major differences were observed between patients with H&N or patients with lung cancer. The average SUV of [18F]HX4 was identical for lung cancer (1.2 ± 0.3) and H&N cancer lesions (1.2 ± 0.3). There is no standardized method to define tumor hypoxia on PET images. The threshold value for defining tumor hypoxia is dependent on the tracer, tracer pharmacokinetics, and other imaging parameters [16]. In a previous study [16], we showed that PET imaging using a threshold of 1.2 times background at 2 h after injection provides a similar FHV and hypoxic lesion detection rate to imaging using a threshold of 1.4 times background at 4 h after injection. In the current analysis, we included both thresholds to quantify the HV. First we defined the threshold as an uptake above 1.2 times the background level. In this case 89 % (8/9) of the patients with lung cancer and 70 % (7/10) of those with H&N cancer had a hypoxic tumor volume. These percentages are in agreement with previously published results showing, for example, hypoxia in 72 % of patients with non-small-cell lung cancer [16] and in 84 % of those with H&N cancer [17]. Increasing the threshold to 1.4 times background level resulted in decreases in the proportions of hypoxic lesions detected to 67 % of lung cancer lesions (6/9) and 30 % of H&N cancer lesions (3/10).

At the tumor level we observed a high correlation for the frequently used parameters to quantify tumor hypoxia (SUVmax, SUVmean, TBR, HV and FHV). This is in agreement with the results of a study by Okamoto et al. [18] who evaluated the reproducibility of the hypoxia PET tracer [18F]FMISO in patients with H&N cancer. They found a high correlation for SUVmax, TBR and HV. However, these results do not agree with the previous results of Nehmeh et al. [19] who found a considerable variability in intratumoral uptake between repeat [18F]FMISO PET scans. The reproducibility of the hypoxia PET tracer [18F]FAZA was evaluated by Busk et al. [20] in a mouse model and showed good reproducibility. In comparison to [18F]FDG PET/CT imaging, our observed repeatability percentages (SUVmax 17 % and SUVmean 15 %) are smaller than the relative differences required to exceed test–retest variability, which should be larger than 25 % for SUVmax and 20 % for SUVmean [21]. Since [18F]HX4 has a lower uptake than [18F]FDG, results from comparisons of the two tracers should be interpreted with caution. However, comparing our relative coefficients of repeatability with the results of the low uptake [18F]FDG measurements (Fig. 1c of de Langen et al. [21]), the observed [18F]HX4 repeatability percentage of the SUVmax (17 %) is much lower than expected based on [18F]FDG (approximately 35 %). This high repeatability of [18F]HX4 PET imaging parameters at the tumor level provides confidence that hypoxia PET imaging using [18F]HX4 can be used to reliably detect and quantify tumor hypoxia. This is essential for the use of hypoxia PET imaging as a predictor of treatment response or for monitoring changes in hypoxia during treatment. The detection of hypoxia using [18F]HX4 PET/CT at the tumor level could therefore be used to identify patients who might benefit from hypoxia-targeted treatment [22].

To evaluate the stability of the heterogeneous uptake pattern of [18F]HX4, a voxel-wise comparison was performed. This analysis showed reproducible results (R > 0.5) in the majority (17 out of 19) patients with lung cancer or H&N cancer. The observed repeatability is in agreement with previous results of Peeters et al. [23] showing high repeatability of [18F]HX4 uptake in a rat rhabdomyosarcoma model. Repeatability studies using the alternative hypoxia tracer [18F]FMISO have shown contradictory results: Okamoto et al. [18] and Bittner et al. [24] found good repeatability, while Nehmeh et al. [19] observed variability in spatial uptake. For the hypoxia tracer [18F]FAZA, repeated PET/CT imaging was performed during the course of radiotherapy. While Mortensen et al. [25] found a stable location of the HV during treatment, Servagi-Vernat et al. [26] found a spatial move in the HV. The spatial reproducibility of tumor hypoxia, as measured by a hypoxia PET tracer is essential for hypoxia PET-based radiotherapy planning. Three-dimensional information on the hypoxic areas within the tumor can be used to tailor radiotherapy treatment to give a higher radiation dose to hypoxic subvolumes [27]. In this study, [18F]HX4 PET/CT imaging was able to identify stable hypoxic areas in the majority of patients. Therefore, this imaging technique could potentially enable the reliable treatment of hypoxic areas with an increased radiotherapy dose. Several studies have already shown that it is feasible to perform radiotherapy dose planning based on hypoxia PET images [12, 28, 29].

There were some limitations to this study. First, patients with very heterogeneous disease were included. These tumors have a different histology and might therefore express a different phenotype regarding acute versus chronic tumor hypoxia, which could possibly affect the reproducibility of tracer uptake. Nevertheless, even in this heterogeneous population, a high repeatability in [18F]HX4 PET/CT uptake was observed. Second, the study design was multicentric; therefore different PET/CT scanners were used with different physical characteristics and different acquisition protocols. Differences in resolution among the scanners might have led to differences in the tumor hypoxia detection rates. In general, we expect with all scanners a partial volume effect, and particularly in small lesions, in lesions with low uptake and with a small HV this would cause larger differences in absolute uptake measurements. Also, breathing motion in the patients with lung cancer could have caused blurring of the PET signal. The differences in acquisition protocol, i.e., acquisition time per bed position and uptake period, will lead to differences in the observed signal-to-noise ratios, and TBR and SUV measurements [16, 30]. Nevertheless, since we used each patient as his or her own control, the partial volume effect and the effect of different scanners should have had only a minor influence on the repeatability results. Third, the [18F]HX4 PET scans were on average acquired at 99 min after injection, with a maximal difference in the time from injection acquisition of 27 min. Studies reported after this study was completed have shown that the contrast between tumor and background increases up to 4 h after injection. Therefore, the image contrast might have been suboptimal and the differences in uptake parameters observed might have been due to differences in the time from injection to acquisition [30].

In conclusion, repeated PET imaging with the hypoxia tracer [18F]HX4 provides reliable and reproducible results regarding the (spatial) uptake in patients with head and neck and lung cancer. [18F]HX4 has the potential to quantify hypoxia in tumors and aid hypoxia-targeted treatments.