Feasibility and robustness of dynamic 18F-FET PET based tracer kinetic models applied to patients with recurrent high-grade glioma prior to carbon ion irradiation

The aim of this study was to analyze the robustness and diagnostic value of different compartment models for dynamic 18F-FET PET in recurrent high-grade glioma (HGG). Dynamic 18F-FET PET data of patients with recurrent WHO grade III (n:7) and WHO grade IV (n: 9) tumors undergoing re-irradiation with carbon ions were analyzed by voxelwise fitting of the time-activity curves with a simplified and an extended one-tissue compartment model (1TCM) and a two-tissue compartment model (2TCM), respectively. A simulation study was conducted to assess robustness and precision of the 2TCM. Parameter maps showed enhanced detail on tumor substructure. Neglecting the blood volume VB in the 1TCM yields insufficient results. Parameter K1 from both 1TCM and 2TCM showed correlation with overall patient survival after carbon ion irradiation (p = 0.043 and 0.036, respectively). The 2TCM yields realistic estimates for tumor blood volume, which was found to be significantly higher in WHO IV compared to WHO III (p = 0.031). Simulations on the 2TCM showed that K1 yields good accuracy and robustness while k2 showed lowest stability of all parameters. The 1TCM provides the best compromise between parameter stability and model accuracy; however application of the 2TCM is still feasible and provides a more accurate representation of tracer-kinetics at the cost of reduced robustness. Detailed tracer kinetic analysis of 18F-FET PET with compartment models holds valuable information on tumor substructures and provides additional diagnostic and prognostic value.

F-FET, unlike other system L-substrates like [ 11 C] methionine and [ 18 F]FDOPA, is not further metabolized after being taken up by the cells, which simplifies respective compartment models and thus makes it a potentially valuable candidate for tracer kinetic analysis. Several authors have studied the value of dynamic 18 F-FET PET for diagnosis and grading of HGG 9,[20][21][22] . However, most of them resort to either qualitative analysis of curve shapes or derivation of semi-quantitative parameters such as the initial slope or time to peak. Few authors applied detailed pharmacokinetic analysis using different compartment models 23,24 , but parameter robustness and reliability were not investigated.
The aim of the present study was to investigate feasibility of tracer kinetic analysis in dynamic 18 F-FET PET of recurrent HGG and to identify the most accurate and reliable model. Therefore, we systematically investigated fits with the 1TCM, the simplified 1TCM and the 2TCM to data from patients with recurrent HGG. Quality of the 2TCM fits was assessed by fitting synthetic TACs, generated with different parameter combinations, and analyzing the resulting parameter estimates in terms of accuracy and robustness.

Methods
All procedures performed in this study were in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The study was approved by the institutional ethics committee (ethics committee of the University of Heidelberg, S-421/2015). Owing to the retrospective nature, the need for informed consent was waived.
Compartment models. Different compartment models could be feasible to describe 18 F-FET tracer kinetics. The simplest approach is using a standard one-tissue compartment model (1TCM), which is depicted in Fig. 1a. The exchange of tracer between blood and tissue by means of exchange rate constants K k [ , ] 1 2 can be expressed in a single differential mass balance equation: t a t 1 2 C t ( ) a and C t ( ) t represent the tracer concentrations within the arterial blood pool and tissue, respectively. The total measured concentration of tracer in a volume element then yields: Figure 1. Schematic representation of the used compartment models. C a is the concentration of tracer in the arterial blood (arterial input function AIF). (a) The one-tissue compartment model (1TCM) describes the exchange of tracer between the arterial blood with the tissue in terms of exchange rates K 1 (1TCM) and k 2 (1TCM). In an element of tissue the blood compartment in form of the capillaries occupies the volume fraction V B , whereas the fraction of tissue volume is V T = 1-V B . (b) The more complex two-tissue compartment model (2TCM) separates the tracer concentration in tissue into tracer in the extracellular extravascular space (EES) and in the cells. Tracer is transported from the capillaries into the EES over the blood-brain barrier with rates K 1 (2TCM), k 2 (2TCM), and from the EES it is taken up in the cells with rates k 3 (2TCM), k 4 (2TCM).
SCIeNTIFIC REPORTs | (2018) 8:14760 | DOI: 10.1038/s41598-018-33034-5 T B t B a V B represents the relative volume of blood within in a respective volume element (in ml/100 ml tissue, 19 . In many cases, V B is assumed to be zero, leading to a simplified model with only two parameters K 1 and k 2 25 . In the following, this model will be referred to as "simplified 1TCM" (s1TCM). The two transfer constants [K k , 1 2 ] in the 1TCM include both tracer transport over the blood-brain barrier via blood flow and diffusion and transport of the tracer to the cells. After crossing the blood-brain barrier, 18 F-FET is transported into the cells by the LAT system 26 . Hence, a more detailed description of the transport processes could be formulated in a two-tissue compartment model (2TCM) 19 , which is described by two differential equations with transfer constants [K k k k , , , 4 ]: The interplay of the two tissue compartments is illustrated in Fig. 1b. C 1 (t) represents the concentration of tracer in the interstitial space, whereas C 2 (t) is the concentration within cells. Transport rates K 1 and k 2 describe blood flow and diffusion of the tracer into the interstitial space, and k 3 and k 4 the transport via the LAT system. The total tracer concentration C T (t) in a voxel of tissue corresponds to the time-activity curve (TAC) derived from dynamic PET measurements. The concentration in the arterial blood pool is usually derived from the 4D PET images by extracting the TAC in a feeding artery close to the tissue of interest. This arterial TAC is often referred to as arterial input function (AIF). For tracer-kinetic analysis, the functional representation of C T (t) from the respective model is fitted to the measured TAC in tissue by means of non-linear least square techniques, using the measured arterial concentration. The fitting routine in terms yields estimates for the models transfer constants.
The subsequent bidirectional transport of 18 F-FET into and out of the cells after crossing the BBB can be modeled by application of a 2TCM. However, if data quality is low, e.g. due to low temporal resolution or low signal-to-noise ratio (SNR), a simplified one tissue compartment model (1TCM) could be more suitable as it yields more stable parameter estimates and does not pose the risk of overfitting the data. Patient data. 16  were segmented based on an isocontour of 70% (I 70 ). Various thresholds are in practice for isocontours, ranging from 50% to 90%. 18 F-FET uptake is low compared to other tracers like FDG. Therefore, to our experience in recurrent glioma, I 90 yields very small volumes of only a few voxels. I 50 on the other hand can easily result in wide spreading segmentations of half of the brain, as recently shown 27 . Thus, I 70 was selected as best compromise between these values. SUV max was determined as maximum value within I 70 . The standardized uptake ratio (SUR) was calculated by normalizing SUV max to the mean background uptake (SUV bg ), which was derived as average SUV within a region of similar 2D size in a part of the brain contralateral to the tumor.
Patient individual image-derived arterial TACs were extracted from the left/right carotid artery in three image slices. Segmentations of arterial TACs were delineated based on the first and second time frame and chosen to include only the inner part of the artery by selection of voxels with maximum peak activity (visual inspection of voxel-wise TACs) and by co-registering contrast enhanced MRI, in order to minimize partial volume effects. Whole blood to plasma conversion was applied using a factor of 10%.
Voxelwise fitting of TACs with 1TCM, s1TCM and 2TCM was performed within I 70 , yielding parameter estimates K 1 (s1TCM), k 2 (s1TCM), K 1 (1TCM), k 2 (1TCM), V B (1TCM), K 1 (2TCM), k 2 (2TCM), k 3 (2TCM), k 4 (2TCM), V B (2TCM). For all transfer rates a constraint of 0 min −1 < k i < 1 min −1 was set in order to limit the search space and exclude unreasonable values. The maximum value of respective parameter estimates was defined as the 75 th quantile, since the actual maximum value potentially consists of the set parameter constraints. Median and maximum of each of the 6 parameters were correlated with tumor grade. Tumors were irradiated with fractionated carbon ions at a median total dose of 37.5 GyE (range 30 to 42 GyE, 3 GyE per fraction). Median parameter estimates of the 1TCM and 2TCM as well as SUR and grade were correlated with overall survival (OS) and progression free survival (PFS) after re-irradiation using Cox proportional hazards model. Simulation Data. For assessment of stability and precision of the parameter estimates resulting from fits with the 2TCM, synthetic TACs were generated using various parameter combinations of K 1 , k 2 , k 3 , k 4 and V B (2TCM). To our knowledge, no detailed reports on tracer kinetic modeling in dynamic 18 F-FET PET in recurrent HGG exist. Thus, representative values of the model parameters, which could be used for data simulation, were not available. Representative sets of parameters for the 2TCM were derived from patient parameter estimates of the current analysis. Histograms of parameter distributions of each patient were investigated and peak values were extracted. From these, ten common parameter combinations were derived (see results). Using the parameter combinations and two different measured patient AIFs, 1000 TACs were generated for each parameter combination using the 2TCM model function.
Acquisition noise was simulated by adding Gaussian random numbers to the synthetic curves, resulting in a signal-to-noise ratio (peak of AIF, divided by standard deviation of the noise) of SNR = 100. Simulated TACs were fitted with the 2TCM, using the same AIF as for data generation.
Evaluation of parameter estimates. Quality of fits was evaluated by means of reduced χ red 2 , calculated as the ratio between χ 2 and the number of degrees of freedom f: with n being the number of sampling steps, p the number of model parameters, and SSR the sum-of-squaredresiduals. ∆y 2 is the squared error on each measured activity value y. For real patient data, the exact error on the measured activity is usually not known, which complicates calculation of χ red 2 . However, apart from systematic errors of the acquisition scanner that are identical for all measurements, the error on the measured activity y is influenced by the statistical nature of the radioactive decay. Therefore, we assumed ∆ ≈ y y . For simulated curves, the noise level is known, thus the error on the measured activity can be simply derived as σ ∆ = = y Noise AIF CNR peak . χ red 2 is mainly driven by the SSR, indicating a better fit quality if the model curve lies closer to the measured data points. However, the more parameters a model provides, the better it can approximate the measured data, regardless of validity of these parameters. Hence, the Akaike information criterion (AIC) 28 provides a more reliable quantity for fit evaluation, as it penalizes the fit quality of a model with the number of parameters 29 : The corrected AIC (cAIC) for small sample sizes is calculated as: When comparing fits from different models, the model yielding the lowest cAIC is considered to yield the best representation of the measured data. For comparison of the three models used in this study, the cAIC was computed for each fit, and results among models were compared. Accuracy and stability of parameter estimates from fitting the simulated TACs (P fit ) were evaluated in terms of the absolute and relative deviation from the true value (P input ), referred to as absolute and relative error: Software. All image analysis was conducted using the Medical Imaging interaction ToolKit (MITK) 30 . SUV calculations were conducted using the SUV calculation plugin. For segmentation of tumor lesions, the MITK isocontour plugin 31 was used. Tracer kinetic analysis was performed using an in-house developed software module for pharmacokinetic modeling in MITK 32 , which allows for voxelwise fitting of the measured time activity curves with various compartment models. The tool is incorporated into MITK and applies a Levenberg-Marquardt algorithm for model fitting.
For statistical analysis and plotting the open-source R software package was utilized (version 3.3.2, http:// www.R-project.org). Two-sided, unpaired Wilcoxon tests were applied, together with Bonferroni-Holmes multiple testing corrections. Mantel-Cox log-rank test was used to compare differences in Kaplan-Meier survival curves (KM). Cox proportional hazards model was used for univariate analyses using log-transformed median parameter values (survival package, version 2.40-1, https://CRAN.R-project.org/package=survival). The significance level was set to 0.05.

Results
Regions of increased tracer uptake were delineated on basis of endpoint 18 F-FET PET images using an isocontour of 70% (I 70 ) and SUV max and SUR were calculated. Within the isocontour, voxelwise tracer kinetic analysis was conducted using the simplified 1TCM (without blood volume V B ), the standard 1TCM and 2TCM (both with blood volume).

Figure 2.
Parameter maps from fits with different compartment models in an exemplary patient with grade IV glioma. Pharmacokinetic analysis was performed within an isocontour of 70%, which was delineated based on the endpoint static summation image, as illustrated in (a). SUV max and SUR were extracted, where SUR is the ratio between SUV max and mean background SUV. Both the standard 1TCM and a simplified 1TCM were fitted voxelwise to the time-activity curves (TACs), yielding estimates for exchange rates K 1 and k 2 (b). Furthermore, TACs were fitted with the more complex 2TCM, which yields metabolic exchange rates K 1 , k 2 , k 3 and k 4 (d). The general definitions of the 1TCM and 2TCM yield estimates on the blood volume V B , which is displayed in (c). In the simplified 1TCM (s1TCM), the blood volume is assumed to be zero V B = 0. Parameter maps of tracer kinetic analysis show enhanced detail on tumor substructure, however, stability is decreased for the 2TCM. Both 1TCM and 2TCM yield similar distributions of V B , indicating that the parameter is not negligible. Figure 2 shows corresponding parameter maps from fits with the different models in an exemplary patient. K 1 and k 2 from the simplified and standard 1TCM (Fig. 2b) showed tumor substructure in more detail compared to the static SUV image. Parameter maps of K 1 and k 2 are very similar, with hot-spots and cold-spots in the same areas. However, the simplified 1TCM yields higher values in both parameters.
Fits with the 2TCM (Fig. 2d) could resolve the substructure even further; however, the enhanced level of detail of the 2TCM also resulted in less stable parameter estimates. This is visible in k 2 , where dark-red voxels indicate misfits, which approached the parameter constraint boundaries. Parameter maps of k 3 and k 4 present with congruent distributions, however, values of k 3 are higher in hotspots. V B from the 1TCM and 2TCM show also very similar distributions (Fig. 2c), indicating stability and consistency of the parameter over the models. Estimates for V B are higher from 1TCM compared to 2TCM. These results indicate that the blood volume V B is not negligible in tracer kinetic analysis of dynamic 18 F-FET PET.
Evaluation of fit quality. Quality of respective fits with the three different models was investigated. Figure 3 shows the measured TAC in two exemplary voxels together with fitted curves from all three models. It can be observed that the 2TCM model function fits the measured time-activity-curves best. The simplified 1TCM provides the least adequate fit, with substantial deviations from the measured curve, especially at later time points. The standard 1TCM describes the time course of tracer activity more appropriately; however, it also shows deviations towards the end of the acquisition. It could be hypothesized that the 2TCM describes the tracer kinetics most accurately, as it visually gives the best fit to the data. In order to further investigate this finding, quantitative analysis of goodness-of-fit was performed by evaluation of χ red 2 and cAIC. Reduced χ red 2 from fits to data of each patient with either one of the three utilized models is shown in Fig. 3. The simplified 1TCM yielded much higher χ red  In order to consider fit quality with respect to the number of model parameters, the corrected AIC was calculated (Fig. 4). The results show that the simplified 1TCM yields the poorest results, however, fit quality according to the cAIC is not as bad as suggested by χ red 2 . Interestingly, the 2TCM appears to be the best representation of the data (lowest cAIC) in most cases, even though fits showed slightly higher χ red 2 compared to the 1TCM.
Parameter estimates from patient data. Table 2 lists median parameter estimates from the three models for all patients, together with the respective SUV max and SUR values. Except for k 2 from the 2TCM, which takes unreasonably high values in many patients (up to the constraint boundary of 1 min −1 ), parameter estimates from fits with all models present with reasonable values. Figures 5 and 6 show histograms of the total voxel frequency of parameter estimates from the 1TCM (Fig. 5) and 2TCM (Fig. 6) in patients with grade III and grade IV tumors, respectively.
On average, patients with grade IV glioma presented with higher SUV max and SUR (Fig. 7). The difference was not significant (SUV max : 2.3 vs. 3.0, p = 0.2; SUR: 2.2 vs 3.0, p = 0.091). K 1 from fits with the simplified 1TCM showed a tendency towards higher values in patients with grade IV tumors, with mean values of K 1 = 0.138 min −1 in grade III, and K 1 = 0.204 min −1 in grade IV (p = 0.17). k 2 from s1TCM fits was similar in both groups (k 2 = 0.189 min −1 in grade III and k 2 = 0.194 min −1 in grade IV). As already discussed, the standard 1TCM yielded lower values for K 1 and k 2 compared to the simplified 1TCM. However, the tendency of lower K 1 in . This can also be observed in the respective histogram, which presents with a clear shift towards higher values in grade IV tumors. V B from the 1TCM fits was significantly lower in grade III glioma as shown in Fig. 7, with values of 7.5 ml/100 ml vs. 11.6 ml/100 ml (p = 0.02518, Bonferroni-Holmes corrected). This effect is also partially visible in the total voxel-value frequency histogram. For the 2TCM, similar results were observed with V B (grade III) = 5.9 ml/100 ml vs. V B (grade IV) = 9.1 ml/100 ml (p = 0.03112). The result did not remain significant after multiple testing corrections.
Average K 1 in the 2TCM over all patients yielded significantly lower values in grade III compared to grade IV glioma (0.127 min −1 vs. 0.224 min −1 , p = 0.0091, Fig. 6 top). Estimates on k 2 presented with a tendency to unrealistically high values in both groups (see Table 2): k 2 (grade III) = 0.503 min −1 and k 2 (grade IV) = 0.676 min −1 . Furthermore, estimates on k 2 resulted in several outliers up to the upper parameter constraint of 1 min −1 especially in grade IV tumors. Thus, it is likely, that the fits in these voxels ran into constraint boundaries. Values for k 3 showed a trend to higher values in grade IV tumors (0.233 min −1 vs. 0.305 min −1 ), whereas mean k 4 was comparable with 0.079 min −1 in grade III and 0.073 min −1 in grade IV.
Parameter estimates from the different models, as well as grade and SUR, were further correlated with patient's overall survival (OS) and progression free survival (PFS). Results are listed in Supplementary Table S1.  For both models, a cut-off for K 1 was found, that significantly separated patients overall survival, as shown in Supplementary Fig. S2. Patients with lower K 1 showed improved survival, with thresholds of K 1 (1TCM) = 0.074 min −1 (p = 0.0041) and K 1 (2TCM) = 0.104 min −1 (p = 0.0041), respectively. Cut-offs for K 1 from both models also presented with tendencies to separate progression free survival: K 1 (1TCM) = 0.059 min −1 , p = 0.082 and K 1 (2TCM) = 0.104 min −1 , p = 0.086. A trend towards correlations of these parameters with PFS was found, that was however not significant (p = 0.2006 and p = 0.154, respectively). Furthermore, k 2 from the 1TCM showed tendencies towards correlation with PFS (p = 0.1042). Parameter k 4 separated progression free survival with a threshold of k 4 = 0.067 min −1 (p = 0.066). Cox proportional hazards model revealed however non-significant correlations with p = 0.301.

Evaluation of accuracy and robustness of the 2TCM on simulation data.
In order to investigate the validity of 2TCM and test accuracy and robustness of the determined parameter estimates, histogram analysis of the individual patient's parameter maps was performed, as illustrated in Fig. 8 for one patient. Corresponding histograms for K 1 and k 2 from the 1TCM are shown in the supplements (Fig. S3). Peak values of the histograms of each parameter were extracted in order to find common parameter values [K 1 , k 2 , k 3 , k 4 , V B ]. From these values, 10 combinations were composed for data simulation, which are listed in Table 3. These parameter combinations together with two measured AIFs (Fig. 8 bottom) were used to simulated time-activity curves, which were fitted again with the 2TCM. The resulting parameter estimates were evaluated by means of stability and precision.  Table 2. Median parameter estimates from fits with the three different compartment models in every patient, together with SUV max and SUV-to-background ratio (SUR): K 1 and k 2 from the simplified 1TCM; K 1 , k 2 and V B from the standard 1TCM; K 1 , k 2 , k 3 , k 4 and V B from the 2TCM. Note: Median Parameter estimates were used instead of averages, due to several voxels with failed fits, which yielded outliers towards high transfer rates (~100 10 −2 min −1 was set as upper constraint for all parameters).
SCIeNTIFIC REPORTs | (2018) 8:14760 | DOI:10.1038/s41598-018-33034-5 Figure 9a shows the distribution of χ red 2 for all 10 combinations, with either one of the AIFs. Fits to data simulated with combination 7 and AIF type I, as well as combination 2 with AIF type II, yielded exceptionally poor fit quality. Apart from these cases, fit quality was good, with χ red 2 being close to one. However, in all combinations, some fits completely failed, yielding outliers of very high χ red 2 . This suggests decreased stability of the fits, which could be overcome by averaging signal from multiple voxels and thus, decreasing noise of the TACs (ROI-based analysis). Figure 9b shows absolute errors on parameter estimates of K 1 , k 2 , k 3 and k 4 . It can be observed, that fits with AIF type II yield overall more stable results, with smaller quantiles of the parameter value distribution. This is counter-intuitive, since AIF type I would be assumed to yield better fit accuracy, because it captures the peak activity better. Interestingly, the distinct difference between the two AIF types was not observed in fit quality.
Overall, both accuracy and stability varied greatly over the different parameter combinations, which suggests that the 2TCM accurately describes the kinetics only in some tissue settings. Estimates on K 1 showed the lowest errors of all, with deviations of about 0.01 min −1 . k 2 presents with increased errors, which are nonetheless not larger than 0.5 min −1 within the 25 th and 75 th quantile. With regards to stability, k 2 showed the widest distribution of parameters, suggesting that it is the least stable parameter. k 3 shows large errors for combinations 1 and 7 (AIF type I). The errors in combination 7 were already suggested by the poor fit quality and high χ red 2 . Interestingly,  In the majority of cases k 3 and k 4 show positive errors, meaning that they are commonly overestimated by the fit. This is partly due to the fact that they were initial set to very low values for simulation, which also makes it difficult to interpret accuracy of these parameters in terms of absolute errors. Thus, further analysis was done by evaluation of relative errors on parameter estimates, which are displayed in Table 4 for the different parameter and AIF combinations. The high stability and accuracy of estimates on K 1 is confirmed, as the parameter shows low errors, which are mostly under 10% and quantile errors do not exceed ± 30%. k 2 shows errors that are increased in several cases up to ~50%, but mostly low as well. However, in 4 cases, the deviations from the true value are substantial, because the quantiles do not include the 0% error value. k 3 and k 4 show large deviations from the true values in combinations 1, 7 and 9, with median errors up to several hundred percent with AIF type I. Furthermore, even though median errors are low in most of the other combinations, the parameters present with increased instability, reflected in the large quantiles of the errors.

Discussion
The presented study investigated the feasibility of tracer kinetic analysis in dynamic 18 F-FET PET scans of patients with recurrent high-grade glioma re-irradiated with carbon ions. Up-to-date, controversial reports exits on the value of SUV and SUV-to-background ratio (SUR) for grading of glioma 17,21,31,33 . These discrepancies might be linked to differences in SUR depending on the time point of PET image acquisition after injection of the tracer, as demonstrated by Weckesser et al. 18 . In this context, dynamic 18 F-FET PET may provide more detailed information about tracer uptake compared to static image acquisition. However, dynamic scanning protocols are rarely used in clinical routine, due to the time-consuming acquisition and challenging interpretation of the data.
There are only few reports on detailed compartment model analysis in 18 F-FET PET in patients, and the question, which model best describes the kinetics of tracer uptake, metabolism and washout is still unanswered 23,24 . However, voxelwise tracer kinetic analysis holds the promise of unraveling tumor sub-volumes of different metabolic activity, which is especially relevant in high-grade glioma due to their infiltrative and heterogeneous nature. Compartment modeling can yield direct physiological correlates on the blood volume and perfusion (i.e. vascularization) and metabolic rate constants.
Retrospective data of 16 patients were analyzed by voxelwise fitting of measured time-activity curves (TAC) with three different compartment models: a standard one-tissue compartment model (1TCM), a simplified 1TCM that assumes the blood volume to be zero (s1TCM) and a two-tissue compartment model (2TCM). All models could be fitted to the data appropriately within an isocontour of 70%. The applied acquisition protocol was feasible for tracer kinetic analysis with compartment models, providing sufficient data quality in terms of temporal resolution and signal-to-noise ratio. We could find no evidence against application of either of the models on clinical patient data. The standard 1TCM and 2TCM fitted the data very well, with good fit quality and reasonably low χ red 2 , however, the 1TCM yielded failed fits with exceptionally high χ red 2 in several cases. The simplified 1TCM with V B = 0 presented with poorer fits and the model did not properly represent the curve shapes. Both versions of the 1TCM yielded reasonable estimates on transfer rates K 1 and k 2 . The 2TCM yielded the least stable parameter estimates, especially for k 2 . There were no hints towards overfitting of the data with the 2TCM, as in fact it yielded higher χ red 2 (lower fit quality) than the standard 1TCM. Estimates on the blood volume V B were reasonable and showed similar patterns of increased and decreased values for the 1TCM and 2TCM, indicating that this parameter is not negligible and the simplified 1TCM might not be an adequate description of uptake kinetics. This is conceivable, as high-grade glioma present with show representative histograms of the distributions of parameter estimates K 1 , k 2 , k 3 and k 4 from fits with the 2TCM in an exemplary patient. Peak analysis of the histograms of all patients was used to identify the most common value of these parameters that were used as input for data simulation. Two different AIFs, displayed in the bottom two plots, were used for both data simulation and subsequent fitting. enhanced vascularization and thus, a large blood volume, whereas the fraction of blood vessels is approximately 2 to 4% in normal brain tissue 25 .
A large source of error in tracer kinetic modeling is the selection of an appropriate arterial input curve 34 . Ideally, the AIF would be derived from blood sampling during tracer administration; however, this invasive procedure is rarely feasible with patients. Furthermore, in retrospective settings, such an AIF is impossible to acquire. Therefore, image-derived input functions are often chosen, even though they are prone to error due to partial volume effects, motion during acquisition, etc. In this study, image-derived input functions were carefully extracted from carotid arteries, applying several suggested methods 34 like co-registration with MRI and visual inspection of TACs in included voxels, in order to minimize distortion of the arterial signal. Nonetheless, no ground truth AIF was available for comparison, and underestimation of arterial tracer concentration, especially in the peak region, can influence parameter estimates of kinetic modeling. No relation was found between the quality of fits and the AIF type, even though it was assumed that fits with AIFs of type III could yield poorer results due to an underestimation of the peak activity.
In many cases, temporal resolution is increased for the initial frames in order to more precisely sample the peak concentration of the AIF and hence, improve parameter estimates. In principle this is possible, as most PET dynamic data is acquired in list mode and afterwards sampled to respective time frames at reconstruction. However, higher temporal resolution always comes at the cost of reduced SNR, which on the other hand decreases quality of the extracted arterial TAC and thus, fit quality 35 . In contrast to MRI, the signal from the reconstructed PET images is the accumulation of all decays measured between time frames, thus forming an integral signal. This aspect potentially reduces the risk of underestimating the peak activity in the AIF.
It was found that the parameter V B provides a diagnostic value for tumor grading, especially in the standard 1TCM. Overall V B from the 1TCM was 7.5 ml/100 ml for grade III tumors and 11.6 ml/100 ml for grade IV tumors (p = 0.02518 after multiple-testing correction). Using the 2TCM yielded lower estimates for V B of 5.9 ml/100 ml in grade III glioma and 9.1 ml/100 ml in grade IV glioma. These estimates are comparable to values published in the literature, that were derived using dynamic contrast-enhanced MRI (DCE MRI) 36,37 . Prediction of tumor grade in recurrent glioma via imaging has a clinical impact, as these tumors may derive after long-latency period from lower-grade glioma and are often no more biopsied. Therefore, differentiation of tumor grades based on dynamic 18 F-FET PET scans could provide valuable input in predicting the outcome in recurrent setting.
Parameter estimates were further correlated with tumor grade and OS. Parameter estimates from the 2TCM hinted towards correlation with patients' survival, further indicating the additional value of detailed tracer kinetic analysis in 18 F-FET PET. Additional studies are required to validate potential correlations between 2TCM parameters and OS.
With the 2TCM being more detailed but also less stable compared to the 1TCM, the question arises, which model fit the data more adequately, and thus, better represents the underlying tracer kinetics. Therefore, accuracy and robustness of the more complex 2TCM were investigated by simulating TACs of 10 tissue types and fitting this synthetic data with the 2TCM. Since no complete set of reference values for exchange rates K 1 , k 2 , k 3 , k 4 and V B was given by the literature, we determined 10 common parameter combinations in our patient data and used these, together with measured patient AIFs, to simulate time-activity curves at the same data quality as the patient PET scans.
Fit quality and stability as well as errors on parameter estimates were quantified in order to assess reliability of this model for data at the same quality (SNR, temporal resolution) as patient scans. Results showed sufficient fit quality and precision of parameter estimates. K 1 and V B are parameters with high stability and accuracy. k 3 and k 4 exhibit large errors at large values (~0.20 min −1 ) but are otherwise robust estimates. k 2 proved to be the least stable and least reliable parameter. Interestingly, AIFs of type II, where the initial frame already includes enhanced activity instead of the zero base line, provided better fit-quality and smaller errors, contrary to the expectations. It was initially hypothesized that a zero-activity first frame is required for a proper description of the peak activity.  Table 3. Parameter values for K 1 , k 2 , k 3 , k 4 and V B from the 2TCM used for simulation of time activity curves. Each parameter combination was used with either one of the two AIF types (Fig. 6) to simulated 10000 curves. Gaussian random numbers were added to the simulated curves, in order to simulate noise at the same level as the measured patient TACs. These simulated data were in terms fitted with the 2TCM in order to evaluate precision and robustness of parameter estimates from the 2TCM in data of our quality (temporal resolution, sampling, SNR). Several authors have studied the value of dynamic 18 F-FET PET for grading and prognosis in both low-and high-grade glioma 9,10,20,22,38 . Pöpperl et al. showed the superior value of dynamic acquisition protocols over static ones for grading in glioma patients 21 . However, for analysis of dynamic scans, most authors rely on qualitative description such as the overall tumor TAC shapes, the determination of semi-quantitative values like SUV at different time points 22 or the time-to-peak activity 9,20 . To our knowledge this study provides the first quantitative report on detailed parameter estimates from these compartment models. Thiele et al., performed tracer kinetic modeling with the 1TCM and 2TCM amongst other analysis techniques of dynamic 18 F-FET PET in grade IV glioma patients 23 . However, correlations of parameters from compartmental analysis with clinical outcome were missing due to the low stability of their results. Furthermore, the sinus sagittalis was used for derivation of image-based arterial TACs, which only gives the tracer concentration in venous blood. In addition, neither fit quality nor mentioned instability nor reliability of parameter estimates was further quantified. In contrast, the results of the current analysis provide an estimate on precision and robustness of tracer-kinetic analysis with the 2TCM by evaluating fits to simulated data.
Recently, pharmacokinetic modeling of 18 F-FET tracer kinetics in a syngeneic orthotopic preclinical model was reported by injecting F98 glioblastoma cells into the right hemisphere of Fisher rats 39 . They found a favorable performance of the 2TCM in terms of model selection criteria AIC, F test and residual plots. This finding is in accordance with our study, showing that the 2TCM presents with lowest cAIC and thus appears to be the best representation of the data. However, there are some differences in their study compared to ours. First, F98 is  Figure 7 (a) shows goodness-of-fits in terms of reduced X², for each of the 10 different parameter combinations (with both AIFs). Due to outliers at very high values, the y-axis in the left plot is displayed on a logarithmic scale. For more detailed evaluation, the right plot shows the distribution of X² at an enlarged linear scale. Apart from two exceptions, both AIF types yielded similar fit quality. Figure 7 (b) shows the distribution of absolute errors in each of the four model parameters K 1 , k 2 , k 3 and k 4 . The absolute error was calculated as the parameter estimate minus the true value used for simulation. While K 1 yields good accuracy and robustness, k 2 showed lowest stability of all parameters. k 3 and k 4 showed tendencies towards positive values for the errors, indicating a systematic overestimation of these two parameters.
carcinogen (nitrosourea-mediated mutagenesis) induced rat glioblastoma model that might differ not only with respect to the molecular characteristics affecting radiosensistivity (e.g., presence of BRCA1 mutation) but also tumor invasion and angiogenesis pattern from the human glioblastoma in patients 40,41 . A second pivotal difference between the two studies is that they do not consider the parameter blood volume (V B ) in their compartment models. Indeed, we found that neglecting the blood volume in the 1TCM (simplified 1TCM) model yields poor results. However, high-grade glioma are highly vascularized necessitating inclusion of V B in modeling these tumors, at least in patients. Together, our data underscore the importance of considering V B in dynamic modeling. Moreover, we performed simulation analyses to investigate accuracy and robustness of parameter estimates. Results showed that the 2TCM is in principle feasible and accurate, however, at the cost of reduced fit stability.  Table 4. Median relative errors on parameter estimates from fitting simulated data of 10 different parameter combinations (Table 3) and two different AIFs with the 2TCM, for each of the model parameters K 1 , k 2 , k 3 and k 4 and V B . The 25 th and 75 th quantile of the errors are denoted in square brackets under the median value. Normalizing the absolute error to the true parameter value used for simulation, and multiplying by 100 calculated relative errors. Quantiles of k 2 show, that even though errors on this parameter are low in most cases, it presents with high instability. k 3 and k 4 show large relative errors, which is likely to be due to their low original values. Overall, the standard 1TCM provided the best results in terms of feasibility and robustness with an appropriate description of the tracer kinetics. This is in concordance with results reported in animal studies 24 . Additionally, correlation of K 1 with overall patient survival was visible in K 1 from both the 1TCM and 2TCM.
The blood volume V B appears to be overestimated by the 1TCM compared to values reported previously in the literature 36 . Precision and robustness of the estimates from the 1TCM should be further investigated, e.g. with a simulation study similar to the one we conducted on the 2TCM in a larger population.
Fitting dynamic 18 F-FET PET data with a 2TCM appears to hold some valuable information on tumor heterogeneity and substructure. The 2TCM provided the best model representation according to the Akaike information criterion 28,29 . Estimated V B from the 2TCM yielded reliable values with good accuracy and robustness and showed a trend towards prognostic value for grading, as a trend towards higher values for K 1 and V B was observed in grade IV tumors compared to grade III tumors. k 4 appears to correlate with progression free survival. Larger patient cohorts need to be analyzed to further investigate these findings. Furthermore, the combination of parameters from this more complex model with physiological measures derived from other imaging modalities (e.g. dynamic contrast-enhanced MRI) could provide additional, relevant information on tumor tissue composition.

Conclusion
The presented study provides a first overview on the diagnostic and prognostic value of compartmental analysis for dynamic 18 F-FET PET in recurrent high-grade glioma patients. The models revealed an improved resolution of tumor substructure (e.g. vascularization). Despite the limited size of the cohort, correlation of parameters with tumor grade and overall survival yielded promising results, which warrants further exploration. It is yet to be investigated, whether our findings can be translated to primary, untreated tumors. However, feasibility of the analysis is independent of this, enabling a proper conduction of such a study.