Investigation of image-derived input functions for non-invasive quantification of myelin density using [11C]MeDAS PET

Multiple sclerosis (MS) is an inflammatory demyelinating disease. Current treatments are focussed on immune suppression to modulate pathogenic activity that causes myelin damage. New treatment strategies are needed to prevent demyelination and promote remyelination. Development of such myelin repair therapies require a sensitive and specific biomarker for efficacy evaluation. Recently, it has been shown that quantification of myelin density is possible using [11C]MeDAS PET. This method, however, requires arterial blood sampling to generate an arterial input function (AIF). As the invasive nature of arterial sampling will reduce clinical applicability, the purpose of this study was to assess whether an image-derived input function (IDIF) can be used as an alternative way to facilitate its routine clinical use. Six healthy controls and 11 MS patients underwent MRI and [11C]MeDAS PET with arterial blood sampling. The application of both population-based whole blood-to-plasma conversion and metabolite corrections were assessed for the AIF. Next, summed images of the early time frames (0-70 seconds) and the frame with the highest blood-brain contrast were used to generate IDIFs. IDIFs were created using either the hottest 2, 4, 6 or 12 voxels, or an iso-contour of the hottest 10% voxels of the carotid artery. This was followed by blood-to-plasma conversion and metabolite correction of the IDIF. The application of a population-based metabolite correction of the AIF resulted in high correlations of tracer binding (Ki) within subjects, but variable bias across subjects. All IDIFs had a sharper and higher peak in the blood curves than the AIF, most likely due to dispersion during blood sampling. All IDIF methods resulted in similar high correlations within subjects (r=0.95-0.98), but highly variable bias across subjects (mean slope=0.90-1.09). Therefore, both the use of population based blood-plasma and metabolite corrections and the generation of the image-derived whole-blood curve resulted in substantial bias in [11C]MeDAS PET quantification, due to high inter-subject variability. Consequently, when unbiased quantification of [11C]MeDAS PET data is required, individual AIF needs to be used.


Introduction
Multiple sclerosis (MS) is the most common neurodegenerative disease amongst young adults, having both inflammatory and demyelinating aspects ( Ramagopalan et al., 2010 ). Inflammation causes the formation of lesions, which are characteristic of demyelination and neurodegeneration ( Traugott et al., 1983 ). Myelin is wrapped as a sheath around neuronal axons and has neuroprotective, neurotransmission enhancing, and axonal metabolic support functions ( Morrison et al., 2013 ; Abbreviations: 2T3k, irreversible 2 tissue compartment model; AIF, arterial input function; FOV, field of view; HC, healthy controls; IDIF, image-derived input function; Ki, net influx rate; MRI, magnetic resonance imaging; MS, Multiple sclerosis; PET, positron emission tomography; ROI, region of interest; SE, standard error; SUV, standardized uptake value; TAC, time activity curve; VOI, volume of interest. In vivo imaging techniques would be ideal for direct characterization of new remyelination therapies. Several magnetic resonance imaging (MRI) techniques for assessing myelin have been developed. A recent study ( van der Weijden et al., 2021a ), however, indicated that these MRI methods are not sensitive enough for quantitative myelin imaging. Positron emission tomography (PET) a widely used clinical imaging modality that is highly sensitive and inherently quantitative for myelin imaging may be an alternative technique ( Auvity et al., 2020 ;Wu et al., 2010 ;. To date, several PET tracers have been developed with the ability to depict myelin content. In fact, [ 11 C]PiB, originally developed as an amyloid tracer, has been repurposed for myelin imaging in clinical studies with marginal success ( Bodini et al., 2016 ;. In animal studies, however, [ 11 C]MeDAS appeared to be a better tracer for imaging of myelin than [ 11 C]PiB . Recently, [ 11 C]MeDAS was successfully used in a first-in-man study ( van der Weijden et al., 2022 ). At present, quantification of myelin density using [ 11 C]MeDAS PET (or any other myelin tracer) requires arterial blood sampling. As this is an invasive and labour intensive (e.g., measurement of radiolabelled metabolites) procedure, an imaging method that does not need arterial blood sampling would be required for routine clinical use.
Arterial blood sampling could be avoided by deriving the arterial input function from the PET scan itself. Several studies have been conducted to explore the feasibility of a so-called image-derived input function (IDIF) for various PET tracers ( Backes et al., 2009 ;Bahri et al., 2017 ;Chen et al., 2007Chen et al., , 1998Croteau et al., 2010 ;Huisman et al., 2012 ;Islam et al., 2017 ;Kang et al., 2018 ;Mabrouk et al., 2014 ;Mourik et al., 2009 ;Sanabria-Bohórquez et al., 2003 ;Schain et al., 2013 ;Zanderigo et al., 2018 ;Zhou et al., 2012Zhou et al., , 2011. The purpose of the present study is to assess whether such an IDIF can indeed be used to substitute arterial blood sampling and to determine potential deviations when used for [ 11 C]MeDAS PET analysis.

Subjects
Eleven MS patients, diagnosed according to the revised McDonald Criteria ( Thompson et al., 2018 ), and 6 healthy volunteers (HC) were included in this study. Inclusion criteria were: at least 18 years old and, in case of MS, a diagnosis of progressive MS. Pregnant or breastfeeding subjects, subjects with a previous adverse reaction to gadolinium contrast agents, subjects suffering from claustrophobia, and subjects diagnosed with cerebrovascular disease were excluded from the study. Other exclusion criteria were a clinical history of diminished renal or liver function, participation in an investigational medication trial, and presence of magnetisable materials in the body. Written informed consent was obtained from all study participants. The study was approved by the Medical Ethics Review Committee of the University Medical centre Groningen, Netherlands (METc no. 2018/450, Trial register: Trial NL7262).

Data acquisition
All participants underwent a 60 min dynamic [ 11 C]MeDAS-PET scan including arterial blood sampling. [ 11 C]MeDAS emission scans were acquired on a Biograph Vision PET/CT scanner (Siemens Healthineers, Erlangen, Germany), after first performing a low-dose CT scan for attenuation correction. Individual doses of [ 11 C]MeDAS were synthesized as described previously ( van der Weijden et al., 2022 ). [ 11 C]MeDAS (195 ± 33 MBq) was injected intravenously 10 s after the start of the [ 11 C]MeDAS scan. PET data were corrected for attenuation, scatter, random coincidences, decay and dead time and reconstructed into 26 frames including a 10 second background frame (1 × 10, 10 × 5, 1 × 10, 2 × 30, 3 × 60, 2 × 150, 4 × 300 and 3 × 600 s). Continuous arterial blood sampling was performed at a rate of 5 mL •min − 1 for the first 5 min and at a rate of 1.66 mL •min − 1 for the remainder of the scan, using an online radioactivity detection system (Veenstra Instruments, Joure, The Netherlands (5 MS patients), or Twilite, Swisstrace, Menzingen, Switzerland (11 subjects)). In addition, 5 manual blood samples of 5 mL each were collected at 10, 20, 30, 45 and 60 min after tracer injection to measure plasma to whole blood ratios and plasma metabolite fractions (see supplementary material for a full description).

Generation of the arterial input function
The plasma-to-whole blood ratio was calculated for each manual sample, and subsequently the potential use of an average plasma-towhole blood ratio per subject for obtaining plasma curves was investigated. Next, parent fractions (the fractions of intact tracer) measured in the manual plasma samples were fitted to a Hill function ( Gunn et al., 1998 ). Each individual whole plasma curve was then multiplied by the corresponding fitted Hill function, resulting in an individual metabolite corrected plasma curve, which is referred to as the full arterial input function (AIF).

Data analysis using an arterial input function
Arterial whole blood curves, plasma-to-whole blood ratios, and parent fractions were assessed for differences between HC and MS. Next, a population-based plasma-to-whole blood ratio and population-based parent fractions were calculated. These population-based plasma-towhole blood ratio and the population based metabolite fractions were used to convert the arterial whole blood curves into alternative metabolite corrected plasma curves. A total of 4 input functions for kinetic modelling were explored: (1) whole blood curve corrected for individual plasma-to-whole blood ratios and individual parent fractions, (2) whole blood curve corrected for individual plasma-to-whole blood ratios, and population-based parent fractions, (3) whole blood curve corrected for population-based plasma-to-whole blood ratios and individual parent fractions, (4) whole blood curve corrected for population-based plasmato-whole blood ratios and population-based parent fractions.
[ 11 C]MeDAS PET images were corrected for movement using the PMOD (v4.1, Zurich, Switzerland) software package, then co-registered to the T 1 -weighted MRI of the same individual, followed by the generation of TACs for anatomically generated ROIs ( Hammers et al., 2003 ), as previously described ( van der Weijden et al., 2022 ).
The AIF (method 1), and the three AIF-derived input functions (methods 2, 3, and 4) were then corrected for time delay between sampling site (wrist) and PET measurements (brain). Delay correction was performed by fitting the front part of the whole brain GM tissue TAC peak with a reversible one tissue (1T2k) model. The delay corrected plasma curves were then used as input for kinetic modelling of the tissue TACs using the irreversible two-tissue compartment model (2T3k), as previously described ( van der Weijden et al., 2022 ). The whole blood curve was used as input for estimating the blood volume fraction in the brain. Results from the kinetic analysis using input functions 2, 3, and 4 were compared with those obtained using input function 1 to assess whether population based plasma-to-whole blood and metabolite corrections are a viable alternative for individual corrections.

Generation of the image-derived whole blood curves
Regions-of-interest (ROIs) for the extraction of whole blood time activity curve (TAC) were defined on coronal slices containing the carotid arteries in the neck. These regions of interests (ROIs) were used to avoid contamination (spill-in) of radioactivity from the brain. ROIs were defined manually on the summed PET images of early time frames (0 to 70 s after injection) and on the frame with the highest blood-brain contrast. From both images, ROIs were defined for the 2, 4, 6 and 12 hottest voxels, and an additional ROI was based on an isocontour of voxels with an intensity ≥ 90% of the highest voxel intensity. Subsequently, these ROIs were transposed on the dynamic PET images corrected for motion ( van der Weijden et al., 2022 ) to obtain a total of 10 different imagederived whole blood curves.

Generation of image-derived input function and data analysis using an image-derived input function
To assess the potential of image-derived input functions, the imagederived whole blood TACs were multiplied by individual plasma-towhole blood ratios to obtain the corresponding (image-derived) total plasma curves. The image-derived plasma curves were subsequently multiplied with a Hill function that was fitted to the parent fractions derived from the manual blood samples. The resulting 10 variants of the IDIF were then corrected for time delay between PET measurements in the neck and PET measurements in the brain. This was achieved by fitting the initial phase (0 to 220-280 s post injection) of the whole brain GM tissue TAC using a 1T2k model with an additional delay parameter. The delay corrected whole blood and metabolite corrected plasma curves were then used to fit brain VOI TACs to a 2T3k model and estimated K i values were compared with AIF derived results. A similar method to account for delay was used for the AIF. As both IDIF and AIF were independently corrected for delay, curves were not shifted to match IDIF and AIF peaks.

Statistical analysis
The net influx rate constant (K i ) obtained with AIF or IDIF was compared with that obtained with the gold standard AIF (i.e. AIF with individual corrections for plasma-blood ratios and parent fractions). This was performed using correlation analyses, for which high correlations (R 2 ) indicated a good correspondence between AIF derived quantification, and linear regression of which the slope and intercept reflect the bias of the measurements. In addition, the intraclass correlation coefficient (ICC) was calculated as it directly integrates the bias in its values. Any K i estimate derived from kinetic modelling with a percentage standard error (%SE) > 25% was considered unreliable and omitted from the correlation analyses (see supplementary Table 1). Depending on the normality as assessed using Kolmogorov-Smirnov test, either parametric (e.g. T-test, ANOVA) or non-parametric tests (e.g. Kruskal Wallis, Friedman) were used for assessing differences in blood characteristics.

Subjects
Complete blood data were available for 6 HC and 9 MS patients ( Table 1 ). For one MS patient, radiolabelled metabolites could not be measured due to malfunctioning equipment and for another MS patient, insertion of the arterial cannula was not successful. No significant differences were observed between HC and MS patients regarding the area under the whole blood standardized uptake value (SUV) curve (AUC), the AUC of the whole blood SUV curve from 500 to 3600 s, the plasmato-whole blood ratio, and the metabolite fractions ( Table 2 ; Fig. 1 ). No significant differences were observed between plasma-to-whole blood ratios over time using the Friedman test ( Q = 8.11, p = 0.085).

Effect of blood corrections
The effects of population-based corrections on the whole blood curve are displayed in Fig. 2 . When a population-based plasma-blood ratio and population-based metabolite corrections were applied, the curves were highly similar in shape and peak as compared with the AIF. Subsequently, the correspondence between K i obtained with the full AIF (individual plasma-blood ratio and parent fraction corrections) and K i obtained with population-based plasma-blood ratio and/or metabolites was evaluated ( Fig. 3 , Table 3 , supplementary Table 2). When the population-based plasma-blood ratio and metabolite corrections were used, overall correlations with the results from the full AIF were high within a single subject, but slopes varied across subjects. To determine the cause of this variable slope, the effects of using either the population-   Fig. 2. Blood time-activity curves of a typical subject (MS1) based on arterial blood sampling. The whole blood curve (whole blood) was converted into a plasma curve, using either individual (indiv plasma) or population based (pop plasma) plasma-to-whole blood ratios, and corrected with either individual (indiv plasma) or population based metabolite curves (pop meta). For better visualisation of the peaks of the curves, only an interval of 0 to 300 s is displayed in the insert.
based plasma-to-blood ratio with individual metabolites or individual plasma-blood ratios with population-based metabolites were investigated. These comparisons showed that, irrespective whether plasmato-blood ratios or metabolite corrections were population based, high within-subject correlations with variable slopes were obtained. However, the bias was lower when individual metabolite corrections were applied. This indicates that individualized metabolite curves are essential for unbiased quantification. The bias observed when a populationbased plasma-to-blood ratio was applied to the whole blood curve seems minimal ( Fig. 4 A), except for HC2 and MS9, whose plasma-to-blood ratios (1.69 and 1.90, respectively) strongly deviated from the population mean (1.56). A widespread subject-specific bias was observed when population-based metabolite corrections were applied ( Fig. 4 B and 4 C).
This indicates that differences in tracer metabolism between subjects might be too variable to use a population-based average.

Image-derived blood curves
Peaks of the image-derived whole blood curves were reached earlier than those in the externally measured arterial blood curves ( Fig. 5 ). Furthermore, the peak of each IDIF was narrower than that of the corresponding AIF, indicating less dispersion. At later time points, the tail of the AIF was slightly higher than the IDIF curves ( Fig. 5 ). For blood ROIs extracted using either the summation of the 0-70 s frames or the frame with the highest blood-brain contrast, the blood curve derived from the 2 hottest voxels showed the highest peak, followed by those Fig. 3. Correlation analysis between regional K i values obtained using either the full arterial input function (AIF) and input functions based on simplified correction methods over 26 brain regions. * HC = healthy control, LOI = line of identity, MS = multiple sclerosis patient.

Table 3
Correspondence of regional K i values estimated using the 2T3k model with individual arterial input functions (AIF) and those estimated using the same model with input functions based on various population corrections.

Whole blood correction
Correspondence with AIF derived from the 4 hottest, 6 hottest and 12 hottest voxels, and finally the isocontour, suggesting some partial volume effects. Significant differences were observed between the AIF and IDIF regarding both AUC of the entire whole blood SUV curve and of the 500-3600 s portion of the whole blood SUV curve.

Quantification of brain regions with [ 11 C]MeDAS-PET using IDIF
K i values obtained using different IDIFs or the AIFs (both with individual calibrations for plasma-to-whole blood and metabolites) were highly correlated on an individual subject level (mean R 2 = 0.95-0.98; Fig. 6 ; supplementary Tables 3, 4, 5 and 6). The average slope of the correlation plots was close to the line of identity, but there were substantial inter-subject differences ( Fig. 6 ). This is further supported by the Bland-Altman plot in Fig. 7 , showing high bias of the 2T3k K i using IDIF as compared with 2T3k K i using AIF. The much higher differences observed for MS9 ( Figs. 6 and 7 ), are likely due to the fact that the IDIF whole blood curve differed more from the AIF whole blood curve than for the other subjects, but no specific reason for this deviating behaviour could be identified. Correlation analysis for the microparameters K 1 , k 2 and k 3 confirmed a highly variable bias in K 1 and k 2 estimations, whereas bias in k 3 was more comparable between subjects (Supplemental Table  7; Supplemental Fig. 2).

Discussion
Myelin distribution in brain regions can be quantified using [ 11 C]MeDAS PET to characterize MS lesions. However, accurate quantification of [ 11 C]MeDAS uptake requires pharmacokinetic modelling using an AIF. Arterial blood sampling is clinically impractical as the Fig. 4. Bland-Altman plots indicating differences between 2T3k K i estimates using AIF and alternative input functions over 26 brain regions. The red line indicates the average difference, and the dashed black lines the 95% confidence intervals. procedure is intrusive for patients, enhances radiation exposure to medical personnel, and requires the need for specialized equipment for blood analysis. Therefore, the aim of this study was to assess the feasibility of substituting the AIF by an IDIF. The use of an IDIF resulted in high within-subject correlations with AIF-derived quantification of tracer binding in the brain. However, the slope of these correlations was highly variable across subjects, indicating a high bias across subjects, which suggests that an IDIF is not suitable for absolute quantification.
The methods used to generate a whole blood curve from PET images had little effect on the outcome. Similar results were obtained when a Fig. 5. Time delay corrected whole blood time-activity curves of a typical subject (MS3) obtained from either arterial blood sampling or the carotid arteries in the PET images using ROIs of different sizes, drawn on the summed images acquired 0-70 s after tracer injection. For better visualisation of the peak of the curves are displayed in the insert. Fig. 6. Correlation between regional K i values obtained using IDIF and AIF over 26 brain regions. The IDIF was derived from an isocontour on the summed images from 0 to 70 s, and was corrected using an individual plasma-to-whole blood ratio and individual metabolite fractions.
whole blood curve was derived from the frame with the highest bloodbrain contrast, or when the IDIF was generated from the summed images acquired between 0 and 70 s after tracer injection. Different ROI sizes used to delineate the blood pool in the carotids (e.g. 2-12 voxels vs. isocontour) seemed to have a minor effect on the results. The use of a smaller ROI could lead to reduced partial volume effects, i.e. less underestimation of the peaks of the blood curves. However, smaller ROIs are also associated with higher noise levels, potentially resulting in overestimation of the peaks of the blood curves. Apparently, the overall effects of these factors on the estimated K i values are small, as comparable correlations, slopes, intercepts, and ICC between IDIF and AIF derived K i values were observed for the various ROI sizes. Potentially, partial volume correction (PVC) might enhance the accuracy of the image-derived whole blood curve. This could also minimize inter-subject variability. Without PVC, quantification using the IDIF with individual blood samples does correlate well, but exhibits high variations in the slope as com-pared with quantification using AIF. In addition, PVC may increase the peak height of the blood curve due to correction for the spill-out effects in the early frames and decrease the tail height of the blood curve due to correcting spill in effects in the late frames. The image-derived whole blood curve already showed a peak that is substantially higher than that derived from the arterial blood sampling and PVC would even further increase the difference between both blood curves. In fact, a higher and narrower peak of the image-derived whole blood curve might depict blood dynamics that are closer towards the ground truth than the blood curve derived from arterial blood sampling, because it does not suffer from dispersion effects in sampling lines that are an inherent flaw of arterial blood sampling. Although dispersion correction of the AIF results in a better resemblance with the ground truth, the peak has minimal effect on K i estimation, and differences in peak heights should therefore have limited effects on K i . However, PVC might reduce the spill-in effects on the tail of the curve and thus reduce the bias in the K i estimates. Fig. 7. Bland-Altman plot indicating differences between 2T3k K i estimates using AIF and IDIF over 26 brain regions. The IDIF was derived from an isocontour on the summed images from 0 to 70 s, and was corrected using an individual plasma-to-whole blood ratio and individual metabolite fractions. The red line indicates the average difference, and the dashed black lines the 95% confidence intervals.
Correction of the whole blood curve for individual plasma-blood ratios and individual metabolite fractions seems to be important for obtaining unbiased measurements. When a population-based blood-to-plasma conversion factor and a population-based metabolite correction was used, high within-subject correlations, but variable bias of the estimates were observed. This suggests that manual blood sampling will be required when absolute quantification of [ 11 C]MeDAS PET is needed. The difference between the use of individual blood-to-plasma conversion factors and population-based conversion factors could be due to inter-subject variations in renal function, variability in haematocrit, or factors that affect binding of the tracer to blood cells. Variations in renal function could explain the differences, as the kidney is involved in tracer excretion from plasma. Variation in renal function can therefore cause intersubject variability in the blood-to-plasma conversion factor. Differences in renal function could be due to high blood pressure, diabetes, smoking, obesity and kidney failure ( Leblanc et al., 2005 ;Schefold et al., 2016 ) In addition, the binding of a tracer to blood cells could be affected by competition with endogenous ligands, changes in the concentration or phenotype of specific blood cells, amongst others. Therefore, the haematocrit, which is the volume percentage of red blood cells in blood and is variable across subjects ( Grau et al., 2018 ), could also contribute to the variation in plasma-to-whole blood ratio across subjects. However, our data ( Fig. 2 ) indicates that correction of the whole blood curve with an individual metabolite curve has a more profound impact than correction with an individual blood-to-plasma conversion factor in most subjects. The variation caused by application of a populationbased metabolite correction instead of individual metabolite corrections could be due to variations in liver function. Liver function is highly influenced by diet, alcohol consumption, drug use, and physical exercise ( Bernuau et al., 1986 ;Lian et al., 2020 ;Myers et al., 2008 ;van der Windt et al., 2018 ). In the present study high variation in tracer metabolism across subjects (CoV = 49.5% ± 11.5%) was observed, which substantially decreases the usefulness of a population-based metabolite correction for accurate quantification of [ 11 C]MeDAS PET.
We also investigated whether an IDIF in combination with manual blood samples for calibration could be employed to avoid continuous blood sampling. Although the usage of arterial blood samples would still require arterial cannulation and thus does not reduce the invasiveness and discomfort, it would illustrate whether venous samples could be considered as a possible alternative. However, the use of venous blood sampling would only be of interest if an image-derived whole blood curve can be generated properly. Unfortunately, this was not the case in the present study. Alternative strategies for the generation of an IDIF (e.g. independent component analysis or application of PVC) might be able to better capture the blood curve. A more promising approach could be to use a large field of view PET-CT scanner, in which the brain, aorta, and heart are within the same field of view. Either the aorta or heart could be employed for the generation of an IDIF, which would suffer less from spill-in and spill-out effects than an IDIF derived from the carotids, and therefore might capture the whole blood curve more accurately. Since we showed that the individual variation is too high for using population-based averages, such an approach should include the collection of venous samples for metabolite analysis. Subsequently, the data analysis using venous samples should be compared with the results when arterial samples are used, to determine the similarity between the two methods.
Taken together, this study shows the caveats of [ 11 C]MeDAS-PET quantification using IDIF. For low bias, an arterial input function corrected with blood samples from the same individual is required. The variability in bias across subjects, when population-based corrections of the input function are applied, precludes the use of an IDIF in longitudinal studies for potential efficacy evaluation of remyelination therapies, as robust quantification is needed. Yet, use of an IDIF, instead of an AIF, could be a good non-invasive alternative for [ 11 C]MeDAS PET data analysis to facilitate early phase, cross-sectional clinical trials.

Data and code availability statement
Due to privacy regulations the clinical data collected in this study are not deposited in a public registry, but the data can be made available via a request to the corresponding author. Anonymized data can be made available after approval of the participants and when a signed data transfer agreement is in place.
Software programs used in the study are commercially available (PMOD, v4.1).

Data availability
Data will be made available on request.