Sensitivity of Arterial Spin Labeling for Characterization of Longitudinal Perfusion Changes in Frontotemporal Dementia and Related Disorders

Highlights • This study demonstrates the value of ASL for longitudinal monitoring of perfusion in FTD patients.• Good agreement was found in repeat measures of CBF in patients and controls.• Transit times were not a significant source of error for the selected post labeling delay (2 s).


Introduction
Frontotemporal dementia (FTD) and related disorders comprise a clinically and pathologically heterogeneous group of neurodegenerative disorders that are characterized by progressive atrophy of the frontal and temporal lobes with relative sparing of the posterior cerebral regions, and abnormal molecular accumulations, mostly commonly of tau or TDP-43 (Seelaar et al., 2011). FTD is the second most common form of early onset dementia, with the greatest prevalence among individuals between 45 and 64 years of age (Finger, 2016). FTD is highly heritable, with up to 40 percent of cases considered hereditary, and ~ 15% autosomal dominantly inherited (Warren et al., 2013). Advances in the understanding of the link between genetic factors and the underlying pathophysiology (Whitwell and Josephs, 2012;Whitwell et al., 2009), and subsequently, the development of candidate disease modifying treatments, have stimulated the need for efficient tools to assess treatment efficacy (Logroscino et al., 2019).
Longitudinal neuroimaging studies can provide insight into therapeutic efficacy and characterize natural disease trajectory, which could increase the possibility of presymptomatic intervention. The majority of studies assessing longitudinal brain function in FTD have focused on assessing tissue volume loss by structural magnetic resonance imaging (MRI) (Whitwell et al., 2015;Chan et al., 2001;Jiskoot et al., 2019) or regional glucose hypometabolism by 18 F fluorodeoxyglucose (FDG) positron emission tomography (PET) (Schuster et al., 2015;Rajagopalan and Pioro, 2019;Diehl-Schmid et al., 2007). While these approaches are used diagnostically, for longitudinal imaging, structural changes are subtle at the early disease stage and PET imaging is expensive and access limited (Piguet et al., 2011;Olm et al., 2016;Verfaillie et al., 2015).
Due to the coupling of perfusion and metabolism, an attractive alternative to FDG PET is the MRI-based perfusion imaging technique, arterial spin labeling (ASL) (Verfaillie et al., 2015;Tosun et al., 2016). In FTD, perfusion changes precede structural findings, with reduced frontal and temporal cerebral blood flow (CBF) indexed by ASL in presymptomatic FTD mutation carriers (Dopper et al., 2016;Mutsaerts et al., 2019) as well as symptomatic FTD (Olm et al., 2016). Furthermore, longitudinal reductions in perfusion by ASL have been detected in FTD patients (Staffaroni et al., 2019a) and are associated with clinical measures of cognitive decline (Staffaroni et al., 2019b), highlighting the potential for ASL for longitudinal assessments. However, despite that perfusion is one of the earliest changes in pathological aging (Mutsaerts et al., 2019;Iturria-Medina et al., 2016), beyond the aforementioned studies, there are few reports in the literature investigating the stability of ASL for longitudinal imaging of FTD patients and no studies assessing the sensitivity of ASL for detecting longitudinal perfusion changes in FTD patients.
In comparison to young healthy populations where ASL shows good reliability and reproducibility (Chen et al., 2011;Parkes et al., 2004;Murphy et al., 2011), perfusion measurements among elderly populations can be challenging due to reduced signal to noise ratio (SNR) and potential vascular changes (Kilroy et al., 2014;Xu et al., 2010). With age, the brain's major feeding vessels become tortuous and the prevalence of stenosis increases. These factors can result in underestimated CBF due to increases in the arterial transit time (ATT) -i.e., the time required for labeled water to travel from the labeling site to brain tissue greatly reducing the sensitivity of ASL to detect clinically relevant perfusion changes (Dai et al., 2017). Furthermore, by collecting data on different days, sources of variation attributed to repositioning and differences in resting perfusion between days can introduce error that can confound measurement of longitudinal change. Establishing the magnitude of perfusion changes that can be reasonably detected among older control and patient populations would help address the influence of sources of between scan variability (Clement et al., 2018), namely, transit time and repositioning errors, as well as day-to-day fluctuations in CBF. The primary aim of this study was to assess the sensitivity of ASL for detecting longitudinal changes in perfusion among patients with FTD using optimized parameters based on the ASL white paper (Alsop et al., 2015). To assess sources of variability, reproducibility and reliability of single delay pseudo continuous ASL (SD-pCASL) were assessed for sameday scans and scans collected during sessions separated by a month. A relatively short between-session period (4 weeks) was selected to avoid disease-related brain atrophy or pathological changes in the patient population. Differences in variability between the aforementioned scan separations reflects error due to repositioning and differences in resting perfusion between sessions. To assess the effects of day-to-day changes in perfusion, variability was assessed for absolute and relative perfusion. Power analysis was conducted to determine the number of participants required to detect clinically relevant longitudinal perfusion changes. The influence of ATT on the longitudinal reproducibility of CBF, was determined by quantifying the between-session variability of ATT measured by a low-resolution (LowRes-pCASL) sequence using multiple inversion times. Given that visual assessment remains a primary source of scan interpretation, a voxel-by-voxel approach was implemented to visualize the spatial distribution of variability and, furthermore, to identify regions where longitudinal changes would be more challenging to detect. As previous studies have suggested that time-encoded multidelay sequences can improve SNR and temporal efficiency (Samsonhimmelstjerna et al., 2016;Guo et al., 2018;Dai et al., 2013), a secondary aim was to compare perfusion and transit times measured by SD-pCASL and LowRes-pCASL, respectively, to Hadamard-encoded multidelay sequences.

Participants
This study was approved by the Western University Health Sciences Research Ethics Board and was conducted in accordance with the Declaration of Helsinki ethical standards. Participants provided written informed consent in compliance with the Tri-Council Policy Statement of Ethical Conduct for Research Involving Humans.
Fourteen neurologically healthy controls and ten patients with FTD or progressive supra-nuclear palsy (PSP) were enrolled in the study. Patients were recruited through the Cognitive Neurology and Aging Brain Clinic at Parkwood Hospital (St Joseph's Health Care London) and controls were recruited through advertisements and the clinic's volunteer pool. Studies were performed between November 2019 and December 2020. The patient cohort consisted of individuals meeting the consensus criteria for probable or definite FTD or PSP; specifically, behavioural variant (bvFTD) , semantic variant (svFTD) (Gorno-Tempini et al., 2011), nonfluent primary progressive aphasia (nfPPA) (Gorno-Tempini et al., 2011), and PSP (Höglinger, 2017). Exclusion criteria included (1) any significant neurologic disease other than suspected FTD, (2) presence of pacemakers, aneurism clip, artificial heart valves, ear implants, metal fragments or foreign objects that would preclude MRI participation, (3) major depression, bipolar disorder, psychotic features or behavioural problems, and (4) any significant systemic illness or unstable medical condition. Diagnostic evaluations were performed by a clinical neurologist (E.F) based on clinical evaluation, neurocognitive testing, clinical MRI brain imaging, and genetic testing.

Imaging
All MRI examinations were performed on a 3 T Siemens Biograph mMR scanner using a 12-channel head coil. Participants were required to abstain from caffeine 8 h before each scan. Each participant was scanned on two occasions separated by approximately 4 weeks. Repeat scans were scheduled at a similar time of day to minimize time-of-day effects (Parkes et al., 2004). SD-pCASL data were acquired twice during each imaging session for a total of 4 scans. This protocol allowed for the assessment of two types of within-subject variability: within-session, representing fluctuations in same-day measurements, and betweensession, representing the variability in measurements separated by 4 weeks. LowRes-pCASL was performed once in each session to assess between-session variability in transit times. Hadamard-encoded sequences were acquired once during one of the two imaging sessions. All scans were performed at rest with the participants awake in the scanner. To improve compliance, participants watched a low cognitive demand movie (Vanderwal et al., 2015). Each scanning session included a T1weighted magnetization-prepared rapid acquisition gradient echo (MPRAGE) sequence with repetition time (TR)/echo time (TE): 2000/ 2.98 ms, voxel size: 1 mm isotropic, field of view (FOV) 256 × 256 × 176 mm 3 , scan time: 4:38 min.

Single delay pCASL
Single delay pCASL data were acquired with a 4-shot 3D gradient and spin echo (GRASE) readout (Günther et al., 2005); TR/TE: 4500/22.14 ms, voxel-size: 4 mm isotropic, FOV: 256 × 256 × 128 mm 3 , labelcontrol pairs: 8, bandwidth: 2298 Hz/Px, 1 preparing scan, scan time: 4:53 min. A post-labeling delay (PLD) of 2000 ms and label duration (LD) of 1800 ms were used as recommended by the ASL consensus paper (Alsop et al., 2015). For all pCASL sequences, two inversion pulses were used to null components with relaxation times T1 = 700 and 1400 ms (Günther et al., 2005). To maintain a consistent acquisition protocol, these parameters were used in both patients and healthy controls, despite the shorter recommended PLD for healthy controls. An equilibrium magnetization image (M0) was acquired to convert perfusionweighted images into physiological units of blood flow. Imaging parameters were identical to the pCASL acquisition except for a TR of 7000 ms and no background suppression or labeling. Since the feeding arteries are often tortuous in elderly populations (Dai et al., 2017;Farkas and Luiten, 2001), the labeling plane offset was manually adjusted for each participant to ensure the labeling plane was straight and parallel to the vessels. A 3D time-of-flight MRI angiography (TR/TE: 22.0/3.75 ms, voxel-size: 0.3 × 0.3 × 1.5 mm 3 , FOV: 263 × 350 × 350 mm 3 , 4 slabs, 30 slices per slab, scan time: 4:23 min) was acquired to identify the major arteries for labeling plane preparation. The offset ranged between 90 and 125 mm from the center of the imaging slab. The same labeling plane was used for all subsequent ASL sequences.

Image processing
Image analysis was performed with SPM12 (http://www.fil.ion.ucl. ac.uk) (Ashburner, 2012), Oxford Centre for Functional MRI of the Brain (FMRIB)'s software library (FSL 6.0.1) (Jenkinson et al., 2012), and inhouse MATLAB scripts ( The MathWorks, Natick, MA). Prior to any analysis, all images were manually reoriented to the axis of the anterior and posterior commissure. T1-weighted images from each session were coregistered and averaged using SIENA . By transforming the two structural images into a halfway space, both images undergo the same resampling steps, thereby reducing potential interpolation bias. The resulting structural images were processed using the fsl_anat pipeline to generate bias-corrected, skull-stripped, tissuesegmented, and spatially normalized structural images, as well as a normalization matrix (Smith, 2004). Grey and white matter masks were generated by thresholding the respective tissue segmented images to include voxels with tissue probabilities >0.8.

Single delay pCASL
M0 images from the first and second imaging session were realigned to their mean and co-registered to the T1-weighted images. Using SPM12, raw SD-pCASL data were motion corrected, registered to the mean M0 and pairwise subtracted. Poor quality difference images were identified using ENABLE (Shirzadi et al., 2018), an automated sort/ check algorithm. Briefly, each difference image was scored based on a linear combination of ASL quality features: temporal SNR, detectability metric (proportion of grey-matter voxels with signals significantly greater than zero), temporal contrast-to-noise ratio, and spatial coefficient of variation. Image volumes that did not meet the quality criterion were removed. Perfusion was quantified using the Oxford ASL toolbox (oxasl) which uses Bayesian inference to perform kinetic modeling and spatial regularization Groves et al., 2009). The incorporation of these spatial and biophysical priors reduces the uncertainty of model parameters by encoding realistic assumptions and accounting for natural variability in the model parameters. A standard well-mixed single compartment model was applied to the motioncorrected and filtered perfusion-weighted images (Buxton, 2005). Model parameters were based on the guidelines of the ASL consensus paper (Alsop et al., 2015): T 1 of tissue = 1300 ms (Wansapura et al., 1999), T 1 of arterial blood = 1650 ms (Zhang et al., 2013), labeling efficiency = 0.85 (Dai et al., 2008), blood-brain partition coefficient = 0.9 ml/g (Herscovitch and Raichle, 1985). Registration between perfusion/transit time and structural images was carried out using boundarybased registration . Images were normalized to the MNI template by applying the transformation parameters generated by fsl_anat using a non-linear image registration tool (FNIRT Jenkinson and Smith, 2001) and smoothed by a 6-mm Gaussian filter.

Multi-Delay pCASL
Raw LowRes-pCASL data were motion corrected and pairwise subtracted using SPM12. Data were fit to the general kinetic model with oxasl to extract the ATT . The single compartment model with no dispersion was fit using the aforementioned SD-pCASL model parameters. ENABLE was implemented to remove low quality difference images. Hadamard-encoded data were processed in a similar manner except no motion correction was applied and instead of pairwise subtraction, the Hadamard transform with Walsh ordering was applied to generate images for each label/sub-bolus (Samson-himmelstjerna et al., 2016). Both perfusion and ATT maps were generated from the Hadamard sequences. All resulting data were normalized to the MNI template and smoothed as described previously.

ROI analysis
Region of interest (ROI) analysis was performed to assess regional reproducibility of SD-pCASL and compare CBF and ATT measured by the different sequences. This was performed in grey and white matter as well as regions commonly associated with FTD; namely, the orbitofrontal gyrus, inferior frontal gyrus, superior frontal gyrus, insular cortex, amygdala, temporal pole, and occipital gyrus (as a reference region) . FTD-specific ROIs were generated by combining regions from the automated anatomical labeling (aal) atlas in WFU Pickatlas (Wake Forest University, http://fmri.wfubmc.edu/cms/s oftware).

Statistics
Statistical analysis was performed using R (R Core Team 2013, https://www.r-project.org/) and MATLAB. Variance components were estimated using a random effects model that employed restricted maximum likelihood. The variance components were estimated according to the following model (Shavelson and Webb, 1991): This model was fit with perfusion (CBF ijk ) for the i th subject, j th session, and k th run, as the response variable and random effects for subject (U) and subject-by-session (V). The grand mean is represented by μ and ε is the residual. By nesting session within subject, sessions were uniquely coded to each subject. Variance was decomposed into 3 components: variance between subjects, variance between sessions, and a residual variance component due to random error. This residual term is an estimate of variance that would result from the two repeat scans within a single session for a given participant. Each variance component indicates the magnitude of variance that the respective individual factor contributes. The within-subject variance (i.e., sum of the betweensession and within-session variances) represents the variance in perfusion images acquired during sessions collected 4 weeks apart in a given participant. This estimate reflects the variance encountered in a study in which a participant is scanned once in each session.
Reproducibility, hereby defined as the variability in repeat measurements, was quantified by the coefficient of variation (CV). Betweensubject and within-subject (i.e. within/between-session) CV were calculated by dividing the respective standard deviation (σ) by the mean (μ): The intraclass correlation coefficient (ICC) was used to assess reliability (Shrout and Fleiss, 1979). While the ICC can be interpreted as the variance in the outcome variable that is accounted for by the grouping variable (e.g. subjects), an alternate interpretation is the expected correlation between randomly drawn units from the same group (Hox et al., 2018). In the context of the current study, we implement two variants of the ICC; ICC between , defined as the expected correlation in CBF among sessions for a randomly selected subject and ICC within , defined as the estimated correlation in CBF from runs within the same session for a randomly selected subject. These metrics of reliability were assessed by: Total variance was defined as the sum of the individual components: ICC values range between 0 and 1, where results were interpreted based on the following guidelines (Cicchetti and Sparrow, 1981): poor (<0.4), fair (0.41-0.59), good (0.6 -0.74), and excellent (>0.75). Reproducibility and reliability were calculated on a voxel-by-voxel basis to visualize the spatial distribution.
Based on the variance in CBF derived by the random effects model, power calculations were performed to estimate the number of participants (N) required to detect a given change (Δ) in perfusion between sessions. This was computed by the following equation: where σ 2 represents the within-subject variance, Z 1-α is the Z-score for the significance criterion, and Z 1β is the z-score for the statistical power.
Variance was determined based on the average of the two runs in each session, which is equivalent to one 10-minute scan. This was performed to minimize the effects of within-session variability and to reflect the variability observed in a longitudinal clinical study where multiple runs would not be acquired. The detection power was set to 80% (i.e., Z 1-β = 0.84) and the significance level was set to α = 0.05 based on a one-tailed t-test since the primary focus is on detecting regional perfusion deficits (i.e., Z 1-α = 1.645). A detectability map depicting the number of participants required to detect a 10% perfusion change between sessions was generated to visualize the estimated sample size for ROIs generated using the aal atlas.
To determine whether there were differences in CBF and ATT among the three ASL sequences, data were fit to a linear mixed model and an ANOVA used to test for significance. T-tests were used to assess betweengroup differences. For all tests p < 0.05 were considered significant. To assess the effect of day-to-day variations, the minimum detectable difference and reproducibility of SD-pCASL CBF were assessed using absolute (aCBF) and relative perfusion (rCBF), where relative perfusion was generated by intensity normalizing by the mean whole-brain CBF. Reliability was only calculated with aCBF, due to the reduction in between-subject variance after intensity normalization.

Demographics and cognitive measures
Of the fourteen controls and ten patients recruited and screened for the current study, three patients and one control had missing follow-up data. The final sample of the test-retest study included 13 controls and 7 patients. Within-session scans were separated by approximately 30 min, while the average separation between imaging sessions was 26 ± 4 days. Data comparing ASL sequences included Hadamard-encoded data acquired in 9 controls and 8 patients. LowRes-pCASL and SD-pCASL data from the corresponding participants were included in this analysis. Demographic and clinical characteristics of all the participants are summarized in Table 1. As expected, healthy controls scored significantly higher on all cognitive tests (p < 0.05).

Test-Retest reproducibility of single delay pCASL
Average grey-matter perfusion across all sessions was 68.6 ± 1.7 ml/ 100 g/min in controls and 65.2 ± 1.72 ml/100 g/min in patients. Representative perfusion images from example control and patient participants for the two sessions are shown in Fig. 1. Perfusion maps were scaled to a common range for display purposes. Perfusion maps showed the expected contrast between grey and white matter. While overall there was good agreement within and between sessions, there were noticeable differences in regional perfusion between sessions in some participants (e.g., reduced frontal perfusion for patient 10 during session 2). To a lesser extent, this phenomenon was also evident in control 1, session 2.

Voxel-by-voxel variability
Control and patient CV maps were similar, with both showing increased variance in white matter, cerebrospinal fluid, and regions proximal to the brain's feeding vessels (Fig. 2). Following intensity normalization, there was a global decrease in grey-matter CV in both controls and patients; however, regions of high variability in white matter and cerebral spinal fluid remained. In controls, between-subject variability was higher in white matter, ventricles, and the posterior regions of the brain. Although between-subject reproducibility in greymatter were within a similar range, (21.4% in controls and 25.3% in patients), in patients, distinct regions of increased variability including the superior frontal gyrus, cerebellum, brainstem, and the left intracalcarine cortex were apparent. For both patients and controls, goodto-excellent reliability was determined for the within-session comparison, whereas fair-to-good reliability was achieved between sessions. For all comparisons, participants showed lower reliability in the striatum relative to other regions. In patients, a clear increase in grey-matter reliability relative to white matter was observed, especially between sessions. Average voxel-by-voxel within and between-session variability in grey-matter aCBF were comparable for patients (within-session: 16%, between-session: 10.8%) and controls (within-session: 13.9%, betweensession: 8.3%). Within-subjects variability in grey-matter was 19.2% and 16.2% in patients and controls, respectively. After intensity normalization, there was a small decrease in within-session variability (controls: 12.3%, patients: 14.8%), whereas between-session CV decreased to a greater extent (controls: 6%, patients: 8.5%). The corresponding within-subject variance were 16.3% and 13.9% in patients and controls, respectively. Good reliability among same-day scans in both patients (ICC within = 0.73) and controls (ICC within = 0.71) was found. Reliability was also good between sessions; however, there was a moderate decrease for both patients (ICC between = 0.62) and controls (ICC between = 0.62).

Variability in FTD-specific ROIs
Across FTD-specific ROIs, there were no differences in perfusion between sessions or runs. Average reliability and reproducibility in FTDspecific ROIs are summarized in Fig. 3. The superior frontal gyrus and temporal pole showed a high amount of variability (CV > 15%) in patients and controls, whereas the orbitofrontal gyrus and amygdala showed high variability in patients only. Between sessions, CV was higher in the superior frontal gyrus and amygdala in patients. Intensity normalization resulted in a significant reduction in between-session CV (p < 0.05). With both aCBF and rCBF, within-session CV was significantly higher than between-session CV (p < 0.05). Within-session reliability was fair to excellent in both patients (range: 0.48-0.78) and controls (range: 0.5-0.89). Between sessions, there was a significant reduction in reliability for patients and controls; the majority of regions showed fair reliability; however, as indicated by ICC between < 0.4, in  patients reliability in the amygdala and temporal pole was poor.

Detectability
Detectability maps depicting estimated sample sizes required to detect a 10% decrease in perfusion between sessions using a 10-minute pCASL scan are shown in Fig. 4. Across ROIs, within-subject variance ranged between 5.1 and 14.5 ml/100 g/min for aCBF and 3.5 and 12.1 ml/100 g/min for rCBF. For FTD-specific ROIs, the number of participants required was significantly higher in patients (26 ± 14) relative to controls (13 ± 5) (p < 0.05). This estimate was based on averaging common ROIs on the left and right hemisphere. After intensity normalization, these values decreased to 10 ± 9 for patients and 5 ± 2 for controls (ns). Although intensity normalization improved regional detectability, as indicated by significant reduction in estimated sample sizes and the cooler colors in Fig. 4, in patients, the lowest sensitivity remained in the frontal lobe and sub-lobar regions.

Perfusion comparison in healthy controls
Perfusion maps averaged over healthy controls and generated by the three pCASL sequences are shown in Fig. 5. Each average contains an identical sample of controls. While all sequences show similarities in contrast and regional distribution between grey-and white matter perfusion, midbrain perfusion appears to be greater in FL_TE-pCASL and conv_TE-pCASL sequences. Average grey-matter CBF measured by SD-pCASL, FL_TE-pCASL and conv_TE-pCASL were: 67.7 ± 15, 95.6 ± 21.1, 96.7 ± 25 ml/100 g/min in controls. Perfusion averages within FTD-specific ROIs are shown in Fig. 6. Perfusion estimates by the Hadamard sequences had greater between-subject variability and were consistently higher than SD-pCASL estimates in all regions (p < 0.05) except for the occipital gyrus. Compared to the conv_TE-pCASL, FL_TE-pCASL was significantly lower in the amygdala, insula and temporal pole (p < 0.05). Fig. 4. Detectability maps indicating the number of participants required to detect a 10% perfusion change within ROIs. Fig. 7 shows ATT maps generated using LowRes-pCASL, FL_TE-pCASL, and conv_TE-pCASL sequences generated for both controls and patients. All sequences show similar spatial patterns with shorter transit times near the centre of the major feeding arteries and increased transit times in the watershed regions. Grey-matter ATT measured by LowRes-pCASL, conv_TE-pCASL and, FL_TE-pCASL were 1.24 ± 0.16, 1.10 ± 0.08, and 1.12 ± 0.08 s in controls and 1.30 ± 0.11, 1.16 ± 0.05, and 1.17 ± 0.04 s in patients, respectively. For all sequences, ATT measured in patients were not significantly different from controls. Average ATT values in FTD specific ROIs are shown in Fig. 8. Among the three sequences, the only significant differences in transit times were in the occipital gyrus, orbitofrontal gyrus and superior frontal gyrus, where the LowRes-pCASL values were significantly higher than the corresponding conv_TE-pCASL and FL_TE-pCASL values. ATT CV maps were mostly homogeneous, with some non-specific spatial patterning (Supplementary Fig. 1). Average between-session CVs in grey-matter were 17.1 ± 5.9% and 15.2 ± 6.3% in controls and patients, respectively.

Discussion
Multiple studies have demonstrated the potential of ASL for assessing disease-driven perfusion changes that differentiate clinical populations as well as presymptomatic mutation carriers (Tosun et al., 2016;Mutsaerts et al., 2019;Anazodo et al., 2018). Considering that ASL is noninvasive and quantitative, it is well suited to longitudinal studies aimed at characterizing disease progression and evaluating treatment efficacy. Toward this goal, the current work focused on evaluating the reproducibility of an optimized ASL sequence for monitoring long-term changes in perfusion in FTD patients. As an initial evaluation, this study included patients who met the consensus criteria for probable FTD or PSP and age-matched controls. Imaging was performed in two sessions that were separated by four weeksa period selected to minimize possible disease-related perfusion changes. Both within-and betweensession variability was assessed to evaluate the impact of common sources of error associated with longitudinal studies including head repositioning and day-to-day fluctuations in CBF. Considering the impact of ATTs on CBF quantification, a second aim was to compare the performance of SD-pCASL and LowRes-pCASL, in which perfusion and ATT are measured separately, to Hadamard-encoded sequences that measure both parameters simultaneously. While the latter methods    provide the ability to image faster with superior SNR, they are more sensitive to motion artifacts. To the best of our knowledge, this is the first study to perform this comparison in an older population. The main results of the study showed that (1) test-retest repeatability was similar in the patient group compared to controls, (2) variations in transit times were not a significant source of error with this patient population, and (3) perfusion imaging by Hadamard-encoded sequences yielded systematically higher CBF compared to SD-pCASL but produced similar transit-time measurements compared to LowRes-pCASL.
Since the role of ASL in assisting with the diagnosis of FTD subtypes is to detect spatial patterns of hypoperfusion, the current study primarily focused on characterizing variability on a voxel-by-voxel basis. This approach provided the ability to identify regions with greater variability, which could make it more challenging to detect perfusion changes in longitudinal studies. In general, within-subject reproducibility and reliability maps, shown in Fig. 2 and summarized in Fig. 3, for the patient and control groups were similar. Between-session greymatter reproducibility and reliability for patients were similar to values for controls (CV = 10.8% vs 8.3%, ICC between = 0.62 vs 0.62, respectively). Regions with the highest variability (the superior frontal gyrus and temporal pole) were common to both groups (Fig. 3), although the CV in dementia-specific ROIs were significantly lower for controls. Increased variability in the superior frontal gyrus and temporal pole are likely related to susceptibility artifacts due to brain-air interfaces, particularly in the patient population where there is greater brain atrophy (Zhao et al., 2017). Visual inspection of Fig. 2 revealed that in both patient and control participants, there was an imaging artifact in the sub-lobar region that is consistent with signal dephasing due to the pulsatile flow in the circle of Willis during the GRASE readout. In the patient group, its border spread into the amygdala region, explaining the increased variability in this region. While a segmented sequence was implemented to reduce the effects of T2 decay during the readout (Feinberg and Günther, 2009), it may be possible to further reduce this artifact by using a variable flip angle (Liang et al., 2013;Zhao et al., 2018). Between-subject reproducibility among patients showed increased CV in the occipital and posterior regions. While this increase could reflect the disease-driven heterogeneity in regional hypoperfusion, since these regions are not typically affected by in FTD and the related disorders, it more likely reflects individual differences due to the small sample size.
Both the accuracy and precision of ASL-CBF are affected by factors such as its inherently low SNR, subject motion, sensitivity to transit delays, and labeling efficiency (Chen et al., 2011). In addition to these within-session sources of error, factors that can degrade between-session reproducibility include repositioning errors and differences in resting perfusion between scanning sessions (Ssali et al., 2016). Efforts to minimize these sources of variability include conducting repeat imaging sessions around the same time of day (Parkes et al., 2004), having participants avoid substances known to affect CBF (e.g. caffeine) (Clement et al., 2018), and implementing a pre-processing pipeline, including ENABLE, to ensure the quality of the ASL images and good registration to the MNI template (Shirzadi et al., 2018;Mutsaerts et al., 2018). The similarity of perfusion maps separated by a month (Fig. 1) demonstrated the effectiveness of these approaches. Between-session variability in grey-matter showed good correlation (ICC between > 0.6) and good reproducibility in both patients and controls (CV < 11%). With intensity normalization, there was approximately a 28 and 21% reduction in between-session CV in patients and controls respectively, highlighting the systemic effect of day-to-day fluctuations in resting perfusion (Fig. 2). Together, these results suggest that with good alignment of data between sessions, and careful control of perfusion modifiers, sources of between-session variability can be minimized. This is particularly relevant for clinical diagnosis and management of FTD given that perfusion changes are subtle.
ROI-based between-session reproducibility and reliability across grey-matter for patients (CV = 9.04%, ICC between = 0.77) and controls (CV = 6.5%, ICC between = 0.77) were within the range of previous studies of other causes of dementia. Kilroy et al assessed reproducibility and reliability of pCASL GRASE in a population of older healthy controls, and patients with MCI and Alzheimer's disease (Kilroy et al., 2014). The authors reported a CV of 10.9% and an ICC between of 0.707 among perfusion measurements separated by 4 weeks. Similar results were observed in different scan separations among young and older participants. Chen et al reported a CV of 8.5 ± 0.14% for data collected 1 week apart in young healthy participants (Chen et al., 2011); more recently, in a population of adult Latinx participants at risk for vascular disease, Jann et al. reported a CV of 7% and ICC between of 0.84 (Jann et al., 2021). The finding that longitudinal variability of the current implementation of ASL is comparable to, and in some cases superior to previous studies, supports the potential of ASL as a sensitive marker of longitudinal perfusion changes.
Metrics of within-session reproducibility and reliability were similar in both patients and controls (Fig. 2). In grey-matter, patients and controls CVs were within roughly 14% of each other (i.e. 16% vs 13.8%, respectively) and correlations between repeat measurements were good (i.e. 0.73 vs 0.71). An unexpected finding was the reproducibility between sessions was greater than the reproducibility between runs in the same session. Within-session variance was 64 and 70% of the withinsubject variance in patients and controls. After intensity normalization, these proportions were: 70 and 77% in the two groups, respectively. This suggests that even after minimizing between-session variance, within-session variance remained dominant. One possible explanation is that perfusion modifiers such as arousal and attention could have led to changes in global CBF between the two runs (Clement et al., 2018). As participants acclimatized to the scanner environment, cerebral perfusion could have decreased considering the two runs were separated by approximately 30 min. However, a repeated measures ANOVA confirmed that average grey-matter perfusion between runs were not significantly different. A more likely explanation is the inherently low SNR of the ASL sequence. Considering that the ASL signal used to calculate perfusion is on the order of 1% (Alsop et al., 2015), several tag-control pairs are typically acquired to improve SNR. In the current study, 8 tag-control pairs were collected in each run to keep the scan time around 5 min, which is typical for clinical studies. However, the unexpectedly poor within-session reproducibility indicates that more averages should be acquired. In order to more accurately characterize perfusion, we recommend a 10-minute SD-pCASL scan.
As a means of visualizing the impact of between-session variability on tracking longitudinal changes in regional CBF, detectability maps were created to show the predicted sample size that would be required to detect a 10% perfusion change in individual anatomical regions (Fig. 4). In light of the unexpectedly high within-session variability, data from the two runs in each session were combined and within-subject (i.e. between-session) variance was estimated based on the two resulting 10minute SD-pCASL scans. Focusing on FTD-specific ROIs, the predicted sample sizes required for patients (aCBF = 26 ± 14, rCBF = 10 ± 9) were generally larger than those required for controls (aCBF = 13 ± 5, rCBF = 5 ± 2). However, this difference only reached statistical significance for absolute CBF. For both groups, a greater number of participants were predicted for ROIs in the frontal and occipital lobes, particularly near watershed regions (Fig. 4). These findings could be related to differences in labeling efficiency between sessions. In an effort to maximize labeling efficiency, a time-of-flight image was used to locate the ideal location for the ASL labeling plane during the first session. Since identical parameters were repeated during the second session, it is possible that the labeling location was not ideal, which could influence the image contrast considering the tortuosity of feeding brain vessels increases with age (Lee et al., 2009;Qiu et al., 2010). This variability was observed in both patients and controls as evident in Fig. 1 in which control 1 and patient 10 exhibited reduced frontal perfusion during the second imaging session. Nevertheless, the prediction that roughly 10 participants would be required to detect a 10% perfusion change in regions relevant to FTD indicates that with careful parameter selection, the current implementation of ASL has the sensitivity to detect longitudinal perfusion changes with relatively small sample sizes. This finding is promising for clinical studies given that FTD is relatively rare (Staffaroni et al., 2019a), which can make recruitment of large number of patients challenging.
Since the prevalence of cerebrovascular disease among patients with FTD is low (Toledo et al., 2013;De Reuck et al., 2012;McKhann, 2001), variation in transit times between the different subtypes was not expected, and therefore, ATT data were averaged across all patient participants. Visual inspection of the ATT maps averaged across patients shows strong resemblance to the corresponding maps generated from the controls (Fig. 7). In addition, average whole-brain ATT values for controls and patients were not significantly different. Likewise, between-session whole-brain CV for patients (15.2 ± 6.3%) and controls (17.1 ± 5.9%) were similar, and the CV maps were homogeneous (S Fig. 1), indicating no regional effects. In contrast to patients with Alzheimer's disease, for whom cerebrovascular dysfunction can result in compromised perfusion (Alsop et al., 2000;De Jong et al., 2019;Mak et al., 2012), there is limited evidence of vascular degeneration in FTD patients beyond that typically attributed to age (De Reuck et al., 2012). In the absence of a priori knowledge, the current study used a PLD of 2 s, based on the recommendations of the ASL white paper (Alsop et al., 2015). Although some studies have shown increased sensitivity by correcting for transit times (Cohen et al., 2020), on average, less than 4.2 ± 4.4% of transit times measured in the current study were greater than the selected PLD. When ATT values greater than 2.3 s were considered, this value dropped to 0.8 ± 0.9%. These results are in agreement with Dai et al. who reported that a PLD between 2 and 2.3 s is sufficient for imaging elderly cohorts (Dai et al., 2017). Given that intensity normalization further diminishes the need for ATT correction (Dai et al., 2017) and yields greater reproducibility and reliability (Ssali et al., 2016), a PLD of 2 s is sufficient for SD-pCASL imaging of FTD patients. This is particularly encouraging for studies focused on assessing presymptomatic perfusion changes considering the effects of ATT should be even more muted due to the younger age of the participants.
While single delay pCASL with a 3D readout is currently recommended for ASL perfusion imaging (Alsop et al., 2015), novel Hadamard encoded approaches are gaining interest due to their ability to image both CBF and ATT in a similar amount of time with good spatial resolution and SNR (Samson-himmelstjerna et al., 2016;Dai et al., 2013;van Osch et al., 2018). Of note, FL_TE-pCASL offers the benefit of acquiring both perfusion and transit time images with similar PLD and LD as the single delay sequence, but without an increase in scan time (Wells et al., 2010;Günther, 2006). While visual inspection revealed greater agreement between FL_TE-pCASL and SD-pCASL, both Hadamard-encoded sequences showed increased contrast in the basal ganglia and insula, compared to SD-pCASL (Fig. 5). This difference could be attributed to zdirection blurring, a well-documented artifact associated with GRASE readout, as well as differences in labeling parameters between the sequences (Woods et al., 2020). Hadamard-based perfusion were not significantly different from SD-pCASL in the occipital gyrus. This reduced perfusion is in line with the superior posterior watershed region that experiences longer ATTs and, furthermore, inspection of perfusion maps revealed hyperperfusion where the vessels enter the cerebrum (e. g. the circle of Willis), particularly with conv_TE-pCASL. Hadamard encoded sequences produced significantly higher CBF estimates relative to SD-pCASL (Fig. 6). To date, few studies have performed a head-tohead comparison of these sequences and no studies in older populations. A recent study demonstrated that CBF by conv_TE-pCASL was 22% higher perfusion in a population of young healthy participants (Guo et al., 2018). Given that internal consistency is critical for longitudinal imaging, the good reproducibility of conv_TE-pCASL (CV = 10.5%, ICC between = 0.77) over a 45 day period reported by Cohen et al., highlights the potential for these novel sequences (Cohen et al., 2020). Furthermore, as demonstrated by the results of this study, relative perfusion provides a stable estimate of perfusion over time, making this systematic offset less concerning.
There are a number of limitations with the current study. First, the estimates of variability were conducted with small sample sizes. Despite this, the results among patients and controls were similar to each other as well as to previous studies. Considering that different subtypes may have differing degrees of regional variability, future studies could investigate subtype-specific sensitivities. Another consideration was the use of global CBF for normalizing the perfusion images. Reference regions need to be selected carefully since errors can be introduced if there are regional perfusion deficits that alter global CBF (Borghammer et al., 2008). While the cerebellum is often used as a reference region, it is not always within the ASL FOV, and furthermore, studies have identified cerebellar atrophy in some patients with FTD (Schmahmann, 2016;Gellersen, 2017). Since there was no difference in global perfusion between patients and controls, global CBF was chosen as a suitable reference region. The low spatial resolution compounded with regional brain atrophy in patient participants can lead to reduced perfusion due to partial volume errors. While the current study did not investigate the influence of partial volume errors, previous work demonstrated that similar hypoperfusion patterns were detected without partial volume correction applied (Dolui et al., 2020). More importantly, a betweensession delay of 4 weeks was chosen to avoid any atrophy-driven perfusion changes between sessions. Finally, although efforts were made to select optimal labeling and bolus/sub-bolus durations among the ASL sequences being compared, it is evident that the Hadamard encoded sequences could have been further optimized (e.g. increasing the sub-bolus durations and PLD) to reduce the effects of vascular signal (Woods et al., 2020).

Conclusion
The results of the current study indicate that SD-pCASL with the appropriate labeling parameters is a promising approach for assessing longitudinal changes in CBF associated with FTD. With the current implementation, it was predicted that ASL can reliably detect changes in perfusion as small as 10% with an estimated sample size of 10 patients with relative perfusion. Agreement of longitudinal measures of CBF and ATT were similar in patients and controls, indicating that there was no additional source of variability with FTD patients compared to agematched controls. Relative to Hadamard-encoded sequences, SD-pCASL showed better grey-to-white matter contrast; however, Hadamard-encoded ASL showed better contrast in the deep-brain regions. While the current study assessed the variability of perfusion measurements within-subject, another important aspect in the diagnosis of FTD is the ability to assess differences between patients and controls. Future work could evaluate the sensitivity of ASL for detecting diseasedriven perfusion changes by direct comparison to the gold standard, PET with radiolabeled water (Ssali et al., 2018). Additionally, toward the ultimate goal of assessing longitudinal perfusion changes, methodologies described in the current study could be implemented in large multicentre studies to gain greater insight into the potential clinical role of ASL in diagnosis and management of FTD.