Clinical and volumetric changes with increasing functional impairment in familial frontotemporal lobar degeneration

Introduction: The Advancing Research and Treatment in Frontotemporal Lobar Degeneration and Longitudinal Evaluation of Familial Frontotemporal Dementia Subjects longitudinal studies were designed to describe the natural history of familial-frontotemporal lobar degeneration due to autosomal dominant mutations. Methods: We examined cognitive performance, behavioral ratings, and brain volumes from the first time point in 320 MAPT, GRN, and C9orf72 family members, including 102 non–mutation carriers, 103 asymptomatic carriers, 43 mildly/questionably symptomatic carriers, and 72 carriers with dementia. Results: Asymptomatic carriers showed similar scores on all clinical measures compared with noncarriers but reduced frontal and temporal volumes. Those with mild/questionable impairment showed decreased verbal recall, fluency, and Trail Making Test performance and impaired mood and self-monitoring. Dementia was associated with impairment in all measures. All MAPT carriers with dementia showed temporal atrophy, but otherwise, there was no single cognitive test or brain region that was abnormal in all subjects. Discussion: Imaging changes appear to precede clinical changes in familial-frontotemporal lobar degeneration, but specific early clinical and imaging changes vary across individuals.


Introduction
Frontotemporal lobar degeneration (FTLD) is a progressive, currently incurable, neurodegenerative disease that is most commonly associated with central nervous system accumulation of one of two proteins: tau or transactive response DNA-binding protein 43 [1]. Most efforts to develop treatments for FTLD are focusing on clearing and/or decreasing formation of these proteins [2]. Studies of such treatments will be more challenging because of the clinical heterogeneity of FTLD, which can present with a variety of syndromes [3]. Increasing evidence indicates that prediction of the specific FTLD protein based on the This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). clinical syndrome can be unreliable [3]. This problem has fueled interest in cohorts of patients with FTLD in whom the protein pathology is predictable.
Treatment studies in f-FTLD are particularly important because each mutation is highly predictive of a specific proteinopathy [4]. In addition, because f-FTLD participants can be identified before symptoms begin, studies can evaluate the effect of a treatment in the earliest phases of illness and also test whether a treatment delays or prevents onset of symptoms.
These considerations led to the creation of the Longitudinal Evaluation of Familial Frontotemporal Dementia Subjects (LEFFTDS) and Advancing Research and Treatment in Frontotemporal Lobar Degeneration (ARTFL) studies, which were designed to understand the natural history of f-FTLD by longitudinally following up both symptomatic and asymptomatic mutation carriers. To maximize generalizability of the findings, the studies are mostly focusing on families with mutations in the genes most commonly associated with f-FTLD: MAPT, GRN, and C9orf72.
The current analysis presents data collected at the first time point from this cohort. We compared cognitive performance, behavioral ratings, and brain volumes across groups of asymptomatic and symptomatic carriers to identify the measures that might mark the early development of symptoms. One of the problems with group analysis, however, is that the findings may not apply to all individuals. This is a critical issue in f-FTLD, where each mutation affects the brain differently, and a person with a given mutation can present with a variety of symptoms [1]. Relying on a single test for all carriers may delay recognition of oncoming symptoms. To examine this issue, we quantified the frequency in which participants in each group showed abnormal performance in each cognitive measure and brain region.

Methods
Participants were recruited at one of 18 centers that are part of the ARTFL (https:// www.rarediseasesnetwork.org/cms/artfl) and/or LEFFTDS (https://clinicaltrials.gov/show/ NCT02372773) networks and included in this analysis if there was a confirmed mutation in the MAPT, GRN, or C9orf72 genes in at least one family member. Clinicians were blinded Cognitive Assessment (MoCA), measures of verbal episodic memory (the Craft story recall task, which is similar to the Wechsler Memory Scale logical memory task), visual episodic memory (ten-minute recall for the Benson complex figure), visuospatial function (copy of the Benson figure), naming (the Multilingual Naming Test [MINT]), lexical fluency (generation of words beginning with the letters "F" and "L", each in one minute), category fluency (generation of animal and vegetable names, each in one minute), attention (forward digit span, Trail Making Test part A), working memory (backward digit span), and set shifting (Trail Making Test part B). Additional tasks included the short form of the California Verbal Learning Test [6]. Measures to characterize socioemotional behavior included the short version of the Neuropsychiatric Inventory (NPI-Q [7]), the Revised Self-Monitoring Scale (RSMS [8]), and the Behavioral Inhibition Scale [9]. Mood was quantified with the Geriatric Depression Scale (GDS [10]). Motor function was quantified with the Unified Parkinson's Disease Rating Scale [11] motor examination. General functional state was characterized using an expanded version of the Clinical Dementia Rating Scale (which is now known as the CDR ® Staging Instrument and will be abbreviated as CDR ® hereafter [12]). The CDR ® provides a categorical rating of severity in six domains, with scores ranging from 0 (clinically normal) to 0.5 (mild/questionable symptoms not affecting daily function) and to levels 1, 2, or 3 (all indicating significant impairment consistent with dementia) for each domain. To broaden the utility of the CDR ® into FTLD spectrum disorders, behavior/comportment/personality, and language domains have been added to the CDR ® to form the 8-domain "FTLD-CDR" [13], and these additional behavior and language domain ratings are implemented by the NACC. This 8-domain rating is now abbreviated as the "CDR ® plus NACC FTLD". The Progressive Supranuclear Palsy Rating Scale [14] quantifies a combination of motor, behavior, and cognitive features relevant to progressive supranuclear palsy.

Genetic testing
Each participant had genetic testing to identify the presence or absence of specific mutations associated with FTLD. Details of the procedures and results of genetic testing are described in a separate publication (Ramos et al., this issue). Although all participants are offered the opportunity to undergo clinical genetic testing, most of the asymptomatic persons have chosen to refrain from clinical testing thus far. However, each participant undergoes research genetic testing (to which the clinicians remain blind and the results are not shared with participants), and therefore, the mutation status is determined for each participant.

Image processing
Cortical volumes for the frontal and temporal lobes for each individual were also calculated by transforming a brain parcellation atlas [18] into the study-specific brain space and summing all modulated gray matter within the frontal and temporal lobes. Peak coordinates for imaging findings are provided in the coordinates of the International Consortium for Brain Mapping brain template [19].
Additional details on the acquisition, quality control, and image-processing procedures are provided in the Supplementary Materials.

Creation of groups for analysis
The group was divided into four categories based on mutation status and clinical severity, as measured by the CDR ® plus NACC FTLD. The groups were asymptomatic non-mutation carriers (−mFTLD-CDR = 0), asymptomatic mutation carriers (+mFTLD-CDR = 0), mildly/ questionably symptomatic mutation carriers (+mFTLD-CDR = 0.5), and symptomatic mutation carriers (+mFTLD-CDR ≥ 1). Consistent with the established approach for assigning these ratings, clinicians used a combination of direct patient observation and informant report to categorize each patient, and there was no formal incorporation of neuropsychological data. Because the CDR ® does not include categories for language and behavior, there is no established algorithm for creating an overall rating that includes the outcomes of these additional ratings. Consequently, patients may have subtle impairment due to language or behavioral problems and still be rated as 0 on the CDR ® . Therefore, we created an algorithm to integrate ratings for all eight categories into a global rating for each individual. The rules were as follows:

1.
If all domains are 0, the global CDR ® plus NACC FTLD score is 0.

2.
If the maximum domain score is 0.5, the global CDR ® plus NACC FTLD score is 0.5.

3.
If the maximum domain score is above 0.5 in any domain, then the following applies: A.
If the maximum domain score is 1 and all other domains are 0, the global CDR ® plus NACC FTLD score is 0.5.

B.
If the maximum domain score is 2 or 3 and all other domains are 0, the global CDR ® plus NACC FTLD score is 1.

C.
If the maximum domain score occurs only once and there is another rating besides zero, the global CDR ® plus NACC FTLD score is one level lower than the level corresponding to maximum impairment (e.g., if maximum = 2 and there is another rating besides zero, the global CDR ® plus NACC FTLD score is 1; if maximum = 1 and there is another rating besides zero, the global CDR ® plus NACC FTLD score is 0.5).

D.
If the maximum domain score occurs more than once (e.g., 1 in 2 domains, 2 in 2 domains), then the global CDR ® plus NACC FTLD score is that maximum domain score.

Group comparisons
Changes occurring with disease stage were examined by comparing the mean value across groups for all clinical variables and for the frontal and temporal lobes using linear regression, treating each variable as an outcome and disease stage as a categorical predictor, and including age, sex, and education as covariates. For models where the effect of group was statistically significant (P < .05), we conducted targeted post-hoc analyses by comparing each mutation carrier group with the −mFTLD-CDR = 0 group as well as with the lower stages of disease (e.g., +mFTLD-CDR ≥ 1 was compared with −mFTLD-CDR = 0 and +mFTLD-CDR = 0.5). To maximize statistical power, these analyses were performed with all three types of mutations together. Statistical analysis was performed using R (www.Rproject.org).

Consistency of abnormalities across individuals
One of the intended uses of these measures would be to indicate that a previously healthy mutation carrier is entering a new phase of illness where function is beginning to be affected. While changes in mean values with disease stage are informative for understanding which measures might mark these transitions, it is also important to understand how well these group observations apply to each individual. One way to examine this is to quantify the proportion of individuals that show abnormalities in each variable at each stage. The ARTFL/LEFFTDS team recently implemented a procedure for transforming each individual's neuropsychological scores into age-and education-corrected standardized scores based on the normative data provided by the NACC. The details of the procedure are published elsewhere [20], and the procedure has not been implemented for all variables, but for those that have these transformations available, we examined the percent of individuals at each stage that were abnormal using a cutoff of z = −1.5. We took a similar approach with the imaging data by creating maps showing the proportion of individuals that had w-scores lower than −1.5 at every voxel. For these analyses, the data are presented separately for each mutation type to provide information about variability in specific symptoms across mutation types.

Mean values across levels of severity
Linear models grouped by levels of severity combined across mutation carriers revealed statistically significant effects of group for nearly every variable examined (Table 1). Posthoc testing revealed that this was largely driven by the +mFTLD-CDR ≥ 1 group, which showed significant impairments in all clinical variables and decreased frontal and temporal brain volumes compared with the −mFTLD-CDR = 0, +mFTLD-CDR = 0, and +mFTLD-CDR = 0.

Frequency of impairment on cognitive testing
Data on the percentage of participants showing impairment in each cognitive test are shown in Fig. 1, with data for each mutation type and level of severity plotted in colored bars relative to the proportion of −mFTLD-CDR = 0 showing abnormality in that measure, plotted in gray bars. Additional details are shown in Supplementary Tables 2-5 in the Supplementary Materials including how many in each group had any abnormal test, how many had abnormal performance for each test, and, for each test, how many had abnormal performance on only that test. Seventy percent of individuals in the −mFTLD-CDR = 0 group showed abnormal performance for at least one score, with the most commonly abnormal test being the MoCA (22%; Fig. 1, gray bars; Supplementary Table 2), and the second most common being the MINT (20%).
For each mutation, abnormalities were sometimes more common in carriers compared with noncarriers in the FTLD-CDR = 0 stage, but the frequency of abnormalities increased along with overall disease severity (Fig. 1). For instance, the MoCA was abnormal in 22% of the −mFTLD-CDR = 0 group, and abnormal MoCA scores were more frequent in +mFTLD-CDR = 0 MAPT carriers, at 29% but less common in +mFTLD-CDR = 0 carriers of GRN (18%) and C9orf72 (15%). Overall, about 70% to 80% of +mFTLD-CDR = 0 and +mFTLD-CDR = 0.5 carriers had at least one abnormal test, whereas nearly 100% had at least one abnormal test in the +mFTLD-CDR ≥ 1 group (Supplementary Tables 3-5). The MoCA was a commonly abnormal test (most common or second most common in nearly all groups), and the MINT was frequently abnormal. In particular, the MINT was the most common or second most commonly abnormal test at each level of severity in MAPT carriers, who had the most consistent pattern of abnormalities across levels of severity ( Fig. 1; Supplementary  Table 3). Among GRN carriers, abnormal performance on the Craft story recall task was relatively common, along with Trail Making Test and "F" word fluency (Fig. 1,  Supplementary Table 4). In C9orf72 carriers, there appeared to be the least consistency across levels of severity beyond the MoCA (Fig. 1, Supplementary Table 5). There was no group in whom the same test was abnormal in 100% of participants, and in all mutation types, there was a substantial number of individuals who had only one abnormal test that was not the most common test. For instance, in the +mFTLD-CDR = 0 C9orf72 group (Supplementary Table 5), the most common abnormal task was the MINT (9 people, 23% of participants), but 20 (50% of people) performed normally on the MINT but abnormally on another task and 12 people (30%) were abnormal on only one test that was not the MINT.

Regional volume loss across individuals
In every group, there was at least one voxel that was more than 1.5 w-score units below normal (Fig. 2).

Discussion
The goal of this analysis was to characterize cognitive performance, behavioral ratings, and brain volumes in a large group of f-FTLD family members. In group comparisons, asymptomatic mutation carriers showed nearly identical scores on all clinical measures compared with noncarriers but reduced frontal and temporal lobe volumes. The group with mild/questionable impairment showed decreased story recall, word list recall, verbal fluency, processing speed, and set-shifting performance and impaired mood and self-monitoring. With development of dementia, all scores were abnormal compared with scores in less symptomatic groups. Looking at performance across individuals, the MoCA was frequently abnormal in all mutations, but this was also true in many noncarriers. The effects of MAPT mutations on brain volume and cognition were most consistent across individuals and stages, with naming impairment and temporal volume loss being present in a high proportion of carriers. Memory disorders were prominent in GRN, but C9orf72 did not show a consistent pattern of impairment in the early stages, and both GRN and C9orf72 showed lower levels of overlap in regional volume loss than MAPT.
These findings have important implications for research and therapy in f-FTLD, which is a critical context for testing treatments in the earliest phases of disease and also for testing whether treatments can prevent onset of symptoms. With regard to prevention, our finding that neuroimaging changes appear to precede clinical changes is consistent with multiple studies demonstrating brain volume loss and other brain imaging abnormalities in asymptomatic mutation carriers [21] and findings from a comprehensive study in a similar large cohort called the Genetic FTD Initiative (GENFI), which suggested that imaging findings precede symptom onset by more than 10 years [22]. These observations support the idea that imaging can serve as a leading indicator of clinical changes and that mutation carriers with imaging abnormalities will be important candidates for prevention studies. Additional work will be required to quantify the degree of abnormality that serves as an early marker, to quantify the timing until symptoms develop, and to assess the value of additional imaging techniques such as diffusion MRI and functional MRI [23].
Ideally, sensitivity for early detection of disease should improve if monitoring could be targeted at brain regions and clinical features that are most likely to be affected first in each mutation. In MAPT, we found very frequent involvement of the temporal lobe, which is also the region most associated with MAPT mutations in prior studies [24]. The consistency of this finding supports a strategy of monitoring early temporal lobe changes in MAPT carriers. However, the findings in our GRN and C9orf72 cohorts suggest that focusing on a specific brain region in these groups would not capture early changes well in all individuals, although thalamic changes seemed to be fairly consistent in C9orf72 carriers. Similarly, our clinical data do not point to one particular cognitive score that reliably marks early symptoms, even in MAPT. Although our finding that naming impairment is frequent in early MAPT carriers is similar to observations from GENFI [22], there were many asymptomatic and mildly/questionably symptomatic MAPT mutation carriers who showed impairment in other tasks but not in naming. Consistency across GRN and C9orf72 mutation carriers appeared to be even lower, although abnormal trail making and fluency scores were relatively frequent in both groups, consistent with the frontoparietal involvement in both mutation types. This is in-line with prior observations that patients with FTLD mutations can present with a variety of clinical syndromes, even with the same mutation in the same family [1].
One approach for dealing with the heterogeneity in mutation carriers would be to track larger portions of the brain such as the frontal and temporal lobes. Similarly, one could use composite measures of cognition that represent function across multiple domains. The fact that the MoCA was one of the most frequently abnormal tests in carriers, even in the asymptomatic and mildly/questionably symptomatic groups, suggests that this might be a fruitful strategy. However, many noncarriers also showed abnormal performance on the MoCA, which suggests that relying on an arbitrary threshold to identify oncoming symptoms would limit the accuracy of the approach. Thus, additional longitudinal work will have to be done to empirically define performance thresholds that reliably predict development of functional changes. Another approach would be to use a multiple-predictor strategy to identify combinations of cognitive tests and behavioral measures from a battery such as the one used in this project to predict onset of symptoms. Such an approach could identify multiple patterns of impairment with predictive value and thus apply to a variety of clinical presentations. A similar approach can be used for brain imaging (see the article by Staffaroni et al. [25] in this issue for example).
These data illustrate the importance and promise of large longitudinal studies of f-FTLD such as LEFFTDS, GENFI, and similar efforts. While our findings reinforce the complexity and heterogeneity of FTLD, even in the context of disease-causing mutations, they suggest that early changes in imaging, cognitive performance, and behavioral ratings may be able to serve as early predictors of functional impairment and help to identify suitable candidates for prevention and early-stage treatment trials. As longitudinal data from these cohorts emerge, they will provide invaluable information about the earliest signs of FTLD and neurodegenerative disease in general.  Proportion of individuals in each group with abnormal performance (z < −1.5) on each cognitive test with available norms (colored bars) superimposed on proportion of noncarriers with abnormal performance on that test. Bars extend to indicate largest observed proportion, so that bars where colors extend beyond gray indicate that mutation carrier group showed higher proportion (denoted by rightward extent of colored bar from the y-axis line) than noncarriers (whose proportion is denoted by rightward extent of gray bars from y-axis line  Proportion of individuals in each group with reduced gray matter volume (w-score < −1.5) at each gray matter voxel. Increasing color from blue to yellow in "heat map" indicates higher proportion of individuals in that group showed reduced volume at that location. Left hemisphere is displayed on the left in coronal images. Abbreviations: MAPT, microtubule associated tau; GRN, progranulin; C9orf72, chromosome 9 open reading frame 72. Alzheimers Dement. Author manuscript; available in PMC 2021 January 06.