Markers of early changes in cognition across cohorts of adults with Down syndrome at risk of Alzheimer's disease

Abstract Introduction Down syndrome (DS), a genetic variant of early onset Alzheimer's disease (AD), lacks a suitable outcome measure for prevention trials targeting pre‐dementia stages. Methods We used cognitive test data collected in several longitudinal aging studies internationally from 312 participants with DS without dementia to identify composites that were sensitive to change over time. We then conducted additional analyses to provide support for the utility of the composites. The composites were presented to an expert panel to determine the most optimal cognitive battery based on predetermined criteria. Results There were common cognitive domains across site composites, which were sensitive to early decline. The final composite consisted of memory, language/executive functioning, selective attention, orientation, and praxis tests. Discussion We have identified a composite that is sensitive to early decline and thus may have utility as an outcome measure in trials to prevent or delay symptoms of AD in DS.


BACKGROUND
The ultra-high risk for AD in association with a high diagnostic certainty for an underlying Alzheimer pathology in cases with dementia makes people with DS an important population to consider for randomized controlled trials (RCTs) for interventions that seek to prevent, delay, or halt the development and progression of dementia. 3 Although there is growing interest in including people with DS in intervention trials, barriers remain for RCTs, including the need for reliable cognitive outcome measures of progression during the preclinical to prodromal spectrum stages of dementia. 4 Identifying early subtle changes in cognition and diagnosing AD in DS can be challenging because of the presence of developmental cognitive impairments associated with lifelong intellectual disability (ID) and the variability in baseline cognitive functioning across individuals. 5 However, there are several cognitive tests measuring memory, verbal fluency, planning, inhibition, attention and visuo-motor abilities which appear to be appropriate for discriminating between those with and without dementia and tracking AD-related decline and progression. [6][7][8][9][10][11][12] Although these tests show promising results in distinguishing individuals with and without dementia and may be useful as cognitive endpoints for RCTs, it is currently unclear which tests show the earliest decline (before dementia can be diagnosed clinically) for use in prevention trials. There is evidence to suggest that declines in memory (recall of new information) and attention occur first in DS, 9,13 similar to sporadic AD. Other DS studies have found that impairments in executive functioning and behavioral and psychological changes precede difficulties in memory. 14 Determining which abilities first show AD-related decline during the transition to the prodromal phase and identifying the tests most sensitive to change in this period is vital to determine the most optimal point for an intervention. 15 If a given test is found to be the most sensitive (ie, earliest to change), then by implication it will likely be a cognitive modality quickest to change.
To facilitate the development of the first AD prevention trials in DS, there is a need to refine and adapt current tests of cognition and identify those which are most sensitive to early AD-related impairments and, therefore, predictive of dementia before the diagnosis can be made. Such a test battery would also be valuable to clinicians by providing them with predictive measures for tracking AD-related decline.
To this end, we utilized data from existing longitudinal studies associated with the Horizon 21 DS consortium (H21 consortium), 4 as well as from two DS research cohorts in the United States. The advantages of this approach include providing a more diverse sample of cognitive data from individuals with DS than is typically possible in single-site studies.
It also capitalizes on the expertise of multiple research groups and provides an opportunity to cross-validate the findings of one cohort with another in the presence of population, cultural, and language differences.
Our aim was to use a data-driven approach to identify cognitive tests or test items that are the most sensitive to detecting early cognitive change in adults with DS. We then sought to use these results as well as expertise from clinicians and researchers familiar with the cognitive tools to identify the optimal constellation of tests or test paradigms to constitute a composite cognitive assessment battery to use in future RCTs in DS and in clinical settings.

Cohorts
Longitudinal data were used from five observational studies on ageassociated cognitive change in DS. These sites included data from

Statistical analysis
Due to differences in the cognitive tests that were administered as well as the length and frequency of follow-up, all analyses were conducted separately within each available cohort. We were interested in the rates of change on each cognitive test and therefore conceptualized "years since baseline visit" as the time variable (hereafter referred to as "time"). Our modeling strategy then proceeded in several steps.
First, all neuropsychological scores were z-scored to the baseline visit.
Second, a linear mixed-effects model was constructed using the "lme4" package 21 to predict scores on a given cognitive test from the "time" variable. Random intercepts across participants were included in all models. Third, we extracted the beta weight of "time" from the model that indexes the annualized rate of change in z-scores. Fourth, we con- We repeated steps 2-4 iteratively to evaluate a variety of cognitive composite scores. We first analyzed each cognitive test in isolation; then we averaged two tests together, then three and so on up to a maximum of six tests in a single composite. This process occurred for all tests available for a given cohort, resulting in dozens of composite scores that consisted of the average of between one and six cognitive tests. For each composite, the Cohen's d scores were extracted from the LME model and then rank ordered in terms of absolute magnitude.
Composites with the highest scores were retained for further analysis/discussion. Any test that appeared in three of the five top composite scores was assumed to tap a cognitive domain (eg, attention, memory, executive function) that shows large and consistent decline in individuals who likely have preclinical AD. We then used these tests to form an "optimal" composite consisting of the z-scored average of each of the measures.

Evaluation of the optimal composite
After selecting the composite within each cohort with the greatest sen- Level of Intellectual Disability

Consensus discussion
Finally, considering the above information, an expert panel of clinicians and researchers from each site represented in the consortium includ-

RESULTS
The The means of the "optimal" composite at the baseline and followup visits are plotted in Figure 1.

Consensus discussion
Taking into account the above data, and the predefined parameters for selecting tests of interest, the following decisions were made from the tests listed in Table 2: two tests were chosen as the consensus tests for memory abilities; a modified version of the Cued Recall test (mCRT) 6,23 and the CANTAB paired associate learning test (PAL) 24 ; this was due to the importance of memory (by accepted diagnostic criteria) in the early stages of AD-related decline and the variety of different tests that were present in the composites from each cohort. With regard to language TA B L E 3 Estimated sample sizes (means and confidence intervals) needed to detect a given effect size for a given trial duration and assessment frequency in the London cohort  Table 4 for full details).

DISCUSSION
This is the first comprehensive analysis of cognitive decline associated with AD in DS across cohorts regardless of assessment tools used, with a focus on decline during the earliest transition from the preclinical to the prodromal stage of AD before a clinical diagnosis of dementia. We demonstrated a range of effect sizes for different cognitive composites and between cohorts; the latter was partially explained by baseline differences in age and intellectual impairment. Not surprisingly, participants who are older at entry were more likely to decline over the course of the study. We then identified cognitive domains and specific neuropsychological tests that were consistently represented as important

Patterns of decline associated with development in AD in DS
This study confirms a pattern of decline that has been emerging in studies of cognitive AD-related change in DS. For the first time, data from several data sets have been combined, thus avoiding some of the issues associated with single research group studies such as administration and language or cultural effects. Earlier studies highlighted the importance of decline in memory, 23,28 attention, 13,25 and executive functioning, 29 with declines involving memory and attention occurring before that of declines in measures of executive function in machinelearning models. 9 Because AD in DS has, just like in other populations, a strong relationship with age, it is to be expected that older adults (and cohorts with higher mean age) would show larger changes on cognitive measures over time, which was confirmed in our analyses. We also showed that the degree of premorbid ID influenced effect sizes, which may be due to those with more severe ID having lower baselines scores, thus limiting the amount of decline that can be measured over time, and/or due to greater variability of scores for a given individual, as it is harder to administer the test reliably for those who are more intellectually impaired. Although we did not demonstrate any significant relationships between length of follow-up and effect sizes of decline on measures when age, ID level, and sex are taken into account, this may become apparent in studies with longer follow-up and could potentially explain differences between previously reported studies, as the effect size could be small. Other reasons for differences in effect sizes between cohorts are the different tests used within the selected domains; specific tests used within one cohort but not another may assess subdomains, which are more sensitive to early decline, creating a discrepancy in the magnitude of the effect reported. Other potential reasons may be that people who start to show decline have been dropped out from longitudinal assessments and the threshold at which people are not offered testing could have differed between sites; or potential differences in thresholds for clinical dementia diagnosis that have determined selection of participants included in this analysis.
We excluded participants with an AD diagnosis at baseline, but crosscountry differences between cohorts in their criteria of diagnosing AD could have an impact on whether the remaining participants are likely to show change over time, if, for example, those in early prodromal stage have been already given an AD diagnosis.

Outcome measure in clinical trials of treatment to delay cognitive decline in DS individuals
There is renewed interest in the need to target the earlier stages of AD in the context of a series of failed therapeutics in later stage disease (including of symptomatic therapies in DS 30,31 ;). DS represents a relatively large population in which such trials are more feasible (due to The mCRT consists of a learning phase and a testing phase; during the learning phase, 12 items representing distinct semantic categories are presented on 3 four-item cards, with each item accompanied by a unique category cue (Buschke, 1984). Learning is repeated up to a maximum of three times if necessary. The testing phase consists of three trials of free and cued immediate recall, generating two measures, a free immediate recall score (FIRS; spontaneous recall of the list of 12 items for each trial) and a total immediate score (TIS; FIRS plus items recalled when the category cue was provided). A 20-min delayed recall trial has also been included, generating two additional scores: free delayed recall score (FDRS) and a total delayed score (FDRS plus items recalled after category cue was provided Total raw score. Adjusted scores based on the CAMCOG-DS scoring can also be used.
Selective attention Cancellation task 25 Participants are shown a piece of paper with a clutter of black and white items, and asked to cross-out each occurrence of a target item, following a practice trial. Total time to complete the task and total number of correct targets crossed-out are recorded.

Further development
The H21 AD test battery will be included in our longitudinal studies of cognitive decline associated with AD in DS. We will collect data on testretest reliability of this combination of subtests across cohorts, as well as change in performance over time. This will allow for further analyses to consider refinement of scoring; for example, to consider the degree each individual test contributes to an overall score to deliver additional data to inform the design of future trials. Finally, an important

Strengths and limitations
Although we had access to a unique and large sample, some withingroup analyses had limited power. Furthermore, study groups used different assessment batteries and, in some cases, different scoring criteria. Thus our statistical analyses are aimed at the identification of important cognitive domains rather than the selection of specific neuropsychological tests. The discussion by our consensus group aimed to supplement the statistical modeling in order to identify the specific tests that would be most feasible to employ in a global clinical trial (eg, easy to score, minimal cultural, or language differences). However, our main objective was to identify robust patterns of decline regardless of assessment tools used, and therefore diversity in terms of batteries were beneficial rather than a limitation. Despite the relatively small sample sizes of some data sets, the findings were consistent across the cohorts. Length of follow-up also differed between the data sets. We minimized the impact that this may have had by limiting the cases by length of follow-up and used annualized rates of change on the final composite. Finally, the included cohorts differed slightly in terms of demographic variables such as age, which explains some of the observed differences in effect sizes, as age is a strong predictor of cognitive decline in individuals with DS. This was particularly evident in the London cohort, which included the oldest individuals (and is probably more representative of the range of intellectual ability amongst the cohorts) and showed the largest annualized change in performance.

CONCLUSIONS
The early markers of cognitive change of AD in DS include prominent decline in memory, language, attention, and praxis, and appear to be comparable to decline in other forms of AD, including sporadic AD and autosomal-dominant AD. We have identified a composite that is sensitive to cognitive change during the early prodromal stage of AD in DS prior to a diagnosis of dementia that can be used as an outcome measure in clinical trials of treatment to prevent or delay decline associated with the disease in DS.
Antonia Coppus: Participated in discussions, collected data considered for inclusion, and critically reviewed manuscript. Juan Fortea: Participated in discussions, collected data considered for inclusion, and critically reviewed the manuscript. Benjamin L. Handen: Contributed data to analysis, led to collection of neuropsychological data at Pittsburgh, and assisted with manuscript preparation. Sigan Hartley: Wisconsin neuropsychologist lead; collected and processed data from Pittsburgh/Wisconsin site and edited the manuscript. Elizabeth Head: Contributed data and assisted with manuscript preparation. Judith Jaeger: Provided high-level guidance on study rationale and objectives, partic-