Influence of cognitive reserve on cognitive and motor function in α -synucleinopathies: A systematic review and multilevel meta-analysis

Cognitive reserve has shown promise as a justification for neuropathologically unexplainable clinical outcomes in Alzheimer ’ s disease. Recent evidence suggests this effect may be replicated in conditions like Parkinson ’ s disease, dementia with Lewy bodies, and multiple system atrophy. However, the relationships between cognitive reserve and different cognitive abilities, as well as motor outcomes, are still poorly understood in these conditions. Additionally, it is unclear whether the reported effects are confounded by medication. This review analysed studies investigating the relationship between cognitive reserve and clinical outcomes in these α -synucleinopathy cohorts, identified from MEDLINE, Scopus, psycINFO, CINAHL, and Web of Science. 85 records, containing 176 cognition and 31 motor function effect sizes, were pooled using multilevel meta-analysis. There was a significant, positive association between higher cognitive reserve and both better cognition and motor function. Cognition effect sizes differed by disease subtype, cognitive reserve measure, and outcome type; however, no moderators significantly impacted motor function. Review findings highlight the clinical implications of cognitive reserve and importance of engaging in reserve-building behaviours.


Rationale
In neurodegenerative disease, there can be significant differences between each individual's competency to perform certain tasks, without any clear neuropathological explanation (Stern et al., 2019).That is, level of cognitive performance does not always match the degree of neuropathology observed.These individual differences can make prognostic predictions and treatment decisions difficult.The term resilience has been used as a possible explanation for these variations in capacity to preserve brain functionality over time despite accumulating neuropathology (Stern et al., 2023).More specifically, cognitive reserve (CR), a hypothetical brain construct, has been proposed as one possible resilience mechanism to explain such discrepancies between neuropathology levels and functional outcomes.The active CR model hypothesises that CR does not passively delay decline, but instead it suggests that experiences throughout the lifespan actively help optimise performance to resist neuropathology by recruiting additional brain networks, more efficiently using brain networks, and compensating for damage (see Table 1 for further explanation; Tucker and Stern, 2011;Zahodne et al., 2015).Given the nature of CR, it is not directly quantifiable; instead, it is inferred from proxy factors, such as education, occupation, intelligence, or comprehensive tests that incorporate multiple factors (Nucci et al., 2012;Valenzuela and Sachdev, 2007).Some of the more recent CR literature has shifted to using a residual approach to index CR rather than relying on socio-behavioural proxy factors.This approach quantifies the difference in observed performance and predicted performance, via a regression model, at a particular level of neuropathology (Reed et al., 2010).This residual approach can offer a more dynamic and statistically based measure to test the main assumption of CR theory (i.e., mismatches in neuropathology and clinical outcomes).However, associations between a CR residual and another variable may be driven by the cognitive ability measure used to derive the performance residual, rather than a suitable index of CR (Elman et al., 2022).
Under neurodegenerative conditions, the CR hypothesis postulates that patients with a higher CR can better maintain functional performance over those with a lower CR, when other factors are held constant (i.e., degree of neuropathology; Fratiglioni and Wang, 2007;Lee et al., 2019).CR was first proposed following a post-mortem analysis of patients with comparable levels of Alzheimer's disease (AD) neuropathology.Despite similar neuropathology, subjects with higher CR were significantly less likely to present with dementia (Katzman et al., 1988).
The latter type of disorder is characterised by accumulation of the protein α-synuclein (α-syn) aggregating insolubly in neuronal or glial cells (McCann et al., 2014).Parkinson's disease (PD), dementia with Lewy bodies (DLB), and multiple system atrophy (MSA) are recognised as the three main α-synucleinopathies, differing clinically according to their distinct neuropathology (Goedert et al., 2017;McCann et al., 2014).PD, the most common α-synucleinopathy, is characterised by α-syn intraneuronal inclusions known as Lewy bodies and neurites that spread throughout the nervous system, causing loss of motor function and cognitive deficits (Braak et al., 2003).DLB is also characterised by Lewy bodies and neurites; however, their initial manifestation in the neocortex causes an earlier presentation of cognitive impairment, prior to possible motor function deterioration (Walker et al., 2019).Finally, MSA presents distinctly from both PD and DLB, with aggregation of α-syn occurring in the cytoplasm of glial cells (specifically oligodendroglia), resulting in glial intracytoplasmic inclusions (GCI) that trigger predominant MSA signs (parkinsonism, cerebellar symptoms, and early disease autonomic dysfunction, see McCann et al., 2014 for a review; Ahmed et al., 2012).
Currently, CR's utility in explaining outcome differences for α-synucleinopathy patients has mostly been researched in PD, with two previous reviews having been published on this topic, and one prospective review forthcoming (Gu and Xu, 2022;Hindle et al., 2014;Kenney et al., 2021).Both published reviews found that CR is associated with better overall cognitive performance in PD patients.However, Gu and Xu (2022) reported non-significant effects of CR on visuospatial function, memory, language, and processing speed, while Hindle et al. (2014) did not find a significant relationship between CR and dementia diagnosis.Importantly, however, these reviews have only considered cognitive outcomes for PD and validity of specific cognitive domain analyses is limited due to a small number of effects included in these meta-analyses.
The recent increase in CR and α-synucleinopathy publications and use of a broader literature search strategy would allow for more effects to be pooled to validate previous review findings for specific cognitive abilities.To date, none have analysed the effect CR may have on other α-synucleinopathies or motor outcomes.Across the very few studies that have explicitly explored CR in DLB patients, the CR proxies education and occupation have been demonstrated to be associated with neuroprotective effects (Carli et al., 2021;Perneczky et al., 2008).To our knowledge, no study specifically investigated the relationship between CR and cognitive or motor outcomes in MSA patients.While α-synucleinopathies are heterogenous in nature, they do share significant clinical and pathological overlap, especially between PD and DLB (McCann et al., 2014;Nussbaum, 2018), often causing diagnostic or prognostic complications (Koga et al., 2015).Investigating the effect of CR on outcomes for patients with PD, DLB, and MSA in a single systematic review could reveal whether CR's substantial impact in AD extends to similar neurodegenerative conditions.
Additionally, understanding how CR interacts with each α-synucleinopathy could potentially assist clinicians in distinguishing disease subtypes and improving future treatment decisions.It is also reasonable to hypothesise, given that under pathological circumstances α-syn leads to basal ganglia dopamine degeneration (Peng et al., 2018) and that dopamine is involved in modulating neural circuits connecting the basal ganglia and prefrontal cortex (Steiner and Tseng, 2010), that CR's neuroprotective effect may extend to motor outcomes.Deficits in motor functioning are typically alleviated by dopamine replacement medication (levodopa-carbidopa), which is frequently administered to α-synucleinopathy patients (Schapira et al., 2009).While originally developed to alleviate motor impairments, levodopa-carbidopa may also affect cognitive abilities that rely on dopamine-dependent neural circuits within the basal ganglia (Cools et al., 2003;Molloy et al., 2006).However, the potentially confounding effects of medication have not always been controlled for in previous studies.If medication has a significant influence on clinical outcomes, then it is critical to control for these effects as they may be distorting associations between CR and clinical outcomes.Thus, the current review will be the first to summarise CR's influence on both cognitive and motor outcomes for all three main α-synucleinopathies and investigate whether controlling for medication affects these relationships.
This review aims to extend the emerging evidence that CR can predict clinical outcomes in a wider range of neurodegenerative diseases.If there is indeed strong evidence that CR may exert neuroprotective effects, this could have critical implications for the general population, as interventions that aim to increase CR might have widespread benefits by potentially reducing the burden of neuropathology that often develops in older age.Implementing CR in a clinical setting would allow for more accurate prognostic assessment, complementing other outcome prediction tools.As individuals with a higher CR experience the clinical onset of neurodegeneration well after development of neuropathology (Stern, 2009), the prognostic utility of CR could also be extended to improve diagnosis alongside pathological and clinical evaluations.Earlier CR = cognitive reserve.
I. Saywell et al. identification and better prognostic projections of neurodegeneration outcomes would allow interventions to be administered sooner and refine individualisation of treatment for these heterogenous conditions.Quality of life for those with neurodegenerative diseases would improve, potentially leading to a lower medical and economic burden.

Review questions
• Whether CR predicts cognitive and motor impairment in α-synucleinopathy patients • If the reported effects differ depending on α-synucleinopathy subtype studied • If controlling for medication impacts the size of reported effects This review also seeks to identify if additional study characteristics might influence the effect of CR, such as differences in outcome measurements and the type of CR measure.

Methods
A protocol for this review was published a priori and registered through PROSPERO (Saywell et al., 2023).The review has been conducted according to preferred reporting items for systematic reviews and meta-analyses (PRISMA; see Supplementary Material 1).

Study conditions
Eligible studies had to assess CR in some capacity, either through recognised socio-behavioural proxies or an explicit CR measure, such as the Cognitive Reserve Index questionnaire (CRIq; Nucci et al., 2012).A whitepaper focused on defining and investigating CR guided our choice of eligible proxies, consisting of education, occupation, leisure time participation, and crystallised intelligence measures like vocabulary, reading ability and tests of premorbid intelligence (Stern et al., 2020).Physical activity and social engagement have also been recognised as valid CR proxies; however, they were excluded given they have been found to explain less variance in individual outcome performance differences (Boyle et al., 2021).More recently, CR has been indexed in studies by other protective factors, like bilingualism; these were, however, not included, as they were not explicitly mentioned or investigated by Stern et al. (2020).Limiting our search to socio-behavioural CR proxies suggested by Stern et al. (2020) also prevented the review from becoming too broad, allowing screening to be performed in a reasonable time.Alongside administering a CR measure, studies were required to test either cognition or motor function or both.Cognitive assessments could include a known neuropsychological measure or diagnostic tool.Studies that compared different levels of CR across cognitive subtypes like "normal cognition," "mild cognitive impairment" (MCI), or "dementia" were also included.Suitable evaluations of motor function were either a verified clinical rating tool or a more precise biomechanical measure.Self-report evaluations and assessments comprising single items measuring an aspect of cognition or motor function within broader scales that did not specifically target these outcomes were excluded (e. g., the memory item in the nonmotor symptoms questionnaire; Chaudhuri et al., 2006).Studies also had to clearly mention having analysed the relationship between CR and an outcome of interest in either the title or abstract.Controlling for the effects of medication was desirable, but not required.Only studies that used an α-synucleinopathy cohort, specifically patients formally diagnosed with PD, DLB, or MSA, were considered.

Study types
Cross-sectional, cohort, and case-control studies were eligible, alongside baseline data from longitudinal studies.Theses, grey literature, past reviews or meta-analyses, non-English papers, and studies

Search strategy
Four keywords, "cognitive reserve," "α-synucleinopathy," "cognition," and "motor", guided search term development for five review relevant databases."Cognitive reserve," "α-synucleinopathy," and both outcomes of interest were all combined with "AND;" however, "cognition" and "motor" as outcomes of interest were separated by "OR," given review eligibility did not require records to include both outcomes.Initial searches of MEDLINE via PubMed, Scopus, psycINFO via Ovid, CINAHL via EBSCO, and Web of Science were performed on the 23rd of June 2022 (see Supplementary Material 2, Table 1 for database specific search strategies and limits).Searches were re-run for each database from the initial search date until data synthesis commencement (1st of March 2023).Titles of records included in reference lists of records that passed all review screening phases were examined to identify additional records that met eligibility criteria.Furthermore, the titles of records that cited review eligible studies were screened for suitability.These backward and forward snowballing search techniques were conducted for all review eligible studies.

Data management
Studies identified from database searches were exported to EndNote v.20 (Clarivate Analytics, PA, USA) and imported into the Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia).Covidence is an online software that streamlines the review screening process by providing an accessible interface to complete all screening phases, data extraction and quality assessment.Study duplicates found in multiple databases were removed automatically by Covidence.

Selection process
Titles and abstracts of searched records were screened by five independent reviewers (IS, LF, BC, APH, IB), with each record assessed by at least two reviewers and a third if consensus could not be reached.At this phase, proportionate agreement among each pair of reviewers ranged between 91 % and 99 %, while Cohen's Kappa ranged from fair to moderate (see Supplementary Material 2, Table 2; McHugh, 2012) Records that passed title and abstract screening had their full text analysed for review eligibility by two independent reviewers (IS, LF).Proportionate agreement was high (90 %) during full text screening and the corresponding Kappa statistic was substantial (0.78).IS screened all records during each phase along with one of the other four reviewers, neither reviewer was blinded, and if consensus could not be reached at any stage a third reviewer was used.

Data items and collection process
A predefined, review-specific, data extraction template consisting of 115 items 2 was developed to extract critical study information (see Supplementary Material 3).IS extracted data for all records that passed full text screening, while LF extracted data for 23 studies.The interrater agreement, represented by proportionate agreement, 3 was 94.5%, indicating substantial reliability among reviewers.In total, 52 authors were contacted for records that required further information regarding effect sizes.Authors were given one month to respond to an initial email and an additional week after a reminder email was sent.

Study risk of bias assessment
A quality assessment was conducted for each record included in the review by IS, while LF completed a random subset of 23 studies to ensure consistent appraisal.The Quality In Prognostic Studies (QUIPS) tool is a commonly used quality assessment for evaluating risk of bias in prognostic studies (Grooten et al., 2019;Riley et al., 2019).While the QUIPS is normally used to assess six bias domains (see Supplementary Material 2, Table 3), authors recommend adapting the tool to fit the reviewers' purpose (Hayden et al., 2013).In the current review, we used four risk of bias domains ("Key Study Variables and Recruitment," "Prognostic Factor Measurement," "Outcome Measurement," and "Statistical Analysis and Reporting"), adjusted some domain items to suit our aims, and altered scoring to attribute greater weight to items deemed more important.Our adapted version of the QUIPS and a custom-made scoring guide are presented in Supplementary Material 2, Table 4.As recommended by Hayden et al. (2013) and Grooten et al. (2019), instead of using a total score to determine overall study risk of bias, each study was classified as low, moderate, or high risk of bias for each domain; these ratings were then used to determine an overall risk of bias (see Supplementary Material 2, Table 4).Scoring overall risk of bias using a domain-based evaluation is a realistic appraisal that removes the difficulty of validating such a subjective assessment method (Grooten et al., 2019;Hayden et al., 2013;Higgins et al., 2022).Interrater agreement between reviewers was judged sufficient if the representative Kappa statistic was 0.6 or above, a threshold recommended by guidelines set by Cohen to indicate substantial agreement (McHugh, 2012).Overall interrater agreement for risk of bias assessed across all domains was substantial (Kappa = 0.6), reaching the predetermined threshold.Domain 1 ("Key Study Variables and Recruitment"), Domain 2 ("Prognostic Factor Measurement"), and Domain 4 maintained reasonable levels of agreement (Kappa = 0.49, 0.47 and 0.78, respectively).Interrater agreement was significantly lower for Domain 3 ("Outcome Measurement") compared to the other domains, only reaching a fair level (Kappa = 0.22).

Effect measures and data coding
Effect sizes and descriptive data extracted from each study were coded for inclusion in meta-analyses.Preliminary searches indicated that correlation coefficients were the most reported effect size.Therefore, if studies reported correlation coefficients, these were coded

Table 3
Overall multilevel models and Grading of Recommendations, Assessment, Development and Evaluation (GRADE) summary of findings.
Patient or population: Individuals diagnosed with an α-synucleinopathy. 3Originally in the review protocol it was stated that interrater agreement for data extraction would be assessed using Cohen's kappa.Given the array of possible responses when extracting data, it was deemed inappropriate to use a categorical interrater reliability measure.Proportionate agreement was identified as the most sensible method to measure interrater reliability when extracting a diverse combination of categorical and continuous data with two reviewers.A minimum threshold of 90% agreement was settled upon for data extraction (Graham et al., 2014).
directly, while other statistical parameters were standardised to correlations for pooling (Harrer et al., 2021;Murad et al., 2019).For regression outputs, conversion into a correlation coefficient is not appropriate with multiple predictors.Instead, sample size, number of predictors, variance explained, and test-statistic for the predictor of interest can be used to estimate a semi-partial correlation coefficient that attempts to isolate the effect size of the predictor of interest (Aloe and Becker, 2012).If odds ratios were reported, they were converted to ) before being calculated as a correlation coefficient ) (Borenstein et al., 2009).For group analyses, if studies reported mean, standard deviation, and sample size for groups, then a test statistic calculator was used to generate a t-value and corresponding degrees of freedom (GraphPad, 2023).The test statistic can then be converted into a correlation coefficient ( ) (Rosenthal, 1994).If only a p-value and sample size were available for group comparisons, then a critical correlation calculator was used to estimate a correlation coefficient (MathCracker, 2017).When studies compared more than two groups and reported a partial eta-squared value, the formula ( ) transformed the reported statistics into Cohen's d, which could be converted into a correlation coefficient (Lakens, 2013).In included studies, group analyses were frequently used to investigate differences in CR across α-synucleinopathy cognitive diagnostic categories (i.e., normal cognition, mild cognitive impairment; MCI, dementia).Effect sizes from these group comparisons are referred to as 'cognitive subtype grouping effects'.For studies that reported effect sizes for continuous cognitive measures, we classified each cognitive test into a specific cognitive domain (see Supplementary Material 2, Table 5).General consensus across review included records, recent literature (Harvey, 2019), and opinion from experts in cognition and psychometrics guided classification of cognitive tests.All coded study effect sizes were included regardless of dependency, as numerous studies reported multiple effect sizes of interest within the same sample.For example, Armstrong et al. (2012) analysed the influence of education and intelligence on attention/executive function, working/short-term memory, long-term memory/learning, visuospatial ability, and global cognition, producing effect sizes representing associations between each proxy and cognitive domain.Often, non-independent effects are dealt with by either averaging the data (Glen, 2014;López-López et al., 2018), or by selecting and analysing only one effect, per study, according to a priori criteria (Cheung, 2019).Both these methods limit exploration of heterogeneity sources within studies, thereby reducing the power of meta-analyses (López-López et al., 2018).Effect size variance is important, but rarely reported, when analysing dependent effect sizes.It was therefore estimated using the following formula ( 1 (Pustejovsky, 2018).Directionality of effects was made consistent between studies, such that positive correlations always indicated that a higher CR is associated with better cognitive or motor function, while negative correlations suggest the opposite.Recommendations by (Assink and Wibbelink, 2016) and Gucciardi et al. (2022) guided item coding.

Statistical analyses 2.7.1. Multilevel (three-level) models
Multilevel models consider studies that include several nonindependent, within-study effects that cannot be accounted for using traditional random-effects meta-analyses (Borenstein et al., 2010;Cheung, 2019).By default, a conventional random-effects meta-analysis relies on a multilevel structure, where effect sizes differ according to sampling error of individual studies and between-study heterogeneity (Cheung, 2019).Thus, random-effects are assigned at the study level, nesting participants within each study and thereby creating a multilevel structure.However, to account for several non-independent effect sizes, an additional random-effect needs to be added to the model.Adding an extra intermediate layer converts this two-level model into a three-level model that accounts for variation among effect sizes within-studies (López-López et al., 2018;Van den Noortgate et al., 2013).Here, instead of only allocating random-effects at the study level, in the extended three-level model, random-effects are assigned to both between (study level) and within-study effects (unique effect size level; Cheung, 2019).This captures variance at the within-study level to account for dependency in study nested effects (Van den Noortgate et al., 2013).Often, three-level models are referred to as 'multilevel meta-analysis' in the literature and thus have been termed as so throughout this review.
Effect sizes were weighted by the inverse of the squared standard error, a method roughly proportional to sample size but slightly more nuanced (Moeyaert, 2019).Separate multilevel models were conducted for overall cognition and motor function, where covariance-variance matrices were imputed to further account for dependency within-study effects at an estimated constant correlation of 0.4 (Pustejovsky, 2023).Pooled correlation strength was interpreted using Cohen's (1988) recommendations (McHugh, 2012), and forest and orchard plots were used to visualise multilevel models, grouped by outcome domain.Distribution of variance across the three-level models (sampling, within-study, between-study) was manually calculated and overall heterogeneity was inferred through I 2 (calculated by summing within and between-study variance) and the Q statistic (Cheung, 2014).When pooling effects for commonly reported individual cognitive tests (i.e., Frontal Assessment Battery; FAB, Scales for Outcomes in Parkinson's Disease-Cognition; SCOPA-COG), traditional random-effects meta-analyses were used rather than multilevel models when there was only one effect per study (see Supplementary Material 2, Table 6).

Moderator analyses
Hunter & Schmidt (2004) argued that if less than 75% of total multilevel model variance can be attributed to sampling variance, then there is likely substantial heterogeneity that justifies moderator analyses.Moderators included for cognition were α-synucleinopathy subtype, cognitive domain, CR proxy type, outcome type (i.e., global, specific, cognitive subtype grouping), and medication control.Multilevel motor function models were fitted nearly identically, consisting of α-synucleinopathy subtype (excluding DLB since no included studies with a DLB cohort assessed motor function), motor function domain, CR proxy type, outcome type (i.e., global, specific), and medication control.Additionally, global motor function tests UPDRS-III and Hoehn & Yahr stage were analysed as separate moderators when investigating domain effect differences, as these measures vary in their comprehensiveness.
For each test of moderators, the grouping type with the most effect sizes was used as the reference category, producing beta coefficients for other grouping types.This allowed us to compare if effects for other grouping types significantly differed from the reference category.For example, PD was the reference category for α-synucleinopathy subtype, meaning beta coefficients were calculated for DLB and MSA to identify how much effects for these two α-synucleinopathies varied from PD.
Mean effect size, confidence interval and p-value for the groupings other than the reference category were obtained by removing the model intercept.Random-effects for each multilevel moderator model were defined using an 'inner | outer' formula, alongside specifying random effects at the unique effect size level.This formula assumes that different studies (the outer factor) are independent, while effects within the same  I. Saywell et al. study are presumed to share correlated random effects with a level of variance corresponding to values of the moderator type (the inner factor; Viechtbauer, 2010).The 'inner | outer' formula requires a variance structure corresponding to the inner factor to be specified.The multilevel moderator models used a heteroscedastic compound symmetric structure that fixes the amount of heterogeneity for the levels of the inner factor to the corresponding inner | outer formula (see code on Open Science Framework; OSF: https://doi.org/10.17605/OSF.IO/ 879RC; Viechtbauer, 2010).This variance structure is most suitable as it strikes a balance between theoretical appeal and feasibility with the extracted dataset. 4Orchard plots were used to plot moderator analyses as a useful tool for graphing many effects across groups (Nakagawa et al., 2021).

Outliers and publication bias
Using more than one sensitivity analysis is recommended when assessing meta-biases like outliers and publication bias in multilevel models to maximise the validity of meta-analyses (Noble et al., 2017).Outliers were determined by calculating residuals as z-scores at the individual effect level (z-scores less than − 1.5 and greater than 1.5 were considered outliers) and using Cook's distance (values greater than 1.5 multiplied by the average Cook's distance value across individual effects; Gucciardi et al., 2022).Separate multilevel models with outliers removed using both methods were conducted for both overall cognition and motor function.Degree of publication bias was evaluated by using a multilevel extension of Egger's test and visually inspecting the relationship between each effect size and its corresponding standard error in a sunset (power-enhanced) funnel plot.The original Egger's regression test is one of the most popular publication bias tests in conventional meta-analyses, testing whether publication bias likely occurred by identifying a statistically significant intercept when regressing standardised effect sizes on the inverse of their standard errors (Egger et al., 1997;Sterne et al., 2011).Currently in multilevel meta-analyses with three or more levels there is no universally used publication bias test and the traditional Egger's test is not a valid approach when incorporating several within-study effects.To include this preferred publication bias test in review analyses on non-independent effects, the Egger's test model was extended to account for between and within-study heterogeneity by adding a random residual component for each level of variance (Fernández-Castilla et al., 2021;Nakagawa et al., 2022).Standard error as a significant moderator in a meta-regression of multilevel models suggests evidence for publication bias (Fernández-Castilla et al., 2021).The sunset funnel plot is a variation of the original funnel plot that also visualises the statistical power and estimates probability of effect replication via a R-index (Kossmeier et al., 2020a(Kossmeier et al., , 2020b)).

Confidence in body of evidence
Evidence presented across all studies was evaluated using an adapted version of the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework (Guyatt et al., 2011;Huguet et al., 2013).Overall, GRADE quality of evidence scores can be classified as high, moderate, low and very low.Initially, quality of evidence for outcomes were graded as moderate at baseline, as CR is a relatively new concept with little theoretical grounding, especially in α-synucleinopathy cohorts.According to Huguet et al. (2013), more exploratory, earlier phase of investigation studies (phase 1) should be judged lower prior to adjusting ratings (Huguet et al., 2013).The quality of evidence was then downgraded by one level for either a breach of: risk of bias (overall high/moderate rating according to adapted QUIPS), inconsistency (substantial I 2 value), indirectness (limited α-synucleinopathy subtype representation, different CR proxy measures, each outcome domain/type), imprecision (effect size confidence interval > 0.5), and publication bias (significant multilevel Egger's test).A moderate/large effect size (≥ 0.30) or the presence of an exposure-gradient response (outcome performance gradient present for different levels of CR) could upgrade the quality of evidence by one level.

Results
The two searches yielded 12735 records with a total of 5993 duplicates that were removed before screening.An additional 15 records were added through forward and backward snowballing methods (Wohlin, 2014).Following title and abstract screening, 6565 out of records were excluded, leaving 177 records to be assessed in full text.In total, 62 records were excluded due to not examining the association between a CR factor and relevant outcomes (n=45), not using an α-synucleinopathy cohort (n=7), no pertinent cognitive or motor function measure (n=5), no CR assessment (n=4), or only using self-report measures (n=1).Thus, 115 full text records were eligible; however, an additional 30 were excluded when the information reported was not sufficient for either: not responding to our data requests (n=20), being legally unable to share data (n=7) or unwilling to share data (n=3).As a result, 85 studies (including the PPMI dataset as "one study") were included in the final review.A detailed visualisation the screening process is depicted in a PRISMA flow diagram (see Fig. 1).

Study characteristics
A total of 12,854 α-synucleinopathy participants were assessed in studies (for detailed characteristics of these studies see Table 2).China (15 studies), the United States of America (13 studies), Italy (9 studies), and Brazil (9 studies) were the most prevalent country of publication.Publication dates for included studies ranged between 1989 and 2023, with more than 50% of studies being published between 2016 and 2023.Study cohorts mostly consisted of individuals with PD (92 % of studies), while there were also three DLB and four MSA studies included.Sample sizes ranged from 14 to 901 participants and there were no instances of multiple α-synucleinopathy cohorts investigated in the same study.Furthermore, CR was generally estimated via education (89% of studies measured the association between CR and outcomes using education); other proxies like intelligence (10 studies), occupation (4 studies), vocabulary (3 studies), or comprehensive CR tests (5 studies) were less common.No studies included in this review used the residual approach to operationalise CR.
A significant number of studies evaluated at least one cognitive outcome (80 studies), while fewer assessed motor function (19 studies).Notably, 14 studies reported associations between CR and both cognitive and motor function outcomes.The high number of studies that investigated cognition meant a diverse range of tests, across multiple domains, were used (see Supplementary Material 2, Table 5 for classification of cognitive tests into domains), with global cognition most frequently assessed (53 % of cognition studies).Of note, the Mini-Mental State Examination (MMSE) was the most used cognitive test (31 % of cognition studies), followed by the Montreal Cognitive Assessment (MoCA) (15 % of cognition studies).
Given limited studies assessed motor function, fewer motor domains could be considered in this review.Global motor function (15 studies), gait (4 studies), bradykinesia (2 studies), and postural stability (2 studies) were the only motor outcomes evaluated; we were not able to 4 For additional detail on random-effects formula and variance structure for multilevel meta-analyses, see the metafor manual, pages 255-267 (https://cr an.r-project.org/web/packages/metafor/metafor.pdf).
assess the effect of CR on motor processing speed or tremor as none of the included studies reported such effects.Markedly, the UPDRS-III and Hoehn & Yahr stage were the most frequently reported measures (used in 63 % and 37 % of motor function studies, respectively) and contributed to 74 % of motor function effect sizes.
Mean participant age (reported in 69 studies) was 65.43 years (SD = 4.72), 4577 participants were female and 7192 were male (reported in 77 studies), mean disease duration (reported in 58 studies) was 6.67 years (SD = 3.52), and mean age of onset (reported in 24 studies) was 58.23 years (SD = 5.76).Note that, for some studies, certain demographic variables were not available (see Table 2), so these data do not represent the entire analysed dataset.Four studies (Ashrafi et al., 2012;Guzzetti et al., 2019;Khalil et al., 2016;Riklan et al., 1989) did not report non-significant effect sizes and did not respond when contacted regarding these effects, so these could not be included in the subsequent analyses; however, they reported significant effect sizes that could be included.Studies that controlled for medication were limited, with only 7 out of 85 studies implementing either a statistical method that adjusted for levodopa equivalent daily dose or testing either unmedicated participants or those in a non-medicated state.

Overall cognition
A multilevel meta-analysis of 176 cognitive outcome effect sizes across 80 studies demonstrated that higher CR levels were moderately  3 and Supplementary Material 2, Fig. 1).Significant heterogeneity was present in the model, as assessed by I 2 and the Q statistic.Using formulae generated by Cheung (2014), variance was calculated for each level of the model.Little variance could be attributed to sampling (17.6 %) and within-studies (8.7 %), compared to between studies effects (73.7 %).

Cognition moderator analyses
Significant heterogeneity in the cognition multilevel model justified the investigation of potential sources of variance.The results from these moderator analyses are summarised in Table 4, displaying the mean correlation for each grouping of a moderator and their corresponding regression coefficient (deviation of moderator groups from reference category effect) relative to the reference category.Visualisation of moderator analyses are displayed in Fig. 2 and are described in more detail below.
3.2.2.1.α-Synucleinopathy subtype.Despite the small number of DLB and MSA studies, effects were significant for each α-synucleinopathy (p < 0.001).CR's influence on MSA patient cognition was greater than other disease subtypes, while PD and DLB effect sizes were similar (see Table 4, α-synucleinopathy section).However, there were few MSA and DLB studies included in these analyses, so these results should be interpreted with caution.

Domain-specific effect differences.
All cognitive domain effects significantly deviated from zero (all p ≤ 0.001).With global cognition as the reference category, the test of moderators was approaching significance (see Table 4, Domain section).Only the comparison between normal cognition and MCI groups yielded significantly different (smaller) effect sizes relative to global cognition (reference category).A similar trend was observed for normal cognition v dementia comparison.However, when moderators normal cognition v MCI and normal cognition v dementia were integrated into a model with only these cognitive subtype grouping effect sizes, there was no significant difference in effect size magnitude (F [1,44] = 1.19; p = 0.28).The small number of studies that assessed visual processing speed and reasoning ability means the results for these two domains should be considered warily.

Cognitive reserve proxy type.
All CR proxies significantly influenced cognition (all p < 0.001; see Table 4, CR proxy type section), and effect magnitude changed significantly depending on type of proxy.Education and comprehensive tests produced very similar effect estimates compared to other CR proxies.Vocabulary and occupation effects were both significantly lower in magnitude than education, albeit there was a small number of unique effect sizes for these CR proxies.Intelligence was expectedly more highly correlated with cognition than any other CR proxy, considering there may be overlap between the two measures (Mohn et al., 2014), but not significantly more than education.Note that one study (Ciccarelli et al., 2022) provided a leisure activities effect size (r = 0.49) that was included in the overall cognition multilevel model, but it was not included in these moderator analyses.

Outcome test type.
CR was significantly associated with all types of cognitive outcomes (specific cognitive domain, global cognitive assessment, or cognitive subtype grouping) (all p ≤ 0.01), but they yielded significantly different effect sizes (see Table 4, Outcome type section).Effect sizes produced from global and specific cognitive measures did not significantly differ, but studies that compared CR levels across cognitive subtypes reported smaller (though still significant) effects.

Medication control.
Whether or not studies controlled for disease medication effects did not influence the reported relationship between CR and cognition, and both pooled effects (controlling or not controlling for medication) were significantly different from zero (p ≤ 0.01).Note, however, that the few studies that controlled for disease medication reported lower effects, and this analysis might have been underpowered.

Influence of cognitive reserve on individual cognitive tests
Random-effects meta-analyses were used to analyse the association between CR and frequently used cognitive tests 5 (see Supplementary Material 2, Table 6).All effects were significant, however, the magnitude of each effect varied considerably among tests.Patients with a higher CR scored significantly higher on both the MMSE (r = 0.331, 25 studies) and MoCA (r = 0.333, 12 studies).Effects for other global cognitive tests like the SCOPA-COG (r = 0.447, 4 studies) and FAB (r = 0.390, 7 studies) were slightly larger.The strongest correlation was with Raven's matrices (r = 0.471, 3 studies) and weakest with verbal learning tests (r = 0.184, 5 studies).

Overall motor function
A total of 31 motor function effect sizes from 19 studies were included in the multilevel meta-analysis.Higher CR was associated with slightly better motor function performance in α-synucleinopathy patients, mostly though for those with PD (see Table 3 and Supplementary Material 1, Fig. 2).There was significant heterogeneity in the model with almost all variance explained by between study effects (74.2 %), and minimal amounts explained by within-study effects (12.6 %) or through sampling (13.2 %).

Motor function moderator analyses
Influence of possible moderators was explored separately to potentially identify reasons for substantial heterogeneity.Effects of each moderator and regression coefficients (if not the designated model reference category) can be viewed in Table 5.These results are visualised in Fig. 3 and described in more detail below. 3.3.2.1.α-Synucleinopathy subtype.Most studies that investigated motor function utilised a PD sample, excluding one study (Barcelos et al., 2018) that investigated the relationship between education and Hoehn & Yahr stage in an MSA sample.There were no motor function effect sizes for DLB patients included in this review and removing the MSA study did not significantly alter the pooled motor function effect size.Thus, the effect of CR on motor function in this review largely reflects PD rather than α-synucleinopathies in general.

Domain-specific effect differences. All domains except Hoehn &
Yahr (p = 0.162) were significantly associated with CR.Effect size magnitudes for gait and Hoehn & Yahr were similar to reference category UPDRS-III, and only balance significantly differed from UPDRS-III.Both balance and bradykinesia effects were derived from two studies each, and therefore these larger effects should be interpreted with caution.

Cognitive reserve proxy type.
Education and intelligence significantly deviated from zero (p ≤ 0.05), however, comprehensive tests did not (p = 0.33), and vocabulary was approaching significance (p = 0.06).Intelligence and vocabulary effects were more than double in magnitude compared to that of education, but small number of effect sizes limits the reliability of these findings.No studies used occupation as their CR proxy measure in relation to motor function.

Outcome test type.
When UPDRS-III and Hoehn & Yahr were combined as global motor measures, their mean effect was not significant (p = 0.10).Specific motor assessments (i.e., balance, bradykinesia, gait) significantly differed from zero (p < 0.001); however, despite this larger, significant effect, there was no substantial difference in effect magnitude compared to global tests.

Medication control.
As with cognition, controlling for disease medication did not significantly impact the association between CR and motor function, although the test of moderators approached significance (see Table 5, Medication control section).The larger mean effect for studies that did not control for medication was expected, given the known influence it has on motor function in α-synucleinopathies, and especially PD patients (Schapira et al., 2009).Only effects that did not control for medication significantly differed from zero, whereas those that controlled for medication were non-significant (p = 0.27). 5Multilevel (three-level) structure only applied to MMSE, MoCA, and Verbal Learning Tests.All other commonly occurring cognitive tests did not present with multiple effect sizes within-studies.

Outliers and publication bias
Multilevel Egger's regression tests suggested that publication bias likely did not confound overall cognition (p = 0.373) and motor function (p = 0.700) effects.These results were supported by symmetrical sunset funnel plots for both cognition and motor function models (see Fig. 4).Median effect size power was higher for cognition (82 %) in contrast to motor function (32 %), as was estimated likelihood of replicating pooled effects for each model (99 % for cognition compared to 21 % for motor function).
Like publication bias, removal of outliers did not significantly impact pooled effects for both overall cognition and motor function, nor did it substantially alter model heterogeneity.From 176 cognition effect sizes, 23 and 24 were identified as outliers using the residual approach and Cook's distance, respectively (see Supplementary Material 2, Fig. 3).While for motor function, from 31 effect sizes six and five effect sizes were classified as outliers via the residual method and Cook's distance respectively (see Supplementary Material 2, Fig. 4).

Risk of bias
Estimated of risk of bias measured via our adapted QUIPS was mixed as the overall ratings were relatively evenly distributed (see Fig. 5).Almost half of studies were individually rated as having a low risk of bias (38 studies), an overall risk of bias rating of moderate was nearly as prominent (25 studies), and high risk of bias studies were the least prevalent (22 studies).Greater overall risk of bias ratings for individual studies were generally associated with poor prognostic factor (CR) measurement or weak statistical analysis and reporting (see Fig. 5).Test of moderators for both cognition and motor function indicated that there was no significant difference in effect sizes across each risk of bias level (see Supplementary Material 2, Table 7).

Assessing the certainty in findings
Table 3 displays a summary of the GRADE framework used to evaluate pooled effects.Quality of evidence for overall cognition was deemed moderate, however, the smaller pooled effect size for overall motor function ultimately resulted in its quality being downgraded to low.Both outcomes were downgraded for considerable effect heterogeneity and/or indirectness (effects biased towards PD cohorts and CR proxy education).Effects for both overall cognition and motor function were deemed unaffected by imprecision, publication bias and risk of bias was low.Each relative effect was upgraded for exposure-gradient response (i.e., analysing continuous measures) across many outcome domains, while only cognition was upgraded for moderate/large effect size.

Discussion
This study conducted the most comprehensive systematic review of literature assessing the relationship between CR and clinical outcomes in α-synucleinopathies to date.Critically, the study integrated a multilevel meta-analysis approach that allowed incorporation of multiple withinstudy effects for exploration of potential moderators.Often metaanalytic studies utilise more conventional approaches that only permit including one effect size per study (Assink and Wibbelink, 2016;Cheung, 2019), limiting the statistical power of analyses and constraining the amount of useable information (Van den Noortgate et al., 2013).CR-focused studies frequently measure different proxy factors and neurodegenerative studies tend to assess multiple cognitive or motor outcomes, resulting in several effect sizes per study.Regardless, past CR and α-synucleinopathy reviews have adopted the more conventional method of analysis, restricting quantity of data and consequently inferences stemming from these models.Our multilevel approach allowed inclusion of more than double the number of effect sizes compared to included studies, ensuring that results are significantly more representative of the current literature, less biased and provide greater accuracy for estimating CR's effect on cognition in patients affected by α-synucleinopathies.The findings of this review suggest that CR has significant associations with all aspects of cognition, and to a lesser extent, but still significantly, motor function.These findings are consistent with previously reported relationships between CR and cognition in systematic reviews of studies on healthy, dementia-afflicted, stroke, AD and PD cohorts (Contador et al., 2023;Hindle et al., 2014;Meng and D'Arcy, 2012;Nelson et al., 2021;Panico et al., 2022).The concept of CR assumes that individuals who have greater accumulated enriching lifetime experiences will have a higher CR, and thus perform cognitively above expectations for a given level of neuropathology (Stern, 2009).Our results suggest that this protective effect of CR generalises from more cognitive neurodegenerative conditions, like AD, to movement-based disorders.This indicates that CR may influence a larger variety of brain regions and networks, rather than only those heavily involved in cognitive processes.Individuals with a higher CR have been found to have a larger brain volume, greater subcortical structure volume, and consequently superior cognitive performance (Nelson et al., 2022;Seider et al., 2016).It is possible, therefore, that CR might increase the whole brain's resistance to neurodegeneration, which may explain why resilience translates from cognitive to motor performance in α-synucleinopathy patients.These effects might also be specific to dopaminergic circuits that are associated with both cognition and motor function.PD patients with higher CR have been shown to have better UPDRS-III ratings than those with a lower CR, despite reductions in Fig. 5. Risk of bias summary plot: review authors' judgements about risk of bias for each domain assessed using the adapted QUIPS, expressed as percentage of included studies rated as low, moderate or high risk of bias (upper panel).Risk of bias traffic plot: review authors' judgements about risk of bias for each domain for each individual included study (lower panel).
available dopamine in the posterior putamen (Chung et al., 2022;Sunwoo et al., 2016).This outcome persists when duration of PD symptoms are held constant (Sunwoo et al., 2016), and supports the idea that CR mechanisms may help compensate for striatal, and potentially other basal ganglia nuclei, dopamine loss.Notably, there was a high degree of heterogeneity in both multilevel meta-analytic models.The diversity across effects was mostly a product of between-study discrepancies caused by differences in cohort, CR assessment, outcome type, and, especially, how each outcome domain was measured.

α-synucleinopathy subtype
Possessing a greater CR has been previously associated with less impaired cognitive performance in PD patients (Gu and Xu, 2022;Hindle et al., 2014), and the current study validates CR's prognostic utility using a larger sample.Interestingly, moderator analyses indicated similar relationships between CR and cognition in DLB and PD patients, despite distinct differences in onset and severity of cognitive impairments (McCann et al., 2014).The similar impact of CR across PD and DLB is consistent with the fact that it can be difficult to differentiate these conditions using cognitive tests alone (Hohl et al., 2000;Litvan et al., 1998).It is also possible that more cognitively impaired PD patients in more advanced stages were included in this review (mean PD disease duration = 7 years), who were thus more likely to suffer widespread aggregation of α-syn in subcortical and higher cortical areas (Halliday and McCann, 2010).Further, the potential misdiagnosis of PD instead of DLB may have led to comparable effects.For instance, Litvan et al. (1998) compared diagnosis accuracy in six neurologists and reported relatively high sensitivity for PD diagnosis and extremely low sensitivity for DLB diagnosis, suggesting over-diagnosis and under-diagnosis, respectively.Notably, there was a small number of DLB effects in our analyses, potentially due to diagnosis issues, limiting any definitive inferences.
In contrast to other α-synucleinopathies, CR had a more profound beneficial impact on cognitive performance in MSA patients.MSA is pathologically and clinically quite distinct to PD and DLB, sharing accumulation of α-syn aggregates, but with these in glia instead of neurons, affecting functional outcomes uniquely.MSA can present with similar symptoms to other α-synucleinopathies, but these can vary depending on brain regions GCI initially develop in, leading to either parkinsonian or cerebellar subtypes in conjunction with autonomic dysfunction.Compared to PD and DLB pathological progression, where degree of neuronal loss is more region-selective, MSA patients typically suffer from widespread loss of cell function across many regions (McCann et al., 2014).Perhaps CR may play a more important role in clinical outcomes when cell loss is more extensive throughout the cortex or when neurodegeneration initially targets glia rather than neurons.For instance, a study by Jin et al. (2023) exploring the moderating effects of CR on brain structure and cognition identified that CR's protective role as a buffering factor was greater when significant neuropathology accumulates in the ageing brain.Additionally, other studies have suggested that CR may interact with patient outcomes and brain structure differently, at least somewhat dependent on the neurological status of the cohort in consideration or CR proxy selection (James et al., 2012;Menardi et al., 2018;Wanigatunga et al., 2021).For example, Bennett et al. (2005) demonstrated that education, a CR proxy, moderated the association between cognitive performance and AD neuropathology when it was quantified as amyloid-β accumulation, but not when neuropathology was indexed by neurofibrillary tangles.Consequently, it is possible that CR exerts a greater neuroprotective effect when neurodegeneration is more extensive and aggressive (Abos et al., 2019) or targets specific cells, like in MSA.Although, like DLB, the small number of MSA effects included in this review and lack of evidence comparing CR's influence in MSA to other α-synucleinopathies makes any conclusions speculative.

Outcome domains and type
The effect of CR remained consistent irrespective of whether global or specific cognitive tests were used.The similar effects for different outcomes was surprising, as we expected larger effects for specific cognitive tests, which can potentially afford greater sensitivity by reflecting the functioning of particular brain regions and cognitive processes (Amaefule et al., 2020;Armstrong et al., 2020;Harvey, 2019;Stellmann et al., 2021).Previous studies have demonstrated that global cognitive tests, like the MoCA and MMSE, are unable to detect cognitive deficits after stroke and traumatic brain injury compared to domain-specific measures (Demeyere et al., 2016;Srivastava et al., 2006).In this review, relationships between CR levels and cognitive subtypes were also analysed, displaying a weaker association with CR than global and specific tests.There were greater differences in CR level when cognitively normal patients were compared to those with dementia, rather than MCI, demonstrating CR's prognostic utility.This suggests that α-synucleinopathy patients with a higher CR are less likely to receive an MCI or dementia diagnosis, consistent with the CR hypothesis (Stern, 2009).As α-synucleinopathy neuropathology progresses over time, chances of dementia diagnosis increases (Hely et al., 2008;McCann et al., 2014).The stronger correlation between CR and dementia group comparisons, relative to MCI, indicates that possessing a higher CR may be even more Important for resisting severe cognitive impairment in later α-synucleinopathy disease stages, excluding MSA where dementia does not appear to be a fundamental hallmark of the disease (Brown et al., 2010;Robbins et al., 1992).
There were no significant differences between global and specific motor function test effects, however, there were few specific test effects, preventing robust comparisons.Despite the simplistic nature of the Hoehn & Yahr scale compared to the UPDRS-III, they both correlated similarly with CR, possibly due to both relying on stringent items measuring similar outcome variables.Stronger associations between CR and domain-specific tests could suggest CR's motor function influence is more clearly shown by objective measures.Subjective rating scales can be limited by clinician bias or experience, and usually reduce precision through restricted answer options per item (Federico et al., 2018;Jahedi and Méndez, 2014).Severe limitations when using the widely applied Hoehn & Yahr scale to measure motor function have been reported, especially for the unmodified version.Specifically, its heavy emphasis on postural stability as an indicator of disease severity and inability to capture other motor features of PD drastically reduces scale sensitivity (Goetz et al., 2004).More objective methods, such as a portable movement sensor, have demonstrated greater test-retest reliability when classifying PD motor impairments than clinicians using the UPDRS-III (Heldman et al., 2014).This portable system was also sensitive to significant bradykinesia, hypokinesia, and dysrhythmia changes in response to patient deep brain stimulation that clinician ratings did not identify.Motor impairments therefore may be erroneously classified, however, more studies are required, especially using objective motor measurements, before any conclusions can be drawn.

Cognitive reserve proxies
An important objective of this review was to explore whether type of CR measure substantially altered cognitive and motor function outcomes.Regardless of proxy selected, higher CR resulted in significantly better cognition.The stronger association between intelligence and cognition, compared to other proxies, was expected, given the likely overlap between these measures (Mohn et al., 2014).Interestingly though, education and comprehensive tests had nearly identical associations with overall cognition, despite being vastly different CR measures.CR proxy measures can be classified as static or dynamic (Malek-Ahmadi et al., 2017).Education is an example of a static measure that remains fixed over time, while comprehensive tests assess dynamic factors developed across the lifespan, like engagement in cognitive activities other than for educational purposes.AD and stroke studies have suggested that dynamic proxies are better CR indicators, displaying stronger associations with cognitive outcomes (Gil-Pagés et al., 2019;Malek-Ahmadi et al., 2017).Education's association with cognition, despite its known correlation (Lövdén et al., 2020), was expected to be lower than that of comprehensive tests.It is possible that the contribution of CR in α-syn-based neurodegenerative conditions may be less reliant on dynamic experiences, and instead static factors like education enhance neural networks just as effectively.There are pathological discrepancies between different neurodegenerative diseases potentially explaining results variance, as often AD patients experience significant entorhinal region deterioration (Van Hoesen et al., 1991), which is distinct from α-synucleinopathy neuropathology.While education has been previously negatively associated with volume of AD-affected brain structures (Zhu et al., 2021), a lack of exploration of the influence different CR proxies may have on α-synucleinopathy-related brain structures prevents any comparisons with AD.Additionally, in our study, the impact of occupation and vocabulary, while significant, were substantially lower than that of education.Occupational measures' positive effects on cognition have been previously overshadowed by education in both healthy and PD cohorts (Darwish et al., 2018;Rouillard et al., 2017), however, limited effect sizes for these proxies may have contributed to significant differences.
For motor function, there were fewer effect sizes to analyse, and primarily education was used as a CR proxy.Only education and intelligence effects were significant predictors of motor function, while vocabulary was approaching significance.A previous study reported a positive association between the CRIq and global motor function (Guzzetti et al., 2019), while data for other studies suggest negative associations (Ciccarelli et al., 2018;Montemurro et al., 2019).These past studies support the inconsistent relationship between CR and motor function identified in this review (see Supplementary Material 2, Fig. 2).Discrepancies in effects potentially arise from a combination of heterogeneity in CR and motor function tests and lack of sufficient effect sizes.Interestingly though, like with cognition, intelligence correlated most strongly with motor function.In children, lower intelligence has been linked to poorer motor performance (Smits-Engelsman and Hill, 2012), but it appears this association may progress into adulthood.Healthy individuals with better cognition and higher CR have been found to engage in greater amounts of physical activity and possess more preserved motor abilities (Buchman et al., 2019).Additionally, in PD, higher global intelligence level correlates with reduced bradykinesia and lower UPDRS-III ratings (Wiratman et al., 2019), while slowing of fluid intelligence abilities (bradyphrenia) have also been positively associated with bradykinesia (Sawamoto et al., 2002).The similar findings in our review potentially suggest a significant level of interconnectedness between intelligence and motor function.To the best of our knowledge, this review is the first study to compare CR proxy effects, however, more studies are required to fully understand what CR factors contribute to better α-synucleinopathy patient outcomes.Future studies attempting to demonstrate CR's neuroprotective effect in α-synucleinopathies would benefit from using outcome measures designed to test neural circuits typically affected by these conditions.For example, utilising a reinforcement learning task known to reflect basal ganglia circuitry dysfunction (Chakravarthy et al., 2010), a recognised deficiency in those diagnosed with α-synucleinopathies (Calabresi et al., 2023), compared to more generalised global cognition tests might provide better opportunities to test CR's effect in these conditions.

Medication control
A novel objective of this review was to investigate the potential confounding effect of medication on CR's relationship with α-synucleinopathy outcomes.Only studies with a PD cohort controlled for medication (dopamine replacement therapy) in this review.CR's influence on cognition was significant, regardless of medication control, and effect magnitude only slightly varied between studies that controlled for it and those that did not, and this difference was not significant.Given that failure to control for extraneous variables tends to produce higher effect sizes (Olejnik and Algina, 2003), a larger pooled effect was expected for studies that did not control for medication.However, these results may also indicate that levodopa-carbidopa's influence on cognition is inconsistent.If disease medication's influence on cognitive performance can be positive, negative or have no significant effect, then its ability to confound the CR and cognition relationship would have been limited.Levodopa-carbidopa has been shown to not affect MMSE scores in PD patients with dementia, but it did improve scores for PD patients without dementia (Molloy et al., 2006).Nevertheless, performance on other cognitive tests did not change for patients without dementia.Another study by Cools et al. (2003) found that the effect of levodopa-carbidopa on cognition varied depending on task demands.ON medication patients performed worse in decision-making tasks, while OFF medication patients exhibited poor attentional flexibility, being unable to quickly switch between tasks.In our review, a majority of the studies that controlled for medication effects used global cognitive tests, and none reported decision-making measures (see Table 2).Thus, the lower (albeit non-significant) effect in the medication-control studies may imply that CR has a favourable effect on PD global cognition that is overlapping with medication.
Interestingly, the effect of CR on motor function was only significant when levodopa-carbidopa was not controlled for, though medication control was not a significant moderator.Levodopa-carbidopa's known effectiveness at reducing PD motor impairments (Schapira et al., 2009) likely overshadows the small, but significant, influence CR has on motor function.Motor function studies that controlled for medication used only education as a CR estimate.It is thus possible that use of a less representative, static CR measure also contributed to a non-significant pooled effect in this group.Findings reported by Schneider et al. (2015) indicate that a static measure like education does not correlate well with motor function outcomes.The education-cognition relationship persisting after medication control, unlike motor function, is likely indicative of education's well-known strong association with life-long cognitive functioning (Lövdén et al., 2020).Alternatively, when CR is measured using comprehensive measures, there appears to be a more significant effect of higher CR on motor function (Guzzetti et al., 2019).CR, evaluating using the CRIq, has been observed to contribute to significant longitudinal gait and balance improvements in PD through virtual reality rehabilitation (Imbimbo et al., 2021).Therefore, if CR is measured comprehensively, then its impact on PD motor function may remain significant, and possibly even moderate responsiveness to disease medication, though these are speculations.

Review strengths
Our review was the first, to our knowledge, to investigate how CR influences both cognition and motor function in multiple α-synucleinopathy subtypes, while considering several moderating variables.Previous, similar reviews (Gu and Xu, 2022;Hindle et al., 2014) did not include as many studies and therefore effects, whereas our study achieved greater statistical power.Including a substantial number of studies allowed us to analyse effects both within and between-studies through multilevel modelling.The multilevel approach, despite being a strong method for accounting for dependent effect sizes, is frequently not considered in meta-analytic research (Assink and Wibbelink, 2016).Often, effect size dependency is avoided in conventional meta-analyses by only including one effect size per study or averaging effect sizes within-studies, limiting investigation of heterogeneity (Assink and Wibbelink, 2016;Cheung, 2019;Nakagawa, et al., 2023).If there is dependency between effects without employing statistical methods that account for this, then overlap in information across effects can lead to inflated effect estimates.Utilising a multilevel meta-analysis is a more robust approach to dealing with several within-study effects, by also considering variance at the individual effect size level.By structuring our meta-analyses with three levels to account for effect size dependency, we could ensure that dependency did not inflate effect sizes, while keeping meta-analytic models as powered as feasibly possible and incorporating all available information.This analysis method allowed us to include 176 unique cognition effect sizes from 80 studies and 31 motor function effect sizes from 19 studies.More importantly though, this approach also made possible moderator analyses examining differences in effects across within-study variables like CR proxy type, a novel component of this review.
Further, 48 studies reported effect sizes that were converted to a correlation coefficient, our chosen standardised estimate.When effect sizes are converted to a different measure, certain assumptions about the nature of these effects are made, and even if these do not persist, conversion is often a better approach than omitting studies as loss of information can lead to bias (Borenstein et al., 2009).Sensitivity analyses revealed that effect size magnitude did not significantly vary between non-converted and converted effect sizes for both cognition and motor function (see code at OSF: https://doi.org/10.17605/OSF.IO/879RC).

Limitations
Review findings should be interpreted in conjunction with several limitations.First, there were few effect sizes for some moderator groupings.Most studies included in the review investigated outcomes in PD patients, used education as a CR proxy, and a cognitive measure as the outcome.DLB and MSA, other CR proxies besides education, motor function, and studies that controlled for disease medication were poorly represented, meaning conclusions drawn from moderator analyses should be interpreted carefully.The CR and α-synucleinopathy literature is scarce, with aims and methodology of most included studies not related to CR.Although including many studies without the intent of investigating CR could also be perceived as a strength of the review, as it potentially reduces publication bias.
Second, our publication bias tests may not be entirely suitable given our choice of analysis.There is currently no agreed-upon method to accurately test for publication bias in multilevel models with three or more levels.While the multilevel extension of Egger's test seems like the most suitable approach for non-independent effects (Nakagawa et al., 2022), simulation studies have shown that this method can lead to inflated type 1 and 2 error rates, especially for datasets containing a large number of heterogenous effects (Fernández-Castilla et al., 2021;Nakagawa et al., 2022).However, we did supplement our multilevel Egger's tests by visually inspecting the sunset (power-enhanced) funnel plot to justify lack of publication bias using multiple evaluations.
Third, the search strategy was limited to peer-reviewed and English publications, thereby possibly biasing included studies.Fourth, language tests were removed as a cognitive outcome and other validated CR proxies outside of those specified in the (Stern et al., 2020) whitepaper were omitted.Notably, a framework for reserve and resilience concepts was recently developed (Stern et al., 2023), providing consensus on definitions and methodology for studying reserve that supersedes the Stern et al. (2020) whitepaper.This new framework was not available at the time when this review was performed, but nevertheless CR proxies included in this review that were chosen on the basis of the whitepaper are still categorised as suitable operationalisations of CR according to the updated framework.Language tests were overly comparable to some CR measures (i.e., intelligence) and thus were inappropriate to include as an outcome.Additionally, our method for classifying cognitive tests into specific domains was, while supported by literature and expert opinion, subjective and contingent to interpretability.By assigning cognitive tests to one domain rather than others they may have also belonged to, it could have changed the significance of CR's effects across different domains.Lastly, effects in our multilevel models were largely pooled from between-study comparisons, meaning the full utility of multilevel meta-analysis could not be realised.Lack of sufficient within-study effects meant there was reasonable variability between effect sizes.Consequently, fitting a more flexible variance structure that would better explain the effect of CR on outcomes was impractical and a more restrictive approach was required.

Future research directions
This review has highlighted the need for further research investigating more diverse CR proxies, motor outcomes, and α-synucleinopathies other than PD to allow more definitive conclusions regarding the relationship between CR and clinical outcomes.While review analyses and previous review findings (Gu and Xu, 2022;Hindle et al., 2014) indicate that higher education is associated with reduced impairment across a variety of cognitive domains in PD, other relationships between CR and α-synucleinopathies are more obscure.Despite the advantage of education being a common demographic, overemphasis on this CR proxy skews inferences about the construct's full protective effect (Boyle et al., 2021).Future CR research should focus on deriving effects from a diverse range of static (e.g., education years and occupational level) and dynamic proxies (e.g., vocabulary and engagement in leisure/cognitively stimulating activities).Solely measuring education level as a CR index only captures a portion of the reserve usually developed in early life and ignores other cognitively enriching experiences occurring before and after education that contribute to reserve.Further, years of education might represent academic ability rather than CR (Schwartz et al., 2016), neglecting other reserve-building activities that can occur outside of academia that also bestow neuroprotective effects.Measuring CR across multiple points during an individual's lifespan by using potentially more accurate, dynamic, proxies that constitute a more complete representation of lifetime experience would provide a better estimation of the construct (Bettcher et al., 2019).For example, in a cohort with AD neuropathology, only regression models using the dynamic CR proxy vocabulary significantly predicted cognitive performance, compared to models using years of education (Malek-Ahmadi et al., 2017), possibly suggesting substantial CR development occurs after education and throughout the entire lifespan.It should also be noted that none of the studies included in the review used the popular residual approach to operationalise CR.Since the review's search strategy was relatively expansive, incorporating several CR related terms and multiple relevant databases, it is probable that no CR and α-synucleinopathy literature using the residual approach exists.A recent meta-analysis highlighted the clinical usefulness of the residual approach in an AD cohort (Bocancea et al., 2021), indicating that exploring its utility in α-synucleinopathies could be similarly informative.
Interpretation of CR's effect on α-synucleinopathies other than PD are mostly provisional, given the small number of DLB and MSA studies we could include in our review.More of these studies would help determine if patients with α-synucleinopathies other than PD benefit differently from CR. Further, the limited focus on motor function outcomes should be addressed in future α-synucleinopathy CR studies.This disparity likely results from CR's strong correlation with cognitive outcomes with research originating in AD (Katzman, 1993;Katzman et al., 1988;Stern, 2002).Motor function impairments are hallmarks of α-synucleinopathies (McCann et al., 2014) that appear to be inconsistently altered by CR and need to be considered alongside cognition in future studies.There is also currently a lack of studies that conduct powerful statistical analyses that control for possible confounders, like disease medication, which have been proven to influence α-synucleinopathy cognition and motor function (Cools et al., 2003;Guzzetti et al., 2019;Molloy et al., 2006).

Conclusion
This systematic review and multilevel meta-analysis demonstrated that α-synucleinopathy patients with higher levels of CR possess superior cognition and experience slightly less severe motor impairments.The influence of CR remained significant for cognition, regardless of α-synucleinopathy subtype, outcome domain, CR proxy, and controlling for disease medication (at least in PD), whereas the association between CR and motor function was inconsistent, but overall indicated a protective effect.CR should be considered a useful marker for its potential to inform the entire clinical process from diagnosis to prognosis to intervention and throughout follow-up sessions.Using CR as one of multiple susceptibility markers for α-synucleinopathy neuropathology development may increase the effectiveness of early-stage diagnosis and thus allow for more prompt intervention.Knowledge of, and accounting for, CR's significant neuroprotective effects across different outcomes for α-synucleinopathy cohorts would help optimise prognostic predictions.More refined understanding of individual patient disease progression would enhance intervention strategies, allowing clinicians to better personalise treatment plans that could improve patient quality of life.Furthermore, our findings underline the importance of engaging in and maintaining participation in CR-building activities throughout one's lifespan to minimise the burden of neuropathology.Development of CR by engaging in reserve-building behaviours at any point throughout the lifespan could act as an alternative intervention to, or supplement, pharmaceutical agents, and surgical options, thereby improving clinical outcomes.

Sources of funding
This project is not directly sponsored but has been developed from a larger project receiving funding from the

Declaration of Competing Interest
The authors declare no competing interests and have nothing to disclose.

k
= number of unique studies.N = number of effect sizes.CC = correlation coefficient.CI = confidence interval.В = regression coefficient.Q = Q statistic test for residual heterogeneity.ToM = Test of Moderators.CR = Cognitive Reserve.PD = Parkinson's Disease.DLB = Dementia with Lewy Bodies.MSA = Multiple System Atrophy.EF = Executive Function.WM = Working Memory.STM = Short-Term Memory.LTM = Long-Term Memory.VPS = Visual Processing Speed.UPDRS-III = Unified Parkinson's Disease Rating Scale-Motor Subscale.*reference category for model.Effect sizes (mean CC) and 95% confidence intervals for each specific moderator are obtained from separate models that used that moderator as the reference category.Significant p-values (≤ 0.05) highlighted in bold.The p-values and associated β values indicate whether each moderator category differed from the reference category.

k
= number of unique studies.n = number of effect sizes.CC = correlation coefficient.CI = confidence interval.β = regression coefficient.Q = Q statistic test for residual heterogeneity.ToM = Test of Moderators.CR = Cognitive Reserve.UPDRS-III = Unified Parkinson's Disease Rating Scale-Motor Subscale.*reference category for model.Effect sizes (mean CC) and 95% confidence intervals for each specific moderator are obtained from separate models that used that moderator as the reference category.Significant p-values (≤ 0.05) highlighted in bold.The p-values and associated β values indicate whether each moderator category differed from the reference category.

Fig. 1 .
Fig. 1.PRISMA Flow Diagram, indicating database search results and screening process.*Original search, in June 2022: n=11931.Update to search (June 2022-March 2023): n=804.**Studies added to title/abstract screening that cited a review included paper.***Studies added to title/abstract screening that were in the reference list of a review included paper.****Including Parkinson's Progressive Markers Initiative (PPMI) dataset as 'one study'.CR = Cognitive Reserve.

Fig. 2 .
Fig. 2. Cognition moderator orchard plots: (A) effect of α-synucleinopathy subtype (B) differences in cognitive domain effects (C) influence of cognitive reserve proxy type on effects (D) differences in outcome type effects (E) effect of disease medication control.Bolded dot = mean effect size.Thick horizontal line = 95 % confidence interval.Thin horizontal line = prediction interval.EF = executive function.DLB = dementia with Lewy bodies.LTM = long-term memory.NC = normal cognition.MCI = mild cognitive impairment.MSA = multiple system atrophy.PD = Parkinson's disease.VPS = visual processing speed.WM/STM = working memory/short-term memory.

Fig. 3 .
Fig. 3. Motor function moderator orchard plots: (A) effect of α-synucleinopathy subtype (B) differences in cognitive domain effects (C) influence of cognitive reserve proxy type on effects (D) differences in outcome type effects (E) effect of disease medication control.Bolded dot = mean effect size.Thick horizontal line = 95 % confidence interval.Thin horizontal line = prediction interval.MSA = multiple system atrophy.PD = Parkinson's disease.UPDRS-III = Unified Parkinson's Disease Rating Scale-Motor Subscale.

Fig. 4 .
Fig. 4. Sunset (power-enhanced) funnel plots.Effect of cognitive reserve on overall cognition publication bias (top).Effect of cognitive reserve on overall motor function publication bias (bottom).
James and Diana Ramsey Foundation to L.C.-P.and I.B.: Evolution of decision-making in Parkinson's disease, 18th March 2019.IS was supported by the University of Adelaide Research Scholarship.

Table 1
Explanation of the different ways an individual can use cognitive reserve to actively cope with brain damage.

Table 2
Characteristics of studies included in review analyses.

Table 2 (continued ) Study Author Country N Mean Age (Years) Sex (% male) Mean Disease Duration (Years) Disease Subtype CR measure(s) Outcome test type(s) Cognitive outcome measure(s) Motor outcome measure(s) Controlled for Medication
drawing command; Digit Span Forward; Weschler Memory Scale symbol span; Rey Complex Figure Test delay recall; Logical memory delayed recall; Boston naming test; Animals; Actions; Rey Complex No (continued on next page) I. Saywell et al.

Table 2
(continued ) Digit Span; Fuld Object Memory Evaluation; Fuld Verbal Fluency; Block Design; MMSE; MoCA Yes (continued on next page) I. Saywell et al.

Table 4
Results of cognition moderator analyses.

Table 5
Results of motor function moderator analyses.