Comparing three neuropsychological subgrouping approaches in subjective and mild cognitive impairment from a naturalistic multi-centre study

Subjective cognitive impairment (SCI) and mild cognitive impairment (MCI) are two clinical groups with an increased risk to develop dementia, but they are highly heterogeneous. This study compared three different approaches to subgroup SCI and MCI patients and investigated their capacity to disentangle cognitive and biomarker heterogeneity. We


Introduction
Discriminating early signs of cognitive impairment from normal age-related changes is clinically challenging and has proven difficult. In that context, the concept of Mild Cognitive Impairment (MCI) has been useful to describe the intermediate stage between normal ageing and the early signs of dementia. However, MCI is a heterogeneous condition with varying aetiology and clinical progression (Petersen et al., 2001;Petersen et al., 1997;Winblad et al., 2004). Although an MCI diagnosis is related to an increased risk of developing dementia, some individuals with MCI do not progress to dementia, and some revert to normal cognitive status (Pandya et al., 2016;Petersen, 2016). The concepts of Subjective J o u r n a l P r e -p r o o f Cognitive Decline (SCD) and Subjective Cognitive Impairment (SCI) have recently emerged to reflect the intermediate stage between normal ageing and MCI. Both SCD and SCI include individuals with self-reported cognitive difficulties that cannot be detected with formal neuropsychological testing, or other objective clinical evaluations (Jessen et al., 2014).
Although related concepts, SCI may contain a higher degree of complaint and have less of a clear temporal connotation than SCD (Jessen et al., 2014). Both SCI and SCD are heterogeneous conditions, but they are more likely to progress to MCI and dementia, compared to individuals without any cognitive complaints (Rabin et al., 2017;Reisberg et al., 2010).
How MCI and SCI/SCD are operationalised has an impact on incidence, diagnostic stability over time, and syndromic profiles (Diaz-Galvan et al., 2021;Jak et al., 2009). Jak et al. (2009) found that a more inclusive traditional definition provides larger incidence numbers, less stability, and includes more individuals with non-memory related impairments. In contrast, a less inclusive comprehensive operationalisation led to lower incidence numbers, higher stability, and include individuals with more memory related impairments. Additionally, more inclusive diagnostic MCI criteria are more sensitive to detect individuals with borderline or very subtle alterations in neuropsychological performance (Clark et al., 2013), whereas less inclusive criteria have better prognostic performance to identify individuals who will develop dementia (Robertson et al., 2019).
There have been efforts to parse the heterogeneity in MCI through different methods to subgroup the patients. One of the most popular approaches is stratifying MCI patients into anamnestic vs. non-anamnestic impairments, with further categorization into single and multiple domain MCI (Petersen et al., 2001;Winblad et al., 2004). These subgroups differ in the risk of progression to dementia (Damian et al., 2013;Jak et al., 2016). This risk may be higher for subgroups with amnestic impairments and especially of the multiple domain type.
Another approach for MCI stratification is data-driven clustering analysis using cognitive data, which disregards top-down theoretical assumptions and, instead, it empirically identifies MCI subgroups. MCI clustering studies have usually reported between three and four MCI subgroups, often including amnestic, language and dysexecutive subtypes (Machulda et al., 2019). In some studies, a subgroup with very subtle cognitive impairments has also been found (Damian et al., 2013;Edmonds et al., 2015;Machulda et al., 2019). The concepts of SCI and SCD are more recent and less is known about potential subgroups in these populations. A recent study showed that different subgrouping methods influenced syndromic and biomarker profiles of the resulting subgroups (Diaz-Galvan et al., 2021).
However, more subgrouping studies in SCI/SCD are needed to disentangle the heterogeneity in those syndromes.
The overall goal of the current study was to compare three subgrouping approaches for SCI and MCI patients in a naturalistic multicentre memory clinic cohort and assess their capacity to disentangle cognitive and biomarker heterogeneity. We operationalized SCI instead of SCD due to insufficient information about change over time in subjective complains in our cohort (Jessen et al., 2014). Following Clark et al (2013) we included the two theory-based approaches known as traditional and comprehensive approaches, as well as a commonly used data-driven approach, i.e., hierarchical clustering (Damian et al., 2013;Edmonds et al., 2015;Machulda et al., 2019). For the biomarkers, we included widely used neuroimaging biomarkers such as medial temporal lobe atrophy for regional brain atrophy (Jack et al., 2018) and white matter hyperintensity burden for cerebrovascular disease (Wardlaw et al., 2013). We assessed amyloid- and tau pathology through the cerebrospinal J o u r n a l P r e -p r o o f fluid (CSF) biomarkers A 1-42 and phosphorylated tau, respectively (Jack et al., 2018). We also discuss the implications of the discrepancies among subgrouping approaches and their potential use in a clinical setting. We elaborate on the neuropsychological heterogeneity among individuals with SCI and MCI as well as the potential clinical difficulty to differentiate between the two in the naturalistic setting.

Participants
Participants in this study come from the MemClin-cohort (Ekman et al., 2020). All participants were assessed during 2016 through early 2019. MemClin is a Stockholm-based multi memory-clinic project where data collection is ongoing with nine clinics included. All data are managed through TheHiveDB, which is a database system that allows for secure management of personal data (Muehlboeck et al., 2014). Diagnoses are clinically established in multiteam consensus rounds according to the guidelines by the Swedish Board of Health and Welfare in 2010, revised in 2017 (Socialstyrelsen, 2017), and the ICD-10 (WHO, 1992).
For the current study, inclusion criteria were: 1) participants diagnosed with either MCI (ICD code F067) or SCI (operationalized as ICD codes R418A and Z032A); 2) participants who had neuropsychological testing available; and 3) age 65 or older. The MMSE total score was used as an estimation of global cognition (Folstein et al., 1975). MemClin follows the declaration of Helsinki and has ethical approval via the regional ethics committee in Stockholm.
J o u r n a l P r e -p r o o f 2.2. Neuropsychological testing Neuropsychological testing was performed as a part of the clinical routine for cognitive evaluation and is described elsewhere (Ekman et al., 2020). In short, the included tests were Boston Naming Test (BNT) (Kaplan et al., 1983), Verbal Fluency, Semantic Fluency, Semantic Switching (combined score) and Trail Making Test (TMT) condition 1 through 4 from the Delis-Kaplan functioning systems (D-KEFS) (Delis et al., 2001), Rey Auditory Learning Test (RAVLT) (Schmidt, 1996), Rey Complex Figure Test (RCFT) (Meyers, 1995) and Block Design, Information, Similarities, and Digit Span subtests from the Wechsler Adult Intelligence Scales (WAIS) (Wechsler, 2008). Following established local clinical practices, we assigned the tests to different cognitive domains as follows: episodic memory (RAVLT total words learned and delayed recall, and RCFT immediate recall from); attention/processing speed (D-KEFS TMT 1-4); executive function (WAIS Similarities and D-KEFS Semantic Switching); language (BNT, WAIS Information, D-KEFS Verbal and Semantic fluency); working memory (digit span from WAIS); and visuospatial ability (Block Design from WAIS and RCFT).

Neuroimaging biomarkers
Since MemClin is a completely naturalistic cohort, brain imaging was assessed from different MRI (both from 1.5 Tesla and 3 Tesla scanners) and computer tomography (CT) centres. Due to the variability in scanners and sequences we favoured visual rating scales for the assessment of both brain atrophy and cerebrovascular disease instead of automated methods. All visual ratings were performed by the radiology centres in accordance with the original publications (Fazekas et al., 1987;Scheltens et al., 1992 ). The visual ratings were performed during the clinical investigation and the values extracted from the participants' medical records for our analyses. Atrophy of the medial temporal lobe was rated using the J o u r n a l P r e -p r o o f MTA scale (Scheltens et al., 1992). White matter hyperintensities were rated with the Fazekas scale (Fazekas et al., 1987). MTA scores were considered normal or abnormal based on the cut offs proposed in Ferreira et al. (2015), which are age-adjusted, ranging from ≥1.5 at age 65-74 to ≥2.0 at age 75-84 and ≥2.5 at age 85-94. The Fazekas cut offs were also ageadjusted based on the values suggested by Wahlund et al. (2017), where a rating >2 is considered abnormal for individuals <70 years and a rating >3 is always considered abnormal. There is excellent agreement between CT and MRI for assessment of MTA and substantial agreement for the Fazekas scale (Wattjes et al., 2009). Therefore, ratings from MRI and CT were combined for statistical analyses.

Cerebrospinal fluid biomarkers
CSF samples were collected, and biomarkers determined in the clinical routine using ELISA.

Subgrouping approaches
The overall goal of the current study was to compare three subgrouping approaches for SCI and MCI patients and assess their capacity to disentangle cognitive and biomarker heterogeneity. Since we aimed to understand the neuropsychological heterogeneity and its association to the operationalisations of MCI, we combined the participants with clinical diagnosis of SCI and MCI. We also report the percentage of individuals with clinical diagnosis of MCI for each subgroup to highlight the heterogeneity and the differences between clinical evaluation and our analysis. The three subgrouping approaches included two theory-based approaches and one data-driven approach. The two theory-based approaches were based on the subgroups from Winblad et al. (2004) and just differed in the cut point used to J o u r n a l P r e -p r o o f establish abnormality and assign individuals to subgroups: amnestic single domain, amnestic multidomain, non-amnestic single domain, non-amnestic multidomain and a "subtle subgroup". The subtle subgroup consisted of individuals who did not fulfil the criteria of cognitive impairment for MCI independently of their original clinical diagnosis. The three subgrouping approaches used the same underlying neuropsychological data and were: 1) The traditional operationalization, where cognitive domains were considered impaired if one test score was ≤-1.5 SD below the mean of the SCI group, based on the Petersen criteria (Petersen et al., 1999).
2) The comprehensive operationalization, where cognitive domains were considered impaired if two or more tests were at ≤-1 SD below the mean of the SCI group, based on the criteria from (Jak et al., 2009).
3) The operationalization based on a data-driven cluster analysis performed on the neuropsychological tests. An agglomerative hierarchical cluster analysis was performed using Ward's method and the distance between clusters was calculated with the squared Euclidian distance, in line with multiple earlier publications in the field (Damian et al., 2013;Edmonds et al., 2015;Machulda et al., 2019). We tested models from 2 to 8 clusters. The appropriate number of clusters was decided based on the resulting dendrogram and the quality of the clusters as informed by the Krzanowski-Lai and Calinski-Harabasz indices, evaluated with NbClust (Malika et al., 2014).
J o u r n a l P r e -p r o o f 2.6. Statistical analysis Neuropsychological test scores were Z-transformed using the means and standard deviations (SD) from the SCI group, and the z-scores were then used for all following analysis of neuropsychological data. The scores from the D-KEFS TMT were inversed to represent performance in the same direction than the other tests (higher scores mean better performance; please see the appendix A for more details and raw scores for both the SCI and MCI groups). The overall amount of missing data for neuropsychological tests was 10%.
Imputation was performed with the missForest R 3.6.1 package, which is a random forestbased imputation method (Stekhoven and Bühlmann, 2012). The two theory-based subgrouping approaches were based on the six cognitive domains while the hierarchical cluster analysis subgrouping approach was based on neuropsychological tests. This was done to leverage more cognitive measures for the hierarchical cluster analysis, while theory-based methods benefit from a simplified set of cognitive measures (and so it is often operationalised in clinical diagnostic criteria for MCI). A sensitivity hierarchical cluster analysis on the six cognitive domains showed virtually the same results as the main analysis on neuropsychological tests (data not shown).
To address our goal of comparing subgrouping approaches and assess their capacity to disentangle cognitive and biomarker heterogeneity, we followed three strategies. Firstly, we used Chi-squared tests, MANOVA, and discriminant analysis for characterization of the subgroups in terms of distributions and neuropsychological profiles. The Chi-squared test was used to evaluate categorical distributions within subgrouping approach, i.e., we assessed whether approaches produced subgroups equal in size (effect sizes estimated with the Cramer's V test). MANOVA was performed to compare approaches in the amount of variance explained by the subgroups, and the magnitude of the between-subgroups J o u r n a l P r e -p r o o f differences in neuropsychological performance (effect sizes assessed via the partial Ƞ 2 estimate). Additionally, MANOVA provided the opportunity to evaluate if confounders such as age, sex, education, and global cognition influenced the differences between the subgroups. We expected the MANOVA to explain more variance in the data-driven approach, because the data-driven approach is trained on neuropsychological tests and MANOVA is performed on the six cognitive domains derived from those neuropsychological tests. Therefore, the main goal of this comparison was to uncover which theory-driven approach performed closest to the data-driven approach. Discriminant analysis was performed to assess the accuracy of the separation of the subgroups by using neuropsychological data. From the discriminant analysis, we report cross-validated accuracy and show the relationship between the subgroups in each approach. As for MANOVA, the six cognitive domains were used for discriminant analysis as well. Secondly, we assessed the correspondence among subgrouping approaches with Chi-square tests and an alluvial plot created with ggalluvial (Brunson and Read, 2020) and ggplot (Wickham, 2016) in R 3.6.1.
Thirdly, we used Chi-squared tests and binary logistic regression to investigate imaging and CSF biomarkers. The Chi-squared test was used to evaluate subgrouping approaches in association with imaging and CSF biomarkers. Binary logistic regression was used to estimate odds ratios for an abnormal imaging or CSF biomarker across subgroups. SPSS 26 was used for the statistical analysis unless stated otherwise. Statistical significance was set at p<.05 in all the analyses. with CT) and 315 had CSF biomarkers available (Table 1). Individuals with and without neuroimaging biomarkers available did not differ in terms of age, sex, MMSE, or education (data not shown). Individuals with CSF biomarkers available were slightly younger (M=76.2, SD=5.6) than those without CSF biomarkers available (M=78.4, SD=6.1, p<.001), and were comparable in terms of sex, MMSE and education (data not shown). In the subsample with CSF biomarkers available, a total of 102 (32%) individuals had an abnormal CSF Aβ-42 biomarker and 48 (15%) had an abnormal CSF p-tau biomarker.

Characteristics of the subgroups
For the traditional subgrouping approach, the amnestic multidomain (61%) was the most common subgroup, followed by the non-amnestic multidomain (13%), subtle (11%), amnestic single domain (9%) and non-amnestic single domain (6%) subgroups. For the comprehensive subgrouping approach, the amnestic multidomain (53%) was again the most common subgroup, followed by the subtle (17%), amnestic single domain (12%), nonamnestic single domain (9%) and non-amnestic multidomain (9%) subgroups. For the datadriven subgrouping approach, the four-cluster solution was the most appropriate (please see Appendix B for the dendrogram (Fig B1) and clustering parameters for solutions ranging from 2 to 8 clusters (Table B1)). Based on their cognitive profiles, the resulting subgroups were labelled as predominantly "amnestic single domain" (36%), "amnestic multidomain" (30%), "subtle" (19%) and "major impairment" (15%), which predominantly involved J o u r n a l P r e -p r o o f attention/processing speed and visuospatial domains, in addition to memory. For a full characterisation of the subgroups and the percentage of participants with MCI in each please see Appendix C, Table C1.
To complement this qualitative description of subgroup frequencies, we used the Chi-square test to assess the distribution of subgroups within each subgrouping approach. Distributions were significantly different from an even distribution in all three approaches, with coefficients being the highest in the traditional approach, hence indicating that the traditional approach captured subgroups more heterogeneous in size (X 2 (4) =842.571, p<.001) than both comprehensive (X 2 (4) =542.432, p<.001) and data-driven (X 2 Next, we conducted alluvial analysis to assess the correspondence across subgrouping approaches at the individual level (Fig 2). Visual inspection showed that individuals with the highest and lowest levels of cognitive performance were classified more consistently across approaches. In contrast, the discrepancy in classification was more noticeable for individuals with intermediate levels of cognitive performance. For example, individuals classified as nonamnestic multidomain in the traditional approach being classified as non-amnestic single domain or even subtle in the comprehensive approach. Most of SCI individuals were consistently classified as subtle across approaches. MCI individuals dominated the major impairment subgroup in the data-driven approach, as well as the amnestic multidomain subgroup in both the traditional and comprehensive approaches.

Medial temporal lobe atrophy and cerebrovascular disease
In the subsample with neuroimaging biomarkers available (N=585), a total of 204 (35%) individuals had an abnormal MTA score and 81 (14%) had an abnormal Fazekas score.

J o u r n a l P r e -p r o o f
To evaluate the ability of the different subgrouping approaches to detect abnormal neuroimaging biomarkers, we ran three binary logistic regression models, one per subgrouping approach. The dichotomised biomarkers were the outcome variable (normal vs. abnormal) and the subgroups were the predictors. The subtle subgroup had the lowest frequency of positive biomarkers and was thus treated as the reference group in all logistic regression models. Please see Table 2 for an overview of the results.
For MTA, the traditional approach provided the highest odds ratios. All subgroups had significantly higher odd ratios for an abnormal score in MTA as compared to the reference subtle subgroup. We observed the same finding for the data-driven approach, but odds ratios were lower. In contrast, the comprehensive approach did not detect a significant risk for an abnormal MTA biomarker in the non-amnestic single domain subgroup, but it did for the other subgroups.
For Fazekas, only the data-driven approach was able to capture a significant risk for an abnormal Fazekas score in the major impairment subgroup ( Table 2). The traditional and comprehensive approaches did not capture any significant risks.

Amyloid-beta and tau biomarkers
The logistic regression model for CSF Aβ-42 showed that the traditional and comprehensive approaches captured increased odds ratios for the two amnestic subgroups to have an abnormal CSF Aβ-42 biomarker ( Table 2). The odds ratio (Exp(B) = 6.139) for the amnestic single domain subgroup in the traditional approach was higher than that of other subgroups in traditional and comprehensive approaches (Exp (B) < 3.0).

J o u r n a l P r e -p r o o f
Regarding CSF p-tau, the three subgrouping approaches did not find any significant increased risk for an abnormal CSF p-tau biomarker in any of the subgroups (data not shown).

Discussion
This study investigated three different approaches for subgrouping MCI and SCI individuals aimed to assess the implications of discrepancies. These approaches were based on neuropsychological data assessed in a relatively large, real-world naturalistic unselected multi-centre sample. We found a generally good correspondence between subgrouping approaches, especially for the individuals with the highest and lowest neuropsychological performances. Both the data-driven and the comprehensive approach favoured the detection of individuals with less pronounced difficulties, as in the subtle subgroups and the amnestic single domain subgroups. Regarding imaging and CSF biomarkers, the comprehensive approach was the only one to capture subgroups with an increased risk for an abnormal MTA score. The data-driven approach was the only one to capture subgroups with an increased risk for an abnormal Fazekas score. The traditional approach was more successful at identifying individuals with a positive CSF Aβ-42 biomarker. None of the approaches were successful at capturing individuals with a positive p-tau biomarker.
The largest MCI subgroup in the two theory-based approaches was the amnestic multidomain subgroup, which comprised 52-61% of the sample. Other studies have also reported that the amnestic multidomain subgroup was the most common profile in their samples (McGuinness et al., 2015;Rapp et al.;Vos et al., 2013). However, there are also studies where the non-amnestic single domain profile was the largest subgroup (Clark et al., 2013;Zhang et al., 2012). Since the methods have been relatively stable across studies, J o u r n a l P r e -p r o o f these differences may be mostly attributed to differences in the source population.
Regarding the subtle subgroup, we found differences in the size between the traditional and the comprehensive approach, in line with previous publications (Clark et al., 2013;Jak et al., 2009). In our three subgrouping approaches, there were individuals categorised as "subtle" but with a clinical MCI diagnosis (24-36% of each subtle subgroup). This may be due to several reasons. There may be an overestimation of neuropsychological deficit based on self or informer-based reports (Edmonds et al., 2014). Alternatively, there could be deficits in one or more neuropsychological tests that we have not collected for the present study. Also, it is also common with isolated poor performances even for cognitively healthy older individuals during neuropsychological testing. For example, one study found that almost 75% of cognitively normal participants tested worse than 1.3 standard deviations below the norm in at least one test during a long neuropsychological test session, and 20% had results of 2 standard deviations or lower in two separate tests (Palmer et al., 1998).
When we used a data-driven hierarchical cluster analysis for subgrouping, we observed that the four subgroups solution was the most appropriate. This finding is in line with the usually reported three to four cluster solutions in the literature (Machulda et al., 2019). Our data-driven subgroups mainly seem to be divided by severity of symptoms rather than distribution of domain impairment as seen in some previous studies (Clark et al., 2013). By looking at the average scores in the different cognitive profiles, it seems that our subgroups mostly spanned across a spectrum of amnestic impairment from minimum impairment (subtle subgroup), to intermediate (amnestic single domain) to severe impairment (amnestic multidomain and major impairment with involvement of memory). The subtle subgroup in our data-driven approach mostly indicated a SCI profile rather than an MCI one. This is in line with Damian et al. (2013), who also used a clinical sample, Edmonds et al. (2015), who used

J o u r n a l P r e -p r o o f
Alzheimer's Disease Neuroimaging Initiative (ADNI) data and Machulda et al. (2019), who used a community sample. As expected, the data-driven approach would explain more variance than the theory-based approaches in analyses like MANOVA and discriminant analysis. The MANOVA demonstrated that the differences between approaches were not influenced by confounders such as age, sex, education, and global cognition. The discriminant analysis provided the data to observe how the approaches perform in supervised analysis with cross validation, while using the same neuropsychological tests across the three different approaches. Together with the Chi-squared analyses and the alluvial plots, these analyses showed a generally good correspondence between the three subgrouping approaches.
We found some other characteristics of the subgroups that are also in line with previous results. The amnestic single domain subgroup in the data-driven approach was of a similar proportion as previously reported (Damian et al., 2013;Edmonds et al., 2015;Machulda et al., 2019). However, our subtle subgroup consisting of 19% of the sample is relatively small when compared to other studies that have only included individuals with MCI. The subtle subgroup varies in size from 14% (Machulda et al., 2019) to over 40% (Clark et al., 2013;Damian et al., 2013). This may be due to the large variance in clinical severity in our sample, partially due to the inclusion of both SCI and MCI patients, but also the relatively low level of dysfunction in our amnestic single domain subgroup. Further, the subtle subgroup has been discussed either as a false-positive (Edmonds et al., 2015) or as a heterogenous subgroup in its potential progression toward MCI (Machulda et al., 2019).
In terms of neuroimaging biomarkers, more individuals had an abnormal MTA score than an abnormal Fazekas score. This finding suggests more focal brain neurodegeneration J o u r n a l P r e -p r o o f presumably related to aetiologies such as Alzheimer's disease, hippocampal sclerosis, or TDP-43 than diffuse white matter neurodegeneration related to cerebrovascular disease, in this cohort. For the traditional subgrouping approach, it was a 4-folded increase of risk of an abnormal MTA score in the amnestic single domain subgroup compared to the subtle subgroup. The risk was lower for the other subgroups which may suggest a higher degree of contributions of other factors to the cognitive impairment in the amnestic multidomain as well as the non-amnestic subgroups. This is in line with previous research suggesting that the amnestic profiles in MCI patients are associated with development of Alzheimer's disease dementia (Winblad et al., 2004;Yaffe et al., 2006) and that non-amnestic MCI presentations are less associated with temporal atrophy but rather have diffuse atrophy (Zhang et al., 2012). With the data-driven approach, there was almost a 3-fold increase in the risk of having an abnormal Fazekas score in individuals belonging to the major impairment subgroup. White matter hyperintensities, which the Fazekas score measures, have been linked to increased risk of developing vascular dementia or stroke (Wardlaw et al., 2015). In addition, vascular dementia has been associated with multidomain MCI presentations (Winblad et al., 2004;Yaffe et al., 2006). Hence, diffuse vascular aetiology could be one of the contributing factors to the neuropsychological impairment in our major impairment subgroup. Overall, the comprehensive approach performed the best for MTA: there was a heightened risk for abnormal MTA in the amnestic subgroups and the non-amnestic multidomain. Although we only assessed MTA, it is possible that the non-amnestic multidomain potentially has a more global spread of atrophy whilst the amnestic subgroups have a more focal MTA. In the clinic this could potentially be informative in terms of discussions about aetiology and patterns of atrophy underlying specific cognitive profiles. In contrast, the data-driven approach performed the best for Fazekas, capturing an increased J o u r n a l P r e -p r o o f risk for an abnormal Fazekas score specific to the major impairment subgroup. This subgroup showed severe neuropsychological deficits but performed similarly on memory and language as the amnestic multidomain subgroup. Identifying this subgroup in the clinic is potentially important since they may benefit from interventions focusing on cardiovascular health, perhaps decreasing the risk of further cognitive decline.
In our cohort, 32% individuals had a positive CSF Aβ-42 biomarker and 15% had a positive CSF p-tau biomarker. This suggests that the pathology in MCI and SCI in MemClin may be less connected to Alzheimer's disease pathology or that our clinical cut-offs may not be inclusive enough. The theory-based methods showed that there was an increased risk for a positive CSF Aβ-42 biomarker in the amnestic subgroups, compared to the subtle subgroup. For the traditional approach, we found a 6-fold risk increase for the amnestic single domain subgroup compared to the subtle subgroup, again in-line with previous publications (Winblad et al., 2004;Yaffe et al., 2006). Whilst the MemClin-cohort is a clinical sample, it is naturalistic and as such patients are heterogeneous both in terms of potential underlying aetiology but also in comorbidities and other potential confounders. Therefore, it will be important to gather longitudinal data to be ascertain what dementias our current individuals progress to, and to be able to characterize the differences between individuals who progress and those who are stable over time. Previous research suggest that amnestic MCI may progress to dementia quicker, but that non-amnestic single domain is associated with an increased risk of death (Yaffe et al., 2006) There are some limitations in the present study. The use of the SCI subgroup as the reference group for all neuropsychological tests may make some of the findings less generalisable as these are no readily available norms. However, this approach may be more J o u r n a l P r e -p r o o f consistent in potential bias than the use of multiple norms for the neuropsychological tests, which uses different cohorts and treat age, sex, and education differently. Further, we combine the SCI and MCI groups for our analysis. This may include the limitation that the data-driven approach could be seen as stage based but it allows us to explore the difficulty in clinical practice to differentiate between the early stages of MCI and SCI. Another limitation is the reduced sample size for the imaging and CSF biomarker analyses. However, it allowed us to compare the subgrouping approaches in the context to MTA, cerebrovascular disease, as well as amyloid-beta and tau-related pathologies. The visual ratings were collected directly from the patients' medical journal; therefore, we could not assess inter-rater reliability which could potentially affect the results. Finally, education or intelligence influence the test scores, which could explain the lower education in the groups that have worse performance in our cohort. Intelligence in early age is highly correlated with intelligence in old age (Deary et al., 2013). Therefore, SCI patients in our sample with low test perfomances may always have had a lower cognitive ability. Future, longitudinal analysis will better aid in the understanding of risk of progression and if an approach was over or underclassifying individuals as MCI. The strengths of the present data include the relatively large sample, the multi-disciplinary comprehensive clinical investigation, and the naturalistic nature of MemClin, which is the main novelty of this study. Our results highlight the heterogeneity and complexity of the SCI and MCI conditions as they are presented in the clinic, and generalization to the real world is likely higher than when using highly selected research-oriented cohorts.

Conclusion
We have showed that the operationalisation of MCI and SCI subgroups using theorydriven and data-driven approaches impacts the size and characteristics of the obtained J o u r n a l P r e -p r o o f subgroups. The traditional approach favours the amnestic multiple domain subgroup and has less potential to recognise the subtle subgroup. In contrast, the comprehensive approach favours the subtle subgroup. The data-driven approach was more influenced by the severity of the cognitive impairment, mostly along an amnestic spectrum from minimum to severe. Although the three approaches are rather concordant between them, they capture some differences. If a choice must be made, our data suggest that the comprehensive approach has the best overall performance. However, the choice of the subgrouping approach may depend on the context of use and the clinical or scientific goal.
We recommend that the traditional approach is used to identify individuals with a higher risk of having an abnormal amyloid-beta biomarker (linked to the amnestic single domain subgroup); the comprehensive approach is used to identify individuals with a higher risk of a medial temporal lobe-driven cognitive phenotype; and the data-driven approach is used to identify individuals with severe impairment with a major cerebrovascular component. If the aim is to include as many individuals at risk as possible, independently of their underlying pathology, the traditional approach may be preferable. The comprehensive approach may be better for finding the high-risk individuals, thus coming down to an issue of sensitivity versus specificity. In the wider sense, the current study helps to move forward our current understanding of the clinical and biological heterogeneity within MCI and SCI, particularly in the unselected memory clinic setting. In the longer run we expect this growing knowledge will allow for more personalised care in terms of treatments and prevention.

Disclosure statements
The authors report no conflict of interest.
J o u r n a l P r e -p r o o f financial support.
We would like to thank all the clinical psychologists involved in the data collection in this project and the other involved personal for each clinic for making this research possible.
Finally, a heartfelt thank you to the patients that have consented to the project.

Data availability
The data, whilst pseudonymised is considered personal data, and may only be available upon reasonable request if legal and ethical requirements can be met. Discriminant analysis across neuropsychological subgroups Fig 1. The figure shows the relationship between function 1 and 2 from discriminant analysis, displaying every observation and the centroids of the subgroups. Separate discriminant analyses were performed for each subgrouping approach and are displayed to illustrate the performance of the different approaches to disentangle the heterogeneity in the cognitive data (based on the six cognitive domains).

Verification
-The current manuscript has not been published previously.
-The current manuscript is not under consideration for publication elsewhere.
-Publication of the current manuscript is approved by all authors and the responsible authorities where the work was carried out and, if accepted, it will not be published elsewhere in the same form, in English or in any other language, including electronically without the written consent of the copyright holder.

Highlights:
 How we determine if an individual has mild cognitive impairment matters.  Individuals may be put into different groups depending on subgrouping method.  Different methods lead to groups with different biological profiles.