Participants. The study comprised 1063 participants between 50 and 98 years (mean age = 71.56, SD age = 8.42, 64.6% women, mean years of education = 12.01, SD years of education = 5). The recruitment was performed across 13 sites in 9 countries, 4 HICs (Chile, France, Italy, United Kingdom, n = 476) and 5 UMICs (Argentina, Brazil, Colombia, Peru, Mexico, n = 587) as classified according to the World Bank (32). The sample included HCs and individuals with different conditions associated with aging (SCC, MCI, AD, and bvFTD, see below). Participants were recruited from different international consortia, including the Multi-Partner Consortium to Expand Dementia Research in Latin America (ReDLat) (78), the International Network on Social Condition Disorders (INSCD) (15), and the Geroscience Center for Brain Health and Metabolism (GERO) (79).
All participants underwent extensive neurological, neuropsychological, and neuropsychiatric examinations comprising semistructured interviews and standardized cognitive assessments. HCs (n = 325) had preserved cognition and no history of neurological or psychiatric conditions. Participants with SCC (n = 145) presented cognitive complaints either self-reported or reported by a knowledgeable informant, scored 0.5 or less on the Clinical Dementia Rating scale (CDR) (80), and had preserved functional abilities (79). The MCI group (n = 96) was composed of participants fulfilling the same criteria as those with SCC but scoring < 22 in the Montreal Cognitive Assessment (MoCA) (34), the most frequently used cut-off to detect MCI (81). Individuals with AD (n = 389) fulfilled the National Institute of Neurological and Communicative Disorders and Stroke-Alzheimer’s Disease and Related Disorders Association (NINCDS-ARDA) criteria (82), were in early and middle stages of the disease, presented memory deficits, and were functionally impaired. Individuals with bvFTD (n = 114) fulfilled the revised Rascovsky criteria (83), were in the early and middle stages of the disease, exhibited prominent behavioral changes, lacked primary language deficits, and had functional impairment. Supporting the clinical diagnosis of neurodegenerative conditions, an analysis of a subsample of participants with available structural MRI data revealed temporal and frontoparietal atrophy in the AD group (46), and fronto-temporo-insular atrophy in the bvFTD group (84) (Fig. S1 and Table S8). Demographic and cognitive information of each participant group is provided in Table S9. The IRΒ of each recruitment site and the Executive Committee of the ReDLat consortium approved this study. All participants signed informed consent in accordance with the Declaration of Helsinki.
Social cognition assessment. Participants completed the Mini-Social Cognition and Emotional Assessment (Mini-SEA), a short battery designed to assess two social cognition domains: facial emotion recognition and mentalizing (31). In the facial emotion recognition subtest, participants are asked to identify the emotion being depicted by an individual in 35 different photos from the Ekman series. The following options are provided: fear, sadness, disgust, anger, happiness, surprise, and neutral. Each correct item is given 1 point. The mentalizing subtest consists of an adaptation of the Faux Pas test. Participants are presented with 10 short stories and asked to identify if the protagonist committed an unintended transgression of a social rule (i.e., a faux pas). Each story also includes two control questions to assess general understanding. The maximum score for this subtest is 40 points. The scores of emotion recognition and mentalizing subtests are converted to a score of 15 each and then summed, resulting in a total score of 30, with higher scores representing better performance. From the full sample, 6.11% of participants (n = 65) were removed for lacking a valid score either in the emotion recognition or the mentalizing subtest, resulting in a final sample of 998 individuals.
Predictors of social cognition. The set of potential predictors of social cognition included:
(a) Diagnosis, HCs, SCC, MCI, AD, and bvFTD.
(b) Demographics, sex (female, male), age (years), education (years), and country income (HICs, UMICs) following the World Bank classification (32).
(c) Cognition
(c.1) Cognitive score, derived from harmonized scores in the Addenbrooke’s Cognitive Examination III (ACE-III) (35), the Mini-Mental State Examination (MMSE) (33), and the MoCA (34) –see SI Extended Methods for details about these tools, Table S9 for the number of participants assessed with each tool in each group, and Data harmonization section.
(c.2) Executive score, derived from harmonized scores in the INECO Frontal Screening (IFS) (37) and the Frontal Assessment Battery (FAB) (36) –see SI Extended Methods, Table S9, and Data harmonization section.
(d) Brain reserve
(d.1) Grey matter volume, average volume of key hubs of the SN, the DMN, the EN, the MN, and the VN from the Automated Anatomical Labeling (AAL) atlas (85) calculated using voxel-based morphometry (VBM) analysis (see below).
(d.2) Functional connectivity, average connectivity strength of the SN, the DMN, the EN, the VN, and the MN calculated via seed analysis of the fMRI resting-state series (see below).
(e) Motion artifacts, average translation and rotation movements estimated during the preprocessing of the fMRI sequence.
Neuroimage acquisition and preprocessing. This section is reported following recommendations from the Organization for Human Brain Mapping (86). Whole-brain structural 3D T1-weighted and resting-state sequences were obtained for 598 (195 HCs, 91 SCC, 53 MCI, 194 AD, 65 bvFTD) and 388 (125 HCs, 91 SCC, 52 MCI, 82 AD, 38 bvFTD) participants, respectively, across acquisition centers. Demographic and cognitive information of these subsamples are provided in Table S10 and Table S11. Scanning protocols followed by each center are detailed in Table S12 and Table S13. Structural MRI scans were preprocessed using the DARTEL Toolbox following standard procedures for VBM (38) through the Statistical Parametric Mapping software (SPM12, https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). Functional images were preprocessed using the Data Processing Assistant for Resting-State fMRI toolbox (DPARSF V4.4, http://rfmri.org/DPARSF) following published procedures (39) –see details in SI Extended Methods. Six movement parameters (right, forward, up, pitch, roll, yaw) were estimated during realignment to calculate average translation and rotation movements per participant (group statistics are reported in Table S14).
Data harmonization. To harmonize the available data and increase the number of participants with homogeneous cognitive and executive measures, two procedures were applied. First, cognitive screening measures were harmonized using equivalence tables (49, 50), as previously reported (87). This procedure allows for estimations of MoCA and ACE scores using MMSE scores, and estimations of the MMSE scores using MoCA and ACE scores. As a result, a total of three new converted-harmonized variables were added. Then, the MMSE and MoCA scores were transformed from 0–30 to 0-100 scale and averaged with ACE score to create a single cognitive score per participant (scale 0-100). All participants had a cognitive score. Finally, IFS and FAΒ scores were also transformed into 0-100 scale and average to obtain a single executive score per participant. In total, 833 participants had an executive score.
Second, we calculated z-scores for demographic (sex, age, education, country income), cognitive (cognitive score, executive score), grey matter, functional connectivity, and motion artifacts variables. For neuroimaging variables, z-scores were estimated using normative data from each fMRI acquisition center according to the following equation:
$${x}_{z}=\frac{x-{\mu }}{s}$$
where:
\({x}_{z}\) is the new value,
\(x\) is the original raw score,
\({\mu }\) is the mean score for HCs from the center to which the participant belongs, and
\(s\) is the standard deviation for HCs from the site or center to which the participant belongs.
Data imputation. A sklearn iterative imputer with Bayesian ridge regression (51) was used to impute missing values for age (n = 4), education (n = 2) and executive score (n = 165). This algorithm applies a multivariate imputing strategy modeling a column with missing values as a function of other features and using the estimate for imputation. Each feature is imputed sequentially allowing the usage of prior imputed values on the model that predicts later features. This process is repeated several times, allowing increasingly better estimates of missing values to be calculated as the missing values for each feature are estimated.
SVR models. To generate predictions of continuous variables (Mini-SEA emotion recognition, mentalizing, and total scores) from multimodal features (diagnosis, demographics, cognition, brain reserve, motion artifacts), we used SVR models. SVR is a variation of support vector machine which allows linear and non-linear regression. SVR transforms the feature space to establish a hyperplane that best fits the training data, while also minimizing the generalization error on new, unseen data (48). The hyperplane is defined as the set of all points x in the feature space such that:
where \(w\) is the weight vector, \(b\) is the bias term, and · denotes the dot product.
The SVR model seeks to find the weight vector \(w\) and bias term \(b\) that satisfy this constraint, while also minimizing the distance between the hyperplane and the training data. The distance is measured using a loss function, typically the ε-insensitive loss:
$$L\left(y,\widehat{y}\right)=max\left(\left|y-\widehat{y}\right|-{\epsilon },0\right)$$
where \(y\) is the true target value, \(\widehat{y}\) is the predicted target value, and \({\epsilon }\) is a small constant that defines the width of the margin around the hyperplane. The loss function penalizes errors that exceed \(\epsilon\), but ignores errors that fall within \(\epsilon\).
To find the optimal weight vector \(w\) and bias term \(b\), SVR introduces two slack variables\({\xi }_{i}\)
and \(\widehat{{\xi }_{i}}\) for each training example, which allow for violations of the margin and the \(\epsilon\)-insensitive loss, respectively. The optimization problem for SVR is then given by:
Minimize:
$$\frac{1}{2}{\left|\left|\text{w}\right|\right|}^{2}+\text{C}\left({\sum }_{\text{i}=1}^{\text{n}}\left({{\xi }}_{\text{i}}+\widehat{{{\xi }}_{\text{i}}}\right)\right)$$
Subject to:
where \(C\) is a hyperparameter that controls the trade-off between the margin width and the number of violations allowed, and \(n\) is the number of training examples. The first term in the objective function encourages a wide margin, while the second term penalizes violations of the margin and the \(\epsilon\)-insensitive loss.
SVR can be extended to handle non-linear regression tasks by using a kernel function to map the input data to a higher-dimensional feature space, where the problem may become linearly separable. The optimization problem then becomes:
Minimize:
$$-\frac{1}{2}{\sum }_{\text{i},\text{j}=1}^{\text{n}}\left({{\alpha }}_{\text{i}}-\widehat{{{\alpha }}_{\text{i}}}\right)\left({{\alpha }}_{\text{i}}-\widehat{{{\alpha }}_{\text{i}}}\right)\text{K}\left({\text{x}}_{\text{i}},{\text{x}}_{\text{j}}\right)- \epsilon$$
Subject to:
$${\sum }_{\text{i}=1}^{\text{n}}\left({{\alpha }}_{\text{i}}-\widehat{{\alpha }_{i}}\right)=0$$
$$0\le {{\alpha }}_{\text{i}},\widehat{{\alpha }_{i}}\le \text{C}$$
where \(K\left({x}_{i},{x}_{j}\right)\) is the kernel function that computes the inner product between the mapped feature vectors, and \({\alpha }_{i}\) and are Lagrange multipliers that determine the importance of each training example in defining the hyperplane. The kernel function allows SVR to learn complex, non-linear relationships between the input features and the target variable.
Hyperparameter tuning. A Bayesian optimization (52) with k = 3 cross-validation was applied for tuning the hyperparameters. A Radial Basis Function kernel was used with optimized gamma value. Models with the best hyperparameters were trained on a training sample (70%) and tested in a testing set (30%), with 10 repetitions –SI Extended Methods.
Feature selection. We used a backward elimination approach (53) to select the most significant predictors for each model. For each iteration, we dropped the predictor with the largest P value until we reached a statistically significant model, a predictor with a P value that became statistically significant, or a model with two predictors.
Statistical analyses
VBM analysis. Using VBM preprocessed structural images, we calculated the average grey matter volume (mL, corrected by total intracranial volume) of 116 regions of the AAL atlas (85) to create grey matter volume indexes of the main hubs of the SN [average of the bilateral anterior cingulum and insula volume (40)], the DMN [average of the bilateral medial frontal and posterior cingulate volume (41)], the EN [average of the bilateral middle frontal and inferior parietal volume (66)], the VN [average of the bilateral occipital volume (66)], and the MN [average of the bilateral precentral volume (66)].
Functional connectivity analysis. The functional connectivity strength of the SN, the DMN, the EN, the VN, and the MN was calculated using seed analysis. Two bilateral seeds were placed on cubic regions of interest (voxel size = 7x7x7) for each network: the dorsal anterior cingulate cortex for the SN (40), MNI coordinates 10, 34, 24 and − 10, 34, 24; the posterior cingulate cortex for the DMN (41), MNI coordinates 3,-54, 27 and − 3,-54, 27; the middle frontal gyri for the EN (42), MNI coordinates 30, -2, 62 and − 30, -2, 62; the primary visual cortex for the VN (43), MNI coordinates 8, -92, 8 and − 8, -92, 8; and the primary motor cortex for the MN (44), MNI coordinates 32, − 30, 68 and − 32, -30, 68. The Pearson correlation coefficient between the averaged BOLD signal of each pair of seeds and voxels comprised in standard masks (88) typically involved in each resting-state network was used to extract one feature per network for each participant. The statistical significance of the resting-state networks was tested by comparing them with null surrogate models. This approach enables robust statistical evaluations to ensure that the results observed are not obtained by chance but represent a true characteristic of the underlying system (89). The surrogate data technique is based on comparing a particular property of the data (a discriminating statistic) with the distribution of the same property calculated in a set of constructed signals (surrogates) that match the original data set but do not possess the property that is being tested. To this end, we used Fourier transform-based surrogates to recreate the brain’s complex-system dynamics, including uncorrelated and correlated noise, coupling between different brain areas, and synchronization. We found that all the computed resting-state networks were statistically significant against null connectivity (SN: P = 0.02, DMN: P = 0.02, EN: P = 0.03, VN: P = 0.02, MN: P = 0.03), further corroborating our connectivity methods.
Age effects on social cognition. Simple linear regression analyses were used to evaluate the predictive value of age on emotion recognition, mentalizing, and the social cognition total score. Analyses were performed in R software. The alpha threshold was set at P < 0.05. Effect size was evaluated with f2, following Cohen’s criteria (90): stating that 0.02 indicates a small effect, 0.15 indicates a medium effect, and 0.35 indicates a large effect.
Social cognition performance across diagnostic groups. Linear mixed-effects models (47) were performed in R to examine between-group differences in emotion recognition, mentalizing, and the total score. Sex, age, and education were entered in the model as covariates of no-interest and participant’s country of origin was entered as a random effect. Post-hoc tests were corrected using the Sidak method. The alpha threshold was set at P < 0.05. Effect size was evaluated with ηp2 (91) where 0.01 indicates a small effect, 0.06 indicates a medium effect, and 0.14 indicates a large effect.
SVR model estimation and performance assessment. We trained and tested 1000 optimized SVR regressors for each outcome variable to obtain the final models using a bootstrap approach. We applied P value correction for false discovery rate using statsmodels (version 0.13.2) and set aside median-stratified 30% of the data as test set. To evaluate models’ performance, we used four statistics: the coefficient of determination R², 95% CI, Cohen’s f2 (90), Fisher F test, and the largest corrected P values. Outlier results (R² < IQR − 1.5 * SD and R² > IQR + 1.5 * SD) were discarded to improve average estimates.