Multiscale neural signatures of major depressive, anxiety, and stress-related disorders

Significance Major depressive, anxiety, and stress-related disorders are highly comorbid and may affect similar neurocircuitry and cognitive processes. However, the neurocircuitry underlying shared dimensions of cognitive impairment is unclear and holds the promise of reimagining psychiatric nosology. Here we leverage population imaging data (n = 27,132) to show that while major depressive and anxiety disorders share functional and structural neural signatures, stress-related disorders are distinct from these two conditions. We report that better cognitive function is associated with lower connectivity of specific nodes of the default mode and frontoparietal networks. These findings provide population benchmarks for brain–cognition associations in healthy participants and those with lifetime major depressive and anxiety disorders, advancing our understanding of intrinsic brain networks underlying cognitive dysfunction.


Cognitive data
Data processing. Participants with times to complete the Trailmaking test (TMT) that exceeded 4 standard deviations from the mean were excluded; participants who took less than 100 second to complete TMT alphanumeric path were excluded; TMT alphanumeric score was calculated by penalizing each error made with a 5 second penalty and adding this penalty time to the time to complete alphanumeric path.
Other UKB cognitive tests. In addition to the tests of cognitive function we included in the main analysis, there were several other cognitive tests that we decided not to include. We selected several tests that were previously shown to be impaired in major depressive disorder (MDD) or anxiety and that showed clear relevance to executive function. The tests selected were guided by our hypotheses that executive dysfunction provides a shared neurocognitive mechanism underlying MDD and anxiety.
We nevertheless present the effects of the clinical groups compared to controls on these measures of cognition after correcting for age, age 2 , age*sex and sex and site (Newcastle, Cheadle, Reading) in Supplementary Table 2. While pairs matching is a popular online game (Mah-jong being one of the most famous examples of tile/card matching), it is less commonly used in cognitive neuroscience and psychiatry research although some evidence suggests that playing Mah-jong can have beneficial effects on cognitive functioning in older adults (1). We also decided against including reaction times as they lack specificity and are affected in different neurodegenerative conditions (2) as well as schizophrenia (3). However, the underlying aetiology that leads to reaction time differences is likely different across these disorders. Next, digit span is a common measure of working memory, yet we did not find reduced digit span in our case groups. Matrix completion test was not included as it provided an additional fluid intelligence measure to the already included fluid intelligence battery. Tower test is a measure of planning that was not impaired in any other disorder apart from MDD+ANX. Finally, neuroticism is a personality trait rather than a cognitive test and was not included among tests of cognitive function. Correlations between the cognitive variables are shown in Supplementary Figure S1.

Clinical Data
One of the key limitations of the UKB data is the cross-sectional nature of the dataset, whereby ICD diagnoses refer to lifetime presence of the respective psychiatric disorder. In order to illustrate the duration since the first time a diagnosis was reported, we used the age at the time of scan (Data Field 21003), year of birth (Data Field 34) and the date when a diagnosis was first reported (Data Fields 130894 for MDD, 130906 for ANX and 130910 for STR). Further, we explored the duration since the last episode based on selfreported symptoms of depression (Data Field 20434). Antidepressant prescription data was extracted from linked electronic health records (primary care prescription records, Data-Field 42039). Age at first MDD episode was extracted from Data Field 20433 and age at most recent MDD episode was extracted from Data Field 20434. These items were self-reported during the first UKB visit and 60.0% of participants with a lifetime ICD diagnosis of MDD and 57.7% of participants with a lifetime ICD diagnosis of both MDD and ANX answered this question. Number of MDD episodes was also assessed using self-report (Data Field 20442) and only 45.1% of participants with a lifetime MDD diagnosis and only 40.1% of participants with a lifetime diagnosis of both MDD and ANX answered this question. Head motion was assessed using Data Field 25741. Genetic kinship or relatedness was assessed using Data Field 22021.
We used the following Data-Fields as an exclusion criteria for being part of the  Figure S1. Correlation structure of cognitive variables and neuroticism. For each pair of variables, rows with missing values in either variable were excluded from the correlation. TMT and Gf showed the strongest associations with other variables. TMT -trailmaking test; Gffluid intelligence; PALpaired associated learning; DSSTdigit-symbol substitution test; Pairspair matching game; RTreaction times; DSdigit span; Matricesmatrix completion test; Tower -Tower test. Correlates significant at P<0.001 are shown in bold.

Mapping the UK Biobank ICA components to Yeo 7 networks
In order to map the UKB ICA components to the Yeo 7 networks, we obtained a volumetric parcellation of the Yeo networks from Freesurfer online resources (https://surfer.nmr.mgh.harvard.edu/fswiki/CorticalParcellation_Yeo2011). We calculated the proportion of voxels in each of the ICA components that fell in each of the 7 Yeo networks (Supplementary Table 1) and assigned each independent component (IC) to the network that encapsulated the highest proportion of voxels of that IC. Exceptions were first, IC-18 that encompassed subcortical regions, notably the striatum; second, IC-15 that encompassed the cerebellum; third IC-3 that was part of both dorsal and ventral attention networks; fourth, IC-21 that was part of the FPN, DMN and to a lesser extent VAN and finally IC-10, IC-11 and IC-12 that fell into Yeo motor and attentional networks.

Multiple comparison correction and permutation testing for significance
Main approach. We use permutation testing a) to show which case-control differences were significant and b) to show which unthresholded case-control brain maps were significantly associated with each other. In our main analyses, we first construct general linear models, testing for the effects of diagnosis on cortical thickness and functional connectivity in separate sets of linear models. We include a number of variables such as site, sex and age as covariates. Each observation in the data matrix comes from a single participant. Next, re-run these linear models to obtain a random distribution of casecontrol differences using permutations of the outcome variable, i.e. either cortical thickness of one of the 360 regions or one of the 210 functional connectivities. We repeat the models with the reshuffled outcome variable 1,000 times and test for main effect of group and for case-control differences on these reshuffled cortical thickness and functional connectivity values. By re-running these randomized models, we generate random distributions of F-values testing for significance of the main effect of group. We also generate random distributions of case-control T-statistics. When testing whether case-control T-maps are significantly correlated, we generate a 'random' distribution of correlations between these T-maps.
Spatial rotation 'spin' testing. The permutation procedure described above does not consider the spatial structure of brain maps. Therefore, we extended our analyses by using a spin test for parcellated brain maps (4), an application of the spin test for any surface map (5). Spin tests can be currently applied to spatial maps e.g. capturing cortical thickness, task activation or global connectivity. Spin tests for connectome data which involves fMRI connectivities between pairs of regions is not currently available. Therefore, spin testing was not applied to the the linear models or partial least squares models that focused on functional connectivity data. Further, functional connectivities used in this analysis involve partially overlapping brain maps derived from a group independent component analysis (6), which complicates the use of a spin test as the spatial relationships between components are not equivalent to a simple spatial map.
We first use the spin test to assess significance of the correlations between case-control brain maps of cortical thickness. Figure S2 shows the resulting p-values after permutation testing in the main analysis (A) and after spin testing (B). The pattern of significance was the same, though permutation of the outcome variable in the original linear models appears to be more stringent. We therefore report the more stringent results in the main analysis.
Hybrid permutation spin test. The original purpose of the spin test was to assess the similarity of pairs of brain maps. Therefore, all examples of the spin test toolbox test for associations of pairs of surface maps, where each observation is a region of interest. In our analyses of linear models of cortical thickness, the units of observation are individual participants. However, we have worked to combine the spin/rotation method with regular permutation testing in our cortical thickness analyses. This approach allows us to preserve the brain map structure for each subject-level brain map and still generate random distributions of linear effects of diagnosis. The original approach involved reshuffling the outcome variable for each brain region in isolation. On the other hand, in our proposed novel approach each permutation involves assigning a different rotation of the cortical thickness map for each subject. We then run linear models on cortical thickness data from each of the 360 regions. We first generated 12,214 rotations, one for each subject. During each permutation, we assign one of these rotations to each subject at random. We rotate each subject's brain map and repeat the linear model analysis for each region. This procedure is repeated 1,000 times. P-values are then calculated by comparing the actual F-values with the randomized distributions of F-values. As before, when a significant main effect of diagnosis is found (permutation p<0.05), we use the T-tests from the linear model (Bonferroni corrected p<0.0125) to test for differences between each of the case groups vs controls. We show the significant case-control differences in Supplementary  Table 4. Similar to the original analysis, we also generate p-values for the correlations assessing disorder similarity ( Figure S2C).
Supplementary Figure S2. P-values for the disorder similarity correlations after regular permutation testing (A), after the spin test (B) and after the hybrid combination of regular permutation testing and spatial rotation (C) of the cortical thickness maps. Zeros indicate a p-value of p<0.001.

Clinical features and cognitive function
As shown in the Supplementary Figure S3, the first time a clinical diagnosis was reported was within 10-20 years from the time of the MRI scan for most participants. Moreover, many of the participants in the MDD-and MDD+ANX groups also showed higher PHQ-2 scores suggesting that depressive symptoms were present in these groups at the time of scan (Supplementary Figure S4, Table 1).
Supplementary Figure S3. Clinical sample characteristics. Histograms (A) illustrate the number of years since the first time a diagnosis of major depressive disorder (MDD, F32), a non-phobic anxiety disorder (ANX, F41) or a stressor-related disorder (STR, F43) was reported. We further broke the sample down into each of the groups shown in (B). The stacked bar chart on the left includes the proportion of all participants (with functional connectivity data) who had a lifetime diagnosis of MDD-, ANX-, MDD+ANX or STR-at the time of scan. The stacked bar chart on the right shows the proportion of participants falling into each of the clinical groups when considering only participants with a PHQ-2 score of two or higher.
Over 33% of participants with a lifetime diagnosis of MDD, ANX, or STR reported experiencing the most recent MDD episode during the last five years and over 60% of participants with a lifetime diagnosis of MDD, ANX, or STR reported such an episode in the last ten years (Supplementary Figure S4).
Supplementary Figure S4. Number of years since the last MDD episode based on self-report (Data Field 20434).

Significant effects of diagnosis on cortical thickness
The regions of significant differences after permutation testing are shown in Supplementary Table 3. In order to correct for multiple comparisons with the case-control differences, we used the following thresholds: PPERMUTATON<0.05 to assess whether the overall group effect was significant and in addition, we Bonferroni-corrected (P<0.0125) the post-hoc case-control comparisons of MDD-, ANX-, STR-and MDD+ANX vs control group.
Supplementary We further present the results of the hybrid permutation analysis in Supplementary Table  4. The regular and the hybrid permutation approaches produced highly consistent results, although the hybrid permutation analysis returned fewer significantly different regions compared to the regular permutation test. The same healthy control group was used for the regular and the hybrid permutation analysis. The healthy control group was picked at random to match the cases in sample size, age and sex. Supplementary

Disorder Similarity
Supplementary Figure S6. Disorder similarity assessed using cortical thickness. For each pair of disorders, we show scatterplots of case-control t-statistics comparing cortical thickness. Each datapoint represents a region of interest in the 360-region HCP parcellation. Significant correlations are highlighted in bold.
Supplementary Figure S7. Disorder similarity assessed using functional connectivity. For each pair of disorders, we show scatterplots of case-control t-statistics comparing functional connectivity. Each datapoint represents one of the 210 unique functional connectivities. Significant correlations are highlighted in bold.

Sensitivity analyses
We include two sensitivity analyses: firstly, we show that restricting the MDD group to active MDD only did not impact our disorder similarity results ( Figure S8). We defined active MDD at the time of scanning and cognitive testing using the PHQ-2 with a cut-off score of two or greater (7).
Supplementary Figure S8. Case-control differences in functional connectivity (A) and cortical thickness (C) between lifetime MDD-and healthy control groups (also shown in Figure 1 and 2) were highly consistent with the case-control differences between active MDD and controls (E, G). Focusing only on those with lifetime MDD diagnosis and PHQ2≥2 resulted in lower number of participants with active MDD (n=977 for functional connectivity and n=1,103 for cortical thickness). Disorder similarity was also highly consistent between lifetime (B) and active MDD (F).
Secondly, we repeat the case-control comparisons focusing only on unmedicated participants and show that the results were highly consistent after excluding unmedicated participants (Supplementary Figure S9).
Supplementary Figure S9. Case-control comparisons for functional connectivity (A) and cortical thickness (C) after exclusion of participants taking medications were similar to the case-control maps for all participants although the sample size was substantially reduced (total n=7,646 for each analysis). Disorder similarity (B,D) was also largely consistent between the main results and this sensitivity analysis.

Neural correlates of cognitive function in all cases
In addition to presenting the correlation matrices showing the relationships between cognitive tests (TMT, Gf, PAL, and DSST) with PLS latent variable scores, we also show scatterplots of these relationships in Supplementary Figure S10. We also provide more details on PLS2 weights in Supplementary Figure S10B.
No significant brain-cognition associations between cortical thickness (instead of functional connectivity) and cognitive function were found in all cases (n=3,372, permutation p=0.185).
Supplementary Figure S10. (A) Associations between PLS scores and cognitive function tests in all cases. (C) In addition to scatterplots, lines of best fit and 99% confidence intervals (obtained using MATLAB R2016a, fitlm and predict functions) are shown. (B) Thresholded PLS2 weights implicated connectivities between independent components corresponding to the default mode (DMN), frontoparietal (FPN) and striatal (STR) networks. Blue connections between network components suggest that higher connectivity of those components was associated with worse fluid intelligence and paired associates learning. Red connections between network components suggest that higher connectivity of those components predicted better fluid intelligence and paired associates learning. No functional connectivities showed robust loadings on PLS3 (|Z|<3) and thus no graph is shown for PLS3. Network labels include: MOTmotor; CRBcerebellum; VISvisual; DA/VAdorsal/ventral attention. TMT -Trailmaking path; Gffluid intelligence test; PALpaired associates learning; DSSTdigit-symbol substitution test.

Neural correlates of cognitive function in MDD-, ANX-MDD+ANX and STR-
A separate PLS regression found three latent variables that together explained a significant amount of variance (PPERMUTATION<0.001) in the cognitive scores in MDD-(n=1,895). PLS1, PLS2 and PLS3 explained 5.2%, 0.8% and 1.6% of variance in cognitive scores. PLS1 scores were associated with longer times to complete trailmaking and worse scores on fluid intelligence, digit-symbol substitution and paired associate learning. Since no PLS2 or PLS3 weights showed reliable weights (|Z|<3) after bootstrapping, we did not focus on these PLS latent variables.
A PLS regression in ANX-(n=426) identified three latent variables that together explained a significant amount of variance in the cognitive scores (PPERMUTATION=0.009). PLS1, PLS2 and PLS3 explained 10.0%, 6.9% and 6.3% of variance in cognitive scores. Similar to the other PLS analyses, PLS1 scores were associated with longer times to complete Trailmaking and worse scores on fluid intelligence, digit-symbol substitution and paired associate learning. No significant brain-cognition relationships were found in MDD+ANX or STR-(PPERM>0.1).

Neural correlates of cognitive function in healthy controls
We aimed to uncover the neural correlates of worse executive function and verbal memory of the Control group (n=14,199) aiming to test whether the neural correlates of executive function is similar to those of the clinical groups presented in Figure 3. These results are shown in Supplementary Figure S11 and S12. PLS regression identified three components that explained 2.0%, 1.2% and 0.4% of variance in the four cognitive function tests (TMT, GF, PAL and DSST). Permutation testing showed that these components together explained a significant amount of variance in executive function (PPERM<0.001). The first component, PLS1, captured the most variance in the outcome variables (Supplementary Figure S12) and was associated with worse performance on each of the tests. The second component, PLS2, was associated with better performance on each of the tests.
We found 20 connectivities with normalized PLS1 weights with Z>4 and 11 connectivities with normalized PLS1 weights with Z<-4 ( Figure S11B). There were 14 PLS2 weights with Z>4 and 19 PLS2 weights with Z<-4. Functional correlates of worse executive and memory performance were similar to the functional correlates of worse performance in the clinical groups, although more functional connectivity correlates were identified in the bigger sample of healthy controls.
A separate PLS model testing for brain-cognition associations with cortical thickness instead of functional connectivity was significant in controls (n=14,885, permutation p<0.001). However, it explained less variance than the functional connectivity PLS. In the cortical thickness PLS, the three components explained 0.69%, 0.37% and 0.19% of variance in the congitive function tests. The cortical thickness PLS1 scores were moderately correlated with TMT, GF, PAL and DSST (r=-0.09. r=-0.12, r=-0.05 and r=-0.05, respectively). Given that the focus of our analyses was on the brain-cognition associations in mood disorders, we report these results in the Supplementary Information.
Supplementary Figure S11. Dimensional relationships between functional connectivity and executive function from a partial least squares (PLS) regression in healthy controls. Higher FC scores on PLS1 were associated with worse executive function, while higher FC scores on PLS2 were associated with better executive function (A). Thresholded PLS1 and PLS2 weights implicated pairwise connectivities between independent components corresponding to the default mode (DMN), frontoparietal (FPN) and dorsal/ventral attention (DA/VA) networks (B). Blue connections between network components suggest that higher connectivity of those components was associated with worse cognitive performance. Red connections between network components suggest that higher connectivity of those components predicted better cognitive performance. Anatomical locations of the independent components are shown in (C). Network labels include: MOT motor; CRBcerebellum; VISvisual; STRstriatum. TMT -Trailmaking path; Gf fluid intelligence test; PALpaired associates learning; DSSTdigit-symbol substitution test.
Supplementary Figure S12. Overview of the PLS findings in healthy controls. The PLS model explained a significantly greater amount of variance (3.6%) in cognitive function than expected by chance from a permutation distribution (A, shown in dark blue). The scores of the resulting latent variable PLS1, summarizing functional connectivity were predictive of PLS1 scores summarizing cognitive function in the full control sample (B). The PLS model achieved good performance in a hold-out analysis, with moderate correlations between predicted and observed cognitive scores in held-out data. (C). Figure S13. Correlation between PLS2 scores summarizing functional connectivity (FC) and PLS2 scores summarizing cognitive function in all cases (left) and in controls (right).