Multivariate genetic analysis of personality and cognitive traits reveals abundant pleiotropy

Personality and cognitive function are heritable mental traits whose genetic foundations may be distributed across interconnected brain functions. Previous studies have typically treated these complex mental traits as distinct constructs. We applied the ‘pleiotropy-informed’ multivariate omnibus statistical test to genome-wide association studies of 35 measures of neuroticism and cognitive function from the UK Biobank (n = 336,993). We identified 431 significantly associated genetic loci with evidence of abundant shared genetic associations, across personality and cognitive function domains. Functional characterization implicated genes with significant tissue-specific expression in all tested brain tissues and brain-specific gene sets. We conditioned independent genome-wide association studies of the Big 5 personality traits and cognitive function on our multivariate findings, boosting genetic discovery in other personality traits and improving polygenic prediction. These findings advance our understanding of the polygenic architecture of these complex mental traits, indicating a prominence of pleiotropic genetic effects across higher order domains of mental function such as personality and cognitive function. Hindley et al. used multivariate statistical genetics tools to examine the genetic underpinnings of cognitive and personality traits and find they are shared across higher order domains of mental functioning.

Personality and cognitive function are heritable mental traits whose genetic foundations may be distributed across interconnected brain functions.Previous studies have typically treated these complex mental traits as distinct constructs.We applied the 'pleiotropy-informed' multivariate omnibus statistical test to genome-wide association studies of 35 measures of neuroticism and cognitive function from the UK Biobank (n = 336,993).We identified 431 significantly associated genetic loci with evidence of abundant shared genetic associations, across personality and cognitive function domains.Functional characterization implicated genes with significant tissue-specific expression in all tested brain tissues and brain-specific gene sets.We conditioned independent genome-wide association studies of the Big 5 personality traits and cognitive function on our multivariate findings, boosting genetic discovery in other personality traits and improving polygenic prediction.These findings advance our understanding of the polygenic architecture of these complex mental traits, indicating a prominence of pleiotropic genetic effects across higher order domains of mental function such as personality and cognitive function.
The brain is responsible for a diverse set of interconnected and overlapping functions.Among these, personality and cognitive functions both represent heritable, higher order domains of mental functioning that (1) remain relatively stable between late adolescence and older age [1][2][3] , (2) form central components of an individual's identity and (3) are related to multiple physical and mental health outcomes 4,5 .They are also interrelated, with evidence of complex patterns of association between personality structure, cognitive functioning 6 and academic performance 7 .A comprehensive investigation of their genetic foundations can provide insights into the neurobiological mechanisms influencing these fundamental human traits 8 .
Accelerated by the population-based cohort the UK Biobank (UKB; n = ~500,000), genome-wide association studies (GWAS) have revealed evidence of genetic overlap between personality and cognitive traits.Multiple overlapping genetic loci have been discovered across large-scale GWAS of neuroticism 9,10 , one of the 'Big 5' personality traits defined broadly as the propensity to experience negative emotions 11 and general cognitive function [12][13][14][15] , representing the shared variance across diverse cognitive functions.Neuroticism and general Article https://doi.org/10.1038/s41562-023-01630-9 We also leveraged our multivariate analysis to boost genetic discovery across the remaining Big Five personality traits and improve polygenic prediction.A conceptual illustration of the study is provided in Fig. 1.

Sample description
The UKB is a population-based cohort comprising over 500,000 participants between the ages of 39 and 72 years 36 .At enrolment, all participants were invited to complete a touchscreen questionnaire, including 12 dichotomous items derived from the neuroticism subscale of the Eysenck Personality Questionnaire-Revised Short Form 37 .They also completed 25 diverse cognitive tasks, including 13 components of a measure of 'fluid intelligence', either at enrolment or during follow-up visits.These included measures of fluid intelligence, reaction time, executive function and memory 38,39 (Table 1 and Supplementary Table 1).After also calculating sum-scores for neuroticism and fluid intelligence, we included all 39 measures to maximize statistical power for genetic discovery.After removing participants of non-White British ancestry and related individuals, the mean sample size across all measures was 201,820, ranging from 3,627 to 336,993 (Table 1 and Supplementary Fig. 1).Sample sizes were more variable among cognitive tasks than neuroticism items.Mean age was 56.9 years (s.d.= 8.0) at enrolment and 53.7% of included participants were female.

Item-level heritability and genetic correlations
To provide an overview of the heritability of questionnaire item-and cognitive task-level measures, we first calculated linkage disequilibrium score regression (LDSR) single nucleotide polymorphism (SNP)-based heritabilities (h 2 SNP ) for all included measures.Since the inclusion of non-heritable phenotypes may reduce statistical power 25 , four measures were removed from the analysis, leaving 35 measures (Supplementary Fig. 1 and Supplementary Table 2).We next computed pairwise LDSR genetic correlations followed by hierarchical clustering to explore directional genetic relationships (Fig. 2 and Supplementary Results) 40 .Neuroticism and cognitive measures shared weak negative correlations (mean r g = −0.15,s.d.= 0.12) compared to moderate to strong positive correlations within each domain (neuroticism mean r g = 0.64, s.d.= 0.14; cognitive mean r g = 0.56, s.d.= 0.23).Consequently, neuroticism and cognitive measures clustered separately.Neuroticism items further clustered into two subclusters: anxiety features (worry) and depressive features (depressed affect), reproducing previous findings 19 .Cognitive measures were more heterogenous, with 'reaction time' distinct from two clusters relating to fluid intelligence, prospective memory and numeric memory (fluid intelligence/memory) and executive function and visuospatial memory (executive function).A similar pattern was observed for phenotypic correlations (Fig. 2 and Supplementary Results).

Multivariate GWAS identifies loci with pleiotropic effects
On application of MOSTest to discover pleiotropic genetic effects, we identified 431 independent genetic loci significantly associated with the multivariate distribution of the 35 measures of neuroticism and cognition.This represented a 3.8× boost in locus discovery compared to mass univariate GWAS with correction for multiple testing (min-P), which identified 113 loci (Fig. 3a, Supplementary Fig. 2 and Supplementary Tables 3 and 4).Since MOSTest specifically leverages pleiotropy, this boost in discovery supports the hypothesis of pleiotropic genetic effects across mental traits.We also performed MOSTest analyses on neuroticism and cognitive measures separately to test the extent to which the boost in locus discovery was driven by cross-domain pleiotropy.Cognitive and neuroticism measures were associated with 221 and 199 loci, respectively.However, 153 loci discovered by the combined MOSTest analysis were not identified by either of the separate analyses, indicating that 35% of the discovered loci were driven by cross-domain pleiotropy.To test the effect of (1) including participants with medical intelligence also exhibit weak but significant negative genetic correlation 9 and higher polygenic scores (PGS) for neuroticism predict lower intelligence 15,16 , reflecting weak negative phenotypic correlations between neuroticism scores and intelligence quotient 17,18 .
However, previous genetic studies have typically treated personality traits and cognitive ability as discrete constructs, reducing each complex mental trait into a single measure 9,12 .The limitations of this approach are underscored by a study which constructed a bifactor model of the neuroticism scale, showing that whilst a general factor of neuroticism showed a negative genetic correlation with cognitive function, two additional factors termed anxiety/tension and worry/vulnerability showed positive genetic correlations with the same cognitive function variable 16 .In addition, an item-level analysis showed divergent patterns of genetic correlation between one of the neuroticism items and cognitive function and educational attainment and another with bipolar disorder 19 .In contrast, multivariate approaches simultaneously model the matrix of correlations between phenotypes, thus more accurately representing the interconnected nature of the brain and its functions.
Multivariate analysis can also increase statistical power in mental traits 19 .For example, multivariate analytical frameworks such as multitrait analysis of GWAS (MTAG) and genomic structural equation modelling (GenomicSEM) have been applied to cognitive traits 15,[20][21][22] and for the neuroticism sum-score in combination with other traits, such as depressive symptoms and subjective well-being 23 to improve genetic discovery.A multivariate approach to investigate pleiotropy across the two domains of cognitive function and neuroticism has not yet been conducted.Nevertheless, due to computational limitations, it would be infeasible to apply MTAG to large numbers of phenotypes such as personality questionnaire item-and cognitive task-level data.Furthermore, neither MTAG nor GenomicSEM capture mixed effect directions, that is the presence of shared genetic variants with a mixture of concordant and discordant effects on two phenotypes resulting in minimal genetic correlation despite extensive overlapping genetic effects 24 .In contrast, a boost in genetic discovery has been demonstrated by the 'pleiotropy-informed' multivariate omnibus statistical test (MOSTest), which is ambivalent to effect direction.Applying MOSTest to brain imaging phenotypes has shown that alterations in brain morphology and functional connectivity are associated with hundreds of genetic loci with 'pleiotropic' genetic effects across the brain, even despite weak genetic correlation [25][26][27][28] .We hypothesized that the genetic architecture of interconnected higher order mental traits, such as cognitive function and personality traits are driven by similar pleiotropic effects.
Additionally, our understanding of the genetics of personality traits beyond neuroticism is limited [29][30][31] , in part because UKB did not collect data on the four remaining personality traits within the Big 5 taxonomy.As such, only eight loci have been reported across all five measures in the largest GWAS to date (n = 76,600-122,886) 32 .However, it is possible to boost statistical power for genetic discovery, identify shared genetic loci and improve prediction in underpowered GWAS by leveraging genetic overlap with a second, more powerful GWAS using the conditional false discovery rate framework (cFDR) [33][34][35] .This approach has recently been applied to MOSTest analyses of brain structural 35 and functional measures 28 to improve discovery and prediction of mental disorders.
Given evidence of genetic overlap between neuroticism and cognitive function, we sought to boost the statistical power for genetic discovery by exploiting pleiotropic genetic effects across questionnaire item and cognitive task-level measures of neuroticism and cognition.By applying pleiotropy-informed MOSTest, which incorporates scenarios of mixed effect directions, we found a substantial boost in discovery driven by shared genetic effects across domains.The widespread effects were supported by functional analysis, which identified underlying neurobiological processes distributed across brain regions.

Article
https://doi.org/10.1038/s41562-023-01630-9conditions which may affect cognition and (2) the choice of covariates, we ran two sensitivity analyses of our MOSTest findings (Supplementary Results and Supplementary Figs. 3 and 4).This showed a reduction in locus discovery (338 loci, 21.6% fewer genetic loci) but evidence of substantial cross-domain pleiotropy remained.See Supplementary Results and Supplementary Figs. 3 and 4 for further details.Given the importance of controlling for population stratification, we also performed an extended sensitivity analysis investigating the effect of including additional covariates on all analyses upstream and downstream of our MOSTest analysis (Supplementary Results, Supplementary Figs.5-12 and Supplementary Tables 5 and 6).This showed that our findings were robust to the inclusion of additional covariates.
To illustrate the distribution of genetic effects, we tested for cross-cluster genetic overlap among the 431 lead variants using univariate GWAS P values (Fig. 3b and Supplementary Table 7).This showed an increase in the number of shared variants at decreasing significance thresholds (P < 5 × 10 −8 , P < 1 × 10 −6 , P < 1 × 10 −5 ), indicating that the pleiotropic genetic variants captured by MOSTest had predominantly subthreshold associations.Comparing across clusters, the two neuroticism clusters 'depressed affect' and 'worry' shared the most lead variants at all thresholds (n = 22-68).Nonetheless, there was a comparable number of shared variants between cognitive and neuroticism clusters (n = 0-29) and within cognitive clusters (n = 1-24).Although these findings are partly affected by differences in sample size, this provides further evidence of pleiotropic genetic effects across mental traits.We also describe the pattern of effect directions across shared variants and evidence of cross-cluster gene-level overlap in Supplementary Results and Supplementary Fig. 13.
To further investigate the pattern of effect directions across mental traits, we performed hierarchical clustering of univariate z-scores from all 431 lead variants (Supplementary Figs. 14 and 15).This revealed that most lead variants had discordant effect directions between No. variants Fig. 1 | Conceptual overview of study design and main findings.We performed a multivariate GWAS of 35 measures of cognitive function and neuroticism in the UKB.We show that discovered loci have pleiotropic genetic effects across both neuroticism and cognitive domains, with differential expression of mapped genes across brain tissues.We replicate these findings in independent samples before leveraging the additional power generated by multivariate analysis to boost discovery of genetic loci associated with the remaining Big 5 personality traits and improve polygenic prediction of conscientiousness (CONSC) and cognitive function.

Article
https://doi.org/10.1038/s41562-023-01630-9 neuroticism and cognitive measures (n = 360), reflecting the weak negative phenotypic association 17,18 .However, a few variants had either concordant effects across all measures (n = 23) or had mixed effects within cognitive and neuroticism clusters (n = 48).This indicates the presence of mixed genetic effect directions but a predominance of discordant effects across domains.
We plotted univariate GWAS P values from all 35 measures for the top 40 lead variants to illustrate item-and task-level patterns of Clusters are derived from genetic correlation-based hierarchical clustering (Fig. 1).Further details are provided in Supplementary Table 1.
We also present the effect directions at the individual variant level, showing that four of the five lead variants exhibit a discordant relationship between neuroticism and cognition, consistent with negative genetic correlations.Nonetheless, Fig. 4b  We also present the distribution of z-scores across the five exemplar lead variants to further emphasize the pattern of effect directions in Supplementary Fig. 17.
We provided further evidence of substantial genetic overlap with mixed effect directions using the bivariate causal mixture model (MiXeR) (Supplementary Methods).MiXeR estimates the total number of shared genetic variants between two phenotypes irrespective of effect directions.Applied to the most heritable phenotype within each phenotypic cluster (Fig. 2), there was extensive genetic overlap irrespective of the genetic correlation between each pair of traits (Supplementary Fig. 18).This pattern is indicative of widespread shared genetic variants with mixed effect directions.Phenotypically, there were also stronger positive correlations within domains but minimal correlation across domains.Measures were clustered on genetic correlation, revealing two neuroticism clusters reproducing previously reported clusters 'depressed affect' and 'worry' 19 and three cognition clusters, broadly mapping on to 'reaction time' (RT), 'executive function' and 'fluid intelligence/ memory'.For further explanation of measures see Table 1 and Supplementary Table 1.

Replication in independent samples
In line with previous GWAS studies [41][42][43] , we tested for nominal significance, Bonferroni-corrected significance and consistency of effect direction for MOSTest-discovered lead variants in independent samples, including 23andMe neuroticism GWAS (n = 59,225) 32 and CHARGE 'general cognitive function' GWAS (n = 113,981) 14 (Supplementary Table 8).Out of 140 lead variants which were present in all three samples and 286 LD proxies (r 2 > 0.6) which had non-ambiguous alleles and were approximately independent from each other (r 2 < 0.1), 65 were nominally significant in the 23andMe neuroticism GWAS, 130 in the CHARGE general cognitive function GWAS and 26 in both datasets.
Using the exact binomial test to test the hypothesis that the number of variants were greater than that expected by chance, all three were highly significant (relative risk (RR) = 3.052, P = 1.98 × 10 −15 ; RR = 6.103,P = 5.80 × 10 -64 ; and RR = 24.413,P = 2.23 × 10 −27 , respectively).After Bonferroni correction (P < 0.05/426), 16 variants were significant in the CHARGE general cognitive function GWAS and four in the 23andMe neuroticism sample, none of which was significant in both.The exact binomial test showed that this remained significant for cognition (RR = 320, P = 5.24 × 10 −35 ) and neuroticism (RR = 80, P = 2.39 × 10 −7 ) but not both phenotypes (RR = 0, P = 1).We also tested for consistent genetic effects of lead variants across UKB and replication datasets 43,44 .Since they were the most comparable phenotypes within our analyses, we compared univariate GWAS effect directions for the neuroticism and fluid intelligence sum-scores with 23andMe neuroticism and CHARGE general cognitive function summary statistics, respectively.A total of 304 had concordant effects in neuroticism (RR = 1.427, one-tailed exact binomial P = 2.57 × 10 −19 ) and 344 had concordant effects in cognition (RR = 1.615,P = 1.54 × 10 −39 ).A total of 244 variants were concordant in both neuroticism and cognition (RR = 2.291, P = 2.22 × 10 −45 ), providing additional evidence of pleiotropic effects in independent samples.

Functional characterization
Using FUMA (fuma.ctglab.nl) 45, we performed functional annotation to provide biological insights into the genetic associations captured by MOSTest.We first used multimarker analysis of genomic annotation (MAGMA) which tests for the association between phenotypic variation and aggregated GWAS P values for 18,952 human protein-coding genes irrespective of effect direction 46 .MAGMA identified 1,062 multiple comparison-corrected significant genes associated with the 35 measures of neuroticism and cognition (Supplementary Table 9).Next, MAGMA-based tissue-specific expression analysis demonstrated highly specific enrichment of mapped genes in brain tissues.At the general tissue level (n = 30), the brain, pituitary, ovary and testis were significantly enriched (Supplementary Fig. 19).At the detail tissue level (n = 53), all of the 14 included brain tissues were significantly enriched, as well as testicular tissue (Fig. 4 and Supplementary Fig. 20).This was a modest increase compared to univariate measures alone, which identified the The number of lead variants shared between each pair of clusters is represented by the width of the coloured ribbons.The proportion of variants with concordant effect directions on each cluster is represented by the colour of the ribbons from blue (0) to red (1).No adjustments were made for multiple comparisons.

Article
https://doi.org/10.1038/s41562-023-01630-9 brain and pituitary at the general level and 13 brain tissues at the detail tissue level.However, none of the univariate analyses was associated with either ovary or testis at the general tissue level, nor the spinal cord and testis at the detailed tissue level (Supplementary Table 10).When applied to gene ontology (GO) and canonical pathways there was a clear predominance of brain-related gene sets.Out of 43 gene sets, 29 were directly implicated in the structure or function of the central nervous system and eight out of the top ten significantly enriched gene sets were related to synaptic structure or function (Fig. 5).Outside of the top ten, other notable gene sets included 'observational learning', 'behaviour' and 'cognition', in addition to several neurodevelopmental gene sets and 'gamma aminobutyric acid signalling pathway' (Supplementary Table 11).This also represented a substantial boost in gene-set discovery compared to univariate analysis, which identified a total of four gene sets across all 35 GWAS (Supplementary Table 11).We further explored the enrichment of different functional categories according to their pattern of association with the 35 included measures (Supplementary Results and Supplementary Fig. 21).

Boosting genetic discovery
We used the cFDR approach 34 to leverage the additional power generated by our multivariate analysis to boost discovery of genetic loci associated with the remaining Big 5 personality traits: agreeableness, conscientiousness, extraversion and openness in an independent sample (n = 59,225) 32 .The cFDR applies a Bayesian model-free statistical framework to rerank SNP associations with a primary trait given their strength of association with a conditional trait.We identified loci associated with agreeableness (n = 11), conscientiousness (n = 36), extraversion (n = 89) and openness (n = 24) (Fig. 6a and Supplementary Tables 12-15).The conditional analysis ensures that the boost in power from the MOSTest method is driven by overlapping genetic variants and not non-specific effects.Functional annotation of cFDR results identified 47 positionally mapped genes for agreeableness, 157 for conscientiousness, 531 for extraversion and 114 for openness (Supplementary Tables 16-19).MAGMA cannot be applied to cFDR statistics because cFDR test statistics are not normally distributed under the null hypothesis and so violate the assumptions of the model.We therefore applied hypergeometric test-based gene-set and tissue enrichment analyses using positionally mapped genes to replicate the approach taken by MAGMA 45 .There were no gene sets or tissues significantly enriched with mapped genes from any of the four traits.
To test for pleiotropic effects in the remaining personality traits, we also performed conjunctional FDR (conjFDR), an extension of cFDR which identifies shared loci between two phenotypes.This revealed that 46-74% of loci associated with the Big 5 personality traits were also associated with our multivariate analysis of mental traits, indicating extensive pleiotropic effects beyond just neuroticism (Supplementary Tables 20-23).
We performed cFDR using independent neuroticism and general cognitive function GWAS and compared these findings to the larger   24 and 25).Of those present in both datasets, 50 out of 72 (69.4%) neuroticism and 92 out of 131 (70.2%) general cognitive function lead variants were nominally significant in the larger GWAS of neuroticism 9 and general intelligence 12 , respectively.

Improving polygenic prediction
We compared PGS calculated on the basis of the top 10-100,000 LD-independent SNPs using three setups: (1) original GWAS P value ranking and original GWAS effect sizes (standard PGS), (2) cFDR-based ranking and original GWAS effect sizes (pleioPGS) 35 and (3) MTAG-based P value ranking and corresponding MTAG-adjusted effect sizes (MTAG) 22 .In the pleioPGS approach we used SNP ranking based on cFDR analysis conditioning GWAS of the trait of interest (23andMe Big 5 personality traits and CHARGE cognitive function) on our multivariate GWAS of cognitive function and neuroticism.In the MTAG approach, our UKB-based GWAS of neuroticism summary score (n = 274,056) was used to adjust P values and effect sizes in 23andMe GWASs of five  personality traits and our UKB-based GWAS of fluid intelligence summary score (n = 163,375) was used to adjust the CHARGE cognitive function summary statistics.We also applied the PRS-CS method (4), an approach which has been shown to outperform standard PGS models, which uses all SNPs in the model after adjusting effect sizes according to LD structure 44 .In all four setups, the phenotypic variance  explained by the PRS (r 2 ) was estimated using a linear regression model controlling for age, sex and first 20 genetic principal components.We hypothesized that the boost in power from our multivariate analysis leveraged by the pleioPGS approach will prioritize more informative variants than standard GWAS, MTAG or PRS-CS, resulting in improved PGS performance.When comparing the best performing model for pleioPGS and standard PGS, pleioPGS outperformed standard PGSs by 2.6 and 2.5 times for conscientiousness and cognitive function, respectively, as well as outperforming PRS-CS and MTAG-based rankings for conscientiousness (Fig. 6b).MTAG-based rankings outperformed pleioPGS for cognitive function, demonstrating that pleiotropy can be leveraged in other ways to boost polygenic prediction.None of the other PGS achieved statistically significant prediction compared to the null model after Bonferroni correction.This may indicate a lack of signal in the primary GWAS for successful prediction in the test sample.

Discussion
In this multivariate GWAS of 35 heritable questionnaire items, cognitive tasks and summary scores, we provide evidence of abundant pleiotropic genetic associations across personality and cognitive traits.Despite weak genetic and phenotypic correlations between neuroticism and cognitive domains, we discovered 431 genetic loci associated with the multivariate distribution of included traits, with evidence of pleiotropic associations across domains.Furthermore, we identified distinct patterns of relationships with evidence of cross-domain genetic association and mixed effect directions.This was confirmed by MiXeR analysis showing extensive genome-wide genetic overlap across all phenotypic clusters even in the presence of minimal genetic correlation.Nonetheless, most lead SNPs were not genome-wide significant in univariate GWAS, demonstrating the boost in power provided by our multivariate approach.Functional characterization revealed that the genetic signal captured by MOSTest was associated with increased gene expression across all brain tissues, the testis and ovary and implicated synaptic structure and neurodevelopmental processes.Moreover, we show that our multivariate analysis improves discovery of implicated gene sets and tissues compared to univariate GWAS.We leveraged the extra power generated by our multivariate approach to boost discovery of genetic loci associated with the remaining Big 5 personality traits, identifying 160 loci for agreeableness (n = 11), conscientiousness (n = 36), extraversion (n = 89) and openness (n = 24).We further showed how the genetic loci shared across cognition and multiple personality traits improved polygenic prediction of conscientiousness and cognitive function in an independent sample.These findings have implications for how we conceptualize the neurobiology of personality and cognition, indicating that their genetic foundations are tightly interrelated.Dimensional, multivariate approaches which account for the complex set of interactions across domains are therefore better suited to fully elucidate the molecular mechanisms contributing to these fundamental human traits.
The boost in power generated by our combined analysis of neuroticism and cognitive measures, alongside our findings of shared genetic associations across domains, is consistent with the idea that these two mental constructs are influenced by pleiotropic genetic variants.This builds on recent evidence that differences in brain structure and function are associated with a similar pattern of pleiotropic genetic effects 25,26,28 .As the number of genetic loci associated with complex mental traits rises 47 , it is becoming increasingly apparent that individual genetic variants impact multiple, diverse traits, with few phenotype-specific variants 24,48 .This represents a key conceptual advance which has several implications.First, while large univariate GWAS have provided insights into the neurobiology of specific traits 9,12 , future studies need to be aware of the lack of specificity of most variants associated with complex mental phenotypes.To fully characterize a given genetic variant, its effect should be evaluated beyond the specific phenotype of interest as it is likely to have pleiotropic effects across diverse domains 25,49 .Second, as statistical power increases, the relative effect size of a variant will probably be more informative with regards to specificity and relevance for a given phenotype than the presence or absence of a statistical association.In this respect, conventional GWAS may become less a tool for discovery and more focused on the precision of effect size estimates.Third, as we have shown here, pleiotropic genetic effects can be leveraged to help boost the power for genetic discovery and polygenic prediction in related traits 15 .
When comparing effect sizes of MOSTest-discovered lead variants across included measures, there was also evidence of mixed effect directions between neuroticism and cognitive domains.This is consistent with the finding of minimal genetic correlation yet pleiotropic effects between these two domains.Genetic correlation is a genome-wide summary measure of the correlation of effect sizes between two phenotypes 40 .It is therefore possible for two phenotypes to share many genetic variants but possess minimal correlation if there is a balance of shared variants with the same and opposite effect directions on the two phenotypes [50][51][52] .Shared genetic variants with mixed effects reflect phenotypic findings that neuroticism does not significantly predict high school educational performance 7 or cognitive function in older adults 6 .Nonetheless, 'executive function' and 'reaction time' clusters shared variants with the 'worry' cluster and 'fluid intelligence/memory' shared variants with the 'depressed affect' cluster which were strongly discordant, despite weak negative genetic correlations.This suggests that MOSTest may prioritize variants which have more strongly aligned effect alleles in relation to the genome-wide average.Further, the recent findings of pleiotropic genetic effects on brain structure and function [26][27][28] , as well as patterns of widespread gene expression across different brain regions 53 underscore the highly interrelated functions of brain regions and structures.Taken with our findings, this indicates that a complex interplay between heritable brain functions result in patterns of heritable, interrelated, higher order mental traits which contribute to the core characteristics of an individual.
We used MAGMA to provide biological insights into the statistical associations captured by MOSTest.First, tissue enrichment analysis showed significant enrichment in all included brain tissues 54 , underscoring the pleiotropic nature of the genetic variants discovered.There were also several relevant gene sets identified, including 'observational learning', 'behaviour' and 'cognition', alongside several gene sets related to synaptic structure and function.Since MAGMA tests for enrichment of positionally mapped genes and so is not biased by the selection of tissue-specific eQTL databases, this indicates that MOSTest is capturing biologically plausible genes and is not driven by non-specific genetic overlap, helping to validate our findings.Furthermore, the diverse set of brain tissues identified, including cortical structures, subcortical structures, the midbrain and the hindbrain, supports the broader concept of pleiotropic effects across the brain both on structural and functional levels 26,28,55 .It is also interesting to note that both the testis and ovary were significantly enriched, although to a lesser degree than brain tissues.Sex hormones can act in the brain to regulate gene transcription and interact directly with neurotransmitter systems 56 .They are also known to affect cognition, particularly verbal and visuospatial abilities 57 and emotional regulation 58 , a core feature of neuroticism 11 .Despite this, gonadal tissue was not significantly enriched in either the aforementioned general intelligence 12 or neuroticism GWAS 9 .This may be the result of the additional power achieved using MOSTest.
Finally, we leveraged the boost in power from our multivariate analysis to improve discovery of genetic loci associated with agreeableness, conscientiousness, extraversion and openness.Nonetheless, these discoveries require replication in an independent sample to ensure their validity given that they were discovered using cFDR and were not significant in the primary GWAS.Genetic overlap between schizophrenia and neuroticism and openness has previously been reported using cFDR 59 .Interestingly, five of the six loci shared between schizophrenia and openness were also identified in our openness Article https://doi.org/10.1038/s41562-023-01630-9cFDR analysis.Nonetheless, larger samples are required to validate these findings.By reranking genetic variants according to the MOSTest-informed cFDR values, we also improved polygenic prediction of conscientiousness and cognitive function.As has previously been shown for schizophrenia and bipolar disorder 35 , the PGSs outperformed standard GWAS-based ranking despite using the same weightings, as well as PRS-CS, suggesting that this method prioritizes more predictive variants.This approach is similar to other recent examples using multivariate GWAS to enhance discovery 28 and prediction 60,61 .Nonetheless, PGSs for agreeableness, extraversion, neuroticism and openness failed to achieve statistically significant prediction in our independent test sample.This may have been due to a lack of statistical power, the use of different personality scales for the training 62 and test samples 63 or cultural differences between the American 23andMe sample 32 and the Norwegian research-focussed Thematically Organized Psychosis (TOP) sample 64 .
Among multivariate approaches, MOSTest was particularly well suited for the analysis of multiple personality and cognitive traits 25 .MOSTest is more flexible than canonical correlation analysis or MTAG since it can handle differences in sample size across included phenotypes and is more computationally efficient for high dimensional data 25 .By using permuted individual-level genotypes, MOSTest also robustly controls for type 1 error.It is also important to note that MOSTest differs fundamentally from genomicSEM 65 , another widely used multivariate GWAS method.While genomicSEM is a statistical framework for applying the principles of structural equation modelling to GWAS summary statistics based on the flexible modelling of genetic covariance matrixes, MOSTest empirically models the multivariate distribution of included variables while being agnostic to effect direction.This means MOSTest can identify variants which are shared across phenotypes even if they have mixed effect directions on each trait, which has been shown for many brain-related mental traits 51,66,67 .
There were limitations to this study.First, this analysis only included white European-ancestry participants due to differences in linkage disequilibrium (LD) between ancestral groups and a lack of large, deeply phenotyped non-white European samples.Whereas mixed models are increasingly used to account for ancestral diversity, they are currently not compatible with MOSTest due to the use of randomly permutated genotypes.Larger samples and the application of permutation algorithms that respect ancestry and family structure will enable the inclusion of ancestrally diverse samples as well as related individuals in future work.Second, there were differences in sample size between measures.This means that the genetic associations captured by MOSTest are probably driven to a greater extent by measures with larger sample sizes and that z-score estimates for measures with smaller sample sizes may be less precise.Despite this, we showed statistically significant associations with measures from both domains, supporting our main finding of pleiotropic effects.There were also differences in the cognitive tasks and personality trait measures used in the UKB, CHARGE, 23andMe and TOP samples.This may have reduced the power in our replication and PGS analysis due to increased noise introduced by different measures.Third, we combined cognitive measures taken at different timepoints during the study.While systematic differences in cognitive performance may subtly alter the results, this is unlikely to change the main findings of the study.Fourth, MOSTest requires the use of individual-level data.This limited our ability to include other personality traits in the main analysis which were not included in UKB.We mitigated this by using our multivariate analysis to boost discovery for the remaining four personality traits.Fifth, we used MAGMA for gene mapping, tissue enrichment and gene-set analyses, which does not incorporate eQTL or chromatin interaction gene mapping.This increased the specificity of the gene-mapping approach and meant that the gene-set and tissue enrichment analyses were not biased by the selection of eQTL or chromatin interaction databases.However, this also reduced the sensitivity of our gene-mapping procedure.We considered this approach to be the most appropriate since the discovery of individual causal genes is difficult due to a lack of experimental gene-mapping evidence.We demonstrate the value of the boosted power for locus discovery using MOSTest through gene-set and tissue enrichment analyses and improved PGS performance.Finally, we did not exclude individuals with acquired disorders which could affect cognitive functioning in the analysis and we only included ten principal components as covariates in the initial GWAS step of the MOSTest analysis, which may not capture all the population stratification present in the UKB sample.Nevertheless, our sensitivity analyses indicated that this was unlikely to affect our main finding of substantial pleiotropy across neuroticism and cognition.
In conclusion, by combining 35 item and task-level measures of mental functioning in a multivariate framework, we demonstrate that distinct cognitive and personality traits are influenced by hundreds of genetic variants with pleiotropic effects and mixed effect directions, despite minimal genetic and phenotypic correlations.This contributes to a growing body of evidence indicating that common genetic variants underlying complex mental traits are closely interrelated, suggesting that 'the whole is more than the sum of its parts' for brain-related phenotypes.

Ethical considerations
All participants provided informed consent.UKB participants who withdrew consent were excluded from the study.UKB data were accessed under accession number 27412.The 23andMe sample participated under a protocol approved by the external AAHRPP-accredited IRB, Ethical & Independent Review Services.Participants were included in the analysis on the basis of consent status as checked at the time data analyses were initiated.The use of summary statistics for cFDR analysis was evaluated by The Norwegian Institutional Review Board: Regional Committees for Medical and Health Research Ethics (REC) South-East Norway and found that no additional ethical approval was required because no individual data were used.TOP received ethical approval from Norwegian REC (ref.

Samples and phenotyping
UK Biobank.Genotypes, demographic and clinical data were obtained from the UKB.We selected unrelated (included in UKB genetic principal components calculation), white British individuals (as derived from both self-declared ethnicity and principal component analysis) with no sex chromosome aneuploidies 36 and genotyping call rate greater than 0.9.Participants who had withdrawn their consent were removed.This resulted in 337,145 individuals with mean age of 56.9 years (s.d.= 8.0 years); 53.7% were female.For the association analysis we retained only variants on autosomes with minor allele frequency above 0.001 imputation information score >0.8 and with Hardy-Weinberg equilibrium P > 1 × 10 −10 , leaving 12.9 million variants.
Table 1 and Supplementary Table 1 summarize the phenotypes included in our multivariate analysis.The UKB neuroticism items were derived from the Eysenck Personality Questionnaire-Revised Short Form 37 .The scale was completed by all participants during enrolment at the assessment centre as part of the touchscreen assessment.All items comprised binary yes/no response options.Cognitive measures were collected at three different timepoints-either as part of the touchscreen cognitive assessment at enrolment (2006-2010), online cognitive follow-up (2014-2015) or during a follow-up imaging visit (2016).Some items from the touchscreen assessment were repeated during the cognitive follow-up and so were merged to maximize sample size.Included measures spanned a variety of cognitive domains, including verbal/numeric reasoning, prospective memory, working memory, non-verbal reasoning, visual declarative memory, processing speed and executive function 39 .All cognitive measures were coded so that Article https://doi.org/10.1038/s41562-023-01630-9larger values indicated better performance (shorter reaction time, less matching errors, faster task completion and so on).23andMe and CHARGE.For our replication and cFDR analyses, summary statistics for 23andMe Big 5 personality traits 32 and CHARGE general cognitive function 14 were accessed through collaborations.Sample make-up, genotyping procedures and phenotyping have been described in detail in the original publications 14,32 .Briefly, the 23andMe samples comprised 59,225 individuals of European ancestry.Sum-scores for agreeableness, conscientiousness, extraversion, neuroticism and openness were derived from the Big Five inventory-44-item edition 62 .The 23andMe customers completed the questionnaire online.The CHARGE general cognitive function sample comprised a meta-analysis of 113,981 participants of European ancestry from 51 cohorts after excluding UKB participants to eliminate overlap with the discovery sample as well as individuals enroled in five additional cohorts that contributed data in the original publication (Age, Gene/ Environment Susceptibility-Reykjavik Study, Atherosclerosis Risk in Communities Study, Cardiovascular Health Study, Framingham Heart Study and Genetic Epidemiology Network of Arteriopathy).Cognitive function was assessed using a wide variety of different cognitive tests for fluid cognitive function.Each cohort included a minimum of three different cognitive tasks that tested different cognitive domains.For each cohort, principal component analysis was applied to the cognitive test scores.The score derived for the first unrotated principal component was used as a measure of the 'general cognitive function' phenotype.
TOP sample.The TOP sample comprised participants recruited as healthy controls for an observational study of severe mental illness.Participants were identified at random from the national population register.Inclusion criteria included the absence of current or previous psychiatric disorder as identified by the Primary Care Evaluation of Mental Disorders delivered by a trained research assistant 68 .Exclusion criteria were substance use disorder, physical health condition, previous traumatic brain injury, neurological disorders, autism spectrum disorder, personal or family (first-degree relative) history of severe psychiatric disorder and age outside of the range 13-72 years.Big 5 personality traits were assessed using the revised Neuroticism-Extraversion-Openness Five Factor Inventory 69 , Norwegian edition, a 60-item questionnaire comprising five-point Likert scale responses.Cognitive function was measured using the Wechsler Abbreviated Scale of Intelligence second addition 70 .Incomplete responses were dropped, leaving sample sizes of 587 for agreeableness, 600 for conscientiousness, 581 for extraversion, 598 for neuroticism, 578 for openness and 1,066 for cognitive function.

Data analysis
Preprocessing of UKB variables.Before the association testing, each item was manually preprocessed.Missing values were dropped from the analysis.Several continuous items with skewed and highly sparse distribution of answers were binarized.All continuous items were transformed using rank-based inverse normal transformation.Further details are provided in Supplementary Table 1.

LD score regression heritability, genetic correlation, phenotypic correlation and hierarchical clustering. Univariate h 2
SNP and pairwise genetic correlations (r g ) were estimated using LDSR 40,71 .Briefly, LDSR estimates univariate h 2 SNP from GWAS summary statistics by modelling the relationship between variant-level effect size and extent of LD, building on the observation that the larger the region of LD the larger the effect size estimate.Genetic correlation is then computed as the covariance of SNP effect size between two traits after controlling for LD.We performed hierarchical clustering on pairwise genetic correlations using AgglomerativeClustering algorithm with distance function 1 − |r g |, as implemented in sklearn Python package v. 1.1.2(ref.72).Phenotypic correlations were computed using Spearman rank correlation as implemented in the Python package SciPy v.1.9.2 (ref.73).
MOSTest and min-P.Plink2 (ref.74; v.2.00a3LM AVX2 Intel (3 January 2021)) was applied to perform item-level genotype-phenotype association testing using linear regression for continuous items and logistic regression for binary items with sex, age and first ten genetic principal components as covariates.In total we performed GWAS of 13 neuroticism and 26 cognition measures.Corresponding summary statistics were processed with LD score regression 71 to estimate SNP-heritabilities (Supplementary Table 2 and Supplementary Fig. 1) and genetic correlations between items (Fig. 1).Since including non-heritable traits into MOSTest analysis may reduce statistical power 25 , only items with h 2 P < 3.167 × 10 −5 were used for subsequent MOSTest and min-P analyses.This threshold is recommended by the developers of LDSR-based SNP-heritability 71 and has previously been used for large-scale heritability analyses of UKB genetic data 75 .In total, 35 measures (13 neuroticism and 22 cognition cognitive) passed this h 2 SNP filter.MOSTest analysis was performed using the following steps, as outlined in previous publications 25,26 : (1) univariate GWAS was run for each individual phenotype using randomly permuted genotypes (in addition to univariate GWAS using original genotypes already performed); (2) covariance matrix of z-scores was estimated from permuted genotypes; (3) MOSTest test statistics were estimated for permuted and original genotypes as the Mahalanobis norm of permuted and original z-scores, respectively, using the regularized covariance matrix obtained in (2), where the regularization parameter (r = 3) was selected to maximize the yield of genome-wide significant loci as described previously 26 ; (4) the distribution of the MOSTest test statistics was approximated under the null hypothesis (no genotype-phenotype association) from the observed distribution of the test statistics for permuted genotypes obtained in (3), using the empirical distribution in the 99.99 percentile and a gamma distribution in the upper tail, selecting the shape and scale parameters of the gamma distribution to maximize the likelihood of the observed data; (5) the cumulative distribution function from (4) was used to calculate MOSTest P values using test statistics obtained using original genotypes 26 .We also performed MOSTest analyses for only neuroticism measures and only cognitive measures.Min-P was computed as the smallest P value for each variant across all univariate GWAS for each phenotype, followed by correction for the number of phenotypes tested, as described previously 76 .
Genetic overlap across univariate GWAS analyses was determined at the lead-variant level.We extracted P values for all MOSTest lead variants from each individual univariate GWAS for included measures.Genetic overlap was deemed present if the lead variant was significant in each pair of univariate GWAS at the specified significance threshold (P < 5 × 10 −8 , P < 1 × 10 −6 , P < 1 × 10 −5 ).The same procedure was used to quantify overlap across the three multivariate analyses.
We performed hierarchical clustering of univariate z-scores for each MOSTest-discovered lead variant.Hierarchical clustering was produced using AgglomerativeClustering algorithm with Euclidian distance, as implemented in sklearn Python package.Lead variants were split into seven clusters.For each variable we then estimated the median z-score over all variants in the cluster.
Conditional/conjunctional false discovery rate.We applied cFDR to boost discovery of genetic variants associated with the Big 5 personality traits and general cognitive function.First, conditional qq-plots were constructed by comparing enrichment of association in all variants in the primary trait (Big 5 personality traits or general cognitive function) with three subsets of variants defined by their strength of association (P < 0.1, P < 0.01 and P < 0.001) with the secondary trait (MOSTest summary statistics).Successive leftward deflection, indicating greater enrichment of statistical associations, with increasing Article https://doi.org/10.1038/s41562-023-01630-9threshold of significance indicates cross-trait enrichment.Shift in enrichment conditional on the secondary trait can be directly interpreted according to the Bayesian definition of the true discovery rate (TDR = 1 − FDR), whereby a larger shift is consistent with a smaller FDR.This means cFDR values can be computed for each variant by comparing enrichment of all variants with a subset of variants which are as strongly or more strongly associated with the secondary trait.The cFDR value can therefore be interpreted as the probability that a given SNP is not associated with the primary trait given that the SNP is more strongly, or as strongly, associated with both phenotypes than observed in the original GWAS.Look-up plots were constructed which provide cFDR values given the P values in the primary and secondary traits.The conjFDR statistic was computed by repeating the analysis having switched the primary and secondary trait.The maximum of the two cFDR statistics represents the probability that a given SNP is not associated with the primary or secondary trait given that the SNP is more strongly or as strongly associated with both phenotypes than observed in the original GWAS.We performed 100 iterations of each analysis after random pruning from independent LD blocks (r 2 > 0.1).Genomic inflation was corrected for by a conservative genomic control procedure using intergenic variants which lack true associations relative to other functional regions 77 .The major histocompatibility complex (MHC) region was excluded from the model-fitting procedure to prevent inflation of test statistics due to complex LD.

Locus definition
Genetic loci were defined on the basis of association summary statistics produced with MOSTest, min-P and cFDR following the protocol implemented in FUMA with default parameters 45 .The protocol is summarized as follows: (1) Independent significant genetic variants were identified as variants with P < 5 × 10 −8 or cFDR < 0.05 and LD r 2 < 0.6 with each other.(2) A subset of these independent significant variants with LD r 2 < 0.1 were selected as lead variants.(3) Candidate variants were identified as variants with LD r 2 ≥ 0.6 with each independent significant variant.(4) For a given lead variant the borders of the genomic locus were defined as minimum/maximum positional coordinates over all corresponding candidate variants.(5) Loci were merged if they were separated by <250 kilobases (kb) and the most significant lead variant was selected as the lead variant for the merged locus.

Replication in independent samples
As applied in several recent GWAS [41][42][43] , we tested for en masse sign concordance of genetic effects, nominal significance and Bonferroni-corrected significance in MOSTest-discovered lead SNPs using UKB fluid intelligence sum-score and CHARGE general cognitive function summary statistics and UKB neuroticism sum-score and 23andMe neuroticism summary statistics.We dropped all variants with ambivalent effect alleles and used LD proxies (r 2 > 0.6) if a lead SNP was not present in both replication cohorts.We first used an exact binomial test to test the null hypothesis that sign concordance, nominal significance and Bonferroni-corrected significance were randomly distributed (P = 0.5, P = 0.05 and P = 0.05/n, respectively), given the total number of variants (n) and the number of variants with concordant effects in UKB and each independent dataset and nominally significant and Bonferroni-corrected significant in the independent datasets, respectively (k).To test for evidence of pleiotropic effects, we used an exact binomial test to test the null hypothesis that sign concordance, nominal significance and Bonferroni-corrected significance in both neuroticism and cognitive function were randomly distributed (P = 0.25, P = 0.0025 and P = (0.0025/n 2 )), given the total number of variants (n) and the number of variants which were concordant, nominally significant and Bonferroni-corrected significant in both phenotypes simultaneously.We also calculated relative risks for all exact binomial tests performed, computed as: (probability of success in the sample)/ (expected probability of success).

Mapped genes, tissue specificity and gene-set analyses
Gene mapping of MOSTest GWAS summary statistics was performed using MAGMA as implemented in FUMA.The MHC region was excluded and all other settings were default.Gene analyses of individual items (Supplementary Results) were performed with MAGMA (v.1.09b) 46pplying a SNP-wide mean model to GWAS summary statistics excluding variants within MHC region (chr.6: 25000000-33000000), 1000 Genomes Phase 3 EUR were used as a reference panel and other settings being default.A total of 18,952 genes were included in the analysis.Tissue specificity and gene-set enrichment analysis of MOSTest summary statistics was performed using MAGMA as implemented in FUMA 46 .Tissue specificity was tested in GTEx v.7 eQTL database 54 across 53 'detail tissues' and 30 'general tissues'.Gene-set enrichment was tested in GO 78 and curated gene sets from MsigDB 79 (n = 10,678).Bonferroni correction was applied to correct for multiple comparisons.
The cFDR statistics are not applicable to MAGMA because the distribution of cFDR statistics under the null hypothesis does not fit the assumption that the association statistic is normally distributed.Genes were therefore mapped to candidate SNPs identified by cFDR using positional mapping, that is according to their physical proximity (<10 kb) to each variant.We performed tissue specificity and gene-set analysis using the GENE2FUNC functionality in FUMA using default settings.Positionally mapped genes were used as input for all analyses.Over-representation of mapped genes within tissue-specific differentially expressed genes and GO and curated gene sets was tested using a hypergeometric test.Correction for multiple comparisons was performed using the Bonferroni method.

Polygenic score analysis
We calculated and compared PGS for the Big 5 personality traits and cognitive function in TOP sample using four different setups.The first three setups were based on the C + T (clumping + thresholding) approach 80 using different strategies for ranking SNPs and adjusting their effect sizes: (1) original GWAS P value-based ranking with original GWAS effect sizes (standard PGS); (2) cFDR-based ranking with original GWAS effect sizes 35 , where cFDR analysis was performed conditioning GWAS of the trait of interest (23andMe Big 5 personality traits and CHARGE cognitive function) on our multivariate GWAS of cognitive function and neuroticism; (3) multitrait analysis of GWAS-based P value ranking and corresponding adjusted effect sizes (MTAG) 23 .In the MTAG approach, our UKB-based GWAS of neuroticism summary score (n = 274,056) was used to adjust P values and effect sizes in the 23andMe GWASs of Big 5 personality traits and our UKB-based GWAS of fluid intelligence summary score (n = 163,375) was used to adjust summary statistics in the CHARGE GWAS of cognitive function.Default MTAG settings were applied as described in ref. 23, besides the application of the '-no_overlap' parameter given the absence of sample overlap across the UKB, 23andMe and no-UKB CHARGE datasets 14,32 .For these three setups, PGS were calculated across five sets of LD-independent SNPs (n = 10, 100, 10,00, 10,000 and 100,000) using PRSice-2 (v.2.3.3) 80with no additional clumping (-no-clump option).Sets of LD-independent SNPs were obtained using Plink v.1.90b6.17based on the setup-defined SNP ranking with-clump-kb 250,-clump-r2 0.1 parameters and in-sample LD estimates.In the fourth setup (4) PRS-CS method was deployed, which uses Bayesian regression and continuous shrinkage to adjust weights for all available SNPs accounting for LD structure 81 .PRS-CS was applied with default settings using the 1000 Genomes Project Phase 3 European LD reference panel as described in ref. 81.In Article https://doi.org/10.1038/s41562-023-01630-9all four setups, the phenotypic variance explained by the PRS (r 2 ) was estimated using linear regression model controlling for age, sex and first 20 genetic principle components.All data met the assumptions of the statistical tests used.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of all covariates tested
A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g.means) or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g.Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection No software was used for data collection
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers.We strongly encourage code deposition in a community repository (e.g.GitHub).See the Nature Portfolio guidelines for submitting code & software for further information.

April 2023
Behavioural & social sciences study design All studies must disclose on these points even when the disclosure is negative.

Study description
This was a quantitative genome-wide association study testing for the association between individual genetic variants and 35 measures of neuroticism and cognitive function.

Research sample
Our sample was derived from the pre-existing UK Biobank dataset which comprises a total of 502,371 individuals with a mean age of 56.5 (SD 8.1) and 54.4% were female.We selected a subset of 337,145 non-related white british adults, as defined by UKB analytics team.The sample had a mean age of 58.9 (SD 8.0) and 53.7% were female.Participants who had withdrawn their consent were excluded.This study sample was chosen because it offered a sufficiently large sample size for genome-wide analysis with in depth phenotyping on cognitive and personality measures.The sample is not-representative of the wider british population since it only includes white British individuals and female participants and participants with higher socio-economic status are overrepresented [Bycroft, C. et al.Nature 562, 203-209 (2018)].

Sampling strategy
Opportunistic recruitment from 22 regional centres across the UK.Target sample size of 500,000 participants.Recruimtent was completed in August 2006.This sample was deemed sufficient because it is sufficiently large to make genetic discoveries in most behavioural and health-related phenotypes.

Data collection
Data was collected either on a touchscreen at enrolment, on a home computer (online follow-up cognitive tests) or during a followup imaging visit.Data was collected under hypothesis free conditions and so were effectively blinded to research studies which would go on to use UK Biobank data such as the present study.

Data exclusions
Non-heritable measures of cognition (n=4) were excluded.This exclusion criteria was pre-defined.
Non-participation 25.3% participants completed online cognitive follow-up.7% of all participants attended imaging visits (data collection is ongoing)

Randomization
Randomization was not applicable because this study was observational and not testing the effect of a specific condition or intervention.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies.Here, indicate whether each material, system or method listed is relevant to your study.If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Fig. 2 |
Fig. 2 | Heatmap of genetic and phenotypic correlations across mental traits.LDSR genetic correlations (r g , top right) and Spearman rank phenotypic correlations (r p , bottom left) reveal a pattern of moderate to strong positive genetic correlations within neuroticism and cognitive domains but weak negative genetic correlations across cognition and neuroticism measures.Phenotypically, there were also stronger positive correlations within domains

DFig. 3 |
Fig.3| Boosting the signal of genetic association for 35 mental traits by leveraging pleiotropy.a, Miami plot for MOSTest (orange) and min-P (blue), plotting for each SNP the −log 10 (P) against chromosomal position.By applying a multivariate framework which leverages pleiotropic effects, there is a substantial boost in signal compared to a 'mass univariate' approach such as min-P, evidenced by smaller P values and a larger number of discovered loci (n = 431 versus 113).This indicates the presence of pleiotropic genetic effects across mental traits.b, Shared genetic associations of lead variants across five genetic correlation-based clusters (Fig.1) at three significance thresholds-5 × 10 −8 , 1 × 10 −6 and 1 × 10 −5 .The number of lead variants within each cluster individually at each significance threshold is represented by the size of the coloured segments.The number of lead variants shared between each pair of clusters is represented by the width of the coloured ribbons.The proportion of variants with concordant effect directions on each cluster is represented by the colour of the ribbons from blue (0) to red(1).No adjustments were made for multiple comparisons.

P = 0 Fig. 5 |
Fig. 5 | MAGMA tissue-specific gene expression and gene-set enrichments.a, Top 20 MAGMA-based tissue specificity analysis of multivariate GWAS of 35 mental traits shows highly specific enrichment across all brain tissues and the testis.All tissues tested are shown in Supplementary Figs.19 and 20.Please note that the ovary was significantly enriched when tested at the 'general tissues' level

Fig. 6 |
Fig. 6 | Leveraging multivariate analysis to boost discovery and polygenic prediction of personality and cognitive function.a, The number of loci associated with agreeableness (AGREE), conscientiousness (CONSC), extraversion (EXTRA) and openness (OPEN) in the primary GWAS (pale orange) compared to the cFDR conditioning on the multivariate analysis of 35 mental measures (cFDR, dark orange).We also provide the number of shared genetic loci between personality and the multivariate analysis of 35 mental measures (conjFDR, orange).The number of loci discovered increased substantially, including the first loci reported for AGREE.b, Explained variance from a linear regression model of Big 5 personality and cognitive function (PGS) Articlehttps://doi.org/10.1038/s41562-023-01630-9 2009/2485), Data Inspectorate (ref.03/02051) and The Norwegian Directorate of Health (ref.05/5821).
nature portfolio | reporting summary April 2023 Corresponding author(s): Guy Hindley, Alexey Shadrin, Ole Andreassen Last updated by author(s): Apr 25, 2023Reporting Summary Nature Portfolio wishes to improve the reproducibility of the work that we publish.This form provides structure for consistency and transparency in reporting.For further information on Nature Portfolio policies, see our Editorial Policies and the Editorial Policy Checklist.

Table 1 | Overview of neuroticism and cognitive measures from UKB
19 and 20.Please note that the ovary was significantly enriched when tested at the 'general tissues' level (Supplementary Fig.19).b,Top20 gene sets significantly enriched for gene-level associations with multivariate GWAS of 35 mental traits.All significant gene sets are presented in Supplementary Table11.All P values are corrected for multiple comparisons using Bonferroni correction.