Vertex-wise multivariate genome-wide association study identifies 780 unique genetic loci associated with cortical morphology

Brain morphology has been shown to be highly heritable, yet only a small portion of the heritability is explained by the genetic variants discovered so far. Here we extended the Multivariate Omnibus Statistical Test (MOSTest) and applied it to genome-wide association studies (GWAS) of vertex-wise structural magnetic resonance imaging (MRI) cortical measures from N=35,657 participants in the UK Biobank. We identified 695 loci for cortical surface area and 539 for cortical thickness, in total 780 unique genetic loci associated with cortical morphology robustly replicated in 8,060 children of mixed ethnicity from the Adolescent Brain Cognitive Development (ABCD) Studyࣨ. This reflects more than 8-fold increase in genetic discovery at no cost to generalizability compared to the commonly used univariate GWAS methods applied to region of interest (ROI) data. Functional follow up including gene-based analyses implicated 10% of all protein-coding genes and pointed towards pathways involved in neurogenesis and cell differentiation. Power analysis indicated that applying the MOSTest to vertex-wise structural MRI data triples the effective sample size compared to conventional univariate GWAS approaches. The large boost in power obtained with the vertex-wise MOSTest together with pronounced replication rates and highlighted biologically meaningful pathways underscores the advantage of multivariate approaches in the context of highly distributed polygenic architecture of the human brain.


Introduction
Variability in brain morphology is highly heritable, with twin studies estimating heritability for global measures at 89% for total surface area and 81% for mean cortical thickness ( Panizzon et al., 2009 ) and ( N = 51,665) from the ENIGMA consortium identified 187 and 50 loci associated with global and regional cortical surface area and thickness, respectively ( Grasby et al., 2020 ). The relatively low yield despite high heritabilities of brain morphology is likely due to high polygenicity and small effect size (discoverability) per locus .
Both imaging genetics  and gene expression studies ( Hawrylycz et al., 2015 ) suggest that genetic effects are distributed across cortical regions, such that variants influencing one cortical region are also likely to affect other cortical regions. Multivariate statistical methods are naturally tailored to model distributed and pleiotropic genetic effects. We recently developed a Multivariate Omnibus Statistical Test (MOSTest)  ) that aggregates effects across spatially distributed phenotypes, such as cortical thickness, boosting our ability to detect variant-phenotype associations. We showed that applying MOSTest to cortical morphology region of interest (ROI) measures in the UK Biobank substantially increased loci discovery  compared to the commonly applied mass univariate approach used by the ENIGMA consortium ( Grasby et al., 2020 ), here referred to as the min-P approach. For each genetic variant tested for association with multiple phenotypes, min-P considers only the most significant p -value and corrects it for the effective number of phenotypes analyzed, thus failing to exploit distributed polygenic architecture across brain regions. In contrast, MOSTest leverages the distributed nature of genetic influences across brain regions and allows detection of genetic variants with weak effects in multiple brain regions. We have shown that the discoverability of GWAS variants underlying regional cortical area and thickness depends on the specific parcellation of cortical regions used, and that parcellations based on genetic correlations from twin studies perform better than genetically uninformed schemes . Here we further extend MOSTest method introducing regularization of covariance matrix which provides substantial increase of discovery yield as compared to unregularized version we introduced in the previous study . Updated method is applied to vertex-wise measures of cortical morphology showing that the combined genetic yield (number of loci discovered) for cortical area and thickness can be boosted when using MOSTest (yielding a 4.2-fold and 4.1-fold increase relative to min-P for cortical area and thickness, respectively), and boosted further when moving from a region-based approach to a more fine-grained vertexwise approach (additional 1.9-fold and 3.0-fold increase for cortical area and thickness, respectively) without sacrificing replicability of the findings in the independent sample. Uncovering the detailed genetic architecture of cortical area and thickness will provide insight into the underlying neurobiology of the human brain, and give a better understanding of brain-related human traits, such as cognition ( Vuoksimaa et al., 2016 ), as well as neurological ( Querbes et al., 2009 ) and psychiatric diseases ( Rimol et al., 2010 ).

Samples
For the primary analysis genotypes, T1 MRI scans, demographic and clinical data were obtained from the UK Biobank under accession number 27412. We selected White British individuals (as derived from both self-declared ethnicity and principal component analysis ( Bycroft et al., 2018 )) who had undergone the neuroimaging protocol and had passed genetic quality control procedures described below. The resulting sample contained 35,657 individuals with a mean age of 64.4 years (standard deviation 7.5 years), 51.7% female.
For the replication analysis we used data from the Adolescent Brain Cognitive Development (ABCD) study, with complete genetic data and baseline T1 MRI scans from data release 3.0 (NDA DOI:10.151.54/1519007) that passed the ABCD quality control procedures ( N = 8,060). These children had a mean age of 9.9 years (standard deviation 0.6 years), 46.9% female.

Data processing
Both for the primary and replication analyses T1-weighted structural MRI scans were processed with the FreeSurfer v5.3 standard "reconall " processing pipeline ( Desikan et al., 2006 ) to generate 1284 nonsmoothed vertex-wise measures (ico3 downsampling with the medial wall removed) summarizing cortical surface area and thickness. For the primary analysis we also generated 68 ROI cortical surface area and thickness measures (based on the Desikan-Killiany parcellation). All measures were pre-residualized for age, sex, scanner site, a proxy of surface reconstruction quality (FreeSurfer's Euler number ( Rosen et al., 2018 )), the first twenty genetic principal components, and a global measure specific to each set of variables: total cortical surface area and mean cortical thickness for the regional area and thickness measurements correspondingly. Subsequently, a rank-based inverse normal transformation was applied to the residualized measures.
For discovery we used UK Biobank v3 imputed data that have undergone extensive quality control procedures as described by UK Biobank genetics team ( Bycroft et al., 2018 ) . In addition, we filtered out individuals with genotype missing rate > 10%, variants with genotype missing rate > 5%, and variants failing Hardy-Weinberg equilibrium at p = 1E-9. We further removed variants with minor allele frequency below 0.5%, and imputation info score below 0.5, leaving 9 million variants.
For the replication we used ABCD genetic data that were part of data release 3.0. The data were genotyped on 646,247 genetic variants using the Affymetrix smokescreen array ( Baurley et al., 2016 ). Successful genotype calls were determined based on the recommendation of Affymetrix Axiom Analysis Suite v5.0, with at least 98% call rates. Further pre-imputation quality controls included inbreeding check, sex concordance check, and cohort level missingness check. Imputation was performed using the Michigan Imputation Server ( Das et al., 2016 ) with hrc.r1.1.2016 reference panel, Eagle v2.3 ( Loh et al., 2016 ) phasing and multiethnic imputation process. Best guess conversion at a threshold of 0.9 was used to convert dosage files to plink files using PLINK ( Chang et al., 2015 ). Post-imputation QC criteria were an imputation quality score greater than 0.9 and a Hardy-Weinberg threshold of 1E-6. This QC filtering was performed using PLINK ( Chang et al., 2015 ) and resulted in 13 million variants and 8060 individuals. Genetic ancestry was estimated using fastStructure ( Raj et al., 2014 ) with four ancestry factors, an individual was considered of European ancestry if its estimated posterior probability of being European was larger than 0.8. This resulted in 5060 samples classified as Europeans. There is no overlap between discovery and replication samples.
Variants were tested for association with cortical surface area and cortical thickness at each vertex and each ROI separately using the standard univariate GWAS procedure. Resulting univariate p-values and effect sizes were further combined in the MOSTest and min-P analyses to identify area-and thickness-associated loci. Replication rates for MOSTest and min-P were assessed based on independent MOSTest and min-P runs in the ABCD dataset.

MOSTest analysis
Consider variants and (pre-residualized) phenotypes. Let be a z-score from the univariate association test between i th variant and j th (residualized) phenotype and = ( 1 , … , ) be the row vector of z-scores of the i th variant across phenotypes. Let = { } be the × matrix of z-scores with variants in rows and phenotypes in columns. For each variant consider a random permutation of its genotypes and let ̃ = { ̃ } be the matrix of z-scores from the univariate association testing between variants with permuted genotypes and phenotypes. A given number of random permutations of genotypes are done for each variant and the resulting permuted genotype vectors are tested for association with all phenotypes, therefore preserving correlation structure between phenotypes. In our discovery analysis we perform 24 random permutations of genotypes (providing ∼200M z-scores to esti-mate distribution under null) to ensure that genome-wide significance level (5E-8) is covered with non-parametric distribution.
Let ̃ = ̃ ̃ be the × covariance matrix of ̃ , and ̃ = is its singular valued decomposition ( and -orthogonal matrixes, -diagonal matrix with singular values of ̃ on the diagonal). Since ̃ is symmetric, = , and singular valued decomposition of ̃ can be written as ̃ = . Consider the regularized version of the covariance matrix ̃ = , where is obtained from by keeping largest singular values and replacing the remaining with th largest. The MOSTest statistics for i th variant (scalar) is then estimated as = ̃ −1 , where regularization parameter is selected separately for cortical area and thickness to maximize the yield of genome-wide significant loci. In this study we observed the largest yield for cortical surface area with = 10; the optimal choice for cortical thickness was = 20 (Figure S4). The distribution of the test statistics under null ( ) is approximated from the observed distribution of the test statistics with permuted genotypes, using the empirical distribution in the 99.99 percentile and Gamma distribution in the upper tail, where shape and scale parameters of Gamma distribution are fitted to the observed data. The p-value of the MOSTest test statistic for the i th variant is then obtained as = ( ) . Of note, compared to the original version of the MOSTest ( van der Meer et al., 2020 ), regularization of the covariance matrix introduced in this study entails technical updates required for correct calibration of test statistics distribution under null. For example, analytical approximation of MOSTest test statistics distribution under null with Gamma distribution applied previously ( van der Meer et al., 2020 ) become invalid when covariance matrix is regularized. To handle it correctly here we use piecewise distribution with empirical distribution in the 99.99 percentile and Gamma distribution in the upper tail as described above.
While in this study the regularization parameter ( ) is selected to maximize discovery yield, alternative criteria might be used. For example, r can be selected to maximize replication rate using nested crossvalidation procedure. However, we expect this to have only marginal effect on the results since both discovery yield (Fig. S4) and replication rate (Fig. S5) remain similar in the broad range of regularization parameters.

min-P analysis
Similar to the MOSTest analysis, consider variants and preresidualized phenotypes. Let be a z-score from the univariate association test between i th variant and j th (residualized) phenotype and = ( 1 , … , ) be the row vector of z-scores of the i th variant across phenotypes. The min-P statistics for the i th variant is then estimated as = 2Φ( − max =1… ( | |) ) , where Φ is a cumulative distribution function of the standard normal distribution. The distribution of the min-P test statistics under null ( − ) is approximated from the observed distribution of the test statistics with permuted genotypes, using the empirical distribution in the 99.99 th percentile and Beta distribution in the upper tail, where shape parameters of Beta distribution ( and ) are fitted to the observed data. The p-value of the min-P test statistic for the i th variant is then obtained as − = − ( ) . It is worth noting that the permutation-based method used here for multiple testing correction of the min-P results essentially represents an exact version of commonly applied approach using matrix spectral decomposition ( Nyholt, 2004 ;Li and Ji, 2005 ). The latter is not applicable to the MOSTest. Therefore, to provide more direct comparison of MOSTest with min-P, we use permutation-based approach for both methods.

Locus definition
Genetic loci were defined based on association summary statistics produced with MOSTest and min-P following the protocol implemented in FUMA ( Watanabe et al., 2017 ) with default parameters. The protocol can be summarized as the following: 1. Independent significant genetic variants are identified as variants with p -value < 5E-8 and linkage disequilibrium (LD) r2 < 0.6 with each other. 2. A subset of these independent significant variants with LD r2 < 0.1 are then selected as lead variants. 3. For each independent significant variant all candidate variants are identified as variants with LD r2 ≥ 0.6 with the independent significant variant. 4. For a given lead variant the borders of the genomic locus are defined as min/max positional coordinates over all corresponding candidate variants. 5. Loci are then merged if they are separated by less than 250kb.
Alternatively, to facilitate comparison with the current largest brain morphology GWAS ( Grasby et al., 2020 ), we also counted genetic loci applying locus definition similar to that used by ENIGMA. Briefly, the association summary statistics produced with either MOSTest or min-P were clumped with PLINK ( Chang et al., 2015 ) using p-value threshold of 5E-8 (-clump-p1) and linkage disequilibrium cutoffs of 1 Mb (clump-kb) and r2 < 0.2 (-clump-r2). Obtained clumps of variants were considered as independent genome-wide significant genetic loci.

MiXeR analysis
MOSTest and min-P p-values were analyzed with the MiXeR tool  to estimate the proportion of additive genetic variance explained by genome-wide significant SNPs as a function of sample size. Right censoring (MiXeR option: -z1max 5.45) was applied to mitigate extreme effects which may lead to biased estimates.

Gene-level analysis
We carried out MAGMA-based gene analyses using default settings, which entail the application of a SNP-wide mean model to GWAS summary statistics, with the use of the 1000 Genomes Phase 3 EUR reference panel. Gene-set analyses were done in a similar manner, restricting the sets under investigation to those that are part of the Gene Ontology biological processes subset (N = 7343), as listed in the Molecular Signatures Database (MsigdB) v7.0. In addition, lead SNPs identified in the vertexwise MOSTest analysis were parsed with gene-set enrichment analysis as implemented in FUMA GENE2FUNC ( Watanabe et al., 2017 ). In contrast to MAGMA analysis where genes are prioritized based on proximity, in this analysis genes were selected combining positional and eQTL (based on GTEx v8 data) mapping.

Replication criterion
Lead variants identified in discovery vertex-wise MOSTest and min-P analyses were tested for replication in the ABCD sample. Only variants which present in both discovery and replication data were used, resulting in 633 variants tested for cortical surface area and 487 variants for cortical thickness. The variant was considered as replicated if its nominal one-tailed p-value for association in replication cohort was < 0.05.

Genetic loci discovery
Using the vertex-wise MOSTest ( van der Meer et al., 2020 ), we performed a multivariate GWAS of cortical morphology, such that the significance of each locus was estimated after aggregating its effects across all vertices (1284 data points each for thickness and area). This was conducted separately for cortical surface area and thickness in 35,657 individuals from UK Biobank. Measurements from left and right hemispheres were included separately (not averaged). We identified 695 and Loci; independent genome-wide significant (P < 5E-8). Y-axes are truncated at -log10(P) = 17.2 to highlight the region around genome-wide significance threshold. ROI = region of interest. 539 loci, respectively, equating to 780 unique loci associated with cortical morphology. Prior to performing the vertex-wise MOSTest analysis, individual cortical area and thickness measures were residualized for age, sex, scanner site, Euler number (proxy of surface reconstruction quality), the first twenty genetic principal components, and a participant-specific global measure (either total area or average thickness). Measurements from left and right hemispheres were not merged. For comparison, we repeated this procedure aggregating over 68 ROIs from the Desikan-Killiany parcellation. This resulted in the discovery of 370 loci for cortical surface area and 181 loci for cortical thickness, such that the vertex-wise MOSTest analysis provided a 1.9-fold and 3.0-fold increase in yield over the region-based MOSTest analysis, respectively. Applying the min-P approach to Desikan-Killiany ROIs resulted in further reduction in the number of loci discovered (88 for cortical surface area; 44 for cortical thickness). This represents a 4.2-fold and 4.1-fold decrease compared to the MOSTest ROI-based analysis, and a 7.9-fold and 12.3-fold decrease compared to vertex-wise MOSTest analysis, respectively. Manhattan plots are presented in Fig. 1 , with corresponding Q-Q plots in Figure S1.
Additionally, we applied min-P approach to vertex-wise data resulting 93 loci for cortical surface area and 63 loci for cortical thickness. Numbers of loci discovered with different approaches are shown in Table S1. Specific loci discovered in each analysis are listed in Tables S2-S9.To compare the vertex-wise MOSTest results with the most recent ENIGMA GWAS ( Grasby et al., 2020 ), we also applied the ENIGMAbased definition of genetic locus. This resulted in 1598 and 1054 unique loci for cortical area and thickness respectively, and a total of 1735 unique loci for cortical morphology identified in the vertex-wise MOSTest analysis (Tables S10 and S11).

Replication analysis
Generalizability of vertex-wise MOSTest and min-P findings was assessed in replication analysis using data on 8,060 participants from the Adolescent Brain Cognitive Development (ABCD) Study as described in the Methods. This analysis revealed comparable replication rates for loci discovered with vertex-wise MOSTest and min-P both for cortical surface area (43% and 55% replicated for vertex-wise MOSTest and min-P, respectively) and cortical thickness (vertex-wise MOSTest 30%, min-P 37%). Therefore, absolute numbers of replicated loci both for cortical surface area and thickness were substantially higher for vertex-wise MOSTest than for min-P (Tables S3, S4, S6 and S7).
Substantial differences in age and ancestry between discovery and replication cohorts might reduce replication rates because genetic association analyses likely capture some ancestry-and age-specific factors. For example, it's unlikely to capture genetic variants affecting degree of age-related cortical atrophy in the ABCD cohort. These factors might have different effect for MOSTest and min-P and thus distort comparison of replication rates. To alleviate differences between discovery and replication analyses and make comparison of replication performance more unbiased, we performed replication analysis restricting ABCD cohort to individuals of European ethnicity ( N = 5,060). The obtained replication rates for cortical area were 50% and 54% for MOSTest and min-P, respectively, for cortical thickness replication rates were 51% and 34% for MOSTest and min-P, respectively. Additionally, we estimated replication rates for a given number of top lead variants (N = 25, 50, 80) identified in discovery phase. These results for both mixed-and Europeanbased replication analyses are presented in Table S13.

Power analysis
To estimate the proportion of additive genetic variance explained by genome-wide significant SNPs identified by either vertex-wise MOSTest or min-P as a function of sample size, we used the MiXeR tool   ( Fig. 2 ). The horizontal shift of the curve indicates that the effective sample size of vertex-wise MOSTest is around 3.0-fold that of min-P. We estimate that with the current UK Biobank sample (N = 35,657), 11.6% and 7.0% of the additive genetic variance in cortical surface area and thickness, respectively, can be explained by genome-wide significant loci from the vertex-wise MOSTest analysis. ( Fig. 2 ). In contrast, the min-P approach identifies 1.3% and 0.2% of the explained additive genetic variance for area and thickness, respectively ( Fig. 2 ). The poweranalysis indicates that 32.2% and 24.0% of the additive genetic variance in cortical surface area and thickness, respectively, will be discovered in the full UK Biobank sample of N = 100,000 using the MOSTest vertex- wise approach ( Fig. 2 ). Further, the proportion of explained variance with the min-P approach in the full UK Biobank sample is estimated to be lower than the yield of vertex-wise MOSTest in the present sample size ( Fig. 2 ).

Gene-level analysis
Through gene-level analyses of the vertex-wise MOSTest GWAS using MAGMA ( de Leeuw et al., 2015 ), we found that 1647 and 1412 genes, out of a total of 19036 protein-coding genes, were significantly associated with area and thickness, respectively (Table S12). We also performed competitive gene-set analyses restricted to the Gene Ontology biological processes category (containing 7343 pathways). This resulted in 204 and 184 significant (p < 0.05/7343) gene sets associated with area and thickness, respectively. The most significantly associated pathways were related to neuronal development and cell differentiation, with the top 10 shown in Fig. 3 . Remarkably similar pathways were highlighted in gene-set enrichment analyses using FUMA GENE2FUNC. Both for cortical area and cortical thickness, out of top 10 gene sets identified in MAGMA analysis ( Fig. 3 ) 5 gene sets are also within top 10 gene sets identified in corresponding FUMA GENE2FUNC analysis (Tables S14,  S15).
For comparison, we also performed the same analyses on the ROIbased MOSTest and min-P GWAS summary statistics, resulting in 198 area and 66 thickness gene sets for ROI-based MOSTest and 60 area and 4 thickness gene sets for min-P. As shown in Figs. S2 and S3, the vertexwise MOSTest approach led to much greater significance for nearly all pathways identified. Interestingly, the most significant pathways identified by vertex-wise MOSTest are tightly connected with critical neurobiological processes implicated in brain development while top findings in the min-P analysis are less specific. The distributed effects of identified variants across different brain regions are also illustrated by brain maps, highlighting the mixture of effects across the cortex ( Fig. 4 ).
The strongest association signals identified in our discovery MOSTest analyses both for cortical surface area (lead variant: rs34680120, p < 1E-300) and cortical thickness (lead variant: rs8033007, p < 1E-300) are located on chromosome 15 at 15q14. Univariate vertex-wise association signals of the lead variants are presented in Fig. 4 . These lead variants are in strong linkage disequilibrium with each other (r2 = 0.33) and with rs1080066 (r2 = 0.50 and r2 = 0.30 for rs34680120 and rs8033007 respectively), which was reported as the strongest association with cortical surface area in large recent GWAS on brain morphology by ENIGMA consortium ( Grasby et al., 2020 ). Regional association patterns of our strongest genetic signals both for cortical area and thickness ( Fig. 4 ) highlight cortex region around the central sulcus. The region encompasses areas responsible for all motor and sensory functions and has rapidly evolved in course of primate evolution, presumably reflecting the increasing importance of somatosensory and motor integration of hand functions ( Hopkins et al., 2014 ). There is also evidence that this region is strongly involved in higher cognitive functions ( Mendoza and Merchant, 2014 ) and may contribute to different psychiatric disorders ( Fujiwara et al., 2007 ;Li et al., 2015 ). Taking together this may suggest an active ongoing fine-tuning of this brain region in the human species which in turn is reflected in observed strong genetic associations altering brain morphology of the region.

Discussion
Applying the vertex-wise MOSTest method, we identified 695 loci for cortical surface area and 539 for cortical thickness, in total 780 unique genetic loci associated with cortical morphology, greatly replicated in independent sample with substantially different ethnical and age composition. This reflects an approximate 8-fold increase in discovery with no penalty on replication rate compared to the commonly applied univariate GWAS methods. Our study highlights the greatly improved yield obtained with the vertex-wise multivariate approach compared to conventional region-based univariate GWAS approach, which stems from highly distributed nature of brain morphology phenotypes, representing continuous maps per individual. The present results support the hypothesis that the genetic determinants of variability in brain morphology are extensively shared across multiple regions ( van der Meer et al., 2020 ). Our findings further underscore the complex molecular mechanisms shaping the human brain, which we show are largely related to neurodevelopmental processes.
Our gene-level analyses indicated that, with the current sample size, 10% of all protein-coding genes were significantly associated with brain morphology (either cortical area or thickness). Gene-set analyses for both area and thickness confirmed involvement of pathways recently reported by ENIGMA ( Grasby et al., 2020 ), but with greater statistical significance. We additionally found strong evidence for the involvement of several genetic pathways regulating neuronal development and differentiation that were not identified by the min-P approach, implicating key biological processes regulating human surface area expansion and increases in thickness. This also corroborates the strong statistical signals and suggests that we are capturing true biological mechanisms that were missed by previous methodologies. These novel find-  ings of neurobiological underpinnings associated with brain morphology provide a framework for follow-up experimental studies to identify the complex polygenic mechanisms involved in human brain development ( Silbereis et al., 2016 ). Further, the findings implicating neuronal development and cell differentiation can facilitate experimental studies to gain better insight into the pathobiological mechanisms of brain-related diseases including psychiatric disorders ( Sullivan and Geschwind, 2019 ), where we need to understand the role of polygenic mechanisms ( Gandal and Geschwind, 2021 ).
Twin studies have suggested the largely independent nature of cortical surface area and thickness ( Panizzon et al., 2009 ). The genetic correlation between them estimated using linkage disequilibrium score regression (LDSR) is rg = -0.32 (p = 6.5E-12) ( Grasby et al., 2020 ). Here we identify the specific loci involved and show that these cortical phenotypes share a large proportion of genomic loci. Out of a total of 695 loci for cortical area and 539 loci for cortical thickness, 454 loci (58.2% of the total number of unique loci) were overlapping. These findings illustrate how measures of genetic correlation fail to fully capture the extent to which the genetic influences of two phenotypes are interrelated. LDSR and twin analyses depend on the consistency of effect directions across phenotypes. In contrast, the analysis performed here consider non-null loci as overlapping if they are both significant and in linkage disequilibrium, regardless of effect directions. Overlapping genetic architecture across brain regions despite the absence of strong genetic correlations are therefore plausible due to common molecular toolkits involved in neurodevelopment across brain regions ( Stiles and Jernigan, 2010 ). This is in line with Allen Brain Atlas maps of the adult human brain, showing regions with high similarity in gene expression between cortical structures consistent with the notion that the basic architecture across the entire cortex is similar or "canonical " ( Hawrylycz et al., 2012 ). This may also explain the shared genetic architecture observed for many brainrelated traits and disorders ( Watanabe et al., 2019 ;Smeland et al., 2019 ; Cross-Disorder Group of the Psychiatric Genomics Consortium. Electronic address, p.m.h.e., and Cross-Disorder Group of the Psychiatric Genomics, C 2019 ). Accounting for the distributed signal across the cor-tex ( Fig. 4 ) in a multivariate framework allowed us to boost power for discovery compared to traditional univariate approaches, such as min-P.
Compared to the current largest brain morphology GWAS ( N = 51,665) ( Grasby et al., 2020 ), analyzing parcellation-free, vertexwise data with MOSTest increased the yield of significant loci 4-fold for cortical surface area and 11-fold for cortical thickness, despite the lower sample size in our study ( N = 35K). Of note, while being generally consistent, our protocol differs in a few aspects from the previous GWAS ( Grasby et al., 2020 ), where global measures were included in the principal analysis, data for cortical regions were averaged across right and left hemispheres, and the definition of genetic loci was less conservative. Using the Desikan-Killiany parcellation approximately 2.0 times more variants were identified for cortical surface area than for cortical thickness both with the min-P and the MOSTest (Table  S1). In contrast, there were 1.3 times more loci for area compared to thickness when using the MOSTest for parcellation-free vertex-wise data. (Table S1). The observed difference in loci yield may be due to differing degrees of mismatch between parcellation schemes and actual architecture of the phenotypes. This seems to be particularly relevant for thickness, where variant effects obtained from an ROI parcellation scheme may be underestimated compared to the vertex-wise approach. This result may explain why parcellation schemes better reflecting the genetic architecture of the cortex improve detectability in imaging genetics studies . It is also worth noting that smoothing of the vertex-wise data results in substantial decrease of loci yield ( Figure S4). This might indicate that there is valid information in the fine spatial structure, which is getting removed when smoothing is applied. It's therefore tempting to apply the method to the data with higher resolution (up to voxel-wise). The latter might become feasible with more advanced implementation of the method, which we currently work on.
The boost in statistical power using the multivariate vertex-wise approach is equivalent to a more than three-fold increase in effective sample size for both area and thickness ( Fig. 2 ). Our analysis suggests that the substantial gain in power provided by vertex-wise MOSTest is projected to explain approximately 32.2% and 24.0% of the additive genetic variance for cortical surface area and thickness, respectively, upon completion of UK Biobank's target neuroimaging sample ( N = 100,000) ( Miller et al., 2016 ) ( Fig. 2 ). It is possible that multivariate approaches will also boost discovery of genetic associations with other human phenotypes that exhibit shared signal between traits.
To conclude, by deploying new vertex-wise MOSTest approach we have identified 780 unique loci associated with human brain morphology, highlighting its distributed polygenic architecture, and providing the foundation for functional follow-up experiments. The code implementing vertex-vise MOSTest approach is made freely available on GitHub. While our primary analysis is focused on UK Biobank, pronounced replication rates in demographically distinct ABCD cohort suggest high potential for generalizability of presented statistical framework. Flexibility of this approach allows its seamless incorporation into large-scale meta-analyses like ENIGMA , offering unique opportunities for major advances in our understanding of the genetic determinants of brain morphology.

Declaration of Competing Interest
Dr. Andreassen has received speaker's honorarium from Lundbeck, and is a consultant to HealthLytix. Dr. Dale is a Founder of and holds equity in CorTechs Labs, Inc, and serves on its Scientific Advisory Board. He is a member of the Scientific Advisory Board of Human Longevity, Inc. and receives funding through research agreements with General Electric Healthcare and Medtronic, Inc. The terms of these arrangements have been reviewed and approved by UCSD in accordance with its conflict of interest policies. The other authors declare no competing interests.