Lessons from complex trait genetics may help us overcome the neuroimaging replication crisis

The research ﬁelds of Complex Trait (or Statistical) Genetics and Neuroimaging face similar challenges in identifying reliable biological correlates of common traits and diseases. This Viewpoint focuses on ﬁve major lessons that allowed population-level genetics research to overcome many of its issues of replicability and may be directly applicable to inter-individual neuroimaging research. First, the failure of candidate gene studies inspires abandoning overly simplistic studies mapping individual brain regions onto traits and diseases. Second, developments in genetics research demonstrate that robust study results can be achieved by increasing sample sizes. Third and fourth, the success of genome-wide association studies motivates the use of mass-univariate testing and sharing summary-level association data to boost large-scale collaboration and meta-analysis. Finally, applying genetics methods dealing with complex data structures to vertex-wise (or voxel-wise) neuroimaging data promises more robust discoveries without the need to develop novel neuroimaging-speciﬁc methods. Those practices e that are ﬁrmly established in genetics research e should either be further endorsed

The research fields of Complex Trait (or Statistical) Genetics and Neuroimaging face similar challenges in identifying reliable biological correlates of common traits and diseases.This Viewpoint focuses on five major lessons that allowed population-level genetics research to overcome many of its issues of replicability and may be directly applicable to interindividual neuroimaging research.First, the failure of candidate gene studies inspires abandoning overly simplistic studies mapping individual brain regions onto traits and diseases.Second, developments in genetics research demonstrate that robust study results can be achieved by increasing sample sizes.Third and fourth, the success of genome-wide association studies motivates the use of mass-univariate testing and sharing summarylevel association data to boost large-scale collaboration and meta-analysis.Finally, applying genetics methods dealing with complex data structures to vertex-wise (or voxelwise) neuroimaging data promises more robust discoveries without the need to develop novel neuroimaging-specific methods.Those practices e that are firmly established in genetics research e should either be further endorsed, or newly adopted by the neuroimaging community, promising to accelerate the evolution of Neuroimaging through robust discovery.© 2023 The Author.Published by Elsevier Ltd.This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Fifteen years ago, researchers claimed to have found a gene responsible for depression.Roughly 450 peer-reviewed studies, published in reputable journals, had delivered apparently supporting evidence for the hypothesis that the Serotonin Transporter Gene formed the biological basis of depression (Border et al., 2019).As the name suggests, the Serotonin Transporter Gene regulates serotonin levels in the brain, making it a logical therapeutic target that conformed with popular theories of depression at the time.Many other so-called candidate gene studies, which tested similar hypotheses about single genes forming the basis of other human traits and diseases, also claimed to have uncovered underlying genetic mechanisms.Those studies were cited thousands of times.Unfortunately, almost all this research later transpired to be based on oversimplified notions of human biology and could not be reliably replicated.
Candidate gene studies had mostly accumulated false results, and are now considered obsolete (Border et al., 2019).The full extent to which candidate gene studies e exploring inter-individual traits with complex biology e produced erroneous and oversimplified results became clear when studies with better methods and superior statistical power systematically contradicted candidate gene findings.The field of Complex Trait Genetics overcame many of its flaws through drastically reforming approaches to analysing big genetic data, which eventually allowed novel insights into human biology.Contemporary genetic discoveries promise exciting translations into applications of personalised healthcare, according to which we may be able to predict disease risk from an individual's genetic make-up to guide treatment, or even prevent disease altogether (e.g., Brittain et al., 2017).
Neuroimaging research now faces some similar challenges of replicability.Specifically, studies looking for brain regions that may be "responsible" for functions (or dysfunctions) are just as difficult to replicate as candidate gene studies.For example, the Parieto-Frontal-Integration theory (P-FIT) e suggesting that enlarged frontal and parietal brain regions underpin good cognitive ability e shaped many of neurocognitive studies.However, meta-analytic evidence for this theory is inconsistent (Basten et al., 2015).It is possible, if not likely, that our future selves will remember the P-FIT, as the Serotonin Transporter Gene theory, as an abandoned piece of the self-correcting scientific process.
It is the aim of this Viewpoint to outline striking parallels between the challenges faced in the fields of Complex Trait Genetics and inter-individual Neuroimaging.It is at the core of my reasoning that traits and diseases have complex genomewide and brain-wide biology of which the corresponding data structures demonstrate similar characteristics.I will discuss firmly established practices that helped genetics research overcome issues of replicability and may help inter-individual neuroimaging research do the same.Some of these practices have already been endorsed by parts of the neuroimaging community, but others are novel and may inspire an acceleration of the evolution of Neuroimaging.

Lesson 1: abandon traditional studies mapping one biological variable onto traits and diseases
To illustrate how lessons from one research field may inspire change in another, I will first focus on advances in Complex Trait Genetics that were key to moving past the replication crisis.Most importantly, it was a conceptual shift from candidate gene towards genome-wide approaches that allowed genetics research to reliably identify genetic risk factors.In essence, candidate gene and genome-wide approaches differ in that the former describe the statistical relationship between a trait and one pre-specified gene-of-interest that the researcher hypothesised to form its biological basis.Genomewide approaches are hypothesis-free, and consider thousands, or millions of genetic markers.
Genome-wide methods successfully enabled robust discoveries as they accommodate two main characteristics of the genetic architecture of traits and diseases.First, genome-wide methods consider that markers across the genome are correlated among one another, which reflects the fact that genes are passed through families in conjunction with other genes.Geneticists call this linkage disequilibrium.Second, genome-wide methods recognise that most human traits have many genetic correlates that are weakly associated and diffusely distributed across the whole genome (as opposed to being controlled by only one strongly associated gene).Geneticists refer to this as polygenicity.It is now widely accepted that many genetic correlates (sometimes thousands) account for why traits and diseases are heritable.For example, whether a person develops depression is influenced by how many genetic risk markers this person inherits at birth.From this perspective, it is intuitive that polygenic traits can only be modelled appropriately by statistical methods that consider the entire genome, as opposed to one individual gene.
Those genetic data structures (i.e., linkage disequilibrium and polygenicity) both have close analogies in neuroimaging data.The parallel is strongest when neuroimaging data is represented in its raw vertex-wise (or voxel-wise) form, including hundreds of thousands of brain-wide measures.Like inter-correlated genetic markers, a measure of cortical thickness at a certain vertex (or voxel) is correlated with other vertex-(or voxel-) wise brain measures, particularly with those in physical proximity.These interdependencies are organised along cortical gradients (Huntenburg et al., 2018).Furthermore, we know from functional neuroimaging and other modalities, that traits have many correlates spread across the brain (Marek et al., 2022), suggesting approaches considering only one brain region oversimplify matters to a substantial degree.As in genetics research, empirical investigations have shown that it is most appropriate to model the brain based on thousands of vertex-wise brain measures, instead of considering crudely averaged regions-of-interest (ROIs) (Fu ¨rtjes et al., 2023).Hence, abandoning overly simplistic studies mapping individual ROIs onto traits has the potential to improve reliability of neuroimaging research.

Lesson 2: increase sample sizes
Using genome-wide approaches, the genetics community soon realised that polygenic traits have many genetic correlates with effect sizes much smaller than previously expected.
As small effects require large samples to achieve adequate statistical power, it is widely accepted that insufficient samples had hindered the reliable identification of genetic mechanisms.For example, Serotonin Transporter Gene studies had a median sample size of 435 (Border et al., 2019), and resulting false discoveries were amplified by publication bias.Many efforts have since been devoted to increasing significance thresholds to counteract chance findings, as well as collecting large-scale genotyped samples, in some cases including millions of participants (Yengo et al., 2022).While some neuroimaging samples are continually growing e for example, the UK Biobank cohort is on a trajectory to scanning 100,000 brains (Littlejohns et al., 2020) e overall, they remain small. 1 Where big samples collected through consortia improved the reliability of genetics studies, larger samples are imperative to improving neuroimaging studies too.Consortia like ENIGMA (Thompson et al., 2014) and repositories such as NeuroVault (2022), BrainMap (Fox et al., 2005;Fox & Lancaster, 2002), or Neurosynth (Yarkoni et al., 2011) are already pioneering data sharing of tens of thousands of participants which will unlock the reliable identification of many correlates spread across the brain.

Lesson 3: use mass-univariate testing
Beyond increasing sample sizes, genetics research established statistical techniques handling complicated data structures, that can also model vertex-(or voxel-) wise neuroimaging data and account for complex brain-trait relationships.A popular genome-wide technique is mass-univariate testing, which geneticists call genome-wide association studies (GWAS).GWAS take a hypothesis-free approach to scanning the genome for any association between a trait and millions of genetic markers.GWAS results have been reliably replicated across many phenotypes and samples (Visscher et al., 2017).Resulting summary statistics, which conceal sensitive participantlevel information, can be publicly shared, enabling large collaborative efforts and powerful meta-analyses.It has become routine for researchers to inform their genetic studies with the newest GWAS association data, in order to predict individual-level disease based on polygenic scores.Those scores reflect an individual's propensity towards disease and their predictive value is improving as sample sizes grow (Visscher et al., 2017).Mass-univariate testing, which is what a GWAS does, has also been employed by neuroimaging studies in which associations between traits and hundreds of thousands of vertex-(or voxel-) wise brain measures are quantified.Vertex-(or voxel-) wise mass-univariate testing is used in many neuroimaging studies (Ashburner & Friston, 2000), however, it has not fully replaced limited ROI-based studies.A downside to mass-univariate testing is the considerable power losses due to many significance tests that need correction for multiple testing.However, increasing neuroimaging sample sizes and larger computational resources promise small but accurate estimates of vertex-(or voxel-)trait associations, which will help uncover meaningful brain-wide association patterns in the future.

4.
Lesson 4: use and share summary-level association data GWAS summary statistics are routinely used as input data to infer estimates of genetic overlap, which quantifies the level of overlapping genetic biology shared between two traits.Many studies focus on genetic overlap to better understand comorbidity or disease risk factors.Based on estimates of genetic overlap, more advanced statistical approaches model relationships between traits at the level of their underlying genetic architecture, enabling tests of specific theories about the shared biology between traits (e.g., Genomic SEM (Grotzinger et al., 2019), Genomic ICA (Soheili-Nezhad et al., 2021), Genomic PCA (Fu ¨rtjes et al., 2021)).Many more methods build on GWAS summary data, uncovering biologically interpretable mechanisms, for example, by linking them with gene expression or cell type profiles (de Leeuw et al., 2015).
Adopting practices that encourage collaboration and metaanalysis also greatly benefits neuroimaging research.Just as geneticists share GWAS summary statistics, neuroimagers should calculate summary-level trait associations for all vertices (or voxels) across the brain and share them publicly.Meaningful summary data will require great, consortiumlevel efforts to reduce noise and (scanner) bias.Inspired by practices surrounding GWAS, vertex-(or voxel-) wise association data may be used to infer brain-based etiology shared between traits (parallel to genetic overlap), or to uncover underlying biological mechanisms by mapping association data onto brain-specific gene expression (Shen et al., 2012) and neurotransmitter systems (Hansen et al., 2022).

5.
Lesson 5: use multivariate approaches that were originally developed for genetics research Alternative multivariate techniques exist that simultaneously map thousands of biological markers onto a trait or disease.Multivariate techniques do not require extensive multiple testing correction, and they therefore have more statistical power than mass-univariate methods.For example, the genetics technique genome-wide complex trait analysis (GCTA) (Yang et al., 2011) is ubiquitously used to estimate heritability.Implemented in efficient software, the GCTA framework employs linear mixed models fitting millions of variables as a vector of random effects, to quantify trait variance accounted for by genome-wide markers (i.e., heritability), while recognising the correlation structure between them.Recent neuroimaging studies repurposed GCTA which enabled the estimation of morphometricity, which is the trait variance explained by brain-wide measures (Couvy-Duchesne et al., 2020).
All traits are heritable (Turkheimer, 2000), and given heritability and morphometricity have an analogous statistical definition, it is unsurprising that most traits are also 1 I am unaware of a study reporting a median sample size for inter-individual neuroimaging studies only.Marek et al. (2022) and Szucs and Ioannidis (2020) report a median sample size of N ¼ 25, however, many of the studies included in this figure do not make inference on an individual-level.If the goal is to make inference on an individual-level e as is the focus in this Viewpoint e N ¼ 25 is insufficiently small.When it is the aim to create a task average contrast map using functional imaging, for example, N ¼ 25 may be sufficient.Note that a parallel between neuroimaging and genetics research may be extended to brain maps e which could parallel research of genetic components that are identical across all humans (i.e., not varying across the population) e however, this discussion is out-of-scope of this Viewpoint.
c o r t e x 1 6 8 ( 2 0 2 3 ) 7 6 e8 1 considerably morphometric (Couvy-Duchesne et al., 2020).A recent study applied the GCTA framework to neuroimaging data, and compared the variance accounted for by ~300,000 cortical measures with variance accounted for by coarser brain atlases (Fu ¨rtjes et al., 2023).It demonstrated that atlasbased representations of the cortex explained a fraction of the morphometricity that was explained by vertex-wise measures, which highlights that considering brain-wide vertex-(or voxel-) wise measures maximises the potential of uncovering neuronal underpinnings of traits and diseases.As in candidate gene approaches, coarse representations of the cortex using ROIs do not reliably account for trait variance.
Critics may argue that modelling vertex-(or voxel-) wise data e using genetics frameworks e would disregard the decade's worth of brain sciences that derived brain atlases to help interpret brain-trait associations.To facilitate more biologically meaningful interpretation, Couvy-Duchesne et al. ( 2020) demonstrate that the GCTA framework permits integrating prior knowledge about brain organisation by grouping vertexwise measures based on the researchers input, and fitting each set of vertices as random effects.This analysis has the advantage that it still models vertex-wise cortical structure, while it drastically reduces multiple testing burden compared with mass-univariate testing, as it only performs a single association test per set of vertices.I suggest this framework has the potential and flexibility to fully replace ROI-based studies with robust vertex-(or voxel-) wise approaches.

6.
Limitations of translating genetics practices to neuroimaging research It must be noted, however, that the discussion above only applies to studies researching inter-individual traits that commonly vary across the general population.Depression is a prominent example, as it affects about 15% of people at some point in their lives (Bromet et al., 2011).It is precisely this variance across the population that the methods discussed above leverage to draw inference.Those methods would be inappropriate to model rarer monogenic traits.Huntington's disease, for example, only affects 1 in 10,000 people and was linked to one single gene coding for a protein called huntingtin (Coleman et al., 2021), which would not map onto models of polygenicity.Nonetheless, parallels between the fields hold as monogenic traits in genetics research are analogous to lesionbased neuroimaging studies.The latter link localised brain lesions (that often result from rare accidents) with very specific loss of cognitive function (e.g., Scoville & Milner, 1957).Both monogenic and lesion-based correlates are rare, they both have large effect sizes, and smaller clinical samples are sufficient to detect them.
A meaningful application of genetic methods to neuroimaging data must consider differences in data structures, which dictate the interpretation of results in their genetic-or brain-specific context.Primary among these differences is that genetic markers are inherited at conception and remain unaltered across the lifespan, while the brain evolves with its environment.Thus, genetic propensity towards a trait can imply directionality of effects, which cannot be inferred from neuroimaging studies.Genetic and neuroimaging data both contain interdependent measures, but the architecture of this interdependence is different, which may affect techniques deriving genetic or brain-based trait overlap.It complicates interpretation of genetic studies that trait correlates often sit in parts of the genome with complicated regulatory, and no direct coding functions (Visscher et al., 2017).In comparison, the interpretation of vertex-(or voxel-) wise brain associations is trivial, as the strongest associations are between vertices (or voxels) in physical proximity.
This Viewpoint aimed to translate specific genetics research practices to vertex-wise neuroimaging data.However, interdisciplinary lessons may also be transferred vice versa e from neuroimaging to genetics research.An example may be that neuroimaging research derives sophisticated methods to adjust for multiple testing [e.g., non-stationary spatial correlations (e.g., Davey et al., 2021)] that could guide future genetic studies to more powerfully account for LD and correlated patterns of trait associations.More broadly, it appears that making sense of genetic and neuroimaging data e or any other multivariate data e is marked by similar challenges and opportunities which becomes evident when considering interdisciplinary parallels in data structures and analysis techniques.Hence, future scientific progress may benefit from exploring parallels between all research disciplines and translating unique insights across them.

Conclusion
Population traits have both complex genetic and brain-based biology, and here I argue that practices firmly established in genetics research can be directly applied to improving neuroimaging studies.Based on striking parallels between the two fields, this Viewpoint transferred lessons drawn from the field of Complex Trait Genetics to Neuroimaging, which promises more robust discoveries if widely endorsed by the neuroimaging community.The failure to produce reliable findings, by both candidate gene and neuroimaging studies mapping individual brain regions onto inter-individual traits, illustrates that future efforts should keep increasing sample sizes, and counteract noisy findings by correcting for multiple testing.Genetics research teaches that we can improve replicability by abandoning hypothesis-driven overly simplistic approaches, and by adopting hypothesis-free methods exemplified by GWAS and GCTA.Sharing summary-level association data will boost large-scale collaboration and meta-analysis.Those genetic practices, applied to vertex-(or voxel-) wise neuroimaging data, promise an acceleration of the evolution of neuroimaging studies, without requiring painstaking innovation of neuroimaging methods that would overcome the same challenges that Complex Trait Genetics already solved.

Disclosures
No conflicts of interest.This manuscript does not report new experimental data.

Credit author statement
This Viewpoint was conceived, written and edited by AEF.At the time of writing, AEF was funded by the Social, Genetic and Developmental Psychiatry Centre, King's College, London, and the National Institute of Health grant R01AG054628..