Impact of gene-by-trauma interaction in MDD-related multimorbidity clusters

Background: Major depressive disorder (MDD) is considerably heterogeneous in terms of comorbidities, which may hamper the disentanglement of its biological mechanism. In a previous study, we classi�ed the lifetime trajectories of MDD-related multimorbidities into seven distinct clusters, each characterized by unique genetic and environmental risk-factor pro�les. The current objective was to investigate genome-wide gene-by-environment (G×E) interactions with childhood trauma burden, within the context of these clusters. Methods:


Introduction
Major depressive disorder (MDD) is the most common psychiatric disorder in Western countries and caused by a combination of genetic predisposition and various non-genetic factors (Flint, 2023).The heritability of MDD, based on additive genetic variation, has been estimated to lie around 25% (Flint, 2023).Additionally, numerous factors have been identi ed that increase the risk of developing depressive episodes throughout lifetime.These are factors primarily associated with an unfavorable lifestyle, such as smoking, alcohol consumption, physical inactivity, and poor diet, or associated with stressful living conditions, including childhood adversity, stressful life events, and unemployment (Sarris et al., 2020).Unfortunately, MDD is highly heterogeneous on the clinical level, with variations in individual symptoms and the development of multiple co-occurring disorders.This often leads to negative effects caused by polypharmacy and a decreased quality of life for patients and makes it even more challenging to identify biological causes of MDD (Rush et al., 2006;Fried and Nesse, 2015;Fratelli et al., 2020).
Explaining this complex relationship between predisposing genetic factors and external risk factors on a biological level is di cult.Essentially, this means that the impact of a speci c genetic variation on an individual's phenotype depends on the exposure to external risk factors (known as gene-by-environment (G×E) interaction) (Karg and Sen, 2012).Childhood adversities have been identi ed as a signi cant risk factor for the development of various psychiatric disorders, including MDD (Mandelli and Serretti, 2013).In genetic studies, previous attempts to uncover such genetic interaction effects for childhood adversity have mostly concentrated on speci c candidate genes involved in neurotransmitter systems (Mandelli and Serretti, 2013;Culverhouse et al., 2018;Border et al., 2019;Li et al., 2020a), or combined results of large genome-wide association studies (GWAS) into polygenic scores (PGS) (Peyrot et al., 2018;Coleman et al., 2020).In a genome-wide interaction approach using data from the Psychiatric Genomics Consortium (van der Auwera et al., 2018) the moderation effect of prominent candidate genes and childhood adversities on MDD could not be supported.To date, interaction analyses have not yielded new biological models, and the underlying mechanisms still remain elusive.Several factors could account for this, including limited sample sizes, as well as strong heterogeneity within the diagnosis of MDD regarding individual symptoms and co-occurring multimorbidities, which may share both genetic and non-genetic risk factors (Flint, 2023).
In the TRAJECTOME (Temporal disease map-based strati cation of depression-related multimorbidities, Juhasz et al., 2023)project, we generated temporal disease trajectories of 86 pre-selected multimorbidities related to MDD in over 1.2 million individuals with the aim to identify clusters that represent temporal courses of MDD-related multimorbidity burden over an individual's lifetime.These seven clusters are characterized by a unique genetic and non-genetic risk factor pro le (Juhasz et al., 2023), showing a clear differentiation between high-and low-risk clusters in terms of MDD and MDDrelated disorders.The signi cant contributions of both genetic predisposition and external risk factors to these clusters suggest a possible G×E interaction.In our current analysis, we aim to identify speci c clusters where the interaction between genetic factors and childhood trauma burden may in uence the assignment to the MDD-related multimorbidity clusters.We expect that the contribution of G×E will vary across clusters, not only in terms of strength of the G×E signal but also in the speci c genes and biological mechanisms involved.

UK Biobank study population and measurements
For our analysis, we used data from the UK Biobank (UKB) under the application number 1602, which contains comprehensive medical, phenotypic and genotypic information from participants recruited based on the NHS patient registers of people aged 40-69 years (Smith et al., 2013).All participants gave written informed consent, and ethical approval was obtained from the National Research Ethics Service Committee North West-Haydock (Nagel et al., 2018).All procedures were in accordance with the Declaration of Helsinki.

Cluster membership outcome variable
Within the TRAJECTOME project, seven clusters were identi ed that re ect an individual's MDD-related multimorbidity burden throughout lifetime (Juhasz et al., 2023).
To construct these clusters, we utilized temporal disease information from a total of 502,504 participants from the UK Biobank, along with 687,005 participants from two other general population cohorts (THL cohorts from Finland (Sund, 2012) and CHSS from catalonia (Farré et al., 2016)).The disease trajectories were categorized into four different cumulative time intervals (aged , , , and ).We included 86 diseases with a minimum prevalence of 1% and signi cant relevance to MDD within any of the time intervals to compute MDD-related multimorbidity scores for each time interval and each participant, respectively.Participant clustering of these scores identi ed seven distinct clusters re ecting temporal courses of the MDD-related multimorbidity burden throughout the lifespan (Fig. 1).The clusters correspond to speci c clinical subtypes with high and low MDD-related multimorbidity burden (see Table 3 for the ve diseases with the most increased and decreased prevalence within each cluster).The clusters are characterized as follows: Participants with a high probability for Clusters 1-4 have a decreased depression prevalence and a lower MDD-related multimorbidity burden whereas participants with a higher probability for Clusters 5-7 show an increased prevalence for depression and a higher overall MDD-related multimorbidity burden.For more details on the clustering procedure and clinical characteristics of the clusters see Geszi et al. (Juhasz et al., 2023).
To account for incomplete and uncertain participant trajectories in the analyses, we excluded individuals who were under 60 years of age and had a maximum posterior probability of less than 0.25 for membership in any cluster (Juhasz et al., 2023).Our outcome variables were the posterior probabilities (log-odds) of the individuals' cluster membership for each of the identi ed seven MDD-related multimorbidity clusters (Juhasz et al., 2023).

Childhood trauma assessment
Childhood trauma was assessed within the UKB mental health online follow-up (Davis et al., 2020) using the Childhood Trauma Screener (CTS) (Walker et al., 1999;Glaesmer et al., 2013), the short form of the more comprehensive Childhood Trauma Questionnaire (CTQ) (Glaesmer et al., 2013;Klinger-König et al., 2022).The ve-item CTS includes one question for each dimension of childhood trauma: emotional abuse (UKB data eld 20487), physical abuse (data eld 20488), sexual abuse (data eld 20490), emotional neglect (data eld 20489), and physical neglect (data eld 20491).Answers are given on a ve-point Likert-scale from "never true" to "very often true".A summary score was calculated that re ects the burden of childhood trauma experienced, with a range of 0-20.The score was only calculated for participants that answered all of the ve questions.

Genetic data and GWAS
The genomic quality control has been detailed elsewhere (Juhasz et al., 2023).
Variants' quality control included ltering out multiallelic variants and variants with a minor allele frequency (MAF) < 0.01, keeping only common biallelic single-nucleotide polymorphisms (SNPs) (Juhasz et al., 2023).For imputed SNPs, both info and certainty parameters had to be at least 0.9.Further, we excluded participants and SNPs according to missingness rate (iteratively, with cut-off points of 0.1, 0.05, and 0.01), as well as SNPs according to Hardy-Weinberg equilibrium violation (p < 1×10 − 5 ) (Juhasz et al., 2023).Before further steps of participant ltering, a linkage disequilibrium (LD) pruning was applied on SNPs with an r 2 of 0.2 (Juhasz et al., 2023).The maximal set of unrelated individuals (dataeld 22020) was selected (Bycroft et al., 2018), and a king-cutoff "--kin 0.044" ltering step was done (Juhasz et al., 2023).Furthermore, a sex check and a heterozygosity outlier detection (Eszlari et al., 2019) was applied on participants (Juhasz et al., 2023).X chromosome and the pseudoautosomal regions of the two sex chromosomes were included in the analyses, in addition to autosomal chromosomes.Males' haploid genotypes are coded as if they are homozygous.
For the GWAS analyses, we selected participants who passed the above GWAS quality control steps, had non-withdrawn consent in February 2022, and had non-missing data according to sex, age, all CTS questions, cluster memberships, and genotyping array.To control for population strati cation, principal component analysis was run in the nal set of participants (N = 76,856) and with the SNP subset after LD pruning detailed above (similarly to Juhasz et al., 2023).Plink 2.0 (Chang et al., 2015) (https://www.cog-genomics.org/plink/2.0/)was used to run linear regression models and assess the moderation effect of each remaining SNP in interaction with the CTS sum score on the seven MDD-related cluster probabilities as outcome.Additional predictors of each model were age, sex, the rst ten genetic principal components, genotyping array, and main effects of the SNP and CTS score.Age was taken into the model as a non-linear variable using cubic splines with a knot at age 60 (calculated by R package splines, and function bs) (Juhasz et al., 2023).Continuous predictor and outcome variables were all standardized in the analyses, as in (Juhasz et al., 2023).Genetic and phenotypic data were available for 76,856 participants.Effect sizes and p-values of the interaction term were evaluated, and entered into post-GWAS analyses.

Post-GWAS analysis
To assess the impact of SNP-interaction results on biological processes, several post-GWAS analyses were applied that extract information on signi cant loci, genes and pathways.
The G×E GWAS summary results for all seven MDD-related clusters were rst processed with FUMA (https://fuma.ctglab.nl/,Watanabe et al., 2017) to identify lead SNPs and signi cant loci.The maximum p-value of lead SNPs was set to 5×10 − 6 , r 2 ≥ 0.6 was set as threshold for independent signi cant SNPs and the maximum distance between LD blocks of independent signi cant SNPs was set to 250kb.Furthermore, MAGMA (Leeuw et al., 2015) gene-level analysis was performed to identify putative signi cant genes and gene sets.We de ned the SNP set of each gene with an extended +/− 10 kb downstream or upstream of the gene.We used the UK Biobank Genome European panel data to evaluate the LD information between SNPs.Statistical comparison of results between the clusters were done in R (https://cran.r-project.org/).

Candidate SNPs and gene selection
To compare our results with previously identi ed and discussed candidate variants and genes for G×E interaction in the light of childhood maltreatment and depression, we selected two recent papers (Border et al., 2019;Li et al., 2020a) and included the following 31 genes: SLC6A4, BDNF, COMT, HTR2A, TPH1, TPH2, MAOA, DRD2, DRD3, DRD4, MTHFR, APOE, CLOCK, SLC6A3, ACE, ABCB1, DTNBP1, DBH, CRHR1, FKBP5, CREB1, NTRK2, OXTR, IL1b, IL6, IL11, CRP, TNF, TNFRSF1A (TNFR1), TNFRSF1B (TNFR2), and GABRG2.These genes capture common biological mechanisms for G×E interaction in depression such as the serotonergic system, hypothalamic-pituitary-adrenal (HPA) axis or immune-related processes (Remes et al., 2021).To analyze these genes, a window of +/−1kb around the GRCh37/hg19 position was used, to mainly focus on SNPs within the coding region (compared to +/−10kb in the MAGMA analysis above).Within these genes, all SNPs available in the UKB genetic data and surviving the ltering process above were selected also including their putative candidate variants if available.These candidate variants are the most commonly investigated SNPs in the literature as listed by Border and Li (Border et al., 2019;Li et al., 2020a) (see Supplementary Table S4).In our study, we consider only candidate SNPs that are biallelic, which is the case in 20 out of the 31 candidate genes.
On the SNP-level p-values and effect estimates from the GWAS were analyzed, on the gene-level MAGMA-based p-values were investigated for each MDD-related cluster.
Candidate SNPs that showed a nominally signi cant G×E interaction for any cluster membership in our present results were investigated using the Oxford Brain Imaging Genetics Server (BIG40) to look up signi cant SNP associations with brain imaging phenotypes in the UK Biobank cohort itself (https://open.win.ox.ac.uk/ukbiobank/big40/pheweb33k/,Elliott et al., 2018;Smith et al., 2021).

Results
We analyzed data from 76,856 UKB participants with a mean age of 58.36 years, ranging from 40 to 72 years, of which 54% were female (characteristic of the sample see Supplementary Table S1).As these are data from the general population, the intensity of childhood trauma was relatively low with a mean CTS score of 1.67 and a range between 0 and 20 (see Supplementary Fig. S1).
On the phenotypic level, the correlations between posterior log-odds cluster membership probabilities and the CTS score (see Supplementary Table S2) re ected a pattern of high and low risk clusters that has already been observed for the clinical characteristics of the clusters (Juhasz et al., 2023).In detail, Clusters 1-4 showed a strongly signi cant but weak negative correlation with the CTS score, meaning that participants with a high probability to belong to the clusters tend to have a lower burden of childhood trauma.In contrast, the correlations with Clusters 5 and 6 were strongly signi cant but positive, re ecting a higher childhood trauma burden for subjects belonging to these clusters.Cluster 7, instead, revealed no signi cant correlation with the CTS score (see Supplementary Table S2).

Interaction GWAS for the seven MDD-related multimorbidity clusters
We performed G×E GWAS analyses in 76,856 UKB participants with complete phenotypic and genetic data using 3,875,386 genotyped or imputed SNPs (Table 1).Analyses for all seven clusters only revealed one genome-wide signi cant locus (p < 5×10 − 8 ) for Cluster 6 on the X chromosome with lead SNP rs145772219 (Supplementary Fig. S2 -S4).Thus, we set the level of suggestive signi cant SNPs to p < 5×10 − 6 to interpret their impact in interaction with the CTS score.Applying this new threshold, suggestive signi cant SNPs were found for all clusters with 291 distinct SNPs spanning 57 risk loci on all chromosomes except chromosomes 12, 22, and pseudoautosomal regions of sex chromosomes (Table 1, Fig. 2, Supplementary Table S3, Supplementary Fig. S2).The strongest genetic signal was found for Cluster 5 (132 signi cant SNPs, 19 signi cant loci) and Cluster 6 (103 signi cant SNPs, 28 signi cant loci) (Table 1, Fig. 2, Supplementary Table S3, Supplementary Fig. S2).In Clusters 3, 4, and 7 less than seven SNPs reached signi cance.The MAGMA gene-based analyses revealed three genome-wide signi cant genes after Benjamini-Hochberg correction for multiple testing using 18,937 protein-coding genes; one in Cluster 1 (LDLRAD4 p = 9.8×10 − 7 ) and two in Cluster 7 (C6orf89 p = 2.1×10 − 6 , TAAR2 p = 1.8×10 − 6 ).Lowering the signi cance threshold to p < 1×10 − 4 , additional genes emerged (see the genes in parentheses in Table 1, Supplementary Fig. S5).Even at this level no signi cant gene was found for Cluster 3. On the level of gene sets, MAGMA analysis identi ed two gene sets signi cant after multiple testing correction using 15,486 gene-sets (GO_cc: go_mon1_ ccz1_complex for Cluster 2 and Curated_gene_sets: borlak_liver_cancer _egf_up for Cluster 6).On the G×E GWAS-level, the cluster-wise correlations of beta values again re ected the pattern of high-and low-risk clusters (Supplementary Fig. S6), which was also observed in the correlation between CTS score and the cluster membership probabilities (see Supplementary Table S2).Lambda: genetic in ation factor; *signi cance of SNPs refers to a suggestive signi cance level p < 5x10 − 6 ; **results from MAGMA analyses, signi cance based on Benjamini-Hochberg correction, using 18,937 protein-coding genes.In brackets additional genes with suggested signi cance p < 1x10 − 4 , ***results from MAGMA analyses, signi cance based on Benjamini-Hochberg correction, using 31 genes.In brackets additional genes with suggested signi cance p < 0.05.

Evaluation of previously identi ed candidate genes and SNPs
To compare our ndings with previous results and hypotheses on the interaction between SNPs and childhood trauma on depression, we selected 31 putative candidate genes from the literature (see methods section Candidate SNPs and gene selection).For these genes, 3,788 SNPs were available in the UKB genetic data and passed the lter procedure (Supplementary Table S4), with the highest number situated within the CLOCK gene (N = 490) and the lowest within APOE, DRD4 and TPH1 (N = 8) before pruning.After gene-wise pruning (r 2 = 0.8), 937 independent SNPs were available, with the highest number in the NTRK2 gene (N = 110) and the lowest within DRD4 and TPH1 (N = 6).As candidate SNP studies often apply a lenient signi cance threshold of p < 0.05, we also applied this threshold in a rst screening approach to not miss potential informative ndings.
For each gene in the pruned list we found at least one nominal signi cant SNP in at least one of the seven cluster GWASes (Supplementary Table S5).The highest number of independent signi cant SNPs was found for the genes NTRK2, ABCB1, and DBH.Moreover, ten genes revealed a nominal signi cant effect for at least one SNP in all seven clusters (MTHFR, OXTR, SLC6A3, GABRG2, DTNBP1, FKBP5, IL6, NTRK2, DBH, DRD2) whereas four genes (IL1B in Cluster 1, ACE in Cluster 5, APOE in Cluster 5, CRHR1 in Cluster 6) showed signi cant SNPs in only one cluster.For three genes we observed individual SNPs that were signi cant in at least ve of the seven clusters (rs74853132 of FKBP5, rs4936274 of DRD2, 4 SNPs of MTHFR; Supplementary Tables S6 -S8, Supplementary Fig. S7 -S8).However, directions of effect were different among the clusters depending on the cluster correlation with high or low childhood trauma burden (Supplementary Fig. S9).Since our analysis was limited to common biallelic SNPs, some genetic candidate variants such as indels and variable number tandem repeats could not be included in this evaluation.
Clusters 6 (N = 85) and 2 (N = 72) came up with the highest number of independent signi cant SNPs within candidate genes, whereas Cluster 4 revealed the lowest number (N = 51) (Supplementary Table S5).
Out of the 18 candidate SNPs that were available, either directly or as proxy (r 2 > 0.8), ve exhibited a nominal signi cant association in at least one of the cluster GWASes (ABCB1 rs2235048, NTRK2 rs1147194, TNF rs1041981, IL6 rs367801961 and rs60056354, TPH1 rs1800532).None of these SNPs remains signi cant after Benjamini-Hochberg p-value correction.Looking these SNPs up in the BIG40, all ve SNPs revealed a signi cant association towards brain imaging phenotypes in the UKB cohort supporting their impact on psychiatric disorders (see Table 2 for speci c parameters and effect alleles).On the level of the 31 candidate genes, we evaluated the MAGMA gene based results for the seven Cluster GWASes (Supplementary Table S9).Applying a Benjamini-Hochberg p-value correction in each cluster separately, three clusters came up with signi cant genes: DRD2 in Cluster 1, DBH and MTHFR in Cluster 5 and TPH1 in Cluster 6 and 13 genes reached nominal signi cance in any of the clusters (Supplementary Table S9).A signi cance heatmap (Fig. 3) revealed that Clusters 1 and 2 as well as Clusters 4 and 7 were most similar regarding their signi cance pattern across the candidate genes, which was not consistent with the pattern observed on the genome-wide level or the association pattern with the CTS score (Supplementary Fig. S6).The strongest genetic signal throughout all clusters was found for the genes TPH1, OXTR, DRD2, MTHFR, and IL6.On cluster-level, the highest number of nominal signi cant candidate genes was found for Clusters 5 (N = 6) and 1 (N = 4) with the overlapping gene DRD2.The two genes ACE and CRHR1 revealed barely a signi cant SNP and were not signi cant in any of the clusters.

Discussion
In the current analysis we investigated the interaction effect between SNP-based genetic variation and childhood trauma on seven MDD-related multimorbidity clusters.These clusters re ect the temporal courses of MDD-related multimorbidity burden throughout life and could initially be associated with a unique clinical, genetic and modi able risk-factor pro le (Juhasz et al., 2023).Here, we extend this direct genetic characterization by moderation effects investigating childhood trauma, one of the strongest risk factors for MDD and other psychiatric diseases in general.In seven G×E GWAS including 76,856 UK Biobank participants, we replicated the pattern of high-and low-multimorbidity clusters concerning childhood trauma burden which has already been found on the level of genetic and non-genetic factors (Juhasz et al., 2023).The strongest genetic ndings in the G×E analyses could be observed for the high CTS burden Clusters 5 and 6 with more than 19 independent loci exceeding suggestive signi cance.
These clusters were also associated with a high MDD-related multimorbidity burden.On the genome level, the correlation pattern showed a strong similarity between Clusters 1-4 which were all associated with a lower CTS burden.In contrast, Clusters 5-7 seemed to exhibit three individual but contrary genetic pro les that contribute to the high multimorbidity load in individuals with high CTS burden.Thus, childhood trauma might promote the development of certain diseases by altering biological pathways and metabolic processes that might be traced back to the identi ed genes (Table 1, Table 3).From a clinical point of view, especially the Clusters 5, 6 and 7 are of interest, as they give insights into the genetic risk-pro les for the development of certain diseases depending on their childhood trauma burden.As our clusters are based on depression-related multimorbidity trajectories, we selected the ve diseases with the most increased and decreased prevalences within each cluster (Table 3) to draw biological connections towards the suggestive genes identi ed in our analysis.For Cluster 6, our G×E analysis identi ed FOXA3, DAW1, B3GNT7, TREX2, SBK2, FEM1A, HAUS7 as suggestive genes.FOXA3 might be involved in the regulation of allergic airway diseases and asthma (Park et al., 2009).As asthma is a potential risk factor for migraine and vice versa, also possible connections between FOXA3 and migraine are conceivable (Wang et al., 2020).Asthma as well as migraine are usually triggered by stress.DAW1 is associated with allergy (Waage et al., 2018), which aligns with allergic rhinitis being one of the high prevalence diseases in Cluster 6.For the remaining genes no associations were reported.However, B3GNT7 may play a role in the formation of neurophils and perineuronal nets in the adult brain (Takeda-Uchimura et al., 2022).
The genetic risk pro le for Cluster 7 includes the genes C6orf89, TAAR2, PNLDC1 and IL17A.The gene C6orf89 encodes the bombesin receptor-activated protein (BRAP) which might be involved in the stress response of lung epithelia (Liu et al., 2016;Xu et al., 2017) and can be linked to allergic rhinitis and asthma.In mice, the BRAP homologous protein may have a protective effect on the behavioral response to stress via regulating dendritic spine formation and synaptic plasticity in the hippocampus (Yao et al., 2023).As a consequence, it was concluded that chronic stress might cause damage to hippocampus function (Yao et al., 2023) A link towards the role of PNLDC1 instead could not be found.
With respect to the candidate genes, we did not con rm their impact in G×E analyses on our MDD-related multimorbidity clusters as a whole.However, some genes suggest a biological connection towards speci c individual clusters, underscoring biological heterogeneity stemming from distinct temporal patterns of MDD-related multimorbidity.
From the 18 available candidate SNPs, 5 (located in ABCB1, NTRK2, TNF, IL6, TPH1) showed a nominally signi cant interaction in at least one cluster (Supplementary Table S4).From these 5 SNPs, no one had a signi cant interaction in more than three clusters.Hence, our observation that they are not signi cant in all clusters is in line with  and 5, where the interaction effect with IL6 is not signi cant.The strong genetic signal of IL6 might be due to the close interrelation between in ammation, stress and depression (Ting et al., 2020).There, it was shown that psychosocial stress acts as a trigger for depression development by initiating changes in HPA-axis and immune/in ammatory system (Ting et al., 2020).Also, results for TPH1, which catalyzes the rst and rate limiting step in the biosynthesis of serotonin, revealed signi cant interactions on SNP (Cluster 1, Cluster 2 and Cluster 6) as well as on gene level (Cluster 6) in clusters that tend to have a low as well as high CTS burden.As TPH1 is associated with a broad range of psychiatric conditions (Shnayder et al., 2022) this might explain the link toward these clusters.Different effect directions for TPH1 on SNP level are re ected by the decreased prevalence rates of psychiatric conditions in the low-risk Clusters 1 and 2 in contrast to the increased prevalence rates for the high-risk Cluster 6.On a gene-based level several candidates might be of special interest as they either survive multiple testing within a cluster or show a strong association pattern towards several clusters.
The gene DRD2 (which encodes a D2 subtype of the dopamine receptor) showed a signi cant interaction effect in several clusters (Clusters 1, 2, 5, and 7; Fig. 3, Supplementary Table S9) and was the only gene that could be con rmed on the gene-level by Border et al. (2019).DRD2 is associated with schizophrenia (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014; Shioda, 2017), which is among the top ve associated disorders in all of these clusters.Interestingly, similar to Border et al.
(2019), we found no signi cant interaction effect of the candidate SNP (rs1800497).This SNP was initially assigned to the DRD2 gene, but was later found to be located in the ANKK1 gene (https://www.ncbi.nlm.nih.gov/gene/1813).Instead, we found a signi cant interaction for rs4936274 in Clusters 1-5, and 7 (Supplementary Table S6) which is not in LD with the historical candidate SNP.Hence, we propose that rs4936274 might be an alternative candidate SNP for the DRD2 x CTQ interaction on depression, as we observe a clear reversal of the effect direction when comparing clusters with high vs low MDD burden.Furthermore, the DBH and MTHFR genes revealed signi cant interaction results with CTS in the genebased analyses in Cluster 5 (Supplementary Table S9), although the proposed candidate SNP was not available in our analyses.In addition, MTHFR had a nominally signi cant interaction in Clusters 3 and 4 (Supplementary Table S9) and four independent MTHFR SNPs (r 2 < 0.8; Supplementary Fig. S7) showed a signi cant interaction in ve clusters which might also be due to the broad association of MTHFR with neurological and psychiatric disorders (Liew and Gupta, 2015;Zhang et al., 2022).
The oxytocin receptor gene (OXTR) was signi cant in Clusters 1, 2 and 6 (Supplementary Table S9).Our dataset does not contain the candidate SNP, but in other studies that SNP had no signi cant interaction effect (Tollenaar et al., 2017;van der Auwera et al., 2018).Instead, we found two independent SNPs (rs60345038, rs62243375) that showed a nominally signi cant interaction in Cluster 1, 2 and 6.One of them (rs60345038) also had a signi cant association with social cognitive performance in individuals with schizophrenia (Davis et al., 2014) and could also be a novel risk variant that is possibly linked to and associated with familial type 2 diabetes (Amin et al., 2023).
Our study has several limitations: The CTS, being a retrospective self-reported measurement, is likely to be in uenced by recall bias (Baldwin et al., 2019).Further, the cluster probabilities are based on MDDrelated multimorbidities, which makes it di cult to compare our results with previous G×E ndings for depression, although we present results for the rst analysis on G×E interaction for temporal MDDmultimorbidity clusters.In addition, the cluster assignment strongly depends on the reliability of the data from the healthcare system where missassignments can lead to wrong results for the G×E analysis.Due to strong correlations among our parameters (MDD, diseases, environmental factors and CTS), interpreting the correlations between GWASes in terms of causes and mechanisms may prove challenging.Results could be biased by the direct GWAS results for the clusters.
To conclude, our results underscore that some of the former candidate SNPs exert their effects on MDDrelated multimorbidity patterns depending on the level of childhood trauma.Such multimorbidity patterns may explain previously inconclusive results on G×E analyses.This genetically based susceptibility for early trauma effects may root in differences in brain phenotypes.Each SNP can have its distributed impacts across numerous brain regions (van der Meer et al., 2020), and these brain-wide differences may establish inter-individual differences in sensitivity to environmental (e.g., traumatic) factors, and thus in multimorbidity patterns, as suggested by our present results.Furthermore, our ndings indicate that the moderation of SNP effects by CTS may exert a more prominent in uence on the high multimorbidity clusters compared to the low multimorbidity clusters.However, future studies are to reveal the exact etiopathological mechanisms from G×E SNPs through brain phenotypes towards multimorbidity patterns.Regarding the role of candidate SNPs, we conclude that rs4936274 is likely a better candidate SNP for the DRD2 × CTQ interaction on depression than the former candidate SNP and should be tested in further analysis.Investigating MDD-related multimorbidity patterns may be a promising approach in G×E analyses.

Figures
Figures

Table 2
Candidate SNPs that were nominally signi cant in an interaction term with CTS score for any cluster membership as outcome + our new suggested candidate SNP for DRD2.In addition, signi cant associations of these SNPs with brain imaging phenotypes in Oxford Brain Imaging Genetics Server are listed.Cluster memberships are the posterior probability (log-odds) of cluster membership to each of the identi ed seven MDD-related multimorbidity clusters.SNP: single-nucleotide polymorphism; CTS: Childhood Trauma Screener; MDD: major depressive disorder.

Table 3
(Shadrin et al., 2021)Brown et al., 2018)ed genes, which are genome-wide signi cant in the GxE analysis, and the ve diseases with the most increased and decreased prevalence within each cluster plus effect direction for depression.Wotton et al., 2022), three high prevalence diseases in Cluster 5(Juhasz et al., 2023).The already known psychiatric risk gene NRG3, in turn, might have a potential developmental role in schizophrenia(Avramopoulos, 2018;Li et al., 2020b), bipolar disorder and major depression(Paterson et al., 2017).Pain and psychiatric disorders were associated with childhood trauma by overlapping brain mechanism: It has been shown that regions of the brain involved in the pain matrix (such as the anterior cingulate cortex, the amygdala, or the hippocampus) are altered after experiences of childhood abuse and trauma(Teicher et al., 2003;Brown et al., 2018).However, neither VGLL2 nor DENND3 showed any associations with the top ve diseases with increased prevalence for Cluster 5, while DENND3 is associated with the volumes of different brain regions (e.g.cortical surface area(Shadrin et al., 2021)cortical thickness (Shadrin et al., 2021; van der Meer et al., 2021), cerebellum (Zhao et al., 2019; Chambers et al., 2022)) and Alzheimer's disease (Chung et al., 2022).
Border et al. (Border et al., 2019) and Li et al. (Li et al., 2020a) as they were also not able to con rm the impact of these SNP -Childhood Trauma interactions on MDD in general.However, the effect direction of the candidate SNP for NTRK2 in Cluster 4 was in line with previous ndings (Juhasz et al., 2011; van der Auwera et al., 2018; Li et al., 2020a).
on gene-level).The GWAS catalog lists an association between IL6 and asthma which is among the top ve diseases in Clusters 3, 4, 6, 7. Conversely, asthma is not among the top ve diseases in Cluster 1, 2,