Decoding depression by exploring the exposome-genome edge amidst COVID-19 lockdown

Risk of depression increased in the general population after the COVID-19 pandemic outbreak. By examining the interplay between genetics and individual environmental exposures during the COVID-19 lockdown, we have been able to gain an insight as to why some individuals are more vulnerable to depression, while others are more resilient. This study, conducted on a Spanish cohort of 9218 individuals (COVICAT), includes a comprehensive non-genetic risk analysis, the exposome, complemented by a genomics analysis in a subset of 2442 participants. Depression levels were evaluated using the Hospital Anxiety and Depression Scale. Together with Polygenic Risk Scores (PRS), we introduced a novel score; Poly-Environmental Risk Scores (PERS) for non-genetic risks to estimate the effect of each cumulative score and gene-environment interaction. We found significant positive associations for PERSSoc (Social and Household), PERSLife (Lifestyle and Behaviour), and PERSEnv (Wider Environment and Health) scores across all levels of depression severity, and for PRSB (Broad depression) only for moderate depression (OR 1.2, 95% CI 1.03–1.40). On average OR increased 1.2-fold for PERSEnv and 1.6-fold for PERLife and PERSoc from mild to severe depression level. The complete adjusted model explained 16.9% of the variance. We further observed an interaction between PERSEnv and PRSB showing a potential mitigating effect. In summary, stressors within the social and behavioral domains emerged as the primary drivers of depression risk in this population, unveiling a mitigating interaction effect that should be interpreted with caution.

Depressive disorders are a leading cause of healthcare burden and one of the most common mental health conditions 1 .It has previously been reported that increased risk for depressive disorders, their onset, and maintenance is influenced by lifespan environmental stressors such as physical or emotional abuse during childhood, which are strongly associated with the risk of developing major depressive disorders (MDD) 2 .Furthermore, this relationship is complex and potentially changed by the co-occurrence of other composite risks such as genetic liability and its interaction with some of these environmental stressors, as has been shown for an additive effect of the depression PRS and childhood abuse on depression 3 .Genetic studies show that the underlying genetic architecture of depression is polygenic 4 , influenced by many variants across the genome with individually small effects.Polygenic risk scores are widely used as a metric for additive genetic liability of a given trait or disease, and although polygenic risk scores for diagnosis of depression have shown limited clinical utility, they have proven very useful in etiological research 5,6 .Each of these individual genetic and non-genetic factors have relatively small effects and a combination of them occurring at different points in the lifespan needs to coalesce to ultimately shape the outcome 7 .This information is key to identify the best preventative and treatment strategies towards reducing the burden of depression.Nevertheless, it is unclear how environmental and genetic factors interact and the extent to which they contribute to the phenotype.

Subjects
All subjects belong to the COVICAT study (COVID-19 cohort in Catalonia).This is a prospective epidemiological study that aims to describe the health impact of the COVID-19 pandemic on the adult population in Spain [29][30][31] .The COVICAT study includes participants from six different pre-existing ongoing population-based cohorts in Catalonia established before the outbreak and developed following the COVID-19 pandemic.The largest proportion of the participants (88%) were sourced from the GCAT|Genomes for Life Study; the GCAT cohort study includes middle-aged participants (40-65 years of age) who are resident in Catalonia, whose recruitment started in 2015 32 .All GCAT participants have Electronic Health Records linkage and genotypes are available for a subset of participants.For data completeness and homogeneity for this study we retained only those individuals sourced from the GCAT cohort.
Briefly, COVICAT harmonized data of all cohorts after the first wave of the COVID-19 pandemic in Spain in March 2020 and the majority of participants (99.7%) were contacted between 28 May 2020 and 15 August 2020.Data collection was primarily completed on a study portal website.Some complementary telephone interviews for participants unfamiliar with web-based approaches were conducted (6.1%).All participants provided informed consent and ethical approval for the study was obtained from the Parc de Salut Mar Ethics Committee (CEIm-PS MAR, no.2020/9307/I) and the Hospital Universitari Germans Trias i Pujol Ethics Committee (CEI no.PI-20-182).
Out of the eligible participants who were contacted 10,862 (61.5%) agreed to participate; of these, 10,087 (92.9%) participants completed the interview satisfactorily.In the present report we excluded 34 participants

Outcome
Depression outcome was assessed by symptoms of depression using the Depression Subscale of the Hospital Anxiety and Depression Scale (HADS-D) 33 .The subscale counts 7 items and ranges between 0 and 21.The survey's digitized format forced the users to respond to all items of the scale before submitting, which prevented missing items as well as missing scores.To examine the association between exposures (such as genetic factors and environmental factors) with depression, we categorized depression in different levels of severity, based on informed HADS-D thresholds, as reported in validation studies in Spanish populations 33 , defining three outcome categories (mild, moderate, severe) as binary variables named: mild depression (HADS-D < 5 vs. HADS-D ≥ 5), moderate depression (HADS-D < 8 vs. HADS-D ≥ 8) and severe depression (HADS-D < 11 vs. HADS-D ≥ 11).
To account for the effect of previous diagnosis of depression (and anxiety), available pre-pandemic Electronic Health Records (EHR) were analyzed.Lifetime diagnosis and pharmacy dispensation through EHR were combined with self-reported diagnosis.Pre-pandemic cases of anxiety and depression were defined by EHR-ICD-9 codes (Depression: 296, 311; Anxiety: 300), self-reported diagnosis by survey (i.e."had you ever been diagnosed by a doctor") and pharmacotherapy by EHR-ATC codes (antidepressants (N06A) and anxiolytics (N05B)) dispensed at least 12 times in the last 10 years (Details in Supplementary Table S1).For this study, a pre-pandemic mental health score was calculated by assigning one point for diagnosis (EHR or Self-reported) and one point for medication.The score ranged from 0 = no evidence of pre-pandemic mental health disorders, to 2 = individuals diagnosed and taking medication for mental health disorders.Only cases with a score of 2 were considered positive for pre-pandemic depression/anxiety.

Exposome and the poly-environmental risk scores (PERS)
The exposome approach was used to capture the cumulative environmental influences after the first wave of the COVID-9 pandemic in several domains: lifestyle, social, environmental and health.Briefly, a survey was used and data were extracted for risk factors including lifestyle, household, social support and health.Then, environmental variables (i.e.exposure to air pollutants and green spaces (including normalized vegetation index (NDVI))) were estimated from the participants' residential addresses using models developed by the ELAPSE project and MODIS, respectively 31,34 .Finally, cases of COVID-19 were defined by information about a positive test for SARS-CoV-2 infection and COVID-19 hospitalization as described elsewhere 30 .The exposome selection included 18 variables: loneliness, interpersonal conflicts in the household, caregiving, living alone, being unemployed after the first outbreak, struggling to pay the rent/food, physical activity, alcohol intake, current smoker, sleep hours, media exposure, access to outdoor spaces during the lockdown, natural views from the household, urbanization, greenness (normalized vegetation index), air pollution (nitrogen dioxide levels), any chronic disease, and diagnosis of COVID-19.A full description of the assessment methods and distribution of exposure measures are presented in Supplementary Tables S2 and S3.
A high correlation was observed between individual atmospheric pollutants: nitrogen dioxide (NO 2 ); particulate matter with an aerodynamic diameter of less than 2.5 μm (PM 2.5 ); tropospheric ozone (O 3 ) and black carbon (BC).Similarly, percentage of green spaces within a census tract and 300m, 500m and 1000m buffer were highly correlated with the normalized difference vegetation index (NDVI).The variables that measured a very similar exposure and with r > 0.8 were considered a group and one variable was included in the analysis as representative of the exposure in the domain 35 .A final low correlation (r < 0.3) was observed among retained pollutant variables.
Due to the heterogeneity of the exposure variables, we created binary variables for each of the 18 individual risk exposures.Binary variables were generated for each individual exposure, where 0 = absence of the risk factor and 1 = presence of the risk factor.A cut-off of 1 was used for categorical variables.For numerical variables, a cut-off was settled using the lowest 25% of the total sample.Descriptions of the binary variables and cut-off is presented in Supplementary Table S2.Resulting binary variables were grouped.
We initially used an agnostic approach; thus variables were grouped together without any preconceived notions or biases.Correlation measures were used to determine which variables tended to co-vary with each other.Hierarchical cluster analysis, using Ward's method, was used first to cluster variables (see Supplementary Fig. S1).Then, to ensure that the resulting categories were consistent with interpretability they were grouped into meaningful categories according to previous observations 13,31 .Briefly, living alone, which impacts social support networks, which are crucial during pandemics when individuals may rely on support from others, was grouped with other socioeconomic factors such as caregiving responsibilities, loneliness, interpersonal conflicts, unemployment, and struggling to meet basic needs like rent or food.In the behaviour domain access to outdoor space was categorized within the behaviour domain along with other factors such as physical activity, alcohol intake, smoking, media exposure, and sleep.These behaviours can significantly affect individual health outcomes and responses to the pandemic.Access to outdoor spaces in particular may be influenced by individuals' perceptions of COVID-19 risks and their behaviours in response to those perceptions.In the environment domain, chronic disease and COVID-19 were grouped within the category with other factors including air pollution, built environment or natural views.Together with the high correlation between air pollution, natural views and built environment, we have reported an association of air pollution and COVID-19 outcomes, possibly due to respiratory health impact 31  www.nature.com/scientificreports/Finally, to summarize the contribution of selected environmental exposures we generated cumulative scores for each domain; the Poly-Environmental Risk Score (PERS).Firstly, all binary exposures were added up to build a total PERS including all non-genetic exposure variables (PERS Total ), then we also generated three sub-categories of PERS in three domains to facilitate the interpretability of risks: (1) PERS Soc for all social and household-related factors (sum of loneliness, interpersonal conflicts, caregiving, living alone, being unemployed, struggling to pay rent or food), (2) PERS Life , for lifestyle and behavioural factors (low physical activity, high alcohol intake, current smoker, low sleep, high media exposure, low access to outdoor spaces), and (3) PERS Env , for wider environment and health-related factors (low natural views, high urbanization, low NDVI, high NO2, chronic disease and COVID-19 case).
To support validity of built PERS, we conducted a sensitivity analysis using a known group validity test examining the association between the three distinct PERS and key variables within our cohort.Employing linear regression models, we investigated the relationship between the PERS Soc score and participants' education levels, the PERS Life and self-perceived health, and the PERS Env score and the deprivation index.Regressions were adjusted for sex to account for potential confounding effects.By conducting known group validity tests we sought to assess whether our derived scores effectively captured meaningful differences in sociodemographic and health-related characteristics thereby establishing the credibility and applicability of our scoring system within the context of our study population.

Genome and the polygenic risk scores (PRS)
Genotypic information was accessible for a sub sample of 2442 of the COVICAT individuals sourced from the GCAT cohort 32 .In brief, genome-wide genotypes were generated using the Infinium Expanded Multi-Ethnic Genotyping Array (MEGAEx) (ILLUMINA, San Diego, California, USA), then imputed, accounting for a total of ~ 20 M unique autosomal variants 36 .All GCAT participants that passed strict quality control were included.Variants with minor allele frequency (MAF) > 0.001 and imputation quality score (R 2 ) > 0.3 were retained for subsequent analysis.All included subjects were Iberian from White-Western European ancestry based on selfreported data and PCs analysis 36 .Genotypes are available at EGA (European Genome-phenome Archive; https:// ega-archi ve.org/) under accession ID EGAD00010001664.
In order to improve the robustness of the approach we considered different dimensions of depression by drawing on multiple sources of depression-related phenotypic data, including different measurement methods and indicators potentially capturing a more comprehensive understanding of the genetic condition and enhancing the predictive power of the PRS instruments, from symptoms to clinical diagnosis.These include lifetime depression (MHQ), quantitative endorsement (up to five) depression phenotypes ('help-seeking' , 'self-reported depression' , 'antidepressant usage' , 'depression 37 , or 'hospital (ICD-10)' 38 ; broad depression 39 , and several depression measures ranging from minimal phenotyping (using data from questionnaires) to EHR definition of depression (using ICD-10 codes) or strictly defined depression (using an online mental health follow-up 40 ).Polygenic risk scores for all phenotypic measures for depression [38][39][40] were derived using either the GWAS summary statistic data, or the weight from the PGS catalogue 41 (Supplementary Table S4).Posterior SNP effect sizes for those phenotypic measures that were derived using GWAS summary statistic data were computed using PRScs 42 .Weights from the PGS catalogue were processed by removing strand ambiguous SNPs and discordant SNP alleles.Then, the cumulative score for 14 polygenic risk scores for each genotyped individual were computed using PLINK1.9 43.Raw PRS were converted into standardized z-scores to compare odds ratios across analyses.Because of the overlapping dimensions of depression used, and to help the interpretation of the results, pairwise correlations between the PRS of different depression measures and data sources were assessed by computing a Pearson correlation matrix with hierarchical clustering using Ward's method between all the phenotypic measures (Supplementary Fig. S2).

Statistical analysis
We conduct regression analysis to investigate the association between various factors and different thresholds of depression (mild, moderate, and severe).To examine the association between exposures and the outcome, we used binary variables based on the HADS-D threshold; mild depression (HADS-D ≥ 5), moderate depression (HADS-D ≥ 8), and severe depression (HADS-D ≥ 11).Backwards feature selection method with logistic regression, with bootstrapping, was done to identify the best features associated with each depression threshold.The selection process was repeated through bootstrap resampling with 1000 iterations to ensure robustness.Multinomial regression analysis were then conducted for each level of severity with the outcome categorized as follows: HADS-D: < 5 indicating no depression, 5-7 indicating mild depression, 8-10 indicating moderate depression, and ≥ 11 indicating severe depression and using the no depression category as the reference.Prediction accuracy was assessed using Nagelkerke's pseudo R 2 .Given the well-documented sex differences in depressive symptoms all analyses were adjusted for sex and age 44 .Similarly, education level was used as a proxy for socioeconomic status and included as a covariate, given its association with depressive symptoms 45 , as well the first ten principal components (PC).All models presented here includes all individuals with environmental and genetic data (n = 2442).
In addition to multinomial analysis, we first computed the main effect of genetic factors (genome model) and environmental exposures (exposome model) by analyzing the association between the Polygenic Risk Score (PRS) and the Personal Environment Risk Score (PERS) measures (PERS Total , PERS Soc , PERS Life , and PERS Env ) and depression outcomes (mild, moderate, severe).These models consider the genetic contribution and environmental factors to depression individually.All models were adjusted for covariates to depression outcomes.Secondly, the multinomial analysis was conducted for the G + E (genome plus exposome) analysis.As briefed before, we employed a stepwise regression approach with backward elimination based on the Akaike information Criterion (AIC).Initially, we fitted a model including all covariates, genetic, and environmental factors and www.nature.com/scientificreports/then iteratively removed variables that did not contribute significantly to the model, as determined by the AIC.This process was repeated with 1000 bootstraps with resampling to enhance the robustness of the G + E model, and the final model was constructed with the variables that were retained in 60% of the bootstraps.The retained factors resulting from the bootstrapped models were subsequently analysed to provide insights into the stability and reliability of the G + E associations using a multinomial regression to assess the effects of these variables across all three depression thresholds.Third, with the retained PRS and PERS in the second step, we computed a multiplicative interaction model to explore potential interaction effects between genetic and environmental factors.This model, known as GxE (genome by exposome), evaluates whether the combined effects of genetic and environmental factors on depression outcomes are greater or different from the sum of their individual effects.We applied a False Discovery Rate (FDR) correction to the main effects results to control for the possibility of false positive results due to multiple comparisons in a comprehensive way.Significance was stated at FDR-adjusted p value of 0.05.Forest plots were generated using the forestplot Python package (version 0.3.3).

Ethics approval and consent to participate
All research was carried out in accordance with relevant national and European guidelines and regulations.The GCAT study was carried out using anonymized data provided by the Catalan Agency for Quality and Health Assessment, within the framework of the PADRIS Program.The participants provided informed written consent.
The research conformed to the principles of the Helsinki Declaration.

Results
We present the results of the analyses in the following sections.Firstly, we provide a detailed description of the sociodemographic characteristics, the distribution of HADS-D subscale and the exposome risks in our sample.Secondly, we present the results of the regression models examining the main effect of each exposome score (PERS Total , PERS Soc , PERS Life and PERS Env ) on depression outcomes (mild, moderate, severe).We introduce here the novelty of grouping environmental factors on cumulative scores; poly-environmental risk scores.Thirdly, we show results of the regression models exploring the main effect of genetic effects.Here we introduce the exploratory use of PRS issued from complementary definitions of depression.Fourth, we investigate Genome-exposome addition (G + E) and Genome-exposome interaction (G × E) models.

Sociodemographic characteristics and HADS-D scores
The mean age of the participants in the COVICAT study was 54.6 years (SD = 7.1), 59.4% of them were female.
Almost half of the sample (46.9%) had a "higher level of education" and 10.3% had a "pre-pandemic mental health diagnosis".Two subsamples used in the analysis (i.e.Subsample A, participants with genetic data, and Subsample B , participants with confirmed pre-pandemic mental disorder) showed differences in age-gender distribution as well as in educational level compared to the whole sample (Tables 1 and 2).Subsample A, was older with a lower education level, a higher proportion of males, a lower prevalence of pre-pandemic mental health diagnosis and similar levels of depression.Subsample B, was older with a lower education level and a higher proportion of females.All these differences were accounted for in the following analysis.

Exposome factors in the full sample
A relevant proportion of the sample reported exposure to known environmental risk factors during the lockdown.For example, loneliness was identified in 75.3% of participants, 9.5% reported interpersonal conflicts in the household, 7.3% lost their jobs after the first outbreak of the pandemic, 8.4% struggled to pay for rent or food, and over 10% had no access to outdoor spaces during the lockdown.A chronic disease was reported by 32.3% of the participants and 4.9% had a positive COVID-19 diagnosis.Tetrachoric correlation between individual variables revealed imperfect correlations within environmental and socioeconomic variables (Supplementary Fig. S1), with only a few absolute pairwise correlations r > 0.5; Living alone-caregiving, NDVI-urbanization, NO2-urbanization.Ward's clustering analysis seemed to support the existence of three distinct categories or clusters, particularly for the cumulative scores related to PERS Soc and PERS Env.The variable "access to outdoor spaces" was clustered with other environmental risks factors included, possibly due to the high correlation with natural views, but for coherence with other behavioural factors we decided to group this variable with other lifestyle/behavioural factors in the PERS Life .Living alone and Caregiving showed a strong negative pairwise correlation and were grouped in a separate cluster, but due to the high correlation they exhibit with other socioeconomic factors, they were considered as such.Finally, variables related to health (i.e.Chronic disease and COVID-19) poorly correlated with most risk factors, were included in PERS Env , based on evidence reported by our group correlating COVID-19 and air pollutants 31 .The mean score for estimated PERS were; PERS Soc = 1.01 (0.97 SD), PERS Life = 1.44 (1.04 SD), PERS Env = 1.92 (1.3 SD), and PERS Total = 4.37 (1.92 SD).
We assumed equal weights for the ease calculation, even if it does not fully capture the variability in the impact of individual factors.Reliability of PERS was confirmed by a known group validity test.The known group validity tests yielded statistically significant associations, capturing meaningful variations in education levels, self-perceived health, and deprivation index within the study cohort.Results support the robustness of our composite scores and their further use as instrumental tools.Complete results from the known group validity test are presented in Supplementary Table S5.www.nature.com/scientificreports/

Non-genetic contribution to depression outcomes
The distribution of depression scores in relation to cumulative PERS at the three considered outcomes (mild, moderate and severe depression) show a consistent increase of depression scores with higher PERS scores and this suggests that higher exposure to environmental stressors may contribute to increased depressive symptoms.Furthermore, there's a heightened likelihood of individuals experiencing depression across all severity levels as their cumulative exposure to environmental stressors escalates, suggesting that individuals exposed to higher levels of environmental stressors are more likely to experience depressive symptoms, regardless of the level of severity.This trend is consistent for combined score (PERS Total ) and when we examined each individual domain (PERS Soc , PERS Life , and PERS Env ).See Fig. 2 and Supplementary Fig. S3 respectively.Regression models incorporating exposome measures provided a significantly better fit compared to a null model that only included covariates.The significant associations observed for all PERS with depression for mild, moderate and severe forms (Supplementary Table S6), underscore the potential impact of social, lifestyle, and broader environmental influences on depression health outcomes.Overall, all environmental cumulative scores are associated with a higher depression risk, where the inclusive PERS Total (OR 1.52, 95% CI 1.43-1.63)explained the largest proportion of variance associated with depression regardless of the level of severity (mild: R 2 = 0.086; moderate: R 2 = 0.120; severe: R 2 = 0.140).PERS Total show the highest effect size for severe depression (OR 1.61, 95% CI 1.47-1.77),but the most significant was for moderate depression (OR 1.52, 95% CI 1.43-1.63).Regarding partitioned cumulative scores, PERS Soc showed the largest effect on all assessed levels of severity (mild: OR 1.73, 95% CI 1.57-1.91;moderate: OR 1.87, 95% CI = 1.66-2.10;severe: OR 1.99, 95% CI 1.68-2.36).For moderate depression the social domain is the one that shows the highest effect PERS Soc (OR 1.87, 95% CI 1.66-2.10),as the estimated effect for the lifestyle and behaviour domain (PERS Life ; OR 1.63, 95% CI 1.46-1.81)and environmental (PERS Env ; OR 1.36, 95% CI 1.22-1.53)show lower effects.Detailed information for all the models, including the mild and severe depression forms are presented in Supplementary Table S6.

Genetic contribution to depression outcomes
Polygenic Risk Scores (PRS) associated with moderate depression are defined as HADS-D score of 8 or higher.Two out of all the assessed PRS were found to have a statistically significant increased risk of moderate depression: PRS B (broad depression), (OR 1.18, 95% CI 1.05-1.33,P = 2.15 × 10 -2 ), a generalized measure of depression and PRS E1 (endorsed measures of depression (one)), (OR 1.17, 95% CI 1.04-1.32,P = 2.97 × 10 -2 ), based on quantitative definitions derived from endorsed measures of depression.Despite the associations observed with moderate depression, none of the assessed PRS were associated with the mild and severe forms of depression.See detailed information for all PRS in Supplementary Table S6.This suggests that the genetic factors captured www.nature.com/scientificreports/by these PRS may specifically influence the risk of moderate depression but not the risk of developing milder or more severe forms of the condition.HADS-D Scores increase relative to PRS Percentile, suggesting a positive association between genetic predisposition to depression, as captured by PRS, and the severity of depressive symptoms measured by HADS-D scores (Fig. 3).Information for all other assessed PRS are shown in Supplementary Fig. S4.
The effect size of PRSB on HADS-D scores is consistent with reported effects on a broad depression phenotype, as noted in a previous study by Halldorsdottir et al. 3 .This suggests that the genetic factors captured by PRSB are indeed associated with a broad range of depressive symptoms, as previously observed.We also observed a moderate positive correlation (r = 0.55) between PRS B and PRS E1 , suggesting some degree of shared genetic influence between the two PRS, but it also suggests that there are distinct genetic effects underlying each PRS capturing unique genetic variations associated with different aspects or measures of depression as has been previously suggested by Cai et al. 40 .

Additive and Interaction risk in severity level of depression
We employed a backwards feature selection method with logistic regression for each different severity level model (mild, moderate, and severe), retaining features that were consistently present in 60% of the iterations for each model (see Supplementary Fig. S5 for detailed description of retained features for all the models).Subsequently, we conducted a multinomial regression incorporating variables retained specifically for each model.
In addition to age, gender and education level, PERS Soc , PERS Life , PERS Env, PRS B , PRS E2 PRS LMDDRec and PRS GP were initially retained in one of the models.
We further investigated the relationship between environmental factors and genetic risk scores and present results for the moderate depression model (defined as HADS-D score of 8 or higher).This model incorporated age, gender, retained principal components, PERS Soc , PERS Life , PERS Env , and PRS B and PRS LMDDRec .The overall Figure 3. Above, PRS B and PRS E1 distributions for the moderate depression outcome.Below, relative prevalence by PRS percentile for all three outcomes (mild, moderate, severe).Individuals were classified according to the PRS percentile and the relative prevalence of individuals with a HADS-D higher than the threshold was computed for each percentile.LOESS curve was fitted to visualize the trend of the relative prevalence by PRS percentile.www.nature.com/scientificreports/variance explained by this multinomial regression model was R 2 = 0.169.PERS Soc , PERS Life , and PERS Env scores were significantly associated with depression regardless of the level of severity, suggesting that various aspects of environmental exposures, including social, lifestyle, and broader environmental factors, play a role in influencing the risk of depression.The PERS Soc risk increased significantly as depression severity worsened, specifically we observed a 1.6-fold increase in the PERS Soc OR between mild and severe depression.Specifically, the OR for PERS Soc was 1.48 (95% CI 1.31-1.66)for mild depression and increased to 2.36 (95% CI 1.97-2.83)for severe depression (Fig. 4).Similarly, PERS Life , and PERS Env exhibited increasing associations with a 1.6-fold and 1.2-fold  Regarding genetic scores, we found only a significant association between PRS B and moderate depression (OR 1.2, 95% CI 1.03-1.40)(Fig. 4).
The multinomial regression analysis results incorporating variables retained from the backwards feature selection method for mild and severe level (mild depression, severe depression model) and the analysis with all retained variables combined (any depression model) is presented in Supplementary Figs.S6-S8.Detailed results for each multinomial model assessed is presented in Supplementary Table S7.
Based on these results, we further explored a possible multiplicative interaction effect between the different PERS and PRS B .We observed that only the interaction between PERS Env and the PRS attained statistical significance; specifically for mild and moderate depression.Strikingly, the odds ratios for these interactions were consistently smaller than 1, OR 0.87, 95% CI 0.77-0.98,OR 0.84, 95% CI 0.72-0.99,and OR 0.92, 95% CI 0.77-1.12for mild, moderate, and severe depression, respectively, showing a potential mitigating effect (Supplementary Table S8) and should be interpreted with caution.Sensitivity analysis, in which one variable used to compute PERS Env was omitted at a time, identified NO 2 as a key factor, the interaction not being significant when NO 2 was removed from the score.This suggests that NO 2 may play a driving role in this interaction (Supplementary Table S9).
Finally, to understand the impact of previously diagnosed mental disorders, we distinguish between individuals with and without pre-pandemic mental health conditions, disaggregating by absence or presence of pre-pandemic mental health conditions.The study found significant positive associations for PERS Soc , PERS Life , and PERS Env scores across all levels of depression severity, regardless of individuals' pre-pandemic mental health status, while polygenic risk scores showed no significant associations, only in the absence of a pre-pandemic mental health diagnosis there was a trend observed for PRS B (Supplementary Table S10).This suggests that environmental factors such as social interactions, lifestyle choices, and broader environmental influences continue to play a significant role in influencing mental health outcomes, irrespective of pre-existing mental health conditions.

Discussion
Heritability estimates of depression are approximately 30-40% 9 .This leaves ample room for the contribution of non-genetic factors to inter-individual variability of the phenotype.In a single cohort, during the COVID-19 lockdowns, we comprehensively analyzed environmental factors that collectively influence susceptibility to depression, focusing on understanding the impact of the entire exposome 46 .Our approach considered the combined effects of both genetic and environmental factors, and explored the use of cumulative scores of environmental exposures and genetic susceptibility that jointly impact depression.
We identified significant positive associations between non-genetic risk factors and depression levels with a considerable impact of environmental factors on mental health outcomes.After accounting for age, about 16.9% of the differences in depression levels among the individuals in our sample can be explained by PERS Soc , PERS Life , PERS Env , and PRS B .In addition, for all three domains, HADS-D Scores increase relative to PERS percentile, suggesting a cumulative effect of multiple environmental stressors on depression risk, and the importance of considering the combined impact.
Specifically, PERS Soc and PERS Life were each associated with a 1.6-fold increase in risk from mild to severe depression, while PERS Env showed a 1.2-fold increase in risk across severity levels.Notably, PERS Soc , reflecting social and household-related factors, and PERS Life , emerged as pivotal determinants, explaining the major portion of the variance in depression levels.In contrast, wider environmental and health-related factors (PERS Env ) account for a relatively smaller impact on variance.This underscores the critical role of interpersonal relationships, social networks, and personal well-being in shaping mental health outcomes.Social isolation and social satisfaction play a mediating role in the effect of interpersonal relationships 47 which increases susceptibility to negative emotions such as depression and anxiety, moreover stress exposure limits the development of healthy interpersonal relationships 48 .The remaining variability in depression levels may be due to other factors not included in our model or random variation.For example, sensitive time windows may model the outcome, as suggested by the strong increase in the incidence of mental health problems in adolescents after the COVID-19 pandemic 49 .
In our study genetics explained a very low proportion of the depression and only PRS B (Broad depression), a generalized measure of depression, was associated with depression, with higher impact on moderate levels of depression compared to milder or more severe forms.These loci may be less specific to the biological underpinnings of clinical depression, while closer to neurobiological mechanisms of symptom onset through susceptibility to exposure to non-genetic factors.The effect size reported here for PRS B (OR 1.2, 95% CI 1.03-1.40) is consistent with reported effects on a broad depression phenotype 3 , suggesting that these genetic factors captured are indeed associated with a broad range of depressive symptoms.This suggests that the depression phenotype captured in self-reported scales is a valid measure of depression and aligns with findings of strong genetic correlation between both measures 39,50 .It's possible that other genetic factors play a more prominent role in the development, or exacerbation of mild and severe depression, indeed, our analyses indicate that broad definitions of depression showed moderate correlations with more stringent measures of Major Depression Disorder (MDD).However, the results also support the need for a careful consideration of the cut-off scores used to identify clinically relevant individuals in a population-based sample.Substantial discrepancies have been reported between self-assessment using symptoms scales and diagnoses made through strict diagnostic criteria 40 .The cut-off scores of the depression subscale proposed for the HADS-D varied between 5 and 11, where a cut-off score of 11 or higher is considered a strong indicator of clinical depression according to established guidelines 33 .Because stringent definitions of PRS show larger heritability 40 , we expected a stronger association between PRS LMDDRec (Lifetime Major Depressive Disorder (with recurrence)) and higher severity of depression.This was not the case.It is possible that the relatively smaller number of individuals with higher severity limited statistical power.
In our model, we observed broad depression (PRS B ) and environmental and health dimension stressors (PERS Env ), shown an interaction with a mitigating effects for mild and moderate depression.From PERS Env , air pollution (NO 2 ) was identified as the key factor, playing a driving role in this interaction.However, these results are counterintuitive considering previous observations.There is emerging evidence of associations between poor air quality and poor mental health, as well as specific mental disorders 15 .Furthermore, pre-existing long-term conditions appear to deteriorate, requiring more healthcare 51 .We also observed that more severe COVID-19 disease was associated with mental health disorders 29 and exposure to air pollutants 31 .Overall, this result suggest that complex non-linear relationships may exist, as well unmeasured variables or uncontrolled confounders resulting in misleading conclusions.
Globally, the prevalence of depression has experienced a significant increase after the COVID-19 pandemic 52 .We previously identified that 5% of the participants in the COVICAT cohort with no pre-pandemic history of mental health disorder presented clinically relevant depression two months after the first lockdown 29 .Here we observed that the strong association of poly-environmental scores was independent of pre-existing mental health disorders.Notably, while the statement holds true for cumulative PERS, certain factors seem to individually exacerbate when there are existing mental health conditions, as reported in Goldberg et al. 29 where Living alone was identified as risk factor for severe depression only for those with pre-pandemic mental health diagnoses, however Household interpersonal conflicts and Financial instability are predictors of severity in people without pre-pandemic mental health diagnosis.
Together, all this evidence challenge a widespread perception that more severe symptoms and/or more strictly defined symptoms of depression are closely related to genetic susceptibility, while social, lifestyle or environmental factors mainly model moderate or non-specific symptomatology 40,53 .Our results show that non-genetic exposures are strong and significant predictors of severe depression and their capacity to shape the mental health of populations should not be underestimated.
Our study has several strengths that contribute to the robustness of the findings, the most relevant being highquality individual-level data and the low proportion of missing data, the large range of depression phenotypes assessed across phenotypic presentations and the thorough examination of environmental stressors.Furthermore, several important limitations should be considered when interpreting the findings of the study.While the assumption of potential causality is reasonable since exposures precede the outcome, caution is warranted due to the possibility of bidirectional relationships between variables.The presence of healthy bias 54 and limited representation of ancestry 55 in the study sample may affect the generalizability of the findings to broader populations, however, this does not invalidate the relationships found between exposures and depression outcomes.Finally, the limited sample size may affect the statistical power, particularly in interaction analyses.Replication studies with larger sample sizes are needed to validate the observed interactions and enhance confidence in the findings.
The findings of the study indeed provide valuable insights into addressing multiple stressors simultaneously to mitigate depression risk and promote resilience 56 .However, it is crucial to interpret cumulative scores and interaction effects cautiously, considering unaccounted factors and potential confounding variables that may influence the observed associations.

Figure 1 .
Figure 1.Distribution of total scores of the Depression Subscale HADS-D for (a) the total sample, by (b) gender, and by (c) pre-pandemic mental health diagnosis.HADS-D threshold for levels of depression are depicted with dashed red lines.Similar distribution is observed by gender, with higher HADS-D scores in females than males.Higher HADS-D scores are observed in the participants with previous diagnoses of depression or anxiety.

Figure 2 .
Figure 2. PERS total distribution for each HADS-D threshold.Above, PERS is presented for each of for all three outcomes using HADS-D thresholds named mild depression (HADS-D ≥ 5), moderate depression (HADS-D≥ 8), and severe depression (HADS-D≥ 11), and below, relative prevalence for depression.Relative prevalence was computed as the number of individuals with a HADS-D higher than the threshold for each PERS Total score.

Figure 4 .
Figure 4. Forest plot depicting the results of the multinomial regression for each depression level including the variables obtained from the backwards selection procedure for the HADS-D threshold for moderate depression.
and chronic diseases, known for potentially exacerbating the severity of COVID-19 symptoms further underscoring the interconnectedness of health and environmental factors.

Table 1 .
Distribution of sociodemographic characteristics and depression (HADS-D) categories in the whole sample and subsample A (sample with genetic data).

Table 2 .
Distribution of sociodemographic characteristics in the whole sample and subsample B (sample prepandemic mental health diagnosis).