Dissecting depression symptoms: Multi-omics clustering uncovers immune-related subgroups and cell-type specific dysregulation

35 In a subset of patients with mental disorders, such as depression, low-grade inflammation 36 and altered immune marker concentrations are observed. However, these immune 37 alterations are often assessed by only one data type and small marker panels. Here, we 38 used a transdiagnostic approach and combined data from two cohorts to define subgroups 39 of depression symptoms across the diagnostic spectrum through a large-scale multi-omics 40 clustering approach in 237 individuals. The method incorporated age, body mass index 41 (BMI), 43 plasma immune markers and RNA-seq data from peripheral mononuclear blood 42 cells (PBMCs). Our initial clustering revealed four clusters, including two immune-related 43 depression symptom clusters characterized by elevated BMI, higher depression severity and 44 elevated levels of immune markers such as interleukin-1 receptor antagonist (IL-1RA), C-45 reactive protein (CRP) and C-C motif chemokine 2 (CCL2 or MCP-1). In contrast, the RNA-46 seq data mostly differentiated a cluster with low depression severity, enriched in brain 47 related gene sets. This cluster was also distinguished by electrocardiography data, while 48 structural imaging data revealed differences in ventricle volumes across the clusters. 49 Incorporating predicted cell type proportions into the clustering resulted in three clusters, 50 with one showing elevated immune marker concentrations. The cell type proportion and 51 genes related to cell types were most pronounced in an intermediate depression symptoms 52 cluster, suggesting that RNA-seq and immune markers measure different aspects of immune 53 dysregulation. Lastly, we found a dysregulation of the SERPINF1 /VEGF-A pathway that was 54 specific to dendritic cells by integrating immune marker and RNA-seq data. This shows the 55 advantages of combining different data modalities and highlights possible markers for further 56 stratification research of depression symptoms.


Introduction
Mental disorders significantly impact human health, health care systems, and economies, accounting for up to 16 % of global disabilityadjusted life years (Arias et al., 2022).Despite this substantial impact, the development of effective and innovative treatments remains a challenge (O'Donnell et al., 2019).Major depressive disorder (MDD) exemplifies these challenges, being the foremost psychiatric contributor to global disability (Vos et al., 2020).
The difficulty in understanding the underlying molecular causes of MDD arises from its biological heterogeneity.MDD manifests with a wide range of symptoms and is often comorbid with other mental disorders (Kaufman and Charney, 2000;Kessler et al., 2017), which complicates its precise characterization.While the heritability of depression has been known for decades (McGuffin et al., 1996), genomewide associations studies (GWASs) using minimal phenotyping have only yielded significant insights with extremely large sample sizes (Howard et al., 2019).Adopting a transdiagnostic approach that includes disorders across diagnostic categories and uses biological pathways, revealed associations between GWAS findings from MDD, schizophrenia and bipolar disorder with cytokine and immune pathways (O'Dushlaine et al., 2015).
Further evidence supports the pivotal role of the immune system in the pathology of MDD.Stress, a risk factor for depression, can lead to increased inflammation (Hammen, 2015;Rohleder, 2019).Accordingly, a subgroup of roughly 30 % of MDD patients shows a proinflammatory profile − often referred to as low-grade inflammation − characterized by elevated levels of C-reactive protein (CRP) (Osimo et al., 2019).Our recent research corroborated this observation of immune-related depression, demonstrating genetic correlation between increased CRP level and depressive symptoms, e.g., tiredness, changes in appetite, anhedonia and feelings of inadequacy (Kappelmann et al., 2021).Immune markers such as interleukin (IL)-6, IL-1beta and tumor necrosis factor (TNF) are often measured to evaluate immune dysregulation (Haapakoski et al., 2015) and contain genetic variants relevant for depression (Barnes et al., 2017).Several studies have provided evidence that an inflammatory stimulus can lead to depressed mood, and that the pre-existing inflammatory status can predict the severity (Cho et al., 2019;Lasselin et al., 2020).For an overview on how cytokines and immune cells may mediate depressive symptoms see Miller and Raison (2016).
Many studies have employed case-control designs to find differences in immune markers between individuals with depression and healthy controls (Sforzini et al., 2023;Sørensen et al., 2023;Wittenberg et al., 2020).At the same time, various studies have established the link between specific depressive symptom profiles and inflammation.Increased inflammation has been observed in patients with dysregulated metabolism (immuno-metabolic depression) and linked to anhedonia (Felger et al., 2016;Lamers et al., 2013;Lucido et al., 2021;Milaneschi et al., 2020).A study by Franklyn et al. (2022) found that inflammation is associated with symptoms like altered eating patterns, appetite and tiredness.Lynall et al. (2020) focusing on immune cell counts, identified a subgroup of depressed patients with increased levels of monocytes, CD4 + T cells and neutrophils.This subgroup also demonstrated increased CRP and IL-6 levels, correlating with more severe depressive symptoms.
The potential of an inflammatory subtype goes beyond symptom profiles, as a study by Cattaneo et al. (2020) identified differences in whole blood gene expression related to inflammation that differentiate patients with treatment-resistant depression from those responding to treatment.Such findings highlight the potential of precisely characterizing immune dysfunction within MDD, aiming to identify patients who might benefit significantly from targeted immunomodulation.In recent years, several randomized control trials evaluated the effectiveness of adding an anti-inflammatory medication to antidepressants.Some of these trials adopted elevated CRP concentration as an inclusion criterion or a measure for secondary analyses.While there is some evidence for beneficial effects (Köhler-Forsberg et al., 2019;Nettis et al., 2021;Savitz et al., 2018), several other studies reported no discernible differences between patients treated with anti-inflammatory medication or those receiving placebos (Hellmann-Regen et al., 2022;Husain et al., 2020), even when stratified by CRP (Baune et al., 2021).Inspired by advancements in cancer research, Miller and Raison (2023) suggest shifting from traditional diagnostic evaluations to focusing on symptoms influenced by inflammation, such as anhedonia, changes in appetite and sleep (Drevets et al., 2022).This shift underpins our use of a transdiagnostic cohort, aiming to transcend diagnostic boundaries and explore common biological underpinnings across psychiatric conditions.This push towards a more integrative and holistic understanding aligns with the broader field of psychoneuroimmunology.Traditionally, many studies in this area have relied on single omics approaches (e.g., transcriptomics or protein concentrations of peripheral immune markers).However, with the recent advancements in multi-omics techniques such as clustering or factor analysis methods (Li et al., 2021), there is an opportunity to integrate these individual omics data sets.Together with data from psychological assessments, imaging, and wearable devices (Moshe et al., 2021;Seppälä et al., 2019), we can further elucidate the complex interplay between the immune system and mental disorders by correlating molecular disease pathologies with clinical data (Halaris et al., 2019;Mengelkoch et al., 2023).
Building on the link between inflammation and depression, we explored this relationship in a transdiagnostic sample (n = 237) using multi-omics clustering.We hypothesized that distinct biological subgroups, defined by immune and transcriptomic profiles, would exhibit unique clinical and behavioral characteristics.To achieve this, we employed a two-step approach.First, we focused on biological features − a broad panel of 43 immune markers and whole transcriptome data from PBMCs (12210 genes) − as well as age and body mass index (BMI), given their known influence on inflammation (Chung et al., 2019;Karczewski et al., 2018), to identify these subgroups.Subsequently, we characterized these clusters using in-depth clinical phenotyping to reveal their distinct symptom profiles.This strategy allowed us to identify novel biological signatures associated with depression heterogeneity and explore their link to specific clinical presentations (Fig. 1).

Methods
For detailed methods see the Supplementary Methods.

Sample selection
The study sample comprised 246 participants from the Biological Classification of Mental Disorders (BeCOME) study (ClinicalTrials.gov:NCT03984084, (Bruckl et al., 2020)) and 115 participants from the OPtimized Treatment Identification at the MAx Planck Institute (OP-TIMA) study (ClinicalTrials.gov: NCT03287362, (Kopf-Beck et al., 2024)).The Munich-Composite International Diagnostic Interview (DIA-X/M− CIDI) (Wittchen et al., 1995;Wittchen and Pfister, 1997) was employed to assess all participants.Out of these with immune marker measurement, 237 affected participants (BeCOME = 134, OP-TIMA = 103, Figure S1) met either threshold or subthreshold DSM-IV DIA-X/M− CIDI criteria for any substance use, affective or anxiety disorder including post-traumatic stress disorder and obsessive-compulsive disorder within the last 12 months.192 of these participants had a (subthreshold) DSM-IV diagnosis of major depression or dysthymia.We also included 36 mentally healthy participants from the BeCOME study without any DSM-IV DIA-X/M− CIDI diagnosis as controls.A replication sample consisted of 11 affected participants (BeCOME = 8, OPTIMA = 3) that met the study criteria but had no genotypes available.9 out of these participants had a DSM-IV diagnosis of major depression or dysthymia.

Ethics approval and informed consent
The BeCOME and OPTIMA studies were approved by the ethics committee of the Ludwig Maximilians University, in Munich, Germany, under the reference numbers 350-14 and 17-395, respectively.Written informed consent was obtained from all participants before study enrollment.

Questionnaire data
All participants were assessed by the Beck Depression Inventory (BDI)-II (Hautzinger et al., 2006) and the Montgomery-Åsberg Depression Rating Scale (MADRS) (Schmidtke et al., 1988).An overview about the sample characteristics for this subset are provided in Table 1 and for the replication sample in Table 2, differences between the cohorts in Table S2.Somatic comorbidities including auto-immune disorders were assessed for BeCOME and part of the OPTIMA study, non-psychiatric medication use was only available for the BeCOME study (Table S1).

Blood collection
Blood was collected in the morning under fasted conditions, peripheral blood mononuclear cells (PBMCs) were extracted and stored at the biobank.

Immune marker measurements
We used the V-PLEX Human Biomarker 54-Plex Kit (Meso Scale Diagnostics (MSD), Rockville, USA, Cat.No. K15248G-2) to measure immune markers in plasma, following the manufacturer's instructions and including three internal controls run in duplicate to assess the coefficient of variation (Table S3).For markers available in both low and high sensitivity (LS and HS) formats within the V-PLEX, we exclusively utilized HS versions for enhanced accuracy.Markers with more than 16 % missing values were excluded (p = 18 markers, details see Table S3 and Figure S2).Additional measurements for critical immune markers were conducted using enzyme-linked immunosorbent assays (ELISAs) to ensure data completeness.This included high-sensitivity C-reactive protein (hsCRP, Tecan Group Ltd., Männedorf, Switzerland, Cat. No. EU59151), cortisol (Tecan Group Ltd., Männedorf, Switzerland, Cat No. RE52061), interleukin (IL)-6 (Thermo Fisher Scientific, Waltham, USA, Cat.No. BMS213HS), IL-6 soluble receptor (sIL-6R, Thermo Fisher Scientific, Waltham, USA, Cat.No. BMS214) and IL-13 (Thermo Fisher Scientific, Waltham, USA, Cat.No. BMS231-3).Participants with hsCRP concentration higher than 20 mg/L were excluded from the sample as this indicates an acute infection (n = 15).This resulted in 43 markers and 273 participants remaining for analysis (Table S3 and Figure S1).

Electrocardiography data assessment
149 affected participants (BeCOME = 108 and OPTIMA = 41) and 32 controls without a DSM-IV diagnosis had an electrocardiography (ECG) recording that spanned 24 h including a sleeping period with the portable ECG-device Faros 180 (Mega Electronics Ltd, Kuopio, Finland) at a sampling frequency of 500 Hz available.

Table 1
Cohort overview with either percent of participants and absolute numbers in brackets or mean and standard deviation in brackets.The age and BMI differences between case and control participants without a DSM-IV diagnosis were significant (t(58.8)= 4.106, p = 0.000126 and t(77.8)= 2.2761, p = 0.02559, respectively), while the sex difference was not (Fisher's exact test, p = 0.8564).The depression diagnosis was assessed by the Munich-Composite International Diagnostic Interview and includes full and subthreshold cases.Psychotropic drugs include antidepressants, mood stabilizer, neuroleptics, tranquilizer and herbal psychotropics.BDI-II = Beck Depression InventoryII.Information on other diagnoses of mental disorders is available in Figure S7, on somatic diseases and medication in Table S1  2.4.Data analysis

Immune marker analysis
The immune markers were quantile-normalized and corrected for the biobank storage position.For each immune marker, a linear model was fitted in R version 4.0.2(R Core Team, 2023) to regress out the batch variable.The residuals from this regression were then used for downstream analysis.
Quality control and exclusion of genes with few counts or high GC content influence left 229 affected participants, 33 controls without a DSM-IV diagnosis and 12,210 genes (Figure S1).The data was sequentially corrected for the GC content and a preparation batch variable with ComBat_seq from the sva package version 3.38.0(Leek et al., 2021) and normalized with vst from DESeq2 version 1.30.1 (Love et al., 2021).

Cell type deconvolution
The filtered RNA-seq count data, which was not batch-corrected, was converted to TPM and deconvoluted using the granulator package version 1.6.0(Pfister et al., 2023) in R version 4.2.0 (R Core Team 2023).All cell types with a standard variation of less than 0.5 were excluded, resulting in 14 cell types.

Multi-omics clustering and statistical analysis of cluster differences
Each variable of the normalized and batch corrected immune marker and RNA-seq data as well as age and BMI was standardized, the distance between the participants calculated and the clustering method SUMO version 0.3.0(Sienkiewicz et al., 2022) was applied repeatedly on a subsample to calculate stability metrics for selecting the number of clusters.
If not otherwise stated, we used R version 4.0.2(R Core Team, 2023) and determined the variable importance based on the F-value from ANOVA tests.A higher variable importance indicates a more distinct separation between the clusters.For continuous variables not used in the clustering, we applied the Tukey Honest Significant Differences (TukeyHSD) test and reported adjusted p-values, for categorical variables the Fisher's exact test.
We did not report p-values for group differences for variables that were used in the clustering as differences were anticipated (Kriegeskorte et al., 2009).

Replication analysis
A multinomial model containing age, BMI, IL-1RA, CRP, PlGF, CCL2, NRCAM, NAP1L2, FBLN2 and PXYLP1 was estimated to differentiate the clusters from the initial clustering using 227 participants with complete data and nnet version 7.13-15 (Venables and Ripley, 2002).The model was used to predict the cluster membership of the replication participants.

Gene set analysis
For each participant, a score was calculated for every biological process and molecular function GO term gene set using GSVA version 1.36.3.(Hänzelmann et al., 2013) based on the batch-corrected and normalized RNA-seq data.The distribution of these gene set scores across clusters was tested using an ANOVA and F-values reported as variable importance.

Prediction of responder status
Mehta et al. ( 2013) measured the response status to infliximab (anti TNF antibody) treatment of 28 patients with treatment-resistant depression over 12 weeks and defined treatment response as a 50 % reduction in the 17-item Hamilton Depression Rating Scale at any point during the study (see Table 3 for details).The baseline values from gene expression data with accession number GSE45468 were mapped to ENSEMBL IDs.8431 genes overlapped with our data and were used in a lasso model using glmnet version 4.1-8 (Friedman et al., 2023) to predict the responder status, the p-value was empirically calculated based on random genes.

Imaging analysis
Morphological differences were investigated at the global, regional (FreeSurfer) and voxel level.FreeSurfer v7.1.1 (Fischl, 2012) based phenotypes were restricted to those with strong meta-analytical evidence for an effect in MDD (Schmaal et al., 2017(Schmaal et al., , 2016)).Analysis of covariance (ANCOVA) was applied to the clusters covarying for intracranial volume, age and sex, and cluster effects were FDR corrected at 5 %.

Electrocardiography analysis
Raw data was preprocessed using the PhysioNet Cardiovascular Signal Toolbox (Vest et al., 2018) in Matlab version 2020b (The Math-Works Inc., 2020).Time domain metrics were calculated for every minute using a sliding window of 5 min.

Single-cell RNA-seq data analysis
Single cell PBMC RNA-seq data from 14 individuals with depression and other mental disorders with accession number GSE185714 was processed with scanpy version 1.9.3 (Wolf et al., 2018) according to Schmid et al. (2021) and their cell type labels used.

Multi-omics clustering with age, BMI, immune markers and RNA-seq data identified 4 clusters with different depression severity and inflammation status
To discover subgroups with immune-related depression symptoms, we leveraged a comprehensive multi-omics approach, assessing a panel of 43 immune markers along with matched whole transcriptome RNAseq data (p = 12210 genes) in 237 participants (Figure S1) diagnosed with MDD or another stress-related mental disorder within the last 12 months.This approach allowed us to identify novel immune-related gene expression patterns associated with depression heterogeneity beyond traditional immune markers.The mean age of the affected participants was 39 years and the mean BMI 25 (see Table 1 for sample characteristics).Besides the biological data, the participants underwent phenotypic assessment.Our approach involved clustering the participants based on multiple biological datasets and subsequently characterizing these subgroups using psychometric data.According to our hypothesis, we initially clustered the participants based on age, BMI, the immune markers and RNA-seq data.Subsequently, we employed a secondary analysis, clustering the immune markers and RNA-seq data after correcting for age, BMI and sex due to the observed influence of these factors on the initial clustering.We also performed an exploratory analysis where we added deconvoluted cell type proportions to the initial clustering features, given their reported relevance in previous studies (Lynall et al., 2020).

Table 3
Cohort overview of the data by Mehta et al. (2013) used to predict the responder status to infliximab treatment with either percent of participants and absolute numbers in brackets or mean and standard deviation in brackets.
status n female BMI nonresponder 14 64 % (9) 30.9 (9.1) responder 14 64 % (9) 30.9 (4.3) that provides cluster stability metrics via consensus clustering.This method organizes variables into layers and computes the distance between the participants within each layer, which subsequently informs the clustering process (see section 2.4.4 for details).For our study, age and BMI were combined as one layer while the immune markers and RNA-seq data were each defined as separate layers.According to the built-in stability metrics (Figure S3A-B), this approach yielded a four cluster solution.These clusters included 121, 58, 39 and 19 participants, respectively (Table S4).In light of the findings described below, we termed cluster 1 mild depression symptoms cluster (MIDS), cluster 2 high immune-related depression symptoms cluster (HIRDS) 1, cluster 3 low immune-related depression symptoms cluster (LIRDS) and cluster 4 HIRDS2.Each of the identified clusters showed a different pattern of immune markers and contained participants from both the BeCOME and OPTIMA cohorts (Fig. 2A), the significantly different distribution of the cohorts across the clusters (Fisher's exact test, p = 0.000003) was driven by the MIDS cluster; between the LIRDS and HIRDS clusters there was no different distribution (Fisher's exact test, p = 0.507320).The OPTIMA cohort contains hospitalized and predominantly medicated participants with a higher disease severity and age compared to BeCOME participants (Table S2).The distribution of somatic comorbidities and medication usage across the clusters is shown in Figure S5.The sex distribution across the clusters was significantly different (Fisher's exact test, p = 0.004607, Figure S4).
Next, we aimed to provide insights into the distribution of age, BMI, and CRP concentration across the clusters due to their importance in the clustering and the established role of CRP as a standard marker for inflammation.Notably, in all clusters except HIRDS2, a positive correlation was observed between CRP concentration and BMI.Specifically, for the MIDS cluster the Pearson correlation was 0.13 (t-test, p = 0.155); for HIRDS1, it was 0.40 (t-test, p = 0.002); for LIRDS, it was 0.39 (t-test, p = 0.015); and for HIRDS2 it was − 0.61 (t-test, p = 0.006), as illustrated in Fig. 3A.It also shows that the HIRDS clusters predominantly contained individuals with elevated CRP concentration and elevated BMI.As illustrated in Fig. 3B, no significant correlation between age and CRP was discernible within the clusters, with Pearson correlations ranging between − 0.16 and 0.01.Interestingly, the LIRDS cluster contained individuals with a restricted age range.Further examination revealed no strong correlation between age and BMI except for HIRDS2, with Pearson correlations ranging from 0.09 to 0.40, which were not significant (Fig. 3C).Notably, while the individuals in HIRDS1 spanned a wide age and BMI range and contained more males than females (Figure S4), HIRDS2 consisted mostly of females aged 50 or above with a BMI between 24 and 30.

Clinical phenotypes, imaging features and heart rate data identified clusters with elevated depression severity
Next, we explored how the clusters, which were defined solely based on biological data (including immune markers, RNA-seq data, BMI, and age) without considering depression severity or symptoms, corresponded to these measures of depression post-clustering.Interestingly, the LIRDS and HIRDS clusters exhibited greater depression severity, as measured by the Beck Depression Inventory-II (BDI-II), than the MIDS cluster (Tukey HSD test, all adjusted p-values < 0.05), and control participants without a DSM-IV diagnosis (Tukey HSD test, all adjusted pvalues < 0.000001).This difference in severity was especially pronounced in the HIRDS clusters (Fig. 3D).A similar result was observed using the Montgomery-Åsberg Depression Rating Scale (MADRS) (Tukey HSD test, all adjusted p-values < 0.05).The MIDS cluster had the highest proportion of participants without a depression or dysthymia diagnosis (Fisher's exact test, p = 0.000500, Figure S7) and the lowest overall depression severity.All BDI-II items differentiated the clusters from the controls without a DSM-IV diagnosis (Fisher's exact test, all pvalues FDR adjusted < 0.05).When comparing only the clusters, 14 BDI-II items, including changes in sleeping behavior (Fisher's exact test, adjusted p-value 0.006997), were significantly different (FDR<0.05,Table S7).Notably, the HIRDS clusters showed increased sleeping and a trend towards increased appetite (Figure S8A-B).
To further assess if the identified symptom differences were mirrored in other biological data, we analyzed heart rate data collected from portable ECG devices (n = 149) and structural imaging-derived features (n = 151) in a subset of the participants for whom these data were available.The median maximum normal-to-normal intervals between heartbeats were higher in the controls without a DSM-IV diagnosis compared to HIRDS1 (Tukey HSD test, adjusted p-value 0.039302) (Figure S8C) and the heart rate variability metrics − for short-term (RMSSD) and long-term (SDNN) − were lower in the LIRDS and HIRDS clusters in comparison to the MIDS cluster and controls without a DSM-IV diagnosis (Fig. 3E).The p-values for these differences ranged from 0.000087 to 0.020397 for RMSSD and from 0.000228 to 0.012328 for SDNN (Tukey HSD test with adjusted p-values).The clusters including the controls without a DSM-IV diagnosis showed a trend of different total gray matter (TGM) (F = 2.186, p = 0.073), with LIRDS showing significantly higher TGM compared to the HIRDS1 and controls; a similar pattern was seen for total brain volume (TBV) and total white matter (TWM) where HIRDS2 showed the lowest volumes (Table S8).Regional analyses of 17 MDD-related phenotypes revealed robust effects on the lateral ventricle volume (F = 4.32, FDR adjusted p = 0.040, with the lowest volume in LIRDS, see Fig. 3F), and nominal significant effects for the right rostral anterior cingulate, the right posterior cingulate and the bilateral hippocampus (Table S9).Voxel-based morphometry revealed no cluster main effects.

Immune markers and genes revealed differences in immune system and brain related pathways between clusters
We wanted to understand how the immune markers and RNA-seq data each contributed to defining the clusters, and if these different data types contained distinct information.Interestingly, except for CCL13, all the top important immune markers were elevated in the HIRDS clusters (Fig. 4A).These clusters not only showed increased BMI but also the highest BDI-II.This distinctive pattern was reflected in CCL13′s low correlation with key markers such as IL-1RA and CRP (Fig. 4D).While still discriminating the HIRDS clusters from the others, PlGF also had a similar low correlation with the top markers IL-1RA and CRP (Fig. 4D).All top immune markers showed a positive correlation with BMI (Figure S9).
In contrast, the most discriminating genes mainly differentiated the MIDS cluster, defined by a low depression severity, from the other clusters (Fig. 4B), while several other genes discriminated the HIRDS clusters (Figure S10).Accordingly, the correlations between the genes were stronger than those among immune markers, with the exception of the correlation between EDA and NAP1L2.On average, the absolute correlation between the top genes was 0.36, while it was 0.27 between the top immune markers (Fig. 4E).To link the findings to clinical data, we investigated if the top discriminating genes could predict the responder status in 28 patients with depression to treatment with an antibody against TNF in an external data set (Mehta et al., 2013).Based on the area under the curve (AUC), the top 10 and top 20 genes did not perform better than random genes (permutation test, p = 0.139 and p = 0.159).The most important gene for the 20-gene-model was vesicle associated membrane protein 1 (VAMP1), which was also included in the 10-gene prediction model, although it was not statistically significant.Additionally, we found that 22 genes differentially expressed between responders and non-responders at baseline in the Infliximab study (e.g., SOX4 and ZNF577) recapitulated the differences between our HIRDS clusters and the MIDS/LIRDS clusters in our data.However, a PCA dimension reduction of our RNA-seq data using these differentially expressed genes did not reveal distinct cluster separation, mirroring the results from our prediction of Infliximab responder status.
To understand the underlying molecular mechanisms, we analyzed not only individual genes but also their coordinated expression patterns, examining these genes in the context of pathways.We applied gene set enrichment analysis using GSVA (Hänzelmann et al., 2013) to calculate gene set scores for every participant and compared the distribution across clusters (Table S10).As depicted in Fig. 4C, the gene sets that stood out prominently were enriched in the LIRDS and HIRDS clusters.Interestingly, these gene sets included those related to the brain (e.g., anterograde axonal transport, VI = 17.3, 38 genes), immune system (regulation of macrophage activation, VI = 12.5, 32 genes), stress response (response to mineralocorticoid, VI = 12.4, 15 genes) and (ion) transport (e.g., regulation of calcium ion transmembrane transporter activity, VI = 13.1, 44 genes).

Replication of the initial clustering in a holdout sample
To assess the robustness of the clustering, we trained a multinomial regression model on the initial clustering using the 10 most important variables and predicted the cluster membership in a replication sample of 11 participants not used in the clustering.The relative number of participants per cluster was similar compared to the initial clustering and we could replicate the age and BMI distribution (Figure S11 A-B).In general, the mean BDI-II was higher in the replication sample which was also reflected in a higher median BDI-II in the predicted MIDS cluster (Figure S11C).The predicted LIRDS cluster with only one participant showed the highest BDI-II, while the predicted HIRDS1 cluster showed a higher median BDI-II than the predicted MIDS cluster.The differences of the top seven immune markers and genes can be replicated between the predicted MIDS and HIRDS1 clusters except for TNF and SH3RF3 (Figure S11D-E).

Secondary analysis of the initial clustering adjusted for age, BMI and sex showed no phenotypic differences
Considering the strong impact of BMI and age on the clustering outcomes, we performed a secondary clustering where we first adjusted the biological data (immune markers and RNA-seq data) for age, BMI and sex using a linear model.We then applied the clustering algorithm to this adjusted data set.This secondary analysis yielded a three cluster solution (Fig. S3C-D), which differed from the four clusters identified in the initial, unadjusted analysis (Figure S12A).Notably, in the secondary analysis, the variable importance of the immune markers was diminished relative to the RNA-seq data (Table S11).Among the immune makers, IL-27 stood out with the highest immune marker variable importance of 7.4.Contrastingly, the gene CKLF like MARVEL transmembrane domain containing 6 (CMTM6) showed a much higher variable importance of 77.4.Moreover, the primary discriminative immune markers or genes in this analysis were different from those in our initial clustering (Figure S12B).Compared to the genes, the distribution of the immune markers across the clusters was more uniform (Figure S12C-D).There were no significant differences of age, BMI, BDI-II or MADRS sum scores between the clusters, while the controls without a DSM-IV diagnosis had lower BDI-II or MADRS sum scores compared to all clusters and a lower age than clusters 2 and 3 (Tukey HSD test with adjusted pvalues, p = 0.025967 and 0.005217) (Figure S12E-G).

Exploratory analysis of the initial clustering including cell types identified three clusters predominantly driven by RNA-seq data
Motivated by previous findings that highlight the significance of blood cell type composition in discerning depression subgroups (Lynall et al., 2020), we conducted an exploratory analysis.In this analysis, we integrated deconvoluted cell types proportions (derived from the RNAseq data) into the clustering model, aligning them in one layer with age and BMI.This allowed us to investigate the potential contribution of cell type composition to the clustering outcome, while recognizing that the cell type proportions are estimated rather than directly measured.The predominant cell types in our sample were "T cells CD4 memory resting" (median fraction of 0.41) and "Macrophages M1" (median fraction of 0.15, see Fig. 5A).This exploratory analysis, incorporating cell type proportions, led to a three cluster solution (Fig. S3E-F).Cluster 1 contained mostly (85.7 %) individuals initially assigned to the MIDS cluster (without cell types).Cluster 3 contained predominantly individuals from the initial clusters characterized by high BMI and high BDI-II (HIRDS clusters, 69.2 %).Cluster 2 presented a mixed composition, drawing from different initial clusters: 54.2 % from MIDS, 17.7 % from HIRDS1, 24 % from LIRDS and 4.2 % from HIRDS2 (Fig. 5B).Accordingly, we termed these exploratory clusters reMIDS, intermediate depression symptoms cluster (reIDS) and reHIRDS, respectively, where "re" indicated re-evaluated clusters that incorporate cell type proportions.

Exploratory analysis revealed T cell proportions and RNA-seq data as primary drivers of the clustering
The distribution of age and BMI demonstrated an increasing trend across the three clusters identified in the exploratory analysis.Notably, BMI was markedly increased in the reHIRDS cluster (mean = 29.1)compared to the reMIDS (mean = 22.0) and reIDS (mean = 23.7)clusters (Figure S13A-B).In line with the pattern seen in the initial clustering, the reHIRDS cluster showed significantly elevated BDI-II compared to the other clusters (Tukey HSD test with adjusted p-values, p = 0.000072 for reMIDS and p = 0.008531 for reIDS, Figure S13C), and the MADRS in the reHIRDS cluster was elevated compared to the reMIDS cluster (Tukey HSD test with adjusted p-values, p = 0.014189) as well.All clusters showed higher BDI-II and MADRS compared to the controls without a DSM-IV diagnosis (Tukey HSD test with adjusted p-values, p < 0.000000).
In this exploratory analysis incorporating cell type proportions, we observed a shift in the variables driving the clustering compared to the initial findings.While age and BMI were the most important variables in the initial clustering, their impact was reduced.BMI still remained the second most important biological variable (VI = 53.5,Fig. 5C and Table S12), ranking 77th overall, but age (VI = 19.5)dropped considerably in importance to the 2037th position.The top 7 immune markers were the same as in the initial clustering except CXCL10 (IP-10, VI = 29.3)replacing CRP (VI = 21.3).However, in the reclustering of the exploratory analysis the immune markers ranked 240th to 1484th in the variable importance, whereas in the initial clustering 3rd to 24th.The cell type "T cells CD4 memory resting" emerged as the most important biological variable (VI = 79.1), also being the most prevalent cell type in our sample (Fig. 5A).Other notable cell types included "Dendritic cells activated" (VI = 31.7)and "Monocytes" (VI = 26.6).In the reMIDS cluster, "T cells CD4 memory resting" showed an increased mean proportion of 0.48 compared to 0.36 and 0.40 in the other clusters, while the "Dendritic cells activated" and "Monocytes" were decreased in the reMIDS cluster (Fig. 5D).Furthermore, Fig. 5C shows that genes were the essential variables to discriminate between the clusters (see Supplementary Results for more details).

Exploratory analysis revealed immune system involvement in the gene set enrichment of re-clusters
To further understand the biological processes differentiating the reclusters identified in the exploratory analysis, we performed gene set enrichment analysis.As highlighted in Fig. 5E, the most discriminative gene sets were downregulated in the reMIDS cluster, upregulated in the reIDS cluster and moderately upregulated in the reHIRDS cluster.Closer examination revealed that these gene sets were enriched for terms related to the immune system (response to molecule of bacterial origin, VI = 119, 186 genes), secretion (regulation of secretion, VI = 118.1,297 genes), transmembrane transport (regulation of transmembrane transport, VI = 108, 237 genes) and neurons (positive regulation of neuron death, VI = 103.7,58 genes) among others (Table S13).

Integration of protein and RNA-seq data with single cell expression showed cell type specific regulation of inflammation
To further elucidate the characteristics of the high depression severity clusters from our initial clustering model delineated in Fig. 2, we analyzed the genes and immune markers that differentiated the LIRDS and HIRDS clusters.Two prominent variables emerged: the gene serpin family F member 1 (SERPINF1, VI = 9.1) and the protein vascular endothelial growth factor A (VEGF-A, VI = 10.1).As represented in Fig. 6A and 6B, SERPINF1 expression was elevated in the MIDS and LIRDS clusters, coinciding with low VEGF-A protein concentrations.In contrast, the HIRDS clusters displayed an opposite pattern.Interestingly, we did not find differences in the gene expression of VEGFA between the clusters (Fig. 6C) and in general low correlation between RNA-seq and corresponding protein level (see Supplementary Results and Figure S14 for more details).
The importance of different cell types was underlined by single cell gene expression analysis.PBMC RNA-seq data from 14 individuals with depression and other mental disorders indicated a predominant SER-PINF1 expression in dendritic cells (Fig. 6D and 6F), while VEGFA was mainly expressed in monocytes (Fig. 6E-F), showing the importance of cell types even when not included directly in the clustering.This cell type specificity was confirmed in healthy subjects of the human protein atlas (Uhlen et al., 2019).

Discussion
In this study, we analyzed a transdiagnostic sample of individuals with stress-related mental disorders from two cohorts to identify patterns of depression symptoms in relation to immune alterations, thereby broadening the understanding of immune-related subgroups in depression.Traditionally, such subgroups have been defined by specific immune markers such as CRP or TNF, known to be increased in some depressed patients (Haapakoski et al., 2015;Osimo et al., 2020), especially those resistant to antidepressant treatment (Chamberlain et al., 2019;Strawbridge et al., 2015;Yang et al., 2019).Our large-scale multiomics analysis, integrating gene expression data of 12,210 genes with immune marker profiling of 43 makers including CRP, suggests a more complex immune involvement in depression symptoms than achieved by examining a limited set of markers like CRP alone.Our approach uncovered four distinct clusters dissecting depression symptoms across the diagnostic spectrum that were partly replicated in a small replication sample: two high immune-related-depression symptoms (HIRDS) clusters with increased immune markers, BMI, and depression severity; a mild depression symptoms (MIDS) cluster, predominantly younger participants with lower depression severity and minimal immune marker elevation; and a low immune-related-depression symptoms (LIRDS) cluster with elevated depression severity but without increased immune markers.This nuanced classification offers deeper insights into the interplay between immune function and depression symptoms.
Incorporating a multi-omics integration clustering approach has clearly demonstrated its value, providing unique insights through the combination of phenotypic data, immune markers and the entire transcriptome.This was especially evident in identifying HIRDS clusters with elevated immune markers such as CRP, TNF, IL-1RA, PlGF, CCL2, CCL4, and CCL13.Measuring IL-1 beta posed a challenge due to its low concentration, even with high-sensitivity assays, highlighting the difficulties in detecting certain cytokines.Interestingly, IL-1RA, typically anti-inflammatory, is produced under the same inflammatory conditions as the proinflammatory IL-1 beta, linking it to an increased risk of developing depressive symptoms in elderly (Milaneschi et al., 2009;Osimo et al., 2020).Consistent with this, we found heightened IL-1RA levels in our HIRDS clusters, which also exhibit altered appetite patterns, aligning with findings from Simmons et al. (2020).These clusters further show a clear elevation in overall depression severity and BMI, along with differences in sleeping patterns and a trend towards increased appetite.These patterns support the concept of immunometabolic depression, where heightened inflammation correlates with disrupted energy balance, manifesting as obesity and fatigue (Lamers et al., 2020;Milaneschi et al., 2020;Simmons et al., 2020).
Our research sheds light on the role of less-studied chemokines in depression.Elevated CCL2 in blood, CSF, and in post-mortem brain tissues of depressed patients (Eyre et al., 2016;Sørensen et al., 2023;Torres-Platas et al., 2014;Young et al., 2014), alongside mixed data for CCL4 (Camacho-Arroyo et al., 2021;Leighton et al., 2018;Sørensen et al., 2023) and limited research on CCL13, with only one study that found lower levels in suicide attempters (Janelidze et al., 2013), indicate their intricate roles in this symptom spectrum.Additionally, it highlighted PlGF, part of the vascular endothelial growth factor family (De Falco, 2012), implicated in angiogenesis, the immune response, and obesity-related processes (Lijnen et al., 2006;Oura et al., 2003), which was previously found to have lower levels in depressed patients (Yue et al., 2016).By employing a multi-panel approach to simultaneously measure a wide array of immune markers, we elucidated the complex roles of chemokines and established markers in depression symptom spectrum.This strategy not only identifies specific immune markers associated with depression symptoms but also advances our understanding of the disorder's immunological aspects, suggesting a promising direction for future research.
The immune markers were crucial for differentiating between the HIRDS clusters and others, while key genes and the gene set analysis were instrumental in separating between clusters of varying depression severity.The panel of important genes includes genes that are associated with neurodevelopmental disorders (NRCAM) (Kurolap et al., 2022), neuronal development (NAP1L2) (Attia et al., 2007) or that were downregulated in post mortem choroid plexus from patients with MDD (FBLN2) (Turner et al., 2014).This underlines that depression not only affects the brain but other tissues as well and demonstrates our approach's effectiveness at identifying genes differentiating depression severity.Our gene set analysis further supports this, revealing a prevalence of pathways related to neurons and axonal transport in the high depression severity clusters.While immune system pathways were enriched in the genes differentiating the clusters, the immune marker data primarily defined the immune-related depression symptom clusters, underscoring the value of this integrative approach.
The significance of BMI and age in relation to depression symptoms in our analysis added to the ongoing debate about whether BMI adjustments in immune marker analyses clarify or confound the relationship with depression (Horn et al., 2018;Moriarity et al., 2023).Several studies suggest that CRP's association with depression is more pronounced without BMI-adjustments (Chae et al., 2022;Figueroa-Hall et al., 2022;Horn et al., 2018).Recently, a simulation study demonstrated that including BMI as a covariate can lead to reduced precision in estimating the relationship of inflammation on depression (Moriarity et al., 2023).Consistently, our secondary clustering analysis adjusting for BMI, sex and age led to a different cluster solution and our findings revealed no significant differences in depression severity or inflammation status across the clusters, underscoring the complex interplay among these factors.Rather than viewing age and BMI simply as confounding factors, our results suggest these factors should be considered integral components of the depression phenotype.This has important implications for clinical practice, as it suggests that interventions targeting obesity and promoting healthy aging may not only improve physical health but also reduce the risk or severity of depression.Furthermore, while elevated BMI and age indicate an inflammatory phenotype, biological measurements such as immune markers and gene expression data are needed to determine the exact inflammation status and possible treatment options on an individual level.
Our analysis extended beyond clinical depression severity to heart rate variability and neuroimaging, differentiating the clusters.While not corrected for covariates, heart rate variability (SDNN) was lower in the LIRDS and HIRDS clusters and could discriminate them from the MIDS cluster, aligning with previous findings that link reduced heart rate variability to current and former depression, as well as dysphoria (Dell'Acqua et al., 2020;Hartmann et al., 2019;Koch et al., 2019), which suggests broader physiological implications of depression.Structural neuroimaging revealed a trend of relatively higher TGM volume in the LIRDS cluster compared with the HIRDS and MIDS clusters, with similar effects for TBV and TWM analyzed separately.Regional analyses confirmed inverse changes of cerebrospinal fluid (CSF), specifically of the lateral ventricle volume.In the presence of acute depressive symptoms, it seems that the LIRDS constellation protects from CSF enlargement.Larger ventricles have been observed in MDD with early disease onset before 21 years of age (Schmaal et al., 2016) and in bipolar disease (Hibar et al., 2016) which could suggest that stronger immune activation might represent subclinical or masked bipolar traits.Regional effects as detected for the bilateral hippocampi and the right rostral anterior cingulate cortex, were heterogeneous between the two HIRDS clusters.While these structures were reported before in immune-based stratifications (Savitz et al., 2013) and are plausibly involved in stress response regulation (Herman et al., 2016), larger samples are needed to fully map the anatomical characteristics of immune-related depression.
Our exploratory analysis supports the hypothesis that depression is associated with alterations in cell type composition and immune marker levels (Foley et al., 2022;Lynall et al., 2020).Integrating cell type proportions into the clustering analysis refined our initial clusters, revealing three re-clusters with varying depression severity and aligning well with the pattern observed in the initial clustering.This highlights the robustness of our subgroups.The reHIRDS cluster, marked by elevated BMI and immune markers, showed an intermediate alteration in T cells CD4 memory resting proportion deviating from Foley et al. (2022) who reported no significant CD4 + T cells proportion changes in depression.Conversely, our reHIRDS and intermediate depression symptoms (reIDS) clusters reflected their findings of decreased lymphocyte percentage in depression.In contrast to the results from Lynall et al. (2020) who observed elevated immune markers and absolute immune cell counts in their immune-related depression cluster, our reHIRDS cluster did not exhibit the strongest cell type phenotype.This discrepancy could stem from methodological differences, such as direct cell counts versus deconvolution.This suggests a complex interplay between immune markers and PBMC composition in depression.Possible sources of immune markers like neutrophils (Tecchio et al., 2014) and adipose tissue (Shelton and Miller, 2011;Fain, 2006;Shelton et al., 2015) highlight the importance of obesity in immune-related depression.Our findings emphasize the need for further research into the associations of immune cell composition and immune markers in this context.
The advantages of a multi-omics integration were further demonstrated by revealing cell type specific inflammation of immune-related depression symptom clusters.VEGF-A was elevated in the HIRDS clusters, contrasting the SERPINF1 expression, which showed the opposite pattern.This aligns with its gene product, pigment epithelium-derived factor (PEDF), known to inhibit VEGF-A (Dawson et al., 1999).Of note, PEDF is neurotrophic (Tombran-Tink and Barnstable, 2003) and was shown to ameliorate depression-like symptoms in mice (Tian et al., 2020) and was increased in depression patients after electroconvulsive treatment (Ryan et al., 2017).This suggests a link of the dysregulated immune pathway in the HIRDS clusters and the observed elevated depression severity.
Furthermore, our single cell data revealed that SERPINF1 is mainly expressed in dendritic cells, suggesting a potential reduction of these cells within immune-related depression symptom clusters.This finding merits additional exploration, as it could open new pathways for understanding and potentially targeting the immune-related aspects of depression symptoms.
Our study has some limitations worth noting.The heterogeneity of our sample, which includes a transdiagnostic sample of both in-and outpatients with varying medication usage and comorbid conditions, may affect the generalizability of our findings.While the MIDS cluster was mainly driven by healthier and younger BeCOME participants, the LIRDS and HIRDS clusters did not have a significantly different cohort distribution.Additionally, incomplete assessment of somatic comorbidities and medication use for the entire sample could influence results.The imaging and psychophysiology data represents only a subset of participants (63 %), and the cell type composition was inferred rather than being directly measured, potentially impacting accuracy.Furthermore, the top discriminating genes did not effectively predict treatment response in an external dataset.Future research with larger, more diverse and longitudinal datasets is needed to validate our findings, further explore the predictive utility of our identified clusters and gene signatures, and investigate the causal relationships between these factors and treatment response.
Taken together, our multi-omics integration analysis successfully discovered two clusters with immune-related depression symptoms, supporting the immuno-metabolic depression hypothesis and highlighting the importance of biological variables such as age and BMI.Incorporating single cell data, we uncovered cell type specific inflammation dysregulation involving SERPINF1 and VEGF-A, both previously implicated in depression.This integrated approach, recognizing depression's heterogeneity, enhances our understanding of its complexity by exploring symptoms across diagnoses.It highlights novel immune markers and genes as potential targets for clinical stratification and new therapeutic intervention.

Fig. 1 .
Fig. 1.Analysis overview: 237 participants with a DSM-IV (subthreshold) diagnosis according to the Munich-Composite International Diagnostic Interview were clustered with a multi-omics clustering.The different layers contained phenotypes (age and BMI, orange), 43 immune markers measured in plasma (green) and RNAseq from peripheral blood mononuclear cells (PBMCs, blue).The clustering was performed repeatedly as a consensus clustering on subsets of the data to obtain stability metrics for the optimal number of clusters.The stratification into clusters was characterized by several phenotypes.MIDS = mild depression symptoms cluster, HIRDS = high immune-related depression symptoms cluster, LIRDS = low immune-related depression symptoms cluster.

Fig. 2 .
Fig. 2. Initial clustering with age, BMI, immune markers and RNA-seq data.(A) Heatmap of immune markers where the columns are participants while the cohort and cluster are noted at the top.The color denotes the normalized concentration with green values missing.(B) Barplot displaying the variable importance of the top 40 variables that most differentiate the clusters, calculated using the F-value from an ANOVA model with the clusters as independent variables.These variable importance values should not be interpreted as p-values, because the variables were used in the clustering themselves and therefore inflate the p-values.Violin plots depicting the distribution of (C) age and (D) BMI across the clusters.N = 36 controls without a DSM-IV diagnosis not used in the clustering, MIDS = mild depression symptoms cluster, HIRDS = high immune-related depression symptoms cluster, LIRDS = low immune-related depression symptoms cluster, cohort 1 = OPTIMA, cohort 2 = BeCOME.

Fig. 3 .
Fig. 3. Distribution of biological and phenotype data across the clusters in the initial clustering with age, BMI, immune markers and RNA-seq data.(A) − (C) Scatterplots illustrate the associations between C-reactive protein (CRP), BMI and age across the four clusters, as determined by locally estimated scatterplot smoothing (LOESS).(D) Violin plots showing the distribution of Beck Depression Inventory (BDI)-II across the clusters.Significant differences (p < 0.05) were observed when comparing HIRDS1, LIRDS and HIRDS2 to the MIDS cluster and the controls without a DSM-IV diagnosis.(E) Violin plots illustrating the Standard Deviation of normal-to-normal intervals (SDNN), a measure for heart rate variability, for 149 participants.The significant differences (p < 0.05) were observed between HIRDS1, LIRDS and HIRDS2 compared to the MIDS cluster.(F) Violin plots illustrating the cluster-specific effect (global intercept + cluster-specific effect + residuals) of the lateral ventricle volume from a model including age, sex and intracranial volume (ICV) derived from structural magnetic resonance imaging for 151 participants.A significant difference (p < 0.05) was observed between the clusters.MIDS = mild depression symptoms cluster, HIRDS = high immune-related depression symptoms cluster, LIRDS = low immune-related depression symptoms cluster.

Fig. 4 .
Fig. 4. Immune marker and RNA-seq data contains complementary information in the initial clustering with age, BMI, immune markers and RNA-seq data.(A) Violin plots showing the distribution of the seven most differentiating immune markers across the clusters.(B) Violin plots showing the distribution of the seven most differentiating genes across the clusters.(C) Heatmap of the 20 most differentiating GO gene sets ordered by their variable importance that are enriched in the clusters.These gene sets were identified through Gene Set Variation Analysis (GSVA).(D) Heatmap showing the Pearson correlation between the top immune markers.(E) Heatmap showing the Pearson correlation between the top genes.MIDS = mild depression symptoms cluster, HIRDS = high immune-related depression symptoms cluster, LIRDS = low immune-related depression symptoms cluster.

Fig. 5 .
Fig. 5. Exploratory analysis with age, BMI, cell types, immune markers and RNA-seq data.(A) Stacked barplot illustrating the proportion of various cell types predicted from the RNA-seq data for all participants.(B) Mapping of the participants from the initial clustering to the re-evaluated clustering that additionally contains cell type proportions.(C) Barplot showing the variable importance of the top 30 variables that most differentiate the clusters, along with the 10 most differentiating biological variables and immune markers.(D) Violin plots illustrating the predicted cell type proportions across the clusters.(E) Heatmap of the 20 most differentiating GO gene sets, ordered by their variable importance, that are enriched in the clusters.These gene sets were identified through Gene Set Variation Analysis (GSVA).reMIDS = re mild depression symptoms cluster, reIDS = re intermediate depression symptoms cluster, reHIRDS = re high immune-related depression symptoms cluster, NK = natural killer cells.

Fig. 6 .
Fig. 6.Cell-type specific dysregulation in the initial clustering with age, BMI, immune markers and RNA-seq data.Violin plots displaying: (A) the gene expression of SERPINF1 across the clusters, (B) normalized protein concentration of VEGF-A across the clusters, and (C) gene expression of VEGFA across the clusters.(D) − (F) Single-cell RNA-seq data of peripheral blood mononuclear cells (PBMCs) from 14 patients with depression and other mental disorders visualized using UMAP plots.(D) Demonstrates the gene expression of SERPINF1.(E) Shows the gene expression of VEGFA.(F) Illustrates the cell types identified within the PBMCs.NK = natural killer cells, MIDS = mild depression symptoms cluster, HIRDS = high immune-related depression symptoms cluster, LIRDS = low immune-related depression symptoms cluster. .

Table 2
Replication sample overview with either percent of participants and absolute numbers in brackets or mean and standard deviation in brackets.No variable is significantly different distributed compared to the participants used for the clustering.