A network analysis of depressive symptoms and metabolomics

Background Depression is associated with metabolic alterations including lipid dysregulation, whereby associations may vary across individual symptoms. Evaluating these associations using a network perspective yields a more complete insight than single outcome-single predictor models. Methods We used data from the Netherlands Study of Depression and Anxiety (N = 2498) and leveraged networks capturing associations between 30 depressive symptoms (Inventory of Depressive Symptomatology) and 46 metabolites. Analyses involved 4 steps: creating a network with Mixed Graphical Models; calculating centrality measures; bootstrapping for stability testing; validating central, stable associations by extra covariate-adjustment; and validation using another data wave collected 6 years later. Results The network yielded 28 symptom-metabolite associations. There were 15 highly-central variables (8 symptoms, 7 metabolites), and 3 stable links involving the symptoms Low energy (fatigue), and Hypersomnia. Specifically, fatigue showed consistent associations with higher mean diameter for VLDL particles and lower estimated degree of (fatty acid) unsaturation. These remained present after adjustment for lifestyle and health-related factors and using another data wave. Conclusions The somatic symptoms Fatigue and Hypersomnia and cholesterol and fatty acid measures showed central, stable, and consistent relationships in our network. The present analyses showed how metabolic alterations are more consistently linked to specific symptom profiles.


Introduction
Major Depressive Disorder (MDD) is a condition with a high burden of disease, which is partly due to comorbidity with other highly disabling somatic conditions such as cardiometabolic disorders (Gold et al., 2020;Liu et al., 2020;Otte et al., 2016).The metabolic alterations involved in cardiometabolic disorders may represent a common biological pathway connecting cardiometabolic disorders to MDD as well (Milaneschi, Lamers, Berk, & Penninx, 2020;Milaneschi, Simmons, van Rossum, & Penninx, 2019).Consistent with this hypothesis, metabolomics research using lipidomic panels identified associations between depression and higher levels of very low density lipoprotein (VLDL) and triglyceride particles, and negative associations with high density lipoprotein (HDL) (Bot et al., 2020).Furthermore, such metabolic signatures appear to be more consistently associated with the presence and severity of depression rather than anxiety (De Kluiver et al., 2021).
Depression is highly heterogeneous, and the link with metabolic alterations may vary as a function of the symptom profile expressed, with certain depressive symptoms being more strongly linked to metabolic alterations.Indeed, studies show that metabolic alterations (e.g.high visceral adipose tissue, inflammation, low HDL levels) are more strongly connected with atypical symptoms that characterise altered energy balance, such as Sleeping too much, Increased appetite, and Low energy levels (Alshehri et al., 2021;Beijers et al., 2019;Brydges et al., 2022;Lamers, Milaneschi, De Jonge, Giltay, & Penninx, 2018;Milaneschi et al., 2020).
The majority of studies investigating associations between depression and metabolic markers have examined mainly univariate associations (e.g.single outcome-single predictor models).Merely studying individual associations, however, does not do justice to the complexity and heterogeneous nature of depression.A growing field of research attempts to incorporate the heterogeneity of depression by applying network theory and models to these data (Borsboom, 2017;Borsboom et al., 2021;Borsboom & Cramer, 2013;Boschloo, van Borkulo, Borsboom, & Schoevers, 2016;Fried et al., 2017Fried et al., , 2020;;Isvoranu et al., 2020;Van Borkulo et al., 2015).This approach has the advantage that it allows us to find associations in a system, conditioning on all variables simultaneously.Using this analytical approach, Kappelmann et al. (2021) studied links between 5 different polygenic scores of immunemetabolic traits and 7 depressive symptoms using a network model, finding that the C-Reactive Protein (CRP) polygenic score was associated with change in appetite, and to a lesser extent with anhedonia.Other studies found associations between CRP, Tumour Necrosis Factor α (TNF-α), and Interleukin-6 (IL-6) with depressive symptoms of Sleep problems and Appetite/ Weight changes (Fried et al., 2020;Moriarity & Alloy, 2020;Moriarity, Horn, Kautz, Haslbeck, & Alloy, 2021).Whereas these networks capture a bigger picture of associations between psychopathological symptoms and (small sets of) biomarkers, a networkwide approach has yet to be applied to large panels of metabolic markers (i.e.metabolomics).
To close above gaps, this paper explored associations between depressive symptoms and a metabolomic panel of 46 markers, using Mixed Graphical Models to create a network with these variables.Lipid-based metabolites and individual depressive symptoms are expected to be highly intra-and inter-correlated; to disentangle the broad correlation matrix we applied network centrality measures allowing to identify the most relevant metabolite-symptom links in the network.One-time only assessment may not be representative of long lasting underlying connections between metabolites and depressive symptoms, because such measures could vary over time and be influenced by several factors such as medication, smoking, and BMI.Furthermore, these types of networks are prone to overfitting.Therefore, in order to identify the more reliable and consistent symptommetabolite links we conducted a stability analysis of the edges in the network and we validated the most stable connections using similar data collected from the same subjects 6 years later, additionally adjusting for different sets of covariates.

Sample
Data are from participants from the Netherlands Study for Depression and Anxiety (NESDA), an ongoing longitudinal cohort study on the long-term course and consequences of depressive and anxiety disorders.Penninx et al. (2008) and Penninx et al. (2021) provide a description of the study rationale, design, and methods.At baseline, 2981 participants were included between the ages of 18 and 65, who were recruited between 2004 and 2007 from the community (19.0%), primary care (54.0%), and specialised mental health care setting (27.0%).The 6-year follow-up (age range 23-72) consisted of 2256 participants.
Participants were either healthy controls or had a current or prior history of depressive and/or anxiety disorder.Diagnoses of depression and anxiety disorders were determined with the use of the Composite Interview Diagnostic Instrument (CIDI)lifetime version 2.1, which assesses diagnostic criteria of the Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV.Participants were excluded when they did not speak fluent Dutch, or when they had a primary clinically overt other psychiatric diagnosis, such as bipolar, psychotic, obsessive compulsive, or severe addictive disorder.The Ethical Committee of all universities participating approved the NESDA project, and all participants provided written informed consent.Data collection included an extensive interview and blood collection.
The network model required complete data on all measures; thus, we selected 2498 participants with complete baseline data on all metabolites and depressive symptoms for the main analyses.For 1745 of these participants complete data was available at 6-year follow-up.

Depressive symptoms
Depressive symptoms (in the week before the baseline interview and the 6-year follow-up interview) were measured using the 30-item self-report Inventory of Depressive Symptomatology (IDS-SR30) (Rush, Gullion, Basco, Jarrett, & Trivedi, 1996).Each symptom is represented by 4 statements representing an ordinal scale in the range 0-3 (e.g.0 = 'I do not feel sad', 3 = 'I feel sad nearly all the time').For interpretation, symptoms were divided into 2 groups (online Supplementary Table S1): 13 somatic (e.g.Aches and pains, Sleeping too much) and 16 mood/cognition (e.g.Feeling sad, General interest) symptoms.This grouping is based on a factor analysis performed by Wardenaar et al. (2010), where the symptoms were clustered into 3 factors (i.e.'mood/cognition', 'anxiety/arousal', and 'sleep').As previously done by Vreijling et al. (2021), the 'anxiety/arousal' and 'sleep' factors were merged into a new factor labelled 'somatic', as the sleep profile contained only 4 variables.The symptom Diurnal variation of mood was removed from either category, as it associates poorly with both groups (Rush et al., 1996;Vreijling et al., 2021;Wardenaar et al., 2010).The symptoms Weight change and Appetite change were split into 4 separate variables: Weight or Appetite increase and decrease.

Metabolomic biomarkers measurement
Participants were asked to provide overnight fasted blood samples during the baseline interview in the morning (between 08.30 and 09.30).Plasma samples were stored in the ethylenediaminetetraacetic acid (EDTA) detergent, and kept at a temperature of −80 °C until they were assayed.The baseline samples were shipped in 2 batches (April and December 2014, referred to as batch 1 and batch 2 respectively); the 6-year follow-up samples were shipped in a single batch.Assessments were based on a proton nuclear magnetic resonance (NMR) platform (Nightingale Health Ltd., Helsinki, Finland) (Soininen, Kangas, Würtz, Suna, & Ala-Korpela, 2015) quantified concentrations of several lipid-related metabolites and their ratios.
Among the measures provided by the platform, we selected concentrations of 51 lipids, fatty acids, and low molecular weight metabolites as in a previous study by De Kluiver et al. (2021).This study also found 13 metabolites to be batch-sensitive, meaning the metabolite levels were different across batches; we adjusted for batch.Additional sub-measures and ratios of these lipoproteins (i.e.lipid composition and particle concentration measures of lipoprotein subclasses and lipid and fatty acids ratios) were not included in the current analyses because of redundancy.Furthermore, when more than 2.5% of the participants had a missing for the same metabolite, that metabolite was removed from the dataset.This resulted in the removal of 5 metabolites (columns).Subsequently, there were 79 subjects (rows) with missing values (2.7% of the total number of subjects); these subjects were removed from the dataset.Applying the above selection criteria resulted in 46 metabolites for final analysis.These metabolites were classified into 11 subcategories (online Supplementary Table S1) according to the classification provided by Soininen et al. (2015).
Data were processed according to the manufacturer's standardised protocol previously applied in other large scale studies (Bot et al., 2020).A value of 1 was added to each value, and the natural log transformation was applied.Values that deviated >5 S.D.s from the mean were set as missing.

Descriptive variables and covariates
Baseline sex, age and sample shipment (batch 1 and 2) were used to adjust metabolite levels for the network analysis, For the regression analyses in the validation step, different sets of covariates were used: sex, age, statin use (Anatomic Therapeutic chemical C10AA, abbreviation ATC), antidepressant (AD) use (tricyclic, ATC N06AA; selective serotonin reuptake inhibitors, ATC N06AB; and other antidepressants, N06AF), Body Mass Index (BMI) based on measured height and weight, and current smoking status (yes/no).Medication info was obtained based on container inspection during the interview.Similar variables were available at the 6-year follow-up.
The analyses involved 4 steps: creating the network with MGM in the baseline data; calculating 5 network centrality measures for all variables in the network; testing the links for stability; and lastly performing a linear regression on the stable links connected to central nodes, additionally correcting for sets of covariates and replicating in the 6-year follow-up dataset.We elaborate on these 4 steps below.

Mixed graphical models
Depressive symptoms (categorical) and residualised metabolite values (Gaussian) were fed into the model using the MGM package (Haslbeck & Waldorp, 2015) (see online Supplementary Fig. S2 for distributions).Each variable is treated as a random variable, and the distribution of the whole set of variables is assumed to be factorised according to an undirected graph G.The connection of a (multivariate) probability distribution and a graph is formalised by the Global Markov Property.The variables (represented as nodes) are connected through conditional distributions.These distributions are part of the exponential family.The parameters belonging to this family function as the edge weights of the network.Some elaboration of the mathematics is provided in the online Supplemental materials.
We did not apply cross-validation nor an EBIC parameter, which are used as a penalty parameter to prohibit overfitting; the associations between symptoms and metabolites were likely to be weaker than within-group associations; by applying a penalty we would not have found these more subtle associations.The nodes were grouped together a clustering function within the package.

Network centrality measures
The network visualisation provides an intuitive overview of how variables (nodes) are connected with one another, although the interpretation of the associations (edges) is not directly straightforward.Therefore, to add to the visual inspection, 5 different network measures were used.These capture different properties of the network nodes.The 5 measures were scaled between 0 and 1, and ranked individually.Mathematical definitions can be found in Rodrigues (2019).
Betweenness -Betweenness counts how often a node is in the shortest path between all other node pairs.If a node often appears on the shortest path, it may have a mediating function for many node sets.
Closeness -Closeness for a node is the inverted sum of its average distance (according to some predefined metric) to all other nodes.If a node has a high closeness it is very central in the network, which is an indication of being able to spread 'information' efficiently through a network.
Weighted degree -The weighted degree sums the weight of all edges connected to a node, and informs us about the connectedness to other nodes.Thicker edges mean stronger associations If a node has many small incoming edges, it might have a similar weighted degree as a node with one big edge.The measure can help identify hubs.
Ratio betweenness and degree -If a node has a high betweenness, but also a high weighted degree, then the ratio of the two measures is low, and that node is relatively node-stable.In contrast if a node has high betweenness, but low degree, the ratio is high: perturbation in the system regarding that area of the network can have a big effect on the system.Consequently, nodes with high betweenness/degree ratio could be interesting as target for an intervention.This is under the assumption that overall degree scales proportionally with in-degree.
Ratio closeness and degree -Similar to how betweenness and degree can say something about node-stability, closeness and degree can do the same thing.A high ratio means this node is less stable, and therefore could be interesting in terms of having a big effect on the system when the node changes (again under the assumption that in-degree and total degree are positively correlated).The links in these networks are correlations, and although correlation does not imply causation (the links do not need to be causal), causation does imply correlation; in this network, causal connections could be forming a subnetwork.These 2 ratio measures have potential to capture some of the causal connections.

Stability analysis
Stability analyses were performed on all potential edges in the network through bootstrapping (Haslbeck & Waldorp, 2015).This method runs the model 50 times for potential edges, and resulted in an edge-stability assessment for 2850 links †1 .We selected edges which were connected to variables scoring highest on at least 1 of the 5 network measures.For these edges we showed the 95% confidence interval (2.5th and 97.5th percentiles) and mean of the edge strengths.Additionally, we provided the fraction of times that the edge was present in the bootstraps.This provides an edgestability measure for the links that were found to be important in the centrality assessments.

Validation
As a validation step for the stable links connected to central nodes, we performed a linear regression.Because these links might arise from unknown mechanisms, we corrected for different sets of covariates: no additional covariates (only sex and age); statin use; AD use; BMI; BMI and smoking; BMI, smoking and AD use; and BMI, smoking AD and statin use.Lastly, we compared these associations with the 6-year follow-up dataset as an additional step to confirm consistency in the findings.

Descriptives
The baseline sample included 2498 subjects with complete data on metabolites and depressive symptoms; the 6-year follow-up sample included 1745.The descriptives and covariates are provided in Table 1.Properties of the datasets after cleaning were similar to the original datasets, apart from sample size, which was reduced by 16% in the baseline, and 23% in the 6-year follow-up datasets.
Online Supplementary Fig. S1 shows the distribution of the 30 IDS-SR items at baseline.The symptoms are ranked from most prevalent to least prevalent.The most endorsed symptoms included Problems sleeping during the night, Leaden paralysis, and Problems falling asleep.The distribution of the 46 metabolites can be found in online Supplementary Fig. S2.

Network analyses
Figure 1a show the MGM network of depressive symptoms and metabolites (see online Supplementary Table S1 for extended legend).As expected, the MGM shows a clustering of symptoms, and a clustering of metabolites.A total of 28 bridge associations (associations between a metabolite and a depressive symptom) were visible.These bridge associations are highlighted in Fig. 1b by leaving out all intra-group associations.Figure 1b shows 8 somatic symptoms (61.5% of all somatic symptoms) and 8 mood/cognition symptoms (50.0% of all mood/cognition symptoms) are connected to metabolites.
There are no direct connections of symptoms with apolipoproteins (dark-green) cholesterol (pink), glycerides and phospholipids (light-orange) and fatty acids (dark-orange).However, connections are visible between depressive symptoms and amino acids (light-green); lipoprotein particle size (red); total fatty acids and saturation measures (light-purple); fluid balance (dark-purple); glycolysis related metabolites (yellow); inflammation (brown) and lastly, ketone bodies (light-blue).

Network centrality measures
Mere visualisation of number of connections alone might miss which variables have high centrality measures, or have a high connectedness to the other nodes in the network.Figure 2 shows the variables that score high on the 5 network measures, that is those that are among the top 3 for at least 1 of the 5 centrality measures (8 depressive symptoms of which 3 somatic, and 7 metabolites).The measure values have been scaled for comparability.
Increase in weight, Sleeping too much, and Low energy levels stand out for the depressive symptoms.For the metabolites we see glucose has high centrality measures, histidine has the highest score for closeness/degree, and mean diameter size for VLDL particles has a high closeness value.In the next step we show that some of these variables also had a high stability.

Stability analysis
For the variables with a high centrality score we evaluated the edge-stability, an important measure when evaluating associations in a network.Bootstraps were performed on the node pairs of which at least one had the highest score for one of the five network measures, including: Sleeping too much; Low energy level/fatigue; Self-criticism and blame; glucose, citrate; and histidine.The model was bootstrapped 50 times for each potential link.
In Fig. 3, the bars show the average edge-strength for each node pair and the 95% confidence interval.The numbers in the graph stand for the fraction of how often an edge was present in one of the 50 bootstraps.The edges are ranked on average edgestrength.What stands out in Fig. 5 are the top two links, Low energy level and mean diameter of VLDL particles (16-49), and Low energy level and Estimated degree of unsaturation (16-68): these have an edge-stability of 98% and 92%, respectively.The third in rank, Sleeping too much and histidine (4-32), also has a relatively high and stable association.

Validation through regression baseline and 6-year follow-up
Lastly, we performed a validation step.We estimated the 3 most stable associations through a linear regression adjusting for different sets of covariates on baseline dataset.As a consistency check the regressions were performed using 6-year follow-up data.The linear regressions are the last step of the statistical analyses, done to check what the influence of covariates was on the associations found in the network and stability analysis, and to check whether the association is consistent over time.As shown in Table 2, there was a significant association for all three pairs in the baseline dataset: Low energy level was positively associated with mean diameter of VLDL particles and negatively with estimated degree of unsaturation; Sleeping too much was negatively associated with histidine.The associations of Low energy level with mean diameter of VLDL particles and estimated degree of unsaturation were confirmed in the 6-year follow-up data.Overall, adjustment for lifestyle and medications partially reduced the strength of the associations, indicating a role of these factors in the link between depressive symptoms and metabolites.

Discussion
In light of high associations between depression and cardiovascular disease risk, this study aimed to examine co-expression of metabolic markers and depressive symptoms using network modelling.The network contained a 46 parameter lipidomics panel and 30 depressive symptoms.Analyses showed patterning of associations across symptoms and metabolites whereby some depressive symptoms showed a relatively more central role (e.g.Low energy level, Sleeping too much and Weight increase).The connections between these symptoms and specific metabolites were stable, and consistent across two different time assessments, most notably associations between Low energy level and VLDL particle size and fatty acid saturation.These associations replicate prior findings (Brydges et al., 2022;Drevets, Wittenberg, Bullmore, & Manji, 2022;Grela et al., 2016;Knowles et al., 2017;Lamers et al., 2018;Milaneschi et al., 2020).VLDL particle size summarises the size of cholesterols and is also highly correlated with several cholesterol variables (Bowden, Wilson, & Beaujean, 2011;Sacks & Campos, 2003).
Depressive symptoms such as Sleeping too much, Low energy level and Increase in weight form a subgroup referred to as atypical, energy-related symptoms.These share associations with cardiometabolic diseases (Alshehri et al., 2021;Lamers et al., 2018), and inflammatory and metabolic dysregulations (Brydges et al., 2022;Lamers et al., 2020;Milaneschi et al., 2020), which is congruent with the present findings.This clustering of atypical energy-related symptoms sharing associations with heightened levels of inflammation is referred to as Immunometabolic Depression (IMD).Network centrality measures can give an interpretation of the potential role of a node in the network.Low energy level scored high on weighted degree, implying the node is a stable factor in the network.Furthermore, the node scored high on closeness and betweenness, which could mean Low energy might have a mediating role in its spreading effects to other variables.Mean diameter for VLDL particles showed mostly a high score for closeness, which is a measure that indicates the spread of information through a system.
Sleeping too much scored high on the two ratio centrality measures, implying this variable might have potential to alter the system.Lastly, histidine scored highest on the ratio measure closeness/degree, which suggests that histidine can spread information efficiently and is node-unstable, meaning it can perturb the system when changed.Histidine is a semi-essential amino acid involved in protein interactions and synthesis (Brosnan & Brosnan, 2020;Liao, Du, Meng, Pang, & Huang, 2013).Lowered histidine levels and sleep deprivation are correlated (Nakamura et al., 2022;Sasahara, Fujimura, Nozawa, Furuhata, & Sato, 2015;Thalacker-Mercer & Gheller, 2020).
It is generally known that biomarkers are not independent of each other, and neither are depressive symptoms; applying a network to the data respects these underlying distributions and interdependencies of the variables.Furthermore, we did not only show the resulting networks (which are prone to overfitting), but also performed a stability test on all the links that had a high centrality.This is an important step in the network analysis as spurious links should be filtered out.The validation confirms that the associations did not merely arise through confounding factors.NESDA does not only provide a large dataset, it also allowed us to check for consistency using the 6-year follow-up, which significantly strengthens the legitimacy of the found associations.Networks can better capture which variables are associated, as depression is known to be a highly heterogeneous disorder.Lastly, this approach allows for more layers of disease outcomes or biomarkers to be added.Some critical remarks are warranted.While the centrality measures are intuitive, and the implementation is straight-forward; more specialised bridging-measures could be applied in future work.Examples are Jensen et al. (2016) and Valente and Fujimoto (2010).These are more complex to implement but give a more fine-tuned interpretation of the network's nodes.We also did not make use of a penalisation parameter, which is applied to prohibit the finding of spurious associations in the network.We did this because we were interested in the associations between symptoms and metabolites, which are substantially weaker than symptom-symptom and metabolite-metabolite associations.Penalising all the links similarly would have resulted close to no symptom-metabolite links.However, we ensured elimination of potential spurious results through the stability analysis and validation steps.When analysing networks with either only continuous, or only discrete random variables, we recommend the Network Comparison Test by Van Borkulo et al. (2022).This is a useful tool to test for stability; however, our data was mixed, so this package could not be applied to this network.Lastly, the methodology is new, but associations between depressive symptoms and metabolites have been studied before.
The added benefit would be to also add extra layers of biomarkers or disease outcomes.An obvious limitation of studying crosssectional data is that causality is not inferable.This work offers preliminary insights in the associations between metabolites and depressive symptoms, and invites further investigations, e.g., longitudinal analyses based on the findings of this paper.We illustrate that applying the network approach to biological and symptom data, provides useful insights in core aspects of the biology-symptom links.Some symptoms are more linked to certain metabolites than others, and serve as bridge-associations between the symptoms-group and metabolite-group.This should be used in future large-scale biological and psychiatric datasets.Lastly, our findings suggest studying metabolic dysregulations in a subgroup of patients suffering from atypical, energy-related symptoms.
Due to the complexity of the interplay of symptoms and metabolites, and especially the differential expression of depressive symptoms, the application of adequate statistical techniques is essential.By creating a network of individual depressive symptoms and a full lipidomic panel we found a subset of symptoms and markers showing strong connections.Our results thereby add to the needed insights of MDD's heterogeneity and illustrate the important roles of Low energy level, VLDL cholesterol particle size, and saturation of fatty acids.

Figure 1 .
Figure1.MGM network visualisation of pairwise associations between depressive symptoms and metabolites.

Figure 2 .
Figure 2. Variables (depressive symptoms and metabolites) which are ranked in the top-3 for at least one of the 5 centrality measures.

Table 1 .
Descriptives and covariates of the NESDA baseline and six-year follow-up datasets

Table 2 .
Results of the linear regression