Low-dose interleukin 2 antidepressant potentiation in unipolar and bipolar depression: Safety, efficacy, and immunological biomarkers

Immune-inflammatory mechanisms are promising targets for antidepressant pharmacology. Immune cell abnormalities have been reported in mood disorders showing a partial T cell defect. Following this line of reasoning we defined an antidepressant potentiation treatment with add-on low-dose interleukin 2 (IL-2). IL-2 is a T-cell growth factor which has proven anti-inflammatory efficacy in autoimmune conditions, increasing thymic production of naïve CD4 + T cells, and possibly correcting the partial T cell defect observed in mood disorders. We performed a single-center, randomised, double-blind, placebo-controlled phase II trial evaluating the safety, clinical efficacy and biological responses of low-dose IL-2 in depressed patients with major depressive (MDD) or bipolar disorder (BD). 36 consecutively recruited inpatients at the Mood Disorder Unit were randomised in a 2:1 ratio to receive either aldesleukin (12 MDD and 12 BD) or placebo (6 MDD and 6 BD). Active treatment significantly potentiated antidepressant response to ongoing SSRI/SNRI treatment in both diagnostic groups, and expanded the population of T regulatory, T helper 2, and percentage of Naive CD4 + /CD8 + immune cells. Changes in cell frequences were rapidly induced in the first five days of treatment, and predicted the later improvement of depression severity. No serious adverse effect was observed. This is the first randomised control trial (RCT) evidence supporting the hypothesis that treatment to strengthen the T cell system could be a successful way to correct the immuno-inflammatory abnormalities associated with mood disorders, and potentiate antidepressant response.


Introduction
Despite tremendous improvement in antidepressant psychopharmacology based on drugs directly affecting neurotransmitter function, current consensus is that one-third of patients with Major Depressive Disorder (MDD) do not achieve full symptomatic remission, and that in individuals with ineffective initial treatment, many relapses are observed despite continuing apparently effective subsequent treatment, paving the way to treatment-resistant depression (TRD) (Sforzini et al., 2022).Outcomes are even worse in Bipolar Disorder (BD), which has been associated with extremely low success rates of antidepressant drugs in naturalistic settings (Post et al., 2011;Post et al., 2012): hence the need of continuous research on pathogenetic mechanisms to address the clinical need of more precisely targeted and effective antidepressant treatment.
Consistent evidence support immune cellular and inflammatory mechanisms as possible targets for antidepressant pharmacology.Postmortem neuropathology detected increased microglia density and activation, and lymphocyte infiltration, in the brain of suicides (Naggan et al., 2023;Schlaaff et al., 2020;Steiner et al., 2013).Patients with depression show high levels of inflammatory compounds in circulating blood and in CSF; higher baseline immuno-inflammatory setpoints related to circulating cytokines, chemokines, and leukocyte gene expression hamper response to antidepressants; the number of failed treatment trials is associated with higher levels of inflammatory markers (Haroon et al., 2018); effective antidepressant treatment decreases inflammatory markers; and add-on treatment with immune-modulatory and anti-inflammatory drugs (e.g., minocycline, celecoxib, infliximab) can promote response in TRD (Benedetti and Vai, 2023;Benedetti et al., 2022;Branchi et al., 2021;Kappelmann et al., 2018).However, response to these kinds of treatments has been suggested to be exclusive of patients with signs of inflammation (Raison et al., 2013), thus possibly explain why some studies with these same compounds (i.e.minocycline, infliximab, cox-inhibitors) failed to observed significant improvements (Husain et al., 2020;McIntyre et al., 2019).
Specific immunopsychiatric antidepressant treatment is however in its infancy, with yet uncertain targets and predictors of response, elusive mechanisms, and a paucity of randomized controlled trials (RCT).Patients with mood disorders show signs of systemic low-grade inflammation due to decreased adaptive, increased innate immunity, with higher macrophage/monocyte inflammatory activation, and higher neutrophils to lymphocyte counts.Immune cellular abnormalities have been described, with a lifetime dynamic pattern of premature immunosenescence and partial T cell defect associated with premature T cell aging.This involves a reduction of naïve CD4 + T cells and an expansion of memory and senescent T cells (Bauer et al., 2015;Becking et al., 2018;Poletti et al., 2017;Simon et al., 2023;Simon et al., 2021a;Swallow et al., 2013){Bekhbat, 2022 #4}.The immune aberrancies trigger a cascade of events which leads to decreased monoaminergic neurotrasmitter function, altered brain glutamatergic activity, and brain structural and functional abnormalities (Benedetti et al., 2020;Bravi et al., 2022;Comai et al., 2022;Haroon et al., 2017;Poletti et al., 2020).
Based on this evidence, we hypothesized that treatments able to strengthen the T cell system could be a successful way to correct these abnormalities, and possibly potentiate antidepressant response.We considered interleukin 2 (IL-2) a good candidate for this purpose.IL-2 is a T-cell growth factor, increasing thymic production of naïve CD4 + T cells even in severe immunodeficiency (Carcelain et al., 2003), and may therefore correct the partial T cell defect observed in mood disorders.IL-2 is essential for CD4 + Th2 differentiation (Cote-Sierra et al., 2004), controls Th1 and Th2 fate decisions in antigen receptor-activated CD4 + T cells (Ross and Cantrell, 2018), and could thus correct the Th1/Th2 shift reported in chronic patients with BD (Brambilla et al., 2014).Starting from 2011 (Koreth et al., 2011;Saadoun et al., 2011), several clinical trials in patients with autoimmune conditions showed that lowdose IL-2 specifically expands and activates CD4 + Treg cell populations and can control inflammation, with a favorable safety profile (Graßhoff et al., 2021;Klatzmann and Abbas, 2015;Rosenzwajg et al., 2019): this effect might also be useful to correct the reported Treg insufficiency in MDD (Ellul et al., 2018;Grosse et al., 2016), thanks to the Treg constitutive expression of high levels of the heterotrimeric high affinity IL-2 receptor complex, which in other CD4 + T cells, CD8 + T cells or NK cells is expressed only upon robust activation (Graßhoff et al., 2021).Finally, IL-2 acts directly as a trophic factor on both neurons and oligodendrocytes (de Araujo et al., 2009), and might then antagonize the detrimental link between low-grade inflammation and brain homeostasis observed in mood disorders.
Low-dose IL-2 was never tested in psychiatric conditions, but in mice it attenuated depression-like behaviors in a chronic stress-induced model of depression (Huang et al., 2022), correcting the Th17/T reg balance and peripheral signs of low grade inflammation.We then performed a randomized, placebo-controlled trial to test low-dose IL-2 as an adjunctive antidepressant treatment in depressed patients with MDD or BD, focusing on alterations in the CD4 + and CD8 + naïve and memory cells, T reg cells, Th17 cells, Th1 cells and Th2 cells, and serum parameters of low grade inflammation (CRP, IL-6), T cell regulation (sCD25, IL-7) and neurotrophism (BDNF).

Study design, participants and treatment
The research program on the effects of low dose IL-2 in depression was defined in 2017 in the context of the H2020 MoodStratification project (Drexhage, 2018), and articulated in two parallel trials, one to be conducted in MDD and in BD at the Mood Disorder Unit of Ospedale San Raffaele with the commercially available Aldesleukin (Acronym: IL-2REG; EudraCT N.: 2019-001696-36), and the other in BD at the Assistance Publique -Hôpitaux de Paris with ILT101 (Acronym: DEPIL-2, ClinicalTrials.gov:NCT04133233).Here we present the outcomes of the IL-2REG trial (Fig. 1).
The study was a single-center, randomised, double-blind, placebocontrolled phase II trial evaluating the safety, clinical efficacy and biological responses of low-dose IL-2 in depressed patients with MDD or BD.IL-2REG was designed and conducted in accordance with the International Conference on Harmonization Good Clinical Practice guidelines and the Declaration of Helsinki, and the study protocol, patient information sheets, and informed consent forms were approved by the Ospedale San Raffaele ethical committee (Prot.N. 77/int/2019) and by the italian regulatory authority (AIFA).Monitoring was provided by external monitors with independent oversight (https://research.hsr.it/en/clinical-trial-center.html).
Data from previous low-dose IL-2 studies suggest a non-Gaussian distribution of the main criterion and a large effect size (Hartemann et al., 2013;Rosenzwajg et al., 2015;Rosenzwajg et al., 2019).Thus, we will use the method proposed by Noether GE (Noether, 1986), for power calculation in case of non-parametric test.We calculated that a power at least equal to 80 % to detect a relevant relative effect of 0.8 (i.e., large effect size according to Cohen's criteria) in increasing Treg cell percentages after Aldesleukin vs Placebo, would have been achieved with a sample size of N = 36 patients, randomized on a 2:1 active/placebo ratio (24:12), with a 5 % two-sided alpha risk.A 10 % drop-out rate has been considered in the calculation.
After confirmation of their eligibility at baseline, 36 consecutively recruited inpatients at the Mood Disorder Unit were randomised in a 2:1 ratio to receive either aldesleukin (n = 24; 12 MDD and 12 BD) or placebo (n = 12; 6 MDD and 6 BD).Treatment randomisation was performed following a randomization list created through the statistical software SPSS.Patients, investigators, nurses, people involved in the evaluation of patients and data managers were kept blind to the treatment allocation for the whole duration of the study and up to the final database lock.The study sponsor remained masked to the individual treatment arm allocation up to the freezing of the database at the end of the study.
Human recombinant IL-2 was provided by the Hospital Pharmacy as Aldesleukin, and administered for 5 weeks as add-on to ongoing antidepressant treatment, with no wash out period.Aldesleukin at a dose of 1 M IU/day or placebo was administered by subcutaneous injection in the morning, every day for consecutive 5 days (induction phase), and then once a week from day 15 to day 36 (maintenance phase).Patients were then followed up until day 60.Dose and scheme of IL-2 administration were in agreement with established protocols for the treatment of inflammatory disorders (Louapre et al., 2023).
All patients were on antidepressant and/or mood stabilizers treatment, and continued it.Treatments had been decided by the psychiatrists in charge, independent from participation to the trial, and kept unchanged unless significant worsening in the symptomatology of the patients required a change.According to our standard treatment protocols, selective serotonin reuptake inhibitors (SSRIs) were preferentially administered; drugs acting on 5-HT and norepinephrine (SNRIs) S. Poletti et al. and tricyclic antidepressants were administered to patients who had not responded to SSRIs in their previous clinical history (https://www.nice.org.uk/guidance/cg90) (Middleton et al., 2005).Patients with BD were also taking lithium.Notwithstanding the psychiatric clinical management of their condition, including daily interviews with the psychiatrist in charge during hospitalization and at every visit thereafter, participants to IL2REG were not being administered a specific psychotherapeutic course of treatment during the study.
The first patient was recruited in January 2020, and the last visit was completed in April 2023.Eligible patients were aged 18-65 years, with a diagnosis of MDD or BD (DSM-5 criteria) and an ongoing depressive episode without suicidal ideation.Patients were already on an antidepressant and/or mood stabilizer, and with a Montgomery-Asberg Depression Rating Scale (MADRS) score > 17.Exclusion criteria were: hypersensitivity to active substance or excipient; active infection requiring antibiotics therapy; organ failure (e.g., liver, kidney, lung and heart) or previous history of organ transplantation; immunosuppressive treatment; hepatotoxic, nephrotoxic, myelotoxic or cardiotoxic drugs; any chronic disease; leukocytes < 4000/mm 3 , platelets < 100 000/mm 3 , hemoglobin < 10.0 g/dL, blood red cells < 3.5*10 6 /mm 3 ; use of antiinflammatory medication on a regular basis for a chronic inflammatory/autoimmune condition (corticosteroids, NSAID, immunosuppressant IV-Ig); uncontrolled diabetes type I or II; cancer or history of cancer in the last 5 years; existing or planned pregnancy or lactation; for women of child bearing potential, not using a highly effective method of contraception during treatment including oral or injectable contraceptives or intrauterine device or system during participation to the study; immediate risk for suicidal behaviour (3 on HDRS or 5 on MADRS); known HIV infection or clinically manifest AIDS; Parkinson's or Alzheimer's disease, or any other serious condition likely to interfere with the conduct of the trial; participation to an interventional study concomitantly or within 30 days prior to this study.

Data collection
The primary endpoint was the change in the relative concentration of peripheral blood Tregs (CD3 + CD4 + FoxP3 + CD127loCD25hi Treg).Secondary endpoints were changes in Th1 and Th2, Naïve T cells, Central memory T cells; changes in a set of peripheral molecules associated with low-grade inflammation in mood disorders; safety and antidepressant efficacy of aldesleukin at day 36 (end of treatment) and at day 60 (end of follow up).
At each visit (day 0-5, 15, 22, 29, 36, 60) severity of depression was rated on the Montgomery-Åsberg Depression Rating Scale (MADRS), Hamilton rating scale for depression (HDRS), Inventory for Depressive Symptomatology Self-Rated (IDS-SR), and safety was assessed based on reported adverse events and changes in concomitant medications.Plasma samples and live peripheral blood mononuclear cells (PBMC) for immunological phenotyping were collected at baseline, day 5, and day 60 (Fig. 1).
For immunophenotyping, blood was collected by venipuncture and gradient-purified PBMC were cryopreserved until used.All samples were then processed in one batch.Data were compensated and analysed using FlowJo v.10.8.1 (FlowJo LLC, Ashland, OR).An immune profile was generated on PBMC using a multiparametric flow cytometry.A 28-color flow cytometry panel was used to characterize lymphocyte phenotype and function (panel below).Percentages of viable lymphocytes, T-cells (CD3 + ) and T cell subpopulation (T-cytotoxic CD8 + and T-helper CD4 + ) were identified.CD45RA and CCR7 markers were included in the panel to identify naïve and memory T cells.Antibodies that measures T cell function through detection of cytokines (IL-17, IFNγ, TNFα, GM-CSF, IL-2, IL-4) were also included.Percentage of Tregs was assessed (detailed procedure in Supplementary Methods).All cells populations are presented as relative percentages of lymphocytes.
For measuring peripheral analytes, blood was collected in 6 ml BD Serum tube, increased with silica act clot activator and silicone-coated interior.Patients' serum was used to quantitative determination respectively of Interleukin 6 (IL-6), Interleukin 7 (IL-7), C-Reactive Protein (CRP), Brain-Derived Neurotrophic Factor (BDNF), Soluble Interleukin 2 Receptor (sIL-2R/sCD25) by means of apDia (Advanced Practical Diagnostics Bv, Turnhout, Belgium) ELISA kits.For each analyte (IL-6, IL-7, CRP, BDNF, sIL-2R/sCD25) an independent ELISA protocol was performed.A standard curve was obtained by plotting the absorbance values versus the corresponding calibrator values.The concentration of the specific analyte was determined by interpolation from the calibration curve (detailed procedure in Supplementary Methods).

Statistics
All the statistical analyses were performed with a commercially available software package (StatSoft Statistica 12, Tulsa, OK, USA) and following standard computational procedures (Dobson, 1990;Hill and Lewicki, 2006), both in the per-protocol (PP) group and in the intentionto-treat (ITT) group.The PP group included all patients who had no major violations, had been administered five treatment injections in the induction period and at least 85 % of the maintenance period injections, and had completed the study.The ITT group included all randomized patients who had taken at least one treatment injection, using the last observation carried forward (LOCF) for the assessment of depression severity.
We tested the normality of the distributions of outcome variables with the Shapiro-Wilk W test, and the homogeneity of variances for group effects with Levene test.Most biomarkers were not normally distributed neither at baseline, at day 5, nor in their delta variation (Supplementary Table 1).In agreement with the statistical plan of the RCT, non-parametric statistics were then chosen over traditional parametric ANCOVA for analyzing treatment effects.
To account for the the non-normal distribution of inflammatory biomarkers, the multiple covarying variables, and the a priori expected significant interaction with several independent factors (age, sex, diagnosis), we tested the effect of predictors on outcomes by combinining non-parametric Generalized Linear Model (GLZM) analyses of variances.We added linear machine-learning (ML) multiple regression techniques to perform feature reduction and prediction of effects.This robust approach has been shown to succesfully capture the complex relationship between immunological variables and clinical phenotypes and outcomes in the field of mood disorders (Benedetti et al., 2021;Benedetti and Vai, 2023;Mazza et al., 2021;Poletti et al., 2021).
Thus, for explanatory purposes the effects of treatment, sex, and diagnosis, and their expected interaction in influencing outcomes, were estimated on changes in biomarkers, and on changes in depression severity (delta values baseline-after treatment), by entering independent variables into a GLZM analysis of homogeneity of slopes with an identity link function (McCullagh and Nelder, 1989).Parameter estimates were obtained with iterative re-weighted least squares maximum likelihood procedures.The significance of the effects was calculated with the likelihood ratio (LR) statistic, by performing sequential tests for the effects in the model of the factors on the dependent variable, at each step adding an additional effect into the model contributing to incremental χ 2 statistic, thus providing a test of the increment in the log-likelihood attributable to each current estimated effect; or with the Wald W 2 test as appropriate (Agresti, 1996;Dobson, 1990), which is valid for testing treatment effects in RCTs (Kim et al., 2021;Li et al., 2021).To assess the magnitude of the observed effect, we also calculated the effect size of the difference (partial η 2 ) between Aldesleukin and Placebo in delta changes of rating scales at endpoint, considering age, sex, diagnosis, and baseline severity as covariates, and following standard interpretation guidelines ( p η 2 ≥ 0.01 for a small effect; p η 2 ≥ 0.06 for a medium effect; p η 2 ≥ 0.14 for a large effect) (Cohen, 2013).
To assess global response to treatment and perform a feature reduction by selecting the factors of interest in predicting it, we also used partial least squares regression (PLS), a ML technique that models the relationships between sets of observed variables with latent variables, to define a linear regression model by projecting the predicted variables and the observable variables to a new space.Changes in rating scales for depression (IDS-SR, MADRS, HDRS) from baseline to day 36 (end of treatment) and day 60 (end of study) were entered as dependent variables, and clinical (treatment, diagnosis, sex) and biological (delta cell percentages and levels of analytes) variables were entered in the model as predictors.Accuracy and significance of the predictive value of the model was assessed by using the Nonlinear Iterative Partial Least Squares (NIPALS) algorithm (Wold, 1966) and optimizing by crossvalidation the number of PLS components to extract (A), then calculating R 2 X (a A-dimensional vector, to record the explained variance of the data matrix of predictors by each PLS component), R 2 Y (a Adimensional vector, to record the explained variance of response variables by each PLS component), Q 2 (predicted variation, to measure R 2 Y applied to a test set with cross-validation procedure, in order to assess the predictive relevance of the endogenous constructs); and for each variable predictive weights (w), and the variable importance in projection (VIP) values, to estimate the contribute of each variable to the explanation of Y variance and the direction of effect.K-fold crossvalidation was performed by randomly splitting the data into k folds of roughly equal size, in order to estimate the error rate of the predictive algorithm by resampling the analysis data on k-1 folds to then assess the performance on the final fold.Significance of the model and of variable contributions were defined by Q 2 > 0 and VIP > 1 (Akarachantachote et al., 2014;Chong and Jun, 2005;Hill and Lewicki, 2006;Palermo et al., 2009).Given the preliminary nature of the study, a cutoff value at VIP > 0.8 for mining variables potentially contributing to prediction was however considered (SAS Institute, 2017).

Results
Clinical and demographic characteristics of the participants are resumed in Table 1.Between January 2020 and April 2023, 108 patients were screened for eligibility, 36 were enrolled, and 28 patients concluded the study, thus allowing to assess the primary endpoint (Fig. 1).All patients met criteria for TRD (Sforzini et al., 2022) except 3 patients with BD, 2 treated with placebo and 1 with aldesleukin.
All patients completed the induction phase.Treatment was generally well tolerated and no serious adverse reactions (SARs) not serious adverse events (SAEs) were observed.Only transient and mild events were observed, the most frequent being injection site reactions, with two patients showing a mild allergic reaction with a rapid resolution without treatment, and 1 patient with a mild allergic reaction who resolved with an antihistaminic treatment (thus excluded from the study).Drop-outs occurred because of direct effect of COVID restrictions (lockdown) in 4 patients, with 2 additional patients lost for not willing to come to the hospital during the pandemic.One patients necessitated a change in antidepressant drug treatments.All drop-outs except one occurred before the end of the treatment phase.Patients who completed the trial did not significantly differ in administered combined antidepressant drugs, which had been stable for mean 11.7 ± 27.7 weeks before starting the experimental treatment (Placebo: 9.2 ± 18.2; Aldesleukin: 12.4 ± 30.7; t = 0.30, p = 0.76).

Biological outcomes
Treatment influenced the percentage of immune cells (Table 2).Patterns of change of Treg and CD4 + and percenatage of CD8 + Naïve cells did not follow parallel slopes of time course in the two treatment groups (Fig. 2).
In the 28 participants who completed the trial (PP group), treatment with aldesleukin, but not with placebo, caused a significant expansion of Treg cells at the end of the induction phase (day 5), as confirmed by a GLZM homogeneity of slopes analysis showing a significant effect of treatment on Delta day0-day5 Treg cell (LR χ 2 = 4.603, p = 0.0320) with no main effects nor interactions with age, sex, and diagnosis.The effect was not anymore apparent at the end of the follow-up (day 60), when cell frequences did not anymore significantly differ from baseline levels.
In the ITT group (n = 36) at day 5, we observed a significant interactions between treatment and diagnosis (higher effect of aldesleukin in increasing Treg cell percentages in patients with BD, W 2 = 4.127, p = 0.0422) and a Treatment x Group x Age interaction (W 2 = 4.865, p = 0.0274): age was positively associated with an increase in Treg cell percentages (W 2 = 6.841, p = 0.0090) after placebo, but not after Aldesleukin, which, in turn, caused a higher increased independent of age (Supplementary Fig. 1).
In the PP group, CD4 + Naïve cells increased with aldesleukin in MDD patients but not in BD patients, while they decreased in placebo treated patients, yielding a significant effect of treatment in the whole group (LR χ 2 = 6.241, p = 0.0125) on day0-day5 cell percentages, which was not mantained over time.Similar effects were observed for CD8 + Naïve cells, which were higher in MDD at baseline, decreased in placebo-treated patients, but remained sustantially stable in aldesleukin treated patients, again showing a significant treatment effect on day0-day5 changes (LR χ 2 = 11.115,p = 0.0009).
With an opposite trend (Supplementary Fig. 2), central memory CD4 + CD27 + CM cell percentages showed a decrease with aldesleukin and an increase with placebo after the induction phase in MDD, but not in BD patients, who showed an opposite trend persisting at Day 60.This yielded a significant Treatment x Diagnosis interaction (LR χ 2 = 5.273, p = 0.0217), with no effect of treatment alone in the whole sample.Similar effects were observed for CD8 + CD27 + CM cells (treatment effect in the whole group: W 2 = 4.100, p = 0.0430).These effects led to an increase in the CD4 + Naïve/CM ratio (W 2 = 4.208, p = 0.0402), and to a nominal, but non-significant increase in CD8 + Naïve/CM ratio.
Changes of Th1, Th2, and Th17 cell percentages in the PP group are shown in Fig. 3.The percentage of Th17 cells showed not significant changes in the studied groups, with a trend to increse in all groups except MDD patients treated with placebo.CD4 + IFNγ + cells (Th1) decreased in MDD, and increased or remained stable in BD during the induction phase, with a significant effect of diagnosis on delta day0-day5 (LR χ 2 = 5.277, p = 0.0216) and no effect of treatment.CD4 + IL-4 + cells (Th2) increased during the induction phase in patients treated with aldesleukin, but not with placebo, with a significant effect of treatment (LR χ 2 = 7.963, p = 0.0048) and not of diagnosis.
Finally, levels of sCD25 (sIL-2Rα) significantly increased during the induction phase in MDD patients treated with aldesleukin, and in BD patients irrespective of treatment options, yielding a significant effect of treatment in the whole sample (LR χ 2 = 11.115,p = 0.0009).CRP showed a significantly higher increase after aldesleukin during the induction phase (LR χ 2 = 9.749, p = 0.0018).Aldesleukin did not significantly affect levels of BDNF and IL-7 (Table 3, Supplementary Fig. 3).Baseline levels of IL-6 were below the sensitivity limits in 17/28 participants, and in 4/28 at the end of the induction phase, with a significant increase (Wilcoxon Z = 2.915, p = 0.0036) independent of treatment.

Antidepressant efficacy of low dose IL-2 treatment
Aldesleukin add-on treatment was followed by a significantly better amelioration of depression severity than placebo in the PP group.Inspection of patterns of change in severity of depression in the PP group (Table 4 and Fig. 4), and of improvement (delta scores) from baseline to the end of treatment (Day 36) and to the end of follow-up (Day 60) (Fig. 5) show erratic changes during the first weeks, and then an advantage for low-dose IL-2 over placebo at Day 36 and 60.
A PLS linear regression in completers (PP group, n = 28) with treatment, diagnosis, sex, and age as predictors of the global improvement at Day 60 defined a significant model predicting the efficacy of treatment when considering all the rated dimensions of depression (IDS-SR, MADRS, HDRS), with one significant component (coefficient = 0.562) explaining 19 % of variance in improvement: R 2 X = 0.300; R 2 Y = 0.185; Q 2 = 0.031.The variable selected as relevant to predict outcomes were treatment (aldeslukin vs placebo) (VIP = 1.37) and age (VIP = 1.61), while sex (VIP = 0.47) and diagnosis (VIP = 0.29) were dismissed as not relevant.
Using an explanatory approach on single dimensions of improvement in the PP group, separated GLZM homogeneity of slopes ANOVAs showed a significant effect of treatment both at day 36 and at day 60, and on all rated dimensions of depression severity.At day 36, treatment significantly affected changes in MADRS (W 2 = 4.723, p = 0.0298), HDRS (W 2 = 8.365, p = 0.0038), and IDS-SR (W 2 = 6.367, p = 0.0116) scores (better effect for aldesleukin; Fig. 5, top) when considering together MDD and BD patients, with a significant interaction with Diagnosis at HDRS (better effect in MDD than in BD, W 2 = 7.002, p = 0.0081).At day 60, GLZM homogeneity of slopes again detected: a significant main effect of treatment on MADRS (W 2 = 6.571, p = 0.0104), HDRS (W 2 = 9.359, p = 0.0022), and IDS-SR (W 2 = 8.094, p = 0.0044) scores when considering together MDD and BD patients; a significant main effect of diagnosis (lower final scores in MDD, irrespective of treatment) at MADRS (W 2 = 5.016, p = 0.0251) and HDRS (W 2 = 7.699, p = 0.0055); and no significant Treatment x Diagnosis interaction, meaning (in agreement with PLS results) that the benefit from aldesleukin versus placebo was not significantly different in the two diagnostic categories (Fig. 5, bottom).An analysis of the effect size of the Aldesleukin-Placebo difference in changes of severity of depression showed medium effects for all variables at day 60, considering age, sex, diagnosis, and baseline severity as covariates (MADRS p η 2 = 0.09; HDRS p η 2 = 0.06; IDS-SR p η 2 = 0.11).
Better improvement with aldesleukin than placebo was confirmed in the ITT group (n = 36), where the same PLS regression analysis used in the PP group produced similar results, with one component (R 2 X = 0.299; R 2 Y = 0.116; Q 2 = -0.057)including, as relevant variables, only treatment (VIP = 1.94) and age (VIP = 1.43).Explanatory GLZM analyses again showed a significant main effect of treatment on changes of depression severity on all rating scales at day 60, when considering together MDD and BD patients: MADRS (W 2 = 4.502, p = 0.0338), HDRS (W 2 = 11.822,p = 0.0006), and IDS-SR (W 2 = 9.292, p = 0.0023) scores.However, in the ITT group we observed significant Treatment x Diagnosis interactions at MADRS and HDRS (W 2 = 5.198, p = 0.0226; W 2 = 8.454, p = 0.0110, respectively), and Treatment x Age interactions (W 2 = 4.304, p = 0.0380; W 2 = 8.722, p = 0.0031) (Fig. 6): while the improvement at MADRS was superior for aldesleukin in both MDD and BD patients, but significantly better in MDD than BD, the improvement at HDRS was superior for aldesleukin in MDD, but not for BD patients.
In summary, at day 60 aldesleukin was superior to placebo on all rating scales both, in the PP and in the ITT samples, when considering together patients with MDD and patients with BD.When separating them in the two diagnostic categories, (i) in the 28 completers (PP sample) better effects of aldesleukin on all rating scales were observed, without significant differences between MDD and BD (Fig. 5); and (ii) in the ITT sample, better effects of aldesleukin, without significant differences between MDD and BD, were observed at IDS-SR; but better effects of aldesleukin in MDD than in BD were observed at MADRS and HDRS, the latter improving even less than placebo in BD (Fig. 6).

Biological predictors of antidepressant efficacy
Effects of treatment on immune cell frequences during the induction phase predicted the subsequent improvement in severity of depression.
Post-hoc, separated GLZM homogeneity of slopes ANOVAs performed on single outcomes (delta improvement in severity at rating scales) at the two times, using the features selected as relevant by the linear PLS regression (delta changes in biomarkers during the induction phase), confirmed the statistical significance of delta CD4 + Tregs and delta CD4 + Naïve cell percentages in predicting outcomes, yielding significant differential effects of the other biomarkers at different times and outcomes (Table 5).

Discussion
We observed significant effects of treatment with low dose IL-2 vs placebo in expanding the population of Treg, Th2, and Naive CD4+/ CD8 + immune cells, and in potentiating treatment efficacy when added-on to ongoing antidepressant drugs.Changes in cells relative percentages were rapidly induced in the first five days of treatment, and predicted the later improvement of depression severity.
Associating immune changes with the clinical depression improvement in the weeks following the induction phase, and persisting in the month after the end of treatment, speaks of complex relationships between the immune system and the clinical phenotype of mood disorders.Indeed, a multifaceted role of the immune system and in particular of T cells populations is emerging, highlighting the importance of the immune system in plasticity, learning, and the maintenance of brain homeostasis but also the existence of distinct immunophenotypes of inflammation in depression which could determine the outcome of immune-targeted treatments (Felger and Miller, 2020).The pattern of variation of self-and observer rating scales for depression showed a consistent trend toward amelioration in the first month of treatment, followed by a trend to relapse in the second month, consistent with current consensus about clinical outcomes of TRD, where a very high relapse rate is expected while continuing treatment even if it was apparently effective at the beginning (Sforzini et al., 2022).Counteracting this trend, well evident in placebo-treated patients, the main effect of IL-2 was to enhance response and prevent the subsequent relapse, thus promoting and consolidating improvement over time.
The observed effect size for the difference between Aldesleukin and Placebo at day 60 was medium, thus in line with the best effects observed with the most used antidepressant drugs (Cipriani et al., 2018).When separating the evaluation of the effects of aldesleukin versus placebo in the PP sample and in the ITT sample and in the two diagnostic groups (MDD and BD), the superiority of aldesleukin versus placebo was comparable in MDD and BD at all rating scales in the 28 completers (PP sample); and was evident at all rating scales in MDD, but only at IDS-SR and MADRS (and not at HDRS) in BD.This discrepancy between MADRS and HDRS has already emerged in RCTs, and discussed in independent evaluations as due to the unidimensional structure of MADRS, which is more focused on the core of depressive psychopathology, more sensitive to its change during treatment, and thus considered as the gold standard clinician rating scale for depression (Carmody et al., 2006;Jauhar and Morrison, 2019).Adding up to the specific reason for most dropouts (COVID-19 lockdowns at the beginning of the treatment, with very long LOCF intervals in the ITT sample), the psychometric superiority of MADRS could explain these effects.
Several non-alternative biological mechanisms could contribute to the observed antidepressant effect of IL-2.IL-2 is a T cell growth factor, which in low dose can particularly increase the percentages of Tregs, an effect repeatedly reported in human trials in immune diseases and associated with clinical benefits (see Introduction).Here this effect predicted the antidepressant potentiation.Moreover, IL-2 increased the percentage of Th2 cells, which support the intrinsic anti-inflammatory properties of the brain (Gimsa et al., 2001), and again this effect predicted antidepressant improvement.In a previous report we found Treg frequencies inversely associated with the pro-inflammatory state of monocytes (Grosse et al., 2016), and it is generally assumed that high Treg cell frequencies correlate with reduced inflammation.However, the possible anti-inflammatory correlate of this effect was not revealed by CRP and IL-6, which have been considered promising baseline markers of inflammation associated with major depression (Arteaga-Henriquez et al., 2019;Pitharouli et al., 2021).Counterintuitively, these markers of low grade inflammation increased with low dose IL-2 treatment and also predicted its efficacy.Interestingly, such an effect has been earlier observed after SSRIs/SNRIs antidepressant drugs, proportional to reduction in depression severity (Carboni et al., 2019;Hannestad et al., 2011).A transient increase in CRP levels has been previously reported in patients undergoing treatment with IL-2 for cancer therapy (Broom et al., 1992;Porter et al., 2009;Rosenzweig et al., 1990), especially in responders.A possible explanation for this effect comes from the ability of IL-2 to cause the release of pro-inflammatory cytokines, including TNF-α, IL-1β, and IL-6 (Heaton et al., 1993) which, in particular, is known to be associated with CRP levels (Felger et al., 2020).This effect could be characteristic of IL-2 for its immuno-modulatory role and not be present in patients treated with drugs with an immuno suppressant function such as infliximab although they seem to exert similar effects in rebalancing T cells populations (Bekhbat et al., 2022).
Due to its effect on Treg cells, low dose IL-2 treatment is also expected to decrease the Th17/Treg ratio, which is higher in depression, and has been proposed as a hallmark of severity and suicidality in MDD, and as a target for antidepressant treatment (Cui et al., 2021;Schiweck et al., 2022).In our study, the percentage of Th17 cells tended to an average increase with low dose IL-2 treatment, but within this average pattern of change, it was a relative negative trend (to a decrease or a lower increase) which predicted better final antidepressant improvement.A similar pattern was observed in patients with psoriatic arthritis, with low-dose IL-2 significantly increasing Th17 and Treg, but with Treg rising more rapidly, thus re-balancing the Th17 and Treg proportions (Wang et al., 2020).Th17 cells might indeed have complex interactions with the depressed brain.This is also illustrated by the reports that Th17 cells are essential for hippocampal neuroplasticity (Niebling et al., 2014), and that high percentages of Th17 cells correlated significantly to better integrity of brain white matter (WM) in BD patients and healthy controls (Poletti et al., 2017).Further research is needed to clarify their role in mood disorders.
IL-2 is also known to increase thymic production of naïve CD4 + T cells (Carcelain et al., 2003), and it indeed increased the relative percentage of CD4 + and CD8 + Naïve cells in this trial, both effects predicting antidepressant improvement.A reduction of naïve T cells and an expansion of memory and senescent T cells is a core characteristics of the immunological imbalance associated with MDD, associating with abnormally activated monocytic/macrophagic innate immunity setpoints (Bekhbat et al., 2022;Simon et al., 2021b).The imbalance between lower naïve, and higher memory T cells was reported to be proportional to severity of MDD, also characterizing suicide risk (Schiweck et al., 2020).Accordingly, here we showed that MDD patients had an increase in the percentage of Naïve Tcells and a decrease in the percentage of memory Tcells suggesting that IL-2 treatment may promote a general rebalancing of the innate/adaptive immunity imbalance in MDD.The lack of such a finding in BD may be explained by the different immune profile which is characterized by normal or even raised levels of T-helper cells (Becking et al., 2015).
Interestingly, IL-2 may also act as a trophic factor on both neurons and oligodendrocytes (de Araujo et al., 2009).Studies on neurons and astrocytes consistently showed that IL-2 promoted survival and neurite extension of cultured cortical, hippocampal, septal, striatal, and cerebellar neurons (Hanisch and Quirion, 1995).In vivo animal models of neuroinflammatory-neurodegenerative disorders showed induction of astrocytic activation and improved synaptic plasticity and spine density in the hippocampus (Alves et al., 2016), also promoting hippocampal neurogenesis (Liu et al., 2014).Reduced hippocampal volumes are one of the most robust findings in brain imaging studies of depressed patients (Schmaal et al., 2016), and also among the few consistent brain structural predictors of poor treatment response (Enneking et al., 2020).We recently showed that lower hippocampal volume predicted worse response to SSRIs and SNRIs administered upon clinical need in a hospital setting (Paolini et al., 2023a), and that it partially mediated the detrimental effect of peripheral low-grade inflammation on antidepressant response (Paolini et al., 2023b).It can then be surmised that the neurotrophic effect of IL-2 could contribute to correct these effects.
Mood disorders associate with signs of disrupted brain WM integrity,  which are influenced by inflammatory markers, and correlate with severity and outcome of the disease (Benedetti et al., 2016;Benedetti et al., 2011;Favre et al., 2019;Van Velzen et al., 2020).Human oligodendrocytes express the IL-2 receptor, and while in vitro at higher concentration a combined IL-1/IL-2 administration hampered oligodendrocyte progenitor cell proliferation (Saneto et al., 1986), at low concentration IL-2 alone markedly stimulated the proliferation of normal human oligodendrocytes (Otero and Merrill, 1997).IL-2 could also promote myelin regeneration by stimulating Treg cells, which promoted oligodendrocyte progenitor cell differentiation and myelination in vitro (Dombrowski et al., 2017).We also showed that in vivo the Th17 and Treg relative frequencies play key roles in myelin maintenance and regeneration, with higher Th17 associated with higher fractional anisotropy in the core WM skeleton of the brain (Poletti et al., 2017).WM integrity predicts antidepressant response both in MDD and in BD (Bollettini et al., 2015;Gerlach et al., 2022).Rapid antidepressant response associates with a rapid improvement in WM integrity (Melloni et al., 2020).Low-dose IL-2 expanded both, Treg and Th17 in the present trial: it can be surmised that the combined direct and indirect effects of IL-2 on WM integrity could contribute to correct the abnormalities hampering antidepressant response.
The trophic effects of IL-2 on brain cells are likely to be mediated by the IL-2 activation of the PI3 kinase (PI3K) activity, an upstream regulator of glycogen synthase kinase 3-β (GSK-3β), acting via Akt to phopsphorylate ser9 and leading to inactivation of GSK-3β (Braunstein et al., 2008).GSK-3β is involved in the control of gene expression, cell behavior, cell adhesion, neuronal polarity, and in regulation of neurodevelopment, neuronal plasticity and cell survival (Grimes and Jope, 2001).Increased GSK-3β activity was reported in post mortem brain tissue of depressed suicides, and GSK-3β inhibition is a unique common feature of mood stabilizers lithium and valproate, serotonergic antidepressants, and some antipsychotics of proven efficacy in BD (Beaulieu et al., 2009;Li and Jope, 2010).In turn, in animal models the GSK-3β inhibitor lithium prolonged T cell proliferation and increased IL-2 production (Ohteki et al., 2000), and the serotonergic antidepressant fluoxetine increased IL-2 (Fazzino et al., 2009) also restoring T cell proliferation and IL-2 levels after chronic stress-driven immune system depression (Frick et al., 2009).It can then be hypothesized that IL-2, also promoted by lithium and antidepressant treatment, could be synergistic with ongoing treatment in rescuing the impaired neuroplasticity associated with mood disorders (Machado-Vieira et al., 2013) by promoting cortical and hippocampal neuroplasticity as influenced by GSK-3β (Manji et al., 2003), thus paving the way to antidepressant response.
As for the safety of low dose IL-2 treatment, we observed only mild adverse events in line with most of the studies using IL-2 to treat autoimmune diseases with low dosages.Contrary to what is observed in studies using a high dose of IL-2 which is associated with sever adverse events, our findings suggest that low dosage is safe and well tolerated.
Strengths of the present study include a focused research question and state-of-the-art methods, but our results must be viewed in light of some limitations.The COVID pandemic occurred during the study, limiting access to the hospital infrastructures and directly increasing attrition.No patient was drug-naive, and the drug treatments administered during the course of the illness and of the current episode could have influenced biological outcomes; in particular, to be on a stable treatment will be needed in future trials to assess the possible usefulness of IL-2 in TRD.Recruitment was in a single center and in a single ethnic group, thus raising the possibility of population stratifications.Further, although cryopreservation is useful to store biological samples for long periods of time, it may have some limitation.Water migration can cause extracellular ice formation and cellular dehydration and such stresses can damage the cells directly so that some subpopulations may be selectively lost at different points during the cryopreservation process including the "thaw" before processing cells for flow cytometry.In our sample we observed on average 80 % of live cells.
Baseline levels of circulating compounds and/or immune cells were not used to select, nor stratify patients.There is some consensus that only 25-50 % of depressed patients exhibit immune abnormalities, and recent studies showed that response to treatments acting on the immune system are more effective on patients showing immune abnormalities (Felger and Miller, 2020;Husain et al., 2020;McIntyre et al., 2019).
Including or not patients without immune abnormalities could have influenced our results, possibly reducing the observed effect; once shown the antidepressant efficacy of adjunctive aldesleukin, interest is warranted to define possible cutoffs for baseline values in order to select subgroups of patients who could most benefit from it.These limitations, however, do not bias the main finding of an effect of low-dose IL-2 in modulating the immune system and in promoting antidepressant response in patients with MDD and BD, thus providing the first RCT evidence supporting the hypothesis that treatment to strengthen the T cell system could be a successful way to correct the immuno-inflammatory abnormalities associated with mood disorders, and potentiate antidepressant response.Interest is warranted for combining multimodal approaches and techniques from computational, molecular and biological psychiatry to deepen the understanding of the immune-brain interaction, disentangle the intertwined signaling pathways which trigger and shape the phenotype of the disorder and its outcome, and define predictors and correlates of response to cluster patients for a precision, personalized treatment approach.

Fig. 2 .
Fig. 2. Changes in CD4 + Treg cell percentages in participants who completed the trial (A, n = 28) or who completed the induction phase (B, n = 36).Changes in CD4 + Naive (C), CD8 + Naive (D) cell percentages during treatment (frequencies of CD4 + and CD8 + ).Red = Aldesleukin; Blue = Placebo.Points are means, whiskers are SEM.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 3 .
Fig. 3. Changes in Th17 (A), CD4 + IFNγ+ (B), CD4 + IL-4+ (C), and CD4 + IL-2+ (D) cell percentages during treatment (frequencies of CD4 + ) in the PP sample.Red = Aldesleukin; Blue = Placebo.Points are means, whiskers are SEM.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 4 .Fig. 5 .
Fig. 4. Pattern of change of severity of depression as rated at MADRS (top), HDRS (middle), and IDS-SR (bottom) during treatment and follow-up in the PP sample.Red = Aldesleukin; Blue = Placebo.Points are means, whiskers are SEM.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 6 .
Fig. 6.Improvement in severity of depression as rated at IDS-SR (left) MADRS (center), and HDRS (right), from baseline to Day 60, in the ITT sample.Pale grey = Placebo; Dark grey = Aldesleukin.Bars are means, whiskers are SEM.

Fig. 7 .
Fig. 7. Variables predicting the global improvement in depression severity at study end (Day 60) in the PP sample, as rated at MADRS, HDRS, and IDS-SR during treatment, listed according to their relative predictive power at linear regression; and direction of the effect contributing to improvement (increase vs decrease).

Table 1
Clinical and demographic characteristics of the patients, and levels of significance of the differences based on treatment options.Values are mean ± SD.

Table 2
Changes of cell counts (frequency of CD4 + ) during treatment.

Table 3
Changes of inflammatory biomarkers during treatment (pg/ml).

Table 5
Levels of significance of the effects of biological predictors of improvement (delta changes during the induction phase, Day0-Day5) on the improvement in severity of depression at Day 36 and at Day 60, as measured on MADRS, HDRS, and IDS-SR.