Efficacy and moderators of short-term psychodynamic psychotherapy for depression: A systematic review and meta-analysis of individual participant data

Background

Short-term psychodynamic psychotherapy Individual participant data Meta-analysis Methods: PubMed, PsycInfo, Embase, and Cochrane Library were searched September 1st, 2022, to identify randomized trials comparing STPP to control conditions for adults with depression.IPD were requested and analyzed using mixed-effects models.Results: IPD were obtained from 11 of the 13 (84.6%)studies identified (n = 771/837, 92.1%; mean age = 40.8,SD = 13.3;79.3% female).STPP resulted in significantly lower depressive symptom levels than control conditions at post-treatment (d = − 0.62, 95%CI [− 0.76, − 0.47], p < .001).At post-treatment, STPP was more efficacious for participants with longer rather than shorter current depressive episode durations.Conclusions: These results support the evidence base of STPP for depression and indicate episode duration as an effect modifier.This moderator finding, however, is observational and requires prospective validation in future large-scale trials.
Affecting >264 million adults globally, depression is one of the most prevalent mental disorders (James et al., 2018).Associated with decreased quality of life (Bromet et al., 2011), loss of workforce (Stewart, Ricci, Chee, Hahn, & Morganstein, 2003), increased mortality (Cuijpers et al., 2014), and elevated health care costs (Greenberg, Fournier, Sisitsky, Pike, & Kessler, 2015), depression ranks as the leading cause of disability worldwide (World Health Organization, 2017).While antidepressant medications are most often used to treat depression, many patients prefer psychotherapy (van Schaik et al., 2004).Next to cognitive behavioural therapy (CBT), short-term psychodynamic psychotherapy (STPP) is a frequently used treatment for depression in clinical prtice (Norcross & Rogan, 2013).Conventional meta-analyses have found STPP to be superior to control conditions in reducing depressive symptoms (Abbass et al., 2014;Barber, Muran, Mccarthy, & Keefe, 2013;Cuijpers, Karyotaki, de Wit, & Ebert, 2020;Driessen et al., 2015).Although effects were not consistently present in all pairwise comparisons, two network meta-analyses have also reported STPP to be more efficacious than waitlist and care-as-usual control conditions (Barth et al., 2013;Cuijpers et al., 2021).Additionally, moderate to large effects of STPP relative to control conditions have been shown on measures of anxiety, general psychopathology, and quality of life (Driessen et al., 2015).These conventional meta-analyses, however, are limited by their dependence on the quality of study-level information reported in publications, which can lead to an overestimation of treatment effects (Tudur Smith et al., 2016).
Furthermore, there are indications that certain patients may benefit specifically from STPP for their depression, but research is scarce and replications have not yet been conducted (Barber, Barrett, Gallop, Rynn, & Rickels, 2012).A conventional meta-analysis of STPP versus control conditions reported larger effect sizes in the subgroup of studies including patients with diagnosed mood disorders than in the subgroup of studies including patients with elevated depressive symptoms scores (Driessen et al., 2015).Moderation analyses alongside conventional meta-analyses, however, are prone to ecological bias, such that the association between study-level characteristics and effect sizes might not be representative of the true relationships in the data at the individual level (Tudur Smith et al., 2016).Thus, it remains largely unclear which patients might benefit specifically from STPP for depression.
Individual participant data (IPD) meta-analysis is an alternative approach for evidence synthesis that gathers and pools participant-level data from all available studies.IPD meta-analyses have several advantages over conventional meta-analyses: data analysis methods can be standardized across studies, rare outcomes can be examined, results of primary studies can be verified, and data that were not reported in the publications can be analyzed.Furthermore, IPD meta-analyses allow for examining potential moderators on the participant-level with increased statistical power due to larger sample sizes (Tudur Smith et al., 2016).Because of these advantages and the resulting increased precision of the effect estimates, IPD meta-analyses are considered the current "gold standard" in evidence synthesis (Stewart & Tierney, 2002).
This IPD meta-analysis examined the efficacy and moderators of STPP versus control conditions for adults with depression.More specifically, STPP and control conditions in randomized controlled trials (RCTs) were compared on measures of depression, anxiety, general psychopathology, interpersonal problems, quality of life, and physical health.Furthermore, several baseline participant characteristics were investigated as potential moderators of depressive symptom outcomes.

Design
This IPD meta-analysis is part of a larger project of which the protocol was published (Driessen et al., 2018) and registered at the PROS-PERO International prospective register of systematic reviews (No.CRD42017056029).

Search strategy
Relevant studies were identified via systematic literature searches in the online databases PubMed, PsycINFO, Embase.com,Web of Science, and Cochrane's Central Register of Controlled Trials.Additionally, databases of grey literature (GLIN) and digital dissertations (ProQuest), and a clinical trial register (ISRCTNR) were searched.The search strings comprised index and free-text terms with synonyms for "Psychodynamic Psychotherapy" and "Depression" (Appendix Table A.1). Additionally, relevant studies were identified via references of STPP efficacy reviews, consultations with psychodynamic researchers, and the METAPSY database of randomized depression psychotherapy trials (https://www.metapsy.org/).These searches were performed on June 17th, 2017.In order to identify recent studies, the METAPSY database was searched from inception to September 1st, 2022.This database is developed through comprehensive literature searches in PubMed, PsycINFO, Embase.com, and Cochrane Library (for the exact search terms see https://osf.io/nv3ea).It has been used in a series of meta-analyses and is updated every four months.

Study selection
Relevant studies were RCTs comparing STPP with a control condition for adults with depression.Studies had to include at least 10 participants and report treatment outcomes on standardized measures.STPP needed to be time-limited a priori, based on psychoanalytic/psychodynamic theories, and delivered verbally.Control conditions comprised non-specific controls, waitlist, low-intensity treatment, pillplacebo, and treatment-as-usual.Participants needed to be at least 18 years old with no upper age restriction.Depression was defined as meeting diagnostic criteria for a unipolar mood disorder or scoring above the 'no depression' cut-off on a standardized measure of depression.
Two raters independently applied the eligibility criteria to the study citations.Full-text papers were requested for studies that could not be definitely excluded and examined by two independent raters.Last, two expert STPP researcher-clinicians independently confirmed that identified studies fulfilled the STPP criteria.Disagreement between raters was resolved by consensus.If consensus could not be reached a third rater F.J. Wienicke et al. was consulted.

Data collection
Using a multi-step contact protocol (Driessen et al., 2018), anonymized IPD for all outcome and all potential moderator variables assessed in the studies were requested from the authors.If authors could not be reached after following the complete protocol, declined to share their data, or if IPD had not been retained, the study's data were considered unavailable.

Measures
The pre-specified primary outcome was post-treatment depressive symptoms, defined as the study's primary continuous depression measure assessed at the study's primary end point.Other pre-specified outcomes were post-treatment anxiety, quality of life, and interpersonal functioning (Driessen et al., 2018).Additional measures and follow-up outcomes were included if assessed in at least two studies.Outcomes were transformed into individual z-scores within study and time point if different instruments were used to assess them across studies (Appendix Table A .2).
Variables qualified as potential moderators if they were measured before treatment start and were assessed in at least two studies.Prespecified moderator categories were sociodemographic (e.g., age), clinical (e.g., previous treatment), and psychological (e.g., attachment style) participants characteristics.Continuous moderators were transformed into z-scores within study and categorical moderators were recoded into similar categories, if primary studies used different assessment methods (Appendix Table A.3).

Data integrity
It was checked whether the received IPD matched the data reported in the publications and whether outcome and moderator variables had out-of-range, invalid, or inconsistent scores.Discrepancies were resolved with the original authors, which occurred in five studies.

Risk-of-bias assessment
Using the Cochrane risk-of-bias tool for RCTs (Higgins et al., 2011), two independent raters assessed selection bias and detection bias based on the published articles and attrition bias based on the IPD.If necessary information was not reported in the publications, it was requested from the authors.Performance bias was not rated, as it is considered impossible to blind participants and therapists to treatment in psychotherapy research.Selective reporting bias was considered not applicable, as all outcome measures assessed were requested.

Data analysis
One-stage IPD meta-analyses were conducted using mixed model analyses with a three-level structure (study, participant, time points) and restricted maximum likelihood estimation.The approach described by Twisk et al., (2018, eq. 2c) was adopted to adequately account for baseline differences in outcome measures and because of its favorable properties of handling missing data.The normality of the residual distribution was checked with histograms and between-study heterogeneity was assessed with the I 2 statistic.
Treatment outcome models included a main effect for time and a time-by-treatment interaction, with a random intercept for study (to account for clustering of participants within studies), a random intercept for participants (to account for clustering of repeated measures within participants), and fixed slopes.A − 2-log likelihood change evaluation was used to decide whether to include a random slope for the time-bytreatment interaction on study level.A p-value of <0.05 for the time-by-treatment interaction's regression coefficient was considered an indication of a significant treatment effect.Effect sizes of ≤0.32 were considered small, 0.33-0.55moderate, and ≥ 0.56 large (Lipsey & Wilson, 1993).Treatment outcome models were conducted in MLwiN (version 3.05).
Moderator models included an additional moderator main effect, time-by-moderator interaction, and time-by-moderator-by-treatment 3way interaction.A significant 3-way interaction after Bonferroni correction for multiple testing (p < .0025,20 tests) was considered an indication of a moderator effect.Because 3-way interactions require larger samples and more statistical power to show significance and therefore have a heightened risk of type II errors (Heo & Leon, 2010), moderators with an associated p value of <0.05 were also reported but interpreted with caution.All statistically significant moderators were modeled simultaneously to test whether their effects were independent.Finally, for the purpose of graphical representation, the remaining significant continuous moderator variables were probed with simple slope analyses for low (minus one standard deviation) and high (plus one standard deviation) levels of the moderator in each condition (Aiken, West, & Reno, 1991).To facilitate the graphical representations' interpretability, z-scores were standardized across time points for these analyses.The moderator analyses were conducted using R (version 4.0.3;R Core Team, 2020) and the lme4 package (version 1.1-27.1;Bates, Mächler, Bolker, & Walker, 2015).
Several pre-specified sensitivity analyses were also performed to investigate the robustness of findings: a) risk of bias items, b) STPP characteristics, c) study design characteristics were added as covariates to the models, and d) analyses were repeated including only studies with low risk of bias scores on all criteria.Additionally, one post-hoc sensitivity analysis was conducted excluding one outlier study (López Rodríguez, López Butrón, Vargas Terrez, & Villamil Salcedo, 2004), which 95% confidence interval (CI) did not overlap with pooled treatment effect's 95% CI.Furthermore, post-hoc meta-regression and conventional meta-analysis subgroup analyses were conducted to examine whether post-treatment depression effect sizes varied at study-level as a function of number of STPP sessions, therapy format (individual vs. online), and type of control condition (non-specific, waitlist, lowintensity treatment, pill placebo, treatment-as-usual), using the R meta package (Balduzzi, Rücker, & Schwarzer, 2019).
Data-availability bias was investigated by comparing studies for which IPD were and were not available regarding study characteristics and effect sizes using, respectively, SPSS (version 26.0.0.0) and Comprehensive Meta-Analysis (version 3.0).Effect sizes were calculated based on data extracted from publications or if not reported, were calculated from IPD, and analyzed with a random effects model.Publication bias was investigated by contour-enhanced funnel plots and Egger's test of the intercept using the R meta package (Balduzzi et al., 2019).
Study characteristics are shown in Table 1.Of the 11 studies for which IPD were obtained, nine (81.8%) investigated individual face-toface STPP and two (18.2%)online STPP.The majority of studies (81.8%) included participants meeting DSM-IV or ICD-10 criteria for a unipolar mood disorder, although two studies (18.2%) included participants with elevated depressive symptom scores.While nine studies (81.8%) investigated depressed adults in general, one study (9.1%) researched women with post-partum depression and one (9.1%)investigated women with breast cancer and depression.The studies included 20 to 157 participants and STPP consisted of 7.4 to 20 sessions.Nine studies (81.8%) conducted follow-up assessments, ranging from 5.5 months to 2 years.

Bias assessments
The risk of bias assessment is presented in Table 2.While all studies applied adequate random sequence generation, one study (9.1%) did not employ adequate allocation concealment procedures, four studies (36.4%) did not blind outcome assessors to treatment condition, and three studies (27.3%) did not retain the complete intention-to-treat data.
Five studies (45.5%) were rated as low risk of bias on all criteria assessed.

Treatment outcomes
Results of all treatment outcome analyses are summarized in  Adding the risk of bias items, STPP characteristics, and study design characteristics as covariates to the models did not change the pattern of results (Appendix Table A .8).However, when repeating the analyses in low risk of bias studies only, STPP was no longer more efficacious than control conditions on follow-up measures of depression (p = .602).Posttreatment depression effect sizes did not vary at study-level as a function of number of STPP sessions (β = − 0.02, 95%CI [− 0.10, 0.06], p = .667),therapy format (Q = 0.00, df = 1, p = .972),or type of control condition (Q = 0.24, df = 4, p = .994;Appendix Table A.9).

Moderators
Table 4 shows the STPP versus control condition effect sizes on depression outcomes across the different moderator levels.Length of the current depressive episode was found to moderate post-treatment depression levels, such that STPP was more efficacious for participants reporting longer rather than shorter episode durations (d = − 0.006, 95%CI [− 0.01, − 0.001], p = .002).Furthermore, age of depression onset moderated treatment effects, such that STPP was more efficacious relative to control conditions for participants with younger rather than older ages of depression onset at post-treatment (d = 0.03, 95%CI [0.01, 0.05], p = .013)and follow-up (d = 0.03, 95%CI [0.003, 0.06], p = .030).
When the moderators were modeled simultaneously (Appendix Table A.10), only length of current depressive episode remained a significant moderator of post-treatment outcomes (d = − 0.006, 95%CI [− 0.01, − 0.001], p = .013).Probing this finding revealed that while participants with shorter episode durations showed similar decreases in depression severity in the two conditions (Fig. 1, Panel A), participants with longer episode durations showed larger decreases in depression severity in STPP compared to the control condition (Fig. 1, Panel B).None of the sensitivity analyses changed the moderator findings (Appendix Table A .11).

Discussion
This systematic review and IPD meta-analysis examined the efficacy of STPP for adults with depression compared to control conditions and investigated moderators of treatment effects.STPP was more efficacious than control conditions on post-treatment measures of depression, anxiety, general psychopathology, and quality of life, as well as on follow-up measures of depression.Episode duration moderated depression treatment effects, such that STPP was more efficacious for participants with longer depressive episodes.
Previous conventional meta-analyses also found STPP superior to control conditions on post-treatment measures of depression, anxiety, general psychopathology, and quality of life (Abbass et al., 2014;Barber et al., 2013;Cuijpers et al., 2020;Driessen et al., 2015).Effect sizes in the current study were smaller than some of those reported in prior conventional meta-analyses for post-treatment measures of anxiety (d = 0.29 in the current study vs. d = 0.48 in Driessen et al., 2015) and general psychopathology (d = 0.38 in the current study vs. d = 0.48 in Driessen et al., 2015).These discrepancies might be explained by the current study working with IPD.This allowed for conducting intentionto-treat analyses for a larger proportion of trials, which have been shown to produce more conservative effect size estimates compared to perprotocol analyses (Tudur Smith et al., 2016).Additionally, the current study included a smaller proportion of studies with waitlist conditions   For categorical moderators, significance indicates differential treatment efficacy between the moderator levels.
For continuous moderators, significance of the "Per … increase" indicates the added effect of each unit increase in baseline values, while "Average" reflects the treatment effect for participants who score at the average of the study sample.than previous meta-analyses (Abbass et al., 2014;Driessen et al., 2015), which have been associated with increased treatment effects relative to care-as-usual controls (Cuijpers et al., 2013).For these reasons, the effects reported in this study, albeit sometimes smaller, might be considered more valid estimates of STPP for depression's efficacy.
The superiority of STPP on depressive symptom measures at followup was not replicated in low risk of bias studies, nor were follow-up effects found on any of the other outcome measures.These results are in line with a previous meta-analysis that did not find STPP more efficacious in reducing depressive symptoms at follow-up compared to control conditions (Abbass et al., 2014).Null findings may be explained by differences in follow-up lengths of primary studies, which potentially confound effect sizes if treatment effects change or deteriorate as a function of time passed (Cuijpers et al., 2013).Alternatively, the inability to control for additional treatment in the follow-up period might have diminished treatment effects.
Moderator analyses revealed that STPP was particularly efficacious relative to control conditions for participants with longer episode durations.These findings are in line with another IPD meta-analysis, which found that adding STPP to antidepressants was more efficacious for participants with longer episode durations (Driessen et al., 2022).Episode duration has also been observed to moderate the effect of antidepressants combined with STPP versus antidepressants combined with CBT (Driessen et al., 2016), such that combined treatment with STPP was more efficacious for participants with episode durations ≥1 years.It has been speculated, in this regard, that individuals with longer episode durations have depressive symptoms that are more influenced by their personality structure resulting in more complex working alliances and transference feelings; psychodynamic therapists are trained to elaborate on these therapeutic relational aspects if necessary (Driessen et al., 2016;Driessen et al., 2022).However, the strength of evidence for episode duration as a moderator is limited by the p value exceeding the Bonferroni correction.At post-treatment and follow-up, STPP was also found particularly efficacious for individuals with younger age of onset.However, the moderation effect of age of onset appeared to be largely accounted for by episode duration.Future studies will need to determine whether this moderation finding is specific to STPP.

Strengths and limitations
This study has two major strengths.First, IPD allowed for conducting intention-to-treat analyses for most studies, standardizing data analysis, appropriately adjusting for baseline differences in all studies, and including a trial that was excluded from a previous meta-analysis because effect size data were not reported in the publication (Driessen et al., 2015).For these reasons, the current treatment effects estimates might be more reliable than those reported in past conventional metaanalyses.Second, IPD allowed for studying moderators on the participant-level with increased statistical power.To the authors' knowledge, this study is the first to investigate moderators across trials comparing STPP for depression to control conditions.
A number of limitations of this study have to be noted.First is the midsized sample (comprising predominantly middle-aged women), which was further reduced in analyses of secondary outcomes and clinical moderators due to trials not having assessed the relevant variables.For the same reason, not all potential moderators of interest could be examined (e.g., childhood trauma).Also, while this study found evidence for episode duration moderating treatment outcomes, the p value exceeded the Bonferroni correction and the study might have been unable to identify weaker moderator relationships.Second, not all moderator variables were assessed in all studies.Thus, the individual moderator models can relate to different subgroups of studies, which might not be representative of the total sample of studies.Third, not all studies were free from selection, detection, and attrition bias, though the main findings appeared robust against controls for these risks of bias.Included studies also differed with regard to the STPP model used and follow-up length.Regardless of these differences, moderator effects could be identified in the combined studies' data.Fourth, IPD were not obtained for two studies, which differed systematically from the other included studies in being dissertations.However, as effect sizes did not differ significantly between studies for which IPD were and were not available, it is unlikely that the treatment effect estimates in this study were biased.Fifth, two studies used a waitlist control condition, which has been argued to potentially inflate treatment effects due to the nocebo effect (control condition participants no expecting and therefore not experiencing improvements while waiting for treatment ;Furukawa et al., 2014).However, effect sizes were not found to be higher in the two studies with a waitlist control condition (Appendix Table A .8) and these two studies comprised a relatively small proportion of the participant sample (7.8%), suggesting that their influence might have been limited.Sixth, and most important, moderator findings are of observational nature, which means that these findings need validation in prospective trials before they can be used to guide treatment selection.

Clinical and research implications
The findings of the current study indicate that STPP is an efficacious treatment for depression, leading to a reduction in depression, anxiety and general psychopathology, and increased quality of life.Though future research is needed to determine the lasting effects of these benefits over time, these findings support the current inclusion of STPP as a recommended treatment option in practice guidelines for (severe) depression (American Psychological Association, 2019; National Institute for Health and Care Excellence, 2022).Individuals with longer depressive episode durations appear to benefit specifically from STPP.However, the findings of this study cannot be taken to imply that such individuals should necessarily receive STPP, as this study does not speak to the effects of STPP versus other well-established depression treatments (e.g., antidepressant medication).
Given the limitations of this study, further research examining the efficacy of STPP for depression and moderators of treatment outcome is warranted.More specifically, there is a need for future large-scale rigorously conducted RCTs of STPP for depression compared to control conditions assessing a range of outcome measures at post-treatment, but particularly at follow-up.Additional help-seeking in the follow-up period should be routinely assessed to examine its potential effect on longer-term outcomes.Moreover, a broad range of patient characteristics should be assessed at baseline to facilitate further research of moderator effects.Such future studies and IPD meta-analyses may provide additional support for the evidence base of STPP and offer further insight into which individuals might benefit specifically from this frequently used depression treatment.

Table 3 (
for results of the individual studies see Appendix Table A.7).At posttreatment, STPP was significantly more efficacious than control

Table 1
Characteristics of identified studies.

Table 2
Risk of Bias Assessment of the Primary Studies.
+ = low risk of bias, − = high risk of bias.

Table 3
Treatment effects of STPP for depression compared to control conditions at post-treatment and follow-up.

Table 4
Cohen's d effect sizes on depressive symptom measures of STPP versus control conditions for the different moderator levels.Depressive Experience Questionnaire; STPP = Short-term psychodynamic psychotherapy.Negative effect sizes indicate a superiority of STPP compared to control conditions.Statistical significance (p < .05) of the time-by-moderator-by-treatment 3-way interaction is marked by bold printed numbers.