The Association of Therapeutic Alliance With Long-Term Outcome in a Guided Internet Intervention for Depression: Secondary Analysis From a Randomized Control Trial

Background: Therapeutic alliance has been well established as a robust predictor of face-to-face psychotherapy outcomes. Although initial evidence positioned alliance as a relevant predictor of internet intervention success, some conceptual and methodological concerns were raised regarding the methods and instruments used to measure the alliance in internet interventions and its association with outcomes. Objective: The aim of this study was to explore the alliance-outcome association in a guided internet intervention using a measure of alliance especially developed for and adapted to guided internet interventions, showing evidence of good psychometric properties. Methods: A sample of 223 adult participants with moderate depression received an internet intervention (ie, Deprexis) and email support. They completed the Working Alliance Inventory for Guided Internet Intervention (WAI-I) and a measure of treatment satisfaction at treatment termination and measures of depression severity and well-being at termination and 3and 9-month follow-ups. For data analysis, we used two-level hierarchical linear modeling that included two subscales of the WAI-I (ie, tasks and goals agreement with the program and bond with the supporting therapist) as predictors of the estimated values of the outcome variables at the end of follow-up and their rate of change during the follow-up period. The same models were also used controlling for the effect of patient satisfaction with treatment. Results: We found significant effects of the tasks and goals subscale of the WAI-I on the estimated values of residual depressive symptoms (γ02=−1.74, standard error [SE]=0.40, 95% CI −2.52 to −0.96, t206=−4.37, P<.001) and patient well-being (γ02=3.10, SE=1.14, 95% CI 0.87-5.33, t198=2.72, P=.007) at the end of follow-up. A greater score in this subscale was related to lower levels of residual depressive symptoms and a higher level of well-being. However, there were no significant effects of the tasks and goals subscale on the rate of change in these variables during follow-up (depressive symptoms, P=.48; patient well-being, P=.26). The effects of the bond subscale were also nonsignificant when predicting the estimated values of depressive symptoms and well-being at the end of follow-up and the rate of change during that period (depressive symptoms, P=.08; patient well-being, P=.68). J Med Internet Res 2020 | vol. 22 | iss. 3 | e15824 | p. 1 https://www.jmir.org/2020/3/e15824 (page number not for citation purposes) Gómez Penedo et al JOURNAL OF MEDICAL INTERNET RESEARCH


Introduction
Several meta-analyses positioned therapeutic alliance as a robust predictor of outcomes in face-to-face psychotherapy [1][2][3]. However, alliance effects do not seem to be limited to the field of traditional psychotherapy. Alliance also predicted outcomes in other health-related interventions, such as pharmacotherapy [4,5]. The increasing development of internet interventions and the evidence for their efficacy and effectiveness in treating diverse mental disorders [6][7][8] raised the question of what role therapeutic alliance might play in such treatments, especially in those providing guidance from trained supporters (guided internet interventions) [9]. In the last years, several studies addressed this question scientifically. Although some authors found that the alliance could be less important in internet interventions than in traditional face-to-face therapies [10], a recent meta-analysis reported similar effect sizes (r=0.275) for the alliance-outcome relationship in online interventions as in traditional face-to-face therapies [1]. Among the 18 studies included in that meta-analysis [1], 15 analyzed the alliance specifically in guided internet interventions. To measure the alliance, most of the studies used the same instruments usually administered in face-to-face psychotherapy (ie, the Working Alliance Inventory [WAI]) [11] with very slight modifications (eg, talking about a treatment instead of therapy) and focused on the effects of the relationship between the patient and the supporting therapist [10,[12][13][14][15][16][17][18][19][20][21]. This approach of measuring alliance follows its classical conceptualization in psychotherapy research as tripartite, consisting of (1) the patient-therapist emotional bond, (2) patient agreement with the tasks of therapy, and (3) patient agreement with the therapeutic goals that the patient and therapist seek in treatment [22,23]. However, it was not considered that in these trials, treatment tasks and goals were not set in collaboration with the supporting therapist but were proposed by the online program, which might be a limiting factor. Some studies did this the other way around; they used adapted instruments to measure the alliance between the patient and the online intervention only [18,[24][25][26].
Regardless of whether the measuring instruments focused on the supporting therapist or the program, all the abovementioned studies have one thing in common. They lack an exploration of the psychometric properties (ie, validity and reliability) of the measuring instruments used in the specific context of guided online interventions. Only Kiluk et al [24] presented an exploratory analysis of the adapted version of the WAI (ie, WAI-Tech) with some evidence of internal consistency (Cronbach alpha) and external validity (ie, significant correlations with the original WAI and with patient treatment satisfaction). However, the small sample size of that study (n=34) limited the interpretability of the findings and prevented the provision of further evidence for the psychometric properties of the scale (eg, construct validity). Thus, beside conceptual issues identified in most previous alliance-outcome studies of guided internet interventions, concerns might be raised regarding the validity and reliability of the instruments used to measure the alliance construct.
In this context, Berger et al [27], as well as Scherer et al [28], presented a compromise between the two previous approaches of analyzing alliance-outcome associations in guided internet interventions (focus on the supporting therapist or the online program). The authors took the original version of the WAI and adapted it to guided internet interventions, exploring the bond with the supporting therapist but the tasks and goals with the online program. With this version of the instrument, they captured the most relevant aspects of the alliance, considering both the importance of the relationship with the supporting therapist and patient attunement with the online program. Recently, Gómez Penedo et al [29] systematized the efforts by Berger et al [27] and Scherer et al [28], presenting the Working Alliance Inventory for Guided Internet Interventions (WAI-I) and exploring the psychometric properties of the scale. The findings provide evidence for adequate internal consistency, external validity, and construct validity (based on a confirmatory factor analysis) of the WAI-I.
Given the cumulative evidence showing that residual depressive symptoms are some of the main predictors of relapse [30][31][32], in this study, we will analyze how the alliance during treatment is associated with long-term outcomes (ie, 9-month follow-up) after a guided internet intervention for patients with moderate depressive symptoms. We will focus on both analyzing the effects of the alliance on the changes produced during the follow-up period (ie, deterioration or further improvement) and evaluating the residual depressive symptoms at the 9-month follow-up. Furthermore, we will analyze the same effects on the well-being of patients. Beside responding to a general call for further studies clarifying the role of alliance in online interventions [1,9], especially with adapted and psychometrically sound instruments [29], the aim of this study was to explore the alliance-outcome association using the WAI-I, a measure of alliance especially developed for and adapted to guided internet interventions.

Participants
For this study, we drew on a dataset from the EVIDENT study [33,34], a large multicenter randomized controlled trial analyzing the effects of the internet intervention Deprexis [35]. In that trial, 509 patients were assigned to the online intervention. Of these 509 participants, 317 presented with moderate depressive symptoms (ie, score between 10 and 14 in the Patient Health Questionnaire [PHQ]) [36] and therefore additionally received weekly email support from trained clinicians. In our study, we analyzed a sample of 223 patients who received both Deprexis and email support, and completed the WAI-I at treatment termination. Patients in this study were aged between 18 and 65 years (mean 44.48 [SD 10.68] years) and were German speaking. Furthermore, most participants were female (157/223, 70.4%), had high school education (107/223, 48.0%), and were in a romantic relationship at the beginning of the treatment (151/223, 67.7%). The exclusion criteria were a lifetime diagnosis of a bipolar disorder or schizophrenia and acute suicidality established by a telephone diagnostic interview.

Internet Intervention
Deprexis is an internet intervention that demonstrated effectiveness when treating depression, showing medium effect sizes at posttreatment [37]. Deprexis has 10 modules (in addition to one introductory and one summary module) developed consistent with cognitive-behavioral treatment manuals. Within the modules, the program provides simulated dialogues, explaining different techniques and key concepts and delivering examples, and illustrations for easy understanding. Participants were requested to complete different exercises within the program and provide feedback in order to attune the interventions to each participant. Although the modules were presented sequentially, patients could repeat them as often as they wanted. In this sample, the participants spent a mean of 520 (SD 314) minutes in the program and completed a mean of 9.74 (SD 4.51) modules.
All the participants in this sample received standardized email support that consisted of weekly feedback regarding their activity on Deprexis during the last week. The main goal of this support was to enhance participants' motivation and engagement with the internet intervention. The support was implemented via a secured email system included in the internet intervention and was delivered by master's students in clinical psychology and psychotherapy, psychotherapists in training, and licensed psychotherapists who received an intensive 4-hour training in the program and feedback strategies, using example cases. The instructions provided to the supporters were in line with those used in a similar previous trial [38]. An expert on internet interventions supervised their tasks by periodically revising the messages from the supporters and providing feedback to them. The study participants were able to contact the supporters directly or respond to their messages.
The sample of this study received a mean of 12.11 (SD 3.23) messages from the supporting clinicians and read a mean of 9.65 (SD 4.88) of those messages. Additionally, the patients sent a mean of 1.99 (SD 2.84) messages to the supporting clinicians (54.7% of the sample sent at least one message). Further details on the internet intervention Deprexis are presented in articles by Meyer et al [35] and Klein et al [33,34].

Working Alliance Inventory for Guided Internet Interventions
The WAI-I is an instrument derived from the Working Alliance Inventory-Short revised [39], and it was specifically adapted to guided internet interventions [27,29]. The WAI-I is a 12-item self-reported measure rated on a 5-point Likert scale ranging from 1 (never) to 5 (always). The instrument has two dimensions. One dimension explores the emotional bond between the patient and the supporting therapist, with items like "I feel that the psychologist who supports me in the online program appreciates me." The second dimension analyzes patient agreement with the tasks and goals of the internet intervention, with items like "I believe the way the online program is working with my problem is correct" and "The goals of the online program are important goals for me." The instrument showed evidence of adequate internal consistency, external validity, and construct validity for a two-factor solution (based on a confirmatory factor analysis) [29]. In this sample, the Cronbach alpha of the bond subscale was .89, whereas that of the tasks and goals subscale was .93.

Patient Satisfaction Questionnaire (Zurich Satisfaction Questionnaire-8)
To measure patient satisfaction with treatment, we used the Zurich Satisfaction Questionnaire-8 (ZUF-8) questionnaire [40] adapted for internet interventions [33]. This instrument is a self-reported measure of eight items rated on a 4-point Likert scale ranging from 1 (low satisfaction) to 4 (high satisfaction). The ZUF-8 adapted to the German culture showed good psychometric properties with evidence of internal consistency, concurrent validity, and construct validity [40]. In the sample of this study, the ZUF-8 showed a Cronbach alpha of .92.

Patient Health Questionnaire-9
The PHQ-9 is a widely used outcome measure for the treatment of depression [41]. It has nine self-reported items representing the nine Diagnostic and Statistical Manual of Mental Disorders-version IV criteria of depression that are completed on a 4-point Likert scale from 0 (not at all) to 3 (nearly every day). Previous studies showed that it is a reliable and valid instrument to measure depression severity [41]. In this study, the PHQ-9 presented good internal consistency during follow-up, with a Cronbach alpha of .82.

Short-Form Health Survey-12
The Short-Form Health Survey-12 (SF-12) is an instrument for measuring health-related quality of life [42]. For this study, we used the mental health subscale (SF-P) of this measure, which consisted of six items, with higher scores representing a higher quality of life in terms of mental health. The items from SF-12 have a Likert scale that varies from 3 to 6 response categories, depending on the item. The individual item responses are then transformed into a 0 to 100 scale and then aggregated into different dimensions or subscales, with higher scores representing greater health well-being. Thereafter, the aggregated scores are standardized according to a normative population by computing t-scores that have a mean of 50 and an SD of 10 [34]. The SF-12 showed evidence of reliability and validity [42]. In this sample, the SF-P presented good internal consistency, with a Cronbach alpha of .84.

Procedure
Patients completed the alliance measure (WAI-I) only once at posttreatment. We decided to measure alliance only at the end of therapy, as some studies showed that patients might have difficulties to complete it early in treatment, because of the limited interaction with the supporter during the intervention [16]. At treatment termination, patients also completed the ZUF-8 as a general measure of patient satisfaction. Furthermore, they completed the PHQ-9 and SF-12 at posttreatment and at 3-and 9-month follow-ups. The Ethics Committee of the German Psychological Association approved the procedure of the study (Deutsche Gesellschaft für Psychologie, reference number SM 04_2012). All patients completed an electronic informed consent form before baseline assessments.

Analytic Strategy
For the analyses in this study, we used hierarchical linear models (HLMs) to deal with the dependency of the observations owing to the nestedness of the data [43]. Considering that repeated measures during follow-up were nested within patients, we ran two-level HLMs, accounting for within-patient and between-patient variabilities. These models accommodate missing data, allowing to retain in the analyses all patients with at least one measurement point, which mimics an intent-to-treat approach.
We first ran two-level fully unconditional models with PHQ-9 and SF-12 values during follow-up as the dependent variables. In the next step, we ran an unconditional time-as-only predictor model, with time as a level 1 predictor centered at the 9-month follow-up and representing the evolution during the follow-up period (posttreatment=−1; end of follow-up=0). Thereafter, we ran a conditional model that included WAI-I scores in the bond and tasks and goals subscales as separate level 2 predictors of the intercept (ie, estimated score of the outcome variables at the 9-month follow-up) and the linear slope of time (ie, evolution of the outcome variables during follow-up). Finally, we ran the exact same models but included either (1) patient satisfaction with treatment or (2) participant-supporter interaction indicators (ie, number of messages sent by the participant, number of messages sent by the supporter, and number of messages read by the participant) as a level 2 predictor to control for its effect.

Sample Details
To characterize the sample, we calculated the mean and SD of each of the targeted variables at posttreatment. We have presented these descriptive statistics in Table 1.

Fully Unconditional Model
We present the results of all conducted models in Table 2. The fully unconditional model estimated a mean level of 7.14 units in the PHQ-9 during follow-up (γ 00 =7. 14

Unconditional Time-as-Only Predictor Model
When predicting PHQ-9 scores, the inclusion of the time variable as a level 1 predictor significantly improved the fit of the fully unconditional model (χ 2 3 =9.70, P=.02). The time-as-only predictor model for PHQ-9 estimated a residual depression symptom score of 6.85 at the 9-month follow-up (γ 00 =6.85, SE=0.29, 95% CI 6.28-7.42, t 209 =54.84, P<.001). This model also showed that the change in the PHQ-9 score during follow-up approached significance (γ 10 =−0.49, SE=0.27, 95% CI −1.02 to 0.04, t 203 =−1.80, P=.08). The computation of the CIs for the random effects showed significant random effects for both the estimated residual depressive symptoms at the 9-month follow-up (SD 3.33, 95% CI 2.91-3.98) and the change during that period (SD 1.73, 95% CI 0.74-2.80). The results revealed that the findings of the participants significantly varied around the average estimated residual depressive symptoms at the end of the 9-month follow-up and the average rate of change during follow-up, suggesting the inclusion of level 2 predictors to explain this variance.
For the models predicting SF-P, the inclusion of the time variable as a level 1 predictor did not significantly increase the fit of the fully unconditional model (χ 2 3 =4. 17, P=.24). This time-as-only predictor model showed an estimated value of 40.04 for the SF-P at the 9-month follow-up (γ 00 =40.04, SE=0.80, 95% CI 38.47-41.61, t 200 =50.19, P<.001). Furthermore, the change in the SF-P during follow-up approached significance (γ 10 =1.33, SE=0.75, 95% CI −0.14 to 2.80, t 190 =1.77, P=.08). The calculation of CIs showed significant random effects for both the estimated well-being at the end of follow-up (SD 8.88, 95% CI 7. 35-10.55) and the rate of change during follow-up (SD 3.20, 95% CI 0.09-6.32). Thus, the findings of the participants significantly varied around the average estimated well-being at the end of follow-up and the average change during follow-up, suggesting the inclusion of level 2 predictors to explain this variance.

Conditional Models: Alliance Main Effects
The conditional model with the PHQ-9 as an outcome variable and the alliance subscales as level 2 predictors significantly improved the fit of the time-as-only predictor model (χ 2 4  Furthermore, the conditional model with the SF-P as the outcome variable and the alliance subscales as level 2 predictors significantly improved the model fit as compared with the time-as-only predictor model (χ 2 4 =20.59, P<.001). This conditional model also showed a significant effect of the tasks and goals subscale on the estimated SF-P score at the end of follow-up (γ 02 =3.10, SE=1.14, 95% CI 0.87-5.33, t 198 =2.72, P=.007). A 1-unit greater tasks and goals score at posttreatment was associated with a 3.10-unit higher score in the SF-P at the end of the 9-month follow-up. There was no significant effect of the tasks and goals subscale on the development of the SF-P during follow-up (γ 12

Conditional Models: Alliance Main Effects Controlling for Patient Satisfaction and Patient-Supporter Interaction
The results of the conditional models estimating the alliance effects controlling for either patient satisfaction or patient-supporter interaction are presented in Multimedia Appendix 2. When running the same conditional models to predict PHQ-9 scores presented above on controlling for patient satisfaction with treatment, there was a significant improvement in the conditional model fit (χ 2 2 =8.61, P=.01). This model comparison test suggested the importance of controlling for patient satisfaction when estimating the alliance effects on the PHQ-9. The results of this model showed that there was still a significant effect of the tasks and goals subscale on the estimated PHQ-9 value at the end of follow-up (γ 02 =−1.78, SE=0.58, 95% CI −2.92 to −0.64, t 203 =−3.09, P=.002). The other effects of the alliance were nonsignificant as in the previous models. Furthermore, the effect of patient satisfaction was not significant when predicting the estimated PHQ-9 scores at the 9-month follow-up (γ 03 =0.09, SE=0.85, 95% CI −1.58 to 1.76, t 200 =0.11, P=.91) but was significant when predicting the change produced in the PHQ-9 during follow-up (γ 13 =2.00, SE=0.81, 95% CI 0.41-3.59, t 193 =2.48, P=.58). When controlling for the alliance subscale effects, a 1-unit greater score in the ZUF-8 at posttreatment (ie, patient satisfaction with treatment) was associated with a 2-unit increase in the PHQ-9 score during the follow-up period.
However, when running a conditional model exploring alliance effects on the PHQ controlling for patient-supporter interaction indicators (ie, number of messages sent by the participant, number of messages sent by the supporter, and number of messages read by the participant), there was no significant improvement in the model fit (χ 2 6 =0.56, P=.99). This test suggested not to include participant-supporter interactions as covariates in conditional alliance models. Nevertheless, it is worth highlighting that the model controlling for participant-supporter interaction indicators still showed significant effects of the tasks and goals subscale on the estimated PHQ-9 value at the end of follow-up (γ 02 =−1.97, SE=0.44, 95% CI −2.82 to −1.12, t 141 =−4.49, P<.001).
Furthermore, inclusion of patient satisfaction in the conditional models predicting SF-P did not significantly improve the model fit from the conditional models that included only alliance subscales (χ 2 2 =0.39, P=.82). This model comparison test again suggested not to include patient satisfaction when estimating alliance effects on the SF-P during follow-up, keeping as the final models the conditional models introduced in the section presented above (ie, conditional models with alliance-only main effects). As can be seen in Multimedia Appendix 1, the results of the model predicting the SF-P and controlling for patient satisfaction suggested no significant effects of either alliance subscales or patient satisfaction.
Additionally, the models controlling for participant-supporter interaction did not improve the conditional model fit when predicting the SF-P (χ 2

Discussion
Responding to a general call for further studies clarifying the role of alliance in online interventions [1,9], particularly with psychometrically sound instruments [29], the aim of this study was to analyze the alliance-outcome association in a guided internet intervention for participants with moderate depression, using an adapted version of the WAI to measure the alliance in this type of approach. The results of the model showed significant effects of tasks and goals on the estimated scores of the PHQ-9 and SF-P at the end of the 9-month follow-up. These results suggest that when participants report a greater agreement with the therapeutic activities and goals proposed by the internet intervention at posttreatment, their residual depressive symptoms are lower and psychological well-being is higher at the end of follow-up. These findings are in line with the results of several studies showing the overall importance of an alliance for internet interventions [1] and the specific relevance of the alliance with an online program for treatment outcomes [18,25]. However, different from previous studies that analyzed overall alliance with internet interventions, according to suggestions from several authors [9,27,28], in this study, alliance was measured disaggregating the bond with trained supporters and the agreement with internet interventions regarding tasks and goals. Furthermore, in this study, the effects of the alliance were associated with residual depressive symptoms and patient well-being 9 months after treatment termination. However, the tasks and goals subscale did not exhibit a significant effect on the rate of change during the follow-up period.
Additionally, the results of this study did not show significant effects of the bond subscale on PHQ-9 or SF-P-estimated values at the end of follow-up or the rate of change during the follow-up period. These findings are in line with theories suggesting that in internet interventions, the agreement of tasks and goals with the intervention might be more relevant than the bond with trained supporters [9,27].
In conclusion, the results of this study point out the importance of attuning internet interventions with patients' expectations and preferences in order to enhance their agreement with the tasks and goals of the treatment. Being responsive to patients' needs has been presented as a fundamental process of change in face-to-face psychotherapy [44][45][46].
The results of this study further support this notion in internet interventions. Some online programs, such as Deprexis, use patient feedback to select specific content presented to patients [35]. However, this treatment personalization is only within modules to treat depression. Considering the high rates of comorbidities in patients with depression [47], one extension that could enhance the intervention is incorporation of modules that address other relevant symptomatologies in these patients (eg, anxiety) [27]. In the last years, there were several developments in this direction, tailoring internet interventions to the symptom profile of the patient and offering individually prescribed treatment modules accordingly [27,[48][49][50]. The inclusion of these personalization strategies to account for possible and very likely comorbidities beyond depressive symptoms might enhance participant agreement with the tasks and goals of the program, because the intervention would further include meaningful activities to address relevant problems for the patients. Indeed, in a study comparing tailored and standardized disorder-specific internet interventions, participants who received the tailored condition rated the agreement of the tasks and goals with the program substantially and significantly higher (P<.001) as compared with those who received the standardized condition [27]. Future studies might need to further explore evidence-based responsiveness and treatment goals according to patient markers and analyze their associations with internet intervention processes, such as agreement on the tasks and goals of the intervention, and with acute as well as long-term outcomes.
Several limitations characterize this study. For instance, with only one assessment point for the alliance, differential effects of the general level of the alliance during treatment (ie, between-patient effects or trait-like components of alliance) and effects of the modifications of the alliance during the intervention (ie, within-patient effects or state-like components of alliance) could not be established [51,52]. Furthermore, as shown by Crits Christoph et al [53], a single measure of alliance might be less reliable than an aggregation of several measurements. The reason for assessing alliance only once at posttreatment was that contact with the supporting therapist was minimal and an adequate dose of interaction between the patient and the supporting therapist was necessary for reliable evaluation. However, this argument applies mainly to the bond subscale but not necessarily the tasks and goals agreement subscale with internet interventions (the component of the alliance that showed significant effects on long-term outcomes). Future studies would benefit from the analysis of the agreement of the tasks and goals with the intervention early in treatment and the use of repeated measures that would allow for a more fine-grained and sound analysis of the association between alliance and outcome in guided internet interventions. Additionally, although we measured alliance only at posttreatment to maximize contact between the participants and supporters before the assessment, their contact during the intervention might be too limited to fully capture the potential effects of the bond with the supporter. Future studies would need to further explore the effects of the therapeutic relationship or bond with the supporter in internet interventions where there is greater participant-supporter contact. Furthermore, the analyses of the study were conducted using a subsample of the EVIDENT randomized controlled trial (ie, patients with moderate depression who received Deprexis with email support). Further studies will need to explore whether the association between task and goal agreement and long-term outcomes can be generalized to other populations with different diagnoses (eg, anxiety disorders), different levels of severity (eg, mild or severe depression), other internet interventions, and other forms of therapist support (eg, phone support).