Feedback-informed treatment in emergency psychiatry; a randomised controlled trial

Background Immediate patient feedback has been shown to improve outcomes for patients in mild distress but it is unclear whether psychiatric patients in severe distress benefit equally from feedback. This study investigates the efficacy of an immediate feedback instrument in the treatment of patients with acute and severe psychosocial or psychiatric problems referred in the middle of a crisis. Methods A naturalistic mixed diagnosis sample of patients (N = 370) at a Psychiatric Emergency Centre was randomised to a Treatment-as-Usual (TAU) or a Feedback (FB) condition. In the FB condition, feedback on patient progress was provided on a session-by-session basis to both therapists and patients. Outcomes of the two treatment conditions were compared using repeated measures MANCOVA, Last Observation Carried Forward and multilevel analysis. Results After 3 months, symptom improvement in FB (ES 0.60) did not significantly differ from TAU (ES 0.71) (p = 0.505). After 6 weeks, FB patients (ES 0.31) actually improved less than TAU patients (0.56) (p = 0.019). Conclusions Patients with psychiatric problems and severe distress seeking emergency psychiatric help did not benefit from direct feedback. Trial registration Dutch Trial Register, NTR3168, date of registration 1-9-2009 Electronic supplementary material The online version of this article (doi:10.1186/s12888-016-0811-z) contains supplementary material, which is available to authorized users.


Background
Feedback systems have been developed in recent decades that provide therapists, patients or both with information about patient progress on a session-to-session basis [1][2][3][4]. The assumption in this 'feedback-informed treatment' is that clients feel more engaged in the therapy process and that therapists are better able to adapt their therapeutic approach when feedback information suggests that treatment is unsuccessful [3,5,6]. In a meta-analysis incorporating nine studies, Lambert & Shimokawa [7] found effect sizes varying from .23 to .33. They concluded that the number of psychotherapy patients who deteriorate in routine care (5-10 % in adult psychotherapy, 14-24 % in child psychotherapy) can be reduced by half using their feedback method.
Most feedback studies have been performed in psychotherapeutic settings and in samples of patients in mild distress who are generally not suffering from major psychiatric disorders.
The available studies in psychiatric samples in the last decade show that feedback improves outcomes for those with more severe mental health problems but that effect sizes are reduced [8]. However, feedback systems differ enormously with respect to the measures used and the frequency of administration [2], and the small number of studies and the heterogeneity of both the studies and the feedback systems make it hard to draw general conclusions [8].
Adding feedback may prove particularly valuable in this psychiatric population since the non-attendance levels in psychiatry are substantial, especially in the group with severe distress [5,9]. Duncan et al. [5] state that patients who have lost their sense of mastery and their faith in therapy can be expected to feel empowered when their views and preferences are explicitly taken into account in a feedback process.
Since using feedback may prove to be a valuable tool in psychiatric treatment, we aimed to investigate, in an RCT, whether applying a feedback system can be effective in a psychiatric setting involving intensive outpatient care following a crisis evaluation.

Setting
The study setting was a Crisis Intervention & Brief Therapy team (CIBT team) in Amsterdam where patients with severe psychiatric and psychosocial problems are treated on an outpatient basis for a maximum of 6 months.
Patients are referred by GPs, mental-health workers and the police. Indication for treatment by the CIBT team is based upon the need for immediate help felt by the patient or referring professional. 'Crisis' is defined as: the patient needs help within 24 h due to a risk of suicide, serious behavioural problems, problems with the law and safety concerns, a sudden loss of social support and/or need for involuntary admission.
The CIBT works from a transdiagnostic perspective, which means that the assessment is not based solely on the diagnostic category but on the overall presentation of symptoms, and the needs and capacity of the patient and relatives. The need for acute help or treatment is integrated with a diagnostic screening and interventions are initiated immediately if necessary. The group of participating therapists consists of a highly experienced permanent staff of six psychiatrists, ten social psychiatric nurses, two psychologists and a family and marital therapist. In addition, the team includes a group of -on average -eight experienced and intensively supervised residents in psychiatry who each work at the CIBT for a period of six months. Clients are assigned to the duty therapist. No selection is made based on the diagnosis of the client or the discipline of the therapist. A total of 32 residents participated during the study period of about three years as a whole. The team uses a systemic approach that incorporates supportive and behavioural interventions [10]. All patients undergo a full clinical psychiatric examination. Treatment may involve pharmacotherapy and psycho-education and includes outreaching care if needed.

Study design, randomisation and inclusion criteria
This study was designed as a randomised controlled trial in 'routine emergency care' comparing Treatment As Usual (TAU) with a Feedback condition (FB). The difference between TAU and FB is that, in every session in FB, feedback was obtained from the patients about progress in their functioning and about the therapeutic alliance [11], and this feedback was discussed by the therapist and the patient together. In the TAU condition, feedback was obtained every six weeks without feeding the results back to the patient or the therapist.
As the emergency setting made it impossible to distinguish in advance between patients who would be treated in the CIBT team and patients who would be referred to other treatment settings after the first contact, we conducted a pre-randomisation procedure for including patients in the study sample [12]. A random allocation sequence was generated using the SPSS random number generator. Patients were assigned to the FB or TAU condition by a research assistant who knew the allocation sequence but had no information about the patients.

Intervention
Prior to the first session, a research assistant explained the principles of the feedback system, the Patient for Change Outcomes Management System (PCOMS), to all patients who had been randomised to the FB condition. Before each session, patients scored their well-being using the Outcome Rating Scale (ORS) and immediately received the printed score on a clipboard. The scores on the ORS form were discussed with the therapist at the beginning of each session. At the end of the session, the patient evaluated the therapy session using the Session Rating Scale (SRS) and also discussed the score with the therapist. When the crosses on the 'What did you think of the session?' form indicated reticence or plain dissatisfaction, the reasons for being dissatisfied were discussed with the therapist. When scores indicated general satisfaction, as indicated by a sum score exceeding 36 [13], the therapist asked for comments about how to improve the therapy.
The research assistant invited patients in the TAU condition to complete the ORS form at intake and every six weeks after that. The score was recorded in the database, and was not accessible to therapists or patients. In both conditions, patients were asked to complete the BSI and OQ45 questionnaires, first upon entering the service, and then every 6 weeks up to a maximum of 24 weeks.

Training of therapists and application of feedback
Staff therapists were trained to administer, score and provide feedback to patients on the basis of the training manual provided for the ORS and SRS [14] before the study started. Follow-up supervision sessions were organised regularly during the course of the research project to maintain adherence. Therapists were trained to discuss the SRS score and encourage patients to express any comments and concerns about the session by making suggestions about how to improve collaboration and therefore address potential breaches in the alliance. Therapists were given the discretion to decide how to interpret and best integrate scores during the course of the treatment. However, if the ORS curve showed no improvement during the initial sessions, therapists were required to consult a colleague and consider other treatment options.

Independent variables
The data collected at baseline (the emergency consultation) were: age, gender, domestic situation, ethnicity and main DSM IV diagnostic category.

Outcome measures
The number of therapy sessions and the duration of treatment were derived from the patient registration systems of Arkin Mental Health Care in Amsterdam. The link to the database of this system was established with an encrypted code based on gender, date of birth and the first two letters of the family name. This link made it possible to deduce data for unique patients.

Choice of feedback system
In meta-analyses, three elements which make feedback more effective were identified [2,7]: when information about patient progress (by contrast with information about patient status only) was supplied, when feedback was reported frequently (more than twice over the course of treatment), and when both the patient and the therapist were informed about progress. A feedback system that incorporates these elements is the Partners for Change Outcome Management System (PCOMS) [13,15]. An advantage of PCOMS is that it uses much shorter score lists than other systems, which is important for psychiatric patients with short attention spans. Three randomised controlled studies have been performed with PCOMS [16][17][18]. In these studies -which took place in student and family counselling settings -patients and couples in the feedback condition were found to improve more than patients receiving treatment as usual.

The Partners for Change Outcome Management System (PCOMS)
PCOMS [13,15] comprises two very short (VAS) scales consisting of four items each: firstly the Outcome Rating Scale (ORS) -which assesses change in three areas of client functioning: individual (or symptomatic) functioning, interpersonal relationships, and social role performance -and the Session Rating Scale (SRS) for scoring the quality of the working alliance.
The psychometric properties of the American and Dutch versions of this instrument have been evaluated [13,15,19,20], resulting in coefficient alpha values ranging from 0.84 to 0.93 for the ORS and from 0.80 to 0.90 for the SRS for both the American and Dutch versions. ORS test-retest reliability coefficients (Pearson's r) for both Dutch and American versions were reported ranging from 0.49 to 0.66, and from 0.49 to 0.65 for the SRS. With respect to this relatively weak test-retest reliability, Hafkenscheid et al. [19] point out that correlations between subsequent administrations are an inappropriate operational definition of test-retest reliability for instruments designed to be sensitive to a client's perception of subjective change.
Outcome Questionnaire 45 (OQ45) The OQ45 [21] consists of 45 statements in three subscales that assess Symptom Distress (SD), Social-Role functioning (SR) and Interpersonal Relationships (IR). Jong et al. [22] conducted a psychometrical evaluation of the Dutch version of the questionnaire. Internal consistency (alpha) for the Total score obtained with the Dutch OQ-45 ranges from 0.92 to 0.96. Test-retest reliability (Pearsons's r) ranges from 0.79 to 0.82.

The Brief Symptom Inventory (BSI)
The BSI [23] is the concise version of the Symptom Checklist 90 (53 statements) for measuring symptoms of psychopathology in adults. Reliability (alpha coefficient) for the Dutch version of the scale as a whole is .96 [24]; test-retest reliability (Pearons's r) is 0.90.

Attitude survey
In order to check for bias resulting from changes in therapists' attitudes to applying feedback, therapists were asked to complete an attitude survey [16] at the start and finish of the study consisting of 19 statements reflecting therapist opinions about PCOMS, examples being 'I consider this instrument useful' or 'I don't think this instrument is useful for clients'. This survey has not been evaluated psychometrically (Additional file 1).
Changes in attitudes towards the feedback process were tested between baseline and 12 weeks (paired t-test).

Adherence survey
After one year (halfway through the study,) staff therapists were asked, in order to check for bias in the results due to lack of adherence, to complete an anonymous survey about the extent to which they had been able to apply the feedback as intended (Additional file 2). This survey was designed by the first author and contains two items: a) the percentage of sessions in which the therapists applied the feedback measures adequately (results categorised in: '10-40 %' , '40-70 % and 'more than 70 % of the sessions adequately applied'); b) the time spent (in minutes) discussing the ORS; c) the time spent discussing the SRS.

Data analysis Sample size calculation
With two groups of 90 patients, an alpha of 0.05 (one-tailed), an effect size of about 0.3 on the BSI total score (Global Severity Index) at 12 weeks (mean EXP group = 1.0; mean TAU = 1.3; standard deviation at week 12 is 0.80) can be detected with a statistical power of 80 %. Analysis was performed using to the intention-totreat principle. Sample size was calculated a priori. No separate power analysis was performed for the ORS and OQ45 since the BSI was the primary outcome measure, as established beforehand [11]. Baseline characteristics were compared using Chi-square tests, ANOVA and Mann-Whitney tests. The proportions of early treatment termination and non-response (patients still in treatment without measurement) were compared at 6, 12, 18 and 24 weeks using Chi-square tests. To test for selective drop-out at these measurement points, patient characteristics (including diagnostic categories), baseline measurements and the number of sessions were compared.
Outcomes of the two treatment conditions (observed cases) were compared using repeated measures MANCOVA with the number of sessions as a covariate. In this analysis, each subsequent measurement was compared separately with the baseline measurement. An identical analysis was performed on a dataset on which Last Observation Carried Forward (LOCF) had been performed.
Furthermore, multilevel analysis (MLwiN v2.25) was used to establish time by treatment interactions in ORS, BSI and OQ45. Three levels were included: patient, therapist and time. First, the measurements from start to week 12 (T0,T6,T12) were analysed, followed by the measurements from start to week 24 (T0,T6,T12,T18,T24). The number of sessions was included as a covariate.
Analyses conducted to compare the outcomes of completed treatments were based primarily on observed cases. LOCF and multilevel analysis were used as an additional method, primarily with the aim of comparing the results of terminated treatments with different durations. Secondly, both LOCF and multilevel analysis were used to handle data that were incomplete due to missing scores and 'drop-out' (in other words, clients who terminated treatment without mutual consent).
Pretreatment-posttreatment effect sizes for each treatment group were calculated by dividing the mean difference by the pooled standard deviation of the baseline measurement and the measurement point concerned.
In addition, the numbers of patients profiting from treatment in both conditions were compared. Based on Cohen's d [25] a cut-off was established at an effect size of 0.5 (which means a 'medium' effect). Clients showing an increase > 0.5 SD on GSI were classified as 'improved' , clients showing an increase < 0.5 SD as 'not changed' and clients dropping > 0.5 SD as 'deteriorated'. In all analyses α = 0.05 (two-sided) was used as the level of significance. All statistical analyses, except multilevel analysis, were conducted in SPSS 17.0.

Patient sample
Between 2009 and 2012, a total of 861 patients were referred to the Psychiatric Emergency Centre. The 222 patients who were unable to fill out a questionnaire at intake were excluded. A group of 269 patients were offered only one session for crisis evaluation, resulting in either immediate admission to a psychiatric hospital or referral to the patient's own general practitioner/therapist (when no indication for acute psychiatric help was found). In 370 patients the crisis intervention was followed by brief therapy, which was defined as more than two sessions (including the first crisis evaluation session). Of these patients, 83 terminated treatment within six weeks, making it impossible to assess their progress at the first time point (T6). The study sample therefore included 287 patients (Fig. 1). As 94 patients terminated treatment before T12, 49 (17.1 %) did not complete the questionnaires at this time and 15 (5.2 %) refused to participate, a total of 129 patients had received either TAU (57) or TAU + FB (72) at 12 weeks.
In conclusion, score evaluation at T6 and T12 was not possible for some of the study sample of 287 patients. LOCF and multilevel analyses were performed to correct for missing data.

Sample characteristics and representativeness testing
Of the participants in the total study sample (n = 287) -FB and TAU conditions combined -135 (47 %) were men and 152 (52 %) were women. The mean age was 38 years, with the majority (58 %) being in the 30-49 age category ( Table 1). The most common diagnostic categories were adjustment disorder (21 %), depression (19 %) and psychosis (15 %). About 40 % (42 %) of the patients were Dutch-born; about 60 % had their roots elsewhere. A substantial proportion (45 %) were living alone. On average, patients suffered from severe distress, as indicated by a mean BSI score of 1.84 at T0, which is significantly lower than the mean BSI score in Dutch clinical populations (1.23) found by de Beurs [24]. No differences for any baseline characteristic (including diagnostic categories) were found (Table 1), indicating a successful randomisation procedure.
The average number of treatment sessions offered to all patients was 9.3 (SD 5.05). No differences were found between conditions. The average duration of treatment was 105 days (range 0-231 days). The majority of patients (49.9 %) ended treatment within three months, two-thirds of patients (55.8 %) finished treatment within eight sessions and half of all patients (49.5 %) had 4-8 sessions. There were no significant differences in treatment duration between the two conditions: the mean was 105.4 days (SD 51.81) for patients in TAU and 103.5 (SD 50.23) days for patients in FB. In addition, no relationship was found between diagnostic categories and the average number of sessions or duration of treatment (data not shown).

Does systematic session-by-session feedback improve outcome?
After six weeks of treatment, patients in the TAU conditionbased on observed cases analysishad improved by .58 on the GSI score (from 1.88 at T0 to 1.30 at T6); patient scores in the FB condition had improved by .31 (from 1.80 at T0 to 1.51 at T6). Patients in TAU achieved significantly higher treatment gains than patients in FB (p = 0.020). LOCF (p = 0.021) and multilevel analysis (p = 0.006) produced similar GSI results at six weeks ( Table 2).
The OQ45 total score at 6 weeks was significantly different with LOCF (p = 0.035) and multilevel analysis (p = 0.047), and also favoured TAU.
At the predetermined primary measurement point the GSI at 12 weeks [11] no significant difference in treatment gains was found between the conditions: on the basis of observed cases, patients in the TAU condition reported mean treatment gains of 0.62 (GSI decreased from 1.88 at T0 to 1.26 at T12); this gain was 0.58 in the FB condition (GSI decreased from 1.80 at T0 to 1.22 at T12). The ORS score at 12 weeks did not indicate any significant difference favouring FB. LOCF and multilevel analysis also produced no significant differences or trends in total scores at 12 weeks. Table 3 shows that patients in TAU at 6 weeks did significantly better on the BSI subscale (depression, hostility, somatic complaints and anxiety) and the OQ 45 subscale (severity). Six other scales/subscales did indicate notsignificant differences favouring TAU (from p = 0.053 to p = 0.093). There were only two subscales that did not significantly favour FB (p = 0.055 and p = 0.069).
To test for selective treatment termination in the first period, we looked for differences between the total study sample at 12 weeks (N = 129) and the group of clients eliminated from analysis due to early treatment termination or not filling out the forms (N = 158). This check involved the same items, plus the number of sessions and duration of treatment. Furthermore, we looked for differences between TAU (N = 57) and FB (N = 72) at 12 weeks, checking for the percentage of nonresponding clients still in treatment (N = 64) as well as the percentage of all non-responding clients, including those who terminated treatment before 12 weeks (N = 158). We also checked for differences between the two conditions in terms of the percentage of clients who terminated the treatment without mutual consent (in other words, drop-out patients), looking at the total percentages in both conditions and at the separate measurement points. Neither of these comparisons revealed significant differences, suggesting that differences in treatment gains between TAU and FB were not affected by selective early treatment termination, missing data or drop-out.    What do the differences in treatment gains mean in clinical practice?
To identify the significance of the differences in treatment gains for clinical practice, final outcomes were categorised according to the percentages of patients who did and did not benefit from treatment according to the GSI (Table 4). At T6, significantly (p = 0.006) more patients in FB (68.8 %) had undergone no change or deterioration (in other words, their scores fell by > .5 SD on GSI) than in TAU (48.9 %). At twelve weeks these differences were no longer statistically significant: in FB 48.6 % of the patients showed 'no change' or 'deterioration' , as opposed to 45.7 % in TAU. In the full sample 52.7 % of patients had improved at T12 and 67.3 % had done so at T24. Table 5 presents the effect sizes (ES) of treatment. The ES for the TAU group at week 6 was .56 (sd 0.70), as opposed to .31 (sd 0.76) for the FB group, which is a significant difference in favour of the TAU group (p = 0.019). There were no significant differences in ES at other measuring points.

Therapist attitude and adherence
Fifty-one therapists (19 staff members and 32 residents) completed an adherence and attitude survey at both the beginning and the end of the study. The mean score at the beginning was 73.88 (SD 9.29) out of 95 and 71.96 (SD 8.01) at the end, indicating that therapists' attitudes to feedback were very positive on average, even though the initially high motivation of the therapists eroded slightly, albeit not significantly, over time (p = 0.06). In the adherence survey, 67 % of the staff therapists reported that they had applied PCOMS adequately in more than 70 % of the sessions; 14 % had applied it in 40-70 % of the sessions, and 19 % in 10-40 %. On average, therapists (N = 21) estimated that they spent 3.5 min on the ORS and 4 min on the SRS. Almost all patients completed the ORS and SRS forms: only one patient did not fill out a single ORS form, and two patients did not fill out a single SRS form.

Discussion
This study was set up to determine whether the positive results of immediate feedback described in psychotherapy studies could also be demonstrated in short-term  f g =2 Data in bold are significant scores psychiatric treatment delivered in an outpatient emergency centre for a range of problems and disorders. Contrary to what we expected, we found no positive effect of immediate feedback at the predetermined end point of our study at twelve weeks. Furthermore, the effect was negative at six weeks because there was significantly less improvement in the FB condition than in the TAU condition. More patients in the FB condition were 'not on track' (showing no change or deterioration) during the treatment process. No difference was found between the TAU and FB groups with respect to duration and number of sessions. Comparing non-responding clients revealed no significant differences, suggesting that providing feedback did not influence selective drop-out in a positive way.
It can be concluded that we found no advantage to including feedback in an emergency psychiatry setting. This result clearly contrasts with most of the earlier studies of PCOMS, which have found substantial benefits with feedback in other treatment settings [16][17][18]. Three possible clinical explanations can be offered for our findings.

Reduced ability to reflect during crisis
Characteristically, in crisis situations, people's ability to consider alternatives and reflect on their situation is impaired [10]. Since the ability to reflect and consider alternatives are precisely the elements needed to benefit from feedback, it is plausible that the impairment of those abilities is responsible (in whole or in part) for the lack of effect of feedback.
Apart from this, patients in crisis desperately look for solutions, including someone who can offer a way out. Explicitly stressing shared responsibilities, as well as possibly introducing insecurity about different treatment options and outcome at the outset in a crisis situation might actually burden the therapeutic relationship. Formalising feedback may disturb the process of subtly balancing between sharing responsibilities for the content of treatment and taking responsibility for the form of the treatment process, which is part of the art of crisis intervention [10].
The finding that differences in favour of TAU emerged in the first six weeks can be interpreted in line with this explanation since, in the initial weeks, a crisis is more severe and the ability to reflect is poor. Later onas patients stabilise moreimmediate feedback is probably more acceptable to patients, even though it still does not lead to better outcomes.
Low level of functioning and severity of psychiatric problems interfere with feedback effects Simon et al. [26] found that feedback had less effect (d = .12 versus d = .30) in a sample with lower pre-test scores (OQ score 83.72) than in a sample with higher pre-test scores (mean OQ 88.8) in a previous study in the same clinic [27]. They suggested that the difference in pre-test scores could account for the reduced effect of feedback, and that 'feedback interventions do not work as well with more disturbed patients as with the less disturbed'.
The pre-test scores found in our study indicate a higher level of distress than the pre-test scores in other feedback studies. The average ORS score was 13.3 (sd 9.14), as compared with 18.33 to 23.7 in other studies [15][16][17][18]; the mean pre-test OQ45 sum score was 91.78 (sd 27.80), as compared with 68 to 78 in other studies [28][29][30] (a lower OQ score means less severe complaints).
The low pre-test scores in our study may have been confronting for patients in FB and discouraging for patients who dysfunction in several life domains. Any positive effect of the feedback process would not seem to compensate for this drawback.

Relatively high efficacy of TAU
It should be noted that the effect size in the TAU condition in our study (0.71 at twelve weeks) is relatively high by comparison with the TAU groups in other feedback studies: Harmon et al. [29] report 0.43, Hawkins [27] .63 and Reese [18] .38. It is possible that this did not leave an adequate margin for further improvement as a result of adding feedback to this treatment.

Method: strengths and limitations Limitations
This study took place in a naturalistic crisis setting and the implementation of the study was therefore challenging in several ways. Firstly, a pre-randomisation procedure had to be conducted instead of random assignment with a full evaluation of inclusion and exclusion criteria before initiating randomisation. Secondly, therapies inevitably differed in duration and intensity, and there was sometimes a change of therapist in the course of therapy. However, analyses based on observed cases, LOCF and multilevel analysiswith the latter two adjusting for missing datalead to consistent results, suggesting that the overall conclusions are sound. A drawback in the design was that patients in the TAU group completed the ORS forms only every six weeks (to prevent bias coming from frequent 'feedback-alike' reflection on progress in TAU group), making it impossible to compare on-track/not-ontrack trajectories in the two conditions. As a consequence, no conclusions can be drawn about the specific effect of feedback on the group of not-on-track patients. Nevertheless, the finding that a comparison of early termination and non-response in both conditions did not reveal differences suggests that early identification of not-on-track patients did not improve outcomes.
Another limitation is that data was not collected about co-existing treatment and the use of medication during the study and so it is not known whether these factors have influenced the outcome or selective drop-out. Even so, no differences were found in the drop-out rates for the different diagnostic categories and it therefore seems unlikely that medication usewhich is generally linked to diagnostic categoriesaffected the outcomes. Sub-analyses by diagnostic group did not reveal significant differences. However, observed power was limited and so it is difficult to draw conclusions from these analyses. Finally, the adherence of the therapist to the feedback model was monitored by self-report and peer supervision but not measured systematically.

Strengths
The setting and population of this study are unique. No feedback study has yet been performed to our knowledge of patients in crisis suffering from severe psychiatric and psychosocial problems. The design ensures that possible differences in therapist characteristics are not responsible for differences in outcome: patients were allocated randomly to the different study conditions, and therapistsall of whom were experienced and qualified care providers participated in both conditions and therefore treated approximately 50 % of their patients using PCOMS and 50 % on the basis of TAU. Given the fact that differences between therapists are usually more pronounced than differences between therapeutic methods, it is important to eliminate the therapist variable [31]. The flip side to this strength is a possible spill-over effect that may occur if therapists fail to distinguish clearly between both conditions. Another benefit of this design is that it is not very likely that allegiance factors (in other words, therapists believing in the effect of feedback) affected outcomes because therapists with both higher and lower levels of motivation delivered treatment to the Feedback condition.
A final strength of the study is that, contrary to previous studies, independent outcome measures (BSI and Q45) were provided instead of using the feedback measure (ORS) itself as an outcome measure. Providing outcome information to patients may result in 'demand characteristics' (patients responding to incidental hints about the therapists' expectations) that favour the feedback condition [32,33]. In line with this, Janse [34] has recently argued that, although PCOMS is a useful feedback instrument, its validity is limited and therefore other instruments should be added to corroborate progress. Ideally, studies should therefore use an independent outcome measure that is not discussed with the therapist.
In our study, the adverse effect of applying feedback would not have been revealed if BSI and OQ45 had not been added. This finding could suggest that ORS outcomes have indeed been influenced by 'socially desirable' scoring.

Conclusions
To our knowledge, this is the first study suggesting that immediate progress feedback in psychiatric practice does not improve outcome and that it may even be counterproductive.
Perhaps patients did not benefit from feedback because they were unable or reluctant to think about the treatment process, and confronting them repeatedly with their low level of functioning may have demoralised them. If this is true, it may be better not to subject some patients with immediate feedback. Future research could determine whether pre-treatment functioning and the patient's ability to reflect influence the success of feedback. In studies of this kind, independent outcome measures should be added to control for 'socially desirable' scoring during the feedback process.

Availability of data and materials
The data and materials used in this study are available on request.

Consent for publication
Not applicable.

Ethics approval and consent to participate
The study protocol and informed consent procedure were evaluated in 2009 by the ethics committee for Dutch Mental Health Institutions, (Kamer Noord of the METiGG) (approval nr. 9219, 1-9-2009). Following their conclusion, the Committee concluded that, since feedback does not fall under the jurisdiction of the WMO (the Dutch law on scientific medical research on human subjects), the regular clinical procedure for informed consent at the department could be followed. The study was then explained to the patients, written information was provided and patients were asked to participate on a voluntary basis, which was noted in the medical file.
Endnotes 1 For a more detailed description of treatment elements, randomisation procedure, training of therapists and interpretation of feedback measures, see van Oenen et al. [10,11]. 2 Scores for alliance scales (Session Rating Scale and HAQ2) will be reported and discussed in a separate paper.