Is the Total Score of the Hamilton Depression Rating Scale Associated with Suicide Attempts or Suicides?

___________________________________________________________________________________________ ABSTRACT Background: Most evidence behind interventions for depression is, in essence, based on the total score of the 17-item Hamilton Depression Rating Scale (HDRS). We identified no systematic review or meta-analysis examining if a total 17-item HDRS score is associated with suicide attempts or suicides in depressive patients. Methodology: Based on a systematic literature search CENTRAL in The Cochrane Library, MEDLINE, EMBASE, PsycInfo, and Science Citation Index Expanded, we systematically reviewed and meta-analysed observational studies examining if the total 17-item HDRS score is associated with suicide attempts or suicides. Results: We identified and included ten cohort studies - seven retrospective and three prospective. All the studies were assessed as high risk of bias. Meta-analysis on the HDRS scores from three retrospective studies showed that depressive patients with a suicide attempt during the on-going depressive episode had HDRS scores from three prospective studies and four studies reporting retrospectively lifetime suicide attempts showed no significant differences between patients with or without a suicide attempt or suicide. Conclusion: A total score on the HDRS does not seem to be associated with past or future suicide attempts or suicides. There seems to be a need for other assessment tools to predict and explain risks of suicidality.


INTRODUCTION
During 40 years the 17-item Hamilton Depression Rating Scale (HDRS) [1] has been the gold standard to quantify depressive symptoms in clinical trials [2] and systematic reviews have shown that the vast majority of trials assessing the effects of intervention for depression primarily use a total score on the HDRS as an outcome measure [3][4][5][6].
We believe a scale sensitive to measure the severity of depressive symptoms should be able to differentiate between depressive patients with and without suicidal tendencies. However, we identified no relevant systematic reviews or meta-analysis examining if a total score on the HDRS is associated with suicide attempts or suicides.

Aims of the Study
We conducted our systematic review of observational studies involving meta-analysis, to answer the questions: is the total score on the 17-item HDRS associated with suicide attempts and suicides in the past, the present, or the future?

METHODOLOGY
Our systematic review with meta-analysis was conducted according to the MOOSE guidelines for reporting of meta-analysis of observational studies [7] adopting methodology from The Cochrane Handbook for Systematic Reviews of Interventions [8]. Our protocol was completed before the literature search began and was published at our website (www.ctu.dk). We included all observational studies examining the association between a total 17-item HDRS score and suicide attempts or suicides. We wanted to compare homogeneous patient groups and therefore we did not include studies with a control group recruited from a different setting than the experimental group -no matter how well the groups were matched. E.g., we did not include studies comparing outpatients with inpatients even if the groups were matched regarding age, sex, etc. We chose not to include studies examining the association between suicide impulses and HDRS, because suicide inclination is often assessed via item 3 on the HDRS or other continuous outcome scales [1,9,10]. We wanted to avoid assessing the validity of the HDRS via an item from the HDRS itself (item 3) or via another continuous outcome scale with a questionable validity. The studies were included irrespective of language, publication status, publication year, and publication type -based on searches in CENTRAL on The Cochrane Library, MEDLINE, EMBASE, PsycInfo, and Science Citation Index Expanded. We also searched other relevant publications for references to relevant studies. The timeframe for the search was all studies published before October 2011.

Types of Studies
We chose to include prospective and retrospective cohort studies, case-control studies, and cross-sectional studies. We chose to classify the studies in the three following categories in prioritized order, the first category below having the highest weight according to 'levels of evidence' [11]:

Prospective studies
One kind of prospective studies was included:  Studies examining prospectively if the mean score on the HDRS differed between depressive patients who, during a follow-up period, had a suicide attempt or committed suicide.

Retrospective studies
Two kinds of retrospective studies were included: 1. Studies examining retrospectively if the mean score on the HDRS differed between depressive patients with and without a suicide attempt during the on-going depressive episode. 2. Studies examining retrospectively if the mean score on the HDRS differed between depressive patients with and without a lifetime history of a suicide attempt.
A number of assessment tools exist to assess the risk of bias in observational studies [12].
We assessed all studies according to 'levels of evidence' [11] and the STROBE guidelines for reporting observational studies [12,13].

Outcomes and Statistical Methods
Our only outcome measure was the mean value of the 17 item HDRS [1] for patients with and without a suicide attempt regardless of the attempt being successful or not. The metaanalysis was undertaken according to the recommendations stated in The Cochrane Handbook for Systematic Reviews of Interventions [8]. We used the mean difference (MD) with a 95% confidence interval [14], and used RevMan version 5.0 for statistical calculations [15]. We reported statistical heterogeneity as I 2 to describe the percentage of the variability in effect estimates that is due to heterogeneity rather than sampling error (chance) [8].

Prospective studies
We included three prospective cohort studies with a total of 1314 patients [16,17,24]. Schneider et al. followed 280 consecutively admitted patients with a DSM-III-R diagnosis of major depressive disorder [24]. The patients were interviewed with HDRS at admission. Five years later information about suicides was obtained for 99.3% of the patients [24]. The investigators obtained the information about suicides by contacting patients, relatives, friends, general practitioners, psychiatrists, and from medical records [24].
Coryell et al. followed 785 depressive patients up to 20 years and reported baseline HDRS mean values for patients who did or did not commit suicide during the follow-up period [16]. The diagnosis of depression was made by the Research Diagnostic Criteria [25]. The

Prospective studies
We included three prospective cohort studies with a total of 1314 patients [16,17,24]. Schneider et al. followed 280 consecutively admitted patients with a DSM-III-R diagnosis of major depressive disorder [24]. The patients were interviewed with HDRS at admission. Five years later information about suicides was obtained for 99.3% of the patients [24]. The investigators obtained the information about suicides by contacting patients, relatives, friends, general practitioners, psychiatrists, and from medical records [24].
Coryell et al. followed 785 depressive patients up to 20 years and reported baseline HDRS mean values for patients who did or did not commit suicide during the follow-up period [16]. The diagnosis of depression was made by the Research Diagnostic Criteria [25]. The

Prospective studies
We included three prospective cohort studies with a total of 1314 patients [16,17,24]. Schneider et al. followed 280 consecutively admitted patients with a DSM-III-R diagnosis of major depressive disorder [24]. The patients were interviewed with HDRS at admission. Five years later information about suicides was obtained for 99.3% of the patients [24]. The investigators obtained the information about suicides by contacting patients, relatives, friends, general practitioners, psychiatrists, and from medical records [24].
Coryell et al. followed 785 depressive patients up to 20 years and reported baseline HDRS mean values for patients who did or did not commit suicide during the follow-up period [16]. The diagnosis of depression was made by the Research Diagnostic Criteria [25]. The authors searched The National Death Index to determine if a death certificate indicating suicide existed for each individual after the follow-up period.
Holma et al. prospectively followed a cohort of depressed patients for five years and reported 'maximum HDRS score during follow-up' for patients with and without a suicide attempt during the follow-up period [17]. 269 psychiatric patients were diagnosed as having DSM-IV major depressive disorder [26]. Information on 249 out of 269 (92.6%) participants was reported. Occurrence of a suicide attempt during the follow-up period was based on both a clinical interview and on psychiatric records.
Meta-analysis of the total HDRS scores from these three prospective studies showed no significant difference between suicide attempters and non-attempters (mean difference on 1.68 HDRS with a higher score for patients with suicidal tendencies; 95% CI -0.21 to 3.57; P=0.08, I 2 =36).

Retrospective studies
We included three studies reporting if a total mean score on the HDRS differed between depressive patients with and without a history of a suicide attempt during the on-going depressive episode [10,18,19]. The patients were not followed prospectively after the HDRS score was assessed. The three studies included a total of 340 depressed patients.
Meta-analysis of the HDRS scores from these three studies showed that depressed patients with a suicide attempt during the on-going depressive episode had a significantly higher score on the HDRS compared to depressed patients without a suicide attempt during the ongoing depressive episode (mean difference 6.31 HDRS; 95% CI 4.72 to 7.91; P<0.00001, I 2 =0). We included four studies reporting if a total mean score on the HDRS differed between depressed patients with and without a lifetime history of a suicide attempt [20][21][22][23]. The four studies included a total of 1033 depressed patients. The meta-analysis of the total HDRS scores from these four studies showed no significant difference between the patients with or without a lifetime history of a suicide attempt (mean difference with a higher score for patients with a history of a suicide attempt of 0.78 HDRS; 95% CI -0.01 to 1.57; P=0.05, I 2 =56).

Risk of Bias
All of the studies were assessed as having high risk of bias (systematic error) [12,13].

DISCUSSION
A total score on the 17-item HDRS does not seem to be significantly associated with future suicide attempts or suicides or a lifetime history of suicide attempts -although HDRS and a lifetime history of suicide attempts had a borderline significant association. Patients with a retrospective history of a suicide attempt during the on-going depressive episode seem to have a higher HDRS score compared to patients without a suicide attempt during the ongoing depressive episode.
Our study both has strengths and limitations. First of all, we published a protocol for the systematic review before the literature search began, and we conducted a thorough systematic literature searches in all relevant databases. The studies were included irrespective of language, publication status, publication year, and publication type enabling a more complete inclusion of relevant studies. All studies were assessed according to STROBE guidelines [11]. We included all observational studies comparing patient groups recruited from the same setting. We did not include studies with a control group recruited from a different setting than the experimental group, because such studies have a higher risk of biased results [27]. Therefore, we have reduced the risk of biased results of this present review. Furthermore, we followed the MOOSE guidelines for reporting of meta-analysis of observational studies [7]. However, our results are based on a limited number of studies and participants. All of the studies were assessed as having 'high risk of bias' so our results may be questionable. According to levels of evidence [11], prospective studies have more evidential weight than retrospective studies. We identified and included only three prospective studies, examining and reporting HDRS baseline scores, and prospectively assessing suicidality. The three studies differed in their choice of suicidality outcome measure, length of follow-up, and number and timing of assessments. Two of the studies reported baselines scores of the HDRS and followed the patients 5 to 20 years regarding suicides [16,24]. It may be difficult for any assessment method to predict risks of suicide attempts so far into the future. However, we believe the results from these two studies should be considered the most clinically valid. The last of the prospective studies performed repetitive HDRS interviews and reported only the highest HDRS score during the five years of follow-up and related these scores to suicide attempts [17]. In this study a significant association was found between HDRS and suicide attempts but some of the HDRS scores might evidently be from a time point after a suicide attempt. We performed meta-analysis on the results from the three prospective studies even though there evidently is a difference in severity between a suicide attempt and a suicide. This is of course a limitation of our review, but it was necessary in order to be able to meta-analyse, and we chose to report suicide attempts regardless of the outcome of the attempt before embarking on data identification, data extraction, and analysis. It can be argued that it is impossible for any assessment instrument to predict suicide attempts many years in advance (see follow-up periods for prospective studies). We believe that the relative long follow-up periods in the three prospective studies are an advantage because the relationship between the total HDRS score and both short-term and long-term suicidal tendencies can be assessed. Major depressive disorder is often a chronic illness [28,29] and any long-term association between HDRS and suicide attempts or suicides is important to clarify. Furthermore, if there is a short-term association between HDRS score and suicidal tendencies this association will be evident only if relatively large studies are conducted. Even with long-term follow-up of 5 to 20 years only 6.5% of the patients included in the prospective studies had a suicidal event. That was considerably higher (19.6% and 31.5%) in the retrospective studies. The retrospective studies also differed significantly regarding choice of outcome measure, length of follow-up, and assessment method. The heterogeneity of the included studies is reflected in the statistical heterogeneity, and makes our results more difficult to interpret.
Our results could, in theory, indicate that suicidality does not associate to the severity of the depressive illness. However, the suicide rate in depressed patients is considerable [30], and we believe that depressed patients with suicidal tendencies on average must tend to have a more severe depression compared to non-suicidal patients. If HDRS was a sensitive measure of severity of depressive symptoms, we would therefore have expected to find a relatively large mean difference between patients with and without suicidal tendencies. We only found a significant difference from the studies examining and reporting retrospectively if a total mean score on the HDRS differed between depressed patients with and without a suicide attempt during the active depressive episode. It is difficult to conclude anything for certain from retrospective studies because the compared patient groups in such studies might systematically differ. Furthermore, when we assess patients with a score we need to gain prognostic information about the future, not about past event that can be identified by asking the patient or relatives or by reading the patient records. Moreover, a suicide attempt is a traumatic life event and participants with suicidal tendencies might have a relatively high HDRS score due to this. This bias mechanism could also affect the results from one of the prospective studies in which the patients were rated a number of times and only the highest HDRS score during the follow-up period was reported [17].
Assessment of suicidality is complex and an area that needs to be further examined [31,32]. Future studies could assess the association between HDRS and specific suicidality assessment scales (e.g., Columbia Suicide Severity Rating Scale or Scale for suicide Ideation) [31][32][33]. This could also be a future research topic for a systematic review.
According to the studies we could identify, the total score of the HDRS was not able to significantly predict future risk of suicide attempts or suicides. The HDRS does therefore not seem to be a useful screening instrument to assess risk of suicidal behaviour. Item three of the HDRS or other HDRS sub-scores might be associated with suicide attempts or suicides, but we were not able to assess these associations because we did not have individual data from each trial. Our results do only show the association between the total HDRS score and suicide attempts and suicides. The HDRS has not been developed as an assessment method to predict suicidality. However, the HDRS total score is used widely throughout healthcare systems and we wanted to relate the total score to past or future suicidality to assess this aspect of the HDRS. We believe suicide attempts and suicides are the most important outcome regarding depressive illnesses. The lack of predictive power of the HDRS could be due to the fact that only few studies with a limited number of participants have been conducted. We need more studies on the association between HDRS and future suicidality.
Systematic reviews have shown that interventions for depression only benefit patients with a few HDRS points compared with different control interventions, and the clinical significant effects of these interventions have therefore been questioned [3][4][5]. We believe that the results from our present review should further question whether a mean difference of, e.g., three points on the HDRS have clinical significance. The National Institute for Health and Clinical Excellence (NICE) has recommended that a mean difference, between antidepressants and placebo, of three points on the HDRS are needed in order for a intervention to be considered significantly effective for clinical practice [6,34]. HDRS does not seem, on the face of it, to be a sensitive outcome measure of the severity of depression, and HDRS does not seem to be able to discriminate between depressed suicide attempters and non-attempters. Other publications have concluded that the HDRS scale is heterogeneous and that the scale is psychometrically and conceptually flawed [2,25,35]. During the past 40 years the HDRS has been the gold standard to quantify depressive symptoms in clinical trials [2]. There seems to be a need for other more clinically relevant assessment methods such as reporting of suicide attempts, suicides and adverse events.

CONCLUSIONS
A total score on the HDRS does not seem to be associated with past or future suicide attempts or suicides. However, there might be an association between the total HDRS score and previous suicide attempt during the on-going depressive episode. There seems to be a need for other assessment tools to predict or explain risks of suicidality.