Psychometric Properties of the Concise Associated Symptom Tracking Scale and Validation of Clinical Utility in the EMBARC Study

Objective The authors aimed to evaluate psychometric properties of the Concise Associated Symptom Tracking (CAST) Scale and validate the clinical utility of measuring irritability by updating and replicating a previously published outcome calculator from the Combining Medications to Enhance Depression Outcomes (CO‐MED) trial. Methods Participants were 292 adults from the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study who had completed the CAST scale at baseline. The scale's five‐domain (irritability, anxiety, mania, insomnia, and panic) structure was evaluated with confirmatory factor analysis. Correlations with other clinical measures were used to confirm convergent and divergent validity. Logistic regression analyses from CO‐MED were used to estimate individual outcomes in EMBARC. Results Cronbach's alpha for the CAST scale was 0.78. Model fit for the five‐domain structure was adequate (goodness of fit index=0.93, comparative fit index=0.92, root mean square error of approximation=0.06). Scores on irritability, anxiety, panic, insomnia, and mania were correlated with scores on the Anger Attack Questionnaire irritability item (rs=0.50), Hamilton Rating Scale for Depression anxiety subscale (rs=0.24), Mood and Anxiety Symptoms Questionnaire anxious arousal scale (rs=0.44), Quick Inventory of Depressive Symptomatology Self‐Report insomnia items (rs=0.38), and Altman Self‐Rating Mania Scale (rs=0.39), respectively. Individual outcomes of remission (area under the curve [AUC]=0.805) and no meaningful benefit (AUC=0.779) were predicted with high accuracy among EMBARC participants using their baseline and week 4 scores for depression and irritability and model estimates from CO‐MED. Conclusions Measuring irritability may help predict clinical course. The CAST scale is a valid measure of depression‐associated symptoms, including irritability.

Patients diagnosed as having major depressive disorder experience a range of symptoms and functional impairments (1)(2)(3)(4)(5). Irritability in particular remains understudied among adults with the disorder (6,7). Although irritability is reported by more than half of adult patients with major depression (6,7), it is neither included as a diagnostic criterion in the DSM-5 (8) nor assessed by commonly used measures of depression severity (9,10). Recently, the Concise Associated Symptom Tracking (CAST) Scale (11) was used to demonstrate the clinical utility of measuring irritability (12). That research found that irritability improved early (from baseline to week 4) with antidepressant treatment and that early improvement predicted higher rates of remission (no or minimal depression) and lower rates of no meaningful benefit (<30% reduction in depression) at week 8, independent of changes in depression severity (12). Finally, that research used baseline and week 4 measures of irritability and depression to develop an interactive calculator in one sample (Combining Medications to Enhance Depression Outcomes [CO-MED]) and to validate it in a separate sample of outpatients with major depression (Suicide Assessment and Methodology Study [SAMS]) (12).
In the present study, we sought to extend these previous findings by evaluating the CAST scale's psychometric properties and by validating its clinical utility in an unrelated sample of outpatients with major depression from the Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care for Depression (EMBARC) study. To test the CAST scale's psychometric properties, we evaluated its five-domain (anxiety, irritability, mania, panic, and insomnia) structure with confirmatory factory analysis, measuring internal consistency with Cronbach's alpha coefficient, and demonstrating construct validity through correlation of the CAST domains with other clinical assessments at baseline. We then validated the CAST scale's clinical utility in measuring irritability as a symptom of major depression by updating the previously published CO-MED calculator (12) and testing the accuracy of calculator in predicting individual-level outcomes in a separate sample (EMBARC) of adult outpatients with major depression.

Study Design and Participants
EMBARC study. As previously described (13,14), the EMBARC study (NCT01407094) enrolled 309 participants with major depressive disorder at four sites. Of these participants, 10 were excluded because they were part of a feasibility sample, and three were randomly assigned but were then found ineligible for the study (13). Of the 296 participants randomly assigned to receive either sertraline or placebo, four did not complete the CAST scale at baseline. Thus, the modified intent-to-treat sample for the present study consisted of 292 participants with major depressive disorder. Institutional review boards at each site approved the EMBARC study, and all participants provided written informed consent prior to beginning any study related procedures. Inclusion and exclusion criteria for the EMBARC study have been described (13) in detail (https://clinicaltrials.gov/ct2/show/NCT01407094). Briefly, EMBARC participants were ages 18-65, met criteria for current episode of major depressive disorder on the Structured Clinical Interview for DSM-IV Axis I Disorders, scored ≥14 on the 16-item Quick Inventory of Depressive Symptomatology Self-Report (QIDS-SR) at both screening and randomization visits, did not meet criteria for a failed antidepressant trial during the current episode as measured by the Massachusetts General Hospital Antidepressant Treatment Response Questionnaire (15), and agreed to and were eligible for all biomarker procedures (electroencephalography, psychological testing, magnetic resonance imaging, and blood draws). Participants were excluded if they did not tolerate sertraline or bupropion in the past; were pregnant, breastfeeding, or planning to become pregnant; were medically or psychiatrically unstable; had ever met criteria for psychotic and/or bipolar disorder; had experienced substance abuse in the past 2 months or substance dependence in past the 6 months; or were taking prohibited concomitant medications (antipsychotic, anticonvulsant, mood stabilizers, central nervous system stimulants, daily use of benzodiazepines or hypnotics, or antidepressants).

CO-MED trial.
For the present study, we used data from the CO-MED trial participants to update the previously published logistic regression analyses of remission and no meaningful benefit as clinical outcomes (12). The CO-MED trial (16) recruited from six primary and nine psychiatric sites 18-75-year-old treatment-seeking outpatients with major depressive disorder (N¼665) and at least moderately severe (score of ≥16 on the 17-item Hamilton Depression Rating Scale [HAMD-17]) nonpsychotic chronic or recurrent depression. All participants provided written informed consent, and institutional review board approval was granted from each participating site (16). Detailed eligibility criteria have been reported (16) and are available on the Internet (https:// clinicaltrials.gov/ct2/show/NCT00590863). At baseline, participants were randomly assigned to treatment with either escitalopram plus placebo, sustained-release bupropion plus escitalopram, or extended-release venlafaxine plus mirtazapine. Postrandomization visits were conducted at weeks 1, 2, 4, 6, 8, 10, and 12 for the acute phase and at weeks 16, 20, 24, and 28 for the continuation phase.

Measurements
HAMD-17. Clinicians conducted the structured interview (17) for HAMD-17 to assess depression severity of patients at each visit of the EMBARC study. Previous reports have found concurrent validity between the HAMD-17 and other measures of depression severity, such as the 30-item Inventory of Depressive Symptomatology-Clinician Rated (IDS-C) (18). Six items of the HAMD-17 (psychic anxiety, somatic anxiety, gastrointestinal somatic symptoms, general somatic symptoms, hypochondriasis, and insight) (19,20) have been used to establish an anxiety subscale.
Quick Inventory of Depressive Symptomatology Self-Report (QIDS-SR). The 16 items (each scored from 0 to 3) of the QIDS-SR are based on the nine symptom domains of major depressive disorder (10). Total scores for this tool range from 0 to 27. The QIDS-SR correlates highly with the HAMD-17 (r¼0.86-0.93) and has high inter-item correlation (Cronbach's α¼0.86-0.87) (10). Because the first three items of the QIDS-SR assess insomnia, we combined them to assess severity of insomnia for the present study. Participants completed the QIDS-SR only during screening and at the baseline visit of EMBARC.

Altman Self-Rating Mania Scale (ASRM).
The ASRM is a five-item self-reported scale designed to evaluate for the presence and severity of manic and hypomanic symptoms over the past 7 days. Each item consists of five possible responses, with scores ranging from 0 to 4. Item scores are added for a total score; 0 is the lowest possible score and 20 is the maximum possible (22).

Mood and Anxiety Symptoms Questionnaire (MASQ).
The 30-item short-form adaptation of the MASQ (23) was used to assess the participants' negative affect, positive affect, and somatic arousal. Each item of the MASQ covers a recall period of 1 week and is rated on a 5-point Likert scale from 1, not at all, to 5, extremely. Cronbach's alpha for the 30-item MASQ in a previous study ranged from 0.85 to 0.95 (23). Furthermore, factor analyses have confirmed the 30-item MASQ's threefactor structure with the following three scales: general distress, anhedonic depression, and anxious arousal (23).
Anger attacks question. At the baseline visit of the EMBARC study, participants completed the Massachusetts General Hospital Anger Attack Questionnaire (AAQ) (24). We used the responses to the first item of the AAQ, "Over the past six months, have you felt irritable or easily angered," to test convergent validity of the irritability domain.

Adaptation of the CO-MED Outcome Calculator
In the CO-MED trial, separate logistic regression analyses were used to predict individual outcomes of remission (no or minimal depression) and no meaningful benefit (<30% reduction from baseline) at week 8 by using scores for depression (QIDS-C) and irritability (CAST-IRR [i.e., the CAST scale's irritability domain]) at baseline and week 4 (12). Model β estimates from these logistic regression analyses were then used in a separate sample (the SAMS study) of outpatients with depression to estimate individual-level probability of remission or no meaningful benefit to build an interactive calculator (12). For the present report, we had to update the logistic regression analyses used in the CO-MED trial, because measures of depression severity differed between EMBARC (HAMD-17) (13) and CO-MED (IDS-C and QIDS-SR). We used a formula by Vittengl et al. (18) to convert IDS-C scores to HAMD-17 scores: HAMD-17¼0.11þ0.53�(IDS-C). (The results of our logistic regression analyses, which used converted HAMD-17 scores as the measure of depression severity and CAST-IRR as the measure of irritability, are available in the online supplement to this article.)

Statistical Analysis
As previously mentioned, the modified intent-to-treat sample for the present study consisted of all EMBARC study participants who were randomly assigned to receive sertraline or placebo and who had completed the CAST scale at baseline of EMBARC (N¼292). Psychometric properties of the CAST scale were validated with EMBARC participants only. We used a confirmatory factor analysis implemented in PROC CALIS in SAS to validate the scale's five-domain structure.
We defined acceptable model fit a priori as a goodness-of-fit index ≥0.90, comparative fit index ≥0.90, and root mean square error of approximation ≤0.08 (25). We estimated the Pearson product-moment correlation coefficient (r) to evaluate association among the scale's five domains. We used separate item response theory (IRT) analyses, based on a graded response model (26) for each domain, to evaluate the performance of individual items. The slope for each item provides an estimate of that item's ability to discriminate between differences in levels of specific domains, and the thresholds indicate the item's sensitivity at difference levels.
To validate the scale's clinical utility, we updated previously described logistic regression analyses from the CO-MED trial, which had remission and no meaningful benefit at week 8 as outcomes (12), by replacing QIDS-C with computed HAMD-17, using the formula HAMD-17¼0.11þ0.53�(IDS-C) per Vittengl et al. (18). For EMBARC study participants with HAMD-17 scores available at week 8 (N¼240), remission and no meaningful benefit at week 8 were defined as HAMD-17 ≤7 and <30% reduction in HAMD-17 from baseline, respectively. By using the intercept and β estimates from updated logistic regression analyses of the CO-MED trial and baseline and baseline-to-week 4 percentage changes in CAST-IRR and HAMD-17 scores from the EMBARC study, we estimated individual probabilities of remission and no meaningful benefit for the present study. We then calculated individual level probability (p) of remission and no meaningful benefit among our EMBARC participants by multiplying the β estimates obtained from the CO-MED trial with the observed scores in the EMBARC study to solve the following equation: log(p/1-p)¼ interceptþβ baseline depression from CO-MED �(baseline depression in EMBARC)þβ baseline irritability from CO-MED � (baseline irritability in EMBARC)þβ percent change in depression from CO-MED �(percent change in depression in EMBARC)þ β percent change in irritability from CO-MED �(percent change in irritability in EMBARC). We then compared these estimated probabilities with observed occurrence of these outcomes by using receiver operating characteristic (ROC) curves. We conducted all analyses with SAS, version 9.4; threshold of significance was set at p<0.05.

Validation of the CAST Scale's Psychometric Properties
Five-domain structure. In confirmatory factor analyses, goodness of fit, comparative fit index, and root mean square error of approximation were 0.93, 0.92, 0.06, respectively, for the CAST scale's fivedomain structure. Because three out of three a priori defined criteria were met, the model fit was deemed acceptable. The standardized factor loadings for the anxiety, irritability, mania, insomnia, and panic domains ranged from 0.36 to 0.85, 0.48 to 0.85, 0.45 to 0.78, 0.62 to 0.68, and 0.61 to 0.93, respectively ( Table 2). The anxiety domain was moderately correlated with the other domains (r¼0.30-0.43). The irritability domain was associated only with anxiety (r¼0.43) and panic (r¼0.34). (See Table 2 for correlations among the five domains.) IRT analyses. In polychoric correlation matrices from IRT analyses, only the first factor of each domain had an eigenvalue exceeding 1.00, supporting the unidimensionality of each domain. The eigenvalues of the first factors of the anxiety, irritability, mania, panic, and insomnia domains were 1.95, 2.86, 2.47, 1.66, and 1.50, respectively. Furthermore, for each domain, the slope of all items exceeded 1.0 (excluding the first item of the anxiety domain), indicating that these items provided adequate discrimination. Table 3 presents the item slopes and thresholds of difficulty for each domain.
Construct validity. The irritability, anxiety, insomnia, and panic domains were positively, albeit modestly, correlated with measures of overall depression severity, namely the QIDS-SR (r s ¼0.17-0.24) and HAMD-17 (r s ¼0.11-0.34). The mania domain was not significantly correlated with these measures of depression severity ( Table 4). The anxiety

Validation of Clinical Utility
By using the baseline scores of the HAMD-17 and CAST-IRR along with the baseline-to-week-4 changes in these measures among our EMBARC study participants and model estimates from the logistic regression models (see online supplement) in the CO-MED trial (12), we found that individuallevel prediction of remission (area under the curve [AUC]¼0.805) and no meaningful benefit (AUC¼0.779) in the EMBARC study were similar to those of the CO-MED trial (remission AUC¼0.804; no meaningful benefit AUC¼0.764) (Figure 1).
This finding provides validation of the CO-MED calculator in an independent sample.

DISCUSSION
In this study of a large sample of adult outpatients with major depressive disorder, we found confirmatory evidence for the psychometric properties and five-domain structure of the 16-item CAST scale. Furthermore, we extended the clinical utility of CAST as a measure of irritability by updating and validating a previously reported individual-level calculator for prediction of acute-phase treatment outcomes of remission and no meaningful benefit. Our findings are consistent with previous studies that have found the CAST scale to have sound psychometric properties (11,21). Additionally, consistent with previous reports, we found moderate association between measures of irritability and anxiety (2,30,31). Future studies are needed to identify the shared versus unique components of these domains as well as the overlap between self-reported symptoms of irritability and overt behavior, such as anger attacks (32). Strengths of this study include validity of the CAST scale's irritability domain as a self-report measure of irritability and replication of its clinical utility by prediction of . The resultant estimated probabilities were then compared with the observed occurrence of these outcomes by using receiver operating characteristic curves. The areas under the curve (AUCs) were comparable for CO-MED and EMBARC.