Differential Item Functioning of the Arabic Version of the Depression Anxiety Stress Scale-21 (DASS-21)

Symptoms of depression and anxiety are widespread, chronic, and recurrent; their recovery is poor. They are associated with decreased work productivity, dissatisfaction in social relations, low quality of life, and high risk of suicide [13]. Negative emotions play a major role in the development and continuation of substance use disorders (SUDs) [4]. Individuals turn to illicit drugs to achieve emotional-regulation i.e., to cope with depression, anxiety, anger and trauma; and cover up their poor social skills [5,6]. Evidence indicates that adolescents with both recent and chronic depression and anxiety are more likely to use illicit drugs in the future [7,8]. Correspondingly, use of illicit drugs in adolescence is associated with higher psychological distress and continued drug use in adulthood [9]. Furthermore, withdrawal symptoms, intensity and frequency of craving, failure of pharmacotherapy, and relapse increase during emotional distress, even during and after treatment [10,11]. Conversely, drop of negative emotions during treatment is associated with less use of drugs [12].


Introduction
Symptoms of depression and anxiety are widespread, chronic, and recurrent; their recovery is poor. They are associated with decreased work productivity, dissatisfaction in social relations, low quality of life, and high risk of suicide [1][2][3]. Negative emotions play a major role in the development and continuation of substance use disorders (SUDs) [4]. Individuals turn to illicit drugs to achieve emotional-regulation i.e., to cope with depression, anxiety, anger and trauma; and cover up their poor social skills [5,6]. Evidence indicates that adolescents with both recent and chronic depression and anxiety are more likely to use illicit drugs in the future [7,8]. Correspondingly, use of illicit drugs in adolescence is associated with higher psychological distress and continued drug use in adulthood [9]. Furthermore, withdrawal symptoms, intensity and frequency of craving, failure of pharmacotherapy, and relapse increase during emotional distress, even during and after treatment [10,11]. Conversely, drop of negative emotions during treatment is associated with less use of drugs [12].
Early identification of youth with high emotional negativity is necessary in order to provide timely protective measures to prevent future onset of debilitating life-long mental problems such as drug use [2]. Similarly, it is crucial to address negative affect among drug users so as to enhance treatment outcomes and lower relapse [11]. Symptoms of depression and anxiety can go unnoticed; therefore, rigorous measures are needed to identify those with high levels of psychological distress [13]. Several measures of depression and anxiety exist. However, the overlap of both symptoms throws doubt on the discriminant validity of standalone depression or anxiety measures [14]. The Depression Anxiety Stress scale (DASS) was designed to minimize measurement overlap by measuring the distinct features of depression and anxiety and to [15]. Its three subscales depict symptoms of depression, anxiety, and stress (which is common in both depression and anxiety) [16]. Hence, it measures a distinct syndrome-emotional distress [17,18]. It is broadly used to assess symptoms severity and response to treatment [14,15]. The full version has sound psychometric qualities; but reports on the psychometrics of the 21 items DASS are inconsistent, especially in translated versions [15,[19][20][21].
Validation testing addresses the implications of scores with respect to the presumed underlying latent trait [22]. DASS scores will be less useful if people respond to them on the basis of other factors that those scorers are not intended to reflect, such as age and gender [13,23]. Differential item functioning (DIF) means that individuals in different subgroups defined by such extraneous factors will respond differently to a given question despite having the same level of the latent trait [24]. For example, at high levels of anxiety, those without obstructive sleep apnea (OSA) were more likely to endorse 'dry mouth' and 'heart beat without effort', compared to those with OSA i.e., both items caused overestimation of the latent trait [25]. This is known as uniform DIF-DIF is the same for all trait levels of the two groups. Non-uniform DIF occurs when there is an inconsistent betweengroup difference in the likelihood of endorsing an item across levels of the latent trait [26].
Although the Arabic DASS-42 was tested in Australia among Arab immigrants of different nationalities [27]; psychometric evaluations of the Arabic DASS-21 are deficient. To bridge that gap, here we report the results of differential item functioning analysis of the DASS-21 in a sample of Egyptian drug users. In this study, we assume that all participants rate DASS-21 items according to their corresponding level of the latent trait (emotional distress). We hypothesize that items do not show DIF between different subgroups.

Material and Methods
This study involves a secondary analysis of a dataset collected from 149 inpatient Egyptian drug users. Description of the procedure is available elsewhere [28]. The original study was approved by Alexandria University board of research ethics.

Instrument
The Depression Anxiety Stress Scale-21 (DASS-21) [29] consists of 3 subscales; each subscale comprises 7items. The subscales assess depressive symptoms (e.g. life was meaningless), anxiety symptoms (e.g. finding it difficult to relax), and general stress symptoms (e.g. feeling rather touchy). Items are reported on a 4-point scale (0 = did not apply to me at all and 3 = applied to me most of the time). Higher scores signify severe psychological distress.

Statistical Analysis
We examined the DASS-21 for differential item functioning (DIF) by age, gender, marital status, education, employment, income, chronicity, length of the hospital stay, and history of mental illness. For this purpose, we used a macro developed by IBM that performs a set of 3 ordinal logistic regressions for each item [30]. In the first step, each item's score was regressed on the DASS-21 total score. In the second step, each item's score was regressed on a group variable (see Table 1 for subgroups). In the third step, the regression equation included a term for the interaction between the group variable and the DASS-17 total score. Then, the criterion of Jodoin & Gierl [31] was used to detect DIF. Specifically, DIF was judged to be present if the effect size (ES) of an item with a statistically significant χ2 statistic was at least 0.035. Effect sizes were computed by subtracting the pseudo-R2 obtained in the third step from those obtained in the first step [31]. Data were analyzed in SPSS version 22, and the level of significance was set to 0.05 two-tailed.

Participants' characteristics
The majority of the participants were men (95.3%). The mean age was 32.5 years (SD= 6.8 years, age range: 19-60 years).
Participants reported abuse of several drugs; however, heroin, pharmaceutical agents, and cannabis were the most frequently abused drugs (80.5%, 79.2%, and 75.2%, respectively) (see Table 1 for sociodemographic and clinical characteristics of the participants). Differential items functioning Table 2 shows that more than half the items exhibited some DIF. Items 2 and 5 had a uniform DIF by age; whereas those less than 30 years significantly endorsed item 2. Meanwhile, items 3, 14, and 15 had uniform DIF by gender. Items 2, 7, and 20 had a uniform DIF by marital status. It was significantly easier for widow and divorced to endorse items 2, and 16; whereas married and single significantly endorsed item 10.Items 6 and 13 significantly inflated DASS-21 total score by chronicity. The effect size was negligible for item 2 by education, item 12 by income, and item 21 by history of mental illness. Item 6 had uniform DIF by the length of hospital stay. Item 5 showed a marginally significant interaction effect among those with a hospital stay of less than 2 weeks; however the effect size was moderate 0.050.Meanwhile, it was significantly easier for those hospitalized for 2-4 weeks to endorse item 21. Obviously, item 2 exhibited higher DIF compared to other items.

Discussion
This study is an initial attempt to examine the potentials of the Arabic DASS-21. More than half the items showed variance between different subgroups. Such DIF indicates that the DASS-21 involves systematic errors. The main effect was inflated for some items among some groups.
The literature embroils disagreement on the psychometric properties of DASS-21 [19,[32][33][34][35]. Consistent with the current study, nationality affected the performance of items of the DASS-21: the Chinese version had lower differentiation between depression and anxiety compared to the English scale. Further, its discrimination was lower among Australian-Chinese compared with Chinese samples in Hong Kong [36]. In accordance, Shea et al. [2] reported DIF by gender for items 13 and 16 among adults attending a stress resilience program. Similarly, another study reported some DIF in all the DASS-21 subscales; 2 items on the anxiety subscale significantly lowered the severity scores of obstructive sleep apnea [25].
The higher incidence of DIF noticed in our study can be because we examined both uniform and non-uniform DIF across 8 grouping variables while the other studies usually described uniform DIF and used only two or three grouping variables. In addition, the samples were different (English speaking). The large number of items that had DIF indicates scale errors that were not caused by chance. Other factors such as gender were a source of bias that caused failure of these items to accurately reflect the latent trait.
In case of DIF, some corrective actions should follow (e.g., removal of problematic items).Considering that 50% of the items had DIF, all these problematic items need further revision. We recommend future studies to examine DASS-21 in larger and diverse samples, and through robust techniques such as IRT to support decisions about eliminating dysfunctional items. This study has some limitations. Data were prone to selfreport bias given the fact that the sample was small, few patients took part, their education was low, those who participated had a longer stay. In addition, only 7 females participated in the study; while the literature supports association between emotional negativity and gender. tobacco dependence, withdrawal, outcome and response to treatment. Addiction 106 (2)