Validation of the Davidson Trauma Scale in its original and a new shorter version in people exposed to the F-27 earthquake in Chile

Background On February 27, 2010 (F-27), an earthquake and tsunami occurred having a significant impact on the mental health of the Chilean population, leading to an increase in cases of post-traumatic stress disorder (PTSD). Objectives Within this context, validated for the first time in Chile was the Davidson Trauma Scale (DTS) using three samples (each one consisting of 200 participants), two of them random from the Chilean population. Results Reliability analyses (i.e., α=0.933), concurrent validity (63% of the items are significantly correlated with the criteria variable “degree of damage to home”) and construct validity (i.e., CMIN = 3.754, RMSEA = 0.118, NFI = 0.808, CFI = 0.850 and PNFI = 0.689) indicate validity between regular and good for DTS. However, a new short version of the scale (DTS-SF) created using the items with heavier factor weights, presented better fits (CMIN = 2.170, RMSEA = 0.077, NFI = 0.935, CFI = 0.963, PNFI = 0.697). Discussion Finally, the usefulness of DTS and DTS-SF is discussed, the latter being briefer, valid and having better psychometric characteristics.

, an 8.8 Richter scale earthquake occurred, the sixth most powerful movement recorded since 1900 around the world (USGS, 2013). Later, a tsunami devastated several cities and towns along 300 km of central coast, between the cities of Constitucion and Talcahuano (PAHO, 2010). Moreover, almost 3 million people were heavily affected by F-27 in terms of damage to their cultural heritage, education systems, homes, and loss of human life (PAHO, 2010).
Based on the previous information, we can say that an earthquake and tsunami like that of F-27 are events that cause a very high impact on people, regardless of their ethnicity or socio-economic situation. Several studies have shown that F-27 had significant negative effects on the health of the inhabitants of central Chile (Figueroa, González, & Torres, 2010;Leiva, 2010;Leiva & Quintana, 2010;Mendez, Leiva, Bustos, Ramos, & Moyano-Díaz, 2010;MIDEPLAN, 2011;ONEMI, 2010). For instance, valid diagnostic tools and effective methods to quantify these effects are very important, especially in order to evaluate the most important mental health problem after a disaster: post-traumatic stress disorder (PTSD; APA, 2005;Rodríguez, Zaccarelli, & Pérez, 2006;Solvason, Ernst, & Roth, 2003).
Specifically, PTSD is an anxiety disorder that could be produced after exposure either directly or indirectly (hear stories, see pictures or movies) to extremely stressful and traumatic events (e.g., F-27). The traumatic event is re-experienced through rumination, uncontrollable and distressing memories or dreams, accompanied by images, thoughts, or perceptions. This produces an intense distress associated with continual avoidance of the experienced, dullness (reluctantly), behavioral activation and physiological responses. These responses appear especially when the person is exposed to internal or external cues that symbolize an aspect of the traumatic event. Some symptoms of PTSD are related to insomnia, inability to focus attention, irritability, anger, hyper-vigilance, and exaggerated startle response. These changes may last longer than 1 month and cause clinically significant distress or impairment in social, occupational, or other important areas of functioning (Lopez-Ibor & Valdes, 2008). Note that these symptoms do not always occur immediately after a disaster. The symptoms should appear at least 1 month after an event (APA, 2005). Some people affected by PTSD improve over time, while others may maintain the disorder for 4 years or more (Goenjian et al., 2000;Priebe et al., 2009).
Between 10 and 30% could be the prevalence of PTSD after disasters (Başoglu, Kılıç, Ş alcıoglu, & Livanou, 2004;Başoǧlu, Salcioǧlu, & Livanou, 2005;Bland et al., 2005;Bulut, 2006;Cairo, Dutta, & Nawaz, 2010;Lai, Chang, Connor, Lee, & Davidson, 2004;McMillen, North, & Smith, 2000;Sharan, Chaudhary, Kavathekar, & Saxena, 1996). In Chile, after F-27, the prevalence of PTSD was 12%, 6% for men, and 15% for women (MIDEPLAN, 2011). Also, Leiva-Bianchi (2011) indicates that after 6 months, 36% of the inhabitants of Constitucion (a city significantly affected by the earthquake and tsunami), would be affected by post-disaster stress, a type of PTSD that includes symptoms of depression and altered daily functioning (Norris, Hamblen, Brown, & Schinka, 2008). Furthermore, it is expected that between 10 and 20% of health care personnel will have symptoms of PTSD and between 30 and 40% in camps of people that lost their homes after F-27 (Figueroa et al., 2010). Damage to the home, therefore, is a quite relevant variable in terms of event exposure. More symptoms of PTSD are observed in low-income groups because they experience a larger material impact (damage to their homes and loss of personal belongings) and because they have fewer resources with which to receive treatment (MIDEPLAN, 2011).
In regard to its measurement, there are a number of relatively short scales (screening) in order to perform a quick diagnosis of PTSD. One of the most common is the ''Davidson Trauma Scale'' (DTS). Davidson et al. (1997) proposed a scale composed of 17 items, each one related to DSM-IV symptoms. Regardless of the participants and their cultural characteristics, the DTS scale has both construct (generally in terms of three or four factors) and convergent validity (with other PTSD measurements), and very good internal consistency and testÁretest reliability (Bobes et al., 2000;Chen, Lin, Tang, Shen, & Lu, 2001;Davidson et al., 1997;McDonald, Beckham, Morey, & Calhoun, 2009;Villafañ e, Milanesio, Marcellino, & Amodei, 2003). For example, Villafañe and his collaborators (2003) indicate that DTS has very high reliability (a 00.890) and a structure composed of four factors very consistent with the original structure and other previous validity studies. It is probably for these reasons that DTS is widely used after potentially traumatic events.
However, DTS has some problems. Each item requires two answers, which could cause confusion and fatigue in people who suffer from PTSD. As mentioned, each participant responds twice to the 17 items that make up the scale, one answer evaluating frequency and the other evaluating intensity. Although this has not caused problems in scale validity (Chen et al., 2001), it could be a practical problem when applying DTS to people who have experienced potentially traumatic events and who are not emotionally prepared to respond to very long instruments. Another problem is that in a Chilean context, DTS has not been validated, although it has been applied on at least two occasions, both times after F-27: 3 months later as part of the Post Earthquake Questionnaire applied together with the National Socio-Economic Characterization Questionnaire (Encuesta de Caracterizacion Socioeconomica Nacional*CASEN) of the Social Development Ministry (MIDEPLAN, 2011); and 7 months later as a measurement criterion in order to validate the post-disaster stress scale (SPRINT-E; Leiva-Bianchi & Gallardo, 2013). In neither of the two cases were DTS reliability or validity indictors reported. In order to validate the DTS for the first time in Chile and taking data from both studies, we conducted this research.

Method
Sample and procedure We selected three random samples of 200 participants each, belonging to two different databases. We chose 200 participants per sample given that it is the maximum limit recommended with which to carry out scale validation (Barret, 2007;Hair, Anderson, Tatham, & Black, 2004). Participants of the first and second samples belong to two regions particularly affected by F-27 (21 and 20% belong to the Metropolitan Region, and 79 and 80% to the Maule Region in both samples, respectively). The samples were selected randomly from the 2010 CASEN Post Earthquake Questionnaire database. Said survey is unique in that it was selected from a representative subsample of households interviewed. The selection was performed via a random sample stratified by sections and carried out in two phases. The sample was based on close to 27,000 participants (interviewed directly) throughout the country. Such a size provides for a margin of error of no more than 8% in all regions and provinces affected by the earthquake and avoids the inconvenience of no answer (MIDEPLAN, 2011). Such rigorousness in sampling is not common in scale validation studies, and even less so in samples of people affected by the same stressful and potentially traumatic event.
The third sample corresponds to 200 participants from the Metropolitan (21%) and Maule (79%) Regions. This time, it was a non-probability convenience sampling.
Similar to the two previous samples, we were interested in groups of people belonging to regions affected by F-27, although in different degrees. These people were interviewed 7 months after the earthquake occurred.

Davidson Trauma Scale
The DTS validated in an Argentinean population was used having a good reliability index and good construct validity (Villafañ e et al., 2003). Items are classified according to DSM-IV criteria for PTSD diagnosis: Criteria B ''re-experimentation'' (RE; items 1Á5); Criteria C ''avoidance and numbing'' (AN; items 6Á12); Criteria D ''hyper-activation'' (HA; items 13Á17). For each item, the person performs two evaluations, both on a scale of 0 (never/nothing) to 4 (daily/extreme) points: one for frequency (number of times it has happened) and the other for intensity (magnitude or gravity) of the symptom experienced. The least possible score is 0, and the maximum is 136. A score of over 40 points is considered to indicate a high probability that the person suffers from PTSD (Davidson et al., 1997).

Degree of damage to home
In order to determine the degree of damage to the home of each participant, we used the question ''As a result of the earthquake, what damage did your home incur?'' We evaluated the degree of damage using the following levels: ''no damage'' (0), ''light damage such as cracking'' (1), ''heavy damage such as fallen walls or ceilings'' (2), and ''total loss'' (3).

Data analysis
Before starting the validation process itself, the reliability of the instrument was tested by Cronbach's alpha (a) for the 17 items of the DTS. For this test, a value above 0.9 is considered excellent (Pardo & San Martin, 1998). In order to perform the reliability analysis, we used each of the three previously described samples separately.
Then, to assess construct validity, we performed an exploratory factor analysis (EFA) with a generalized least squares extraction method, free numbers of factors (criterion of eigenvalues greater than 1) and varimax rotation. Although we expect related factors, varimax rotation was performed because it is a simple way to interpret and we will carry out a confirmatory factor analysis (CFA), which is more accurate with the estimation of the relation between the factors. The EFA was performed using 200 randomly selected participants from the original sample of 26,737. The model has a good fit and it is relevant to perform the analysis if the following tests show values within the limits: KaisserÁMeyerÁOlkin (KMO) 0.51; Bartlett's sphericity test with pB0.01, X 2 with p0.05 (Ximenez & San Martin, 2004).
To confirm the existence of the pattern obtained from the EFA, we conducted a CFA through a structural equation model with the 17 items of DTS. CFA was performed using other samples of participants that were not used in EFA (n0200). Considering the arguments of Barret (2007), in this case a model has an appropriate fit if the following indicators have values approximately within the limits: CMIN/DFB3, RMSEA B0.05, TLI 0.9, CFI0.9 and PNFI0.5 (Hair et al., 2004;Schreiber, Nora, Stage, Barlow, & King, 2006). Once the single model is obtained, it is once again submitted to CFA, this time with a third sample: 200 participants selected via a non-probabilistic sampling, evaluated 6 months after F-27. The model fit will be evaluated according to the same indicators mentioned above.
Finally, to determine concurrent validity, Pearson correlations were conducted between DTS items with home damage level. The more the DTS items are related with home damage, the better the criterion validity will be. All these correlations must be statistically significant (p B0.05).
Both the analysis of reliability and EFA was performed using SPSS version 15. CFA was performed using AMOS version 16. As for concurrent validity, Table 1 shows the Pearson correlations between DTS and home damage item, indicating that there are 94, 41, and 53% of correlated items (pB0.01) in the three samples, respectively. There are also 11 items that are correlated with the criteria in at least two of the three samples. In total, 63% of items correlate with the criteria question.
Construct validity: EFA In the first sample (n 0200), EFA was performed to begin the DTS construct validity analysis. First, we analyzed the relevance of factor solution and if there is a structure of relations among the items suitable for extracting factors. In this regard, the KMO (0.915 and Bartlett's sphericity (X 2 02600.963, pB0.01) tests indicated that the structure of correlations was adequate.
The factor structure of this solution was analyzed. In this regard, we obtained a three-factor solution that explained 63% of the total variance. This structure is similar to that found by other authors (Chen et al., 2001;McDonald et al., 2009;Villafañ e et al., 2003) and that which was originally proposed (Davidson et al., 1997). However, upon examining the rotated component matrix, we found that Items six and seven originally belonging to the dimension AN, now weighed much more in RE. Something similar occurs with Item 13 of HA, which weighed more in RE (see Table 2). This, in light of the fact that these items could have a certain semantic  (Davidson et al., 1997)  coherence that would link them with the RE factor, motivates us to propose two models in order to confirm: one using the EFA (empirical) and the other using the original DTS dimensions (original).
Construct validity: CFA Given the above results, we conducted a CFA to specify the fit of the empirical model and compare its fit with the original. In both cases, maximum likelihood estimation method was used. The analysis was performed on the second sample (n0200). To begin, in the original model, all factor loadings were significant (pB0.001). However, it has a regular overall fit (CMIN/DF03.754 and RMSEA00.118) and incremental fit (TLI 00.824 and CFI00.850), although it provided a good fit of parsimony (PNFI 00.689; see Fig. 1). Regarding the empirical model, all factor loadings were significant to (pB0.001). This model also had a regular general fit (CMIN/DF 03.844 and RMSEA 00.120) and incremental fit (TLI 00.808 and CFI 00.845). However, parsimony fit (PNFI 00.685) shows an appropriate fit (see Fig. 2).
With the previous results, we can establish that both models showed the same fit. Given this problem, However, we cannot ignore that the fit of the theory model appeared regular. Therefore, we designed a fourth model using that which was originally proposed, without items 3, 6, 10, 11, 13, and 14 for having the model's lowest weights (lB0.7). This is how we arrived at a model that contains 11 items. By testing the model's fit with Sample 2, we obtained a regular general fit (CMIN/ DF03.567 and RMSEA00.114), although the incremental (TLI 00.896 and CFI00.923) and parsimony fit  (Fig. 3). creation of a newly reduced version of the same scale (DTS-SF) produced two important findings from this investigation. We confirmed the structure found by other authors based on three factors (Bobes et al., 2000;Chen et al., 2001;Seo et al., 2008;Villafañ e et al., 2003) for the original version and that of 11 items. Furthermore, given the important correlations between latent variables or scale dimensions, it is possible to establish a secondorder variable that we could call PTSD (Figs. 4 and 5). Both models show the same fit indicators as its predecessors and its existence reinforces the validity of the scale in both its versions. We could say, then, that both DTS and DTS-SF are valid scales for measuring PTSD.

Reliability of 11-item DTS version
Our attention is called to the fact that DTS-SF shows somewhat better indicators than the original version. Although there is another short version called SPAN (named for its four items Startle, Physiological arousal, Anger and Numbness; Meltzer-Brody, Churchill, & Davidson, 1999), we did not find any validation study with EFA and/or CFA (Chen, Shen, Tan, Chou, & Lu, 2003;Seo et al., 2011;Yeager, Magruder & Frueh, 2007), which makes us doubt its construct validity. This represents an opportunity, as it is possible to apply a shorter version of DTS as valid as the original and with the same factor structure. Therefore, with DTS-SF, we would have more certainty of measuring the same construct. DTS-SF could be very useful in situations where participants present a higher degree of emotional stress, a lower degree of reading comprehension, or in situations in which the application of considerably long instruments is difficult. In addition, in order to confirm the validity of this version, we propose performing other investigations that compare the functioning of DTS-SF with other PTSD measures serving as criteria (i.e., PCL-C, TOP-8, SPRINT-E; Bobes et al., 2000;Norris et al., 2008;Leiva-Bianchi & Gallardo, 2013;Weathers, Litz, Herman, Huska, & Keane, 1993). Certainly, we recommend testing the construct validity of SPAN and to compare it with DTS and DTS-SF validity.
In regard to criterion measures, we used only one in our investigation: the degree of damage to the home. The relationship between DTS and damage to the home, despite being scarce (63%), weak (r B0.3) and insignificant, is useful in order to confirm scale validity. As we know, the probability of suffering from PTSD increases with the person's exposure to the event, whether by heavy damage to their belongings after a catastrophe (MIDEPLAN, 2011) or for other socio-demographic and environmental factors (Goenjian et al., 2000). Nevertheless, it is important to note that this is not a casual relationship, which is to say that not all those who suffered a high degree of exposure to the event will necessarily suffer from a psychopathology such as PTSD. In fact, the majority spontaneously recovers, and even experiences personal growth after the event (Tedeschi & Calhoun, 2004).
Here, we can mention the study's first limitation. There are probably other more precise criterion measures than the one used in this case. In fact, the PTSD and postdisaster stress scales mentioned are a good alternative for this purpose. Unfortunately, the measurements carried out in the CASEN Survey did not consider these types of measurements as criteria. However, the third sample used in this study did consider two other criterion measures: the SPRINT-E and a checklist to determine the presence or absence of panic attack symptoms. Complementarily, we would like to note that upon making Pearson correlations between the 17 DTS items and the SPRINT-E scale totals and the panic attack symptoms, all items proved to be significantly correlated with both criteria (pB0.01).
Another limitation might be the maximum likelihood as a method of extraction. While this method is widely used in CFA and provides statistical tests to estimate model parameters (Martinez, Hernandez, & Hernandez, 2006), it may not be the most appropriate when variables are ordinal or when they do not meet normality assumption (Brown, 2006;McIntosh, 2007), such as the DTS items (see Table 3). Therefore, we recommend performing the analysis with another method, such as unweighted least square (ULS;Brown, 2006). Complementary to presented results and based on the sample three, we performed the analysis of DTS and DTS-SF by ULS. This procedure brings us to confirm the results for DTS (RMR 00.252, GFI00.990, NFI 00.988 and PNFI00.843) and DTS-SF (RMR 00.177,GFI0 0.996,NFI00.994,and PNFI 00.745). Performing some items transformation procedure could have been another solution (e.g., log or square root). However, this could hinder the interpretation of results in having to change from original to transformed scores.
Although it is not the first DTS validation in Spanish, it is the first time in which two samples have been taken via a random sampling procedure. This strength is not common in DTS validation studies reviewed, or in any other validation study that we know of, for that matter. This strength allows us to arrive at a conclusion in respect to the DTS structure that is most representative of the population affected by the same potentially traumatic event. This, together with the high validity of the scale criteria and the fact that validation was carried out after a single stressful event common to all participants (F-27), allows for a decrease in the margin of error of estimates and assurance of the accuracy of the validation results. In addition, this effort is relevant when having representative cross-cultural findings. As per the particular characteristics of the 2010 CASEN Survey database, we can mention that further analysis represents at least three interesting psychometric opportunities. The first pertains to performing DTS validation with those participants that received PTSD diagnosis (a total score of 40 points or more on the scale) and not with the general population, as was the case in this study. The second consists of dividing the total sample number (26,737) into 138 samples of 200 participants each, validating each sample and finally obtaining an indicator of average or proportion fit for DTS. The third opportunity is related to performing analysis using the item response theory, given the large number of participants that responded to the DTS as part of the CASEN Survey.
Another task still pending is the creation of norms that confirm (or not) the issue of the 40 points originally provided (Davidson et al., 1997). For this, we propose applying DTS accompanied by a structured clinical interview (the gold standard for validation) in order to generate the ROC curves necessary for establishing its sensitivity. The emphasis here on the search for a better model for DTS was detrimental to these important aspects. Nevertheless, in order to partially mitigate this weakness and provide practical criteria for decision making in the clinical environment, we used the 40-point limit mentioned for calculating PTSD prevalence in the population. We found that for the three samples analyzed, there would be 10, 6, and 28% prevalence of PTSD, respectively, that is, an average of 15%. This concurs with the prevalence results mentioned at the beginning. However, given the variation of PTSD prevalence in the three samples, we suggest further investigation of factor structures across groups with high, mid and low levels of symptoms.
For its part, the new 11-item version does not possess scales either. However, we note as a reference that the equivalent of the 40 points on the original scale correspond to the scores of 28, 30, and 27 for the three samples on the scale of 11 points applied here. Therefore, scores above 28 points on DTS-SF imply a higher risk of the presence of PTSD. Finally, we recommend incorporating DSM-V criteria for diagnosing PTSD in future evaluations. In this regard, it is important to include assessments of negative moods and dissociated thoughts. To differentiate the symptoms of PTSD with panic attacks could be relevant in order to identify cases of PTSD (Friedman, Resick, Bryant, & Brewin, 2011). Studies of comorbidity and differential diagnosis of disorders based on fear and anxiety (e.g., panic attacks), depression, and dissociative disorders would be very useful in this regard.

Author's note
The submitted article has not been previously published or orally presented, and is not under consideration elsewhere.
The corresponding author is not a member of ESTSS.