Validation of a German Version of the Stress Overload Scale and Comparison of Different Time Frames in the Instructions

: Introduction: TheStressOverloadScale(SOS;Amirkhan,2012,2018)wasintroducedasatwo-factorialself-reportmeasure ofstressto overcome limitations of other scales. Methods: We developed a German translation of the SOS and validated it in addition to a short version and an extra-short version. Furthermore, we tested whether manipulating the time frame introduced as retention interval in the instructions affected its psychometric properties. Results: Using two independent age-heterogeneous convenience samples ( N total = 1,239), we found good psychometric properties for a modified German short version of the SOS-S (SOS-S-G) and a new extra-short version (SOS-XS-G), but not for the German long version of the SOS. Moreover, manipulating the time frame of the SOS did not affect its psychometric quality. Discussion: The SOS enriches the repertoire of self-report measures of stress as it captures the nonpathological core facets of stress in line with theoretical stress conceptualizations.

Stress is a central construct in psychology and health sciences as it has been associated with reduced physical and mental health, and various physical and mental disorders (McEwen, 1998). Stress is commonly defined as a mismatch between environmental demands and personal resources (Lazarus & Folkman, 1984;McEwen, 2000) that threatens an organism's homeostasis (McEwen, 1998(McEwen, , 2000. Several self-report measures have been developed to assess the subjective experience of stress, including the Stress Overload Scale (SOS; Amirkhan, 2012) and its short form (SOS-S; Amirkhan, 2018), which have been proposed to address the shortcomings of previous scales (e.g., alignment of factor structure and theoretical foundation). In this study, we validated a German translation of the SOS and the SOS-S and developed a new extra-short version (called SOS-XS-G). Furthermore, as existing research used different time frames in the instructions of the SOS (e.g., Amirkhan et al., 2015;Hartsell & Neupert, 2019), we tested whether changing the time frame in the instruction of the SOS changes its psychometric properties.

Theoretical Background Stress
The stress construct features two nonhierarchical, but related, dimensions referred to as environmental demands and personal resources which reflect a dynamic interplay between person-environment characteristics (Lazarus, 1990). Environmental demands represent situations that cause threat or challenge to an individual, whereas personal resources allow an individual to cope with the environmental demands (Lazarus, 1990). Stress can be assessed objectively (e.g., cortisol levels; Hellhammer et al., 2009), but the subjective experience of stress is an important and distinct aspect of the stress construct. This is underlined by weak-to-moderate correlations between subjective stress reports and objective markers of stress (Schlotz et al., 2008;Weckesser et al., 2019). Moreover, whereas the assessment of objective stress measures can be time-consuming, assessing subjective stress is convenient and applicable in large cohorts (Amirkhan, 2012). Furthermore, subjective stress reports predict changes in physical health, mental health, and mortality, underlining the relevance and utility of selfreport measures of stress (e.g., Novak et al., 2013).

The Stress Overload Scale as a New Measure of Subjective Stress
The SOS is intended to measure the core concept of stress as a state in which current environmental demands exceed an individual's personal resources (Lazarus & Folkman, 1984;McEwen, 2000). In line with this definition, the SOS comprises two subscales. The event load subscale refers to the external, environmental demands and responsibilities that an individual faces. The personal vulnerability subscale captions the subjective feeling of stress and overstrain. Hence, the SOS features a correlated two-factor structure that allows a holistic measurement of an individual's stress level by integration of the two subscales (Amirkhan, 2012(Amirkhan, , 2018. Correlations between the two subscales are strong (i.e., r ≈ .55; Amirkhan, 2012). Importantly, the two-factorial conceptualization of the stress construct underlying the SOS can be considered an advantage compared to other stress scales that focus on either environmental demands or personal resources when assessing an individual's stress level (for an overview, see Amirkhan, 2012). In line with that, psychometric properties of some existing measures of subjective stress have been criticized (B. P. Dohrenwend, 2006;Hough et al., 1976), including the most commonly used stress scales, such as the Perceived Stress Scale (PSS;  or the Screening Scale of Chronic Stress (SSCS; Amirkhan, 2012;Schmidt et al., 2020;Schulz & Schlotz, 1999).

Nomological Net
With respect to convergent validity, moderate-to-strong positive associations have been reported between the SOS and anxiety, depression, and general illness (Amirkhan, 2012;Amirkhan et al., 2015;Duan & Mu, 2018). Furthermore, the SOS has been positively associated with the number of experienced life events and negatively to resilience (i.e., coping abilities that allow an individual to withstand conditions of risk and adversity; Amirkhan, 2012;Amirkhan et al., 2015Amirkhan et al., , 2018. However, in line with the theoretical conceptualization of the SOS, the size of these associations differed between the two subscales. The event load subscale of the SOS was more strongly related to the number of experienced life events than the personal vulnerability subscale, whereas the personal vulnerability subscale was more strongly related to resilience than the event load subscale (Amirkhan, 2012). We expected that the mentioned correlations would replicate in our German translation. Furthermore, we expected the SOS to be negatively related to life satisfaction (people's cognitive evaluation of life; Diener, 1984) and affective well-being (positive and negative feelings; Diener, 1984).
With respect to discriminant validity, the SOS was only weakly associated with social desirability in prior research (Amirkhan, 2012(Amirkhan, , 2018. Social desirability is the tendency to align responses on self-report measures with what is perceived as being socially accepted (King & Bruner, 2000). Self-report measures are intended to reduce socially desirable response biases to a minimum. To do so, the SOS features additional filler items which do not belong to one of the two subscales measuring stress, but which are intended to conceal the true target variable of the SOS (i.e., stress). The SOS-S, however, misses these filler items for the purpose of its shortness (Amirkhan, 2018). In line with the existing empirical evidence, we expected to find only weak correlations between social desirability and our German translations of the SOS.

Criterion Validity
The SOS has been associated with various health outcomes (Amirkhan, 2021;Amirkhan et al., 2015). Higher SOS scores have been shown to reliably predict (1) physical symptoms commonly associated with stress (e.g., headaches; Amirkhan, 2018), (2) a higher cortisol reactivity in response to an acute laboratory stressor, and (3) the onset of illness after prolonged exposure to elevated stress (Amirkhan et al., 2015). The high and consistent criterion validity across multiple criteria is a unique advantage of the SOS, as other measures of self-reported stress often lack strong associations with (mental) health outcomes (Amirkhan et al., 2015;B. S. Dohrenwend et al., 1984;Schmidt et al., 2020). Consequently, we expected to find strong positive correlations between our German translation of the SOS and measures of cognitive and behavioral symptoms as well as self-reported health.

Stress Overload Scale: Intended Use
The SOS is intended to be used for the assessment of shortterm stress experience (i.e., stress experience of the past week). The SOS can be applied in general population samples and for estimates of nonpathological stress experience, but it should not be used for any kind of diagnosis in clinical settings (Amirkhan, 2012). To achieve sufficient variability in item responses in the general population, the items of the SOS should have a medium item difficulty (Moosbrugger & Kelava, 2012). The SOS might also serve as an initial screening tool to identify atrisk individuals in the general population since it is shorter than other self-report measures of stress (SOS: 30 items, SOS-S: 10 items vs., e.g., Daily Hassles Inventory: 117 items, or Stress and Adversity Inventory: 220 items; Kanner et al., 1981;Slavich & Shields, 2018). Given the above-mentioned high criterion validity of the SOS, the SOS may be considered an ideal scale to predict health consequences of stress in the general population (Amirkhan, 2021).

Stress Overload Scale: Target Population
The SOS was empirically developed and validated in multiple heterogeneous community-based samples in the United States (Amirkhan, 2012). Therefore, the SOS offers application in large samples as for epidemiologic research or in longitudinal cohort studies with multiple measurements (Amirkhan, 2012). In this regard, the SOS may be a valuable advancement since other self-report measures of stress, such as the PSS-10, have been criticized for being validated mainly in college students or workers (Lee, 2012). Application in larger cohorts and epidemiological studies may be further encouraged due to the brevity of the SOS and the SOS-S.
Beyond the English version, the SOS so far has been validated for a Setswana-speaking community in South Africa (Wilson et al., 2018), for Chinese populations (Duan & Mu, 2018), and an Arabic version of the SOS has been developed (Bashmi & Amirkhan, 2018). Since a German translation of the SOS has not yet been validated, the first aim of the current study was to translate the SOS and the SOS-S to German language and to test the psychometric quality of this translated version. Furthermore, we developed and validated a new extra-short version of the SOS (SOS-XS-G).

Time Frame in the Instruction of Self-Report Measures of Stress
Existing self-report measures of stress use different time frames in their instructions as retention interval to evaluate the items (e.g., the PSS requires to rate one's stress level of the past month). The SOS refers to an individual's stress level of the past week and has been validated only for this particular time frame (Amirkhan, 2012). However, Hartsell and Neupert (2019), for instance, altered the time frame of the SOS so that participants were asked to evaluate their stress level with regard to the past year.
Currently, it is unknown whether the psychometric quality of self-report measures of stress depends on specific time frames and whether manipulating the time frames alters the psychometric quality of these measures.
In general, manipulating the time frames of stress scales may prove meaningful with respect to the prediction of defined criteria as it is conceivable that effects of stress differ as a function of stress exposure time and temporal distance to the stress exposure (Lam et al., 2019). For instance, recent stress exposure seems to better predict cognitive deficits (Shields et al., 2017), whereas chronic stress is more strongly associated with biological aging (Epel et al., 2004). With respect to the SOS, Amirkhan et al. (2018) showed that participants' stress levels were associated with acute and delayed physical and behavioral symptoms assessed over different time periods. From a psychometric perspective, however, one may question whether manipulations of the time frame change the validity or reliability of the responses. Retrospective self-reports are restricted by the participants' capacity to aggregate and remember their past experiences (Weckesser et al., 2019), implying that selfreports might be prone to certain biases originating from current appraisal and from the accessibility of past contextual details (e.g., Geng et al., 2013). It is therefore important to examine whether the time frame in the instruction of selfreport measures is related to their psychometric properties and whether the time frame can be modified according to the needs of a particular research project. Consequently, we aimed at specifying whether the time frame in the instructions of the SOS impacted its psychometric properties.

The Current Study
The aims of this study were to (1) validate German versions of the SOS and the SOS-S and (2) investigate potential effects of different time frames. The current investigation consisted of two studies using two large age-heterogeneous convenience samples (N total = 1,239). In Study 1, we first validated a German version of the SOS and the SOS-S by translating the original English scale to German language and testing the psychometric properties of the translated scale. As our translated versions had poor fit in confirmatory factor analyses in Study 1, we also developed and validated a modified version of the SOS-S (called SOS-S-G) and a new extra-short version of the SOS (called SOS-XS-G). Second, we explored whether the psychometric quality of the different versions of the SOS changed when varying the time frame in the instruction. For Study 2, we used an independent sample to validate our modified German version of the SOS-S and the newly developed extra-short version. Furthermore, we again tested the effects of varying time frames in the instruction of the SOS.

Study Design
Study 1 was based on data from the study Well-Being After One Year of the Corona Pandemic. People interested in this study first had to register for participation. Registration included providing informed consent and age verification (minimum: 18 years). After the registration, participants were invited to two measurement occasions 1 week apart (henceforward called T1 and T2). At T1, participants rated several indicators of their well-being, health, and stress (including our German translation of the SOS). Furthermore, participants provided demographic information. At T2, participants again completed several indicators of their well-being, health, and stress (SOS, SSCS, and PSS). Furthermore, at T2, we manipulated the time frame used in the instructions of the SOS. Participants randomly received the SOS with one of four different instructions: past day, past week, past month, and past year.

Sample
Based on the results of the English version of the SOS, power analyses suggested that approximately 500 participants were required to achieve a power of .80 for the statistical tests described below (see the study design preregistration for details). However, data collected in the study Well-Being After One Year of the Corona Pandemic were intended to be used in different projects which partly required larger sample sizes (N = 1,000). Therefore, as pre-registered, recruitment was stopped after reaching the required sample size for all intended projects. Participants were recruited online via social media and e-mail lists.
In total, N = 1,046 participants provided informed consent to participate in Study 1. To ensure data quality, we excluded participants who completed measurement occasions in less than 40% of the expected duration and who provided no or incorrect answers on instructed response items (e.g., "To ensure data quality, please select the response option often"). Applying these exclusion criteria led to a final sample size of N = 812 participants for Study 1. The mean age of our sample was 34.87 years (SD = 12.15), and 72% of our sample were female.
Translation of the SOS Translation of the SOS to German language followed procedures previously used to translate the SOS to other languages (Duan & Mu, 2018;Wilson et al., 2018). Moreover, we considered general recommendations for the cross-cultural adaptation of questionnaires (e.g., using translation and back-translation procedures, reaching consensus on translations through an expert committee, conducting pilot-testing of preliminary versions; Beaton et al., 2000). The translation process is illustrated in Figure 1 and further described in the supplementary material. Table 1 lists our translated items of the SOS alongside the original English ones.

Measures
All measures used for the present analyses beyond the German translation of the SOS are summarized in Table 2.

Statistical Analyses
Aim 1: Psychometric Properties of the German SOS We performed several steps to test the psychometric properties of our German translations of the SOS. First, we conducted confirmatory factor analyses using the R package lavaan (Rosseel, 2012). As for the original scales, we specified a two-factor model and evaluated model fit using goodnessof-fit indices (acceptable: CFI > .95, TLI > .95, RMSEA < .08; Schermelleh-Engel et al., 2003). We used the indicator variable method and the robust WLSMV estimator for model estimation. Second, we computed Cronbach's α and the 1week test-retest reliability to estimate the reliability of the SOS. Third, to evaluate convergent, discriminant, and criterion validity, we computed zero-order correlations between the German translations of the SOS and several other measures (see Table 2). Fourth, using Hitter's test for dependent correlations (Diedenhofen & Musch, 2015), we tested whether the zero-order correlations with the Life Event Checklist and the Resilience Scale differed between the two subscales of the SOS. We expected that the correlation with the Resilience Scale is stronger for the personal vulnerability subscale than for the event load subscale, and vice versa, for the correlation with the Life Event Checklist. Fifth, to evaluate the criterion validity, we statistically compared zero-order correlations of the SOS and other selfreport measures of stress (PSS and SSCS) with health-related outcomes using Hitter's test for dependent correlations (Diedenhofen & Musch, 2015). As the PSS and the SSCS were only assessed at T2, we used T2 data for these analyses. Since we manipulated the time frame in the instructions of the SOS at T2, these analyses were restricted to participants who either responded to the SOS using its original time frame (i.e., past week) or to participants who responded to the SOS using a 1-month time frame (i.e., in this case, all stress measures referred to the same time frame in their instructions).
Aim 2: Time Frame Used in the Instructions of the SOS Effects of varying time frames in the instruction of the SOS were evaluated with the T2 data. First, we checked for measurement invariance between the different time frames as weak measurement invariance constitutes a precondition for the subsequent analyses. This precondition was fulfilled (Table S2 in the supplementary materials). Then, we compared Cronbach's α across different time frames using the R package cocron (Diedenhofen, 2016). Next, we compared convergent, discriminant, and criterion validity of the SOS among different time frames by means of two nested regression models. In Model A, we used the scores of the measures to assess the convergent, discriminant, or criterion validity of the SOS as outcome and the SOS scores and the time frame as predictors. In Model B, we additionally included interactions between the SOS scores and the time frames. The two models were compared using an F test for nested regression models. A significant test indicated that including the interactions significantly improved the model and consequently that the time frame in the instruction influenced the psychometric quality of the SOS.

Results
Aim 1: Psychometric Properties of the German SOS The factorial validity of our translations was evaluated with confirmatory factor analyses. Neither the SOS nor the SOS-S had acceptable fit using a correlated two-factor structure (Table 3). Thus, as pre-registered, we examined modification indices of the confirmatory factor analyses. However, the results suggested that the poor fit was not due to specific items (i.e., more than 10 items were involved in modification indices larger than 10 for the SOS).
Therefore, we conducted a broader evaluation of our translated items to check whether other items than the ones comprising the English SOS-S may be used for a modified German version of the SOS-S. Based on current recommendations for creating short scales (e.g., Rammstedt & Beierlein, 2014), we evaluated different criteria to select the best-suited items for a German SOS-S. First, we conducted three exploratory factor analyses, extracting one, two, or three factors. However, the three-factor solution was dropped from further analyses since no item had a substantial loading on the third factor. Second, we computed descriptive coefficients of item quality (M, SD, item-total correlation). Third, we estimated test-retest reliability and average convergent validity per item. Fourth, the two first authors independently judged the content validity of each item. Fifth, we examined standardized loadings in confirmatory factor analyses. Table S3 in the supplementary materials summarizes the results of this item evaluation. In the German translation, some of the items seemed to not clearly belong to one subscale (e.g., Item 13, Item 14, or Item 20 had medium-sized loadings > .25 on both factors in the exploratory factor analysis; Costello & Osborne, 2005), while other items loaded only onto one factor (e.g., Item 15, Item 23, or Item 24). We therefore selected 10 items that clearly loaded onto one factor for a modified German version of the SOS-S (henceforward called SOS-S-G; see Table 1). Furthermore, we decided to keep those items that did not clearly load onto one factor as they seemed to be representative of the overall construct (e.g., these items had the highest loadings using a one factorial model). Using these items, we created a new unidimensional extra-short scale (henceforward called SOS-XS-G; see Table 1) measuring the overall construct stress overload with only four items. Formatted versions of both the SOS-S-G and the SOS-XS-G including the German translations of the instructions are provided in the supplementary material.
We then evaluated the psychometric properties of the SOS-S-G and the SOS-XS-G. First, using a correlated twofactor structure for the SOS-S-G and a one-factorial structure for the SOS-XS-G, both scales had acceptable factorial validity as indicated by model fit in confirmatory factor analyses. The models fitted the data well at T1 and at T2 (Table 3). Second, Cronbach's α was in a good range (α > .70; Cortina, 1993) for both scales and for the two subscales of the SOS-S-G (Table 4). The 1-week test-retest reliability was also good (r > .70; Moosbrugger & Kelava, 2012) for the SOS-S-G (r = .82) and for the SOS-XS-G (r = .81). Third, the SOS-S-G and the SOS-XS-G correlated significantly and in the expected direction with our measures used to assess convergent validity, such as life satisfaction or depression (.14 ≤ | r| ≤ .74, all p values < .001; Table 4). Fourth, for the SOS-S-G, we compared the correlations with the Life Events Checklist and the Resilience Scale between the two subscales event load and personal vulnerability. As expected, the correlation between the personal vulnerability subscale and the Resilience Scale (r = À.43) was significantly stronger than the respective correlation between the event load subscale and the Resilience Scale (r = À.23), z = 6.93, p < .001. However, contrary to our expectations, the correlation between the event load subscale and the Life Event Checklist (r = .12) was not significantly stronger than the respective correlation between the personal vulnerability subscale and the Life Event Checklist (r = .14), z = 0.65, p = .516. Fifth, regarding discriminant validity, the SOS-S-G (r = À.12) and the SOS-  Amirkhan (2012). However, the six filler items that were used in the original publication to mask the purpose of the scale are not displayed here. German translations that are validated in the present paper are presented as the SOS-S-G and the SOS-XS-G. The German SOS-S-G comprises different items than the original English SOS-S. Please note that reuse of the original English items of the SOS requires permission from the copyright holders. EL = event load. PV = personal vulnerability.
XS-G (r = À.12) correlated significantly with social desirability. As hypothesized in the pre-registration, both correlations were only weak (i.e., |r| ≈ .10; Funder & Ozer, 2019) and of the same strength as those found for the original scale (Amirkhan, 2012(Amirkhan, , 2018. Sixth, all correlations between the SOS-S-G and the SOS-XS-G and our measures used to assess the criterion validity of the SOS (e.g., cognitive symptoms) were significant and in the expected direction (.31 ≤ |r| ≤ .69, Table 4). However, these correlations with our measures of criterion validity did not significantly differ between the SOS and other measures of self-reported stress (see Table S4 in the supplementary materials for details).

Aim 2: Time Frame Used in the Instructions of the SOS
Our second aim was to evaluate manipulations of the time frame of the SOS with respect to its psychometric properties (internal consistency, convergent, discriminant, and criterion validity). For the SOS-S-G and the SOS-XS-G, the results were similar: Different time frames did not lead to significant changes in psychometric properties (Table S5 in

Exploratory Analyses: Psychometric Properties of the German Translations Using the Originally Proposed Scoring Scheme
In response to the comments of an anonymous reviewer, we further explored the psychometric quality of the original scoring scheme of the SOS and the SOS-S in our German translations (henceforward called SOS-GO and SOS-S-GO). Although our German translations of the SOS did not fit well in the confirmatory factor analyses, it might be the case that these original scoring schemes obtain comparable or superior results in the other analyses. In general, the results of the SOS-GO and the SOS-S-GO were similar to the results of the SOS-S-G and the SOS-XS-G described above (see Tables S6-S9 in the supplementary materials for details). The SOS-GO and the SOS-S-GO had  Note. For the SOS, the SOS-S, and the SOS-S-G, a correlated two-factor model was tested. For the SOS-XS-G, a unidimensional measurement model was specified. Model fit at T2 for Study 1 and for Study 2 was evaluated only with participants who received the original instruction of the SOS-S-G and the SOS-XS-G.
good reliability, moderate-to-high correlations with the scales used to assess convergent validity, low correlations with social desirability, and high correlations with measures used to assess criterion validity (Tables S6 and S7 in the supplementary materials). However, as for the SOS-S-G and the SOS-XS-G and contrary to our expectations, the correlations between the event load subscale and the Life Event Checklist were not significantly stronger than the respective correlations between the personal vulnerability subscale and the Life Event Checklist (SOS-GO: z = À0.79, p = .428; SOS-S-GO: z = 1.09, p = .278). Furthermore, correlations with measures of criterion validity did not differ significantly between the SOS-GO or the SOS-S-GO and other measures of self-reported stress ( Table S8 in the supplementary  materials). Finally, except for associations with life satisfaction, the psychometric properties of the SOS-GO and the SOS-S-GO did not differ significantly across different time frames used in the instructions of the SOS (Table S9 in the supplementary materials). Thus, apart from poorer fit in the confirmatory analyses, the psychometric quality of the SOS-GO and the SOS-S-GO was neither better nor worse than those of the SOS-S-G and the SOS-XS-G.

Study 2
Study 2 had two aims. First, as we had developed a revised version of the SOS-S (SOS-S-G) and a new extra-short version of the SOS (SOS-XS-G) in Study 1, we wanted to test the psychometric properties of the SOS-S-G and the SOS-XS-G in a second, independent sample. Second, we aimed to further investigate the effects of the manipulation of the time frame in the instructions. The finding that changing the time frames in the instructions of the SOS in Study 1 did not affect the psychometric properties of the SOS-S-G and the SOS-XS-G allows different conclusions. One interpretation is that researchers can flexibly adopt the time frames used in the instructions of the SOS to match their research needs. However, an alternative interpretation is that participants did not pay attention to the time frames used in the instructions of the SOS-S and the SOS-XS-G. In Study 2, we aimed to disentangle these two interpretations by including a manipulation check (i.e., checking whether participants were able to recall the time frame they received in the instructions of the SOS).

Study Design
Study 2 comprised only one measurement occasion. First, participants provided informed consent and demographic information. Then, we assessed several indicators of their mental health. Finally, we presented the items of the SOS-S and the SOS-XS-G with one of four different time frames in the instructions (past day, past week, past month, past year, i.e., the same manipulation as in Study 1). On the next survey page, participants had to select which time frame they had received in the instructions of the SOS (i.e., our manipulation check): past day, past week, past 2 weeks, past month, past 2 months, past 3 months, past 6 months, past year, past 2 years, or past 10 years.

Sample
As for Study 1, we aimed for a sample size of 500 participants. However, due to time constraints, the final sample size for Study 2 was somewhat below this goal (N = 427). The mean age of our sample was 29.71 years (SD = 10.96); 84% of the sample were female.

Measures
All measures used for the present analyses beyond the SOS-S-G and the SOS-XS-G are summarized in Table 2.

Statistical Analyses
Aim 1: Psychometric Quality of the SOS-S-G and the SOS-XS-G In Study 2, we evaluated factorial validity, internal consistency, and convergent validity of the SOS-S-G and the SOS-XS-G with the same statistical methods, as described for Study 1. However, to be consistent with the results provided for Study 1, we restricted these analyses to participants who received the SOS-S-G and SOS-XS-G with the original time frame in the instructions (i.e., past week).

Aim 2: Time Frame Used in the Instructions
First, we calculated the proportion of participants who correctly specified the time frame in the instructions of the SOS in our manipulation check. All following analyses were then based on those participants who passed this manipulation check (i.e., participants who correctly specified the time frame in the instructions of the SOS). As in Study 1, we first checked for measurement invariance among the four different time frames. Details on this analysis and the results are provided in the supplementary materials (Table S2 in the  supplementary materials). Then, we examined whether Cronbach's α differed among the different time frames of the SOS (Diedenhofen, 2016). Finally, we used nested model comparisons, as described for Study 1, to test whether the associations between the SOS-S-G and the SOS-XS-G with the measures indicating convergent validity differed among the different time frames used in the instructions.

Results
Aim 1: Psychometric Properties of the SOS-S-G and the SOS-XS-G To test the factorial validity of the SOS-S-G and the SOS-XS-G, we used confirmatory factor analysis. As summarized in Table 3, the one-factor structure had an excellent fit for the SOS-XS-G and the correlated two-factor structure had an acceptable fit for the SOS-S-G. Furthermore, Cronbach's α was in a good range for both scales and for the two subscales of the SOS-S-G (Table 5). Finally, we evaluated convergent validity of the SOS-S-G and SOS-XS-G by examining zero-order correlations with symptoms of depression and anxiety (Table 5). For both versions of the SOS, the correlations were high for depression (.75 ≤ r ≤ .78) and anxiety (.70 ≤ r ≤ .73).
Aim 2: Time Frame Used in the Instructions Using the above-described manipulation check, we found that only 71% of participants were able to correctly specify the time frame they had received in the instructions of the SOS. We restricted the following analyses to those participants who had passed our manipulation check (i.e., who correctly specified the time frame in the instructions of the SOS) to test whether the different time frames change the psychometric properties of the SOS among those participants paying attention to the time frame. As in Study 1, we found that Cronbach's α and associations of the SOS-S-G and the SOS-XS-G with other scales assessing convergent validity did not differ significantly among the different time frames in the instructions (see Table S10 in the supplementary materials).

Discussion
The present paper had two aims. First, we evaluated the psychometric properties of our German translations of the SOS and the SOS-S. We did not replicate the original twofactor structure in our German translations. Instead, we created and validated a new two-factorial short version of the SOS (SOS-S-G, which comprises different items than the English SOS-S) and a unidimensional extra-short scale (SOS-XS-G). Both scales showed good psychometric properties in our two studies. Second, we compared the psychometric properties of the SOS-S-G and the SOS-XS-G among different time frames used in the instructions. In two studies, we found no significant differences among the different time frames (apart from one significant effect for associations between the SOS-S-G and life satisfaction).

Psychometric Quality of the German SOS
Our German translation of the SOS did not replicate the two-factor structure for the long version of the SOS and the original SOS-S. The results indicated that some items did not clearly load to one subscale in our German translation. One potential reason for this result might be that our translation of the English items did not adequately reflect their actual meaning. In fact, during translation of the SOS to German language, single items posed challenges for accurate translation. For instance, in English language, the item "powerless" (Item 17) may be interpreted in the sense of a lack in authority or in the sense of a lack in force/ strength. In German language, there is no word simultaneously covering both meanings so that we had to choose between the two interpretations. In this particular case, we opted to translate the item in the sense of a lack in authority (German: "machtlos"). Similarly, the item "swamped by your responsibilities" (Item 8) in English describes a literal flooding. We could have translated it literally to German. However, in German, one commonly refers to a state of suffocating among tasks. Hence, we preferred a translation based on this German-specific idiomatic expression, thereby inducing a change in metaphoric modality (i.e., in our translated version, the item now refers to a gaseous modality, "als würden Sie in Aufgaben ersticken"). Finally, some of the items that represent a longer phrase in the English original were translated by using one word in the German translation (e.g., English: "like you were rushed," German: "gehetzt"). Conversely, some items that only consist of one word in the English original were translated using a longer phrase in the German version (e.g., English: "overcommitted," German: "als hätten Sie sich übernommen") since we perceived that more words were needed to convey the English meaning in German language. Ultimately, we cannot rule out that such slight differences between our translations and the English original version might have affected understanding of and thereby responses to the translated items. However, we relied on well-established recommendations for the translation of self-report measures that explicitly highlight the need to weigh verbatim translations against languagespecific, idiomatic translations, and further cross-cultural adaptations (Beaton et al., 2000).
In this context, it should be further emphasized that problems in replicating the exact factor structure of the SOS in German language may not only originate from sheer translational issues but also from cultural differences in the perception of stress and in the stress-illness relation (e.g., Chun et al., 2006;Sinha & Watson, 2007) that have been reported previously (Han et al., 2022;Sinha & Watson, 2007). Potential moderating variables underlying cultural influences may be differences in coping styles, perceived locus of control, self-esteem, and social support, which particularly arise between individualistic and collectivistic societies (Kuo, 2013;Sinha & Watson, 2007). Hence, culture has been conceptualized as a groundwork for the perception and regulation of stressful states, thereby affecting both person and environment variables (Chun et al., 2006). Consequently, the personal vulnerability and event load scales of the SOS could be prone to cultural influences. Along these lines, translations of the SOS to other languages partly faced similar problems with replicating the two-factor structure. For example, Wilson et al. (2018) also found a poor model fit for the long version of the SOS for a translation to Setswana (South Africa).
To sum up, different items of the SOS may be best suited in different languages to capture the theoretically implied two-factor structure of the stress concept. In line with this assumption, we were able to replicate the two-factor structure for the SOS-S-G when using different items than in the English SOS-S. This approach resulted in good overall psychometric quality, which was comparable to existing self-report measures of stress. The SOS-S-G may even surpass other inventories in several aspects. First, adhering to the two-factor structure, it acknowledges theoretical underpinnings on the concept of stress more strongly than existing scales (Amirkhan, 2012). Second, compared to the few measures that already assess both environmental demands and personal resources, the SOS-S-G is significantly shorter and was validated in more heterogeneous samples for the English original version and for the present German version.
Furthermore, we developed the SOS-XS-G as a unidimensional self-report measure of stress. This extra-short scale waives the advantage of the theoretical derived twofactor structure for the sake of being even shorter by comprising only four items. We argue that the SOS-XS-G still captures the concept of stress quite broadly as it comprises items from both original subscales of the SOS and as it correlates strongly with both subscales of the SOS-S-G. In summary, the SOS-S-G and the SOS-XS-G enrich the assessment repertoire of self-reported measures of stress as they are well-validated instruments that assess stress in line with its theoretical conceptualization (Lazarus & Folkman, 1984;McEwen, 2000) and provide high cost-effectiveness to be ideally suited to assess stress in large-scale surveys in the general population. Our exploratory analyses in Study 1 indicated that apart from poor fit in confirmatory factor analyses our German translations of the SOS using the original scoring scheme (SOS-GO and SOS-S-GO) did neither have better nor worse psychometric properties than our modified German versions. Thus, depending on the research goal, researchers may still decide to use the originally proposed items when using our German translations of the SOS. For example, the originally proposed versions might be preferred for cross-cultural comparisons while considering the misfit of the SOS-GO and SOS-S-GO.

Time Frame Used in the Instructions of the SOS
With one exception, we did not find evidence for effects of different time frames used in the instructions of the SOS-S-G and the SOS-XS-G on their psychometric properties in our first study. The results of Study 2 demonstrated that 29% of participants were not able to correctly recall the time frame used in the instructions of the SOS. This raises the question of which information was used by participants who failed the manipulation check when rating items of the SOS-S-G and the SOS-XS-G. Furthermore, our finding underlines the difficulty to ensure that participants pay attention to the time frame given in the instructions as common strategies, such as highlighting them in bold (as we did in the present study), may not be sufficient.
Restricting our analyses of Study 2 to participants who did pay attention to the time frame given in the instructions, we replicated the results from Study 1 and showed that changing the time frame in the instructions did not affect the psychometric properties of the SOS-S-G and the SOS-XS-G. On the one hand, this might be interpreted in favor of the SOS underlining the robustness and the general validity of its psychometric properties. In line with this interpretation, one may conclude that researchers can flexibly adopt the time frames used in self-report measures of stress to match the needs of their studies (e.g., assess recent vs. chronic stress). On the other hand, it is conceivable that although participants perceived the time frame given in the instructions (as shown by passing our manipulation check), they may not have been able to apply this information when evaluating their stress level. As selfreports are prone to biases originating from current appraisal (Levine & Safer, 2002;Weckesser et al., 2019), self-reported stress may mainly reflect participants' current stress level independent of the time frame given in the instructions. Again, this poses a problem for understanding whether and how such a bias can be reduced. Theoretical approaches suggest that mental time travels, for instance, could help participants reaccess past emotional states (e.g., Debus, 2014). Moreover, one might reduce the bias by first explicitly asking participants to evaluate their current stress level (e.g., by means of an appropriate self-report measure) and subsequently instructing participants to consciously dissociate from this current emotional state when completing a following scale asking for past emotional experiences.

Limitations and Future Research
Our study had some limitations. First, although our samples were more heterogeneous than samples in other validation studies of stress measures (e.g., Lee, 2012), they were still not representative for the German adult population. Specifically, in both studies, the majority of participants were female and highly educated. Moreover, since recruitment for our studies was mainly conducted on social media, our samples may be restricted to participants being active online (e.g., on Facebook) and having regular internet access. Such limitations in sample composition might have given rise to certain biases (e.g., the amount and kind of stressors experienced by the participants; Cohen & Janicki-Deverts, 2012).
Second, our studies were conducted during the COVID-19 pandemic. Although data collection for both studies was completed in phases with rather minor restrictions in Germany (e.g., bars and restaurants were open, and larger sport events were allowed), we cannot rule out that the results generated in our final sample are impacted by this historical era. With regard to the two-factor structure of the SOS, the pandemic led to increases in environmental demands (captured by the event load subscale), for example, due to additional responsibilities for parents because of school closures (Cluver et al., 2020). Furthermore, the pandemic likely resulted in reductions in personal resources (captured by the personal vulnerability subscale), for example, due to financial insecurity or less social support during lockdowns (e.g., Cheng et al., 2021;Saltzman et al., 2020). Similarly, measures of criterion validity assessing stressrelated symptoms may have been confounded with COVID-19-related stress. Thus, the historical era may be a reason why we did not entirely replicate findings of the original English versions.
Third, we could not replicate the two-factor structure of the long SOS for German language, and our SOS-S-G comprises different items than the English SOS-S. Thus, comparisons with the literature based on English versions of the SOS might be impeded. As discussed, translational issues (i.e., deviations in meaning between English and German items originating from differences in language) may be a reason for the nonreplication of the original factor structure. For cross-cultural research, one might use the SOS-GO and the SOS-S-GO despite the misfit in the confirmatory factor analyses. The German versions of the SOS using the original items had similar psychometric properties as the SOS-S-G and the SOS-XS-G.
Fourth, we did not include any objective indicators of stress (e.g., cortisol levels) in our study. Thus, we were not able to examine the relationship between our German translations of the SOS and such objective stress markers (which has been done for the English version of the SOS; Amirkhan et al., 2015). Future research should, for example, address the relative contribution of objective stress assessments and the SOS in predicting important criterion variables, such as health symptoms.

Conclusion
Stress can have severe effects on mental and physical health, and it is important to measure stress accurately. The SOS has been introduced to overcome several weaknesses of existing self-report measures of stress. In this study, we validated a German translation of the SOS-S and developed a new extra-short scale (SOS-XS-G). Both scales show good psychometric properties and can be used to measure stress in large-scale samples. Furthermore, we showed that varying the time frames in the instructions overall did not change the psychometric properties of the scales.