Advancement of the German version of the moral distress scale for acute care nurses—A mixed methods study

Abstract Aim Moral distress experienced by nurses in acute care hospitals can adversely impact the affected nurses, their patients and their hospitals; therefore, it is advisable for organizations to establish internal monitoring of moral distress. However, until now, no suitable questionnaire has been available for use in German‐speaking contexts. Hence, the aim of this study was to develop and psychometrically test a German‐language version of the Moral Distress Scale. Design We chose a sequential explanatory mixed methods design, followed by a second quantitative cross‐sectional survey. Methods An American moral distress scale was chosen, translated, culturally adapted, tested in a pilot study and subsequently used in 2011 to conduct an initial web‐based quantitative cross‐sectional survey of nurses in all inpatient units at five hospitals in Switzerland's German‐speaking region. Data were analysed descriptively and via a Rasch analysis. In 2012, four focus group interviews were conducted with 26 nurses and then evaluated using knowledge maps. The results were used to improve the questionnaire. In 2015, using the revised German‐language instrument, a second survey and Rasch analysis were conducted. Results The descriptive results of the first survey's participants (n = 2153; response rate: 44%) indicated that moral distress is a salient phenomenon in Switzerland. The data from the focus group interviews and the Rasch analysis produced information valuable for the questionnaire's further development. Alongside the data from the second survey's participants (n = 1965; response rate: 40%), the Rasch analysis confirmed the elimination of previous deficiencies on its psychometrics. A Rasch‐scaled German version of the Moral Distress Scale is now available for use.


| Background
The literature describes moral distress in nurses as a phenomenon occurring where a nurse knows what action would be correct based on his or her professional ethical principles, but for various reasons is unable either to act in accordance with these principles or to prevent potential harm (Epstein & Hamric, 2009;Fenton, 1988;Wilkinson, 1987/88). In contrast to other forms of stress, affected nurses feel that their personal moral integrity is threatened or harmed, leading them to experience moral distress (Corley, 2002;Hardingham, 2004;Webster & Baylis, 2000).
In 2010, when this study was under development, no Germanlanguage conceptual definition existed for moral distress as experienced by nurses. For this reason, we settled on the following literature-based working definition (translated from German): Moral distress describes the burden felt by a nurse who believes he or she knows what the professionally ethical behaviour would be in a particular care situation but, due to impediments, is unable to act accordingly (Kleinknecht-Dolf et al., 2014;Spirig et al., 2014).
According to this definition, the principles and values associated with moral distress are of the utmost importance. For this reason, the professional ethical principles delineated by the Swiss Association of Nurses served as the foundation for our definition (Schweizer Berufsverband der Pflegefachfrauen und Pflegefachmänner (SBK),

2013):
Professional ethical principles describe the objective to offer professional, high-quality, safe and equitable care. Patients shall be protected from harm, their needs, preferences and resources shall be respected and they shall be supported in reaching their health-related goals (Kleinknecht-Dolf et al., 2014).
Professional ethical values are embedded in cultural and contextual factors (Clark, 1997;Horton, Tschudin, & Forget, 2007). It follows that this is also the case for the ethical decision-making associated with moral distress and its impact on personal experience (Goethals, Gastmans, & Dierckx de Casterle, 2010;Varcoe, Pauly, Webster, & Storch, 2012).
Individual factors, factors relating to the work environment as well as those relating to a particular practice setting may trigger moral distress (Hamric, Davis, & Childress, 2006). Whether or to what extent a nurse experiences moral distress depends primarily on his or her moral resilience (Lützén & Ewalds-Kvist, 2013;Monteverde, 2014;Rushton, 2016).
Depending on the effectiveness of the affected nurse's coping strategies, moral distress may lead to either psychological or physical symptoms (Hamric, Borchers, & Epstein, 2012;Huffman & Rittenmeyer, 2012;Schreuder et al., 2012). Additionally, the sense of burden can lead to job dissatisfaction or even the desire to leave the position or even the profession (Huffman & Rittenmeyer, 2012;Rushton, Kaszniak, & Halifax, 2013). Affected nurses may also withdraw emotionally from patient interactions and relationships in an effort to protect themselves (De Villers & DeVon, 2013;Evanovich Zavotsky & Chan, 2016;Whitehead, Herbertson, Hamric, Epstein, & Fisher, 2015). This may manifest itself as intolerance towards patients or the avoidance of certain interventions (Corley, 2002;Gutierrez, 2005;Hamric et al., 2006), negatively impacting the quality of treatment and care.
Considering the effects of moral distress on the nurses affected by it, their patients and their organizations, the literature recommends internal monitoring of situations that commonly trigger moral distress (American Association of Critical-Care Nurses (AACN), 2008; Pendry, 2007;Wilson, Goettemoeller, Bevan, & McCord, 2013). At the time this study was developed, no German-language instrument had been published for institutional measurement of moral distress amongst nurses in acute care hospitals.
Hence, this study's aim was to develop an easily understandable, valid instrument for measuring and monitoring moral distress amongst nurses on inpatient units in Swiss acute care hospitals.

| Design
A mixed methods design was chosen, starting with an initial crosssectional survey, followed by a qualitative, explanatory substudy and a second cross-sectional survey (Creswell & Plano Clark, 2006). This type of design is well-suited for developing a conceptual understanding both of particular phenomena and of the instruments used to measure them (Creswell, Plano Clark, Gutmann, & Hanson, 2003;Onwuegbuzie & Collins, 2007). To adapt the English-language moral distress scale to our needs, we decided on this sequential explanatory design (Creswell & Plano Clark, 2006;Ivankova, Creswell, & Stick, 2006).
During the study's initial development phase, an established moral distress scale for nurses was identified in the literature, translated and tested via a pilot study (preparation). The quantitative phase that followed (quantitative phase I) consisted of a web-based cross-sectional survey carried out using questionnaire version 1. Based on the results, qualitative study phase focus group interviews were carried out (qualitative phase I). The quantitative and qualitative results were then systematically integrated and interpreted (Creswell & Plano Clark, 2006;Zhang & Creswell, 2013). The information gained was used to refine the Germanlanguage version of the questionnaire to its version 2 (Creswell, Klassen, Plano Clark, & Clegg Smith, 2011;Greene, Caracelli, & Graham, 1989), which was used for a second web-based quantitative survey (quantitative phase II). Figure 1 shows the study's sequences.

| Methodological considerations of questionnaire development
Carried out in five hospitals in Switzerland's German-speaking region, our research was part of a larger study aimed at developing a tool to monitor nursing-relevant context factors in hospital work environments. One of the monitoring model's underlying context factors is moral distress . In our planning phase, we identified an established American instrument for measuring this factor in nurses in acute care hospitals. Consequently, while developing the questionnaire, our focus was on producing an accurate translation, adapting it culturally, modifying its content as necessary and finally, testing the German-language version's psychometric properties. Because moral distress is a latent variable and we intended to produce an interval scale, we carried out a Rasch analysis as an alternative to the processes suggested by classical test theory (van Alphen, Halfens, Hasman, & Imbos, 1994;DeVellis, 2012). Rasch analysis belongs to the family of item response theory models and is used in constructing interval-scaled measures of latent traits (Hagquist, Bruce, & Gustavsson, 2009). To determine face validity, the translated and modified questionnaire was submitted several times to an expert panel. Construct validity was assessed by analysing participant results (Bannigan & Watson, 2009;DeVon et al., 2007).

| Preparation
The objective of this preparation phase was to examine, if there already exists a well examined and established instrument for assessing moral distress in nurses in acute care hospitals, which we could use as a template for our instrument.

| Choosing a questionnaire
Following an extensive literature search in autumn 2010 on the concept of moral distress in nurses at acute care hospitals and the associated instruments, Hamric's version of Corley's "Moral Distress Scale" (MDS) was chosen (Corley, 1995;Hamric & Blackhall, 2007).
Of the scales identified, the MDS conformed most closely to our working definition of moral distress. It was also the one most studied and widely used by nurses in acute care hospitals. Measured by Cronbach's Alpha, its internal consistency was 0.83 (Hamric & Blackhall, 2007).

| Translation and adaptation of the questionnaire (version 1)
After obtaining the authors' consent for use of the MDS, an expert panel of three clinical nurse specialists reduced the number of questionnaire items-which was originally designed for use in intensive care-from 21 to nine, adopting only the questions relevant to all medical specialties. The remaining nine items were then translated into German using standard methods for research translations (Jones, Lee, Phillips, Zhang, & Jaceldo, 2001;Martin, Vincenzi, & Spirig, 2007).
We then supplemented the translated questionnaire with one additional item pertaining to professional ethical behaviour. The rationale behind this addition was that work-related moral distress in nursing is indispensable conceptually linked to the relevance of nurses' professional ethical values (Bentzen et al., 2013;Corley, 2002). Each of the 10 items was then assessed by 10 clinical nurse specialists for importance, comprehensibility and feasibility.

| Questionnaire design
To aid participants' understanding, in addition to its questions, our MDS included brief definitions of professional ethical principles and moral distress.
For the item on the importance of professional ethical principles in daily business, the frequency had to be indicated on a 5-point verbal rating scale (0 = never -4 = very often).
Similarly, each of the nine items on moral distress used the same verbal rating scale response format to assess frequency. In addition, for each of the nine items on moral distress, participants assessed their levels of disturbance on a second 5-point verbal rating scale (0 = none to 4 = very high). For items describing situations the participants had never experienced, they were asked to indicate hypothetical levels of disturbance (Frequency = 0). In accordance with Hamric and Blackhall's guideline, it was specified also that the reporting period for each item covered the previous 12 months (Hamric & Blackhall, 2007).

| Pilot study
In April 2011, a pilot study involving 294 nurses was conducted in eight units of one of the participating hospitals. The aim was to assess the comprehensibility and apparent content validity of the questionnaire. The details of the procedure and the results have

| Objective
The objectives of this sequence were to collect data about the relevance of the professional ethical principles as well as the frequency F I G U R E 1 Flow chart of the sequential explanatory design procedures with repetition of the quantitative cross-sectional survey in accordance with Ivankova et al. (2006) of occurrence and the related burden of moral distress in nursing practice. In addition to measuring moral distress amongst nurses in acute care hospitals, the goal of the first cross-sectional survey was to test the psychometric properties of version 1 of our MDS.

| Participants and procedure
In November 2011, all RNs and clinical nurse specialists (n = 4950) involved in direct patient care in inpatient units (n = 204) at three university hospitals and two cantonal hospitals were invited to fill out the questionnaire. The web-based cross-sectional survey was conducted according to current European guidelines for "Good Clinical Practice" (European Medicines Agency, 2002). Details of the procedure are described in an earlier publication (Kleinknecht-Dolf, Spichiger, et al., 2015).

| Data analysis
A descriptive data analysis was carried out in spring 2012 using SPSS, Version 18 (SPSS INC, 2009). For psychometric testing, the items including disturbance assessments were subjected to a Rasch analysis using RUMM2030 (Andrich, Lyne, Sheridan, & Luo, 2010).
For this analysis, we used only responses of nurses who had actually experienced the given moral distress-inducing situations (frequency >"never").

| Objective
The objective of this phase was to gather more insights about the constituent elements of the concept of moral distress in the given context of nursing practice as well as more elaborate information about the interpretation of the quantitative results of quantitative phase I to deepen our understanding of the concept.

| Participants and procedure
Drawing on the results of the quantitative data analysis, four focus group interviews were carried out in the autumn of 2012. The focus groups' 26 members included RNs, clinical nurse specialists and unit managers from one of the study's participating university hospitals.
To be included, prospective participants had to have participated in the quantitative survey. Participants were recruited via an invitation circulated internally in such a way that all specialty fields were represented. This type of purposive sampling is described in the literature in connection with studies using mixed methods designs for the development of concepts or instruments (Greene et al., 1989;Teddlie & Yu, 2007). The procedure for conducting our focus group interviews is described in an earlier publication (Kleinknecht-Dolf, Haubner, Wild, & Spirig, 2015).

| Method
Each focus group interview was moderated by two researchers, following an interview guideline based on the quantitative results of quantitative phase I. In addition to discussing the importance of professional ethical principles in clinical practice, focus group participants were asked to consider the roots of moral distress. We hoped to learn, for example, whether the questionnaire fully and comprehensively described all of the most important situations that could trigger moral distress. Regarding the quantitative results, one target outcome was the groups' explanation for instances where event frequencies for an item were equal but participants indicated widely different levels of disturbance. The focus group interviews were audio recorded and field notes taken.

| Data analysis
During the focus group interviews, in addition to the moderators, a third researcher was present to analyse the participants' statements on an ongoing basis and to depict them as knowledge maps. In the focus group interview context, analytical knowledge mapping delivers a map that highlights essential terms or topics and its relationship between them as they arise (Ebener et al., 2006;Pelz, Schmitt, & Meis, 2004). The focus group participants assessed the knowledge maps at the end of each interview for completeness and accuracy. Via qualitative content analysis, each knowledge map was reduced on its core categories (Mayring, 2008). All main points of each knowledge map were compared and generalized. The generalizations were then reduced further to yield core categories. The field notes were used to better understand the points in the context in which they arose.

| Objective
The objective of this study sequence was to systematically integrate the results of the quantitative and qualitative phase I to strengthen our knowledge of the concept of moral distress as well as to obtain information for the further development of the questionnaire.
Immediately following the analysis of the qualitative data in summer 2013, the integration of the quantitative and qualitative results (with respect to the field notes) began. To guide the process of integration, additional research questions were formulated (Farmer, Robinson, Elliott, & Eyles, 2006). To answer these, the qualitative results were

| Development of version 2 of the questionnaire
Based on the insights gained through the integration process, the questionnaire was proofed for content and all items examined semantically. The questionnaire was revised beginning in winter 2013. In summer 2014, version 2 of our MDS was submitted for critical review to the same expert panel that had examined version 1.

| Objective
The objectives of this phase were to repeat the cross-sectional survey of phase I. In addition to measuring moral distress among nurses in the same acute care hospitals 4 years later and to test the psychometric properties of version 2 of our MDS.

| Participants and procedure
In November 2015, for the second cross-sectional survey, version 2 of our MDS was presented to all RNs and those clinical nurse specialists involved in direct patient care (n = 4867) in all inpatient units (n = 189) of the three university hospitals and the two cantonal hospitals. As with the data collection in quantitative phase I, the questionnaire was administered in electronic form and a web-

| Data analysis
Descriptive data analyses were carried out using SPSS, Version 22 (IBM Corporation, 2013). Again the items relating to disturbance underwent a Rasch analysis using RUMM2030 (Andrich et al., 2010).
This process incorporated all responses of the participating nurses who had actually experienced the listed moral distress-inducing situations (frequency >"never").

| Quantitative phase I
The final survey received responses from 2153 nurses (response rate 44%). The participants' sociodemographic data are shown in Table 1. The descriptive quantitative results are shown in Table 2.
These results are described in detail in an earlier publication (Kleinknecht-Dolf, Spichiger, et al., 2015). The Rasch analysis indicated differential item functioning (DIF) of several items. This means that some subgroups of nurses responded in a different manner to these items despite equally severe levels of moral distress. The analysis also showed that participants could not differentiate sufficiently between the response options of disturbance, which resulted in disordered thresholds-the failure of respondents to use the response options in a way consistent with the level of distress being measured. In addition, the targeting was not optimal. There was a lack of very difficult items, that is, situations that are assessed as not being so distressing even by highly morally stressed persons. The results of the tests on unidimensionality and on local independence of items were satisfying, as well as it was the Person Separation Index (PSI), an index frequently used in Rasch analysis, which is similar to Cronbach's alpha.
The results of the Rasch analysis gave us valuable hints to improve the wording of the statements and the response scales. Table 3 shows the sociodemographic data of the 26 focus group participants interviewed following the quantitative data analysis.

| Qualitative phase I
Most focus group participants described the questionnaire as generally comprehensible and agreed that the items' content was both important and semantically applicable. However, several noted that certain statements were imprecisely formulated or difficult to understand. Regarding completeness, participants mentioned that the questionnaire omitted several important moral distressinducing situations. Specifically, they cited non-collegial collaboration, dependence on inadequate orders from physicians and the informal assumption of responsibility for other hospital workers' tasks.
Regarding the adequacy of the 5-point response scale for frequency and level of disturbance, the participants explained that the assessment of how frequently a given situation occurs is highly dependent on a subjective evaluation of that situation's potential impacts. Hence, identical responses to different statements do not necessarily convey the same degree of frequency. Added to this, the employment status of the person making the assessment and the size of the unit also played roles. Therefore, several participants recommended making the response categories less subjective.
Similarly, the participants described their perceptions of the level of disturbance as dependent not only on their subjective assessment of the risk involved but also on the degree to which their own moral integrity was threatened or harmed. They also emphasized that their perception of disturbance could depend, for example, on the extent to which they were constrained from taking action, on their own state of health, on work pressures, their mood, or the length of their current sequence of working days. Here also, the focus group noted that identical values for disturbance did not convey identical meaning for each item. For this reason, they suggested that the disturbance scale should also include more specific assessment terms.
Regarding the relationship between the frequency with which a stressinducing situation occurs and the intensity of the disturbance associated with it, the participants described various viewpoints. They explained that, in cases where it is possible to cope with a particular situation, it is possible to keep the level of disturbance from increasing, even if the situation is ongoing or escalates. However, if this coping ability is not learned, the level of disturbance due to moral distress may increase. A more extensive description of the results of the focus group interviews can be found in a previously published article (Kleinknecht-Dolf, Haubner, et al., 2015).

| Results of integration of the quantitative and qualitative results of phase I
Regarding interpretation of the quantitative results on moral distress, the qualitative focus group data revealed that that the degree of frequency assigned to a particular item does not correlate consistently with the degree of distress engendered. For the same reason, the levels of disturbance assigned to different cases are not directly comparable. This fact complicates the interpretation of the results.
Applying an integration matrix confirmed that the statements on the questionnaire are applicable and relevant. However, additional items are required to cover non-collegial collaboration, inadequate physicians' orders and informal assumptions of responsibility for other hospital workers' tasks. Table 4 shows an example of the integration procedure that led to these results. Regarding the response scale for frequency and level of disturbance, our integration matrix indicated that improvements to the response category descriptions would ease the response process and improve the validity of the results. The resulting integration and corresponding results are shown in Table 5.

| Version 2 of our MDS
The insights gained from the integration process were used to refine all items semantically. In addition, three items (Items 10, 11 and 12) were added to the questionnaire. Finally, each response  We refined the wording of the scale for assessing the level of disturbance. Additionally, to each distress level response category, we added a "smiley" icon corresponding to that particular level of disturbance. While the idea to add "smileys" was proposed in our focus group interviews, other studies have also found that adults readily accept smiley icons as a graphic aid for numerical values in scales recording latent variables such as pain (Jäger, 2004;Wong & Baker, 2001).
After these adaptations were in place, version 2 of our MDS was once again assessed by a statistician and an expert panel. Based on discussions regarding these assessments, the length of the retrospective assessment was shortened from 12 to 3 months. Also, considering our new insights, we concluded, as did the original authors of the MDS (Corley, Elswick, Gorman, & Clor, 2001), that in cases where frequency was assessed at 0 ("never"), the assessment scale for disturbance should be left blank (value = missing). This technique limits assessment to disturbance arising from situations actually experienced by the respondents. The structure and revised items of version 2 of our MDS are shown in Table 6.

| Quantitative phase II
In total, 1965 nurses (response rate: 40%) took part in the survey using version 2 of our questionnaire. The sociodemographic data of the participants are shown in Table 1.
The Rasch analysis of the revised items showed that all items worked (no more DIF) and that participants used the response categories in a consistent way. Once again, the results of the tests on unidimensionality and on local independence of items were good as well as it was the PSI. Targeting was also improved.

| DISCUSSION
This study's aim was to develop a comprehensible and valid Germanlanguage instrument to measure moral distress in nurses at acute care hospitals. The results of the individual phases of our mixed methods research indicate that the strategy chosen fulfilled this aim. Following translation of Hamric's MDS (Hamric & Blackhall, 2007), our addition of several items, including one on the importance of professional ethical principles, was judged adequate and appropriate by an expert panel of clinical nurse specialists, as the additions strengthen the face validity of the item set chosen (Houser, 2008).
The results of our pilot study in April 2011 showed that the translated items were fundamentally comprehensible and relevant. As an indication of the questionnaire's construct validity, the results produced by our MDS correlated with the responses expected from the participating nurses (Wampold, Davis, & Good, 1990).
A response rate of 44% (2011) and 40% (2015) for the two crosssectional surveys is a common response rate for this type of webbased cross-sectional survey (Cook, Heath, & Thompson, 2000). A response rate of at least 40% is considered a prerequisite for obtaining reliable evidence on the unit level (Kramer, Schmalenberg, Brewer, Verran, & Keller-Unger, 2009).
Both item use and response variability are important indicators of questionnaire quality (DeVellis, 2012). Given that, for all items of our MDS, the full range of offered response options were used in both cross-sectional surveys, with reasonable variation between respondents, the response categories represent diverse subjective assessments for the individual item statements and are adequately sensitive within the various scopes of application.
In line with similar studies, the quantitative results of our 2011 survey showed that professional ethical principles play a key role in all areas of routine nursing, with a pronounced influence on nursing practice (Bentzen et al., 2013;Kangasniemi, Pakkanen, & Korhonen, 2015). Supporting corresponding results of other studies using Hamric's MDS, the results of our nine selected items on moral distress-inducing situations out of it show that moral distress is experienced in all practice areas, sometimes to a high degree (Fernandez-Parsons, Rodriguez, & Goyal, 2013;Hamric et al., 2012). However, the interpretation of these quantitative results is limited by the fact that respondents also assessed levels of disturbance for situations that they did not actually experience. These hypothetical assessments distort the levels of disturbance indicated by those nurses actually affected by moral distress and complicate discussions on possible measures to prevent or reduce moral distress. For this reason, we decided to follow the former application guideline set by the original authors of the MDS and to abandon hypothetical answers of disturbance (Corley et al., 2001).
The results of the Rasch analysis of the 2011 cross-sectional survey data provided important material with which to refine our initial translation of the MDS. In contrast to classical test theory, Rasch analysis is well-suited to develope questionnaires involving latent constructs (van Alphen et al., 1994;Hagquist et al., 2009). The Rasch analysis indicated the need for items that even nurses with high levels of disturbance did not assess as particularly disturbing. It also indicated that the formulation of certain existing items needed revision.
Our qualitative results confirmed that the items selected from  The meaning of frequency assessments (e.g. Assessment 3) may vary for every statement, even where different responds' scores are identical. Individual valuations are dependent on the details of the situation and the potential patient harm or associated suffering. As one participant expressed it: "There are cases where just 1 occurrence would be frequent, because it is simply never supposed to happen". The assessment of frequency is dependent on the level of disturbance felt by the nurse affected and their individual disposition. Several participants observed: "Depending on the situation I'm in at that moment … it's more obvious to me, I notice it much more". The participants observed that the assessment of frequency also depends on the extent of their professional experience and for this reason, specific information on frequency would be helpful. As one participant explained: "Is it 10 times in the last 6 months, is it 5 times … is it 100 times … for me, often is if it happened more than 20-30 times, for others it would be 5 times".
Regarding the assessments indicated on the questionnaire's response scale, differences can arise from the characteristics of the individual, the content and the situation. At this point, the formulation of the response scales is not precise enough to allow participants to determine which value they should choose for their assessment. The assessment of disturbance is dependent on the work environment, the individual's disposition and the respondent's assessment of potential patient suffering or harm. In addition, the assessment of disturbance depends on how that particular situation is handled. One participant described the relationship as follows: "Can I come to terms with it? Something might happen just twice … but have so grave an impact that I can't cope with it. On the other hand, I know that every 6 months a new assistant will arrive. That's annoying, but that's just the way it is". Depending on the situation, low frequency can be accompanied by a high level of disturbance or vice versa. One participant said: "Sometimes if I have to provide care that I don't feel I'm qualified to deliver, that is very disturbing for me. It's just the opposite if I have to just follow a medical order and repeat a CTG, for example… I have to do that a lot, but it doesn't bother me because I know, it's ok, the CTG". Also, the participants explained that although situations may be identical, they are not always equally disturbing, as the level of disturbance fluctuates. For example, it also depends on the number of days the nurse is working. Several participants said that more specific information on disturbance would help with the assessment and that, for example, the assessment tool for pain could be a good resource.
Regarding the assessments made using the response scale, some differences were based on the individual, the content and the situation. At this point, the formulation of the response scales is not precise enough to allow the participants to determine which value they should choose for their assessments. The results of this response scale are a fluid, subjective assessment of the situation affected by the individual's current state of mind and working conditions as well, as their ability to recall. This makes interpretation of the results more difficult.
The response scale needed to be brought into a format that was inter-subjectively comprehensible and made filling it out as clear and self-evident as possible.
To minimize the unreliability and distortion of the situations experienced in terms of time and frequency, the observation period needs to be shortened.
"Please indicate how often you encountered each of the following situations in the past 3 months and the level of disturbance you felt from each. If you have never experienced a particular situation, please choose the value 0 ('never') for 'Frequency' and do not fill out the column for 'Level of Disturbance'".
Variations in respondents' moral resilience or coping mechanisms may partially explain why the frequency/disturbance relationship manifests itself in opposite directions. However, our focus groups emphasized that the moral distress-inducing situations listed are interpreted and assessed in the context of potential adverse effects on specific patients, that is, across diverse care contexts, identical frequencies can yield diverse levels of distress. For this reason, response values that are identical do not necessarily mean the same thing. Therefore, it is essential that the qualifiers provided in the response scales be as unambiguous as possible. This observation is also a strong argument for why the individual responses on frequency and disturbance should not be combined into one mathematical product which is then totalled to create an overall score intended to express the overall level of disturbance.
Several studies on the MDS describe this algorithm for calculating an overall score (Hamric et al., 2012;Lazzarin, Biondi, & Di Mauro, 2012;Wiggleton et al., 2010). In contrast, our Rasch analysis showed that it is possible to generate an interval-scaled Rasch score just from the individual responses on disturbance that represents the overall level of disturbance, making it possible to compare the total scores of individual nurses while accounting for item difficulties. The results of the frequency scale can be used to express the prevalence of each listed situation and to monitor its occurrence. Several studies have shown the use and usability of similar MDSs (Borhani, Abbaszadeh, Nakhaee, & Roshanzadeh, 2014;Kleinknecht-Dolf et al., 2014;Piers et al., 2012).
Overall, our findings show that, through the integration of quantitative and qualitative results and in accordance with the literature, we were able to add materially to previous knowledge of the concept of moral distress, as well as to improve the structure and content of the associated questionnaire for our study context (Creswell et al., 2011). Repetition of the Rasch analysis using the data from the second cross-sectional survey showed substantial improvements to our MDS version's psychometric properties, making it suitable for future crosssectional surveys of nurses in acute care hospitals.

| Limitations
Our study has various limitations. Although the quantitative results are based on surveys at each of the five hospitals participating in the study, resource constraints dictated that the qualitative data had to be gathered from focus group interviews held at only one. The extent to which that hospital's nurses represent the views of those in the other four and the extent to which the results may be applicable to them are debatable.
Furthermore, although the response rate was within a reasonable range for this type of study, regarding moral distress and the situations associated with it, we know nothing of the thoughts and experiences of nurses who did not take part. And it must be noted that interpretation of these results may be limited by social desirability bias regarding professional ethics, with a corresponding distortion of the results (Holmes, 2009).

| CONCLUSIONS
The results reported here form a compelling argument that moral distress should be incorporated into the monitoring of nursing-relevant T A B L E 6 The structure and items of Moral Distress Scale version 2 on measuring moral distress in nurses at acute care hospitals context factors in hospital work settings. The chosen mixed methods design benefitted us considerably in developing our questionnaire on moral distress and provided a theoretical foundation on which we calculate with the help of the Rasch analysis an overall score in the form of an interval scale. In future studies relating to ongoing monitoring of nursing-relevant factors of the hospital work setting, this will increase the MDS's usefulness. Finally, by supporting nurse managers to develop appropriate interventions to reduce the incidence, severity and consequences of moral distress, its results will help improve the quality of the work environment and nursing care.

ACKNOWLEDGEMENT
We thank all the nurses and nurse managers at the hospitals that kindly agreed to participate in this study. We also thank all the funding