Rasch Analysis of Lebanese Nurses’ Responses to the EIS Questionnaire

This study examined the psychometric characteristics of a 32-item modified version of the Ethical Issues Scale (EIS). Data were collected from 59 registered nurses at the American University of Beirut Medical Centre (AUBMC). Data were analyzed using WINSTEPS Rasch analysis software. The four-category EIS rating scale needs modification for future studies in Lebanon. All EIS scale items need rewording prior to translation into Arabic to avoid confusion among Lebanese nurses. Principal component analysis (PCA) of residuals indicated the possible presence of additional dimensions. Additional EIS items are needed to improve targeting.


Introduction
Researchers investigating ethical dilemmas in nursing practice in new settings can choose between two broad strategies. They can develop new measures responsive to local needs, or they can adapt existing measures to local requirements. Developing new measures is time-consuming, costly, and can duplicate effort unnecessarily. In countries where funds for research conducted by nurses are scarce, the second strategy is more cost-effective. However, when measures have been developed in a different language for use in a dissimilar culture, investigators face the challenge of determining appropriateness. We faced this challenge as we worked toward validating a measurement scale for investigating ethical dilemmas experienced by registered nurses (RNs) in Lebanon. We are aware from our different backgrounds (English/Australian, Lebanese) that languages, religions, ethical assumptions, and knowledge of human rights principles vary among cultures. As a result, we resisted the temptation to choose a measuring instrument solely on the basis of validity and reliability reported in Western countries. Instead, we examined the cultural fit and psychometric properties of the 32-item Ethical Issues Scale (EIS survey questionnaire; Damrosch, S. & Fry, 1993;Fry & Duffy, 2001) to determine its suitability (Switzer, Wisniewski, Belle, Dew, & Schultz, 1999) for a national study we intend to conduct in Lebanon.

Background
Lebanon is a small country on the Eastern shore of the Mediterranean Sea best known for its beauty and troubled history since independence from France in 1943. The official language of the population of 4 million is the Lebanese dialect of Arabic, although English and French are widely spoken.
The Lebanese health care system is predominantly hospital based. The American University of Beirut Medical Centre (AUBMC) has a high percentage of Lebanese physicians educated in the United States and is accredited by the Joint Commission International (JCI). AUBMC's nursing services are Magnet designated by the American Nurses Credentialing Center. Two other private medical centers have JCI accreditation. The Ministry of Public Health accredits hospitals in the public sector.
Approximately 6,000 nurses are licensed by the Lebanese Order of Nurses. University schools of nursing offer baccalaureate degrees and master's degrees. Most licensed nurses working in hospitals other than academic medical centers have a technical rather than a university education. All hospitals employ assistant nurses. The proportion of the workforce made up of assistant nurses depends on the location of the hospital. Hospitals outside Beirut are poorly funded and employ a higher proportion of assistant nurses.
We are preparing to conduct a national study to investigate the ethical dilemmas Lebanese nurses face in daily practice. We identified the EIS survey questionnaire as a possible research instrument and conducted an initial study to assess its suitability for translation into Arabic.

The Rasch Measurement Model
The Rasch Measurement Model was developed by the Danish mathematician Georg Rasch (1960Rasch ( /1980 modified, applied and developed by Wright (1977;Andrich (1978); Wright & Masters (1982); Masters (1982); Wright & Stone (1999). The model conceptualizes responses to assessment questions or questionnaire items as a special case of the general linear model (GLM). The Rasch model specifies the probability of correct responses to test items and strength of endorsement of rating scale items. The ability of respondents (aptitude for selecting correct answers/tendency to endorse rating scale items) and the difficulty of items (probability of a correct answer or of endorsing a particular category in a rating scale) are modeled on a continuous latent variable measured in logits (additive log-odds units of equal measurement). The expected probability of a correct answer or of endorsing either category on a dichotomous rating category is .5. For Rasch analyses of responses to polytomous items, the data are fit to the following mathematical model: log e (P n ij /P ni(j-1) ) = B n − D i − F j , where log e is the natural logarithm of the probability P nij of person n of ability B n endorsing category j in response to a scale item of difficulty D i , as opposed to the probability P ni(j-1) of the person endorsing the next lowest category (j − 1). For example, if j is the endorsement of "4" on a 5-point Likert-type scale, (j − 1) would be the endorsement of "3," the adjacent lower category. In this formulation, the parameter F j defines the same rating scale structure for all items (Linacre, 2012a). In the partial credit model (Adams & Khoo, 1993), the assumption of a fixed rating scale is relaxed to allow items with different rating scales to be grouped for analysis. The Rasch rating scale model was used for this study.
Unlike raw scores for a test instrument or questionnaire that have unknown intervals, the Rasch model enables investigators to calibrate item difficulty and measure personal ability using standardized intervals of measurement along a common continuum or latent trait. It is important to note that comparisons between individuals on the measured latent trait are independent of which test or questionnaire items are used, and similarly, that items that measure the same latent trait return results that are independent of the individuals in the particular sample.
More specifically, Rasch analysis examines how well items in a test or rating scale contribute to the useful measurement of an assumed one-dimensional latent variable (Rasch, 1960(Rasch, /1980). Rasch fit statistics indicate how well respondents and their responses fit the response pattern predicted by the Rasch measurement model. Infit and outfit statistics are calculated as chi-square values that range from zero to infinity. Infit and outfit values for an item that perfectly matches the Rasch model have a mean square value (MNSQ) of 1. Items with MNSQ values greater than 1 overfit the model because they lack precision. Items with values less than 1 are too predictable and may not achieve successful measurement. An MNSQ of 1.2 indicates that there is 20% more randomness in the data than the model expects. An MNSQ of 0.5 indicates that there is a 50% deficiency in model-predicted randomness (Linacre, 2012a). Outfit and infit MNSQs in the range 0.77 to 1.3 are acceptable for most purposes (McNamara, 1996). An alternative rule of thumb is to accept MNSQ values in the range 0.6 to 1.4 (Frantom, Green, & Hoffman, 2002). For exploratory analysis, a range of 0.5 to 1.5 is acceptable (Linacre, 2002). In Rasch analysis, the dimensionality of rating scales and subscales can be checked by principal component analysis (PCA) of residuals (Wright, 1996). The Rasch measurement model is the dimension of first comparison. Second and subsequent dimensions with eigenvalue greater than 2 suggest second and higher order dimensions that need investigation (Linacre, 2012a). What matters in the analysis of second and subsequent dimensions is not the value of item loadings themselves, but whether there are patterns of loading that are interpretable as response patterns distinct from the Rasch dimension (Linacre, 2009).

Aim of the Study
Our study was designed to examine the suitability of the EIS for use in Lebanon. Our specific objectives were to 1. explore the psychometric characteristics of the EIS and 2. identify items to delete or modify prior to translating the EIS survey questionnaire into Arabic.

Design
Our sample of 59 nurses exceeds the sample size of 50 that was sufficient (Linacre, 1994) for this initial study.

Sample
All RNs involved in direct patient care at AUBMC were eligible to participate. We recruited our sample by posting flyers at nurses' stations and by visiting units and departments to explain the study. We left packages at nurses' stations. Each package contained the EIS survey questionnaire, an information sheet, a consent document, and a sealable envelope in which to return questionnaires. Nurses confirmed their voluntary informed consent to participate in our study by returning completed questionnaires to conveniently located drop boxes. Sixty questionnaires were returned. We excluded one questionnaire from our analysis because the respondent had replied to less than 50% of the EIS items.

Instrument
The EIS was developed from a 32-item scale used in a survey of Maryland nurses (Damrosch & Fry, 1994). Psychometric evaluation of a 35-item revised version of the scale was later conducted with a sample of 2,090 RNs practicing in the six New England states (Fry & Duffy, 2001). Split validation random samples of approximately equal size were used to establish and separately validate underlying scale components (Damrosch & Fry, 1994). The three components confirmed by the study were end of life treatment decisions (EOLD, Cronbach's α = .85), patient care issues (PCI, Cronbach's α = .82), and human rights issues (HRI, Cronbach's α = .74). Cronbach's α for the total scale was .91 (Fry & Duffy, 2001). The EIS has four response categories (never, rarely, sometimes, and frequently) and measures the frequency with which nurses experience ethical issues in clinical practice.

Cultural Fit
We used a panel of five expert nurses to consider the cultural fit of EIS items. On the advice of the panel, we modified four items. The four reworded items are identified with asterisks in Table 1. We substituted the word "futile" for the word "inappropriate" in item EOLD 1 because panel members thought it more specific. We replaced item EOLD 13 "Participating/not participating in euthanasia or assisted suicide" with "Following/not following physicians' Do Not Resuscitate (DNR) orders" because both euthanasia and suicide are illegal in Lebanon. We changed "managed care" in item PCI 4 and in item PCI 14 to "national health policies" because Lebanon does not have managed care. We deleted item HRI 2 "Following/not following Advanced Directives (e.g., living will, durable power of attorney for health care )" because living wills have no legal status in Lebanon. We replaced item HRI 2 with a new item "Following/not following patient's wishes regarding treatment."

Data Collection
The EIS was administered by survey as described. The nurses were encouraged to fill out the survey at home to ensure privacy and to avoid taking time out from patient care.

Ethical Considerations
The American University of Beirut Social and Behavioral Sciences Institutional Review Board, the AUBMC Medical Director, and the AUBMC Director of Nursing approved the study. The survey was anonymous. We further protected the anonymity of the participants by not collecting demographic data or information about practice settings. We maintained confidentiality by providing envelopes in which the nurses could seal completed questionnaires. Drop boxes were emptied several times a day. The sealed envelopes were opened in a private office by a member of the research team. Data were entered on a password protected computer to which only one member of the research team had access. The questionnaires are stored in a locked cupboard in the private office of the principal investigator. The data were analyzed on the password protected computer of the principal investigator.

Data Analysis
We conducted separate Rasch analysis on the three EIS subscales using WINSTEPS version 3.75.0 (Linacre, 2012b). We examined response ordering to determine whether response categories had been interpreted and used correctly. Then, we examined the dimensionality of EIS subscales to identify secondary and higher order dimensions. Next, we assessed local dependence to find out whether responses to individual items overly influenced responses to other items. Interitem standardized correlations of ≥.7 were taken as evidence of high local dependence indicating at least ~50 or more residual common variance between items (Linacre, 2012a). We then examined targeting to assess the match between the nurses' ability to report the frequency of ethical dilemmas and the difficulty of endorsing rating scale categories.
Then, we examined MNSQ values for the two fit statistics provided by Rasch analysis. The outlier-sensitive fit statistic (outfit MNSQ) indicates the discrepancy between observed and Rasch expected responses, irrespective of how far the response is from the person's ability level. The inlier-pattern sensitive statistic (infit MNSQ) indicates an unexpected response near to the person's level of ability (Linacre, 2012a). In this study, we regarded an item as too imprecise if outfit MNSQ values were >1.4. We regarded an item as overly predictable if outfit MNSQ and infit MNSQ were <0.6.
We concluded our analysis by examining person separation indices and item reliability coefficients to assess EIS precision. We looked for person separation indices of ≥2.0 and item reliability coefficients of ≥0.8 (Linacre, 2012a).

Response Ordering
When we examined response category ordering for the EOLD items, we found they were disordered for Items 8, 10, 12, and 13. The categories never and sometimes were disordered for Item 8. The disordered categories for Items 10, 12, and 13 were sometimes and frequently. These results indicate that the EIS response categories require modification to better suit nurses in Lebanon, possibly because the distinctions between the existing response categories are too nuanced for nurses whose first language is Arabic.
Then, we examined category ordering for the PCI items. We found the response categories sometimes and frequently were disordered for Items 3,8,9,10,11,13,and 14, which again indicates that the nurses in our sample may have misinterpreted the EIS response categories.
We examined response category ordering for the items in the HRI scale and found that responses to all five items were properly ordered, which indicates that the four response categories are suitable for retention.
Category disorder was corrected to facilitate further analysis.

Dimensionality
We examined the dimensionality of the 13 items in the EOLD subscale (Table 2); 40.2% of the raw variance was explained by the measures. The first contrast explained 11.6% of the unexplained variance, suggesting the presence of a second dimension (eigenvalue 2.5), with Items 10, 11, and 12 loading >.4 at one pole of the dimension, in contrast to Items 4, 1, and 7, which loaded −.68 to −.35 at the opposite pole. This second dimension may differentiate responses that relate to indecision about procedures and confusion about acting in accordance with personal moral principles. We then analyzed the dimensionality of the 14 items in the PCI subscale ( Table 2). The measures explained 40.8% of the raw variance. Within the unexplained variance, 12.0% was explained by the first contrast, which suggests a second dimension (eigenvalue 2.8), with Items 7, 5, 1, 6, and 8 loading −.73 to −.44 at the unethical practice pole of a sub-optimal care dimension, in contrast to Items 13, 9, and 2, which loaded >.4 at a failure to report and advocate pole.
We examined the dimensionality of the five items in the HRI subscale (Table 2). The measures explained 46.8% of the raw variance in subscale scores ( Table 2). As there are only five items in the scale, a second dimension was not expected, and none was found (first contrast eigenvalue 1.6).
Finally, we examined the dimensionality of the EIS as whole. The measures explained 30.8% of the raw variance. Four contrasts had an eigenvalue greater than 2.0 (range = 3.3-2.1).

Local Dependence
We analyzed local dependence separately for each of the three subscales. The largest standardized residual correlations for the items in the EOLD subscale were between .46 and −.29, which indicates that no pairs of items shared half or more of their random variance. Consequently, no items are locally dependent, and none need to be removed from the EOLD subscale. The largest standardized residual correlations for the PCI subscale ranged in descending order from .45 to −.33, which again indicates insufficient dependence to justify removing items. The largest standardized residual correlations for the HRI subscale ranged from .42 to −.07, again indicating no local dependence. For the EIS as a whole, largest standardized residual correlations ranged from .57 to −.34.

Targeting
We found the items in the EOLD subscale well matched to 70% of the nurses in the sample; person mean discrimination −0.17, root-mean-square standard error (RMSE) 0.43 logits, item mean discrimination 0.00, RMSE 0.20 logits (Figure 1). The nurses ranged in ability to report frequency of ethical dilemmas at end of life from −3.20 to 2.22 logits. The mean difficulty of the items was only marginally greater than the mean reporting ability of the nurses, indicating that the items were not difficult for the nurses to endorse. The test information curve for the EOLD subscale is shown in Figure 2. When we examined the PCI subscale, the mean discrimination of the items indicated that the nurses found them harder to endorse than the items in the EOLD subscale (−0.67 ± 0.39 logits for mean person, 0.00 ± 0.17 logits for mean item; Figure 1). The range of ability to identify PCI was −3.49 to 1.21 logits, lower and narrower than that for the EOLD subscale. The 14 items in the scale matched the ability levels of 68% of the nurses in the sample. The test information curve for the PC subscale is shown in Figure 2.
The five items in the HRI subscale were easier for the nurses to endorse than those in the EOLD and PCI subscales, were less well matched to the sample than those in the EOLD scale, and better matched to the sample than those in the PCI subscale; person mean discrimination 0.34 ± 0.71 logits, item mean discrimination 0.00 ± 0.20 logits (Figure 1). The range of ability to report frequency of HRI s was −1.39 to 2.42 logits, higher and narrower than those for the EOLD and PCI subscales. The five HRI items were well targeted to 71% of the nurses in the sample. The test information curve for the HRI subscale is shown in Figure 2.
For the EIS as a whole, items were well matched to 73% of the sample; person mean discrimination −0.30, RMSE 0.22 logits, item mean discrimination 0.00, RMSE 0.16 logits. The nurses ranged in ability to report the frequency of ethical dilemmas in nursing practice from −1.37 to 0.91 logits. The mean difficulty of the items only moderately exceeded the ability of the sample. No items directly matched the 16 nurses (27% of the sample) with ability levels <0.68 logits.

Item Fit Analyses
All infit and outfit MNSQ values for the EOLD items were lower than the selected cut-off value of 1.4. None of the items in the EOLD subscale had an overfit value of less than 0.6, which indicates that all the items contributed to successful measurement.
Two PCI subscale items had outfit MNSQ values greater than 1.4; Item 2 (1.47) and Item 9 (1.42), which indicates that the two items might measure something other than the ability to report the frequency of ethical issues in patient care. There were no items with an overfit MNSQ value of less than 0.6.
Item 3 in the HRI subscale had an outfit MNSQ value of 1.79, indicating that it might be measuring a variable other than the ability to report the frequency human rights issues in nursing practice. None of the items in the HRI were overly predictable.

Separation and Reliability
We examined separation and reliability for the three subscales. The person separation index of 1.85 for the EOLD subscale indicates that it may not be sensitive enough to distinguish between nurses with low and high levels of ability to report the frequency of ethical dilemmas in end of life care. The item separation index of 2.73 indicates four levels of item difficulty (easy, moderate, difficult, and very difficult); real person reliability .77, Cronbach's α for person raw score test reliability .78.
The person separation index of 1.88 for the PCI subscale indicates that it may not be sensitive enough to discriminate between nurses with low and high levels of ability to report the frequency of ethical issues in patient care. The item separation index of 3.46 indicates five levels of item difficulty (easy, moderate, average, difficult, very difficult). In this initial analysis, real person reliability .78, Cronbach's α for person raw score test reliability .81, higher than that for the EOLD subscale due to the larger number of items in the PCI subscale.
The person separation index of 0.76 for the HRI subscale indicates that it lacks sufficient sensitivity to distinguish among nurses with different levels of ability to report human rights issues in nursing practice. The item separation index of 4.42 indicates at least five levels of item difficulty (easy, moderate, average, difficult, very difficult). In this initial analysis, real person reliability .36, Cronbach's α for person raw score test reliability .39 due to the small number of items in the subscale.
For the EIS as a whole, the person separation index of 2.43 indicates that it has sufficient sensitivity to distinguish among three to four levels of nurses' ability (poor, moderate, good, very good) to report the frequency of ethical issues in nursing practice. The item separation index of 2.82 suggests four levels of item difficulty (easy, moderate, average, difficult, very difficult). Since the value of Cronbach's α for person raw score test reliability was .87, which is below the rule of thumb cutoff value of .90, the sample of 59 nurses was not large enough to determine the item difficulty hierarchy of the 32 EIS items.

Discussion
We solicited responses to EIS items with the following statement: The following statements concern ethics or human rights issues in which you may have been directly involved in your nursing practice. Circle 0 = never; 1 = rarely; 2 = sometimes; or 3 = frequently to indicate how often you have been involved with the issue during the last 12 months.
Consequently, the nurses' responses indicate their 'ability' to report the frequency of their involvement in the issues surveyed by the EIS survey questionnaire. Our use of the term ability here, which is consistent with the application of the Rasch measurement model, requires clarification because we want to be explicit about how we interpreted responses to the EIS items. If the response to an item was "never," should we have interpreted this to mean the respondent has not been involved with the issue or that the respondent is not aware of personal involvement with the issue? The results we report are based on the first interpretation; that is, we are assuming that our respondents understood the moral dilemmas referred to in the EIS items and were able to report the frequency with which they experienced them.

Rating Scale Categories
The results of our response ordering analyses suggest that nurses in the sample may have been confused by the EIS rating scale categories. The distinctions between never and rarely, and between sometimes and frequently, may have been too nuanced for the nurses to apply consistently. The results for both the EOLD subscale and the PCI subscale showed that response categories were disordered for several items, indicating that the measurement properties of the scale might benefit from changing the four ordinal categories to a 5-point Likert scale.

Wording and Language
During our study, five participants contacted us for clarification of the EIS items. The EIS items are currently phrased as ethical dilemmas (Table 1); that is, they require responses to either/or statements. For example, Item 1 in the EOLD subscale poses the ethical dilemma involved in prolonging or Figure 2. Ethical Issues Scale-Targeting for EOLD, PCI, and HRI subscales. Mean severity discrimination (difficulty) was well matched to the mean severity (ability) of the participants for EOLD subscale (A). Targeting for the HRI subscale (C) was less well matched than for (A) but better matched than for the PCI subscale (B).
not prolonging life when further intervention is likely to be futile. Phrasing questions in this way confused some participants because they were not sure what they were endorsing; that is, they were unsure about whether a response such as "frequently" indicated that they were endorsing or not endorsing intervention, when they were, in fact, being asked to indicate how often they face this dilemma. For our national study, EIS items will be rephrased to avoid ambiguity (Table 1).
In our national study, our modified version of the EIS will be administered in Arabic.

Dimensionality
The dimensions of the EIS scale require more analysis in future studies. Our national study will conduct exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) on independent samples of nurses to determine whether the three dimensional structure of the existing EIS scale is appropriate for studies in Lebanon.
Adding items to the EIS to improve targeting is another reason for re-examining the dimensionality of the EIS. Cardsort procedures (McKeown & Thomas, 1988) will be used to select extra items to improve item targeting for Lebanese nurses. Additional items are needed for ability levels for which there are no items within 0.5 logits. Item fit analyses will be conducted following our national study to remove misfitting items (items with an outlier-sensitive fit statistic [MNSQ] value greater than 1.4) unless there are strong conceptual reasons for retaining them. The inclusion of additional EIS items will enable re-examination of person separation and item reliability. The change from use of a four-category ordinal rating scale to a 5-point Likert scale and better targeting of EIS items is likely to improve person reliability. Items better targeted to larger samples with a wider range of ability are likely to improve item reliability (Linacre, 2012a).

Conclusion
Based on our analysis, we intend to retain all 32 EIS items for our national validation study. We will substitute a five point Likert scale for the existing response categories. We will expand the number of subscale items to improve targeting. We will conduct EFA and CFA on data from separate samples large enough to support EFA and CFA. We will administer the scale in the Lebanese Arabic dialect. From our national validation study, we will determine which subscale items are redundant and which to retain for use in further studies of ethical issues in nursing practice in Lebanon. Selected measures will be included in our national study to establish the convergent and discriminate validity of the Arabic version of the EIS.