Content Analysis of Work Limitation, Stanford Presenteeism, and Work Instability Questionnaires Using International Classification of Functioning, Disability, and Health and Item Perspective Framework

Background. Presenteeism refers to reduced performance or productivity while at work due to health reasons. WLQ-26, SPS-6, and RA-WIS are the commonly used self-report presenteeism questionnaires. These questionnaires have acceptable psychometric properties but have not been subject to structured content analysis that would define their conceptual basis. Objective. To describe the conceptual basis of the three questionnaires using ICF and IPF and then compare the distribution and content of codes to those on the vocational rehabilitation core set. Methods. Two researchers independently linked the items of the WLQ-26, SPS-6, and RA-WIS to the ICF and IPF following the established linking rules. The percentage agreement on coding was calculated between the researchers. Results. WLQ-26 was linked to 62 ICF codes, SPS-6 was linked to 17 ICF codes, and RA-WIS was linked to 74 ICF codes. Most of these codes belonged to the activity and participation domains. All the concepts were classified by the IPF, and the most were rational appraisals within the social domain. Only 12% of codes of the core set for vocational rehabilitation were used in this study to code these questionnaires. Conclusion. The specific nature of work disability that was included in these three questionnaires was difficult to explain using ICF since many aspects of content were not confined. The core set for vocational rehabilitation covered very limited content of the WLQ-26, SPS-6, and RA-WIS.


Introduction
Rehabilitation is based on an understanding that health and function extend beyond the presence or absence of disease to include the ability to participate in life activities and roles. Similarly, we now recognize that work functioning extends beyond the presence or absence of being at work to include the ability to engage in work activities and roles. Presenteeism refers to reduced performance or productivity while at work due to health reasons [1]. In a study conducted in Sweden where one-third of the surveyed labor force reported going to work two or more times in the past year in spite of their health being so bad that they should have taken leave [2]. Presenteeism is a complex issue that is affected by individual, work, workplace factors, health, and health behaviours. Previous studies have tried to identify determinants of presenteeism and have identified factors like low monthly income, psychological stress, initial health, time pressure, and finding a replacement, amongst others [1][2][3][4][5][6][7][8].
During rehabilitation, ability to return to work is often a major concern. Vocational rehabilitation is a specific subtype of rehabilitation that focuses on helping those with disabilities to regain skills and abilities that allow them to acquire or retain employment. It is important to have questionnaires that allow one to quantify the amount of difficulty experienced at work to monitor the success of these rehabilitative processes.

Rehabilitation Research and Practice
People can return to work for a variety of reasons such as financial responsibilities or social responsibility to coworkers. Thus, return to work cannot be the only indicator of successful outcome of vocational rehabilitation. Presenteeism is responsible for a substantial burden to the employee and the employer in lost productivity. The economic burden caused due to sickness presenteeism is attributable to work impairment, disability, and lost productivity time. [9] Studies that identified the economic burden of sickness presenteeism due to depression found that the direct and indirect costs exceeded 18.2 billion in USA and 15.1 billion in UK [10].
The three presenteeism questionnaires that have the most supporting psychometric evidence are (1) 26-Item Work Limitations Questionnaire (WLQ-26) [11], (2) Stanford Presenteeism Scale (SPS-6) [12], and (3) Rheumatoid Arthritis Work Instability Scale (RA-WIS) [13]. The recent systematic review of the psychometric properties of these questionnaires revealed that all three have been assessed in various populations and have demonstrated acceptable levels of validity, reliability, and responsiveness [14].
The International Classification of Functioning Disability and Health (ICF) is an international classification developed by the World Health Organization (WHO) [15]. The domains of the ICF are classified into body structures, body functions, activity limitations, and participation restrictions. There are also contextual factors that are taken into account, which include individual environmental factors and personal factors although the latter are not classified [15]. ICF provides a theoretical model that underpins a substantial component of rehabilitation research. It also provides a classification system and language to help communicate about disability. Processes has been developed to link content from questionnaires to this hierarchical coding system Cieza et al. [16].
Core set for vocational rehabilitation [17] has been developed by a rigorous multistep process, for example, engaging stakeholders through international consensus, informed by qualitative and quantitative research findings with the goal of establishing the most relevant codes for specific areas of practice or health problems. There is a comprehensive and a brief core set to provide options across diverse applications. The comprehensive set has ninety codes while the brief core set has thirteen codes. These core set were established to describe the functioning and participation of individuals, for instance, those who can participate in multidisciplinary vocational rehabilitation. Since vocational rehabilitation aims at successful return-to-work, participating in all life activities, sports, and so forth presenteeism questionnaires may be highly relevant to the outcome of research and practice in this area. Hence, it is necessary to code the concepts of the presenteeism questionnaires with the codes of ICF which would ensure comparability and interpretability of the presenteeism questionnaire scores across studies, transcending language, and cultural barriers [18,19].
The Item Perspective Framework (IPF) is a classification system developed to classify the content of individual items of questionnaires as to what kind of decisions respondents will have to make in responding to a question [20]. IPF was derived from the philosophical work on how individuals appraise "value" or "quality" in life by Pirsig [21] and McWatt [22]. The IPF was constructed to classify fundamental qualities of the items in a patient report outcome measure such as (1) the type of appraisal presented (rational or Emotional), (2) the nature and form of concepts under evaluation, and (3) the types of relationships that occur among multiple concepts [20]. The developer has proposed a novel way of using "item perspective" in conjunction with the ICF classification [20], The ICF is concerned with an aspect of addressed functioning, but does not code the perspective [23] nor does it have a mechanism to address the relationship between different concepts [24]. These two are important issues for understanding the content of questionnaires and are included in the "items perspective" classification. The IPF can provide this information making the content analysis more meaningful.
Content validation can be enhanced by using rigorous methods of evaluating content. This can be particularly significant for complex concepts like presenteeism. The purposes of the current study are threefold: (1) to link the three presenteeism questionnaires to the ICF, (2) to use the IPF framework in conjunction with the ICF classification to classify an item as emotional, rational, and also categorize them into biological or inorganic or psychological or social domain, and (3) to compare the distribution or content of codes included in these questionnaires to those on the vocational rehabilitation core set.  [11] and an 8-item version [25]. The 26-item version was used in the study. The main purpose of this questionnaire is to measure the impact of chronic health problems and or treatment on a person's perceived ability to handle work demands [11]. It also measures healthrelated productivity loss. It includes 26-item under 4 subscales addressing four dimensions of job demands: it uses a 5 point ordinal response scale with an additional sixth option. The physical demand sub-scale contains six items covering physical demands, energy drive, moving from place to place, flexibility tasks and coordination of the hand "does not apply to my job. " The time demand subscale contains five items from managing, scheduling, and completing a job. The mental/interpersonal demands contain nine items that assess cognition and on the job social interactions. The output demand subscale contains six items determining the quality and productivity at work. The total scores range from 0 to 100% [26].

Methodology
The WLQ has been validated with a variety of patient population with chronic conditions such as rheumatoid arthritis (RA), depression, osteoarthritis, back pain, migraine, and epilepsy [11,27]. It is shown to have high internal consistency with Cronbach's alpha values ranging from 0.88 to 0.97 [11,27]. In the initial validation study, the interclass correlation coefficient (ICC) for specialty clinic patients for 2 week testretest validity ranged from 0.69 to 0.80 for the 4 sub scales [9]. The WLQ has shown good construct validity by correlating significantly with arthritis, pain, stiffness, functional limitation, and self-reported work productivity [27]. Reliability coefficients are reported to range from 0.70 to 0.90 for all items and from 0.88 to 0.91 for items within each scale. . SPS-6 is a 6-item self-report questionnaire [12]. It measures the impact of a worker's perceived ability to concentrate on work tasks despite the distractions of health impairments and pain. The response scale for the SPS-6 is a 5-item Likert scale, with response options ranging from 1-strongly agree to 5strongly disagree, giving a total score that could range between 6 and 30. A total score would not be calculated if response to any of the items was missing.

Rheumatoid Arthritis Work Instability Scale (RA-WIS).
The RA-WIS is a 23 item self-report questionnaire that assesses work instability. Work instability is the consequence of a mismatch between an individual's functional ability and his/her work tasks that place the individual at risk for work disability (lowered productivity/premature job loss, etc.) [13]. It has no subscales, and it has a dichotomous response option of yes or no only. The total score ranges from 0 to 23. The WIS can be subgrouped into 3 bands indicating low (less than 10), medium (10)(11)(12)(13)(14)(15)(16)(17), and high (above 17) risk of work disability.
The RA-WIS was specifically developed for patients with rheumatoid arthritis [13] but it has been validated for use in other groups of diseases such as osteoarthritis (OA) [29,30]. It has excellent test-retest reliability of 0.89 (Spearman's rho) [13]. Beaton et al. have found that the RA-WIS exhibits excellent correlations with other presenteeism questionnaires [28]. In workers with OA, it exhibited moderate to high correlations. ( = 0.55-0.79) [31]. It is also found to have predictive validity (relative risk = 1.05). In terms of responsiveness, RA-WIS has shown to exhibit small to moderate standardized response means (SRM) and effect size (ES) [28].

2.2.
Procedure. Two academic physical therapists highly qualified and trained in the field of ICF coding independently linked the items of the WLQ-26, the SPS-6, and the RA-WIS to the ICF and IPF. Percentage agreement ( ) was calculated between the raters for the linking process.

ICF Linking.
The ICF linking procedures were carried out following the eight standardized linking rules proposed by Cieza et al. [16,19]. As per these linking rules, meaningful concepts should be identified from the items of the questionnaire and the identified concepts must be linked to the most precise ICF category. Items with insufficient information about the meaningful concepts, about the precise ICF category to which it should be linked, are to be marked as "nd = not definable. " Meaningful concepts that are related to health, physical health or mental (emotional) health in general, are assigned "nd-gh" (not definable-general health), "nd-ph" (not definable-physical health) or "nd-mh" (not definablemental health), respectively. Meaningful concepts related to quality of life are assigned "nd-qol" (not definable-quality of life). When the meaningful concept is not covered with the codes of the ICF, and if it is not a personal factor, then the meaningful concept is assigned "n/c = not covered. " If the meaningful concept refers to a diagnosis or health condition, the meaningful concept will be assigned "hc" (health condition) [16]. Code d 840-859 deals with work and employment. Since all the items of all three questionnaires deal with work and employment, we have used these codes as mandatory codes to code all items of the three questionnaires.

Classification Using IPF.
The items were classified based on the guidelines proposed by Rosa [20]. The first three steps of the proposed five step process were used in this study. In the first step, the context of the item, that is, the declared purpose of the questionnaire, was determined. In the second step the type of appraisal presented with the item (rational or emotional) was determined. Items were classified as presenting "emotional" appraisals (E) only when they assess the respondent's emotions or feelings at the present time. Any inquiries into emotions/feelings that have occurred in the past or "in general" are classified as rational appraisals (R) since they require retrieval of memories pertaining to previous psychological states. In the third step, the concept domains represented in the item are identified. According to the IPF framework there are four concept domains that represent all subjective and objective evolutionary levels of reality that are amenable to human perception namely inorganic (I), for example, does your chair has a proper arm rest?, biological (B), for example, "rate the level of your shoulder pain?, " social (S), for example, "has pain affected your social life?" or psychological (P), "are you depressed and you feel less capable because of your shoulder pain?" [20] These concept domains have a hierarchical order reflecting McWatt's hypothesis that the inorganic matter gives rise to (and supports) biological organisms; biological organisms self-organize and interact with one another in a manner that gives rise to social behaviors and psychological functioning occurs as a result of increasing complexity in social behavior [22].
To compare the distribution/content of codes in these questionnaires to those on vocational rehabilitation core set we used the following indicators.
(1) Alignment with core set (questionnaire to brief or comprehensive core set absolute linkage): It is the percentage of items from a questionnaire that could be linked to ICF core set codes: = Questionnaire items that are linked to codes appearing in the core set Total number of items on the measure (2) Scope (Core Set Representation): It is the percentage of core set codes that appear when the measure's items are linked to ICF codes. This represents the extent to which the entire scope of content defined by the Core Set is represented on the measure/measure.
Radar plots were used to describe the percentage of questionnaire items that fell under each of the domains of the ICF and IPF. Similarities and differences between the questionnaires as to the domains into which the items fall and a percentage of ICF codes or IPF codes used were identified. Content overlap between the questionnaires was also identified using these plots.

WLQ-26
3.1.1. Linking to ICF. The 26 items were linked to 70 ICF codes, "of which 7 codes belong to the ICF body function; b codes; 3 codes belong to ICF environmental factors e codes, 60 belong to ICF "activity and participation. " (see Figure 1 and Table 1). Some of the observations are: there were 2 items in the questionnaire (items 15 and 16) that used 3 codes. 3 Items (17, 18, and part of 19) had the same codes (d230, d2301, and b140). Item 9 was not definable ("nd") as ICF did not have any codes to code the meaningful concept of "sense of accomplishment".

Percentage Agreement.
The between-rater agreement for the ICF linking was 84% and 100% for the IPF.
Distribution of codes in WLQ-26 to those in vocational rehabilitation core set as follows. The WLQ-26 was linked to 70 ICF codes. The alignment of the questionnaire was 100% with the mandatory work (d 840-859) codes being used. However, the alignment came down to 7% (brief core set) and 88% (comprehensive core set) when the mandatory codes were not considered (see Table 4). In terms of scope, 20 (28.6%), codes appeared in the comprehensive core set for vocational rehabilitation. Only 5 (7.1%) codes from the brief core set for vocational rehabilitation were used to code the concepts in WLQ-26 (see Table 4).  the radar plot (see Figure 1). The 6 items on the questionnaire were linked to 17 ICF codes from 2 chapters, of which 6 codes belong to the ICF body function, b codes, 11 codes belong to ICF activity and participation, d codes (see Table 2). There were no meaningful concepts that referred to body structures and environmental factors.

Classification Based on IPF.
Four (30.8%) concepts required rational appraisals and 9 (69.2%) required emotional appraisals. Six (46.2%) concepts fell within the social domain, 4 (30.8%) within the biological domain, and 3 (23.1%) within the psychological domain. There were no concepts that fell within the inorganic domain.

Percentage Agreement.
While coding SPS-6 both raters agreed at all instances giving a between-rater percentage agreement of 100% for both the ICF linking and the IPF classification.

Distribution of Codes in SPS-6 to Those in Vocational
Rehabilitation Core Set. The SPS-6 was linked to 11 ICF codes. The alignment of the questionnaire was 100% with the mandatory work codes being used. The alignment came down slightly to 83% (brief core set) and it remained the same in the comprehensive core set when the mandatory codes were not considered. (see Table 4) In terms of the questionnaire, 6 (54.5%) codes appeared in the comprehensive core set for vocational rehabilitation. Only 2 (18.2%) codes from the brief core set for vocational rehabilitation were used to code the concepts in SPS-6 (see Table 4).    Table 3). One or more meaningful concepts from the 18 items could not be coded by using ICF; the first part was codable whereas the second part was not codable (nc). The code d-850 Work    and employment is used to code all the items in this questionnaire. There were concepts that the ICF could not capture in this particular questionnaire because a number of items address to emotional issues related to work. ICF considers emotional control as a body function but does not consider emotional perspectives.

Percentage Agreement.
The between-rater percentage agreement for both ICF linking and IPF classification was 100%.

Distribution of Codes in RA-WIS to Those in Vocational
Rehabilitation Core Set. The RA-WIS was linked to 35 ICF codes. The alignment of the questionnaire was 100% with the mandatory work codes being used. However, the alignment came down to 9% (brief core set) and 30% (comprehensive core set) when the mandatory codes were not considered (see Table 4). With regards to the scope of the questionnaire, 8 (22.9%) of the RA-WIS codes appeared on the comprehensive core set for vocational rehabilitation. Only 4 (11.4%) codes from the brief core set for vocational rehabilitation were used to code the concepts in RA-WIS (see Table 4).

Dimensionality and Content Overlap of the Three Presenteeism Questionnaires
3.5.1. ICF. In terms of dimensionality, more than 80% of the items of the WLQ-26 and more than 60% of the SPS-6 and the RA-WIS items fell within the activity and participation domain (see Figure 1), while more than 30% of SPS-6 items

Discussion
By linking the items of the most commonly used presenteeism questionnaires, 26-Items Work Limitations Questionnaire (WLQ-26); Stanford Presenteeism Scale (SPS-6) and Rheumatoid Arthritis Work Instability Scale (RA-WIS) to the ICF language codes, we have shown that there is, substantial variability in the content that each address. Further, classification using the IPF reveals that the items of these three questionnaires contain many emotional appraisals that the ICF would not normally capture. We also found that many of the codes used to classify the presenteeism questionnaires were not reflected on the core set for vocational rehabilitation.
Similarly, many codes from the core set were not reflected on the questionnaires.
The results of the current study show that none of the three questionnaires covered all the four domains of the ICF conceptual framework. The environmental domain is important and work since it includes many factors that would influence successful return to work including labor and workplace policies, and social environment. Other questionnaires have been developed to assess workplace and practice policies [1] and given the scope of environmental issues that can affect work it may be appropriate that environment is addressed in entirely separate questionnaires. More than fifty percent of the items in the three questionnaires are linked to the participation restriction domain of ICF. A small percentage (less than ten percent) of concepts are mapped onto body functions and environmental factors domain and none of the concepts mapped on to the body structure domain. These findings are consistent with the conceptual basis of presenteeism or at work disability since it focuses on a specific form of participation.
Our results indicate that these questionnaires tap into a limited number of concepts endorsed by the WHO when establishing the vocational rehabilitation core set. The WLQ-26 scale provided the most coverage of core set codes. It is not reasonable to expect a questionnaire to addresses all possible components of the construct it measures, but an adequate sampling of the key disability concepts should be present on the disability questionnaire that is targeted to the same conceptual basis. Whether a brief measure like the SPS-6 adequately addresses the scope of presenteeism is not clear. However, attention should be paid to what elements of the concept are covered and how they match the context when questionnaires are relied upon to evaluate the impacts of health problems or the benefits of interventions. It may be necessary to change or supplement questions to achieve an appropriate alignment between the context for measuring and the tools selected, particularly for vocational rehabilitation.
All three questionnaires had items that were linked to multiple meaningful concepts. For example, item 15 of WLQ-26 "Sit, stand, or stay in one position for longer than 15 minutes while working, " item 16 of WLQ-26 "bend, twist, or reach while working, " each have 3 different tasks. Respondents would have variable levels of difficulty with these tasks. Having multiple concepts within a single item on a questionnaire can be a way to bring together a group of activities or tasks that have a common etiological link to specific disorders. This can improve the efficiency of measuring difficulty since it may tap into different manifestations of a particular problem. However, in other cases this can reflect poor item design since it can create confusion for people who have difficulty with some elements of the item and not others; and can create a lack of discriminations if there are functional subgroups that manifest differently in that item.
Seventeen items of the RA-WIS could not be coded ("nc") as the ICF did not have codes that were directly linked to the meaningful concepts. ICF linking does not incorporate coding related to personal factors and quality of life. The items that were "not codable" were mostly pertained to personal factors, for example, quitting work, worried about work, and good or bad days at work, which have more of an emotional component attached to them.
The IPF was included because it classifies the decision that the respondent will have to make to respond to an item (rational/emotional). The IPF was able to classify all items in all three questionnaires. The developer has provided a means of linking the IPF codes to ICF, in this study it augmented classifying items with ICF which could not have been done initially with the ICF alone. However, IPF has only recently been developed, and its role in the evaluation of content validity is not yet clear. Hence more studies are needed to see if the use of IPF to augment the ICF provides a useful method for assessing content.
The radar plot revealed that there was an overlap in the content of the three questionnaires with the ICF coding, with more items falling within the activity and participation domain and very minimal distribution across the other domains. However when the radar plot for the IPF classification was done, it showed that the questionnaires were quite different in the perspective which these domains were extracted from respondents, especially the SPS-6 which takes more of an emotional perspective. The vocational rehabilitation core set developed through international consensus may serve as a benchmark for the important concepts in vocational rehabilitation. We assessed alignment with the core set as being the extent to which items on the scale assessed concepts and scope of coverage as the percentage of core set codes addressed by the measure. A brief measure like the SPS-6 might be very well aligned but address a small scope of the core set. However, the SPS-6 was the measure that was best aligned and had the best scope. Our results show that, on an average, only 12% of the codes in the vocational rehabilitation core set were used in coding the presenteeism questionnaires. This suggests that none of the questionnaires we evaluated would be sufficient for assessment in vocational rehabilitation, since only 20 percent of total codes were from the vocational rehabilitation core set. For example, in a review of the core set, there were a few items of the questionnaires that are not amenable to self-report such as stating good or bad days at work clearly without defining the criteria for those good or bad days. Some concepts that are deemed important by the ICF were missing in the questionnaires.
This study informs our understanding of the content of these three questionnaires but has limitations. The use of only two independent coders may have affected the interpretation of coding although there was high consistency between raters. The concepts that were difficult to code because they contain content that was not assessed by ICF may have been approached differently by different raters. The strengths of the study are the use of IPF to augment and enhance the utility of ICF and the use of radar plots to represent diagrammatically where the meaningful concepts of the questionnaire items fall within the domains of ICF.
Implications for clinical practice, research and policy are as follows. This study has compared the content and perspective of these three presenteeism questionnaires. This may help researchers and clinicians to decide as to which outcome measure to select to quantify a particular end point. This could be of great help in conducting international multicenter clinical trials by facilitating the production of data that is comparable across borders and languages. This study may inform policy by providing outcome measures that can produce comparable and reliable data to provide population estimates of presenteeism. This can aid in planning disability and sick leave benefits.

Conclusion
Our study concluded that the three presenteeism questionnaires vary considerably in the content they address, in their relationship to the vocational rehabilitation core set, and in the proportion of content that could not be classified by the ICF. The IPF provided a different perspective on items. In particular, many of the items and codes such as the emotional could not be classified by the ICF, which does not deal with how people feel about specific constructs. The IPF illustrates a preponderance of focus on the social domain. We recommend further studies to look into the content of these presenteeism questionnaires and how they link to the vocational rehabilitation core set of the ICF.