Background

General health and physical functioning are frequently assessed in injured patients using patient-reported outcomes (PROMs) [1,2,3,4,5,6]. To clinicians, it is important to be able to evaluate to what extent patients have returned to their pre-injury health status. To assess changes in health status of injured patients, information about their pre- and post-injury health state values is needed. However, in acute-onset conditions such as acute traumatic injuries (as opposed to chronic conditions), data about pre-injury health status are usually not available. Though preferred, in day-to-day clinical practice, it is not feasible to prospectively collect data about pre-onset health status of patients that will become injured.

Although not a measurement property of a PROM (like validity and reliability), interpretability is a prerequisite for a proper use of a measurement instrument. Interpretability is the degree to which one can assign qualitative meaning (i.e., clinical or commonly understood connotations) to an instrument’s quantitative scores or change in scores [7]. To interpret the change in health status due to injury, different methods may be used. First, population-based normative data can be used as reference of pre-onset health status. Second, recalled pre-injury health state values reported shortly after sustaining the traumatic injury can be used as a proxy for pre-injury health status [8, 9]. Finally, health state values of a matched non-injured group of patients can be used to assess changes in health of injured patients [10].

Studies that compared recalled pre-injury health status to the general population using generic Health-related Quality of Life (HRQoL) questionnaires generally reported that recalled pre-injury health status was higher than the health status of the general population [8, 9, 11,12,13]. In these studies, it was suggested that injured patients may not be accurately reflected by population norms. However, it is not known if these findings may be generalized to more specific domains of health status, such as physical functioning, which is usually more affected in injured patients. Furthermore, previous studies compared pre-injury health status and normative data without adjustment of differences in general characteristics [8, 9, 11,12,13]. In other words, it is not known whether the reported differences remain after adjusting for the differences in general characteristics.

Two frequently used PROMs are the Short Musculoskeletal Function Assessment (SMFA) and the EQ-5D. The SMFA is a condition-specific questionnaire that was developed to assess physical functioning of patients with a variety of musculoskeletal disorders [14]. The EQ-5D is a generic HRQoL questionnaire that can be used to evaluate general health status.

The aims of this study were (1) to evaluate and report recalled pre-injury health status of injured patients using both the condition-specific SMFA and the generic HRQoL instrument EQ-5D, and (2) to investigate whether differences in health state values existed between injured patients and the Dutch population normative data.

Materials and methods

Patients

A prospective cohort study design was used. Injured patients were recruited at the emergency department of the University Medical Center Groningen (The Netherlands), a level 1 trauma center with an emergency department that is also open to self-referrals. Patients that presented with an acute injury due to trauma were prompted for inclusion. Patients had a broad range of acute injuries including wounds, fractures, or organ injury such as liver rupture or pneumothorax. Patients were identified as injured by a triage nurse of the emergency department and were treated by a surgery resident or trauma surgeon. Exclusion criteria were patients of age ≤ 17 or age > 75, inability to read and write Dutch, severe mental disabilities, traumatic brain injury with neurological symptoms, and patients that lived outside of The Netherlands. Eligible patients were requested to complete the SMFA-NL and EQ-5D questionnaires on paper within 2 weeks after the injury. Patients were asked to report their health status of the week before their injury. Non-responders were reminded once.

There are no clear guidelines regarding the required sample size for the comparison of normative data of PROMs to other samples. It has been recommended to use a sample size of at least 50 per age group to establish normative data [15]. The methods employed in this study have been reviewed by the local Institutional Review Board, and waived further need for approval (METc2012.104). The study was carried out in compliance with the principles outlined in the Declaration of Helsinki on ethical principles for medical research involving human subjects.

Questionnaires

The original American SMFA consists of 46 items, divided into two indices: the function index (34 items) and the bother index (12 items) [14]. Reininga et al. cross-culturally adapted the SMFA into Dutch (SMFA-NL) and showed that it consists of four subscales: the upper extremity dysfunction (6 items), lower extremity dysfunction (12 items), problems with daily activities (20 items), and mental and emotional problem (8 items) subscales [16]. The original division into two indices is applicable for the Dutch SMFA-NL as well. Items were scored on a 1- to 5-point Likert scale. The SMFA-NL has been shown to be valid and reliable in injured patients [16]. In accordance with the SMFA-NL normative data, SMFA-NL scores were transposed to a 0 to 100 scale, with higher scores representing better function of patients in the explored domain.

The EQ-5D consists of 5 items (mobility, self-care, daily activities, pain, and anxiety or depressive symptoms) which are scored on a 1- to 3-point Likert scale [17, 18]. All five items load on one index value, calculated by the Dutch EQ-5D scoring algorithm [17]. Scores range from − 0.33 to 1.00, where 0.00 represents death and 1.00 represents the best possible health state. Scores below 0.00, representing a possible health state worse than death, are a consequence of the time trade-off method scoring algorithm [17, 19]. The EQ-5D has been demonstrated valid and reliable in injured patients and is available in Dutch [18, 20,21,22].

Patient-reported demographic characteristics are gender, age, relationship status, and educational level. Patients were asked to report the presence of 12 common chronic health conditions (migraine, hypertension, asthma or COPD, severe spinal conditions, severe gut-related diseases, osteoarthritis, rheumatoid arthritis, diabetes mellitus, stroke, myocardial infarction, severe non-infarct cardiac conditions, and malignant disease) as used in the health surveys of Statistics Netherlands [23, 24].

Normative data

The SMFA-NL pre-injury scores were compared to the Dutch population normative data of the SMFA-NL [25]. The Dutch normative data of the SMFA-NL have been published in 2015 and were based on a population sample of 875 Dutch citizens. Participants were recruited per e-mail and completed the web-based questionnaire. The sample was considered an accurate reflection of the Dutch population based on the distribution of gender, age, educational level, relationship status, and prevalence of comorbidities. The dataset of the SMFA-NL population normative data was obtained and was used in the statistical analysis of this study. EQ-5D scores gathered in this study were compared to the Dutch normative data of the EQ-5D, published by Stolk et al. [26]. The EQ-5D normative data originate from 2009 and consisted of a sample of 2667 Dutch citizens. The majority of these normative data were sampled through a web-form. A small fraction of the data (n = 309) was obtained through an interview. The sample was considered an accurate reflection of the Dutch population. The original EQ-5D dataset could not be obtained; hence all analyses were performed using the data provided in the original publication [26].

Data analysis

Demographic characteristics, injury type, and injury mechanism were presented as frequencies and proportions. The average number of chronic health conditions per patient was calculated. Means, standard deviations, and 95% confidence intervals were calculated for indices and subscales of both questionnaires. Six age groups were constructed (18–24, 25–34, 35–44, 45–54, 55–64, 65–75). The last age group did not continue the 10-year age band, matching the SMFA-NL normative data. EQ-5D normative data originally were stratified in 5-year age groups [26]. The mean and standard deviations of the EQ-5D scores of the normative data were pooled by weight of the number of participants in each 5-year age group to create the following age groups: 20–24, 25–34, 35–44, 45–54, 55–64, and 65–74 [27]. When 15% or more of the patients reported a maximal score on a subscale, a ceiling effect was considered to be present [28].

Statistical analysis

For each subscale of the SMFA-NL and EQ-5D, the unadjusted difference in score between the injured patients and the Dutch population was compared using independent t tests. Multivariable linear regression analyses were used to evaluate the adjusted differences in the SMFA-NL subscale scores between the injured patients and the Dutch general population. The overall mean differences in scores between the injured patients and the Dutch population were adjusted for the covariables: gender, age, relationship status, educational level, and the number of chronic health conditions. The adjusted differences could not be calculated for the EQ-5D since the original dataset could not be obtained.

Sensitivity analysis

A two-part model approach was used to investigate the difference between injured patients and the Dutch population with respect to possible ceiling effects [29, 30]. In the first part of the two-part model, a multivariable logistic regression was used to estimate the (adjusted) difference in probability of achieving the maximum SMFA-NL score, between the injured patients and the Dutch population. The second part was a multivariable linear regression analysis to evaluate the differences in the SMFA-NL scores between the injured patients and the Dutch general population, among those with a sub-maximal SMFA-NL score (less than 100 points). In both parts of the two-part model, the covariables were gender, age, relationship status, educational level, and the number of chronic health conditions. The sensitivity analysis was performed for all indices and subscales of the SMFA-NL.

Missing values were handled listwise. Items that were answered incorrectly were handled as missing. A p value smaller than 0.05 was considered statistically significant. To correct for multiple comparisons in the multivariable regression analyses, a Bonferroni correction was used and the p value was set at 0.0083 (0.05/6).

Results

General characteristics

Between October 2012 and February 2014, a total of 596 patients filled in the questionnaires (response rate: 43%). All age groups contained at least 51 patients. Demographic characteristics, injury types, and injury mechanisms of the study sample are described in Table 1. The study sample contained more males (60%, n = 359) than females. Upper and lower extremity fractures were the most prevalent injuries (21% and 19%, respectively). Most patients sustained the injury in a traffic accident (22%), fall (22%), or during sports (21%). Of the injured patients, 54% reported that they did not have any chronic health condition (Table 1). The general characteristics of SMFA-NL and EQ-5D normative data sample are shown in Table 1 [25, 26].

Table 1 General characteristics

Difference in pre-injury health status injured patients and health status of the Dutch population

Unadjusted pre-injury scores of the injured patients were significantly better on all indices and subscales of the SMFA-NL, compared to the Dutch normative data. (Table 2). Mean differences ranged from + 2.4 to + 8.6 points (all p values < 0.001, Table 2). The pre-injury EQ-5D score of the total group of patients was 0.05 points higher compared to the Dutch population (p < 0.001, Table 2).

Table 2 Unadjusted differences in pre-injury scores and the Dutch population norms

The adjusted mean differences between pre-injury scores of injured patients and the Dutch population ranged from + 0.8 to + 6.8 points (shown in Tables 3, 4). At the Bonferroni corrected alpha level, the adjusted differences between the injured patients and the Dutch population were significant for all subscales, except for upper extremity dysfunction subscale (+ 0.8 points [95% CI − 0.4 to 2.1], p = 0.2). For all subscales, the number of chronic health conditions was found to be the strongest confounders for the difference in health status between injured patients and the general population. Chronic health conditions reduced the estimate of the difference in score between injured patients and the Dutch population, ranging from a 32% reduction on the mental and emotional problems subscale to a 65% reduction on the upper extremity dysfunction subscale.

Table 3 Adjusted difference between injured patients and the Dutch population for the indices of the SMFA-NL
Table 4 Adjusted difference between injured patients and the Dutch population for the subscales of the SMFA-NL

Sensitivity analysis

In part one of the sensitivity analysis, injured patients had a significantly higher likelihood of scoring the maximum SMFA-NL score, on all indices and subscales (Appendix Tables 5, 6, 7), compared to the Dutch population. Odds ratios ranged from 1.95 [95% CI 1.2–4.4], p < 0.001 on the bother index, to 3.96 [95% CI 2.92–5.37], p < 0.001 on the lower extremity dysfunction subscale. For all subscales, chronic health conditions significantly decreased the probability of scoring the maximum SMFA score (Appendix Tables 5, 6, 7).

In part two of the sensitivity analysis (only patients with a sub-maximal score), injured patients showed a significantly better score on the function index (2.8 points [95% CI 1.2–4.4], p < 0.001, Appendix Table 5) and the mental and emotional problems subscale (4.9 points, [95% CI 2.9–6.9], p < 0.001, Appendix Table 7), compared to the Dutch population. The difference in score between the injured patients and the Dutch population was not significantly different for the bother index, upper extremity dysfunction, lower extremity dysfunction, and problems with daily activities subscales (Appendix Tables 5, 6, 7). For all subscales, the presence of chronic health conditions was significantly associated with reporting a lower score (Appendix Tables 5, 6, 7).

Discussion

The present study showed that injured patients reported significantly better pre-injury scores compared to the Dutch population for both the condition-specific SMFA-NL and the generic EQ-5D questionnaires. Adjustment for general characteristics resulted in a reduction of the differences between pre-injury health status of injured patients and the Dutch population, yet it remained significantly different. The reduction of this difference in health status between both samples was mainly due to the lower number of chronic health conditions reported by injured patients.

It is important to evaluate whether the differences in health status are clinically relevant. To the best of our knowledge, there is no known minimally important difference (MID) value of the SMFA [31]. Hence, there is no clear reference available that can be used to indicate which difference between groups may be considered clinically relevant. However, the differences were smaller than the standard error of measurement of the SMFA-NL, which ranged from 7.8 points for the function index, to 11.3 points for the mental and emotional problems subscale [16]. We think that the adjusted differences in health status between the injured patients and the Dutch population were too small to reflect a clinically relevant difference. This was supported by part two of the sensitivity analysis, which showed that among patients with a sub-maximal score, there was no evidence of a difference in health status between injured patients and the Dutch population for four of the six scales.

Though there was little evidence of a difference in health status between the injured patients and the Dutch population, among patients with a sub-maximal score (part two of the sensitivity analysis), this conclusion may not be directly translated to patients that reached the limit of the scale (i.e., a score of 100 points). The sensitivity analysis (part one) showed that injured patients were significantly more likely to reach the maximal score than the Dutch population. The increased likelihood of reaching the maximal score may indicate that there could be a difference in health status between the injured patients and the Dutch population ‘above’ the maximal SMFA-NL score of 100 points. However, since 100 points was the upper limit of the scale, the difference in health status between both groups could not be further quantified. This was a limitation of this study and may be subject of further research using a questionnaire that is less susceptible to ceiling effects.

Regarding the EQ-5D, one MID value of 0.08 points has been reported to compare groups of patients with musculoskeletal conditions [32, 33]. This value was not reported in an injury-specific study population, but was calculated from a sample of patients undergoing total hip arthroplasty. Based on this MID, the difference between injured patients and the normative data of the EQ-5D found in our study (an unadjusted difference of 0.05 points) was perceived as being not a clinically important difference. In addition, the EQ-5D score difference was not adjusted for patient characteristics and may be smaller after adjustment for patient characteristics.

The unadjusted differences found in the present study are in line with previous research on generic HRQoL instruments. In a systematic review, Scholten et al. concluded that recalled pre-injury health status consistently exceeded population norms in patients with traumatic injuries [34]. In a sample of patients with a broad range of traumatic injuries, Watson et al. used the SF-36 and reported higher pre-injury scores on both the physical and mental domains [12]. The differences found in the study of Watson et al. were of a similar magnitude to the unadjusted differences found in the present study. Wilson et al. used the EQ-5D in a large sample of 2842 patients that sustained various traumatic injuries, and reported that pre-injury health status was 0.12 points higher than the health status of the general population [8].

In several previous studies, it has been discussed that the (unadjusted) difference between injured patients and the general population may be explained in terms of recall bias or response shift [8, 9, 12, 34]. In this context, response shift means that the experience of poorer health status after the injury may have inflated the patient’s valuation of recalled pre-injury health status [34, 35]. Alternatively, it was hypothesized that injured patients may be a specific sub-sample of the general population [8, 9, 12, 34]. However, in these studies, the differences were never adjusted for patient characteristics. The present study showed that controlling for patient characteristics led to a reduction of the difference in pre-injury health status and health status of the general population. Having one or more chronic health conditions was of greater influence on the difference in health status, than originating either from the group of injured patients or the Dutch population. Hence, though the present study was not able to quantify response shift or recall bias, the findings imply that the differences between recalled pre-injury health status and general population norms may for an important part be explained by differences in general characteristics and in particular the number of chronic health conditions.

Prospective evaluation of pre-injury health status is preferred, since it is not subject to bias and response shift due to sustaining the injury [34]. However, in clinical practice, prospective evaluation is generally not feasible. The use of normative data has been advocated, since it provides pre-injury estimates that are free of recall bias and response shift [34]. In addition, the use of normative data relieves administrative burden on patients. However, the use of normative data relies on the assumption that the population norms are an accurate reflection of injured patients. The (adjusted) difference in health status between patients with a broad range of traumatic injuries and the general population norms is small [34]. However, this may not be applicable to all injured patients. In specific samples, such as hip fractures, patients have a worse pre-injury health status opposed to the general population [36, 37]. In contrast, patients with gun-shot injuries and traumatic brain injury report a high pre-injury health status [38, 39]. It has been suggested that patients with specific injuries are likely to respectively have a poorer or better general health than the general population in terms of socioeconomic status or comorbidities [34]. Due to the underlying assumptions for the use of normative data, the representativeness of the normative data for the study sample should be considered carefully before being used, especially in patients with specific injuries. If population norms are used as a proxy for pre-injury health status, they should be adjusted for differences in general characteristics.

Recalled pre-injury scores on the other hand are also subject to debate. As outlined earlier, there is a susceptibility to two biasing factors. Firstly, patients may have remembered their pre-injury health state incorrectly, thereby inducing recall bias. Recall bias may lead to an overestimation of patients their pre-injury health status [40, 41]. However, when patients recall their pre-injury health status shortly after the injury, recall bias may be limited. A two-week interval is generally considered appropriate to limit recall bias [28, 42]. Secondly, response shift may operate. Since patients evaluate their pre-injury health status after the injury, the injury itself may have changed patients’ perception of their pre-injury health status, due to a change in internal valuation of what health is [35]. This may inflate the recalled pre-injury health status. In the absence of prospectively assessed pre-injury health status, it is not possible to quantify response shift. Nonetheless, others have argued that post-injury assessment of pre-injury health status may have its advantages. It enables patients to value their pre-injury health status based on newly learned information that could not have been gained before the injury and is not present in population norms [34, 43]. In addition, recalled pre-injury health status enables that pre- and post-injury health status evaluation can be based on the same set of internal values, which has been suggested to be preferable in terms of validity and reliability [34, 43, 44].

Limitations of the present study

One of the limitations of this study was that the two PROMs that were used were susceptible to detecting ceiling effects. This is a known limitation of both the SMFA-NL and EQ-5D [14, 16, 45]. Because pre-injury and general population health status were considered relatively ‘healthy’ conditions, ceiling effects were expected. A sensitivity analysis by means of a two-part model was used to account for the ceiling effects on the SMFA-NL. The sensitivity analysis could not be performed for the EQ-5D since the original dataset could not be obtained.

Additional differences between injured patients and the general population may be explained by other variables, such as socioeconomic status, and additional chronic health conditions such as kidney disease, levels of pre-injury physical activity, and mental health [34]. However, these variables were not available in this study.

The sample size of the study was considered adequate and the response rate of 43% was considered reasonable for an injured patient population, however, it may have introduced selection bias [46].

The differences in the applied methods of administration of the SMFA-NL and EQ-5D might be considered a limitation. The injured patients completed the questionnaires on paper, while the normative data of the SMFA-NL were administered electronically [25]. The EQ-5D normative data were mainly sampled using internet web forms [26]. In a meta-analysis, it was concluded that there is extensive evidence of the equivalence of on-paper and electronically administered PROMs [47]. We believe that the mode of administration had no influence on the differences between the study samples.

To obtain pre-injury health status, patients were asked to report their health status of the week before their injury. The recall period both PROMs was slightly changed from the original PROM. This was considered a limitation of this study, since it is preferable to completely re-evaluate the validity and reliability of a PROM when any change is made to it [48, 49]. Though no standard recall period exists, typically shorter recall periods are preferred, and must be based on the purpose of the assessment [50]. The recall interval of the adjusted question was considered was very similar to the original question, appropriate for both measures and short enough such that the effects on the validity, reliability, and recall bias of both questionnaires would be limited.

In future studies where pre-injury data are not available, adjusted normative data may be used to compare groups of patients that sustained general trauma. Prospective (population-wide) studies may provide insight in the effects of recall bias and response shift on pre-injury health status.

Conclusion

This study provided insight into differences in population characteristics and pre-injury health status of injured patients, compared to the Dutch general population. For both the generic HRQoL and condition-specific measures, injured patients reported a better pre-injury health status than the general population. However, general characteristics explained an important part of the difference in health status between injured patients and the general population. Within the detectable range of the scale, adjusted differences between the recalled pre-injury health status of injured patients and the general population were considered not clinically relevant.