Modified Scoring of the QuickDASH Can Achieve Previously-unattained Interval-level Measurement in Dupuytren Disease and Carpal Tunnel Syndrome

Background: Rasch measurement theory can be used to identify scales within questionnaires and to map responses to more precise continuous scales. The aim of this article was to use RMT to refine the scoring of the QuickDASH in patients with Dupuytren disease and carpal tunnel syndrome (CTS). Methods: Data were collected between 2013 and 2019 from a single center in the UK. Preoperative QuickDASH responses from patients diagnosed with Dupuytren disease and CTS were used. RMT was used to reduce the number of items in the QuickDASH and examine the reliability and validity of each subscale. Results: The preoperative QuickDASH responses of 750 patients with Dupuytren disease and 1916 patients with CTS were used. The median age of participants was 61 years, and 46% were men. Exploratory factor analysis suggested two distinct subscales within the QuickDASH: task items 1–6 and symptom items 9–11. These items were fitted to the Rasch model, and disordered response thresholds were collapsed. In Dupuytren disease, the two worst responses or each item were disordered. After collapsing these options, good Rasch model fit was demonstrated. CTS responses fitted without modification. Item targeting was more appropriate for CTS than Dupuytren disease. Conclusions: This study proposes a modification to the scoring system for the QuickDASH that provides high-quality, continuous, and condition-specific scales for the QuickDASH. The identification of distinct subscales within the QuickDASH can be used to identify distinct improvements in hand function and/or symptoms in previous, current, and future work.

and/or symptoms. 5Both the DASH and the QuickDASH assume that all items reflect a single underlying entity (are "unidimensional") and involve fixed scoring of items using ordinal scales, which assumes that score intervals are always equally spaced.If unidimensionality is assumed inappropriately, and ordinal scores are treated as continuous data, clinically relevant changes may not be detected, or may be implied incorrectly.
Both issues may affect the QuickDASH.Firstly, previous work has indicated that different items that comprise the QuickDASH may measure different health constructs (or "factors"): for example, the degree of difficulty performing daily tasks or the severity of hand symptoms. 6,7lthough symptoms and function may be related, they are conceptually different constructs.Summing symptom item scores and function item scores into a single overall score can make the interpretation of the QuickDASH problematic: positive change in one factor may be masked by negative change in another, for example.
Secondly, ordinal scales like QuickDASH are less accurate, precise, and interpretable than continuous scales because their measurement intervals are not equally spaced.For example, the difference in function between having mild and moderate problems holding a shopping bag is not necessarily equal to the difference in function between having moderate and severe problems holding a shopping bag.It cannot be assumed that the difference in function between a QuickDASH score of 5 and 10 is the same as the difference between a score of 10 and 15.Despite this, QuickDASH scores are often treated as continuous data in clinical practice and research, and so may fail to capture clinically relevant differences, or imply difference erroneously.
Rasch measurement theory (RMT) is a branch of psychometrics developed in the 1960s but only recently applied to surgical outcome measurement. 8It assumes unidimensionality and addresses the problem of ordinal scoring by mapping item responses to continuous scales through statistical models.In RMT, the probability of endorsing a given response is a function of the measured trait (eg, hand function).In other words, RMT can calculate the probability distribution of endorsing a certain response option, given the respondent's level of hand function, or it can predict the respondent's level of hand function on a continuous scale given a set of item responses.
Dupuytren contracture and carpal tunnel syndrome (CTS) are two of the most common conditions encountered in elective hand surgery 9 and represent distinct symptom profiles.The aim of this study was to use RMT to determine whether the QuickDASH questionnaire could provide true continuous measurement in patients with these conditions.

Patient Cohort
The study setting was a regional hand surgery center.A total of 750 patients who had surgical treatment for Dupuytren contracture and 1916 patients who underwent carpal tunnel decompression between 2013 and 2019 were identified.The median age was 61 (interquartile range: 51-71 years), and there were 1217 male patients (46%).
Complete preoperative QuickDASH items responses were available for 731 patients (97%) with Dupuytren disease and 1851 patients with CTS (95%).Procedures were done under the care of a single consultant hand surgeon.

RMT Analysis
A previous exploratory factor analysis demonstrated unidimensionality of items 1-6 ("task-based items") and items 9-11 ("symptoms-based items"). 7We undertook a separate RMT analysis for each group of items.The responses to these items are assessed on a five-point Likert scale ("no difficulty," "mild difficulty," "moderate difficulty," "severe difficulty," and "unable").
Analyses were undertaken using R, version 4.0.3.First, the ability to consistently order items, a requirement for subsequently fitting to the Rasch model, was investigated using nonparametric item response theory-based Mokken analysis. 10In this, item scalability was assessed using Loevinger H coefficient, with a coefficient of greater than 0.3 considered acceptable. 10Local dependency (LD), where items are excessively related to each other, was measured by Yen's Q3 statistic: values of more than 0.2 were accepted as indicating LD. 11 Fit to the Rasch model assumes that LD is not present.
Item characteristic curves (ICCs) were plotted; item response thresholds were determined to investigate successive scoring of response options.The fit of each item to the Rasch model was assessed using a chi-square test, as well as infit and outfit mean squares.Chi-square tests that were nonsignificant at the level of P greater than 0.05 were deemed to represent appropriate model fit, along with infit and outfit mean squares between 0.5 and 1.7.
Close or disordered ICC thresholds were collapsed, and RMT analysis was repeated.Where item-level misfit still existed after response option collapse, we attempted to understand why misfit was occurring by using a generalized additive model 12 to plot smoothed regression lines for item response probability functions in the mirt R package (version 1.33.2). 13 Item-person plots were created to compare the targeting of items to the score distribution in our sample.

Takeaways
Question: Can the QuickDASH provide true continuous measurement of function and symptoms using Rasch measurement theory?Findings: Preoperative QuickDASH responses from patients with carpal tunnel syndrome (n = 1916) and Dupuytren disease (n = 750) were used.Exploratory factor analysis suggested two distinct subscales (function items 1-6 and symptoms items 9-11).These items were fitted to the Rasch model, and good fit was demonstrated.
Meaning: This study proposes a modification to the scoring system of the QuickDASH.The identification of distinct subscales with interval-level scoring can be used to identify improvements in hand function and/or symptoms that may not have been detected through composite scoring.
To assess scale-level fit, five model fit statistics were determined and reported: chi-square (χ 2 ), comparative fit index (CFI), Tucker-Lewis index (TLI), root mean squared error of approximation (RMSEA), and the standardized root mean squared residual (SRMR).The following values were considered to indicate good Rasch model fit: χ 2 P greater than 0.05, CFI greater than or equal to 0.950, TLI greater than or equal to 0.950, RMSEA less than 0.060, and SRMR less than or equal to 0.080. 14We anticipated type I errors in the item-level and scale-level χ 2 P values, as these are almost always significant with sample sizes as large as ours. 15Internal consistency was assessed and reported as Cronbach alpha: a value of more than 0.7 represents acceptable internal consistency, while values 0.95 or more can suggest item redundancy. 16fter RMT analysis, we calculated conversion tables to translate the "raw" ordinal sum score of the items to a continuous scale between 0 and 100.This was achieved by cross-walking matched sum scores and Rasch scores that were calculated via an expected a posteriori approach. 12

RMT Analysis in Patients with Dupuytren Disease
Both groups of items demonstrated scalability with a Loevinger H coefficient of more than 0.3 (overall scale 0.733 for task-based; 0.721 for symptoms-based).No item pairs were locally dependent.
ICCs generated in the initial RMT assessment of QuickDASH items 1-6 in Dupuytren disease are presented in Figure 1.For all six items, the thresholds between the two most negative response options ("severe difficulty" and "unable") were close or disordered in all items.What this means, in real-world terms, is that the difference between these two responses, in units of hand function, was either very small or nonexistent.We handled this by rescoring these response options equally.After rescoring the participant's QuickDASH responses in this way, item-level fit was generally good, except for χ 2 P values (Table 1).Items 2 (do heavy household chores), 3 (carry a shopping bag or briefcase), and 5 (use a knife to cut bread) showed poorer fit (outfit <0.5).Generalized additive model response curve estimates suggested that these items were poorly discriminative at the higher level of the scale even after collapsing the most severe response options; people generally did not have poor enough hand function to report severe difficulty in doing heavy chores, carrying shopping bags, or using a knife to cut food.(See figure 1    In our Dupuytren disease cohort, the IPP showed a negatively skewed score distribution, with items generally targeted toward discriminating between respondents with poorer hand function than were present in the sample (Fig. 3).Items 2, 3, and 5 were the most mis-targeted.The Rasch score conversions for items 1-6 are presented in Table 2.
ICCs generated in the initial RMT assessment of QuickDASH items 9-11 (rate the severity of pain in the last week, rate the severity of tingling in the past week, and during the past week how much difficulty have you had sleeping) in Dupuytren disease are presented in Figure 4.For items 9-11, scale-level fit statistics were as follows: χ 2 P less than 0.001, CFI = 0.980, TLI = 0.980, RMSEA = 0.086, and SRMR = 0.095.QuickDASH items 9-11 showed inappropriate targeting and a suboptimal Rasch model fit.As a result, no further modification was undertaken, although the Rasch score conversions for items 9-11 are presented in Table 3 for completeness.

RMT Analysis in Patients with CTS
Loevinger H was more than 0.3 for items 1-6 (overall scale 0.704).There was no LD: all item pairs had a Yen Q3 of 0.2 or less.
ICCs from our CTS cohort are presented in Figure 5.No disordered thresholds were observed, and therefore, no modification was made to the scoring.Item-level fit  statistics, except for χ 2 P values, suggested good model fit for all items (Table 4).Scale-level fit statistics were as follows: χ 2 P less than 0.001, CFI = 0.962, TLI = 0.962, RMSEA = 0.128, and SRMR = 0.062.
The IPP demonstrated excellent targeting, with no floor or ceiling effects, and appropriately spaced thresholds (Fig. 3).The Rasch score conversions for items 1-6 are presented in Table 2; conversions for items 9-11 are presented in Table 3.To generate valid Dupuytren disease Rasch scores from QuickDASH items 1-6, clinicians and researchers should allocate scores of 4 to both the "severe difficulty" and "unable" response options.The item scores can then be summed and converted using our Rasch conversion tables.In CTS, no modifications are required at the item level: the scores from items 1-6 can be summed and converted with the table presented.For the symptomsbased items 9 and 10 in CTS, "no difficulty" should remain with a score of 1, "mild difficulty" should also be scored  as "moderate difficulty" should be scored as 2, "severe difficulty" should be scored as 3, and "unable" should be scored as 4. The same scoring system should be applied for item 11, but "unable" should also be scored as 3. Following this modified scoring, the conversion tables for items 9-11 can then be consulted.We do not recommend the use of items 9-11 in patients with Dupuytren disease.The Rasch scores are not interchangeable between Dupuytren disease and CTS, and the disease-specific conversion scores must be used.For completeness, the ICCs for unmodified QuickDASH items 9-11 in CTS for the modified QuickDASH items 9-11 in Dupuytren disease and for unmodified QuickDASH items 9-11 in Dupuytren disease are presented (Figs.6-8).

Discussion
We have previously suggested that QuickDASH items 1-6 can be used as a stand-alone, unidimensional measure of hand function in CTS and Dupuytren disease. 7his study has built on these results by providing a new condition-specific scoring system for these items, which allows accurate measurement of hand function on a continuous scale in patients with CTS and Dupuytren disease.
Rasch model scales detailed in the present study can be applied prospectively or retrospectively to QuickDASH response data to achieve more precise, accurate, and interpretable measurement of hand function in CTS and Dupuytren disease.This has immediate implications for clinicians at an individual patient level.For example, a patient presenting with CTS and a raw QuickDASH score of 8 (for items 1-6) has an equivalent Rasch QuickDASH score of 16.This clinically relevant, conventional scoring indicates the patient is likely asymptomatic, whereas Rasch scoring indicates the patient is likely to have problems working.The primary strength of this study is that these modified scoring systems can be retrospectively applied to existing datasets.The increasing use of PROMs in different hand conditions has resulted in large datasets of generic PROM responses, although these instruments have not all been validated to contemporary psychometric standards.If large volumes of data have previously been collected, then it is desirable to use this for further study, if possible.Furthermore, modification of such datasets that leads to optimization of existing hand surgery PROMs has clear benefits, such as minimizing further inconvenience to patients, and streamlining research, while avoiding the significant costs which can be associated with the development, validation, and application of new PROMs.Such an approach could prove valuable while we await the development of new instruments that meet the criteria for validity and reliability set out in the consensus-based standards for the selection of health measurement instruments [Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN)] statement. 17here are also limitations to our study.Although this study could improve the structural validity of the QuickDASH in CTS and Dupuytren disease, we are unable to address the content validity of this PROM in these conditions.This represents the main limitation of this study.The QuickDASH is a short-form derivative of the DASH PROM 18 developed through item reduction.The DASH was originally developed as a generic site-specific tool, before the introduction of the COSMIN checklist.Maintaining content validity is an important consideration when modifying pre-existing PROMs 19 : because the QuickDASH is not a condition-specific PROM, its validity may vary between conditions, 20 with worse preoperative QuickDASH observed in patients with CTS than those with Dupuytren disease.

Fig. 2 .
Fig. 2. iccs for items 1-6 in Dupuytren disease, following modification of the scoring system by collapsing response options 4 and 5.

Fig. 3 .
Fig. 3. item-person plots for the modified rasch model for items 1-6 in ctS (top plot) and Dupuytren disease (bottom plot).

Table 1 . Item-level Fit Statistics for Dupuytren Item Responses
DASH6Recreational activities where you take some force through your hand 0.684 0.926 <0.01