Original ArticleSimulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments
Introduction
Computerized adaptive testing (CAT) has transformed the process of estimating latent traits [1]. Latent traits or abilities cannot be directly observed, but can be estimated by analyzing a person's performance on a set of items [2]. For the purpose of this study of patients with lower extremity impairments, the latent trait of interest is lower extremity functional status (FS), which we operationally define as the patient's perception of their ability to perform functional tasks described in the FS items. FS is of interest because many people seek rehabilitation to improve functional deficits caused by lower extremity impairments [3].
CAT has its origins in mental [4], educational [5], and military [6] testing, but inexpensive, powerful computers have facilitated development of computerized adaptive tests (CATs) [1], [6]. CATs have recently emerged in the medical [7], [8] and rehabilitation [9], [10] fields, and development of CAT measures of function in rehabilitation has been recommended [11], [12], [13].
CATs offer advantages compared to a computer administered or paper and pencil outcomes instruments. CATs (1) administer informative items, the difficulty of which are matched to the patient's level of ability reducing the number of inappropriate items administered; (2) administer fewer items, reducing respondent burden with little reduction in precision of patient ability estimates; 3) allow the level of measure precision to be established before testing improving control of measurement error during testing; and (4) simplify test revision by allowing adding and testing new items as needed [6], [14]. CATs provide an efficient alternative to traditional paper-and-pencil or computer-administered tests, and allow outcomes data to be collected during the clinical encounter with reduced patient and scoring burden. Therefore, CAT facilitates management of a central conflict in scale development: good measurement precision with low response burden [6], [7] and is applicable to assessment of outcomes, that is, change in FS in patients receiving rehabilitation [9], [10], [15], [16]. Recent symposia in health outcomes methodology and computer-based testing have emphasized the need to improve (1) outcomes assessment for advancing the science and practice of treatment-effectiveness evaluation [17], and (2) chart a path to development of better computer-based tests [18].
The foundation of CAT lies in Item Response Theory (IRT) methods [19], [20], [21], [22]. Briefly, IRT comprises a set of mathematical models and associated statistical procedures that connect observed survey responses to a person's location on an unmeasured, underlying latent trait like FS. IRT models produce item and latent trait estimates that do not vary with population characteristics with respect to the underlying trait, standard errors conditional on trait level, and trait estimates linked to item content. IRT facilitates evaluation of whether items measure the trait of interest similarly in different subgroups of respondents, that is, differential item functioning (DIF) and assesses data fit to the model [23].
This article describes development of CATs using items from the Lower Extremity Functional Scale (LEFS), a common paper-and-pencil outcomes instrument for patients with lower extremity impairments receiving rehabilitation [3]. No articles have described IRT analyses or CAT applications of the LEFS. The overall purpose of this study was to develop CATs of the LEFS. Specific purposes were to (1) test unidimensionality and local independence of the LEFS items, (2) test LEFS item DIF, (3) develop CATs using LEFS items, and (4) compare the discriminant validity of FS measures generated using all LEFS items analyzed with an IRT rating scale model with measures generated from the simulated CATs.
Section snippets
Study design and setting
A secondary analysis of retrospective data collected from patients with lower extremity impairments prior to rehabilitation was conducted. Focus On Therapeutic Outcomes, Inc. (FOTO) Institutional Review Board approved the project.
Subjects
Patients (n = 1772, 48 ± 17 years, 14 to 89 years, 64% female) with lower extremity impairments were analyzed (Table 1). Patients, who represent a sample of convenience, received rehabilitation in 81 outpatient clinics in 20 states (United States) in the consecutive 24
Unidimensionality and local independence
EFA of the 1,772 patients with complete scores on all 20 LEFS items produced a scree plot analysis that supported one dominant factor (first three eigenvalues = 13.1, 1.7, 0.7) with the first three factors explaining 66, 9, and 4% of data variance. In CFA, a three-factor model fit better than a one-factor model, but the correlations between the three factors were high (>0.62) suggesting one dominant factor. Fit statistics from the one- to three-factor models were CFI = 0.93, 0.94, 0.94, TLI = 0.98,
Discussion
Results (1) support body part specific, that is, hip, knee, foot/ankle, CATs can be generated from LEFS items; (2) measures of lower extremity FS generated using these CATs can discriminate known groups of patients in clinically logical ways; (3) θCAT measures were similar to θIRT measures in their discriminating abilities, but (4) because θCAT measures were estimated using on average six LEFS items, the CATs were 67% more efficient compared to using 18 unidimensional LEFS items and 70% more
Acknowledgments
The authors thank John M. Linacre, PhD for his comments regarding statistical analyses, and Karon F. Cook, PhD, for her insightful comments regarding statistical analyses, results, and manuscript edits.
References (72)
A computer adaptive testing simulation applied to the FIM instrument motor component
Arch Phys Med Rehabil
(2003)- et al.
Score comparability of short forms and computerized adaptive testing: simulation study with the activity measure for post-acute care
Arch Phys Med Rehabil
(2004) Conceptualization and measurement of health-related quality of life: comments on an evolving field
Arch Phys Med Rehabil
(2003)- et al.
Development of an index of physical functional health status in rehabilitation
Arch Phys Med Rehabil
(2002) - et al.
Evaluation of the MOS SF-36 physical functioning scale (PF-10): I. Unidimensionality and reproducibility of the Rasch item scale
J Clin Epidemiol
(1994) - et al.
Assessing unidimensionality of the Oswestry Low Back Pain Disability Questionnaire
Arch Phys Med Rehabil
(2002) - et al.
Rasch scoring of outcomes of total hip replacement
J Clin Epidemiol
(2003) - et al.
Evaluation of the MOS SF-36 physical functioning scale (PF-10): II. Comparison of relative precision using Likert and Rasch scoring methods
J Clin Epidemiol
(1997) - et al.
Differential item functioning in the Danish translation of the SF-36
J Clin Epidemiol
(1998) Use of item response theory to link 3 modules of functional status from the Asset and Health Dynamics Among the Oldest Old Study
Arch Phys Med Rehabil
(2002)
Introduction and history
Emergence of item response modeling in instrument development and data analysis
Med Care
The lower extremity functional scale (LEFS): scale development, measurement properties, and clinical application
Phys Ther
Statistical theories of mental test scores
Some test theory for tailored testing
Practical implications of Item Response Theory and computerized adaptive testing. A brief summary of ongoing studies of widely used headache impact scales
Med Care
Applications of computerized adaptive testing (CAT) to the assessment of headache impact
Qual Life Res
Extending the frontier of rehabilitation outcome measurement and research
J Rehabil Outcome Meas
The use of Rasch analysis to produce scale-free measurement of functional activity
Am J Occup Ther
Computerized adaptive testing with polytomous items
Appl Psychol Meas
Development and psychometric evaluation of the Flexilevel Scale of Shoulder Function
Med Care
Convening health outcomes methodologists
Med Care
Applications of Item Response Theory to practical testing problems
Handbook of modern Item Response Theory
Item Response Theory for psychologists
Fundamentals of Item Response Theory
Item response theory and health outcomes measurement in the 21st century
Med Care
Using clinical outcomes to identify expert physical therapists
Phys Ther
Validation of the Lower Extremity Functional Scale on athletic subjects with ankle sprains
Physiother Can
Validation of the LEFS on patients with total joint arthroplasty
Physiother Can
Getting more from the literature: estimating the standard error of measurement from reliability studies
Physiother Can
International classification of functioning, disability and health
Latent structure analysis
Item response theory, item calibration, and proficiency estimation
Cited by (107)
The Lower Extremity Physical Function Patient-Reported Outcome Measure Was Reliable, Valid, and Efficient for Patients With Musculoskeletal Impairments
2021, Archives of Physical Medicine and RehabilitationComputerised adaptive testing accurately predicts CLEFT-Q scores by selecting fewer, more patient-focused questions
2019, Journal of Plastic, Reconstructive and Aesthetic SurgeryMeasuring outcomes following tibial fracture
2019, Injury