Abstract
Purpose
Most multidimensional patient-reported outcomes (PRO) measures are lengthy to complete. Computerized adaptive testing (CAT) that selects the most informative items can potentially reduce respondent burden without sacrificing measurement accuracy. The commonly used maximum Fisher information item selection method has been reported to lead to highly unbalanced item bank usage and potentially imprecise trait estimation. This study employs the content-balancing strategy in a bifactor-modeled CAT item selection and examines its impact on measurement accuracy and item bank usage.
Methods
Item responses from a population-based SF-36 survey were first calibrated using the bifactor graded response model. Four post hoc CATs using items and responses from the SF-36 data set were then created. The content-balancing strategy was adopted in the item selection procedure of the bifactor-modeled CAT. The measurement accuracy and usage of items of the CAT were compared between the tests with and without the content-balancing strategy.
Results
The results indicate that the CAT implemented with the content-balancing strategy offers a better overall measurement accuracy of both the general health status and the two health domains (physical and mental) of the SF-36.
Conclusions
The content-balancing strategy helps the CAT–PRO to balance the selection of items and achieve improved measurement accuracy. Its implementation in real-time CAT administration to measure multidimensional PRO traits merits further studies.
Similar content being viewed by others
References
Chang, C-H. (2007). Patient-reported outcomes measurement and management with innovative methodologies and technologies. Quality of Life Research, 16(Supplement I), 157–166.
Chang, C-H., & Reeve, B. B. (2005). Item response theory and its applications to Patient-Reported Outcomes Measurement. Evaluation & the Health Professions, 28(3), 264–282.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. CA: Sage Publications.
Chang, H.-H. (2004). Understanding computerized adaptive testing: From Robbins-Monro to Lord and beyond. In D. Kaplan (Ed.), The Sage handbook of quantitative methodology for the social sciences (pp. 117–133). Thousand Oaks, CA: Sage.
Fayers, P. M. (2007). Applying item response theory and computer adaptive testing: The challenges for health outcomes assessment. Quality of Life Research, 16(Supplement 1), 187–194.
Ware, J. E., Jr, & Kosinski, M. (2003). Applications of CAT to the assessment of headache impact. Quality of Life Research, 12, 935–952.
Walter, O. B., Becker, J., Bjorner, J. B., Fliege, H., Klapp, B. F., & Rose, M. (2007). Development and evaluation of a computer adaptive test for ‘Anxiety’ (Axiety-CAT). Quality of Life Research, 16(Supplement I), 143–155.
Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33(6), 419–440.
Petersen, M. Aa., Groenvold, M., Aaronson, N., Fayers, P. M., Sprangers, M. A., & Bjorner, J. B. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluation. Quality of Life Research, 15, 315–329.
Haley, S. M., Ni, P. S., Ludlow, L. H., & Fragala-Pinkham, M. A. (2006). Measurement precision and efficiency of multidimensional computer adaptive testing of physical functioning using the pediatric evaluation of disability inventory. Archives of Physical Medicine and Rehabilitation, 87, 1223–1229.
Gibbons, R., & Hedeker, D. (1992). Full-information item bifactor analysis. Psychometrika, 57(3), 423–436.
Gibbons, R. D., Bock, R., Hedeker, D., Weiss, D. J., Segawa, E., Bhaumik, D. K., Kupfer, D. J., Frank, E., Grochocinski, V. J., & Stover, A. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31(1), 4–19.
Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16, 19–31.
Haley, S. M., Ni, P., Dumas, H. M., Fragala-Pinkham, M. A., Hambleton, R. K., Montpetit, K., Bilodeau, N., Gorton, G. E., Watson, K., & Tucker, C. A. (2009). Measuring global physical health in children with cerebral palsy: Illustration of a multidimensional bi-factor model and computerized adaptive testing. Quality of Life Research, 18, 359–370.
Immekus, J. C., Gibbons, R. D., & Rush, A. J. (2007). Patient-reported outcomes measurement and computerized adaptive testing: An application of post-hoc simulation to a diagnostic screening instrument. In D. J. Weiss (Ed.). Proceedings of the 2007 GMAC conference on computerized adaptive testing.
Weiss, D. J., & Gibbons, R. D. (2007). Computerized adaptive testing with the bifactor model. In D. J. Weiss (Ed.). Proceedings of the 2007 GMAC conference on computerized adaptive testing.
Cheng, Y., Chang, H-H., Douglas, J., & Guo, F. (2009). Constraint-weighted a-stratification for computerized adaptive testing with nonstatistical constraints. Educational and Psychological Measurement, 69(1), 35–49.
Leung, C-K., Chang, H.-H., & Hau, K.-T. (2003). Computerized adaptive testing: A comparison of three content balancing methods. Journal of Technology, Learning, and Assessment, 2(5). Available from http://www.jtla.org.
Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2, 359–375.
Chen, S., Ankenmann, R. D., & Spray, J. A. (1999, April). Exploring the relationship between item exposure rate and test overlap rate in computerized adaptive testing. Paper presented at the annual meeting of the National Council on Measurement in Education, Montreal, Canada.
Leung, C-K., Chang, H.-H., & Hau, K.-T. (2000, April). Content-balancing in stratified computerized adaptive testing designs. Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA.
Hays, R. D., Sherbourne, C. D., & Mazel, R. M. (1993). The RAND 36-item health survey 1.0. Health Economics, 2(3), 217–227.
Ware, J. E. Jr., & Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30(6), 473–483.
Ware, J. E., Kosinski, M., & Keller, S. D. (1994). SF-36 Physical and mental summary scale: A user’s manual. Boston, MA: The Health Institute.
Chang, C.-H., Wright, B. D., Cella, D., & Hays, R. D. (2007). The SF-36 physical and mental health factors were confirmed in cancer and HIV_AIDS patients. Journal of Clinical Epidemiology, 60(1), 68–72.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologist. NJ: Lawrence Erlbaum Associates.
Ware, J. E., Kosinski, M., & Keller, S. D. (1996). A 12-item short-form health survey: Construction of scales and preliminary tests of reliability and validity. Medical Care, 34(3), 220–233.
Ware, J. E. (2002). User’s manual for the SF-12v2 health survey (with a supplement documenting SF-12 health survey). Lincoln, RI: QualityMetric Inc.
Chang, H., & Ying, Z. (1999). A-stratified multistage computer adaptive testing. Applied Psychological Measurement, 23(3), 211–222.
Chang, H., Qian, J., & Ying, Z. (2001). A-stratified multistage computer adaptive testing with b blocking. Applied Psychological Measurement, 25(4), 333–341.
Stewart, A. L., Sherbourne, C. D., Hays, R. D., Wells, K. B., Nelson, E. C., Kamberg, C., Rogers, W. H., Berry, S. H., Ware, J. E. (1992). Summary and discussion of MOS measures. In A. L. Stewart & J. E. Ware (Eds.), Measuring functioning and well-being: The medical outcomes study approach (pp. 345–371). Durham, NC: Duke University Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zheng, Y., Chang, CH. & Chang, HH. Content-balancing strategy in bifactor computerized adaptive patient-reported outcome measurement. Qual Life Res 22, 491–499 (2013). https://doi.org/10.1007/s11136-012-0179-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-012-0179-6