Abstract
This paper presents an empirical challenge to the assumption that an item-response theory analysis always yields a better measure of a clinical construct. We summarize results from two measurement development studies that showed that such an analysis lost important content reflecting the conceptual model (“conceptual validity”). The cost of parsimony may thus be too high. Conceptual models that form the foundation of QOL measurement reflect the patient’s experience. This experience may include concepts and items that are psychometrically “redundant” but capture distinct features of the concept. Good measurement is likely a balance between relying on IRT’s quantitative metrics and recognizing the importance of conceptual validity and clinical utility.
Similar content being viewed by others
Data availability
Not applicable.
Notes
“Scoring rules” refers to the scoring algorithm recommended by the measure developers. It may be a simple sum or an IRT score based on the statistical model used.
All modifications were done with written permission from Dr. David Cella, head of the PROMIS and Neuro-QOL endeavors.
In the published DMD paper, references to the EFA were removed because the reviewers believed that implementing an EFA and CFA on the same sample in training and validation analyses was somehow improper. Rather than argue, we removed reference to the EFA. In the future, however, we will not remove mention of EFAs going forward, as the use of training and validation analyses on different subsets of the sample represents good practice and is standard for psychometric development. Using CFA to get model fit statistics on the EFA model determined to be the best is also good practice.
Based on Alchemer survey engine’s testing software.
Abbreviations
- CFA:
-
Confirmatory factor analysis
- CFI:
-
Confirmatory fit index
- CPAP:
-
Continuous positive airway pressure
- DMD:
-
Duchenne Muscular Dystrophy
- EFA:
-
Exploratory factor analysis
- FDA:
-
Food and Drug Administration
- HD:
-
Hemorrhoid disease
- IRT:
-
Item-response theory
- NIH:
-
National Institutes of Health
- PRO:
-
Patient-reported outcome
- PROMIS:
-
Patient-reported Outcomes Measurement Information System
- RMSEA:
-
Root-mean-square error approximation
- TLI:
-
Tucker–Lewis index
- QOL:
-
Quality of life
References
Nunnally, J. C. (1994). Psychometric Theory (3rd ed.). Tata McGraw-Hill Education.
Reeve, B. B., Wyrwich, K. W., Wu, A. W., Velikova, G., Terwee, C. B., Snyder, C. F., et al. (2013). ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research. Quality of Life Research, 22(8), 1889–1905.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.
Reise, S. P., & Revicki, D. A. (2014). Handbook of item response theory modeling: Applications to typical performance assessment. Routledge.
Blanchin, M., Guilleux, A., Hardouin, J.-B., & Sébille, V. (2020). Comparison of structural equation modelling, item response theory and Rasch measurement theory-based methods for response shift detection at item level: A simulation study. Statistical Methods in Medical Research., 29(4), 1015–1029.
Center A. HealthMeasures Scoring Service powered by Assessment CenterSM 2023 [Available from: https://www.assessmentcenter.net/ac_scoringservice.
Organization PH. Welcome to the PHO 2023 [Available from: https://www.promishealth.org/.
Reise, S. P., & Waller, N. G. (2009). Item response theory and clinical measurement. Annual review of clinical psychology., 5, 27–48.
Fayers, P., & Hand, D. (1997). Factor analysis, causal indicators and quality of life. Quality of Life Research., 6, 139–150.
Fayers, P., Groenvold, M., Hand, D. J., & Bjordal, K. (1998). Clinical impact versus factor analysis for quality of life questionnaire construction. Journal of clinical epidemiology., 51(3), 285–286.
Fayers, P. M., Hand, D. J., Bjordal, K., & Groenvold, M. (1997). Causal indicators in quality of life research. Quality of life research., 6, 393–406.
Fayers, P. M., & Hand, D. J. (2002). Causal variables, indicator variables and measurement scales: An example from quality of life. Journal of the Royal Statistical Society: Series A (Statistics in Society)., 165(2), 233–253.
Bollen, K. A., & Bauldry, S. (2011). Three Cs in measurement models: Causal indicators, composite indicators, and covariates. Psychological Methods., 16(3), 265–284.
Juniper, E. F., Guyatt, G. H., Streiner, D. L., & King, D. R. (1997). Clinical impact versus factor analysis for quality of life questionnaire construction. Journal of Clinical Epidemiology., 50(3), 233–238.
Schwartz, C. E., Merriman, M. P., Reed, G., & Byock, I. (2005). Evaluation of the Missoula-VITAS Quality of Life Index - Revised: Research tool or clinical tool? Journal of Palliative Medicine., 8(1), 121–135.
Schwartz, C. E., Stark, R. B., Cella, D., Borowiec, K., Gooch, K. L., & Audhya, I. F. (2021). Measuring Duchenne muscular dystrophy impact: Development of a proxy-reported measure derived from PROMIS item banks. Orphanet Journal of Rare Diseases., 16, 487.
Schwartz CE, Borowiec K. Development and Validation of the HDSIMTM Tool: a Measure of Hemorrhoid Disease Symptom Impact. Quality of Life Research. 2024 (in press)
Willis, G. B. (2004). Cognitive interviewing: A tool for improving questionnaire design. Sage Publications.
DeVellis, R. F., & Thorpe, C. T. (2021). Scale development: Theory and applications. Sage publications.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. ERIC.
De Vet, H. C., Terwee, C. B., Mokkink, L. B., & Knol, D. L. (2011). Measurement in medicine: A practical guide. Cambridge University Press.
Food and Drug Administration. Guidance for industry: patient-reported outcome measures: use in medical product development to support labeling claims. Silver Spring, MD: US Department of Health and Human Services Food and Drug Administration; 2009.
Hu, Lt., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.
Cook, K. F., Kallen, M. A., & Amtmann, D. (2009). Having a fit: Impact of number of items and distribution of data on traditional criteria for assessing IRT’s unidimensionality assumption. Quality of Life Research., 18, 447–460.
Szabo, S. M., Salhany, R. M., Deighton, A., Harwood, M., Mah, J., & Gooch, K. L. (2021). The clinical course of Duchenne muscular dystrophy in the corticosteroid treatment era: A systematic literature review. Orphanet Journal of Rare Diseases., 16(1), 1–13.
Pandya, S., James, K. A., Westfield, C., Thomas, S., Fox, D. J., Ciafaloni, E., & Moxley, R. T. (2018). Health profile of a cohort of adults with Duchenne muscular dystrophy. Muscle & Nerve., 58(2), 219–223.
Acknowledgements
We are grateful to the handling editor and reviewers from Quality of Life Research for their thoughtful constructive feedback on an earlier draft of this Commentary.
Funding
This work was not funded by an external organization.
Author information
Authors and Affiliations
Contributions
CES, KB, and BDR conceptualized the Commentary. CES wrote the paper, and KB and BDR edited the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
All authors declare that they have no potential conflicts of interest and report no disclosures.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Schwartz, C.E., Borowiec, K. & Rapkin, B.D. When better is the enemy of good: two cautionary tales of conceptual validity versus parsimony in clinical psychometric research. Qual Life Res (2024). https://doi.org/10.1007/s11136-024-03617-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s11136-024-03617-z