Skip to main content
Log in

When better is the enemy of good: two cautionary tales of conceptual validity versus parsimony in clinical psychometric research

  • Commentary
  • Published:
Quality of Life Research Aims and scope Submit manuscript

Abstract

This paper presents an empirical challenge to the assumption that an item-response theory analysis always yields a better measure of a clinical construct. We summarize results from two measurement development studies that showed that such an analysis lost important content reflecting the conceptual model (“conceptual validity”). The cost of parsimony may thus be too high. Conceptual models that form the foundation of QOL measurement reflect the patient’s experience. This experience may include concepts and items that are psychometrically “redundant” but capture distinct features of the concept. Good measurement is likely a balance between relying on IRT’s quantitative metrics and recognizing the importance of conceptual validity and clinical utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

Not applicable.

Notes

  1. “Scoring rules” refers to the scoring algorithm recommended by the measure developers. It may be a simple sum or an IRT score based on the statistical model used.

  2. All modifications were done with written permission from Dr. David Cella, head of the PROMIS and Neuro-QOL endeavors.

  3. In the published DMD paper, references to the EFA were removed because the reviewers believed that implementing an EFA and CFA on the same sample in training and validation analyses was somehow improper. Rather than argue, we removed reference to the EFA. In the future, however, we will not remove mention of EFAs going forward, as the use of training and validation analyses on different subsets of the sample represents good practice and is standard for psychometric development. Using CFA to get model fit statistics on the EFA model determined to be the best is also good practice.

  4. Based on Alchemer survey engine’s testing software.

Abbreviations

CFA:

Confirmatory factor analysis

CFI:

Confirmatory fit index

CPAP:

Continuous positive airway pressure

DMD:

Duchenne Muscular Dystrophy

EFA:

Exploratory factor analysis

FDA:

Food and Drug Administration

HD:

Hemorrhoid disease

IRT:

Item-response theory

NIH:

National Institutes of Health

PRO:

Patient-reported outcome

PROMIS:

Patient-reported Outcomes Measurement Information System

RMSEA:

Root-mean-square error approximation

TLI:

Tucker–Lewis index

QOL:

Quality of life

References

  1. Nunnally, J. C. (1994). Psychometric Theory (3rd ed.). Tata McGraw-Hill Education.

    Google Scholar 

  2. Reeve, B. B., Wyrwich, K. W., Wu, A. W., Velikova, G., Terwee, C. B., Snyder, C. F., et al. (2013). ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research. Quality of Life Research, 22(8), 1889–1905.

    Article  PubMed  Google Scholar 

  3. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.

    Google Scholar 

  4. Reise, S. P., & Revicki, D. A. (2014). Handbook of item response theory modeling: Applications to typical performance assessment. Routledge.

    Book  Google Scholar 

  5. Blanchin, M., Guilleux, A., Hardouin, J.-B., & Sébille, V. (2020). Comparison of structural equation modelling, item response theory and Rasch measurement theory-based methods for response shift detection at item level: A simulation study. Statistical Methods in Medical Research., 29(4), 1015–1029.

    Article  MathSciNet  PubMed  Google Scholar 

  6. Center A. HealthMeasures Scoring Service powered by Assessment CenterSM 2023 [Available from: https://www.assessmentcenter.net/ac_scoringservice.

  7. Organization PH. Welcome to the PHO 2023 [Available from: https://www.promishealth.org/.

  8. Reise, S. P., & Waller, N. G. (2009). Item response theory and clinical measurement. Annual review of clinical psychology., 5, 27–48.

    Article  PubMed  Google Scholar 

  9. Fayers, P., & Hand, D. (1997). Factor analysis, causal indicators and quality of life. Quality of Life Research., 6, 139–150.

    CAS  PubMed  Google Scholar 

  10. Fayers, P., Groenvold, M., Hand, D. J., & Bjordal, K. (1998). Clinical impact versus factor analysis for quality of life questionnaire construction. Journal of clinical epidemiology., 51(3), 285–286.

    CAS  PubMed  Google Scholar 

  11. Fayers, P. M., Hand, D. J., Bjordal, K., & Groenvold, M. (1997). Causal indicators in quality of life research. Quality of life research., 6, 393–406.

    Article  CAS  PubMed  Google Scholar 

  12. Fayers, P. M., & Hand, D. J. (2002). Causal variables, indicator variables and measurement scales: An example from quality of life. Journal of the Royal Statistical Society: Series A (Statistics in Society)., 165(2), 233–253.

    Article  MathSciNet  Google Scholar 

  13. Bollen, K. A., & Bauldry, S. (2011). Three Cs in measurement models: Causal indicators, composite indicators, and covariates. Psychological Methods., 16(3), 265–284.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Juniper, E. F., Guyatt, G. H., Streiner, D. L., & King, D. R. (1997). Clinical impact versus factor analysis for quality of life questionnaire construction. Journal of Clinical Epidemiology., 50(3), 233–238.

    Article  CAS  PubMed  Google Scholar 

  15. Schwartz, C. E., Merriman, M. P., Reed, G., & Byock, I. (2005). Evaluation of the Missoula-VITAS Quality of Life Index - Revised: Research tool or clinical tool? Journal of Palliative Medicine., 8(1), 121–135.

    Article  PubMed  Google Scholar 

  16. Schwartz, C. E., Stark, R. B., Cella, D., Borowiec, K., Gooch, K. L., & Audhya, I. F. (2021). Measuring Duchenne muscular dystrophy impact: Development of a proxy-reported measure derived from PROMIS item banks. Orphanet Journal of Rare Diseases., 16, 487.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Schwartz CE, Borowiec K. Development and Validation of the HDSIMTM Tool: a Measure of Hemorrhoid Disease Symptom Impact. Quality of Life Research. 2024 (in press)

  18. Willis, G. B. (2004). Cognitive interviewing: A tool for improving questionnaire design. Sage Publications.

    Google Scholar 

  19. DeVellis, R. F., & Thorpe, C. T. (2021). Scale development: Theory and applications. Sage publications.

    Google Scholar 

  20. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. ERIC.

    Google Scholar 

  21. De Vet, H. C., Terwee, C. B., Mokkink, L. B., & Knol, D. L. (2011). Measurement in medicine: A practical guide. Cambridge University Press.

    Book  Google Scholar 

  22. Food and Drug Administration. Guidance for industry: patient-reported outcome measures: use in medical product development to support labeling claims. Silver Spring, MD: US Department of Health and Human Services Food and Drug Administration; 2009.

  23. Hu, Lt., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.

    Article  Google Scholar 

  24. Cook, K. F., Kallen, M. A., & Amtmann, D. (2009). Having a fit: Impact of number of items and distribution of data on traditional criteria for assessing IRT’s unidimensionality assumption. Quality of Life Research., 18, 447–460.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Szabo, S. M., Salhany, R. M., Deighton, A., Harwood, M., Mah, J., & Gooch, K. L. (2021). The clinical course of Duchenne muscular dystrophy in the corticosteroid treatment era: A systematic literature review. Orphanet Journal of Rare Diseases., 16(1), 1–13.

    Article  Google Scholar 

  26. Pandya, S., James, K. A., Westfield, C., Thomas, S., Fox, D. J., Ciafaloni, E., & Moxley, R. T. (2018). Health profile of a cohort of adults with Duchenne muscular dystrophy. Muscle & Nerve., 58(2), 219–223.

    Article  Google Scholar 

Download references

Acknowledgements

We are grateful to the handling editor and reviewers from Quality of Life Research for their thoughtful constructive feedback on an earlier draft of this Commentary.

Funding

This work was not funded by an external organization.

Author information

Authors and Affiliations

Authors

Contributions

CES, KB, and BDR conceptualized the Commentary. CES wrote the paper, and KB and BDR edited the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Carolyn E. Schwartz.

Ethics declarations

Conflicts of interest

All authors declare that they have no potential conflicts of interest and report no disclosures.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schwartz, C.E., Borowiec, K. & Rapkin, B.D. When better is the enemy of good: two cautionary tales of conceptual validity versus parsimony in clinical psychometric research. Qual Life Res (2024). https://doi.org/10.1007/s11136-024-03617-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11136-024-03617-z

Keywords

Navigation