Skip to main content
Log in

Detecting Score Drift in a High-Stakes Performance-Based Assessment

  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

Although studies have been conducted to examine the effects of a variety of factors on the comparability of scores obtained from standardized patient examinations (SPE), little research has been conducted to specifically investigate the challenge of detecting drift in case difficulty estimates over time, particularly for large-scale, performance-based, assessments. The purpose of the current study was to investigate the use of a procedure to detect drift in the difficulty estimates for a large-scale, high stakes SPE. The results of this investigation suggest that, for particular performance tasks, there was some variation in mean scores over time. These findings indicate that, although it is feasible to create a bank of case-SP means and link scores back to these fixed estimates, special attention must be paid to the standardization of exam materials over time. This is essential to ensure comparability of scores and pass-fail decisions for candidates who are assessed on multiple test forms throughout the year.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Battles, J.B., Carpenter, J.L., McIntire, D.D. & Wagner, J.M. (1994). Analyzing and adjusting for variables in a large-scale standardized-patient examination. Academic Medicine 69: 370-376.

    Article  Google Scholar 

  • Boulet, J.R., Ben-David, M.F., Ziv, A., Burdick, W.P., Curtis, M., Peitzman, S.J. & Gary, N.E. (1998a). Using standardized patients to assess the interpersonal skills of physicians. Academic Medicine 73(10 suppl.): S94-S96.

    Google Scholar 

  • Boulet, J.R., Ben-David, M.F., Hambleton, R.K., Burdick, W.P., Ziv, A. & Gary, N.E. (1998b). An investigation of the sources of measurement error in the post-encounter written scores from standardized patient examinations. Advances in Health Sciences Education 3: 89-100.

    Article  Google Scholar 

  • Boulet, J., Friedman Ben-David, M., Ziv, A., Burdick, W.P. & Gary, N.E. (2000). The use of holistic scoring for post-encounter written exercises. In D. Melnick (ed.), Proceedings from the Eighth Ottawa Conference on Medical Education and Assessment, Philadelphia, USA. National Board of Medical Examiners.

  • Colliver, J.A., Vu, N.V., Verhulst, S.J. & Barrows, H.S. (1991). Effect of position-within-sequence on case performance in a multiple-station examination using standardized patient cases. Evaluation and the Health Professions 14: 343-355.

    Google Scholar 

  • De Champlain, A.F., Macmillan, M.K., Margolis, M.J., Klass, D.J., Nungester, R.J., Schimpfauser, F. & Zinnerstrom, K. (1999). Modeling the effects of security breaches on students' performance on a large-scale standardized patient examination. Academic Medicine 74(suppl.): S49-S51.

    Google Scholar 

  • Gispert, R., Rue, M., Roma, J. & Martinez-Carretero, J.M. (1999). Gender, sequence of cases, and day effects on clinical skills assessment with standardized patients. Medical Education 33: 499-503.

    Article  Google Scholar 

  • Gordon, B., Englehard, Jr., G., Gabrielson, S. & Bernknopf, B. (1996). Conceptual issues in equating performance assessments: Lessons from writing assessment. Journal of Research and Development in Education 29: 81-88.

    Google Scholar 

  • Green, B.F. (1995). Comparability of scores from performance assessments. Educational Measurement: Issues and Practice 14: 13-15, 24.

    Article  Google Scholar 

  • Harris, D.J. & Welch, C.J. (1995, April). Scaling and Equating in High Stakes Writing Assessment. Paper presented at the Annual Meeting of the National Council on Measurement in Education, San Francisco.

  • Lloyd, J.S., Williams, R.G., Simonton, D.K. & Sherman, D. (1990). Order effects in standardized patient examinations. Academic Medicine 65(suppl.): S51-S52.

    Article  Google Scholar 

  • Muraki, E., Hombo, C.M. & Lee, Y.W. (2000). Equating and linking of performance assessments. Applied Psychological Measurement 24: 325-337.

    Article  Google Scholar 

  • Newble, D.I. & Swanson, D.B. (1988). Psychometric characteristics of the objective structured clinical examination. Medical Education 22: 325-334.

    Article  Google Scholar 

  • Petersen, N.S., Kolen, M.J. & Hoover, H.D. (1989). Scaling, norming and equating. In R.L. Linn (ed.), Educational Measurement 3rd edition (pp. 221-262).

  • Resnick, R.K., Blackmore, D., Dauphinee, W.D., Rothman, A.I. & Smee, S. (1996). Large-scale high-stakes testing with an OSCE: Report from the Medical Council of Canada. Academic Medicine 71: S19-S21.

    Article  Google Scholar 

  • Swanson, D.B., Clauser, B.E. & Case, S.M. (1999). Clinical skills assessment with standardized patients in high-stakes tests: A framework for thinking about score precision, equating, and security. Advances in Health Sciences Education 4: 67-106.

    Article  Google Scholar 

  • Swanson, D.B. & Norcini, J.J. (1989). Factors influencing reproducibility of tests using standardized patients. Teaching and Learning in Medicine 1: 158-166.

    Article  Google Scholar 

  • Vu, N.V. & Barrows, H.S. (1994). Use of standardized patients in clinical assessments: Recent developments and measurement findings. Educational Researcher 23: 25-30.

    Article  Google Scholar 

  • Whelan, G.P. (1999). Educational Commission for Foreign Medical Graduates: Clinical Skills Assessment prototype. Medical Teacher 21: 156-160.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Danette W. McKinley.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McKinley, D.W., Boulet, J.R. Detecting Score Drift in a High-Stakes Performance-Based Assessment. Adv Health Sci Educ Theory Pract 9, 29–38 (2004). https://doi.org/10.1023/B:AHSE.0000012214.40340.03

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:AHSE.0000012214.40340.03

Navigation