A Criterion-Referenced Approach to Student Ratings of Instruction

Meyer, J. Patrick; Doromal, Justin B.; Wei, Xiaoxin; Zhu, Shi

doi:10.1007/s11162-016-9437-8

A Criterion-Referenced Approach to Student Ratings of Instruction

Published: 13 September 2016

Volume 58, pages 545–567, (2017)
Cite this article

Research in Higher Education Aims and scope Submit manuscript

J. Patrick Meyer ORCID: orcid.org/0000-0002-0044-7959¹,
Justin B. Doromal¹,
Xiaoxin Wei¹ &
…
Shi Zhu¹

1291 Accesses
4 Citations
5 Altmetric
2 Mentions
Explore all metrics

Abstract

We developed a criterion-referenced student rating of instruction (SRI) to facilitate formative assessment of teaching. It involves four dimensions of teaching quality that are grounded in current instructional design principles: Organization and structure, Assessment and feedback, Personal interactions, and Academic rigor. Using item response theory and Wright mapping methods, we describe teaching characteristics at various points along the latent continuum for each scale. These maps enable criterion-referenced score interpretation by making an explicit connection between test performance and the theoretical framework. We explain the way our Wright maps can be used to enhance an instructor’s ability to interpret scores and identify ways to refine teaching. Although our work is aimed at improving score interpretation, a criterion-referenced test is not immune to factors that may bias test scores. The literature on SRIs is filled with research on factors unrelated to teaching that may bias scores. Therefore, we also used multilevel models to evaluate the extent to which student and course characteristic may affect scores and compromise score interpretation. Results indicated that student anger and the interaction between student gender and instructor gender are significant effects that account for a small amount of variance in SRI scores. All things considered, our criterion-referenced approach to SRIs is a viable way to describe teaching quality and help instructors refine pedagogy and facilitate course development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Student Ratings of Teaching Quality Dimensions: Empirical Findings and Future Directions

An integrated strategy for the analysis of student evaluation of teaching: from descriptive measures to explanatory models

Article 11 October 2016

Student Evaluation of Teaching: A Study Exploring Student Rating Instrument Free-form Text Comments

Article 14 May 2015

Notes

The complete measure is available upon request. For brevity, we did not include it in this paper.
An unpublished manuscript about the original study is available upon request.
A Wright map is also referred to as an item map.

References

Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998. Journal of Personnel Evaluation in Education, 13(2), 153–166.
Article Google Scholar
Anderson, K., & Miller, E. D. (1997). Gender and student evaluations of teaching. PS: Political Science and Politics, 30(2), 216–219.
Google Scholar
Arreola, R. A. (2007). Developing a comprehensive faculty evaluation system: A guide to designing, building, and operating large-scale faculty evaluation systems (3rd ed.). Bolton, MA: Anker Publishing Company Inc.
Google Scholar
Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87(4), 656–665.
Article Google Scholar
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed effects models using lme4. Journal of Statistical Software, 67(1), 1–48. doi:10.18637/jss.v067.i01.
Article Google Scholar
Bensley, D. A. (2010). A brief guide for teaching and assessing critical thinking in psychology. Observer, 23(10). Retrieved from http://www.psychologicalscience.org/index.php/publications/observer/2010/december-10/a-brief-guide-for-teaching-and-assessing-critical-thinking-in-psychology.html.
Benson, J. (1998). Developing a strong program of construct validation: A test anxiety example. Educational Measurement: Issues & Practice, 17, 10–22.
Article Google Scholar
Benton, S. L., & Cashin, W. E. (2014). Student ratings of instruction in college and university courses. In M. B. Paulsen (Ed.), Higher education: Handbook of theory & research (Vol. 29, pp. 279–326). Dordrecht, The Netherlands: Springer.
Chapter Google Scholar
Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). New York: Routledge.
Google Scholar
Braga, M., Paccagnella, M., & Pellizzari, M. (2014). Evaluating students’ evaluations of professors. Economic of Education Review, 41, 71–88.
Article Google Scholar
Bransford, J. D., Brown, A. L., & Cocking, R. R. (Eds.). (1999). How people learn: Brain, mind, experience, and school. Washington, DC: National Academy Press.
Google Scholar
Brophy, J. E. (1999). Teaching (educational practices series—1). Geneva, Switzerland: International Academy of Education and International Bureau of Education, UNESCO.
Google Scholar
Carrell, S. E., & West, J. E. (2010). Does professor quality matter? Evidence from random assignment of students to professors. Journal of Political Economy, 118(3), 409–432.
Article Google Scholar
Cizek, G. J., & Bunch, M. B. (2007). Standard setting: A guide to establishing and evaluating performance standards for tests. Thousand Oaks, CA: Sage.
Book Google Scholar
Clark, R. E. (1983). Reconsidering research on learning from media. Review of Educational Research, 53, 445–459. doi:10.3102/00346543053004445.
Article Google Scholar
Clark, R. E. (2009). Translating research into new instructional technologies for higher education: The active ingredient process. The Journal of Computing in Higher Education, 21, 4–18. doi:10.1007/s12528-009-9013-8.
Article Google Scholar
Clayson, D. E. (2009). Student evaluations of teaching: Are they related to what students learn? A meta-analysis and review of the literature. Journal of Marketing Education, 31, 16–30.
Article Google Scholar
Cohen, P. A. (1981). Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies. Review of Educational Research, 51, 281–309.
Article Google Scholar
Feldman, K. A. (1992). College students’ views of male and female college teachers: Part I—evidence from the social laboratory and experiments. Research in Higher Education, 33(3), 317–375.
Article Google Scholar
Feldman, K. A. (1993). College students’ views of male and female college teachers: Part II—evidence from students’ evaluations of their classroom teachers. Research in Higher Education, 34(2), 151–211.
Article Google Scholar
Ferguson, R. F. (2012). Can student surveys measure teaching quality. The Phi Delta Kappan, 94(3), 24–28.
Article Google Scholar
Fink, L. D. (2013). Creating significant learning experiences: An integrated approach to designing college courses. San Franciso: Jossey-Bass.
Google Scholar
Glaser, R. (1963). Instructional technology and the measurement of learning outcomes. American Psychologist, 18, 519–521.
Article Google Scholar
Greenwald, A. G., & Gillmore, G. M. (1997). Grading leniency is a removable contaminant of student ratings. American Psychologist, 52, 1209–1217.
Article Google Scholar
Hamre, B. K., Pianta, R. C., Downer, J. T., DeCoster, J., Mashburn, A. J., Jones, S. M., et al. (2013). Teaching through interactions: Testing a developmental framework of teacher effectiveness in over 4000 classrooms. The Elementary School Journal, 113(4), 461–487.
Article Google Scholar
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. London: Routledge.
Google Scholar
Huynh, H. (1998). On score locations of binary and partial credit items and their applications to item mapping and criterion-referenced interpretation. Journal of Educational and Behavioral Statistics, 23, 35–56.
Article Google Scholar
Huynh, H., & Meyer, J. P. (2003). Maximum information approach to scale description for affective measures based on the Rasch model. Journal of Applied Measurement, 4, 101–110.
Google Scholar
Johnson, V. E. (2003). Grade inflation: A crisis in college education. New York: Springer.
Google Scholar
Kennedy, M. J., Thomas, C. N., Aronin, S., Newton, J. R., & Lloyd, J. W. (2014). Improving teacher candidate knowledge using content acquisition podcasts. Computers & Education, 70, 116–127. doi:10.1016/j.compedu.2013.08.010.
Article Google Scholar
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2015). Package ‘lmerTest’ [computer software] version 2.0–29. Retrieved from https://cran.r-project.org/web/packages/lmerTest/index.html.
Linacre, J. M. (2006). A user’s guide to WINSTEPS Rasch-model computer programs. Chicago, IL: Author.
Google Scholar
Lüdtke, O., Robitzsch, A., Trautwein, U., & Kunter, M. (2009). Assessing the impact of learning environments: How to use student ratings of classroom or school characteristics in multilevel modeling. Contemporary Educational Psychology, 34, 120–131.
Article Google Scholar
Marsh, H. W. (1987). Students’ evaluations of university teaching: Research findings, methodological issues, and directions for future research. International Journal of Educational Research, 11, 253–388.
Article Google Scholar
Marsh, H. W., & Roche, L. A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52(11), 1187–1197.
Article Google Scholar
Marsh, H. W., & Roche, L. A. (2000). Effects of grading leniency and low workload on students’ evaluations of teaching: Popular myths, bias, validity, or innocent bystanders? Journal of Educational Psychology, 92(1), 202–228.
Article Google Scholar
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.
Article Google Scholar
Mayer, R. E. (2008). Applying the science of learning: Evidence-based principles for the design of multimedia instruction. American Psychologist, 63, 760–769. doi:10.1037/0003-066X.63.8.760.
Article Google Scholar
Mayer, R. E. (2009). Multimedia learning (2nd ed.). New York: Cambridge University Press.
Book Google Scholar
McCaffrey, D. F., Lockwood, J. R., Koretz, D. M., & Hamilton, D. S. (2003). Evaluating value-added models for teacher accountability. Santa Monica, CA: Rand.
Book Google Scholar
McKeachie, W. J., & Svinicki, M. (2006). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers (12th ed.). Boston: Houghton Mifflin.
Google Scholar
Meyer, J. P. (2014). Applied measurement with jMetrik. New York: Routledge.
Google Scholar
Ory, J. C., & Ryan, K. (2001). How do student ratings measure up to a new validity framework? New Directions for Institutional Research, 109, 27–44. doi:10.1002/ir.2.
Article Google Scholar
Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes: Standardized observation can leverage capacity. Educational Researcher, 38(2), 109–119.
Article Google Scholar
Pyc, M. A., Agarwal, P. K., & Roediger, H. L., III (2014). Test-enhanced learning. In V. A. Benassi, C. E. Overson, & C. M. Hakala (Eds.) Applying science of learning in education: Infusing psychological science into the curriculum. Retrieved from the Society for the Teaching of Psychology website http://teachpsych.org/ebooks/asle2014/index.php.
Raudenbush, S. W., & Jean, M. (2014). To what extent do student perceptions of classroom quality predict teacher value added? In T. J. Kane, K. A. Kerr, & R. C. Pianta (Eds.), Designing teacher evaluation systems. San Franciso, CA: Jossey-Bass.
Google Scholar
Wiggins, G., & McTighe, J. (2011). Understanding by design (2nd ed.). Alexandria, VA: Association for Supervision and Curriculum Development.
Google Scholar
Willingham, D. T. (2007). Critical thinking: Why is it so hard to teach? American Educator. Washington, D.C.: American Federation of Teachers. http://www.aft.org/newspubs/periodicals/ae.
Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum.
Google Scholar

Download references

Acknowledgments

We thank Emily Bowling, Fares Karam, Bo Odom, and Laura Tortorelli for their work on the original version of this measure. They developed the original teaching framework and wrote the initial pool of items as part of a course project.

Author information

Authors and Affiliations

Curry School of Education, University of Virginia, P.O. Box 400265, 405 Emmet Street South, Charlottesville, VA, 22904, USA
J. Patrick Meyer, Justin B. Doromal, Xiaoxin Wei & Shi Zhu

Authors

J. Patrick Meyer
View author publications
You can also search for this author in PubMed Google Scholar
Justin B. Doromal
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxin Wei
View author publications
You can also search for this author in PubMed Google Scholar
Shi Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Patrick Meyer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meyer, J.P., Doromal, J.B., Wei, X. et al. A Criterion-Referenced Approach to Student Ratings of Instruction. Res High Educ 58, 545–567 (2017). https://doi.org/10.1007/s11162-016-9437-8

Download citation

Received: 09 February 2016
Published: 13 September 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11162-016-9437-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Criterion-Referenced Approach to Student Ratings of Instruction

Abstract

Access this article

Similar content being viewed by others

Student Ratings of Teaching Quality Dimensions: Empirical Findings and Future Directions

An integrated strategy for the analysis of student evaluation of teaching: from descriptive measures to explanatory models

Student Evaluation of Teaching: A Study Exploring Student Rating Instrument Free-form Text Comments

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Criterion-Referenced Approach to Student Ratings of Instruction

Abstract

Access this article

Similar content being viewed by others

Student Ratings of Teaching Quality Dimensions: Empirical Findings and Future Directions

An integrated strategy for the analysis of student evaluation of teaching: from descriptive measures to explanatory models

Student Evaluation of Teaching: A Study Exploring Student Rating Instrument Free-form Text Comments

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation