Abstract
The recent increase in participation of international large-scale assessments (ILSAs) has coincided with a growth in economic diversity, with many low-income countries performing significantly lower than their wealthier peers. This diversity has brought with it a number of challenges for the testing organizations. Specifically, can one assessment provide valid and reliable results for high-, medium-, and low-performing systems? In this chapter, we take up this question, suggesting that the methods currently used to measure students’ background and achievement in ILSAs are limited in what and who can be measured. To do so, we first provide an overview of research that our research group has been investigating in terms of the methodological limits of current ILSA designs in relation to measuring student’s background. We include in this discussion how a one-size-fits-all model to background questionnaires may not result in comparable indicators. We conclude the chapter by discussing what we see as the major methodological challenges facing ILSAs given new assessment designs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brese, F., & Mirazchiyski, P. (2013). Measuring students’ family background in large-scale international education studies. IERI Monograph Series, 2. https://www.iea.nl/publications/series-journals/ieri-monograph-series/ieri-monograph-series-5
Forshay, A. W., Thorndike, R. L., Hotyat, F., Pidgeon, D. A., & Walker, D. A. (1962). Educational achievements of thirteen-year-olds in twelve countries: Results of an international research project, 1959–1961. UNESCO Institute for Education. http://iea.nl/pilot_twelve-country_study.html
Friedman, T. L. (2005). The world is flat: A brief history of the twenty-first century. Macmillan.
Gardner, R. C., & MacIntyre, P. D. (1991). An instrumental motivation in language study: Who says it isn’t effective? Studies in Second Language Acquisition, 13(1), 57–72. https://doi.org/10.1017/S0272263100009724
Husén, T., & Postlethwaite, T. N. (1996). A brief history of the International Association for the Evaluation of Educational Achievement (IEA). Assessment in Education: Principles, Policy & Practice, 3(2), 129–141. https://doi.org/10.1080/0969594960030202
IEA. (2016). Brief history of the IEA | IEA. Brief History of the IEA. http://www.iea.nl/brief-history-iea
Johnson, T., Shavitt, S., & Holbrook, A. (2011). Survey response styles across cultures. In D. Matsumoto & F. J. R. van de Vijver (Eds.), Cross-cultural research methods in psychology. Cambridge University Press.
Khorramdel, L., von Davier, M., Bertling, J. P., Roberts, R. D., & Kyllonen, P. C. (2017). Recent IRT approaches to test and correct for response styles in PISA background questionnaire data: A feasibility study. Psychological Test and Assessment Modeling, 59(1), 71.
Lu, J., & Wang, C. (2020). A response time process model for not-reached and omitted items. Journal of Educational Measurement, Online First. https://doi.org/10.1111/jedm.12270
Luecht, R. (2014). Design and implementation of large-scale multistage testing systems. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 391–409). CRC Press.
Mullis, I. V. S., & Martin, M. O. (Eds.). (2013). TIMSS 2015 assessment frameworks. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College.
Mullis, I. V. S., & Martin, M. O. (Eds.). (2017). TIMSS 2019 assessment framework. TIMSS & PIRLS International Study Center. https://timssandpirls.bc.edu/timss2015/frameworks.html
OECD. (2014a). Are grouping and selecting students for different schools related to students’ motivation to learn? (No. 39; PISA in Focus). OECD Publishing. https://www.oecd.org/pisa/pisaproducts/pisainfocus/pisa-in-focus-n39-(eng)-final.pdf
OECD. (2014b). PISA 2012 technical report. OECD Publishing. https://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf
OECD. (2016). PISA 2015 results in focus. OECD Publishing. https://www.oecd.org/pisa/pisa-2015-results-in-focus.pdf
OECD. (2017a). PISA 2015 results: Students’ well-being: Volume III. OECD Publishing.
OECD. (2017b). Scaling procedures and construct validation of context questionnaire data. In PISA 2015 technical report. OECD Publishing.
OECD. (2019a). PISA 2021 assessment and analytical framework. OECD Publishing.
OECD. (2019b). PISA for development 2018 technical report. OECD Publishing. https://www.oecd.org/pisa/pisa-for-development/pisafordevelopment2018technicalreport/
OECD. (2020). PISA 2018 technical report. OECD. https://www.oecd.org/pisa/data/pisa2018technicalreport/
Reddy, V. (2010). Cross-national achievement studies: Learning from South Africa’s participation in the Trends in International Mathematics and Science Study (TIMSS). Compare: A Journal of Comparative and International Education, 35(1), 63–77. https://doi.org/10.1080/03057920500033571
Rubin, D. B. (1988). Discussion. In H. Wainer & H. Braun (Eds.), Test validity. Lawrence Erlbaum Associates, Inc.
Rutkowski, D., & Rutkowski, L. (2013). Measuring socioeconomic background in PISA: One size might not fit all. Research in Comparative and International Education, 8(3), 259–278.
Rutkowski, D., & Rutkowski, L. (2020). Running the wrong race: The case of PISA for development. CIES.
Rutkowski, D., Rutkowski, L., & Liaw, Y.-L. (2017). Measuring widening proficiency differences in international assessments: Are current approaches enough? [Manuscript Submitted for Publication].
Rutkowski, D., Rutkowski, L., & Liaw, Y.-L. (2018). Measuring widening proficiency differences in international assessments: Are current approaches enough? Educational Measurement: Issues and Practice, 37(4), 40–48. https://doi.org/10.1111/emip.12225
Rutkowski, L. (2017). Design considerations for planned missing auxiliary data in a latent regression context. Psychological Test and Assessment Modeling, 59(1), 55–70.
Rutkowski, L., & Rutkowski, D. (2018). Improving the comparability and local usefulness of international assessments: A look back and a way forward. Scandinavian Journal of Educational Research, 62(3), 354–367. https://doi.org/10.1080/00313831.2016.1261044
Rutkowski, L., & Rutkowski, D. (2019). Methodological challenges to measuring heterogeneous populations internationally. In L. Sutter, E. Smith, & B. Denman (Eds.), The SAGE Handbook of Comparative Studies in Education (pp. 126–150). SAGE Publications.
Rutkowski, L., & Rutkowski, D. (forthcoming). Multistage test design considerations in international assessment. In L. Khorramdel, M. von Davier, & K. Yamamoto (Eds.), Innovative computer-based international large-scale assessments – Foundations, methodologies and quality assurance procedures. Springer.
Rutkowski, L., Rutkowski, D., & Liaw, Y.-L. (2019). The existence and impact of floor effects for low-performing PISA participants. Assessment in Education: Principles, Policy & Practice, 26(6), 643–664. https://doi.org/10.1080/0969594X.2019.1577219
Sackett, P. R., Schmitt, N., Ellingson, J. E., & Kabin, M. B. (2001). High-stakes testing in employment, credentialing, and higher education: Prospects in a post-affirmative-action world. American Psychologist, 56(4), 302–318. https://doi.org/10.1037/0003-066X.56.4.302
Sandoval-Hernandez, A., Rutkowski, D., Matta, T., & Miranda, D. (2019). Back to the drawing board: Can we compare socioeconomic background scales? Pensémoslo de nuevo:? Podemos comparar las escalas de antecedentes socioeconómicos? Quarterly Journal Starting Year, 1952(383), 164037.
Tijmstra, J., Liaw, Y.-L., Bolsinova, M., Rutkowski, L., & Rutkowski, D. J. (2019). Sensitivity of the RMSD for detecting item-level misfit in low-performing countries. [Manuscript Submitted for Publication].
UNESCO. (2019). The promise of large-scale learning assessments: Acknowledging limits to unlock opportunities. UNESCO.
United Nations. (2020). World social report 2020. United Nations Publication. https://www.un.org/development/desa/dspd/wp-content/uploads/sites/22/2020/02/World-Social-Report2020-FullReport.pdf
van de Vijver, F. J., & He, J. (2016). Bias assessment and prevention in noncognitive outcome measures in context assessments. In Assessing contexts of learning (pp. 229–253). Springer.
van de Vijver, F. J. R., & Matsumoto, D. (2011). Introduction to the methodological issues associated with cross-cultural research. In D. Matsumoto & F. J. R. van de Vijver (Eds.), Cross-cultural research methods in psychology (pp. 1–14). Cambridge University Press.
von Davier, M., Khorramdel, L., He, Q., Shin, H. J., & Chen, H. (2019). Developments in psychometric population models for technology-based large-scale assessments: An overview of challenges and opportunities. Journal of Educational and Behavioral Statistics, 44(6), 671–705. https://doi.org/10.3102/1076998619881789
Yamamoto, K., & Lennon, M. L. (2018). Understanding and detecting data fabrication in large-scale assessments. Quality Assurance in Education, 26(2), 196–212. https://doi.org/10.1108/QAE-07-2017-0038
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this entry
Cite this entry
Rutkowski, D., Rutkowski, L. (2022). Designing Measurement for All Students in ILSAs. In: Nilsen, T., Stancel-PiÄ…tak, A., Gustafsson, JE. (eds) International Handbook of Comparative Large-Scale Studies in Education. Springer International Handbooks of Education. Springer, Cham. https://doi.org/10.1007/978-3-030-88178-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-88178-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88177-1
Online ISBN: 978-3-030-88178-8
eBook Packages: EducationReference Module Humanities and Social SciencesReference Module Education