Skip to main content

Multistage Test Design Considerations in International Large-Scale Assessments of Educational Achievement

  • Reference work entry
  • First Online:
International Handbook of Comparative Large-Scale Studies in Education

Abstract

Numerous choices exist for designing and implementing a multistage test (MST) for dozens of heterogeneous educational systems internationally. In this chapter, we review recent research that focuses on MST in an international large-scale assessment (ILSA) context. To do so, we first describe the inherent heterogeneity and associated measurement challenges of ILSAs, describing how MST offers a means for tailoring assessments to better measure the full achievement distribution while minimizing test burden. We then emphasize design choices and how these impact item and person parameter estimates as well as item exposure rates. We also discuss the tension between fully realizing the promise of an MST design with the primacy of stable trend estimates. Specifically, we discuss the design choices with respect to the structure of MST and panels, related routing decisions within MST, routing methods, module lengths, and position effects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 379.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

Download references

Acknowledgments

This project was partially funded by a grant from the Norwegian Research Council, FINNUT program, # 255246

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leslie Rutkowski .

Editor information

Editors and Affiliations

Section Editor information

Appendix A: Some Technical Details for SLRR19 and RLSR20

Appendix A: Some Technical Details for SLRR19 and RLSR20

RLSR20

Data generation and estimation were conducted in R (R Core Team, 2020). We used mstR 1.2 (Magis et al., 2018) for MST simulation and TAM 3.1-45 (Robitzsch, Kiefer, Wu 2019) for item calibration and population modeling. (The mstR function was custom modified by Magis to allow for probabilistic routing element.) One hundred replications were performed within each condition. We elaborate on our simulation and analysis subsequently. We calibrated items using a 2PL model with the following technical specifications: we used marginal maximum likelihood obtained via the EM algorithm (Bock & Aitkin, 1981); we assumed a different Gaussian distribution for each of nine populations; we did not specify prior item probabilities or starting values; we used quasi Monte Carlo integration with 1000 iterations; and our convergence criteria was set at 0.0001. Item parameters were estimated via a multigroup model, assuming a different normal distribution for each population; we assumed a latent variable distribution of 0/1 for the first population for model identification. Population achievement distributions were estimated using latent regression; to identify the model, we assumed a mean and variance of 0 and 1, respectively, for one population. Because of this identification restriction, the item and person parameters were on a scale determined by the selection of this population. To put the item parameters back onto the generating scale, we used a mean/sigma linking approach.

SLRR19

We utilized a Monte Carlo simulation study to address the research questions in our study via the R (R Core Team, 2020) package mstR 1.2 (Magis et al., 2018) for MST simulation and analyses, and the mirt package (Chalmers 2012) for item parameter calibration. (The mstR function was custom modified by Magis to allow for probabilistic routing element.) We calibrated items using a 2PL model with the following technical specifications: we used marginal maximum likelihood obtained via the EM algorithm (Bock & Aitkin, 1981) with 500 cycles; we assumed a Gaussian distribution for θ; we did not specify prior item probabilities or starting values; we used 61 quadrature points; and our model convergence criteria was set at 0.0001. We used expected a priori estimation for the person-parameter distribution (Bock & Mislevy, 1982). For item calibration, we assumed a latent variable mean and variance of 0 and 1, respectively, for model identification.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Rutkowski, L., Rutkowski, D., Svetina Valdivia, D. (2022). Multistage Test Design Considerations in International Large-Scale Assessments of Educational Achievement. In: Nilsen, T., Stancel-PiÄ…tak, A., Gustafsson, JE. (eds) International Handbook of Comparative Large-Scale Studies in Education. Springer International Handbooks of Education. Springer, Cham. https://doi.org/10.1007/978-3-030-88178-8_63

Download citation

Publish with us

Policies and ethics