A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis

Huo, Yan; de la Torre, Jimmy; Mun, Eun-Young; Kim, Su-Young; Ray, Anne E.; Jiao, Yang; White, Helene R.

doi:10.1007/s11336-014-9420-2

A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis

Published: 30 September 2014

Volume 80, pages 834–855, (2015)
Cite this article

Psychometrika Aims and scope Submit manuscript

Yan Huo¹,
Jimmy de la Torre¹,
Eun-Young Mun¹,
Su-Young Kim²,
Anne E. Ray³,
Yang Jiao³ &
…
Helene R. White³

955 Accesses
20 Citations
Explore all metrics

Abstract

The present paper proposes a hierarchical, multi-unidimensional two-parameter logistic item response theory (2PL-MUIRT) model extended for a large number of groups. The proposed model was motivated by a large-scale integrative data analysis (IDA) study which combined data (N = 24,336) from 24 independent alcohol intervention studies. IDA projects face unique challenges that are different from those encountered in individual studies, such as the need to establish a common scoring metric across studies and to handle missingness in the pooled data. To address these challenges, we developed a Markov chain Monte Carlo (MCMC) algorithm for a hierarchical 2PL-MUIRT model for multiple groups in which not only were the item parameters and latent traits estimated, but the means and covariance structures for multiple dimensions were also estimated across different groups. Compared to a few existing MCMC algorithms for multidimensional IRT models that constrain the item parameters to facilitate estimation of the covariance matrix, we adapted an MCMC algorithm so that we could directly estimate the correlation matrix for the anchor group without any constraints on the item parameters. The feasibility of the MCMC algorithm and the validity of the basic calibration procedure were examined using a simulation study. Results showed that model parameters could be adequately recovered, and estimated latent trait scores closely approximated true latent trait scores. The algorithm was then applied to analyze real data (69 items across 20 studies for 22,608 participants). The posterior predictive model check showed that the model fit all items well, and the correlations between the MCMC scores and original scores were overall quite high. An additional simulation study demonstrated robustness of the MCMC procedures in the context of the high proportion of missingness in data. The Bayesian hierarchical IRT model using the MCMC algorithms developed in the current study has the potential to be widely implemented for IDA studies or multi-site studies, and can be further refined to meet more complicated needs in applied research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Gordon W. Cheung, Helena D. Cooper-Thomas, … Linda C. Wang

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Yan Xia & Yanyun Yang

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

Jörg Henseler, Christian M. Ringle & Marko Sarstedt

Notes

The multiple shorter chains were used instead of one single longer chain. Once the chains converged, the magnitudes of the auto-correlation did not affect the estimates. Therefore, it is not necessary to compute the auto-correlation.
Each of the 20 studies administered a subset of the 66 items (16 items in the study with the least items and 52 in the study with the most).
There are several computer programs available, such as mlirt and WinBUGS. However, those programs are not specifically designed for dealing with the problems we have. For example, the mlirt program is more appropriate for analyzing the within and between variability in the multilevel IRT models. The WinBUGS program can be used for a variety of the Bayesian IRT models, but it did not meet our need. The MCMC algorithms we programmed gave us full control on every aspect of estimation (e.g., determining the candidate variances for greater convergence efficiency). This allowed us to tailor our program to meet specific needs in solving problems in our work.
Latent trait scores can be estimated simultaneously along with other structural model parameters. However, we decided to split the entire MCMC procedure into two stages: calibration and scoring for the purpose of computational efficiency. Because three studies had relatively larger sample sizes than other studies (more than half the total sample across these three studies), it required much longer computing time when all the observations were utilized in one combined stage. Using only 10 % of the sample from these three studies and all participants from the rest of the studies in the first stage was computationally more efficient, especially because we fine-tuned the algorithms several times along the way. As such, we needed a second step to score all the respondents using the same MCMC procedure. Thus, once the calibration using the subsample at baseline was completed, we used the structural parameter estimates obtained in the calibration stage to derive latent trait scores for all participants not only at baseline but also at all subsequent follow-ups.
The amount of bias can be affected by group sizes and the magnitudes of the parameters. In our study, groups with relatively small sample sizes were more susceptible to this problem given that considerable missingness existed in our data. The five largest biases were observed in three small studies.

References

Adams, R. J., Wilson, M., & Wang, W.-C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1–23.
Article Google Scholar
Bauer, D. J., & Hussong, A. M. (2009). Psychometric approaches for developing commensurate measures across independent studies: Traditional and new models. Psychological Methods, 14, 101–125.
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 4, 541–562.
Article Google Scholar
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markovchain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Cai, L., Thissen, D., & du Toit, S. H. C. (2011). IRTPRO for Windows [Computer software]. Lincolnwood, IL: Scientific Software International.
Google Scholar
Casella, G., & George, E. I. (1992). Explaining the Gibbs sampler. The American Statistician, 46, 167–174.
Google Scholar
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. http://www.jstatsoft.org/v48/i06/.
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings algorithm. The American Statistician, 49, 327–335.
Google Scholar
Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14, 81–100.
Article PubMed Central PubMed Google Scholar
Curran, P. J., Hussong, A. M., Cai, L., Huang, W., Chassin, L., Sher, K. J., et al. (2008). Pooling data from multiple longitudinal studies: The role of item response theory in integrative data analysis. Developmental Psychology, 44, 365–380.
Article PubMed Central PubMed Google Scholar
de la Torre, J. (2009). Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables. Applied Psychological Measurement, 33, 465–485.
de la Torre, J., & Hong, Y. (2009). Parameter estimation with small sample size: A higher-order IRT model approach. Applied Psychological Measurement, 34, 267–285.
Article Google Scholar
de la Torre, J., & Song, H. (2009). Simultaneous estimation of overall and domain abilities: A higher-order IRT model approach. Applied Psychological Measurement, 33, 620–639.
Article Google Scholar
de la Torre, J., & Patz, R. J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 1–38.
Google Scholar
Dimeff, L. A., Baer, J. S., Kivlahan, D. R., & Marlatt, G. A. (1999). Brief alcohol screening and intervention for college students: A harm reduction approach. New York, NY: Guilford Press.
Google Scholar
Doornik, J. A. (2009). Object-oriented matrix programming using Ox (Version 3.1) [Computer software]. London: Timberlake Consultants Press.
Google Scholar
Fox, J.-P., & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 271–288.
Article Google Scholar
Finch, H. (2008). Estimation of item response theory parameters in the presence of missing data. Journal of Educational Measurement, 45, 225–245.
Article Google Scholar
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.
Article Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.
Google Scholar
Gill, J. (2002). Bayesian methods: A social and behavioral sciences approach (1st ed.). Boca Raton, FL: Chapman & Hall/CRC.
Google Scholar
Hartig, J., & Höhler, J. (2009). Multidimensional IRT models for the assessment of competencies. Studies in Educational Evaluation, 35, 57–63.
Article Google Scholar
Hurlbut, S. C., & Sher, K. J. (1992). Assessing alcohol problems in college students. Journal of American College Health, 41(2), 49–58. doi:10.1080/07448481.1992.10392818.
Article PubMed Google Scholar
Kahler, C. W., Strong, D. R., & Read, J. P. (2005). Toward efficient and comprehensive measurement of the alcohol problems continuum in college students: The Brief Young Adult Alcohol Consequences Questionnaire. Alcoholism: Clinical and Experimental Research, 29(7), 1180–1189. doi:10.1097/01.alc.0000171940.95813.a5.
Article Google Scholar
Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis. Boston, MA: Houghton Mifflin.
Google Scholar
Liu, X. (2008). Parameter expansion for sampling a correlation matrix: An efficient GPX-RPMH algorithm. Journal of Statistical Computation and Simulation, 78, 1065–1076.
Article Google Scholar
Liu, X., & Daniels, M. J. (2006). A new efficient algorithm for sampling a correlation matrix based on parameter expansion and re-parameterization. Journal of Computational and Graphical Statistics, 15, 897–914.
Article Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
McArdle, J. J., Grimm, K., Hamagami, F., Bowles, R., & Meredith, W. (2009). Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement. Psychological Methods, 14, 126–149.
Article PubMed Central PubMed Google Scholar
McDonald, R. P. (1997). Normal-ogive multidimensional model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 257–269). New York: Springer.
Meng, X. L. (1994). Posterior predictive p-values. The Annals of Statistics, 22, 1142–1160.
Article Google Scholar
Millsap, R., & Maydeu-Olivares, A. (2009). Handbook of quantitative methods in psychology. London, UK: Sage.
Google Scholar
Mislevy, R. (1991). Randomization-based inferences about latent variables from complex samples. Psychometrika, 56, 177–196.
Article Google Scholar
Mun, E. Y., White, H. R., de la Torre, J., Atkins, D. C., Larimer, M., Jiao, Y., et al. (2011). Overview of integrative analysis of brief alcohol interventions for college students. Alcoholism: Clinical and Experimental Research, 35, 147.
Google Scholar
Oshima, T. C., Raju, N. S., & Flowers, C. P. (1997). Development and demonstration of multidimensional IRT-based internal measures of differential functioning of items and tests. Journal of Educational Measurement, 34, 253–272.
Article Google Scholar
Reckase, M. D. (1996). A linear logistic multidimensional model. In W. J. van der Linder & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271–286). New York, NY: Springer.
Google Scholar
Reckase, M. D. (2009). Multidimensional item response theory. New York, NY: Springer.
Book Google Scholar
Rubin, D. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
Book Google Scholar
Saunders, J. B., Aasland, O. G., Babor, T. F., & Grant, M. (1993). Development of the alcohol use disorders identification test (AUDIT): WHO Collaborative Project on early detection of persons with harmful alcohol consumption-II. Addiction, 88(6), 791–804. doi:10.1111/j.1360-0443.1993.tb02093.x.
Article PubMed Google Scholar
Schafer, J. L. (1997). Analysis of incomplete multivariate data. Boca Raton, FL: Chapman & Hall/CRC.
Book Google Scholar
Sheng, Y., & Wikle, C. K. (2007). Comparing unidimensional and multi-unidimensional IRT models. Educational and Psychological Measurement, 67, 899–919.
Article Google Scholar
Sheng, Y., & Wikle, C. K. (2008). Bayesian multidimensional IRT models with a hierarchical structure. Educational and Psychological Measurement, 68, 413–430.
Article Google Scholar
Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.
Article Google Scholar
Skinner, H. A., & Allen, B. A. (1982). Alcohol dependence syndrome: Measurement and validation. Journal of Abnormal Psychology, 91(3), 199–209.
Article PubMed Google Scholar
Skinner, H. A., & Horn, J. L. (1984). Alcohol dependence scale: Users guide. Toronto: Addiction Research Foundation.
Google Scholar
Thomas, N. (2002). The role of secondary covariates when estimating latent trait population distributions. Psychometrika, 67, 33–48.
Article Google Scholar
Van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York, NY: Springer.
Book Google Scholar
Wang, W., Wilson, M., & Adams. R. J. (1995). Item response modeling for multidimensional between-items and multidimensional within-items. Paper presented at the International Objective Measurement Conference. Berkeley, CA.
White, H. R., & Labouvie, E. W. (1989). Towards the assessment of adolescent problem drinking. Journal of Studies on Alcohol, 50(1), 30–37.
Article PubMed Google Scholar
Zeger, L. M., & Thomas, N. (1997). Efficient matrix sampling instruments for correlated latent traits: Examples from the National Assessment of Education Progress. Journal of the American Statistical Association, 92, 416–425.
Article Google Scholar
Zimowski, M. F., Muraki, E., Mislevy, R. J., & Bock, R. D. (2003). BIOLOG-MG 3 [Computer Software]. Lincolnwood, IL: Scientific Software International Inc.
Google Scholar

Download references

Acknowledgments

We would like to thank the following investigators who generously contributed their data to Project INTEGRATE: John S. Baer, Department of Psychology, The University of Washington, and Veterans’ Affairs Puget Sound Health Care System; Nancy P. Barnett, Center for Alcohol and Addiction Studies, Brown University; M. Dolores Cimini, University Counseling Center, The University at Albany, State University of New York; William R. Corbin, Department of Psychology, Arizona State University; Kim Fromme, Department of Psychology, The University of Texas, Austin; Joseph W. LaBrie, Department of Psychology, Loyola Marymount University; Mary E. Larimer, Department of Psychiatry and Behavioral Sciences, The University of Washington; Matthew P. Martens, Department of Educational, School, and Counseling Psychology, The University of Missouri; James G. Murphy, Department of Psychology, The University of Memphis; Scott T. Walters, Department of Behavioral and Community Health, The University of North Texas Health Science Center; Helene R. White, Center of Alcohol Studies, Rutgers, The State University of New Jersey; and Mark D. Wood, Department of Psychology, The University of Rhode Island. The project described was supported by Award Number R01 AA019511 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIAAA or the National Institutes of Health.

Author information

Authors and Affiliations

Graduate School of Education, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
Yan Huo, Jimmy de la Torre & Eun-Young Mun
Department of Psychology, Ewha Womans University, Seoul, Republic of Korea
Su-Young Kim
Center of Alcohol Studies, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
Anne E. Ray, Yang Jiao & Helene R. White

Authors

Yan Huo
View author publications
You can also search for this author in PubMed Google Scholar
Jimmy de la Torre
View author publications
You can also search for this author in PubMed Google Scholar
Eun-Young Mun
View author publications
You can also search for this author in PubMed Google Scholar
Su-Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Anne E. Ray
View author publications
You can also search for this author in PubMed Google Scholar
Yang Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Helene R. White
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Huo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huo, Y., de la Torre, J., Mun, EY. et al. A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis. Psychometrika 80, 834–855 (2015). https://doi.org/10.1007/s11336-014-9420-2

Download citation

Received: 29 January 2013
Published: 30 September 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s11336-014-9420-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation