Multilevel Models for the Analysis of Comparative Survey Data: Common Problems and Some Solutions

Schmidt-Catran, Alexander W.; Fairbrother, Malcolm; Andreß, Hans-Jürgen

doi:10.1007/s11577-019-00607-9

Multilevel Models for the Analysis of Comparative Survey Data: Common Problems and Some Solutions

Mehrebenenmodelle zur Analyse von vergleichenden Umfragedaten: Häufige Probleme und ausgewählte Lösungsansätze

Abhandlungen
Published: 06 May 2019

Volume 71, pages 99–128, (2019)
Cite this article

KZfSS Kölner Zeitschrift für Soziologie und Sozialpsychologie Aims and scope Submit manuscript

Alexander W. Schmidt-Catran¹,
Malcolm Fairbrother² &
Hans-Jürgen Andreß³

3374 Accesses
40 Citations
1 Altmetric
Explore all metrics

Abstract

This paper provides an overview over the application of mixed models (multilevel models) to comparative survey data where the context units of interest are countries. Such analyses have gained much popularity in the last two decades but they also come with a variety of challenges, some of which are discussed here. A focus lies on the small-N problem, influential cases (outliers) and the issue of omitted variables at the country level. Summarizing the methodological literature, the paper provides recommendations for applied researchers when possible or otherwise points to the more detailed literature. Some solutions for the small-N problem and omitted variable bias are discussed in detail, recommending the pooling of multiple survey waves to increase statistical power and to allow for the estimation of within-country effects, thereby controlling for unobserved heterogeneity. All issues are illustrated using an empirical example with data from the European Social Survey. The online appendix provides detailed syntax to adopt the presented procedures to researchers’ own data.

Zusammenfassung

Die vorliegende Arbeit bietet einen Überblick über die Anwendung von Mehrebenenmodellen auf international vergleichende Umfragedaten. Mehrebenenanalysen, in denen die relevanten Kontexteinheiten Länder sind, haben in den letzten 2 Jahrzehnten eine weite Verbreitung gefunden, sind allerdings aus statistischer Perspektive in einigen Aspekten problematisch. Dieser Artikel zielt auf einige der Probleme ab, die bei der Anwendung von Mehrebenenanalysen auf internationale Umfragedaten auftreten. Ein Fokus liegt dabei auf dem small-N-Problem, einflussreichen Fällen („Ausreißern“) und dem Problem unbeobachteter Heterogenität auf der Länderebene. Dieser Beitrag bietet eine Zusammenfassung der methodischen Literatur zu Mehrebenenmodellen und versucht, in Forschung Tätigen möglichst konkrete Empfehlungen zu geben oder – wo dies nicht möglich ist – auf die tiefergehende Literatur zu verweisen. Lösungsansätze für das small-N-Problem und das Problem unbeobachteter Heterogenität werden im Detail diskutiert. Aus dieser Diskussion ergibt sich die Empfehlung, vorhandene Wellen international vergleichender Umfragedaten zu poolen. Zur Illustration verwendet dieser Artikel ein empirisches Beispiel auf Basis der Daten des European Social Survey. Der Online-Anhang enthält zu diesen Beispielen eine detaillierte Syntax, die sich leicht für andere Daten und Forschungsfragen anpassen lässt.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Meta-Analysis and Meta-Regression: An Alternative to Multilevel Analysis When the Number of Countries Is Small

Quantitative Approaches to Intersectionality: New Methodological Directions and Implications for Policy Analysis

Comparability of Currently Available Survey Data

Notes

The online appendix is available at www.schmidt-catran.de/mixedmodels.html.
Note that the country-level random effects now have an additional subscript (0,1, … ,k), indicating to which fixed effect the random effect belongs.
A symmetric variance-covariance matrix of size m contains \(m\cdot ((m+1)/2)\) unique entries, m of which are variances and the rest being covariances.
A detailed description of all involved variables, their descriptive statistics and correlations, can be found in the online appendix to this paper.
But see Heisig et al. (2017) who argue for the inclusion of random slopes even if the research interest is not in cross-level interactions, i. e. in explaining differences in individual-level effects by country-level characteristics. Barr et al. (2013) and Bell et al. (2019) also demonstrate and discuss the importance of random slopes.
The issue of nonrandom missing values, i. e. sample selection effects at the individual level, is left aside here.
This is equivalent to the introduction of country-dummies, i. e. country fixed effects.
The data set has been obtained from the cumulative data wizard, which does exclude Albania, Kosovo and Latvia.
Ignoring for now the fact that two EU members are not in the sample: Malta and Latvia.
See Bowers and Drake (2005) for more information on how to use exploratory data analysis and visualization when the number of level 2 units is small.
With anticonservative tests, the risk of falsely rejecting the null hypothesis of no effect increases. In other words, results look too significant.
To be precise, RML does also allow to compare nested models but only if they differ in their random but not in the fixed part.
Brännström (2008); Sutton (2012); Stadelmann-Steffen (2012); Stegmueller et al. (2012); Giger (2012); Mewes (2014).
In the example data, the country-level variables are not too strongly related. The average (absolute) correlation across the four variables amounts to 0.31 (min = 0.19, max = 0.47), so collinearity is not a pressing issue. However, it is much stronger than the average (absolute) correlation across the individual-level variables which is 0.09 (min = 0.01, max = 0.25).
An example of such a paper is Semyonov et al. (2006, p. 437): “Because of restrictions associated with the limited degrees of freedom at the country level, only three hierarchical linear model equations are estimated […], with each equation including only one country-level variable.”
Technically, there is perfect collinearity between country-level variables and country-dummies.
Note that this is an oversimplification. Technically, the level at which a variable is measured is not one specific level but it depends on how the variance components of a variable distribute over the levels.
Except for the fact that we now include six additional countries which have been in the ESS at some point but not in the 2014 wave used in Table 1.
The idea to identify an effect solely by within-unit variation and thereby to control for any time-constant unobserved variables originates from the analyses of panel data. Readers who want to get a detailed understanding of this may want to read this literature: Allison (2009); Andress et al. (2013, Chap. 4); Bell and Jones (2015).
For two-level models, Stata users can use the mlt ado-package to calculate Cook’s D and DFBETAs (Möhring and Schmidt 2013). In the online appendix we provide a syntax for three-level models which is very general and can be easily adapted to researchers’ own applications.
DFBETAs can of course also be calculated for individual-level variables (x) but in the context of multilevel modeling its application to country-level variables (z) is typically of interest.

References

Allison, Paul D. 2009. Fixed effects regression models. Thousand Oaks: SAGE.
Book Google Scholar
Andress, Hans-Jürgen, Katrin Golsch and Alexander W. Schmidt. 2013. Applied panel data analysis for economic and social surveys. Springer: Berlin, Heidelberg.
Book Google Scholar
Arceneaux, Kevin, and Gregory A. Huber. 2007. What to do (and not do) with multicollinearity in state politics research. State Politics & Policy Quarterly 7:81–101.
Article Google Scholar
Babones, Salvatore J. 2013. Methods for quantitative macro-comparative research. London: Sage.
Google Scholar
Barr, Dale J., Roger Levy, Christoph Scheepers and Harry J. Tily. 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language 68:255–278.
Article Google Scholar
Beck, Nathaniel. 2001. Time-series–cross-section data: What have we learned in the past few years? Annual Review of Political Science 4:271–293.
Article Google Scholar
Bell, Andrew J., and Kelvyn Jones. 2015. Explaining fixed effects: Random effects modeling of time-series cross-sectional and data panel data. Political Science Research and Methods 3:133–153.
Article Google Scholar
Bell, Andrew J., Kelvyn Jones and Malcolm Fairbrother. 2018. Understanding and misunderstanding Group mean centering: A commentary on Kelley et al.’s dangerous practice. Quality & Quantity 52:2031–2046.
Article Google Scholar
Bell, Andrew, Malcolm Fairbrother and Kelvyn Jones. 2019. Fixed and random effects models: Making an informed choice. Quality & Quantity 53:1051–1074.
Article Google Scholar
Bell, Bethany A., Grant B. Morgan, Jason A. Schoeneberger, Jeffrey D. Kromrey and John M. Ferron. 2014. How low can you go? Methodology 10:1–11.
Article Google Scholar
Belsley, David A., Edwin Kuh and Roy E. Welsch. 1980. Regression diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley: New York.
Book Google Scholar
Bowers, Jake, and Katherine W. Drake. 2005. EDA for HLM: Visualization when probabilistic inference fails. Political Analysis 13:301–326.
Article Google Scholar
Brännström, Lars. 2008. Making their mark: The effects of neighbourhood and upper secondary school on educational achievement. European Sociological Review 24:463–478.
Article Google Scholar
Browne, William J., and David Draper. 2000. Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics 15:391–420.
Article Google Scholar
Bryan, Mark L., and Stephen P. Jenkins. 2016. Multilevel modelling of country effects: A cautionary tale. European Sociological Review 32:3–22.
Article Google Scholar
Cook, R. Dennis. 1977. Detection of influential observation in linear regression. Technometrics 19:15–18.
Google Scholar
Draper, David. 2008. Bayesian multilevel analysis and MCMC. In Handbook of multilevel analysis, eds. Jan De Leeuw and Erik Meijer, 77–139. Springer: New York.
Chapter Google Scholar
Ebbinghaus, Bernhard. 2005. When less is more: selection problems in large-N and small-N cross-national comparisons. International Sociology 20:133–152.
Article Google Scholar
Elff, Martin, Jan P. Heisig, Merlin Schaeffer and Susumu Shikano. 2016. No need to turn Bayesian in multilevel analysis with few clusters: How frequentist methods provide unbiased estimates and accurate inference. Open Science Framework (osf.io/fkn3u).
Google Scholar
Elwert, Felix. 2013. Graphical causal models. In Handbook of causal analysis for social research, ed. Morgan, Stephen L., 245–273. Springer Netherlands: Dordrecht.
Chapter Google Scholar
Enders, Craig K., and Davood Tofighi. 2007. Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods 12:121–138.
Article Google Scholar
European Social Survey Cumulative File, ESS 1–7 (2016). Data file edition 1.0. NSD – Norwegian Centre for Research Data, Norway – Data Archive and distributor of ESS data for ESS ERIC.
Fairbrother, Malcolm. 2013. Rich people, poor people, and environmental concern: Evidence across nations and time. European Sociological Review 29:910–922.
Article Google Scholar
Fairbrother, Malcolm. 2014. Two multilevel modeling techniques for analyzing comparative longitudinal survey datasets. Political Science Research and Methods 2:119–140.
Article Google Scholar
Fairbrother, Malcolm. 2016. Trust and public support for environmental protection in diverse national contexts. Sociological Science 3: 359–382.
Article Google Scholar
Finseraas, H. 2012. Poverty, ethnic minorities among the poor, and preferences for redistribution in European regions. Journal of European Social Policy 22:164–180.
Article Google Scholar
Fontaine, Johnny R.J. 2015. Traditional and multilevel approaches in cross-cultural research: An integration of methodological frameworks. In Multilevel analysis of individuals and cultures, eds. Van de Vijver, Fons J.R., Dianne A. Van Hemert and Ype H. Poortinga, 65–93. New York: Psychology Press.
Google Scholar
Gelman, Andrew. 2005. Two-stage regression and multilevel modeling: A commentary. Political Analysis 13:459–61.
Article Google Scholar
Gelman, Andrew. 2006. Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 1:72–91.
Article Google Scholar
Gelman, Andrew, and Jennifer Hill. 2007. Data analysis using regression and multilevel/hierarchical models. New York: Cambridge University Press.
Google Scholar
Giesselmann, Marco and Alexander W. Schmidt-Catran. 2018. Getting the within estimator of cross-level interactions in multilevel models with pooled cross-sections: Why country dummies (sometimes) do not do the job. Sociological Methodology. https://doi.org/10.1177/0081175018809150
Google Scholar
Giger, Nathalie. 2012. Is social policy retrenchment unpopular? How welfare reforms affect government popularity. European Sociological Review 28:691–700.
Article Google Scholar
Goerres, Achim, Markus B. Siewert, and Claudius Wagemann. 2019. Internationally comparative research designs in the social sciences: Fundamental issues, case selection logics, and research limitations. In Cross-national comparative research – analytical strategies, results and explanations. Sonderheft Kölner Zeitschrift für Soziologie und Sozialpsychologie. Eds. Hans-Jürgen Andreß, Detlef Fetchenhauer and Heiner Meulemann. Wiesbaden: Springer VS. https://doi.org/10.1007/s11577-019-00600-2.
Goldthorpe, John H. 1997. Current issues in comparative macrosociology: A debate on methodological issues. Comparative Social Research 16:1–26.
Google Scholar
Te Grotenhuis, Manfred, Marijn Scholte, Nan Dirk de Graaf and Ben Pelzer. 2015. The between and within effects of social security on church attendance in Europe 1980–1998: The danger of testing hypotheses cross-nationally. European Sociological Review 31:643–654.
Article Google Scholar
Heisig, Jan P., Merlin Schaeffer and Johannes Giesecke. 2017. The costs of simplicity: Why multilevel models may benefit from accounting for cross-cluster differences in the effects of controls. American Sociological Review 82:796–827.
Article Google Scholar
Heisig, Jan Paul, and Merlin Schaeffer. 2019. Why You Should Always Include a Random Slope for the Lower-Level Variable Involved in a Cross-Level Interaction. European Sociological Review 35:258–279.
Article Google Scholar
Hox, Joop J. 2010. Multilevel analysis: Techniques and applications, 2nd Edition. New York: Routledge.
Book Google Scholar
Immerzeel, Tim, and Frank Van Tubergen. 2013. Religion as reassurance? Testing the insecurity theory in 26 European countries. European Sociological Review 29:359–372.
Article Google Scholar
Jackman, Simon D. 2009. Bayesian analysis for the social sciences. New York: John Wiley.
Book Google Scholar
Jaeger, Mads M. 2013. The effect of macroeconomic and social conditions on the demand for redistribution: A pseudo panel approach. Journal of European Social Policy 23:149–163.
Article Google Scholar
Kim, Jee-Seon, and Edward W. Frees. 2006. Omitted variables in multilevel models. Psychometrika 71:659–690.
Article Google Scholar
Maas, Cora J. M., and Joop J. Hox. 2005. Sufficient sample sizes for multilevel modeling. methodology: European Journal of Research Methods for the Behavioral and Social Sciences 1:86–92.
Article Google Scholar
Van der Meer, Tom, Manfred Te Grotenhuis and Ben Pelzer. 2010. Influential cases in multilevel modeling: A methodological comment. American Sociological Review 75:173–178.
Article Google Scholar
Mewes, Jan. 2014. Gen (d) eralized trust: women, work, and trust in strangers. European Sociological Review 30:373–386.
Article Google Scholar
Möhring, Katja, and Alexander W. Schmidt. 2013. MLT: Stata module to provide multilevel tools (Statistical Software Components S457577). Boston, MA: Boston College Department of Economics.
Google Scholar
Mundlak, Yair. 1978. Pooling of time-series and cross-section data. Econometrica 46:69–85.
Article Google Scholar
Patterson, H. Desmond, and Robin Thompson. 1971. Recovery of inter-block information when block sizes are unequal. Biometrika 58:545–554.
Article Google Scholar
Rabe-Hesketh, Sophia, and Anders Skrondal. 2012. Multilevel and longitudinal modeling using stata, 3rd Edition. College Station, TX: Stata Press.
Google Scholar
Schmidt-Catran, Alexander W. 2016. Economic inequality and demand for redistribution: Cross-sectional and longitudinal evidence. Socio-Economic Review 14:119–140.
Article Google Scholar
Schmidt-Catran, Alexander W., and Malcom Fairbrother. 2016. The random effects in multilevel models: Getting them wrong and getting them right. European Sociological Review 32:23–38.
Article Google Scholar
Semyonov, Moshe, Rebeca Raijman and Anastasia Gorodzeisky. 2006. The rise of anti-foreigner sentiment in European societies, 1988–2000. American Sociological Review 71:426–449.
Article Google Scholar
Snijders, Tom A.B., and Johannes Berkhof. 2008. Diagnostic checks for multilevel models. In Handbook of multilevel analysis, eds. Jan De Leeuw and Erik Meijer, 457–514. New York: Springer.
Google Scholar
Snijders, Tom A.B., and Roel J. Bosker. 2012. Multilevel analysis: An introduction to basic and advanced multilevel modelling, 2nd Edition. London: Sage.
Google Scholar
Stadelmann-Steffen, Isabelle. 2012. Education policy and educational inequality—evidence from the Swiss laboratory. European Sociological Review 28:379–393.
Article Google Scholar
Stegmueller, Daniel. 2013. How many countries for multilevel modeling? A comparison of frequentist and Bayesian approaches. American Journal of Political Science 57:748–761.
Article Google Scholar
Stegmueller, Daniel, Peer Scheepers, Sigrid Roßteutscher and Eelke de Jong. 2012. Support for redistribution in western Europe: Assessing the role of religion. European Sociological Review 28:482–497.
Article Google Scholar
Sutton, John R. 2012. Imprisonment and opportunity structures: A Bayesian hierarchical analysis. European Sociological Review 28:12–27.
Article Google Scholar
Van Erp, Sara, Joris Mulder and Daniel L. Oberski. 2017. Prior sensitivity analysis in default Bayesian structural equation modeling. Psychological Methods 23:363–388.
Article Google Scholar
Wilkes, Rima, Neil Guppy and Lily Farris. 2007. Right-wing parties and anti-foreigner sentiment in Europe. American Sociological Review 72:831–840.
Article Google Scholar
Wooldridge, Jeffrey M. 2013. Introductory econometrics: A modern approach. Mason, OH: South-West Cengage Learning.
Google Scholar
Wulfgramm, M. 2014. Life satisfaction effects of unemployment in Europe: The moderating influence of labour market policy. Journal of European Social Policy 24:258–272.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Soziologie, Lehrstuhl für Soziologie mit dem Schwerpunkt Methoden der quantitativen empirischen Sozialforschung, Goethe-Universität Frankfurt, Theodor-W.-Adorno-Platz 6, Campus Westend, 60323, Frankfurt am Main, Germany
Alexander W. Schmidt-Catran
Department of Sociology, Umeå University, Norra Beteendevetarhuset, Umeå universitet, 901 87, Umeå, Sweden
Malcolm Fairbrother
Fakultät für Wirtschafts- und Sozialwissenschaften, Institut für Soziologie und Sozialpsychologie, Lehrstuhl für empirische Sozial- und Wirtschaftsforschung, Universität zu Köln, Albertus-Magnus-Platz, 50923, Cologne, Germany
Hans-Jürgen Andreß

Authors

Alexander W. Schmidt-Catran
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm Fairbrother
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Jürgen Andreß
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander W. Schmidt-Catran.

Additional information

Online Appendix: http://www.schmidt-catran.de/mixedmodels.html

Appendix

Table 4 Sample sizes of example data—European Social Survey (ESS) rounds 1 to 7

Full size table

Table 5 Cook’s D of fixed part and DFBETAs of within-effect of social spending from Model M6

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schmidt-Catran, A.W., Fairbrother, M. & Andreß, HJ. Multilevel Models for the Analysis of Comparative Survey Data: Common Problems and Some Solutions. Köln Z Soziol 71 (Suppl 1), 99–128 (2019). https://doi.org/10.1007/s11577-019-00607-9

Download citation

Published: 06 May 2019
Issue Date: 03 June 2019
DOI: https://doi.org/10.1007/s11577-019-00607-9

Keywords

Schlüsselwörter

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multilevel Models for the Analysis of Comparative Survey Data: Common Problems and Some Solutions

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Meta-Analysis and Meta-Regression: An Alternative to Multilevel Analysis When the Number of Countries Is Small

Quantitative Approaches to Intersectionality: New Methodological Directions and Implications for Policy Analysis

Comparability of Currently Available Survey Data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Schlüsselwörter

Navigation

Multilevel Models for the Analysis of Comparative Survey Data: Common Problems and Some Solutions

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Meta-Analysis and Meta-Regression: An Alternative to Multilevel Analysis When the Number of Countries Is Small

Quantitative Approaches to Intersectionality: New Methodological Directions and Implications for Policy Analysis

Comparability of Currently Available Survey Data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Schlüsselwörter

Search

Navigation