Skip to main content

Marginal Models: An Overview

  • Chapter
  • First Online:
Trends and Challenges in Categorical Data Analysis

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

Abstract

Marginal models involve restrictions on the conditional and marginal association structure of a set of categorical variables. They generalize log-linear models for contingency tables, which are the fundamental tools for modelling the conditional association structure. This chapter gives an overview of the development of marginal models during the past 20 years. After providing some motivating examples, the first few sections focus on the definition and characteristics of marginal models. Specifically, we show how their fundamental properties can be understood from the properties of marginal log-linear parameterizations. Algorithms for estimating marginal models are discussed, focussing on the maximum likelihood and the generalized estimating equations approaches. It is shown how marginal models can help to understand directed graphical and path models, and a description is given of marginal models with latent variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The design also has disadvantages, of course. These include panel attrition and the fact that, even if originally selected appropriately, with passing time the sample will become different in composition from the current population.

  2. 2.

    See Bergsma et al. [12], Chapter 5, for applications in causal analysis and the relationship with structural equation models.

  3. 3.

    The neighbours of a node A are those nodes with which A is connected by an edge.

  4. 4.

    For alternative parameterizations see Sect. 3.5.

  5. 5.

    Note that formula (iii) in Theorem 3.1 in Ghosh and Vellaisamy [38] appears to have a typo.

  6. 6.

    A model is called smooth if it admits a smooth parameterization.

  7. 7.

    The notations for the variables are going to be clarified later.

  8. 8.

    Setting a parameter to zero means setting it zero for all category combinations of the variables in the effect.

References

  1. Agresti, A.: Categorical Data Analysis, 3rd edn. Wiley, London (2013)

    MATH  Google Scholar 

  2. Andersson, S.A., Madigan, D., Perlman, M.D.: Alternative Markov properties for chain graphs. Scand. J. Stat. 28, 33–85 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  3. Aitchison, J., Silvey, S.D.: Maximum likelihood estimation of parameters subject to restraints. Ann. Math. Stat. 29, 813–828 (1958)

    Article  MathSciNet  MATH  Google Scholar 

  4. Barndorff-Nielsen, O.: Information and Exponential Families. Wiley, New York (1978)

    MATH  Google Scholar 

  5. Bartolucci, F., Forcina, A.: A class of latent marginal models for capture–recapture data with continuous covariates. J. Am. Stat. Assoc. 101, 786–794 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bartolucci, F., Colombi, R., Forcina, A.: An extended class of marginal link functions for modelling contingency tables by equality and inequality constraints. Stat. Sin. 691–711 (2007)

    Google Scholar 

  7. Bartolucci, F., Scaccia, L., Farcomeni, A.: Bayesian inference through encompassing priors and importance sampling for a class of marginal models for categorical data. Comput. Stat. Data Anal. 56, 4067–4080 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bergsma, W.P. (1997). Marginal models for categorical data. Tilburg: Tilburg University Press

    MATH  Google Scholar 

  9. Bergsma, W., Rudas, T.: Marginal models for categorical data. Ann. Stat. 30, 140–159 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  10. Bergsma, W., Rudas, T.: Variation independent parameterizations of multivariate categorical distributions. In: Cuadras, C.M., Fortiana, J., Rodriguez-Lallena, J.A. (eds.) Distributions with Given Marginals and Statistical Modelling, pp. 21–27. Kluwer, Dordecht (2002)

    Chapter  MATH  Google Scholar 

  11. Bergsma, W., Rudas, T.: On conditional and marginal association. Annales de la Faculte des Sciences de Toulouse 6(11), 455–468 (2003)

    MATH  Google Scholar 

  12. Bergsma, W., Croon, M., Hagenaars, J.A.: Marginal Models For Dependent, Clustered and Longitudinal Categorical Data. Springer, New York (2009)

    Google Scholar 

  13. Bergsma, W.P., Croon, M.A., Hagenaars, J.A. (2013). Advancements in marginal modeling for categorical data. Sociol. Methodol. 43(1), 141

    MATH  Google Scholar 

  14. Bergsma, W.P., Rapcsák, T.: An exact penalty method for smooth equality constrained optimization with application to maximum likelihood estimation. Eurandom Technical Report (2005)

    Google Scholar 

  15. Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge (1975)

    MATH  Google Scholar 

  16. Bon, J., Baffour, B., Spallek, M., Haynes, M.: Analysing sensitive data from dynamically-generated overlapping contingency tables. J. Off. Stat. 36, 275–296 (2020)

    Article  Google Scholar 

  17. Cocchi, M. (ed.): Data Fusion Methodology and Applications. Elsevier, Amsterdam (2019)

    Google Scholar 

  18. Colombi, R., Forcina, A.: Marginal regression models for the analysis of positive association of ordinal response variables. Biometrika 88, 1007–1019 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  19. Colombi, R., Forcina, A.: A class of smooth models satisfying marginal and context specific conditional independencies. J. Multivariate Anal. 126, 75–85 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  20. Colombi, R., Giordano, S.: Multiple hidden Markov models for categorical time series. J. Multivariate Anal. 140, 19–30 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  21. Colombi, R., Forcina, A.: Testing order restrictions in contingency tables. Metrika 79, 73–90 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  22. Colombi, R., Giordano, S., Cazzaro, M.: hmmm: an R package for hierarchical multinomial marginal models. J. Stat. Softw. 59(11), 1–25 (2014)

    Google Scholar 

  23. Colombi, R., Giordano, S., Gottard, A., Iannario, M.: Hierarchical marginal models with latent uncertainty. Scand. J. Stat. 46, 595–620 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  24. Cox, D.R., Wermuth, N.: Multivariate Dependencies. Chapman and Hall, London (1996)

    MATH  Google Scholar 

  25. Dardanoni, V., Fiorini, M., Forcina, A.: Stochastic monotonicity in intergenerational mobility tables. J. Appl. Econom. 27, 85–107 (2012)

    Article  MathSciNet  Google Scholar 

  26. D’Orazio, M., Di Zio, M., Scanu, M.: Statistical matching for categorical data: displaying uncertainty and using logical constraints. J. Off. Stat. 22, 137–157 (2006)

    Google Scholar 

  27. Drton, M.: Discrete chain graph models. Bernoulli 15(3), 736–753 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  28. Eppmann, H., Krügener, S., Schäfer, J.: First German register based census in 2011. Allgemeines Statistisches Archiv 90(3), 465–482 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  29. Evans, R.J.: Smoothness of marginal log-linear parameterization. Electron. J. Stat. 9, 475–491 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  30. Evans, R.J., Forcina, A.: Two algorithms for fitting constrained marginal models. Comput. Stat. Data Anal. 66, 1–7 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  31. Evans, R.J., Richardson, T.S.: Marginal log-linear parameters for graphical Markov models. J. R. Stat. Soc. B: Stat. Methodol. 75(4), 743 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  32. Forcina, A.: Identifiability of extended latent class models with individual covariates. Comput. Stat. Anal. 52, 5263–5268 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  33. Forcina, A.: Smoothness of conditional independence. J. Multivariate Anal. 106, 49–56 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  34. Forcina, A., Kateri, M.: A new general class of RC association models: estimation and main properties. J. Multivariate Anal. 184, 1–16 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  35. Forcina, A., Lupparelli, M., Marchetti, G.M.: Marginal parameterizations of discrete models defined by a set of conditional independencies. J. Multivariate Anal. 101, 2519–2527 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  36. Frees, E.W, Kim, J.-S.: Panel studies. In Rudas, T. (ed.) Handbook of Probability: Theory and Applications, pp. 205–224. Sage, Thousand Oaks (2008)

    Chapter  Google Scholar 

  37. Frydenberg, M.: The chain graph Markov property. Scand. J. Stat. 17, 333–353 (1990)

    MathSciNet  MATH  Google Scholar 

  38. Ghosh, S., Vellaisamy, P.: Marginal log-linear parameters and their collapsibility for categorical data (2019). arXiv 1711.00680v4

    Google Scholar 

  39. Glonek, G.F.V.: A class of regression models for multivariate categorical responses. Biometrika 83, 15–28 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  40. Glonek, G.F.V., McCullagh, P.: Multivariate logistic models. J. R. Stat. Soc. B 57, 533–546 (1995)

    MATH  Google Scholar 

  41. Goodman, L.A.: The analysis of multidimensional contingency tales when some variables are posterior to others: a modified path analysis approach. Biometrika 60, 179–192 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  42. Hagenaars, J.A., Bergsma, W., Croon, M.: Nonloglinear marginal latent class models. In: Advances in Latent Class Analysis: A Festschrift in Honor of C. Mitchell Dayton, vol. 61 (2019)

    Google Scholar 

  43. Huber, P.J.: The behavior of maximum likelihood estimates under nonstandard conditions. Proc. Fifth Berkeley Symp. Math. Statist. Probab. 1 221–233 (1967)

    MathSciNet  MATH  Google Scholar 

  44. Kuijpers, R.E., Ark, L.A., Croon, M.A.: Testing hypotheses involving Cronbach’s alpha using marginal models. Br. J. Math. Stat. Psychol. 66, 503–520 (2013)

    MathSciNet  MATH  Google Scholar 

  45. Kuijpers, R.E., Ark, L.A., Croon, M.A.: Standard errors and confidence intervals for scalability coefficients in Mokken scale analysis using marginal models. Sociol. Methodol. 43, 42–69 (2013)

    Article  Google Scholar 

  46. Lang, J.B.: Maximum likelihood methods for a generalized class of log-linear models. Ann. Stat. 24, 726–752 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  47. Lang, J.B.: On the comparison of multinomial and Poisson log-linear models. J. R. Stat. Soc. B (Methodological) 58(1), 253–266 (1996b)

    Google Scholar 

  48. Lang, J.B.: Multinomial-Poisson homogeneous models for contingency tables. Ann. Stat. 32,340–383 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  49. Lang, J.B., Agresti, A.: Simultaneously modelling the joint and marginal distributions of multivariate categorical responses. J. Am. Stat. Assoc. 89, 625–632 (1994)

    Article  MATH  Google Scholar 

  50. Lauritzen, S.L.: Graphical Models. Clarendon Press, Oxford (1996)

    MATH  Google Scholar 

  51. Lauritzen, S.L., Dawid, A.P., Larsen, B.N., Leimer, H.-G.: Independence properties of directed Markov fields. Networks 20(5), 491–505 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  52. Lauritzen, S.L., Wermuth, N.L Graphical models for associations between variables, some of which are qualitative and some quantitative. Ann. Stat. 17, 31–57 (1989)

    MathSciNet  MATH  Google Scholar 

  53. Liang, K.Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  54. Lipsitz, S.R., Laird, N.M., Harrington, D.P.: Generalized estimating equations for correlated binary data: using the odds ratio as a measure of association. Biometrika 78(1), 153–160 (1991)

    Article  MathSciNet  Google Scholar 

  55. Little, R., Rubin, D.: Statistical Analysis with Missing Data, 3rd. edn. Wiley, New York (2019)

    MATH  Google Scholar 

  56. Lupparelli, M., Marchetti, G.M., Bergsma, W.P.: Parameterizations and fitting of bi-directed graph models to categorical data. Scand. J. Stat. 36(3), 559–576 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  57. Lupparelli, M., Roverato, A.: Log-mean linear regression models for binary responses with an application to multimorbidity. J. R. Stat. Soc. C (Applied Statistics) 66(2), 227–252 (2017)

    Google Scholar 

  58. Marchetti, G.M., Lupparelli, M.: Chain graph models of multivariate regression type for categorical data. Bernoulli 17(3), 827–844 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  59. McCullagh, P., Nelder, J.A.: Generalized Linear Models, 2nd edn. Chapman and Hall, London (1989)

    Book  MATH  Google Scholar 

  60. Molenberghs, G., Verbeke, G.: Models for Discrete Longitudinal Data (2005)

    Google Scholar 

  61. Németh, R., Rudas, T.: On the application of discrete marginal graphical models. Soc. Methodol. 43, 70–100 (2013)

    Article  Google Scholar 

  62. Németh, R., Rudas, T.: Discrete graphical models in social mobility research—a comparative analysis of American, Czechoslovakian and Hungarian mobility before the collapse of state socialism. Bull. Soc. Methodol. 118, 5–21 (2013)

    Google Scholar 

  63. Nicolussi, F., Cazzaro, M.: Context-specific independencies in hierarchical multinomial marginal models. Stat. Methods Appl. 29, 767–786 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  64. Nicolussi, F., Colombi, R.: Type II chain graph models for categorical data: a smooth subclass. Bernoulli 23, 863–883 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  65. Ntzoufras, I., Tarantola, C., Lupparelli, M.: Probability based independence sampler for Bayesian quantitative learning in graphical log-linear marginal models. Bayesian Anal. 14, 777–803 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  66. Pan, W.: Akaike’s information criterion in generalized estimating equations. Biometrics 57(1), 120–125 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  67. Qaqish, B.F., Ivanova, T.: Multivariate logistic models. Biometrika 93, 1011–1017 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  68. Rhemtulla, M., Little, T.: Tool of the trade: planned missing data designs for research in cognitive development. J. Cogn. Dev. 13(4), 10 (2012). https://doi.org/1080/15248372.2012.717340

    Article  Google Scholar 

  69. Rotnitzky, A., Jewell, N.P.: Hypothesis testing of regression parameters in semiparametric generalized linear models for cluster correlated data. Biometrika 77(3), 485–497 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  70. Roverato, A., Lupparelli, M., La Rocca, L.: Log-mean linear models for binary data. Biometrika 100(2), 485–494 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  71. Rudas, T.: Prescribed conditional interaction structure models with application to the analysis of mobility tables. Q. Quantity 25, 345–358 (1991)

    Article  Google Scholar 

  72. Rudas, T.: Odds Ratios in the Analysis of Contingency Tables. No 119, Quantitative Applications in the Social Sciences. Sage, Thousand Oaks (1998)

    Google Scholar 

  73. Rudas, T.: Informative allocation and consistent treatment selection. Stat. Methodol. 7, 323–337 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  74. Rudas, T.: Directionally collapsible parameterizations of multivariate binary distributions. Stat. Methodol. 27, 132–145 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  75. Rudas, T.: Lectures on Categorical Data Analysis. Springer, New York (2018)

    Book  MATH  Google Scholar 

  76. Rudas, T., Bergsma, W.: On applications of marginal models to categorical data. Metron 42, 15–37 (2004)

    MathSciNet  MATH  Google Scholar 

  77. Rudas, T., Bergsma, W., Németh, R.: Parameterization and estimation of path models for categorical data. In: Rizzi, A., Vichi, M. (eds.) COMPSTAT 2006, pp. 383–394. Physica Verlag, Heidelberg (2006)

    Chapter  Google Scholar 

  78. Rudas, T., Bergsma, W., Németh, R.: Marginal log-linear parameterization of conditional independence models. Biometrika 97, 1006–1012 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  79. Rudas, T., Leimer, H.-G.: Analysis of contingency tables with known conditional odds ratios or known log-linear parameters. In: Francis, B., Seeberg, G.U.H., van der Heijden, P.G.M., Jansen, W. (eds.) Statistical Modelling, pp. 313–322. Elsevier, Amsterdam (1992)

    Google Scholar 

  80. Shpitser, I., Evans, R.J., Richardson, T.S., Robins, J.M.: Sparse nested Markov models with loglinear parameters. In: Twenty-ninth Conference on Uncertainty in Artificial Intelligence, pp. 576–585 (2013)

    Google Scholar 

  81. Stanghellini, E., van der Heijden, P.G.: A multiple-record systems estimation method that takes observed and unobserved heterogeneity into account. Biometrics 60(2), 510–516 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  82. Touloumis, A., Agresti, A., Kateri, M.: GEE for multinomial responses using a local odds ratios parameterization. Biometrics 69(3), 633–640 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  83. Turner, E.L.: Marginal Modelling of Capture-Recapture Data. Ph.D. Thesis. McGill University Montreal (2007)

    Google Scholar 

  84. Wedderburn, R.W.: Quasi-likelihood functions, generalized linear models, and the Gauss—Newton method. Biometrika, 61(3), 439–447 (1974)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wicher Bergsma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rudas, T., Bergsma, W. (2023). Marginal Models: An Overview. In: Kateri, M., Moustaki, I. (eds) Trends and Challenges in Categorical Data Analysis. Statistics for Social and Behavioral Sciences. Springer, Cham. https://doi.org/10.1007/978-3-031-31186-4_3

Download citation

Publish with us

Policies and ethics