Skip to main content

Bayesian Variable Selection in Generalized Linear Mixed Models

  • Chapter
Random Effect and Latent Variable Model Selection

Part of the book series: Lecture Notes in Statistics ((LNS,volume 192))

Abstract

Repeated measures and longitudinal data are commonly collected for analysis in epidemiology, clinical trials, biology, sociology, and economic sciences. In such studies, a response is measured repeatedly over time for each subject under study, and the number and timing of the observations often varies among subjects. In contrast to cross-sectional studies that collect a single measurement per subject, longitudinal studies have the extra complication of within-subject dependence in the repeated measures. Such dependence can be thought to arise due to the impact of unmeasured predictors. Main effects of unmeasured predictors lead to variation in the average level of response among subjects, while interactions with measured predictors lead to heterogeneity in the regression coefficients. This justification has motivated random effects models, which allow the intercept and slopes in a regression model to be subject-specific. Random effects models are broadly useful for modeling of dependence not only for longitudinal data but also in multicenter studies, meta analysis and functional data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Albert, J.H. and Chib, S. (1997). Bayesian Test and Model Diagnostics in Conditionally Independent Hierarchical Models. Journal of the American Statistical Association 92, 916–925

    Article  MATH  Google Scholar 

  • Breslow, N.E. and Clayton, D.G. (1993). Approximate Inference in Generalized Linear Mixed Models. Journal of the American Statistical Association 88, 9–25

    MATH  Google Scholar 

  • Cai, B. and Dunson, D.B. (2006). Bayesian Covariance Selection in Generalized Linear Mixed Models. Biometrics 62, 446–457

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, B. and Dunson, D.B. (2007). Bayesian Variable Selection in Nonparametric Random Effects Models. Biometrika, under revision

    Google Scholar 

  • Chen, Z. and Dunson, D.B. (2003). Random Effects Selection in Linear Mixed Models. Biometrics 59, 762–769

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, M., Ibrahim, J.G., Shao, Q., and Weiss, R.E. (2003). Prior Elicitation for Model Selection and Estimation in Generalized Linear Mixed Models. Journal of Statistical Planning and Inference 111, 57–76

    Article  MathSciNet  MATH  Google Scholar 

  • Chipman, H., George, E.I., and McCulloch, R.E. (2002). Bayesian Treed Models. Machine Learning 48, 299–320

    Article  MATH  Google Scholar 

  • Chipman, H., George, E.I., and McCulloch, R.E. (2003). Bayesian Treed Generalized Linear Models. Bayesian Statistics 7 (J.M. Bernardo, M.J. Bayarri, J.O. Berger, A.P. Dawid, D. Heckerman, A.F.M. Smith, and M. West, eds), Oxford: Oxford University Press, 323–349

    Google Scholar 

  • Daniels, M.J. and Kass, R.E. (1999). Nonconjugate Bayesian Estimation of Covariance Matrices and Its Use in Hierarchical Models. Journal of the American Statistical Association 94, 1254–1263

    Article  MathSciNet  MATH  Google Scholar 

  • Daniels, M.J. and Pourahmadi, M. (2002). Bayesian Analysis of Covariance Matrices and Dynamic Models for Longitudinal Data. Biometrika 89, 553–566

    Article  MathSciNet  MATH  Google Scholar 

  • Daniels, M.J. and Zhao, Y.D. (2003). Modelling the Random Effects Covariance Matrix in Longitudinal Data. Statistics in Medicine 22, 1631–1647

    Article  Google Scholar 

  • Gelman, A. and Rubin, D.B. (1992). Inference from Iterative Simulation using Multiple Sequences. Statistical Science 7, 457–472

    Article  Google Scholar 

  • George, E.I. and McCulloch, R.E. (1993). Variable Selection via Gibbs Sampling. Journal of the American Statistical Association 88, 881–889

    Article  Google Scholar 

  • Geweke, J. (1992). Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments. Bayesian Statistics 4 (J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, eds), Oxford: Oxford University Press, 169–193

    Google Scholar 

  • Geweke, J. (1996). Variable Selection and Model Comparison in Regression. Bayesian Statistics 5 (J.O. Berger, J.M. Bernardo, A.P. Dawid, and A.F.M. Smith, eds), Oxford: Oxford University Press, 609–620

    Google Scholar 

  • Gilks, W.R., Best, N.G., and Tan, K.K.C. (1995). Adaptive Rejection Metropolis Sampling within Gibbs Sampling. Applied Statistics 44, 455–472

    Article  MATH  Google Scholar 

  • Gilks, W.R., Neal, R.M., Best, N.G., and Tan, K.K.C. (1997). Corrigendum: Adaptive Rejection Metropolis Sampling. Applied Statistics 46, 541–542

    Google Scholar 

  • Kinney, S.K. and Dunson, D.B. (2007). Fixed and random effects selection in linear and logistic models. Biometrics 63, 690–698

    Article  MathSciNet  MATH  Google Scholar 

  • Laird, N.M. and Ware, J.H. (1982). Random Effects Models for Longitudinal Data. Biometrics 38, 963–974

    Article  MATH  Google Scholar 

  • Liechty, J.C., Liechty, M.W., and Muller, P. (2004). Bayesian Correlation Estimation. Biometrika 91, 1–14

    Article  MathSciNet  MATH  Google Scholar 

  • Lin, X. (1997). Variance component testing in generalised linear models with random effects. Bio-metrika 84, 309–326

    MATH  Google Scholar 

  • McCulloch, C.E. (1997). Maximum Likelihood Algorithms for Generalized Linear Mixed Models. Journal of the American Statistical Association 92, 162–170

    Article  MathSciNet  MATH  Google Scholar 

  • McCulloch, C.E. and Searle, S. (2001). Generalized Linear and Mixed Models. New York: Wiley.

    MATH  Google Scholar 

  • McGilchrist, C.A. (1994). Estimation in Generalized Mixed Models. Journal of the Royal Statistical Society B 56, 61–69

    MathSciNet  MATH  Google Scholar 

  • Meyer, M.C. and Laud, P.W. (2002). Predictive Variable Selection in Generalized Linear Models. Journal of the American Statistical Association 97, 859–871

    Article  MathSciNet  MATH  Google Scholar 

  • Nott, D.J. and Leonte, D. (2004). Sampling Schemes for Bayesian Variable Selection in Generalized Linear Models. Journal of Computional and Graphical Statistics 13, 362–382

    Article  MathSciNet  Google Scholar 

  • Ntzoufras, I., Dellaportas, P., and Forster, J.J. (2003). Bayesian Variable Selection and Link Determination for Generalised Linear Models. Journal of Statistical Planning and Inference 111, 165–180

    Article  MathSciNet  MATH  Google Scholar 

  • Raftery, A. (1996). Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Models. Biometrika 83, 251–266

    Article  MathSciNet  MATH  Google Scholar 

  • Raftery, A.E. and Lewis, S. (1992). How Many Iterations in the Gibbs Sampler? Bayesian Statistics 4 (J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, eds), Oxford: Oxford University Press, 763–773

    Google Scholar 

  • Raftery, A.E., Madigan, D., and Volinsky, C.T. (1996). Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance. Bayesian Statistics 5 (J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, eds), Oxford: Oxford University Press, 323–349

    Google Scholar 

  • Rowland, A.S., Baird, D.D., Weinberg, C.R., Shore, D.L., Shy, C.M., and Wilcox, A.J. (1992). Reduced Fertility Among Women Employed as Dental Assistants Exposed to High Levels of Nitrous Oxide. The New England Journal of Medicine 327, 993–997

    Article  Google Scholar 

  • Schall, R. (1991). Estimation in Generalized Linear Mixed Models with Random Effects. Bio-metrika 78, 719–727

    MATH  Google Scholar 

  • Sinharay, S. and Stern, H.S. (2001). Bayes Factors for Variance Component Testing in Generalized Linear Mixed Models. In Bayesian Methods with Applications to Science, Policy and Official Statistics (ISBA 2000 Proceedings), 507–516

    Google Scholar 

  • Solomon, P.J. and Cox, D.R. (1992). Nonlinear Component of Variance Models. Biometrika 79, 1–11

    Article  MathSciNet  MATH  Google Scholar 

  • Wong, F., Carter, C.K., and Kohn, R. (2003). Efficient Estimation of Covariance Selection Models. Biometrika 90, 809–830

    Article  MathSciNet  Google Scholar 

  • Zeger, S.L. and Karim, M.R. (1991). Generalized Linear Models with Randome Effects: A Gibbs Sampling Approach. Journal of the American Statistical Association 86, 79–86

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Cai .

Editor information

Editors and Affiliations

Appendix

Appendix

The normal linear mixed model of Laird and Ware (1982) is a special case of a GLMM having g(μ ij ) = μ ij = η ij = x′ ij β + z′ ij ζ i , ϕ = σ2 and \(b\left({\theta _{ij} } \right) = {{\eta _{ij}^2 } /2}\) In this case,\(\frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta }}\frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta '}} = \left({y - \eta } \right){{\left({y - \eta } \right)^\prime }/{\sigma ^2 }} and \frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta \partial \eta '}} = - 1_N {{1'_N }/{\sigma ^2 }}\). Therefore we have

$$ \begin{array}{*{20}c} {B_{i,k}^{\left(1 \right)} = \left\{ {\left({{\text{y}}_i - X_i \beta } \right)^\prime {\text{Z}}_{ik} } \right\}^2 - {\text{Z'}}_{ik} {\text{Z}}_{ik} } \\ {B_{i,m,k}^{\left(2 \right)} = \left({{\text{y}}_i - X_i \beta } \right)^\prime {\text{Z}}_{im} {\text{Z'}}_{ik} \left({{\text{y}}_i - X_i \beta } \right){\text{Z'}}_{ik} {\text{Z}}_{im},} \\ \end{array} $$

where Z ik denotes the kth column of Z i , and

$$ L_0 = \exp \left\{ { - \frac{1} {{2\sigma ^2 }}\sum\limits_{i = 1}^n {\sum\limits_{j = 1}^{n_i } {\left({y_{ij} - {\text{x'}}_{ij} \beta } \right)^2 } } } \right\}. $$

When y ij are 0–1 random variables, the logistic regression model can be obtained by the canonical link function \(g\left({\pi _{ij} } \right) = \log \frac{{\pi _{ij} }}{{1 - \pi _{ij} }} = \eta _{ij} = x'_{ij} \beta + z'_{ij} \zeta _i,\varphi = 1\), b(θ ij ) = log(1 + eη ij) = − log(1 − π ij ), hence \(\frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta }}\frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta '}} = \left({y - \pi } \right)\left({y - \pi } \right)^\prime and \frac{{\partial ^2 l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta \partial \eta '}} = - \pi \left({1_N - \pi } \right)^\prime \). Then

$$ \begin{array}{*{20}c} {B_{i,k}^{\left(1 \right)} = \left\{ {\left({{\text{y}}_i - \pi _i } \right)^\prime {\text{Z}}_{ik} } \right\}^2 - \pi '_i {\text{DG}}\left({{\text{Z}}_{ik} {\text{Z'}}_{ik} } \right)\left({1_{ni} - \pi _i } \right)} \\ {B_{i,m,k}^{\left(2 \right)} = \left({{\text{y}}_i - \pi _i } \right)^\prime {\text{Z}}_{im} {\text{Z'}}_{ik} \left({{\text{y}}_i - \pi _i } \right) - \pi '_i {\text{DG}}\left({{\text{Z}}_{im} {\text{Z'}}_{ik} } \right)\left({1_{ni} - \pi _i } \right),} \\ \end{array} $$

where \(\pi _i = \left({\pi _{i1}, \ldots,\pi _{in_i } } \right)^\prime \) with π ij = exp(x′ ij β)/ 1 + exp(x′ ij β), and

$$ L_0 = \exp \left\{ {y_{ij} \log \frac{{\pi _{ij} }} {{1 - \pi _{ij} }} + \log \left({1 - \pi _{ij} } \right)} \right\}. $$

Similarly, when y ij are counts with mean λ ij , the Poisson regression model can be obtained by the canonical link function g(λ ij ) = logλ ij = η ij = x′ ij β + z′ ij ζ i , ϕ = 1, b (θ ij ) = e ηij = λ ij , \(\frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta }}\frac{{\partial l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta '}} = \left({y - \lambda } \right)\left({y - \lambda } \right)^\prime and \frac{{\partial ^2 l\left({\beta,\varphi,\zeta \left| y \right.} \right)}}{{\partial \eta \partial \eta '}} = - \lambda 1'_N \). Then we obtain that

$$ \begin{array}{*{20}c} {B_{i,k}^{\left(1 \right)} = \left\{ {\left({{\text{y}}_i - \lambda _i } \right)^\prime {\text{Z}}_{ik} } \right\}^2 - \lambda '_i {\text{DG}}\left({{\text{Z}}_{ik} {\text{Z'}}_{ik} } \right)1_{ni} } \\ {B_{i,m,k}^{\left(2 \right)} = \left({{\text{y}}_i - \lambda _i } \right)^\prime {\text{Z}}_{im} {\text{Z'}}_{ik} \left({{\text{y}}_i - \lambda _i } \right) - \lambda '_i {\text{DG}}\left({{\text{Z}}_{im} {\text{Z'}}_{ik} } \right)1_{ni},} \\ \end{array} $$

where \(\lambda _i = \left({\lambda _{i1}, \ldots,\lambda _{in_i } } \right)^\prime \) with λ ij = exp(x′ ij β), and L 0 = exp y ij logλ ij − λ ij − logy ij !.

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Cai, B., Dunson, D.B. (2008). Bayesian Variable Selection in Generalized Linear Mixed Models. In: Dunson, D.B. (eds) Random Effect and Latent Variable Model Selection. Lecture Notes in Statistics, vol 192. Springer, New York, NY. https://doi.org/10.1007/978-0-387-76721-5_4

Download citation

Publish with us

Policies and ethics