Skip to main content

Advertisement

Log in

Combining models in longitudinal data analysis

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Model selection uncertainty in longitudinal data analysis is often much more serious than that in simpler regression settings, which challenges the validity of drawing conclusions based on a single selected model when model selection uncertainty is high. We advocate the use of appropriate model selection diagnostics to formally assess the degree of uncertainty in variable/model selection as well as in estimating a quantity of interest. We propose a model combining method with its theoretical properties examined. Simulations and real data examples demonstrate its advantage over popular model selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barron A.R. (1987) Are Bayes rules consistent in information?. In: Cover T.M., Gopinath B. (eds) Open Problems in Communication and Computation. Springer-Verlag, Berlin, pp 85–91

    Chapter  Google Scholar 

  • Breiman L. (1996) Heuristics of instability and stabilization in model selection. The Annals of Statistics 24: 2350–2383

    Article  MathSciNet  MATH  Google Scholar 

  • Buckland S.T., Burnham K.P., Augustin N.H. (1997) Model selection: An integral part of inference. Biometrics 53: 603–618

    Article  MATH  Google Scholar 

  • Burnham K.P., Anderson D.R. (2004) Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods and Research 33(2): 261–304

    Article  MathSciNet  Google Scholar 

  • Cantoni E., Field C., Flemming J.M., Ronchetti E. (2007) Longitudinal variable selection by cross- validation in the case of many covariates. Statistics in Medicine 26: 919–930

    Article  MathSciNet  Google Scholar 

  • Chatfield C. (1995) Model uncertainty, data mining and statistical inference (with discussion). Journal of the Royal Statistical Society, Series A 158: 419–466

    Article  Google Scholar 

  • Diggle P.J., Heagerty P., Liang K.Y., Zeger S.L. (2002) Analysis of longitudinal data (2nd ed). Oxford University Press, New York

    Google Scholar 

  • Draper D. (1995) Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society, Series B 57: 45–70

    MathSciNet  MATH  Google Scholar 

  • Efron B., Tibshirani R. (1993) An introduction to the bootstrap. Chapman & Hall, New York

    MATH  Google Scholar 

  • Fan J., Li R. (2004) New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. Journal of the American Statistical Association 99: 710–723

    Article  MathSciNet  MATH  Google Scholar 

  • Fitzmaurice, G. M., Laird, N. M., Ware, J. H. (2004). Applied longitudinal analysis. New York: Wiley. http://biosun1.harvard.edu/~fitzmaur/ala/.

  • Geisser S. (1993) Predictive inference: An introduction. Chapman & Hall, New York

    MATH  Google Scholar 

  • Henry K., Erice A., Tierney C., Balfour H.H.Jr., Fischl M.A., Kmack A. et al (1998) A randomized, controlled, double-blinded study comparing the survival benefit of four different reverse transcriptase inhibitor therapies (three-drug, two-drug, and alternative drug) for the treatment of advanced AIDS. Journal of Acquired Immune Deficiency Syndromes and Human Retrovirology 19: 339–349

    Article  Google Scholar 

  • Hjort N.L., Claeskens G. (2003) Frequentist model average estimators. JASA 98: 879–899

    MathSciNet  MATH  Google Scholar 

  • Hoeting J., Madigan D., Raftery A., Volinsky C. (1999) Bayesian model averaging: A tutorial (with discussion). Statistical Science 14: 382–417

    Article  MathSciNet  MATH  Google Scholar 

  • Huang J., Liu N., Pourahmadi M., Liu L. (2006) Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93: 85–98

    Article  MathSciNet  MATH  Google Scholar 

  • Juditsky A., Nemirovski A. (2000) Functional aggregation for nonparametric estimation. The Annals of Statistics 28: 681–712

    Article  MathSciNet  MATH  Google Scholar 

  • Liang K.Y., Zeger S.L. (1986) Longitudinal data analysis using generalized linear models. Biometrika 73: 13–22

    Article  MathSciNet  MATH  Google Scholar 

  • Lin D.Y., Ying Z. (2001) Semiparametric and nonparametric regression analysis of longitudinal data (with discussion). Journal of the American Statistical Association 96: 103–126

    Article  MathSciNet  MATH  Google Scholar 

  • Pan W. (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57: 120–125

    Article  MathSciNet  MATH  Google Scholar 

  • Ruppert D., Wand M.P., Raymond C. (2003) Semiparametric regression. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Shen X., Ye J. (2002) Adaptive model selection. Journal of the American Statistical Association 97: 210–221

    Article  MathSciNet  MATH  Google Scholar 

  • Tsybakov, A. B. (2003). Optimal rates of aggregation. In Proceedings of 16th annual conference on learning theory (COLT) and 7th annual workshop on Kernel machines. Lecture notes in artificial intelligence (Vol. 2777, pp. 303–313). Heidelberg: Springer.

  • Wang L., Qu A. (2009) Consistent model selection and data-driven smooth tests for longitudinal data in the estimating equations approach. Journal of the Royal Statistical Society, Series B 71: 177–190

    Article  MathSciNet  MATH  Google Scholar 

  • Yafune A., Funatogawa T., Ishiguro M. (2005) Extended information criterion (EIC) approach for linear mixed effects models under restricted maximum likelihood (REML) estimation. Statistics in Medicine 24: 3417–3429

    Article  MathSciNet  Google Scholar 

  • Yang Y. (2001) Adaptive regression by mixing. Journal of the American Statistical Association 96: 574–588

    Article  MathSciNet  MATH  Google Scholar 

  • Yang Y. (2003) Regression with multiple candidate models: Selecting or mixing?. Statistica Sinica 13: 783–809

    MathSciNet  MATH  Google Scholar 

  • Yang Y. (2004) Combining forecasting procedures: Some theoretical results. Econometric Theory 20: 176–222

    Article  MathSciNet  MATH  Google Scholar 

  • Ye J. (1998) On measuring and correcting the effects of data mining and model selection. Journal of the American Statistical Association 93: 120–131

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan Z., Yang Y. (2005) Combining linear regression models: When and how?. Journal of the American Statistical Association 100: 1202–1214

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuhong Yang.

About this article

Cite this article

Liu, S., Yang, Y. Combining models in longitudinal data analysis. Ann Inst Stat Math 64, 233–254 (2012). https://doi.org/10.1007/s10463-010-0306-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-010-0306-5

Keywords

Navigation