Abstract
Clustered competing risks data are a complex failure time data scheme. Its main characteristics are the cluster structure, which implies a latent within-cluster dependence between its elements, and its multiple variables competing to be the one responsible for the occurrence of an event, the failure. To handle this kind of data, we propose a full likelihood approach, based on generalized linear mixed models instead the usual complex frailty model. We model the competing causes in the probability scale, in terms of the cumulative incidence function (CIF). A multinomial distribution is assumed for the competing causes and censorship, conditioned on the latent effects that are accommodated by a multivariate Gaussian distribution. The CIF is specified as the product of an instantaneous risk level function with a failure time trajectory level function. The estimation procedure is performed through the R package Template Model Builder, an C++ based framework with efficient Laplace approximation and automatic differentiation routines. A large simulation study was performed, based on different latent structure formulations. The model fitting was challenging and our results indicated that a latent structure where both risk and failure time trajectory levels are correlated is required to reach reasonable estimation.
Similar content being viewed by others
References
Andersen PK, Geskus RB, de Witte T, Putter H (2012) Competing risks in epidemiology: possibilities and pitfalls. Int J Epidemiol 31(1):861–870
Bonat WH (2018) Multiple response variables regression models in R: the mcglm package. J Stat Softw 84(4)
Bonat WH, Jørgensen B (2016) Multivariate covariance generalized linear models. J Roy Stat Soc Ser C (Appl Stat) 65(5):649–675
Bonat WH, Ribeiro PJ Jr (2016) Practical likelihood analysis for spatial generalized linear mixed models. Environmetrics 27(1):83–89
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88(421):9–25
Cederkvist L, Holst KK, Andersen KK, Scheike TH (2019) Modeling the cumulative incidence function of multivariate competing risks data allowing for within-cluster dependence of risk and timing. Biostatistics 20(2):199–217
Cheng Y, Fine JP (2012) Cumulative incidence association models for bivariate competing risks data. J Roy Stat Soc Ser B (Methodol) 74(2):183–202
Cheng Y, Fine JP, Kosorok MRJ (2007) Nonparametric association analysis of bivariate competing-risks data. J Am Stat Assoc 102(480):1407–1415
Cheng Y, Fine JP, Kosorok MRJ (2009) Nonparametric association analysis of exchangeable clustered competing risks data. Biometrics 65(1):385–393
Clayton DG (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial rendency in chronic disease incidence. Biometrika 65(1):141–151
Cox DR, Reid N (2004) A note on pseudolikelihood constructed from marginal densities. Biometrika 91(3):729–737
Dennis JE, Gay DM, Welsch RE (1981) An adaptive nonlinear least-squares algorithm. ACM Trans Math Softw 7(3):348–368
Diaconis P (2009) The Markov chain Monte Carlo revolution. Bull (New Ser) Am Math Soc 46(2):179–205
Embrechts P (2009) Copulas: a personal view. J Risk Insur 76(3):639–650
Fine JP (1999) Analysing competing risks data with transformation models. J Roy Stat Soc Ser B (Methodol) 61(4):817–830
Fine JP, Gray RJ (1999) A proportional hazards models for the subdistribution of a competing risk. J Am Stat Assoc 94(446):496–509
Gay DM (1990) Usage summary for selected optimization routines, technical report, computing science technical report 153. AT &T Bell Laboratories, Murray Hill
Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85(410):398–409
Gerds TA, Scheike TH, Andersen PK (2012) Absolute risk regression for competing risks: interpretation, link functions and prediction. Stat Med 31(29):3921–3930
He Y, Kim S, Mao L, Ahn KW (2022) Marginal semiparametric transformation models for clustered multivariate competing risks data. Stat Med 41:5349–5364
Hoffman MD, Gelman A (2014) The no-u-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(47):1593–1623
Hougaard P (2000) Analysis of multivariate survival data. Springer, New York
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, Hoboken
Klein JP (1992) Semiparametric estimation of random effects using cox model based on the em algorithm. Biometrics 48(1):795–806
Kristensen K, Nielsen A, Berg CW, Skaug HJ, Bell BM (2016) TMB: automatic differentiation and Laplace approximation. J Stat Softw 70(5):1–21
Krupskii P, Joe H (2013) Factor copula models for multivariate data. J Multivar Anal 120(1):85–101
Kuk AYC (1992) A semiparametric mixture model for the analysis of competing risks data. Aust J Stat 34(2):169–180
Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38(4):963–974
Larson MG, Dinse GE (1985) A mixture model for the regression analysis of competing risks data. J Roy Stat Soc Ser C (Appl Stat) 34(3):201–211
Liang KY, Self S, Bandeen-Roche KJ, Zeger SL (1995) Some recent developments for regression analysis of multivariate failure time data. Lifetime Data Anal 1(1):403–415
Lindsay BG (1988) Composite likelihood methods. Comtemp Math 80(1):221–239
Masarotto G, Varin C (2012) Gaussian copula marginal regression. Electron J Stat 6(1):1517–1549
McCullagh P, Nelder JA (1989) Generalized linear models, second, edition. Chapman & Hall, London
McCulloch CE, Searle SR (2001) Generalized, linear, and mixed models. Wiley, New York
Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New York
Monnahan C, Kristensen K (2018) No-U-turn sampling for fast Bayesian inference in ADMB and TMB: introducing the adnuts and tmbstan R packages. PloS ONE 13(5)
Naskar M, Das K, Ibrahim JG (2005) A semiparametric mixture model for analyzing clustered competing risks data. Biometrics 61(3):729–737
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135(3):370–384
Nielsen GG, Gill RD, Andersen PK, Sørensen TIA (1992) A counting process approach to maximum likelihood estimation in frailty models. Scand J Stat 19(1):25–43
Nocedal J, Wright SJ (2006) Numerical optimization, springer series in operations research and financial engineering, 2nd edn. Springer, New York
Petersen JH (1998) An additive frailty model for correlated life times. Biometrics 54(1):646–661
Peyré G (2020) Course notes on optimization for machine learning. https://mathematical-tours.github.io/book-sources/optim-ml/OptimML.pdf. CNRS & DMA, École Normale Supérieure
Prentice RL, Kalbfleisch JD, Peterson AV Jr, Flournoy N, Farewell VT, Breslow NE (1978) The analysis of failure times in the presence of competing risks. Biometrics 1(1):541–554
R Core Team(2021) R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. https://www.R-project.org/
Scheike T, Sun Y (2012) On cross-odds ratio for multivariate competing risks data. Biostatistics 13(4):680–694
Scheike T. H., Sun Y., Zhang M.-J., Jensen T. K. (2010) A semiparametric random effects model for multivariate competing risks data. Biometrika 97(1):133–145. https://doi.org/10.1093/biomet/asp082
Shi H, Cheng Y, Jeong JH (2013) Constrained parametric model for simultaneous inference of two cumulative incidence functions. Biom J 55(1):82–96
Shih JH, Albert PS (2009) Modeling familial association of ages at onset of disease in the presence of competing risk. Biometrics 66(4):1012–1023
Shun Z, McCullagh P (1995) Laplace approximation of high dimentional integrals. J Roy Stat Soc Ser B (Methodol) 57(4):749–760
Stan Development Team (2019) Stan modeling language users guide and reference manual, Version 2.26. https://mc-stan.org
Stan Development Team (2020) RStan: the R interface to Stan. https://mc-stan.org/. R package version 2.21.2
Therneau TM, Grambsch PM (2000) Modeling survival data: extending the cox model. Springer, New York
Tierney L, Kadane J (1986) Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 81(393):82–86
Varin C, Reid N, Firth D (2011) An overview of composite likelihood methods. Stat Sin 21(1):5–42
Vaupel JW, Manton KG, Stallard E (1979) The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 16(1):439–454
Wood SN (2015) Core statistics. Textbooks, IMS, Institute of Mathematical Statistics, New York
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Enea G. Bongiorno.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Laureano, H.A., Petterle, R.R., Silva, G.P.d. et al. A multinomial generalized linear mixed model for clustered competing risks data. Comput Stat 39, 1417–1434 (2024). https://doi.org/10.1007/s00180-023-01353-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-023-01353-5