Abstract
Missing values in predictors are a common problem in survival analysis. In this paper, we review estimation methods for accelerated failure time models with missing predictors, and apply a new method called subsample ignorable likelihood (IL) Little and Zhang (J R Stat Soc 60:591–605, 2011) to this class of models. The approach applies a likelihood-based method to a subsample of observations that are complete on a subset of the covariates, chosen based on assumptions about the missing data mechanism. We give conditions on the missing data mechanism under which the subsample IL method is consistent, while both complete-case analysis and ignorable maximum likelihood are inconsistent. We illustrate the properties of the proposed method by simulation and apply the method to a real dataset.
Similar content being viewed by others
References
Antonovsky A (1967) Social class, life expectancy, and overall mortality. Milbank Meml Fund Q 45:31–73
Bedrick EJ, Christensen R, Johnson WO (2000) Bayesian accelerated failure time analysis with application to veterinary epidemiology. Stat Med 19:221–237
Black D, Morris JN, Smith C, Townsend P (1982) Inequalities in health: the black report. Penguin, Middlesex
Buckley J, James I (1979) Linear regression with censored data. Biometrika 66:429–436
Cho M, Schenker N (1999) Fitting the log-F accelerated failure time model with incomplete covariate data. Biometrics 55:826–833
David M, Little RJA, Samuhel ME, Triest RK (1986) Alternative methods for CPS income imputation. J Am Stat Assoc 86:29–41
Giorgi R, Belot A, Gaudart J, Launoy G (2008) The performance of multiple imputation for missing covariate data within the context of regression relative survival analysis. Stat Med 27:6310–6331
Haan M, Kaplan GA, Camacho T (1987) Poverty and health: prospective evidence from the Alameda County study. Am J Epidemiol 125:989–998
Heitjan DF, Rubin DB (1991) Ignorability and coarse data. Ann Stat 19:2244–2253
Hemming K, Hutton JL (2010) Bayesian sensitivity models for missing covariates in the analysis of survival data. J Eval Clin Pract 18:238–246
Herring AH, Ibrahim JG, Lipsitz SR (2002) Maximum likelihood estimation in random effects cure rate models with nonignorable missing covariates. Biostatistics 3:387–405
Jin Z, Lin DY, Ying Z (2006) On least-squares regression with censored data. Biometrika 93:147–161
Lillard L, Smith JP, Welch F (1986) What do we really know about wages: the importance of nonreporting and census imputation. J Polit Econ 94:489–506
Link BG, Phelan J (1995) Social conditions as fundamental causes of disease. J Health Soc Behav 32:80–94
Lipsitz SR, Ibrahim JG (1996a) A conditional model for incomplete covariates in parametric regression models. Biometrika 83:916–922
Lipsitz SR, Ibrahim JG (1996b) Using the EM-algorithm for survival data with incomplete categorical covariates. Life Data Anal 2:5–14
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken
Little RJA, Zhang N (2011) Subsample ignorable likelihood for regression analysis with missing data. J R Stat Soc 60:591–605
Miller R, Halpern J (1982) Regression with censored data. Biometrika 69:521–531
Meng X, Schenker N (1999) Maximum likelihood estimation for linear regression models with right censored outcome and missing predictors. Comput Stat Data Anal 29:471–483
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Sorlie PD, Backlund E, Keller JB (1995) US mortality by economic, demographic, and social characteristics: the National Longitudinal Mortality Study. Am J Public Health 85:949–956
White IR, Royston P (2009) Imputing missing covariate values for the Cox model. Stat Med 28:1982–1998
Yan T, Curtin R, Jans M (2010) Trends in income nonresponse over two decades. J Off Stat 26:145–164
Acknowledgments
This paper uses data supplied by the National Heart, Lung, and Blood Institute, NIH, DHHS from the National Longitudinal Mortality Study. The views expressed in this paper are those of the authors and do not necessarily reflect the views of the National Heart, Lung, and Blood Institute, the Bureau of the Census, or the National Center for Health Statistics.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zhang, N., Little, R.J. Subsample ignorable likelihood for accelerated failure time models with missing predictors. Lifetime Data Anal 21, 457–469 (2015). https://doi.org/10.1007/s10985-014-9304-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-014-9304-x