Abstract
Multiply robust inference has attracted much attention recently in the context of missing response data. An estimation procedure is multiply robust, if it can incorporate information from multiple candidate models, and meanwhile the resulting estimator is consistent as long as one of the candidate models is correctly specified. This property is appealing, since it provides the user a flexible modeling strategy with better protection against model misspecification. We explore this attractive property for the regression models with a binary covariate that is missing at random. We start from a reformulation of the celebrated augmented inverse probability weighted estimating equation, and based on this reformulation, we propose a novel combination of the least squares and empirical likelihood to separately handling each of the two types of multiple candidate models, one for the missing variable regression and the other for the missingness mechanism. Due to the separation, all the working models are fused concisely and effectively. The asymptotic normality of our estimator is established through the theory of estimating function with plugged-in nuisance parameter estimates. The finite-sample performance of our procedure is illustrated both through the simulation studies and the analysis of a dementia data collected by the national Alzheimer’s coordinating center.
Similar content being viewed by others
References
Brumback L C, Pepe M S, Alonzo T A. Using the ROC curve for gauging treatment effect in clinical trials. Stat Med, 2006, 25: 575–590
Chan K C G. A simple multiple robust estimator for missing response problem. Stat, 2013, 2: 143–149
Chan K C G, Yam S C P. Oracle, multiple robust and multipurpose calibration in a missing response problem. Statist Sci, 2014, 29: 380–396
Chen S N. Imputation of Missing Values Using Quantile Regression. Ann Arbor: UMI Dissertations Publishing, 2014
Han P. Multiply robust estimation in regression analysis with missing data. J Amer Statist Assoc, 2014, 109: 1159–1173
Han P. Combining inverse probability weighting and multiple imputation to improve robustness of estimation. Scand J Stat, 2016, 43: 246–260
Han P, Wang L. Estimation with missing data: Beyond double robustness. Biometrika, 2013, 100: 417–430
Horvitz D G, Thompson D J. A generalization of sampling without replacement from a finite universe. J Amer Statist Assoc, 1952, 47: 663–685
Little R J A, Rubin D B. Statistical Analysis with Missing Data. New York: Wiley, 2002
Qin J, Zhang B. Empirical-likelihood-based inference in missing response problems and its application in observational studies. J R Statist Soc B, 2007, 69: 101–122
Qin J, Zhang B, Leung D H Y. Empirical likelihood in missing data problems. J Amer Statist Assoc, 2009, 104: 1492–1503
Robins J M, Rotnitzky A, Zhao L P. Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc, 1994, 89: 46–866
Robins J M, Rotnitzky A, Zhao L P. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Amer Statist Assoc, 1995, 90: 106–121
Tsiatis A A. Semiparametric Theory and Missing Data. New York: Springer, 2006
van der Laan M J, Polley E C, Hubbard A E. Super learner. Stat Appl Genet Mol Biol, 2007, 6: 1–23
van der Laan M J, Sherri P. Targeted Learning: Causal Inference for Observational and Experimental Data. New York: Springer, 2011
van der Vaart A. Asymptotic Statistics. New York: Cambridge University Press, 2000
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Duan, X., Wang, Z. A fusion of least squares and empirical likelihood for regression models with a missing binary covariate. Sci. China Math. 59, 2027–2036 (2016). https://doi.org/10.1007/s11425-016-5156-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11425-016-5156-z