Skip to main content
Log in

A fusion of least squares and empirical likelihood for regression models with a missing binary covariate

  • Articles
  • Published:
Science China Mathematics Aims and scope Submit manuscript

Abstract

Multiply robust inference has attracted much attention recently in the context of missing response data. An estimation procedure is multiply robust, if it can incorporate information from multiple candidate models, and meanwhile the resulting estimator is consistent as long as one of the candidate models is correctly specified. This property is appealing, since it provides the user a flexible modeling strategy with better protection against model misspecification. We explore this attractive property for the regression models with a binary covariate that is missing at random. We start from a reformulation of the celebrated augmented inverse probability weighted estimating equation, and based on this reformulation, we propose a novel combination of the least squares and empirical likelihood to separately handling each of the two types of multiple candidate models, one for the missing variable regression and the other for the missingness mechanism. Due to the separation, all the working models are fused concisely and effectively. The asymptotic normality of our estimator is established through the theory of estimating function with plugged-in nuisance parameter estimates. The finite-sample performance of our procedure is illustrated both through the simulation studies and the analysis of a dementia data collected by the national Alzheimer’s coordinating center.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Brumback L C, Pepe M S, Alonzo T A. Using the ROC curve for gauging treatment effect in clinical trials. Stat Med, 2006, 25: 575–590

    Article  MathSciNet  Google Scholar 

  2. Chan K C G. A simple multiple robust estimator for missing response problem. Stat, 2013, 2: 143–149

    Article  Google Scholar 

  3. Chan K C G, Yam S C P. Oracle, multiple robust and multipurpose calibration in a missing response problem. Statist Sci, 2014, 29: 380–396

    Article  MathSciNet  MATH  Google Scholar 

  4. Chen S N. Imputation of Missing Values Using Quantile Regression. Ann Arbor: UMI Dissertations Publishing, 2014

    Google Scholar 

  5. Han P. Multiply robust estimation in regression analysis with missing data. J Amer Statist Assoc, 2014, 109: 1159–1173

    Article  MathSciNet  Google Scholar 

  6. Han P. Combining inverse probability weighting and multiple imputation to improve robustness of estimation. Scand J Stat, 2016, 43: 246–260

    Article  MathSciNet  MATH  Google Scholar 

  7. Han P, Wang L. Estimation with missing data: Beyond double robustness. Biometrika, 2013, 100: 417–430

    Article  MathSciNet  MATH  Google Scholar 

  8. Horvitz D G, Thompson D J. A generalization of sampling without replacement from a finite universe. J Amer Statist Assoc, 1952, 47: 663–685

    Article  MathSciNet  MATH  Google Scholar 

  9. Little R J A, Rubin D B. Statistical Analysis with Missing Data. New York: Wiley, 2002

    Book  MATH  Google Scholar 

  10. Qin J, Zhang B. Empirical-likelihood-based inference in missing response problems and its application in observational studies. J R Statist Soc B, 2007, 69: 101–122

    Article  MathSciNet  Google Scholar 

  11. Qin J, Zhang B, Leung D H Y. Empirical likelihood in missing data problems. J Amer Statist Assoc, 2009, 104: 1492–1503

    Article  MathSciNet  MATH  Google Scholar 

  12. Robins J M, Rotnitzky A, Zhao L P. Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc, 1994, 89: 46–866

    MathSciNet  MATH  Google Scholar 

  13. Robins J M, Rotnitzky A, Zhao L P. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Amer Statist Assoc, 1995, 90: 106–121

    Article  MathSciNet  MATH  Google Scholar 

  14. Tsiatis A A. Semiparametric Theory and Missing Data. New York: Springer, 2006

    MATH  Google Scholar 

  15. van der Laan M J, Polley E C, Hubbard A E. Super learner. Stat Appl Genet Mol Biol, 2007, 6: 1–23

    MathSciNet  MATH  Google Scholar 

  16. van der Laan M J, Sherri P. Targeted Learning: Causal Inference for Observational and Experimental Data. New York: Springer, 2011

    Book  Google Scholar 

  17. van der Vaart A. Asymptotic Statistics. New York: Cambridge University Press, 2000

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to XiaoGang Duan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Duan, X., Wang, Z. A fusion of least squares and empirical likelihood for regression models with a missing binary covariate. Sci. China Math. 59, 2027–2036 (2016). https://doi.org/10.1007/s11425-016-5156-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11425-016-5156-z

Keywords

MSC(2010)

Navigation