Original Article

Analysis of Incomplete Data Using Inverse Probability Weighting and Doubly Robust Estimators

Stijn Vansteelandt

Ghent University, Belgium

Search for more papers by this author

James Carpenter

London School of Hygiene and Tropical Medicine, UK

Search for more papers by this author

, and

Michael G. Kenward

London School of Hygiene and Tropical Medicine, UK

Search for more papers by this author

Published Online:January 20, 2010https://doi.org/10.1027/1614-2241/a000005

Abstract

This article reviews inverse probability weighting methods and doubly robust estimation methods for the analysis of incomplete data sets. We first consider methods for estimating a population mean when the outcome is missing at random, in the sense that measured covariates can explain whether or not the outcome is observed. We then sketch the rationale of these methods and elaborate on their usefulness in the presence of influential inverse weights. We finally outline how to apply these methods in a variety of settings, such as for fitting regression models with incomplete outcomes or covariates, emphasizing the use of standard software programs.

References

Bang, H. , Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61, 692–972. First citation in article Crossref, Google Scholar
Brookhart, M. A. , van der Laan, M. J. (2006). A semiparametric model selection criterion with applications to the marginal structural model. Computational Statistics and Data Analysis, 50, 475–498. First citation in article Crossref, Google Scholar
Carpenter, J. , Bithell, J. (2000). Bootstrap confidence intervals: When, which, what? A practical guide for medical statisticians. Statistics in Medicine, 19, 1141–1164. First citation in article Crossref, Google Scholar
Carpenter, J. , Kenward, M. , & Vansteelandt, S. (2006). A comparison of multiple imputation and doubly robust estimation. Statistics in Society, 169, 571–584. First citation in article Crossref, Google Scholar
David, M. , Little, R. J. A. , Samuhel, M. E. , Triest, R. K. (1983). Imputation models based on the propensity to respond. Proceedings of the business and economic statistics section, (pp. 168–173). American Statistical Association. First citation in article Google Scholar
Davidian, M. , Tsiatis, A. A. , & Leon, S. (2005). Semiparametric estimation of treatment effect in a pretest-posttest study with missing data. Statistical Science, 20, 261–301. First citation in article Crossref, Google Scholar
Greenland, S. , Robins, J. M. , Pearl, J. (1999). Confounding and collapsibility in causal inference. Statistical Science, 14, 29–46. First citation in article Crossref, Google Scholar
Horvitz, D. G. , Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663–685. First citation in article Crossref, Google Scholar
Kang, J. D. Y. , Schafer, J. L. (2008). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22, 523–539. First citation in article Crossref, Google Scholar
Kenward, M. G. , Carpenter, J. (2007). Multiple imputation: Current perspectives. Statistical Methods in Medical Research, 16, 199–218. First citation in article Crossref, Google Scholar
Lin, H. Q. , Scharfstein, D. O. , Rosenheck, R. A. (2004). Analysis of longitudinal data with irregular, outcome-dependent follow-up. Journal of the Royal Statistical Society – Series B, 66, 791–813. First citation in article Crossref, Google Scholar
Little, R. , An, H. (2004). Robust likelihood-based analysis of multivariate data with missing values. Statistica Sinica, 14, 949–968. First citation in article Google Scholar
Murray, G. D. , Findlay, J. G. (1988). Correcting for the bias caused by drop-outs in hypertension trials. Statistics in Medicine, 7, 941–946. First citation in article Crossref, Google Scholar
Robins, J. M. , Rotnitzky, A. (2001). Comment on the Bickel and Kwon article, “Inference for semiparametric models: Some questions and an answer”. Statistica Sinica, 11, 920–936. First citation in article Google Scholar
Robins, J. M. , Rotnitzky, A. , Zhao, L.-P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89, 846–866. First citation in article Crossref, Google Scholar
Robins, J. M. , Rotnitzky, A. , Zhao, L.-P. (1995). Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association, 90, 106–121. First citation in article Crossref, Google Scholar
Robins, J. M. , Sued, M. , Lei-Gomez, Q. , Rotnitzky, A. (2008). Performance of double-robust estimators when ‘inverse probability’ weights are highly variable. Statistical Science, 22, 544–559. First citation in article Crossref, Google Scholar
Robins, J. M. , Wang, N. (2000). Inference for imputation estimators. Biometrika, 87, 113–124. First citation in article Crossref, Google Scholar
Rosenbaum, P. R. (1984). The consequences of adjustment for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society Series A, 147, 656–666. First citation in article Crossref, Google Scholar
Rotnitzky, A. G. , Robins, J. (2005). Inverse probability weighted in survival analysis. In P. Armitage, T. Colton, (Eds.), The encyclopedia of biostatistics. (2nd ed., Vol. 4, pp. 2619–2625). Chichester, UK: Wiley & Sons. First citation in article Google Scholar
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley. First citation in article Crossref, Google Scholar
Scharfstein, D. O. , Rotnitzky, A. , Robins, J. M. (1999). Adjusting for non-ignorable drop-out using semiparametric non-response models. Journal of the American Statistical Association, 94, 1096–1120. First citation in article Crossref, Google Scholar
Tan, Z. (2008). Understanding OR, PS, and DR. Statistical Science, 22, 560–568. First citation in article Crossref, Google Scholar
Vansteelandt, S. , Rotnitzky, A. , Robins, J. M. (2007). Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse. Biometrika, 94, 841–860. First citation in article Crossref, Google Scholar

Volume 6Issue 1January 2010

ISSN: 1614-1881eISSN: 1614-2241

Licenses & Copyright

Keywords

Acknowledgments:

The authors are grateful to the guest editor and two referees for thorough and very helpful comments. They acknowledge support from IAP research network Grant No. P06/03 from the Belgian government (Belgian Science Policy). James Carpenter and Mike Kenward are partially supported by a grant from the Medical Research Council, UK, G0600599.

PDF download

Verify Phone

Congrats!

Analysis of Incomplete Data Using Inverse Probability Weighting and Doubly Robust Estimators

Abstract

References

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Analysis of Incomplete Data Using Inverse Probability Weighting and Doubly Robust Estimators

Abstract

References

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners