Abstract
Mendelian Randomization (MR) represents a class of instrumental variable methods using genetic variants. It has become popular in epidemiological studies to account for the unmeasured confounders when estimating the effect of exposure on outcome. The success of Mendelian Randomization depends on three critical assumptions, which are difficult to verify. Therefore, sensitivity analysis methods are needed for evaluating results and making plausible conclusions. We propose a general and easy to apply approach to conduct sensitivity analysis for Mendelian Randomization studies. Bound et al. (J Am Stat Assoc 90:443–450. 10.2307/2291055, 1995) derived a formula for the asymptotic bias of the instrumental variable estimator. Based on their work, we derive a new sensitivity analysis formula. The parameters in the formula include sensitivity parameters such as the correlation between instruments and unmeasured confounder, the direct effect of instruments on outcome and the strength of instruments. In our simulation studies, we examined our approach in various scenarios using either individual SNPs or unweighted allele score as instruments. By using a previously published dataset from researchers involving a bone mineral density study, we demonstrate that our proposed method is a useful tool for MR studies, and that investigators can combine their domain knowledge with our method to obtain bias-corrected results and make informed conclusions on the scientific plausibility of their findings.
Similar content being viewed by others
References
Auerbach J et al (2018) Causal modeling in a multi-omic setting: insights from GAW20. BMC Genet 19:74. https://doi.org/10.1186/s12863-018-0645-4
Basmann RL (1957) A generalized classical method of linear estimation of coefficients in a structural equation. Econometrica 25:77–83
Bauchet M et al (2007) Measuring European population stratification with microarray genotype data. Am J Hum Genet 80:948–956. https://doi.org/10.1086/513477
Bound J, Jaeger D, Baker R (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc 90:443–450. https://doi.org/10.2307/2291055
Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG (2017) Sensitivity analyses for robust causal inference from mendelian randomization analyses with multiple genetic variants. Epidemiology 28:30–42. https://doi.org/10.1097/EDE.0000000000000559
Burgess S, Thompson SG (2014) Mendelian randomization: methods for using genetic variants in causal estimation. Chapman & Hall/CRC interdisciplinary statistics series. Chapman & Hall/CRC, Boca Raton
Burgess S, Thompson SG (2015) Mendelian randomization: methods for using genetic variants in causal estimation. Chapman & Hall/CRC interdisciplinary statistics series. CRC Press, Taylor & Francis Group, Boca Raton
Chao J, Swanson NR (2007) Alternative approximations of the bias and MSE of the IV estimator under weak identification with an application to bias correction. J Econom 137:515–555. https://doi.org/10.1016/j.jeconom.2005.09.002
Conley TG, Hansen CB, Rossi PE (2012) Plausibly exogenous. Rev Econ Stat 94:260–272. https://doi.org/10.1162/REST_a_00139
Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB, Wynder EL (1959) Smoking and lung cancer: recent evidence and a discussion of some questions. J Natl Cancer Inst 22:173–203
Davey Smith G, Ebrahim S (2005) What can mendelian randomisation tell us about modifiable behavioural and environmental exposures? BMJ 330:1076–1079. https://doi.org/10.1136/bmj.330.7499.1076
Harding DJ (2003) Counterfactual models of neighborhood effects: the effect of neighborhood poverty on dropping out and teenage pregnancy. Am J Sociol 109:676–719. https://doi.org/10.1086/379217
Dimitri P (2018) Fat and bone in children—where are we now? Ann Pediatr Endocrinol Metab 23:62–69. https://doi.org/10.6065/apem.2018.23.2.62
Gastwirth JL, Krieger AM, Rosenbaum PR (1998) Dual and simultaneous sensitivity analysis for matched pairs. Biometrika 85:907–920. https://doi.org/10.1093/biomet/85.4.907
Goh WWB, Wang W, Wong L (2017) Why batch effects matter in omics data, and how to avoid them. Trends Biotechnol 35:498–507. https://doi.org/10.1016/j.tibtech.2017.02.012
Golding J (1990) Children of the nineties. A longitudinal study of pregnancy and childhood based on the population of Avon (ALSPAC). West Engl Med J 105:80–82
Greenland S (1996) Basic methods for sensitivity analysis of biases. Int J Epidemiol 25:1107–1116
Haavelmo T (1944) The probability approach in econometrics. Econometrica 12:1–15
Hackinger S, Zeggini E (2017) Statistical methods to detect pleiotropy in human complex traits. Open Biol. https://doi.org/10.1098/rsob.170125
Katan MB (2004) Apolipoprotein E isoforms, serum cholesterol, and cancer. 1986. Int J Epidemiol 33:9. https://doi.org/10.1093/ije/dyh312
Kolesár M, Chetty R, Friedman J, Glaeser E, Imbens GW (2015) Identification and inference with many invalid instruments. J Bus Econ Stat 33:474–484. https://doi.org/10.1080/07350015.2014.978175
Leek JT et al (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11:733–739. https://doi.org/10.1038/nrg2825
Lin DY, Psaty BM, Kronmal RA (1998) Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics 54:948–963
Listgarten J, Kadie C, Schadt EE, Heckerman D (2010) Correction for hidden confounders in the genetic analysis of gene expression. Proc Natl Acad Sci USA 107:16465–16470. https://doi.org/10.1073/pnas.1002425107
Matthew H, Jerry H, Christopher JP (2016) Finite sample bias corrected IV estimation for weak and many instruments. Adv Econom 36:245–273
Michaelson JJ, Loguercio S, Beyer A (2009) Detection and interpretation of expression quantitative trait loci (eQTL). Methods 48:265–276. https://doi.org/10.1016/j.ymeth.2009.03.004
Neuman JA, Isakov O, Shomron N (2013) Analysis of insertion-deletion from deep-sequencing data: software evaluation for optimal detection. Brief Bioinform 14:46–55. https://doi.org/10.1093/bib/bbs013
Novembre J et al (2008) Genes mirror geography within Europe. Nature 456:98–101. https://doi.org/10.1038/nature07331
Palmer TM et al (2012) Using multiple genetic variants as instrumental variables for modifiable risk factors. Stat Methods Med Res 21:223–242. https://doi.org/10.1177/0962280210394459
Rosenbaum PR (1987) Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika 74:13–26. https://doi.org/10.2307/2336017
Rosenbaum PR, Rubin DB (1983) Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J R Stat Soc Ser B (Methodol) 45:212–218
Seldin MF, Price AL (2008) Application of ancestry informative markers to association studies in European Americans. PLoS Genet 4:e5. https://doi.org/10.1371/journal.pgen.0040005
Sivakumaran S et al (2011) Abundant pleiotropy in human complex diseases and traits. Am J Hum Genet 89:607–618. https://doi.org/10.1016/j.ajhg.2011.10.004
Small DS (2007) Sensitivity analysis for instrumental variables regression with overidentifying restrictions. J Am Stat Assoc 102:1049–1058. https://doi.org/10.1198/016214507000000608
Smith GD, Ebrahim S (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32:1–22
Smith GD, Ebrahim S (2004) Mendelian randomization: prospects, potentials, and limitations. Int J Epidemiol 33:30–42. https://doi.org/10.1093/ije/dyh132
Theil H (1953a) Estimation and simultaneous correlation in complete equation systems. Central Planning Bureau. Mimeo, The Hague
Theil H (1953b) Repeated least squares applied to complete equation systems. Central Planning Bureau. Mimeo, The Hague
Theil H (1958) Economic forecasts and policy. Central Planning Bureau. Mimeo, The Hague
Timpson NJ, Sayers A, Davey-Smith G, Tobias JH (2009) How does body fat influence bone mass in childhood? A Mendelian randomization approach. J Bone Miner Res 24:522–533. https://doi.org/10.1359/jbmr.081109
Vanderweele TJ, Arah OA (2011) Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders. Epidemiology 22:42–52. https://doi.org/10.1097/EDE.0b013e3181f74493
Wang X, Jiang Y, Zhang NR, Small DS (2018) Sensitivity analysis and power for instrumental variable studies. Biometrics. https://doi.org/10.1111/biom.12873
Wosje KS, Khoury PR, Claytor RP, Copeland KA, Kalkwarf HJ, Daniels SR (2009) Adiposity and TV viewing are related to less bone accrual in young children. J Pediatr 154:79–85.e72. https://doi.org/10.1016/j.jpeds.2008.06.031
Wright PG (1928) The tariff on animal and vegetable oils. The Institute of Economics Investigations in international commercial policies, vol 26. MacMillan, New York
Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89:82–93. https://doi.org/10.1016/j.ajhg.2011.05.029
Wu Y et al (2018) Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat Commun 9:918. https://doi.org/10.1038/s41467-018-03371-0
Zhang W, Ghosh D (2017) On the use of kernel machines for Mendelian randomization. Quant Biol 5:368–379. https://doi.org/10.1007/s40484-017-0124-3
Zhu Z et al (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet 48:481–487. https://doi.org/10.1038/ng.3538
Acknowledgements
This research was supported by the National Science Foundation under Grant No. NSF ABI 1457935 and the National Institutes of Health under Grant R01 GM117946.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Weiming Zhang and Debashis Ghosh declared that they have no conflict of interest.
Appendix 1
Appendix 1
We assume independence of the p genetic variants. We first expand the \({\sigma }_{\widehat{x},\varepsilon },\) which is the covariance between the fitted exposure and the error term in Eq. (5).
Hence,
When there is a single SNPs (p = 1), the bias simplifies to
When multiple SNPs are used in MR, it was suggested that a summary score should be used in place of the multiple SNPs to reduce the finite-sample bias. This simplified equation may be used in such situation by treating the summary score as the single instrument.
Rights and permissions
About this article
Cite this article
Zhang, W., Ghosh, D. A General Approach to Sensitivity Analysis for Mendelian Randomization. Stat Biosci 13, 34–55 (2021). https://doi.org/10.1007/s12561-020-09280-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-020-09280-5