1932

Abstract

Unmeasured confounding is one of the main sources of bias in observational studies. A popular way to reduce confounding bias is to use sibling comparisons, which implicitly adjust for several factors in the early environment or upbringing without requiring them to be measured or known. In this article we provide a broad exposition of the statistical analysis methods for sibling comparison studies. We further discuss a number of methodological challenges that arise in sibling comparison studies.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040120-024521
2022-03-07
2024-04-28
Loading full text...

Full text loading...

/deliver/fulltext/statistics/9/1/annurev-statistics-040120-024521.html?itemId=/content/journals/10.1146/annurev-statistics-040120-024521&mimeType=html&fmt=ahah

Literature Cited

  1. Abel K, Wicks S, Susser E, Dalman C, Pedersen M et al. 2010. Birth weight, schizophrenia, and adult mental disorder: Is risk confined to the smallest babies?. Arch. Gen. Psychiatry 67:9923–30
    [Google Scholar]
  2. Allison P. 2005. Fixed Effects Regression Methods for Longitudinal Data Using SAS Cary, NC: SAS Inst.
  3. Allison P. 2009. Fixed Effects Regression Models Thousand Oaks, CA: SAGE
  4. Andersen E. 1970. Asymptotic properties of conditional maximum-likelihood estimators. J. R. Stat. Soc. Ser. B 32:2283–301
    [Google Scholar]
  5. Ashenfelter O, Krueger A. 1994. Estimates of the economic return to schooling from a new sample of twins. Am. Econ. Rev. 84:51157–73
    [Google Scholar]
  6. Bang H, Robins J. 2005. Doubly robust estimation in missing data and causal inference models. Biometrics 61:4962–73
    [Google Scholar]
  7. Boone-Heinonen J, Biel F, Marshall N, Snowden J 2020. Maternal pre-pregnancy BMI and size at birth: race/ethnicity-stratified, within-family associations in over 500,000 siblings. Ann. Epidemiol. 46:49–56.e5
    [Google Scholar]
  8. Breslow N, Day N. 1980. Statistical Methods in Cancer Research, Vol. I: The Analysis of Case-Control Studies Lyon, Fr: IARC/WHO
    [Google Scholar]
  9. Brumback B, Dailey A, Brumback L, Livingston M, He Z. 2010. Adjusting for confounding by cluster using generalized linear mixed models. Stat. Probab. Lett. 80:21–221650–54
    [Google Scholar]
  10. Cai Z, Brumback B. 2015. Model-based standardization to adjust for unmeasured cluster-level confounders with complex survey data. Stat. Med. 34:152368–80
    [Google Scholar]
  11. Carlin J, Gurrin L, Sterne J, Morley R, Dwyer T 2005. Regression models for twin studies: a critical review. Int. J. Epidemiol. 34:51089–99
    [Google Scholar]
  12. Class Q, Rickert M, Larsson H, Lichtenstein P, D'Onofrio B. 2014. Fetal growth and psychiatric and socio-economic problems: population-based sibling comparison. Br. J. Psychiatry 205:5355–61
    [Google Scholar]
  13. Dahlqwist E, Pawitan Y, Sjölander A. 2019. Regression standardization and attributable fraction estimation with between-within frailty models for clustered survival data. Stat. Methods Med. Res. 28:2462–85
    [Google Scholar]
  14. Dai J, Mukamal K, Krasnow R, Swan G, Reed T. 2015. Higher usual alcohol consumption was associated with a lower 41-y mortality risk from coronary artery disease in men independent of genetic and common environmental factors: the prospective NHLBI Twin Study. Am. J. Clin. Nutr. 102:131–39
    [Google Scholar]
  15. Daley D, Jacobsen R, Lange A, Sørensen A, Walldorf J. 2019. The economic burden of adult attention deficit hyperactivity disorder: a sibling comparison cost analysis. Eur. Psychiatry 61:41–48
    [Google Scholar]
  16. D'Onofrio B, Rickert M, Frans E, Kuja-Halkola R, Almqvist C et al. 2014. Paternal age at childbearing and offspring psychiatric and academic morbidity. JAMA Psychiatry 71:4432–38
    [Google Scholar]
  17. Eisen S, Goldberg J, True W, Henderson W. 1991. A co-twin control study of the effects of the Vietnam War on the self-reported physical health of veterans. Am. J. Epidemiol. 134:149–58
    [Google Scholar]
  18. Falconer DS, Mackay T. 1996. Introduction to Quantitative Genetics New York: Pearson. , 4th ed..
  19. Fitzmaurice G, Laird N, Ware J 2011. Applied Longitudinal Analysis New York: Wiley
  20. Floderus B, Cederlöf R, Friberg L 1988. Smoking and mortality: a 21-year follow-up based on the Swedish twin registry. Int. J. Epidemiol. 17:2332–40
    [Google Scholar]
  21. Frisell T, Öberg S, Kuja-Halkola R, Sjölander A. 2012. Sibling comparison designs: bias from non-shared confounders and measurement error. Epidemiology 23:5713–20
    [Google Scholar]
  22. Gesell A. 1942. The method of co-twin control. Science 95:2470446–48
    [Google Scholar]
  23. Goetgeluk S, Vansteelandt S. 2008. Conditional generalized estimating equations for the analysis of clustered and longitudinal data. Biometrics 64:3772–80
    [Google Scholar]
  24. Gorseline D. 1932. The effect of schooling upon income. PhD Thesis Indiana Univ. Bloomington:
    [Google Scholar]
  25. Greenland S. 1991. Reducing mean squared error in the analysis of stratified epidemiologic studies. Biometrics 47:773–76
    [Google Scholar]
  26. Greenland S. 2003. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology 14:3300–6
    [Google Scholar]
  27. Greenland S, Robins J, Pearl J. 1999. Confounding and collapsibility in causal inference. Stat. Sci. 14:129–46
    [Google Scholar]
  28. Griliches Z. 1979. Sibling models and data in economics: beginnings of a survey. J. Political Econ. 87:5S37S64
    [Google Scholar]
  29. Hernán M, Robins J. 2020. Causal Inference: What If. Boca Raton, FL: Chapman & Hall/CRC
  30. Holt J, Prentice R. 1974. Survival analyses in twin studies and matched pair experiments. Biometrika 61:117–30
    [Google Scholar]
  31. Jonsson F, Wolk A, Pedersen N, Lichtenstein P, Terry P et al. 2003. Obesity and hormone-dependent tumors: cohort and co-twin control studies based on the Swedish Twin Registry. Int. J. Cancer 106:4594–99
    [Google Scholar]
  32. Jung S. 1999. Rank tests for matched survival data. Lifetime Data Anal. 5:167–79
    [Google Scholar]
  33. Kalish L. 1990. Reducing mean squared error in the analysis of pair-matched case-control studies. Biometrics 46:2493–99
    [Google Scholar]
  34. Kaplan E, Meier P. 1958. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53:282457–81
    [Google Scholar]
  35. Kendler K, Karkowski L, Prescott C 1999. Causal relationship between stressful life events and the onset of major depression. Am. J. Psychiatry 156:6837–41
    [Google Scholar]
  36. Klein J, Moeschberger M. 2003. Survival Analysis: Techniques for Censored and Truncated Data. New York: Springer. , 2nd ed..
  37. Kolk M, Barclay K. 2019. Cognitive ability and fertility among Swedish men born 1951–1967: evidence from military conscription registers. Proc. R. Soc. B 286:20190359
    [Google Scholar]
  38. Lampi K, Lehtonen L, Tran P, Suominen A, Lehti V et al. 2012. Risk of autism spectrum disorders in low birth weight and small for gestational age infants. J. Pediatr. 161:5830–36
    [Google Scholar]
  39. Lancaster T. 2000. The incidental parameter problem since 1948. J. Econom. 95:2391–413
    [Google Scholar]
  40. Land J. 2006. How should we report on perinatal outcome?. Hum. Reprod. 21:102638–39
    [Google Scholar]
  41. Lawlor D, Clark H, Smith G, Leon D 2006. Intrauterine growth and intelligence within sibling pairs: findings from the Aberdeen children of the 1950s cohort. Pediatrics 117:5e894–e902
    [Google Scholar]
  42. Lawlor D, Mortensen L, Nybo Andersen A. 2011. Mechanisms underlying the associations of maternal age with adverse perinatal outcomes: a sibling study of 264,695 Danish women and their firstborn offspring. Int. J. Epidemiol. 40:51205–14
    [Google Scholar]
  43. Levit S. 1935. Twin investigations in the U.S.S.R. J. Personal. 3:3188–93
    [Google Scholar]
  44. Little R, Rubin D. 2000. Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches. Annu. Rev. Public Health 21:121–45
    [Google Scholar]
  45. Lown E, Goldsby R, Mertens A, Greenfield T, Bond J et al. 2008. Alcohol consumption patterns and risk factors among childhood cancer survivors compared to siblings and general population peers. Addiction 103:71139–48
    [Google Scholar]
  46. Lundström S, Forsman M, Larsson H, Kerekes N, Serlachius E et al. 2014. Childhood neurodevelopmental disorders and violent criminality: a sibling control study. J. Autism Dev. Disord. 44:112707–16
    [Google Scholar]
  47. Maršál K, Persson P, Larsen T, Lilja H, Selbing A, Sultan B 1996. Intrauterine growth curves based on ultrasonically estimated foetal weights. Acta Paediatr. 85:7843–48
    [Google Scholar]
  48. Meyer K, Williams P, Hernandez-Diaz S, Cnattingius S. 2004. Smoking and the risk of oral clefts: exploring the impact of study designs. Epidemiology 15:6671–78
    [Google Scholar]
  49. Mundlak Y. 1978. Pooling of time-series and cross-section data. Econometrica 46:169–85
    [Google Scholar]
  50. Murray C. 2002. IQ and income inequality in a sample of sibling pairs from advantaged family backgrounds. Am. Econ. Rev. 92:2339–43
    [Google Scholar]
  51. Neuhaus J, McCulloch C. 2006. Separating between- and within-cluster covariate effects by using conditional and partitioning methods. J. R. Stat. Soc. Ser. B 68:5859–72
    [Google Scholar]
  52. Nilsen T, Knudsen G, Gervin K, Brandt I, Røysamb E et al. 2013. The Norwegian twin registry from a public health perspective: a research update. Twin Res. Hum. Genet. 16:1285–95
    [Google Scholar]
  53. Nosarti C, Reichenberg A, Murray R, Cnattingius S, Lambe M et al. 2012. Preterm birth and psychiatric disorders in young adult life. Arch. Gen. Psychiatry 69:6610–17
    [Google Scholar]
  54. Pearl J. 1995. Causal diagrams for empirical research. Biometrika 82:4669–88
    [Google Scholar]
  55. Pearl J. 2009. Causality: Models, Reasoning, and Inference Cambridge, UK: Cambridge Univ. Press. , 2nd ed..
  56. Petersen A, Lange T. 2020. What is the causal interpretation of sibling comparison designs?. Epidemiology 31:175–81
    [Google Scholar]
  57. Pettersson E, Sjölander A, Almqvist C, Anckarsäter H, D'Onofrio B et al. 2015. Birth weight as an independent predictor of ADHD symptoms: a within-twin pair analysis. J. Child Psychol. Psychiatry 56:4453–59
    [Google Scholar]
  58. Piirtola M, Jelenkovic A, Latvala A, Sund R, Honda C et al. 2018. Association of current and former smoking with body mass index: a study of smoking discordant twin pairs from 21 twin cohorts. PLOS ONE 13:7e0200140
    [Google Scholar]
  59. Rosenbaum P. 2015. How to see more in observational studies: some new quasi-experimental devices. Annu. Rev. Stat. Appl. 2:21–48
    [Google Scholar]
  60. Rosenbaum P. 2020. Modern algorithms for matching in observational studies. Annu. Rev. Stat. Appl. 7:143–76
    [Google Scholar]
  61. Rubin D. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66:5688–701
    [Google Scholar]
  62. Seaman S, Pavlou M, Copas A. 2014. Review of methods for handling confounding by cluster and informative cluster size in clustered data. Stat. Med. 33:305371–87
    [Google Scholar]
  63. Sjölander A. 2013. Reducing mean squared error in the analysis of binary paired data. Epidemiol. Methods 2:133–47
    [Google Scholar]
  64. Sjölander A. 2021. Estimation of marginal causal effects in the presence of confounding by cluster. Biostatistics 22:3598612
    [Google Scholar]
  65. Sjölander A, Frisell T, Kuja-Halkola R, Öberg S, Zetterqvist J. 2016. Carryover effects in sibling comparison designs. Epidemiology 27:6852–58
    [Google Scholar]
  66. Sjölander A, Frisell T, Öberg S. 2012a. Causal interpretation of between-within models for twin research. Epidemiol. Methods 1:1217–37
    [Google Scholar]
  67. Sjölander A, Johansson A, Lundholm C, Altman D, Almqvist C, Pawitan Y 2012b. Analysis of 1:1 matched cohort studies and twin studies, with binary exposures and binary outcomes. Stat. Sci. 27:3395–411
    [Google Scholar]
  68. Sjölander A, Lichtenstein P, Larsson H, Pawitan Y. 2013. Between–within models for survival analysis. Stat. Med. 32:183067–76
    [Google Scholar]
  69. Sjölander A, Zetterqvist J. 2017. Confounders, mediators, or colliders. Epidemiology 28:4540–47
    [Google Scholar]
  70. Skinner C, D'Arragio J. 2011. Inverse probability weighting for clustered nonresponse. Biometrika 98:4953–66
    [Google Scholar]
  71. Skytthe A, Ohm Kyvik K, Vilstrup Holm N, Christensen K 2011. The Danish twin registry. Scand. J. Public Health 39:775–78
    [Google Scholar]
  72. Stefanski L, Boos D. 2002. The calculus of M-estimation. Am. Stat. 56:129–38
    [Google Scholar]
  73. Sullivan W. 1899. A note on the influence of maternal inebriety on the offspring. J. Mental Sci. 45:190489–503
    [Google Scholar]
  74. Wikipedia 2020. Small for gestational age. Wikipedia . https://en.wikipedia.org/w/index.php?title=Small_for_gestational_age&oldid=995728915
  75. Zagai U, Lichtenstein P, Pedersen N, Magnusson P. 2019. The Swedish twin registry: content and management as a research infrastructure. Twin Res. Hum. Genet. 22:6672–80
    [Google Scholar]
  76. Zetterqvist J, Vansteelandt S, Pawitan Y, Sjölander A. 2016. Doubly robust methods for handling confounding by cluster. Biostatistics 17:2264–76
    [Google Scholar]
  77. Zetterqvist J, Vermeulen K, Vansteelandt S, Sjölander A. 2019. Doubly robust conditional logistic regression. Stat. Med. 38:234749–60
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-040120-024521
Loading
/content/journals/10.1146/annurev-statistics-040120-024521
Loading

Data & Media loading...

Supplemental Material

Supplementary Data

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error