A general framework for identifying oligogenic combinations of rare variants in complex disorders

  1. Santhosh Girirajan1,2,3
  1. 1Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA;
  2. 2Bioinformatics and Genomics Graduate Program, Huck Institutes of the Life Sciences, University Park, Pennsylvania 16802, USA;
  3. 3Department of Anthropology, Pennsylvania State University, University Park, Pennsylvania 16802, USA
  • Corresponding author: sxg47{at}psu.edu
  • Abstract

    Genetic studies of complex disorders such as autism and intellectual disability (ID) are often based on enrichment of individual rare variants or their aggregate burden in affected individuals compared to controls. However, these studies overlook the influence of combinations of rare variants that may not be deleterious on their own due to statistical challenges resulting from rarity and combinatorial explosion when enumerating variant combinations, limiting our ability to study oligogenic basis for these disorders. Here, we present RareComb, a framework that combines the Apriori algorithm and statistical inference to identify specific combinations of mutated genes associated with complex phenotypes. RareComb overcomes computational barriers and exhaustively evaluates variant combinations to identify nonadditive relationships between simultaneously mutated genes. Using RareComb, we analyzed 6189 individuals with autism and identified 718 combinations significantly associated with ID, and carriers of these combinations showed lower IQ than expected in an independent cohort of 1878 individuals. These combinations were enriched for nervous system genes such as NIN and NGF, showed complex inheritance patterns, and were depleted in unaffected siblings. We found that an affected individual can carry many oligogenic combinations, each contributing to the same phenotype or distinct phenotypes at varying effect sizes. We also used this framework to identify combinations associated with multiple comorbid phenotypes, including mutations of COL28A1 and MFSD2B for ID and schizophrenia and ABCA4, DNAH10 and MC1R for ID and anxiety/depression. Our framework identifies a key component of missing heritability and provides a novel paradigm to untangle the genetic architecture of complex disorders.

    Footnotes

    • Received November 1, 2021.
    • Accepted March 15, 2022.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server