Characterizing the targets of transcription regulators by aggregating ChIP-seq and perturbation expression data sets

  1. Paul Pavlidis1,2
  1. 1Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
  2. 2Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
  3. 3Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
  • Corresponding author: paul{at}msl.ubc.ca
  • Abstract

    Mapping the gene targets of chromatin-associated transcription regulators (TRs) is a major goal of genomics research. ChIP-seq of TRs and experiments that perturb a TR and measure the differential abundance of gene transcripts are a primary means by which direct relationships are tested on a genomic scale. It has been reported that there is a poor overlap in the evidence across gene regulation strategies, emphasizing the need for integrating results from multiple experiments. Although research consortia interested in gene regulation have produced a valuable trove of high-quality data, there is an even greater volume of TR-specific data throughout the literature. In this study, we show a workflow for the identification, uniform processing, and aggregation of ChIP-seq and TR perturbation experiments for the ultimate purpose of ranking human and mouse TR–target interactions. Focusing on an initial set of eight regulators (ASCL1, HES1, MECP2, MEF2C, NEUROD1, PAX6, RUNX1, and TCF4), we identified 497 experiments suitable for analysis. We used this corpus to examine data concordance, to identify systematic patterns of the two data types, and to identify putative orthologous interactions between human and mouse. We build upon commonly used strategies to forward a procedure for aggregating and combining these two genomic methodologies, assessing these rankings against independent literature-curated evidence. Beyond a framework extensible to other TRs, our work also provides empirically ranked TR–target listings, as well as transparent experiment-level gene summaries for community use.

    Footnotes

    • Received August 31, 2022.
    • Accepted April 26, 2023.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server