Computing prokaryotic gene ubiquity: Rescuing the core from extinction

  1. Robert L. Charlebois and
  2. W. Ford Doolittle1
  1. Genome Atlantic, and Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada B3H 1X5

Abstract

The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.

Footnotes

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3024704.

  • 1 Corresponding author. E-mail Ford{at}dal.ca; fax (902) 494-1355.

    • Accepted October 7, 2004.
    • Received July 20, 2004.
| Table of Contents

Preprint Server