The Complete Human Olfactory Subgenome

  1. Gustavo Glusman1,4,
  2. Itai Yanai1,2,
  3. Irit Rubin3, and
  4. Doron Lancet1,5
  1. 1Department of Molecular Genetics and the Crown Human Genome Center, The Weizmann Institute of Science, Rehovot 76100, Israel; 2Bioinformatics Graduate Program, Boston University, Boston, Massachusetts 02215, USA; 3Department of Biological Regulation, The Weizmann Institute of Science, Rehovot 76100, Israel

Abstract

Olfactory receptors likely constitute the largest gene superfamily in the vertebrate genome. Here we present the nearly complete human olfactory subgenome elucidated by mining the genome draft with gene discovery algorithms. Over 900 olfactory receptor genes and pseudogenes (ORs) were identified, two-thirds of which were not annotated previously. The number of extrapolated ORs is in good agreement with previous theoretical predictions. The sequence of at least 63% of the ORs is disrupted by what appears to be a random process of pseudogene formation. ORs constitute 17 gene families, 4 of which contain more than 100 members each. “Fish-like” Class I ORs, previously considered a relic in higher tetrapods, constitute as much as 10% of the human repertoire, all in one large cluster on chromosome 11. Their lower pseudogene fraction suggests a functional significance. ORs are disposed on all human chromosomes except 20 and Y, and nearly 80% are found in clusters of 6–138 genes. A novel comparative cluster analysis was used to trace the evolutionary path that may have led to OR proliferation and diversification throughout the genome. The results of this analysis suggest the following genome expansion history: first, the generation of a “tetrapod-specific” Class II OR cluster on chromosome 11 by local duplication, then a single-step duplication of this cluster to chromosome 1, and finally an avalanche of duplication events out of chromosome 1 to most other chromosomes. The results of the data mining and characterization of ORs can be accessed at the Human Olfactory Receptor Data Exploratorium Web site (http://bioinfo.weizmann.ac.il/HORDE).

Footnotes

  • 4 Present address: The Institute for Systems Biology, 4225 Roosevelt Way NE, Seattle, WA 98105, USA.

  • 5 Corresponding author.

  • E-MAIL doron.lancet{at}weizmann.ac.il; FAX 972-8-9344487.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.171001.

    • Received November 13, 2000.
    • Accepted March 13, 2001.
| Table of Contents

Preprint Server