Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence

Abstract

The number of genes in the human genome is unknown, with estimates ranging from 50,000 to 90,000 (refs 1, 2), and to more than 140,000 according to unpublished sources. We have developed ‘Exofish’, a procedure based on homology searches, to identify human genes quickly and reliably. This method relies on the sequence of another vertebrate, the pufferfish Tetraodon nigroviridis, to detect conserved sequences with a very low background. Similar to Fugu rubripes , a marine pufferfish proposed by Brenner et al.3 as a model for genomic studies, T. nigroviridis is a more practical alternative4 with a genome also eight times more compact than that of human. Many comparisons have been made between F. rubripes and human DNA that demonstrate the potential of comparative genomics using the pufferfish genome5. Application of Exofish to the December version of the working draft sequence of the human genome and to Unigene showed that the human genome contains 28,000–34,000 genes, and that Unigene contains less than 40% of the protein-coding fraction of the human genome.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Construction of Exofish.
Figure 2
Figure 3: Examples of chromosome 22 results.
Figure 4: Distribution of gene and ecores on individual human chromosomes according to the EST physical map8 and Exofish.

Similar content being viewed by others

Accession codes

Accessions

GenBank/EMBL/DDBJ

References

  1. Fields, C., Adams, M.D., White, O. & Venter, J.C. How many genes in the human genome? Nature Genet. 7, 345 –346 (1994).

    Article  CAS  Google Scholar 

  2. Antequera, F. & Bird, A. Number of CpG islands and genes in human and mouse. Proc. Natl Acad. Sci. USA 90, 11995–11999 (1993).

    Article  CAS  Google Scholar 

  3. Brenner, S. et al. Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome. Nature 366, 265 –268 (1993).

    Article  CAS  Google Scholar 

  4. Crnogorac-Jurcevic, T., Brown, J.R., Lehrach, H. & Schalkwyk, L.C. Tetraodon fluviatilis, a new puffer fish model for genome studies. Genomics 41, 177–184 ( 1997).

    Article  CAS  Google Scholar 

  5. Elgar, G. et al. Generation and analysis of 25 Mb of genomic DNA from the pufferfish Fugu rubripes by sequence scanning. Genome Res. 9, 960–971 (1999).

    Article  Google Scholar 

  6. Schuler, G.D. et al. A gene map of the human genome. Science 274, 540–546 (1996).

    Article  CAS  Google Scholar 

  7. Dunham, I. et al. The DNA sequence of human chromosome 22. Nature 402, 489–495 (1999).

    Article  CAS  Google Scholar 

  8. Deloukas, P. et al. A physical map of 30,000 human genes. Science 282, 744–746 (1998).

    Article  CAS  Google Scholar 

  9. Roest Crollius, H. et al. Characterization and repeat analysis of the compact genome of the freswater pufferfish Tetraodon nigroviridis. Genome Res . (in press).

  10. Jin, L., Zhong, Y. & Chakraborty, R. The exact numbers of possible microsatellite motifs . Am. J. Hum. Genet. 55, 582– 583 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 ( 1999).

    Article  CAS  Google Scholar 

  12. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 ( 1990).

    Article  CAS  Google Scholar 

  13. Smith, T.F. & Waterman, M.S. Identification of common molecular subsequences. J. Mol. Biol. 147, 195– 197 (1981).

    Article  CAS  Google Scholar 

  14. Glemet, E. & Codani, J. LASSAP, a large scale sequence comparisons package. Comput. Appl. Biosci. 13, 137– 143 (1997).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank the sequencing and template preparation team at Genoscope; Sun Microsystems for access to the SUN benchmark centre; and F. Francis for critical reading of the manuscript. This work would not have been possible without the public availability of a large fraction of the sequence of the human genome, and we thank all contributing genome centres.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean Weissenbach.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roest Crollius, H., Jaillon, O., Bernot, A. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nat Genet 25, 235–238 (2000). https://doi.org/10.1038/76118

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/76118

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing