Skip to main content
Log in

Large Scale of Human Duplicate Genes Divergence

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Proteome complexity increases in the evolution mostly by means of gene duplication followed by divergence. In this genome-scale study of human genome I show that density distribution of duplicate gene pairs along the axis of protein divergence between pair members forms two main peaks with a small peak and plateau before the first main peak. This picture indicates the existence of three evolutionary stages of duplicate gene evolution. The analysis of various functional parameters (gene expression level and breadth, transcription factor targets, protein interaction networks) suggests that subfunctionalization (partition of function) is a predominant mode of divergence in the first main peak, whereas neofunctionalization (acquiring of novel functions) prevails in the second main peak. The young duplicate pairs show a much higher expression level compared with singleton genes and more diverged duplicates, which indicates that requirement for high gene dosage is important for retention of duplicates just after the duplication event. Thus, a prevailing route of duplicate evolution seems to be the high gene dosage–subfunctionalization–neofunctionalization. This adaptationist model suggests that an organism is evolving in the direction of its most intensively used functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Ainali C, Simon M, Freilich S, Espinosa O, Hazelwood L, Tsoka S, Ouzounis CA, Hancock JM (2011) Protein coalitions linked by rapidly evolving proteins in a core mammalian biochemical network. BMC Evol Biol 11:142

    Article  PubMed  CAS  Google Scholar 

  • Anisimova M, Kosiol C (2009) Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol 26:255–271

    Article  PubMed  CAS  Google Scholar 

  • Byun-McKay SA, Geeta R (2007) Protein subcellular relocalization: a new perspective on the origin of novel genes. Trends Ecol Evol 22:338–344

    Article  PubMed  Google Scholar 

  • Conant GC, Wolfe KH (2008) Turning a hobby into a job: how duplicated genes find new functions. Nat Rev Genet 9:938–950

    Article  PubMed  CAS  Google Scholar 

  • Davis JC, Petrov DA (2005) Do disparate mechanisms of duplication add similar genes to the genome? Trends Genet 21:548–551

    Article  PubMed  CAS  Google Scholar 

  • Des Marais DL, Rausher MD (2008) Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454:762–765

    PubMed  CAS  Google Scholar 

  • Edger PP, Pires JC (2009) Gene and genome duplications: the impact of dosage-sensitivity on the fate of nuclear genes. Chromosom Res 17:699–717

    Article  CAS  Google Scholar 

  • Farre D, Alba MM (2010) Heterogeneous patterns of gene-expression diversification in mammalian gene duplicates. Mol Biol Evol 27:325–335

    Article  PubMed  CAS  Google Scholar 

  • Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531–1545

    PubMed  CAS  Google Scholar 

  • Force A, Cresko WA, Pickett FB, Proulx SR, Amemiya C, Lynch M (2005) The origin of subfunctions and modular gene regulation. Genetics 170:433–446

    Article  PubMed  CAS  Google Scholar 

  • Gu X, Wang Y, Gu J (2002) Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution. Nat Genet 31:205–209

    Article  PubMed  CAS  Google Scholar 

  • Han MV, Demuth JP, McGrath CL, Casola C, Hahn MW (2009) Adaptive evolution of young gene duplicates in mammals. Genome Res 19:859–867

    Article  PubMed  CAS  Google Scholar 

  • He X, Zhang J (2005) Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169:1157–1164

    Article  PubMed  Google Scholar 

  • He X, Zhang J (2006) Higher duplicability of less important genes in yeast genomes. Mol Biol Evol 23:144–151

    Article  PubMed  CAS  Google Scholar 

  • Hittinger CT, Carroll SB (2007) Gene duplication and the adaptive evolution of a classic genetic switch. Nature 449:677–681

    Article  PubMed  CAS  Google Scholar 

  • Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40:D306–D312

    Article  PubMed  CAS  Google Scholar 

  • Innan H, Kondrashov F (2010) The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 11:97–108

    Article  PubMed  CAS  Google Scholar 

  • Kondrashov FA, Kondrashov AS (2006) Role of selection in fixation of gene duplications. J Theor Biol 239:141–511

    Article  PubMed  CAS  Google Scholar 

  • Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 20. Bioinformatics 23:2947–2948

    Article  PubMed  CAS  Google Scholar 

  • Li WH (1993) Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J Mol Evol 36:96–99

    Article  PubMed  CAS  Google Scholar 

  • Li WH, Wu CI, Luo CC (1985) A new method for estimating synonymous and nonsynonymous rates of nucleotide substitutions considering the relative likelihood of nucleotide and codon changes. Mol Biol Evol 2:150–174

    PubMed  Google Scholar 

  • Li J, Zhang Z, Vang S, Yu J, Wong GK, Wang J (2009) Correlation between Ka/Ks and Ks is related to substitution model and evolutionary lineage. J Mol Evol 68:414–423

    Article  PubMed  CAS  Google Scholar 

  • Liang H, Plazonic KR, Chen J, Li WH, Fernández A (2008) Protein under-wrapping causes dosage sensitivity and decreases gene duplicability. PLoS Genet 4:e11

    Article  PubMed  Google Scholar 

  • Liao BY, Zhang J (2006) Low rates of expression profile divergence in highly expressed genes and tissue-specific genes during mammalian evolution. Mol Biol Evol 23:1119–1128

    Article  PubMed  CAS  Google Scholar 

  • Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP (2011) Molecular signatures database (MSigDB) 3.0. Bioinformatics 27:1739–1740

    Article  PubMed  CAS  Google Scholar 

  • Maglott D, Ostell J, Pruitt KD, Tatusova T (2011) Entrez gene: gene-centered information at NCBI. Nucleic Acids Res 39:D52–D57

    Article  PubMed  Google Scholar 

  • Marques AC, Vinckenbosch N, Brawand D, Kaessmann H (2008) Functional diversification of duplicate genes through subcellular adaptation of encoded proteins. Genome Biol 9:R54

    Article  PubMed  Google Scholar 

  • Nei M, Gojobori T (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426

    PubMed  CAS  Google Scholar 

  • Ohno S (1970) Evolution by gene duplication. Springer, New York

    Google Scholar 

  • Pamilo P, Bianchi NO (1993) Evolution of the Zfx and Zfy genes—rates and interdependence between the genes. Mol Biol Evol 10:271–281

    PubMed  CAS  Google Scholar 

  • Pearson W (2004) Finding protein and nucleotide similarities with FASTA. Curr Protoc Bioinformatics Chapter 3: Unit 39

  • Qian W, Zhang J (2008) Gene dosage and gene duplicability. Genetics 179:2319–2324

    Article  PubMed  Google Scholar 

  • Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Federhen S et al (2012) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 40:D13–D25

    Article  PubMed  CAS  Google Scholar 

  • Su AI, Cooke MP, Ching KA, Hakak Y, Walker JR, Wiltshire T, Orth AP, Vega RG, Sapinoso LM, Moqrich A, Patapoutian A, Hampton GM, Schultz PG, Hogenesch JB (2002) Large-scale analysis of the human and mouse transcriptomes. Proc Natl Acad Sci USA 99:4465–4470

    Article  PubMed  CAS  Google Scholar 

  • Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 101:6062–6067

    Article  PubMed  CAS  Google Scholar 

  • Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering C (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39:D561–D568

    Article  PubMed  Google Scholar 

  • Valente AX, Cusick ME (2006) Yeast protein interactome topology provides framework for coordinated-functionality. Nucleic Acids Res 34:2812–2819

    Article  PubMed  CAS  Google Scholar 

  • Vinogradov AE (2004) Compactness of human housekeeping genes: selection for economy or genomic design? Trends Genet 20:248–253

    Article  PubMed  CAS  Google Scholar 

  • Vinogradov AE (2008) Modularity of cellular networks shows general center-periphery polarization. Bioinformatics 24:2814–2817

    Article  PubMed  CAS  Google Scholar 

  • Vinogradov AE (2010a) Systemic factors dominate mammal protein evolution. Proc R Soc B 277:1403–1408

    Article  PubMed  CAS  Google Scholar 

  • Vinogradov AE (2010b) Human transcriptome nexuses: basic-eukaryotic and metazoan. Genomics 95:345–354

    Article  PubMed  CAS  Google Scholar 

  • Vinogradov AE, Anatskaya OV (2009) Loss of protein interactions and regulatory divergence in yeast whole-genome duplicates. Genomics 93:534–542

    Article  PubMed  CAS  Google Scholar 

  • Wagner GP, Pavlicev M, Cheverud JM (2007) The road to modularity. Nat Rev Genet 8:921–931

    Article  PubMed  CAS  Google Scholar 

  • Wernersson R, Pedersen AG (2003) RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Res 31:3537–3539

    Article  PubMed  CAS  Google Scholar 

  • Yang Z (2006) Computational molecular evolution. Oxford University Press, Oxford

    Book  Google Scholar 

  • Yang Z (2007) PAML 4: a program package for phylogenetic analysis likelihood. Mol Biol Evol 24:1586–1591

    Article  PubMed  CAS  Google Scholar 

  • Yang Z, Nielsen R (2000) Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol 17:32–43

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This study was supported by the Russian Foundation for Basic Research (RFBR). I thank two anonymous reviewers for valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander E. Vinogradov.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 775 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vinogradov, A.E. Large Scale of Human Duplicate Genes Divergence. J Mol Evol 75, 25–33 (2012). https://doi.org/10.1007/s00239-012-9516-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-012-9516-1

Keywords

Navigation