Abstract
Comparative genomics considers the detection of similarities and differences between extant genomes, and, based on more or less formalized hypotheses regarding the involved evolutionary processes, inferring ancestral states explaining the similarities and an evolutionary history explaining the differences. In this chapter, we focus on the reconstruction of the organization of ancient genomes into chromosomes. We review different methodological approaches and software, applied to a wide range of datasets from different kingdoms of life and at different evolutionary depths. We discuss relations with genome assembly, and potential approaches to validate computational predictions on ancient genomes that are almost always only accessible through these predictions.
Notes
- 1.
see the GOLD database for example https://gold.jgi.doe.gov/statistics .
References
Sturtevant AH (1921) A case of rearrangement of genes in drosophila. Proc Natl Acad Sci U S A 7:235–237
Dobzhansky T, Sturtevant AH (1938) Inversions in the chromosomes of drosophila pseudoobscura. Genetics 23:28–64
Pauling L, Zuckerkandl E (1963) Chemical paleogenetics. Acta Chem Scand 17:S9–S16
Poinar HN, Schwarz C, Qi J et al (2006) Metagenomics to paleogenomics: large–scale sequencing of mammoth DNA. Science 311:392–394
Muffato M, Roest Crollius H (2008) Paleogenomics in vertebrates, or the recovery of lost genomes from the mist of time. Bioessays 30:122–134
Ma J, Zhang L, Suh BB et al (2006) Reconstructing contiguous regions of an ancestral genome. Genome Res 16:1557–1565
Chauve C, Tannier E (2008) A methodological framework for the reconstruction of contiguous regions of ancestral genomes and its application to mammalian genomes. PLoS Comput Biol 4:e1000234
Neafsey DE, Waterhouse RM, Abai MR et al (2015) Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 anopheles mosquitoes. Science 347:1258522
Semeria M, Tannier E, Guéguen L (2015) Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies. BMC Bioinformatics 16(Suppl 14):S5
Chauve C, Gavranovic H, Ouangraoua A et al (2010) Yeast ancestral genome reconstructions: the possibilities of computational methods II. J Comput Biol 17:1097–1112
Sankoff D, Zheng C, Wall PK et al (2009) Towards improved reconstruction of ancestral gene order in angiosperm phylogeny. J Comput Biol 16:1353–1367
Murat F, Xu JH, Tannier E et al (2010) Ancestral grass karyotype reconstruction unravels new mechanisms of genome shuffling as a source of plant evolution. Genome Res 20:1545–1557
Ming R, VanBuren R, Wai CM et al (2015) The pineapple genome and the evolution of CAM photosynthesis. Nat Genet 47:1435–1442
Salse J (2016) Ancestors of modern plant crops. Curr Opin Plant Biol 30:134–142
Murat F, Louis A, Maumus F et al (2015) Understanding Brassicaceae evolution through ancestral genome reconstruction. Genome Biol 16:262
Murat F, Zhang R, Guizard S et al (2015) Karyotype and gene order evolution from reconstructed extinct ancestors highlight contrasts in genome plasticity of modern rosid crops. Genome Biol Evol 7:735–749
Wang Y, Li W, Zhang T et al (2006) Reconstruction of ancient genome and gene order from complete microbial genome sequences. J Theor Biol 239:494–498
Patterson M, Szöllősi G, Daubin V et al (2013) Lateral gene transfer, rearrangement, reconciliation. BMC Bioinformatics 14(Suppl 15):S4
Darling AE, Miklós I, Ragan MA (2008) Dynamics of genome rearrangement in bacterial populations. PLoS Genet 4:e1000128
Kohn M, Högel J, Vogel W et al (2006) Reconstruction of a 450–my–old ancestral vertebrate protokaryotype. Trends Genet 22:203–210
Nakatani Y, Takeda H, Kohara Y et al (2007) Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res 17:1254–1265
Ouangraoua A, Tannier E, Chauve C (2011) Reconstructing the architecture of the ancestral amniote genome. Bioinformatics 27:2664–2671
Jaillon O, Aury JM, Brunet F et al (2004) Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto–karyotype. Nature 431:946–957
Woods IG, Wilson C, Friedlander B et al (2005) The zebrafish gene map defines ancestral vertebrate chromosomes. Genome Res 15:1307–1314
Catchen JM, Conery JS, Postlethwait JH (2008) Inferring ancestral gene order. Methods Mol Biol 452:365–383
Naruse K, Tanaka M, Mita K et al (2004) A medaka gene map: the trace of ancestral vertebrate proto–chromosomes revealed by comparative gene mapping. Genome Res 14:820–828
Putnam NH, Butts T, Ferrier DEK et al (2008) The amphioxus genome and the evolution of the chordate karyotype. Nature 453:1064–1071
Putnam NH, Srivastava M, Hellsten U et al (2007) Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317:86–94
Herrero J, Muffato M, Beal K et al (2016) Ensembl comparative genomics resources. Database 2016:bav096. https://doi.org/10.1093/database/bav096
Speir ML, Zweig AS, Rosenbloom KR et al (2016) The UCSC genome browser database: 2016 update. Nucleic Acids Res 44:D717–D725
Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14:157–167
Penel S, Arigon AM, Dufayard JF, Sertier AS, Daubin V, Duret L, Gouy M, Perrière G (2009) Databases of homologous gene families for comparative genomics. BMC Bioinformatics 10(Suppl 6):S3
Sankoff D, Nadeau JH (2003) Chromosome rearrangements in evolution: from gene order to genome sequence and back. Proc Natl Acad Sci U S A 100:11188–11189
M. Višnovská, T. Vinar, and B. Brejová (2013) DNA sequence segmentation based on local similarity. In: ITAT 2013 Proceedings, pp. 36–43
Dousse A, Junier T, Zdobnov EM (2016) CEGA–a catalog of conserved elements from genomic alignments. Nucleic Acids Res 44:D96–D100
M. Belcaid, A. Bergeron, A. Chateau, et al. (2007) Exploring genome rearrangements using virtual hybridization. In: APBC’07: 5th Asia–Pacific bioinformatics conference, Imperial College Press 2007, pp. 205–214
Kim J, Larkin DM, Cai Q et al (2013) Reference–assisted chromosome assembly. Proc Natl Acad Sci U S A 110:1785–1790
Biller P, Gueguen L, Knibbe C, Tannier E (2016) Breaking good: accounting for the fragility of genomic regions in rearrangement distance estimation. Genome Biol Evol 8(5):1427–1439
Alizadeh F, Karp RM, Weisser DK et al (1995) Physical mapping of chromosomes using unique probes. J Comput Biol 2:159–184
Yancopoulos S, Attie O, Friedberg R (2005) Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21:3340–3346
Fertin G (2009) Combinatorics of genome rearrangements. MIT Press, Cambridge
Tannier E, Zheng C, Sankoff D (2009) Multichromosomal median and halving problems under different genomic distances. BMC Bioinformatics 10:120
Xu AW, Moret BME (2011) GASTS: parsimony scoring under rearrangements. In: Algorithms in bioinformatics. Springer, Berlin Heidelberg, pp 351–363
Zheng C, Sankoff D (2011) On the PATHGROUPS approach to rapid small phylogeny. BMC Bioinformatics 12(Suppl 1):S4
Alekseyev MA, Pevzner PA (2009) Breakpoint graphs and ancestral genome reconstructions. Genome Res 19:943–957
Avdeyev P, Jiang S, Aganezov S et al (2016) Reconstruction of ancestral genomes in presence of gene gain and loss. J Comput Biol 23:150–164
Ma J, Ratan A, Raney BJ et al (2008) The infinite sites model of genome evolution. Proc Natl Acad Sci U S A 105:14254–14261
Paten B, Zerbino DR, Hickey G et al (2014) A unifying model of genome evolution under parsimony. BMC Bioinformatics 15:206
D. Simon and B. Larget (2004) Bayesian analysis to describe genomic evolution by rearrangement (BADGER), version 1.02 beta, Department of Mathematics and Computer Science, Duquesne University
Feijao P, Meidanis J (2011) SCJ: a breakpoint–like distance that simplifies several rearrangement problems. IEEE/ACM Trans Comput Biol Bioinform 8:1318–1329
Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Biol 20:406–416
Miklós I, Smith H (2015) Sampling and counting genome rearrangement scenarios. BMC Bioinformatics 16(Suppl 14):S6
Jones BR, Rajaraman A, Tannier E et al (2012) ANGES: reconstructing ANcestral GEnomeS maps. Bioinformatics 28:2388–2390
Hu F, Zhou J, Zhou L et al (2014) Probabilistic reconstruction of ancestral gene orders with insertions and deletions. IEEE/ACM Trans Comput Biol Bioinform 11:667–672
J. Ma (2010) A probabilistic framework for inferring ancestral genomic orders. In: Bioinformatics and biomedicine (BIBM), pp. 179–184
Maňuch J, Patterson M, Wittler R et al (2012) Linearization of ancestral multichromosomal genomes. BMC Bioinformatics 13(Suppl 19):S11
Stoye J, Wittler R (2009) A unified approach for reconstructing ancient gene clusters. IEEE/ACM Trans Comput Biol Bioinform 6:387–400
Maňuch J, Patterson M, Chauve C (2012) Hardness results on the gapped consecutive–ones property problem. Discrete Appl Math 160:2760–2768
Maňuch J, Patterson M (2011) The complexity of the gapped consecutive–ones property problem for matrices of bounded maximum degree. J Comput Biol 18:1243–1253
Gavranović H, Chauve C, Salse J et al (2011) Mapping ancestral genomes with massive gene loss: a matrix sandwich problem. Bioinformatics 27:i257–i265
Csurös M (2010) Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood. Bioinformatics 26:1910–1912
De Bie T, Cristianini N, Demuth JP et al (2006) CAFE: a computational tool for the study of gene family evolution. Bioinformatics 22:1269–1271
Csűrös M (2013) How to infer ancestral genome features by parsimony: dynamic programming over an evolutionary tree. In: Models and algorithms for genome evolution. Springer, London, pp 29–45
Sankoff D, Rousseau P (1975) Locating the vertices of a steiner tree in an arbitrary metric space. Math Prog 9:240–246
Bergeron A, Chauve C, Gingras Y (2008) Formal models of gene clusters. In: Bioinformatics algorithms. John Wiley & Sons, Inc, Hoboken, pp 175–202
Wittler R, Maňuch J, Patterson M et al (2011) Consistency of sequence–based gene clusters. J Comput Biol 18:1023–1039
Treangen TJ, Salzberg SL (2012) Repetitive DNA and next–generation sequencing: computational challenges and solutions. Nat Rev Genet 13:36–46
Rajaraman A, Zanetti J, Manuch J et al (2016) Algorithms and complexity results for genome mapping problems. IEEE/ACM Trans Comput Biol Bioinform 14(2):418–430. https://doi.org/10.1109/TCBB.2016.2528239
Rajaraman A, Tannier E, Chauve C (2013) FPSAC: fast phylogenetic scaffolding of ancient contigs. Bioinformatics 29:2987–2994
Gagnon Y, Blanchette M, El Mabrouk N (2012) A flexible ancestral genome reconstruction method based on gapped adjacencies. BMC Bioinformatics 13(Suppl 19):S4
Nakhleh L (2013) Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol Evol 28:719–728
Szöllősi GJ, Tannier E, Daubin V et al (2015) The inference of gene trees with species trees. Syst Biol 64:42–62
Jacox E, Chauve C, Szöllősi GJ et al (2016) ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony. Bioinformatics 32(13):2056–2058. https://doi.org/10.1093/bioinformatics/btw105
Luhmann N, Thévenin A, Ouangraoua A et al (2016) The SCJ small parsimony problem for weighted gene adjacencies. In: Bioinformatics research and applications. Springer, Berlin Heidelberg
Ma J, Ratan A, Raney BJ et al (2008) DUPCAR: reconstructing contiguous ancestral regions with duplications. J Comput Biol 15:1007–1027
Bérard S, Gallien C, Boussau B et al (2012) Evolution of gene neighborhoods within reconciled phylogenies. Bioinformatics 28:i382–i388
Chauve C, Ponty Y, Zanetti J (2015) Evolution of genes neighborhood within reconciled phylogenies: an ensemble approach. BMC Bioinformatics 16(Suppl 19):S6
Anselmetti Y, Berry V, Chauve C et al (2015) Ancestral gene synteny reconstruction improves extant species scaffolding. BMC Genomics 16(Suppl 10):S11
Duchemin W, Anselmetti Y, Patterson M et al (2017) DeCoSTAR: reconstructing the ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol Evol 9:1312–1319
Koren S, Schatz MC, Walenz BP et al (2012) Hybrid error correction and de novo assembly of single–molecule sequencing reads. Nat Biotechnol 30:693–700
Antipov D, Korobeynikov A, McLean JS et al (2015) hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32:1009–1015
Paulino D, Warren RL, Vandervalk BP et al (2015) Sealer: a scalable gap–closing application for finishing draft genomes. BMC Bioinformatics 16:230
Salmela L, Sahlin K, Mäkinen V et al (2016) Gap filling as exact path length problem. J Comput Biol 23:347–361
English AC, Richards S, Han Y et al (2012) Mind the gap: upgrading genomes with Pacific biosciences RS long read sequencing technology. PLoS One 7:e47768
Koren S, Phillippy AM (2015) One chromosome, one contig: complete microbial genomes from long–read sequencing and assembly. Curr Opin Microbiol 23:110–120
Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13:278–289
Lin Y, Nurk S, Pevzner PA (2014) What is the difference between the breakpoint graph and the de Bruijn graph? BMC Genomics 15(Suppl 6):S6
Compeau PEC, Pevzner PA, Tesler G (2011) How to apply de Bruijn graphs to genome assembly. Nat Biotechnol 29:987–991
Muñoz A, Zheng C, Zhu Q et al (2010) Scaffold filling, contig fusion and comparative gene order inference. BMC Bioinformatics 11:304
Aganezov S, Sitdykova N, AGC Consortium et al (2015) Scaffold assembly based on genome rearrangement analysis. Comput Biol Chem 57:46–53
Higuchi R, Bowman B, Freiberger M et al (1984) DNA sequences from the quagga, an extinct member of the horse family. Nature 312:282–284
Cooper A, Lalueza-Fox C, Anderson S et al (2001) Complete mitochondrial genome sequences of two extinct moas clarify ratite evolution. Nature 409:704–707
Stiller M, Baryshnikov G, Bocherens H et al (2010) Withering away–25,000 years of genetic decline preceded cave bear extinction. Mol Biol Evol 27:975–978
Krings M, Stone A, Schmitz RW et al (1997) Neandertal DNA sequences and the origin of modern humans. Cell 90:19–30
Marciniak S, Klunk J, Devault A et al (2015) Ancient human genomics: the methodology behind reconstructing evolutionary pathways. J Hum Evol 79:21–34
Rasmussen S, Allentoft ME, Nielsen K et al (2015) Early divergent strains of Yersinia Pestis in Eurasia 5,000 years ago. Cell 163:571–582
Wagner DM, Klunk J, Harbeck M et al (2014) Yersinia Pestis and the plague of Justinian 541–543 AD: a genomic analysis. Lancet Infect Dis 14:319–326
Miller W, Drautz DI, Ratan A et al (2008) Sequencing the nuclear genome of the extinct woolly mammoth. Nature 456:387–390
Orlando L, Ginolhac A, Zhang G et al (2013) Recalibrating Equus evolution using the genome sequence of an early middle pleistocene horse. Nature 499:74–78
Peltzer A, Jäger G, Herbig A et al (2016) EAGER: efficient ancient genome reconstruction. Genome Biol 17:1–14
Minkin I, Patel A, Kolmogorov M et al (2013) Sibelia: a scalable and comprehensive synteny block generation tool for closely related microbial genomes. In: Algorithms in bioinformatics. Springer, Berlin Heidelberg, pp 215–229
Bos KI, Schuenemann VJ, Golding GB et al (2011) A draft genome of Yersinia Pestis from victims of the black death. Nature 478:506–510
Froenicke L, Caldés MG, Graphodatsky A et al (2006) Are molecular cytogenetics and bioinformatics suggesting diverging models of ancestral mammalian genomes? Genome Res 16:306–310
Steel M, Penny D (2000) Parsimony, likelihood, and the role of models in molecular phylogenetics. Mol Biol Evol 17:839–850
Durrett R, Nielsen R, York TL (2004) Bayesian estimation of genomic distance. Genetics 166:621–629
Gould SJ (1990) Wonderful life: the burgess shale and the nature of history. Norton, New York
Hillis DM, Bull JJ, White ME et al (1992) Experimental phylogenetics: generation of a known phylogeny. Science 255:589–592
R.N. Randall (2012) Experimental phylogenetics: a benchmark for ancestral sequence reconstruction. https://smartech.gatech.edu/handle/1853/48998
Barrick JE, Yu DS, Yoon SH et al (2009) Genome evolution and adaptation in a long–term experiment with Escherichia Coli. Nature 461:1243–1247
Romiguier J, Ranwez V, Douzery EJP et al (2013) Genomic evidence for large, long–lived ancestors to placental mammals. Mol Biol Evol 30:5–13
Szöllosi GJ, Boussau B, Abby SS et al (2012) Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc Natl Acad Sci U S A 109:17513–17518
Beiko RG, Charlebois RL (2007) A simulation test bed for hypotheses of genome evolution. Bioinformatics 23:825–831
Dalquen DA, Anisimova M, Gonnet GH et al (2012) ALF–a simulation framework for genome evolution. Mol Biol Evol 29:1115–1123
Biller P, Knibbe C, Beslon G, Tannier E (2016) Comparative genomics on artificial life. In: Computability in Europe, to appear. Springer, Cham
Acknowledgment
C.C. is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant 249834. E.T., S.B., and Y.A. are funded by the French Agence Nationale pour la Recherche (ANR) through PIA Grant ANR-10-BINF-01-01 “Ancestrome”. N.L. is funded by the International DFG Research Training Group GRK 1906/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Anselmetti, Y., Luhmann, N., Bérard, S., Tannier, E., Chauve, C. (2018). Comparative Methods for Reconstructing Ancient Genome Organization. In: Setubal, J., Stoye, J., Stadler, P. (eds) Comparative Genomics. Methods in Molecular Biology, vol 1704. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7463-4_13
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7463-4_13
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7461-0
Online ISBN: 978-1-4939-7463-4
eBook Packages: Springer Protocols