Skip to main content

Homology Search and Multiple Alignment

  • Chapter
  • First Online:
Introduction to Evolutionary Genomics

Part of the book series: Computational Biology ((COBO,volume 17))

  • 1836 Accesses

Chapter Summary

How to discover evolutionary homology of nucleotide and amino acid sequences and how to analyze these homologous sequences are discussed, including homology search, pairwise alignment, multiple alignment, and genome-wide sequence viewing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.

    Article  Google Scholar 

  2. http://www.ncbi.nlm.nih.gov/books/NBK21097/.

  3. Karlin, S., & Altschul, S. F. (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proceedings of the National Academy of Sciences USA, 87, 2264–2268.

    Article  Google Scholar 

  4. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., et al. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.

    Article  Google Scholar 

  5. Zhang, Z., Schwartz, S., Wagner, L., & Miller, W. (2000). A greedy algorithm for aligning DNA sequences. Journal of Computational Biology, 7, 203–214.

    Article  Google Scholar 

  6. NCBI BLAST. (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch).

  7. Kitano, T., Sumiyama, K., Shiroishi, T., & Saitou, N. (1998). Conserved evolution of the Rh50 gene compared to its homologous Rhblood group gene. Biochemical and Biophysical Research Communications, 249, 78–85.

    Article  Google Scholar 

  8. DDBJ. (https://www.ddbj.nig.ac.jp/).

  9. DDBJ getentry. (http://getentry.ddbj.nig.ac.jp).

  10. DDBJ ARSA. (http://ddbj.nig.ac.jp/arsa/).

  11. DDBJ BLAST. (http://ddbj.nig.ac.jp/arsa/).

  12. NCBI blastp. (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome).

  13. Lipman, D. J., & Pearson, W. R. (1985). Rapid and sensitive protein similarity searches. Science, 227, 1435–1441.

    Article  Google Scholar 

  14. Pearson, W. R., & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proceedings of the National Academy of Sciences USA, 85, 2444–2448.

    Article  Google Scholar 

  15. Kent, W. J. (2002). BLAT—The BLAST-like alignment tool. Genome Research, 12, 656–664.

    Article  Google Scholar 

  16. http://genome.ucsc.edu/FAQ/FAQblat.html.

  17. Ma, B., Tromp, J., & Li, M. (2002). PatternHunter: Faster and more sensitive homology search. Bioinformatics, 18, 440–445.

    Article  Google Scholar 

  18. Eddy, S. R. (2009). A new generation of homology search tools based on probabilistic inference. Genome Informatics, 23, 205–211.

    Google Scholar 

  19. http://hmmer.org.

  20. Waterman, M. S. (1995). Introduction to computer biology. London: Chapman & Hall.

    Book  Google Scholar 

  21. Chao, K.-M., & Zhang, L. (2008). Sequence comparison: Theory and methods. London: Springer.

    Google Scholar 

  22. Saitou, N., & Ueda, S. (1994). Evolutionary rate of insertions and deletions in non-coding nucleotide sequences of primates. Molecular Biology and Evolution, 11, 504–512.

    Google Scholar 

  23. Needleman, S. B., & Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48, 443–453.

    Article  Google Scholar 

  24. Sellers, P. H. (1974). On the theory and computation of evolutionary distances. SIAM Journal on Applied Mathematics, 26, 787–793.

    Article  MathSciNet  Google Scholar 

  25. Waterman, M. S., Smith, T. F., & Beyer, W. A. (1976). Some biological sequence metrics. Advances in Mathematics, 20, 367–387.

    Article  MathSciNet  Google Scholar 

  26. Gotoh, O. (1982). An improved algorithm for matching biological sequences. Journal of Molecular Biology, 162, 705–708.

    Article  Google Scholar 

  27. Altschul, S. F., & Erickson, B. W. (1986). A nonlinear measure of subalignment similarity and its significance levels. Bulletin of Mathematical Biology, 48, 603–616.

    Article  MathSciNet  Google Scholar 

  28. Fitch, W. (1969). Locating gaps in amino acid sequences to optimize the homology between two proteins. Biochemical Genetics, 3, 99–108.

    Article  Google Scholar 

  29. Schulz, J., Florian Leese, F., & Held, C. (2011). Introduction to dot-plots. Web page available at http://www.code10.info/.

  30. YASS server. (http://bioinfo.lifl.fr/yass/index.php).

  31. Murata, M., Richardson, J. S., & Sussman, J. L. (1985). Simultaneous comparison of three protein sequences. Proceedings of National Academy of Sciences, USA, 82, 3073–3077.

    Article  Google Scholar 

  32. Feng, D.-F., & Doolittle, R. F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution, 25, 351–360.

    Article  Google Scholar 

  33. Notredame, C. (2007). Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology, 3, e123.

    Article  Google Scholar 

  34. MEGA (Molecular Evolutionary Genetics Analysis). (https://www.megasoftware.net).

  35. Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22, 4673–4680.

    Article  Google Scholar 

  36. Edgar, R. C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797.

    Article  Google Scholar 

  37. Notredame, C., Higgins, D. G., & Heringa, J. (2000). T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology, 302, 205–215.

    Article  Google Scholar 

  38. Katoh, K., Misawa, K., Kuma, K., & Miyata, T. (2002). MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30, 3059–3066.

    Article  Google Scholar 

  39. Morgenstern, B., Dress, A., & Werner, T. (1996). Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proceedings of National Academy of Sciences, USA, 93, 12098–12103.

    Article  Google Scholar 

  40. Brudno, M., Do, C., Cooper, G., Kim, M. F., Davydov, E., Green, E. D., et al. (2003). LAGAN and multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Genome Research, 13, 721–731.

    Article  Google Scholar 

  41. Bray, N., & Pachter, L. (2004). MAVID: Constrained ancestral alignment of multiple sequences. Genome Research, 14, 693–699.

    Article  Google Scholar 

  42. Darling, A. C. E., Mau, B., & Perna, N. T. (2010). ProgressiveMauve: Multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE, 5, e11147.

    Article  Google Scholar 

  43. Kryukov, K., & Saitou, N. (2010). MISHIMA—A new method for high speed multiple alignment of nucleotide sequences of bacterial genome scale data. BMC Bioinformatics, 11, 142.

    Article  Google Scholar 

  44. Popendorf, K., Tsuyoshi, H., Osana, Y., & Sakakibara, Y. (2010). Murasaki: A fast, parallelizable algorithm to find anchors from multiple genomes. PLoS ONE, 5, e12651.

    Article  Google Scholar 

  45. Marcais, G., et al. (2018). MUMmer4: A fast and versatile genome alignment system. PLoS Computational Biology, 14, e1005944.

    Article  Google Scholar 

  46. Felsenstein, J., Sawyer, S., & Kochin, R. (1982). An efficient method for matching nucleotide acid sequences. Nucleic Acids Research, 10, 133–139.

    Article  Google Scholar 

  47. Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30, 772–780.

    Article  Google Scholar 

  48. Kryukov, K. (unpublished). MSHIMA version 2.

    Google Scholar 

  49. SeaView—Multiplatform GUI for molecular phylogeny. (http://doua.prabi.fr/software/seaview).

  50. Sievers, F., et al. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7, 539.

    Article  Google Scholar 

  51. Lipman, D. J., Altschul, S. F., & Kececioglu, J. D. (1989). A tool for multiple sequence alignment. Proceedings of the National Academy of Sciences of the United States of America, 86, 4412–4415.

    Article  Google Scholar 

  52. UNIPROT. (http://www.uniprot.org).

  53. Larkin, M. A., et al. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947–2948.

    Article  Google Scholar 

  54. Subramanian, A. R., Kaufmann, M., & Morgenstern, B. (2008). DIALIGN-TX: Greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms for Molecular Biology, 3, 6.

    Article  Google Scholar 

  55. Bradley, R. K., Roberts, A., Smoot, M., Juvekar, S., Do, J., Dewey, C., et al. (2009). Fast statistical alignment. PLoS Computational Biology, 5, e1000392.

    Article  MathSciNet  Google Scholar 

  56. Blanchette, M., et al. (2004). Aligning multiple genomic sequences with the threaded blockset aligner. Genome Research, 14, 708–715.

    Article  Google Scholar 

  57. Kurtz, S., et al. (2004). Versatile and open software for comparing large genomes. Genome Biology, 5, R12.

    Article  Google Scholar 

  58. Brudno, M., Chapman, M., Gottgens, B., Batzoglou, S., & Morgenstern, B. (2003). Fast and sensitive multiple alignment of long genomic sequences. BMC Bioinformatics, 4, 66.

    Article  Google Scholar 

  59. Raphael, B., Zhi, D., Tang, H., & Pevzner, P. (2004). A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Research, 14, 2336–2346.

    Article  Google Scholar 

  60. Do, C. B., Mahabhashyam, M. S. P., Brudno, M., & Batzoglou, S. (2005). ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research, 15, 330–340.

    Article  Google Scholar 

  61. Lassmann, T., & Sonnhammer, E. L. L. (2005). Kalign—An accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics, 6, 298.

    Article  Google Scholar 

  62. Lotynoja, A., & Goldman, N. (2005). An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences USA, 102, 10557–10562.

    Article  Google Scholar 

  63. Sze, S.-H., Lu, Y., & Yang, Q. (2006). A polynomial time solvable formulation of multiple sequence alignment. Journal of Computational Biology, 13, 309–319.

    Article  MathSciNet  Google Scholar 

  64. Liu, Y., Schmidt, B., & Maskell, D. L. (2010). MSAProbs: Multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics, 26, 1958–1964.

    Article  Google Scholar 

  65. Shih, A. C.-C., & Li, W.-H. (2003). GS-Aligner: A novel tool for aligning genomic sequences using bit-level operations. Molecular Biology and Evolution, 20, 1299–1309.

    Article  Google Scholar 

  66. Keightley, P. D., & Johnson, T. (2004). MCALIGN: Stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Research, 14, 442–450.

    Article  Google Scholar 

  67. Schwartz, S., et al. (2000). PipMaker—A web server for aligning two genomic DNA sequences. Genome Research, 10, 577–586.

    Article  Google Scholar 

  68. PipMaker and MultiPipMaker. (http://pipmaker.bx.psu.edu/pipmaker).

  69. Matsunami, M., Sumiyama, K., & Saitou, N. (2010). Evolution of conserved non-coding sequences within the vertebrate Hox clusters through the two-round whole genome duplications revealed by phylogenetic footprinting analysis. Journal of Molecular Evolution, 71, 427–436.

    Article  Google Scholar 

  70. VISTA. (http://genome.lbl.gov/vista/index.shtml).

  71. UCSC (University of California, Santa Cruz) Genome Bioinformatics. (http://genome.ucsc.edu).

  72. NCBI Genome Data Viewer. (https://www.ncbi.nlm.nih.gov/genome/gdv/).

  73. Higgins, D. G., & Sharp, P. (1988). CLUSTAL: A package for performing multiple sequence alignment on a microcomputer. Gene, 73, 237–244.

    Article  Google Scholar 

  74. Sokal, R., & Michener, C. D. (1958). A statistical method for evaluating systematic relationship. University of Kansas Science Bulletin, 38, 1409–1438.

    Google Scholar 

  75. Higgins, D. G., Bleasby, A. J., & Fuchs, R. (1992). CLUSTAL V: Improved software for multiple sequence alignment. Computational Applied Biosciences, 8, 189–191.

    Google Scholar 

  76. Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111–120.

    Article  Google Scholar 

  77. Kimura, M. (1983). The neutral theory of molecular evolution. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  78. Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406–425.

    Google Scholar 

  79. Wilbur, W. J., & Lipman, D. (1984). The context dependent comparison of biological sequences. SIAM Journal of Applied Mathematics, 44, 557–567.

    Article  MathSciNet  Google Scholar 

  80. Myers, E. W., & Miller, W. (1988). Optimal alignments in linear space. CABIOS, 4, 11–15.

    Google Scholar 

  81. Clustal: Multiple Sequence Alignment. (http://www.clustal.org/).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naruya Saitou .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Saitou, N. (2018). Homology Search and Multiple Alignment. In: Introduction to Evolutionary Genomics. Computational Biology, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-319-92642-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92642-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92641-4

  • Online ISBN: 978-3-319-92642-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics