Skip to main content
Log in

Parametric Analysis of Alignment and Phylogenetic Uncertainty

  • Original Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

To infer a phylogenetic tree from a set of DNA sequences, typically a multiple alignment is first used to obtain homologous bases. The inferred phylogeny can be very sensitive to how the alignment was created. We develop tools for analyzing the robustness of phylogeny to perturbations in alignment parameters in the NW algorithm. Our main tool is parametric alignment, with novel improvements that are of general interest in parametric inference. Using parametric alignment and a Gaussian distribution on alignment parameters, we derive probabilities of optimal alignment summaries and inferred phylogenies. We apply our method to analyze intronic sequences from Drosophila flies. We show that phylogeny estimates can be sensitive to the choice of alignment parameters, and that parametric alignment elucidates the relationship between alignment parameters and reconstructed trees.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bárány, I., & Larman, D. G. (1998). The convex hull of the integer points in a large ball. Math. Ann., 312, 167–181.

    Article  MATH  MathSciNet  Google Scholar 

  • Beerenwinkel, N., Pachter, L., Sturmfels, B., Elena, S., & Lenski, R. (2007). Analysis of epistatic interactions and fitness landscapes using a new geometric approach. BMC Evol. Biol., 7(1), 60.

    Article  Google Scholar 

  • Carrillo, H., & Lipman, D. (1988). The multiple sequence alignment problem in biology. SIAM J. Appl. Math., 48(5), 1073–1082.

    Article  MATH  MathSciNet  Google Scholar 

  • Chiarmonte, F., Yap, V. B., & Miller, W. (2002). Scoring pairwise genomic sequence alignments. Pacific Symp Biocomput, (7), 115–126.

    Google Scholar 

  • Daskalakis, C., & Roch, S. (2010). Alignment-free phylogenetic reconstruction. In Proceedings of RECOMB 2010. To appear.

  • Dewey, C. N., Huggins, P. M., Woods, K., Sturmfels, B., & Pachter, L. (2006). Parametric alignment of Drosophila genomes. PLoS Comput. Biol., 2(6), e73.

    Article  Google Scholar 

  • Dobkin, D., Edelsbrunner, H., & Yap, C. K. (1990). Probing convex polytopes. In Cox & Wilfong (Eds.), Autonomous robot vehicles (pp. 326–341). New York: Springer.

    Google Scholar 

  • Edelsbrunner, H. (1987). Algorithms in combinatorial geometry. New York: Springer.

    MATH  Google Scholar 

  • Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution, 39(4), 783–791.

    Article  Google Scholar 

  • Fernández-Baca, D., Seppalainen, T., & Slutzi, G. (2004). Parametric multiple sequence alignment and phylogeny construction. J. Discrete Algorithms, 2(2), 271–287.

    Article  MATH  MathSciNet  Google Scholar 

  • Fernández-Baca, D., & Venkatachalam, B. (2006). Parametric sequence alignment. In S. Aluru (Ed.), Handbook of computational molecular biology. New York: Chapman & Hall.

    Google Scholar 

  • Gawrilow, E., & Joswig, M. (2000). Polymake: an approach to modular software design in computational geometry. In G. Kalai & G. M. Ziegler (Eds.), Proceedings of the 17th annual symposium on computational geometry (pp. 43–74). Basel: Birkhäuser.

    Google Scholar 

  • Gusfield, D., Balasubramanian, K., & Naor, D. (1994). Parametric optimization and sequence alignment. Algorithmica, 12, 312–326.

    Article  MATH  MathSciNet  Google Scholar 

  • Gusfield, D., & Stelling, P. (1996). Parametric and inverse-parametric sequence alignment with XPARAL. Methods Enzymol., 266, 481–494.

    Article  Google Scholar 

  • Guyon, F., Brochier-Armanet, C., & Guénoche, A. (2009). Comparison of alignment free string distances for complete genome phylogeny. Adv. Data Anal. Classif., 3, 95–108.

    Article  Google Scholar 

  • Hein, J. J. (1990). A unified approach to phylogenies and alignments. Methods Enzymol., 183, 625–644.

    MATH  Google Scholar 

  • Higgins, D., Thompson, J., Gibson, T., & Thompson, J. D. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680.

    Article  Google Scholar 

  • Huggins, P. (2006). iB4e: A software framework for parametrizing specialized LP problems. In A. Iglesias & N. Takayama (Eds.), Proceedings of ICMS 2006 (pp. 245–247). New York: Springer.

    Chapter  Google Scholar 

  • Huggins, P. (2008). Polytopes in computational biology. PhD dissertation, University of California, Berkeley.

  • Huggins, P., Pachter, L., & Sturmfels, B. (2007). Towards the Human Genotope. Bull. Math. Biol., 69(8), 2723–2725.

    Article  MATH  MathSciNet  Google Scholar 

  • Konagurthu, A. S., & Stuckey, P. J. (2006). Optimal sum-of-pairs multiple sequence alignment using incremental Carrillo-and-Lipman bounds. J. Bioinform. Comput. Biol., 13(3), 668–685.

    MathSciNet  Google Scholar 

  • Liu, K., Raghavan, S., Nelesen, S., Linder, C. R., & Warnow, T. (2009). Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science, 324(5934), 561–1564.

    Article  Google Scholar 

  • Lunter, G., Miklos, I., Drummond, A., Jensen, J. L., & Hein, J. (2005). Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinform., 6, 83.

    Article  Google Scholar 

  • Needleman, S. B., & Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48(3), 443–453.

    Article  Google Scholar 

  • Pachter, L., & Sturmfels, B. (2004). Parametric inference for biological sequence analysis. Proc. Natl. Acad. Sci. USA, 101(46), 16138–16143.

    Article  MATH  MathSciNet  Google Scholar 

  • Pachter, L., & Sturmfels, B. (Eds.) (2005). Algebraic statistics for computational biology. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Pollard, D. A., Moses, A. M., Iyer, V. N., & Eisen, M. B. (2006). Widespread discordance of gene trees with species trees in Drosophila: evidence for incomplete lineage sorting. PLoS Genetics, 2(10), e173.

    Article  Google Scholar 

  • Redelings, B. D., & Suchard, M. A. (2005). Joint Bayesian estimation of alignment and phylogeny. Syst. Biol., 54(3), 401–418.

    Article  Google Scholar 

  • Sankoff, D. (1975). Minimal mutation trees of sequences. SIAM J. Appl. Math., 78, 35–42.

    Article  MathSciNet  Google Scholar 

  • Sankoff, D., Cedergren, R. J., & Lapalme, G. (1976). Frequency of insertion–deletion, transversion, and transition in the evolution of 5S ribosomal RNA. J. Mol. Evol., 7, 133–149.

    Article  Google Scholar 

  • States, D. J., Gish, W., & Altschul, S. F. (1991). Improved sensitivity of nucleic acid database searches using application-specific scoring matrices. Methods Enzymol., 3(1), 66–70.

    Article  Google Scholar 

  • Suchard, M. A., & Redelings, B. D. (2006). Bali-Phy: simultaneous Bayesian inference of alignment and phylogeny. Bioinformatics, 22(16), 2047–2048.

    Article  Google Scholar 

  • Swafford, D. (2007). Paup*. http://paup.csit.fsu.edu/.

  • Vinzant, C. (2009). Lower bounds for optimal alignments of binary sequences. Discrete Appl. Math., 157(15), 3341–3346.

    Article  MATH  MathSciNet  Google Scholar 

  • Waterman, M. S., Eggert, M., & Lander, E. (1992). Parametric sequence comparisons. Proc. Natl. Acad. Sci. USA, 89(13), 6090–6093.

    Article  Google Scholar 

  • Vinga, S., & Almeida, J. (2003). Alignment-free sequence comparison—a review. Bioinformatics, 19(4), 513–523.

    Article  Google Scholar 

  • Wheeler, W. C. (1995). Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Syst. Biol., 44(3), 321–331.

    Google Scholar 

  • Ziegler, G. M. (1995). Lectures on polytopes. New York: Springer.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Huggins.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Malaspinas, AS., Eriksson, N. & Huggins, P. Parametric Analysis of Alignment and Phylogenetic Uncertainty. Bull Math Biol 73, 795–810 (2011). https://doi.org/10.1007/s11538-010-9610-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11538-010-9610-8

Keywords

Navigation