Abstract
We study the problem of constructing phylogenetic trees for a given set of species. The problem is formulated as that of finding a minimum Steiner tree on n points over the Boolean hypercube of dimension d. It is known that an optimal tree can be found in linear time [1] if the given dataset has a perfect phylogeny, i.e. cost of the optimal phylogeny is exactly d. Moreover, if the data has a near-perfect phylogeny, i.e. the cost of the optimal Steiner tree is d + q, it is known [2] that an exact solution can be found in running time which is polynomial in the number of species and d, yet exponential in q. In this work, we give a polynomial-time algorithm (in both d and q) that finds a phylogenetic tree of cost d + O(q 2). This provides the best guarantees known—namely, a (1 + o(1))-approximation—for the case \(\log(d) \ll q \ll \sqrt{d}\), broadening the range of settings for which near-optimal solutions can be efficiently found. We also discuss the motivation and reasoning for studying such additive approximations.
This work was supported in part by the National Science Foundation under grant CCF-1116892, by an NSF Graduate Fellowship, and by the MSR-CMU Center for Computational Thinking.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ding, Z., Filkov, V., Gusfield, D.: A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping (PPH) Problem. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 585–600. Springer, Heidelberg (2005)
Blelloch, G.E., Dhamdhere, K., Halperin, E., Ravi, R., Schwartz, R., Sridhar, S.: Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 667–678. Springer, Heidelberg (2006)
Gusfield, D.: Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press (1997)
Semple, C., Steel, M.: Phylogenetics. Oxford lecture series in mathematics and its applications. Oxford University Press (2003)
Hinds, D.A., Stuve, L.L., Nilsen, G.B., Halperin, E., Eskin, E., Ballinger, D.G., Frazer, K.A., Cox, D.R.: Whole-genome patterns of common dna variation in three human populations. Science 307(5712), 1072–1079 (2005)
The international hapmap project. Nature 426(6968), 789–796 (2003)
Alon, N., Chor, B., Pardi, F., Rapoport, A.: Approximate maximum parsimony and ancestral maximum likelihood. IEEE/ACM Trans. Comput. Biol. Bioinformatics 7, 183–187 (2010)
Robins, G., Zelikovsky, A.: Improved steiner tree approximation in graphs. In: SODA, pp. 770–779. Society for Industrial and Applied Mathematics (2000)
Robins, G., Zelikovsky, A.: Improved steiner tree approximation in graphs (2000)
Robins, G., Zelikovsky, A.: Tighter bounds for graph steiner tree approximation. SIAM Journal on Discrete Mathematics 19, 122–134 (2005)
Misra, N., Blelloch, G., Ravi, R., Schwartz, R.: Generalized Buneman Pruning for Inferring the Most Parsimonious Multi-state Phylogeny. In: Berger, B. (ed.) RECOMB 2010. LNCS, vol. 6044, pp. 369–383. Springer, Heidelberg (2010)
Fernández-Baca, D., Lagergren, J.: A polynomial-time algorithm for near-perfect phylogeny. SIAM J. Comput. 32, 1115–1127 (2003)
Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations, pp. 85–103. Plenum, New York (1972)
Foulds, L.R., Graham, R.L.: The Steiner problem in phylogeny is NP-complete. Adv. Appl. Math. 3 (1982)
Sridhar, S., Dhamdhere, K., Blelloch, G., Halperin, E., Ravi, R., Schwartz, R.: Algorithms for efficient near-perfect phylogenetic tree reconstruction in theory and practice. IEEE/ACM Trans. Comput. Biol. Bioinformatics 4, 561–571 (2007)
Damaschke, P.: Parameterized enumeration, transversals, and imperfect phylogeny reconstruction. Theor. Comput. Sci. 351, 337–350 (2006)
Agarwala, R., Fernandez-Baca, D.: A polynomial-time algorithm for the perfect phylogeny problem when the number of character states is fixed. In: SFCS, pp. 140–147 (November 1993)
Byrka, J., Grandoni, F., Rothvoß, T., Sanità, L.: An improved lp-based approximation for steiner tree. In: STOC. ACM (2010)
Bodlaender, H.L., Fellows, M.R., Warnow, T.: Two Strikes against Perfect Phylogeny. In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 273–283. Springer, Heidelberg (1992)
Takahashi, H., Matsuyama, A.: An approximate solution for the steiner problem in graphs. Mathematica Japonica 24, 573–577 (1980)
Berman, P., Ramaiyer, V.: Improved approximations for the steiner tree problem. In: SODA, pp. 325–334 (1992)
Prömel, H.J., Steger, A.: RNC-Approximation Algorithms for the Steiner Problem. In: Reischuk, R., Morvan, M. (eds.) STACS 1997. LNCS, vol. 1200, pp. 559–570. Springer, Heidelberg (1997)
Karpinski, M., Zelikovsky, A.: New approximation algorithms for the steiner tree problems. Journal of Combinatorial Optimization 1, 47–65 (1995)
Zelikovsky, A.: Better approximation bounds for the network and euclidean steiner tree problems. Technical report (1996)
Hougardy, S., Promel, H.J.: A 1.598 approximation algorithm for the steiner problem in graphs. In: SODA, pp. 448–453 (1999)
Borchers, A., Du, D.Z.: The k-steiner ratio in graphs. In: STOC, pp. 641–649. ACM (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Awasthi, P., Blum, A., Morgenstern, J., Sheffet, O. (2012). Additive Approximation for Near-Perfect Phylogeny Construction. In: Gupta, A., Jansen, K., Rolim, J., Servedio, R. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2012 2012. Lecture Notes in Computer Science, vol 7408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32512-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-32512-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32511-3
Online ISBN: 978-3-642-32512-0
eBook Packages: Computer ScienceComputer Science (R0)