Abstract
There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the ℓ ∞-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with n labeled leaves as a point in \(\mathbb {R}^{n(n+1)/2}\) known as the cophenetic vector, then comparing the two resulting Euclidean points using the ℓ ∞ distance. Meanwhile, the interleaving distance is a formal categorical construction generalized from the definition of Chazal et al., originally introduced to compare persistence modules arising from the field of topological data analysis. We show that the ℓ ∞-cophenetic metric is an example of an interleaving distance. To do this, we define phylogenetic trees as a category of merge trees with some additional structure, namely, labelings on the leaves plus a requirement that morphisms respect these labels. Then we can use the definition of a flow on this category to give an interleaving distance. Finally, we show that, because of the additional structure given by the categories defined, the map sending a labeled merge tree to the cophenetic vector is, in fact, an isometric embedding, thus proving that the ℓ ∞-cophenetic metric is an interleaving distance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This is also known as a [0, ∞)-actegory, but category with a flow is both easier to say and fails to generate a flurry of questions about assumed typos.
- 2.
Note that traditionally, a Lawvere metric does not require the axiom of symmetry. However, as all of our constructions are symmetric, we regularly drop the word “symmetric” for simplicity.
- 3.
The analogy between category and meta-category is like the comparison of sets and classes.
- 4.
This category is equivalently thought of as the slice category \(\mathbf {Top} \downarrow \mathbb {R}\).
References
P.K. Agarwal, K. Fox, A. Nath, A. Sidiropoulos, Y. Wang, Computing the Gromov-Hausdorff distance for metric trees. ACM Trans. Algorithms 14(2), 1–20 (2018). https://doi.org/10.1145/3185466
R. Alberich, G. Cardona, F. Rosselló, G. Valiente, An algebraic metric for phylogenetic trees. Appl. Math. Lett. 22(9), 1320–1324 (2009). https://doi.org/10.1016/j.aml.2009.03.003
A. Babu, Zigzag coarsenings, mapper stability and gene network analyses, Ph.D. thesis, Stanford University, 2013
U. Bauer, X. Ge, Y. Wang: measuring distance between Reeb graphs, in Annual Symposium on Computational Geometry - SOCG 14 (ACM Press, New York, 2014). https://doi.org/10.1145/2582112.2582169
U. Bauer, E. Munch, Y. Wang, Strong equivalence of the interleaving and functional distortion metrics for Reeb graphs, in 31st International Symposium on Computational Geometry (SoCG 2015), Leibniz International Proceedings in Informatics (LIPIcs), vol. 34, pp. 461–475 (Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, 2015). https://doi.org/10.4230/LIPIcs.SOCG.2015.461. http://drops.dagstuhl.de/opus/volltexte/2015/5146
U. Bauer, B. Di Fabio, C. Landi, An edit distance for Reeb graphs (2016). https://doi.org/10.6092/unibo/amsacta/4705
K. Beketayev, D. Yeliussizov, D. Morozov, G.H. Weber, B. Hamann, Measuring the distance between merge trees, in Mathematics and Visualization (Springer, Cham, 2014), pp. 151–165. https://doi.org/10.1007/978-3-319-04099-8_10
S. Biasotti, D. Giorgi, M. Spagnuolo, B. Falcidieno, Reeb graphs for shape analysis and applications. Theor. Comput. Sci. Comput. Algebraic Geom. Appl. 392(13), 5–22 (2008). https://doi.org/10.1016/j.tcs.2007.10.018. http://www.sciencedirect.com/science/article/pii/S0304397507007396
L.J. Billera, S.P. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees. Adv. Appl. Math. 27(4), 733–767 (2001). https://doi.org/10.1006/aama.2001.0759
H.B. Bjerkevik, M.B. Botnan, Computational complexity of the interleaving distance, in 34th International Symposium on Computational Geometry (SoCG 2018) (Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Wadern, 2018)
D. Bryant, J. Tsang, P.E. Kearney, M. Li, Computing the quartet distance between evolutionary trees, in Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’00, pp. 285–286 (Society for Industrial and Applied Mathematics, Philadelphia, 2000). http://dl.acm.org/citation.cfm?id=338219.338264
P. Bubenik, J.A. Scott, Categorification of persistent homology. Discret. Comput. Geom. 51(3), 600–627 (2014). https://doi.org/10.1007/s00454-014-9573-x
P. Bubenik, V. de Silva, J. Scott, Metrics for generalized persistence modules. Found. Comput. Math. 15(6), 1501–1531 (2014). https://doi.org/10.1007/s10208-014-9229-5
G. Cardona, A. Mir, F. Rosselló, L. Rotger, D. Sánchez, Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf. BMC Bioinforma. 14(1), 3 (2013). https://doi.org/10.1186/1471-2105-14-3
M. Carrière, S. Oudot, Structure and stability of the one-dimensional mapper. Found. Comput. Math. (2017). https://doi.org/10.1007/s10208-017-9370-z
F. Chazal, D. Cohen-Steiner, M. Glisse, L.J. Guibas, S.Y. Oudot, Proximity of persistence modules and their diagrams, in Proceedings of the 25th Annual Symposium on Computational Geometry, SCG ’09, pp. 237–246 (ACM, New York, 2009). https://doi.org/10.1145/1542362.1542407. http://doi.acm.org/10.1145/1542362.1542407
F. Chazal, V. de Silva, M. Glisse, S. Oudot, The Structure and Stability of Persistence Modules (Springer, New York, 2016). https://doi.org/10.1007/978-3-319-42545-0
J. Curry, Sheaves, cosheaves and applications, Ph.D. thesis, University of Pennsylvania, 2014
V. de Silva, E. Munch, A. Patel, Categorified Reeb graphs. Discret. Comput. Geom. 1–53 (2016). https://doi.org/10.1007/s00454-016-9763-9
V. de Silva, E. Munch, A. Stefanou, Theory of interleavings on categories with a flow. Theory Appl. Categories 33(21), 583–607 (2018). http://www.tac.mta.ca/tac/volumes/33/21/33-21.pdf
B. Di Fabio, C. Landi, The edit distance for Reeb graphs of surfaces. Discrete Comput. Geom. 55(2), 423–461 (2016). https://doi.org/10.1007/s00454-016-9758-6
P.W. Diaconis, S.P. Holmes, Matchings and phylogenetic trees. Proc. Natl. Acad. Sci. 95(25), 14600–14602 (1998). http://www.pnas.org/content/95/25/14600.abstract
J. Eldridge, M. Belkin, Y. Wang, Beyond Hartigan consistency: merge distortion metric for hierarchical clustering, in Proceedings of The 28th Conference on Learning Theory, ed. by P. Grünwald, E. Hazan, S. Kale. Proceedings of Machine Learning Research, vol. 40, pp. 588–606 (PMLR, Paris, 2015). http://proceedings.mlr.press/v40/Eldridge15.html
H. Fernau, M. Kaufmann, M. Poths, Comparing trees via crossing minimization. J. Comput. Syst. Sci. 76(7), 593–608 (2010). https://doi.org/10.1016/j.jcss.2009.10.014
F.W. Lawvere, Metric spaces, generalized logic, and closed categories. Rendiconti del seminario matématico e fisico di Milano 43(1), 135–166 (1973). Republished in: Reprints in Theory and Applications of Categories, No. 1 (2002), pp. 1–37
B. Lin, A. Monod, R. Yoshida, Tropical foundations for probability & statistics on phylogenetic tree space (2018). arXiv:1805.12400v2
T. Mailund, C.N.S. Pedersen, QDist–quartet distance between evolutionary trees. Bioinformatics 20(10), 1636–1637 (2004). https://doi.org/10.1093/bioinformatics/bth097
D. Morozov, K. Beketayev, G. Weber, Interleaving distance between merge trees, in Proceedings of TopoInVis (2013)
V. Moulton, T. Wu, A parsimony-based metric for phylogenetic trees. Adv. Appl. Math. 66, 22–45 (2015). https://doi.org/10.1016/j.aam.2015.02.002
E. Munch, B. Wang, Convergence between categorical representations of Reeb space and mapper, in 32nd International Symposium on Computational Geometry (SoCG 2016) ed. by S. Fekete, A. Lubiw Leibniz International Proceedings in Informatics (LIPIcs), vol. 51, pp. 53:1–53:16 (Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, 2016). https://doi.org/10.4230/LIPIcs.SoCG.2016.53. http://drops.dagstuhl.de/opus/volltexte/2016/5945
M. Owen, Computing geodesic distances in tree space. SIAM J. Discret. Math. 25(4), 1506–1529 (2011). https://doi.org/10.1137/090751396
G. Reeb, Sur les points singuliers d’une forme de pfaff complèment intégrable ou d’une fonction numérique. C.R. Acad. Sci. 222, 847–849 (1946)
E. Riehl, Category Theory in Context (Courier Dover Publications, New York, 2017)
D. Robinson, L. Foulds, Comparison of weighted labelled trees, in Combinatorial Mathematics VI (Springer, Berlin, 1979), pp. 119–126. https://doi.org/10.1007/BFb0102690
D. Robinson, L. Foulds, Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981). https://doi.org/10.1016/0025-5564(81)90043-2
G. Singh, F. Mémoli, G.E. Carlsson, Topological methods for the analysis of high dimensional data sets and 3D object recognition, in SPBG, pp. 91–100 (2007)
A. Stefanou, Dynamics on categories and applications, Ph.D. thesis, University at Albany, State University of New York, 2018
G. Valiente, An efficient bottom-up distance between trees, in SPIRE (IEEE, Piscataway, 2001), p. 0212
Acknowledgements
The authors gratefully thank two anonymous reviewers whose feedback substantially increased the quality of the paper. The work of EM was supported in part by NSF Grant Nos. DMS-1800446 and CMMI-1800466. AS was partially supported both by the National Science Foundation through grant NSF-CCF-1740761 TRIPODS TGDA@OSU and by the Mathematical Biosciences Institute at the Ohio State University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 The Author(s) and the Association for Women in Mathematics
About this chapter
Cite this chapter
Munch, E., Stefanou, A. (2019). The ℓ ∞-Cophenetic Metric for Phylogenetic Trees As an Interleaving Distance. In: Gasparovic, E., Domeniconi, C. (eds) Research in Data Science. Association for Women in Mathematics Series, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-11566-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-11566-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11565-4
Online ISBN: 978-3-030-11566-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)