The space of ultrametric phylogenetic trees

The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods.

This paper lies in the broad scope of research on the following two phylogenetic problems, which are of much more general interest, as we demonstrate in this work. First is the problem of introducing a satisfactory parameterisation for statistical analysis of the space of phylogenetic trees. The nature of the space of phylogenies is that it encapsulates the structure of a manifold as well as a discrete structure of the tree, the later of which is combinatorially complicated [22]. This mix of a continuous and a discrete structure is what makes statistical analysis of the space complicated. The second problem is the problem of summarising a finite set of phylogenetic trees [14,15,16]. This problem arises in different settings of phylogenetic analysis, the most important of which is the problem of computing a statistically consistent summary of a sample from the posterior distribution [8,4].
A quite extensive amount of research has been done on the space of phylogenetic trees in the general setting when the phylogenetic distance between taxa is given by the lengths of the edges of the tree [3,22]. As we demonstrate in this paper, this general setting sometimes leads to computationally intractable models when applied to the space of ultrametric trees (a special case of time-trees). Ultrametric trees are the only satisfactory model for a great body of research in phylogenetics and epidemiology, especially when divergence time dating is the objective, and the taxa were all sampled contemporaneously. In these cases the time-tree, which is ultrametric, is considered separately to the rates of evolution across lineages, which may vary from one branch to the next.
The aim of this paper is to introduce a mathematically satisfactory model of the space of equidistant phylogenetic trees. The notion of a 'mathematically satisfactory model' will be clarified and made exact later in the paper with an eye towards the two general problems mentioned above. Our work is inspired by that of Billera, Holmes, and Vogtmann [3], and is similar to it in the sense that we use polyhedral complexes with unique geodesics to describe a metric space. The investigation of the tree space from a geometric point of view was initiated in [3] with introducing a parameterisation that has later become known as BHV. Due to several nice geometric and algorithmic properties, it was recently suggested [2] that BHV is the space for statistical, and particularly MCMC, analysis of phylogenetic trees. Our results presented in this paper show how crucial the way a tree is parameterised can be for geometric, algorithmic, and statistical properties of the space. Particularly, we demonstrate that the summary tree that is suggested in [2] will be different for different parameterisations of the tree space. Our results leave the question of which parameterisation should be chosen open.
We call the same type of trees 'ultrametric' in the title of the paper and 'equidistant' in the previous paragraph. The reason for this ambiguity is the following. The type of objects we are going to consider in this paper is commonly known to evolutionary biologists as an ultrametric tree, while for mathematical biologists, this term is not appropriate for the following reasons: (1) The popular handbook of a mathematical biologist [22] calls these trees equidistant.
(2) An equidistant tree gives rise to a metric on the taxa. This metric is always an ultrametric, which is a standard notion of geometry.
(3) Every finite ultrametric space X can be presented by a unique equidistant phylogenetic tree on X as taxa.
(4) There are non-equidistant phylogenetic trees that give rise to an ultrametric space on the taxa.
Hence, we have used the term 'ultrametric' in the title and abstract of this paper, to make it possible for both communities to understand what the paper is about, and we will be using the term 'equidistant' from now on for reasons (1) and (4) above.
Unless it otherwise is explicit, by a tree we mean an equidistant phylogenetic tree, that is, a binary rooted tree with distinguished tips and branch lengths such that the distance from the root is the same to every tip.
We follow books [22] for phylogenetics and [23,5] for geometric combinatorics terminology. The structure of our paper is the following. In Section 1. Introduction, we describe the state of the art. Particularly, we present the metric spaces with which we will be comparing our spaces. We note that all the metric spaces, apart from BHV, are not actually used as geometries but as number characteristics for estimating errors instead [14,16]. Indeed, some of those characteristics are not even metrics, they are more general objects called dissimilarity maps. In this section, we establish some geometric properties of described spaces (that theoretically justify their sometimes poor fit for both statistical analysis and summarising the posterior). We also describe a way of adapting the BHV space for the space of equidistant trees. Section 2. τ -space is one of the main contributions of the paper. Here we introduce the space and establish its geometric properties that make τ -space attractive for modelling equidistant phylogenetic trees. We also describe effectiveness properties of the space. Section 3. t-space is where we describe the t-space, establish its geometric properties and compare τ -and t-spaces. The seemingly inessential difference between the two spaces of how a tree is parameterised results in a large impact on geometric and algorithmic properties of the space, as we demonstrate in this section, which is one of the main contributions of the paper.

Introduction
Although the space of equidistant phylogenetic trees is well understood by biologists, mathematicians, and statisticians, there are a large number of ways to define this space formally. As a well-chosen formal definition is the main and crucial step for various types of analyses of the space, including mathematical, statistical, and computational, it is very important to carefully address the balance between the generality of the definition, its mathematical clarity, and its applicability in the analysis of real evolutionary processes. Formal definitions become especially important in modern research because of the massive use of computational tools, for which equivalent definitions can change the complexity of a problem from incomputable on any type of abstract digital computer to rapidly computable on available machines. In this paper, we consider several approaches to defining the space of equidistant phylogenetic trees and compare their mathematical, computational, and statistical tractability.
It is standard practice in evolutionary biology to model real biological processes by mathematical abstractions [22]. Particularly, as biologists are often interested in comparing different hypotheses about an evolutionary process modelled by phylogenetic trees, it is natural to work within the space of such trees. It is also a common practice to introduce different types of measures on the space of trees as a formal way of comparing them. One of the most general and commonly used ways of measuring the similarity between two trees is given by the notion of distance, or metric as it is widely known in mathematics. In order to measure the distance between trees, the tree space has to be parameterised, that is, some real-valued parameters have to be assigned to trees.
Formally, this scenario can be described as follows. Let T be the space of phylogenetic trees on n taxa 1 . A parameterisation of the space T is an embedding p : T → M of the tree space T to a metric space M, which we call a model metric space. By embedding here, we mean a function that maps different trees to different points of the metric space M. The embedding p plays the role of the assignment of parameters (points of the space M, which could be tuples of real number, for example). The existence of such an embedding makes the space T itself a metric space. Indeed, the distance between two trees T and R is given by the distance between their images 1 We use n to denote the number of taxa throughout the paper. under the embedding p, that is, d T (T, R) is defined to be d M (p(T ), p(R)). We say in this case that the metric d T is induced by the parameterisation p.
As is known [14,15,16], the existence of a parameterisation alone is already a fruitful property of the space of phylogenies, as it allows to test hypotheses such as how far are two trees from each other? How far is an estimate from the true tree? Given two algorithms, which one produces trees that are closer to the true tree? Sometimes it is even possible to extract an objective function minimisation that leads to a practical way of summarising posteriors [14]. We will present some of these parameterisations later in this section.
Often biologists and especially phylogeneticists are interested in more subtle properties of the space of trees, such as what tree is in the middle between two given trees 2 ? What is the path from one tree to another? What is the mean and the variance of a set of (sampled) trees? The last question is of prominent importance, as this is the very basic question for statistical analysis of data sets given by trees. Furthermore, this question is very important in testing whether two probability distribution on the tree space are the same, the task which is commonly used in model selection in statistics. More sophisticated questions include, for example, how standard phylogenetic models such as coalescent and birth-death can be described under a given parameterisation? Can more efficient proposal mechanisms, such as Hamiltonian Monte Carlo, be employed in Bayesian analysis of phylogenetic data?
A more detailed mathematical analysis is needed in order to approach questions such as these. In what follows, we summarise several basic properties of parameterisations, which we suggest are desirable to advance research on the problems mentioned.
It is often the case that the metric space M that is used to parameterise the tree space T is greatly different from the metric space T with the induced metric d T . The key reason for this is the nature of the parameterisation p. As we will see later in the paper, some parameterisations p induce metrics that share pretty much no geometric properties in common with the original metric space M that was used in the parameterisation p. Particularly, those parameterisations are far from being bijective, that is, being able to recover a tree given an arbitrary point from the space M. The lack of this property can lead to situations where, for example, there are infinitely many trees all of which minimise the total square distance to a given set of trees [14]. A sphere, for example, has such a property, where there are infinitely many points minimising the total square distance to two antipodal points. Furthermore, there are two points a and b both of which minimise this distance, but the distance between a and b is as large as possible.
Although the parameterisations we introduce in Sections 2 and 3 of this paper are bijective, the requirement of being bijective is somewhat too strict in the sense that many desirable properties can be achieved without the parameterisation being bijective. We continue with introducing less strict requirements that allow to carry the analysis of the space M over to the space of trees T .
For the statistical analysis of a space, one needs to define probability distributions on the space, e.g. for Bayesian analysis the first step is to define a prior. A continuous probability distribution defined on the metric space M has to remain the same continuous distribution when pulled back to the space of trees T under the parameterisation p. In order to achieve this, one has to be able to continuously move from one tree to another by a path that stays within the tree space. In other words, any two trees have to be connected by a path.
Formally, a metric space X is called path-connected if for each pair of points x, y in the space, there exists a continuous map γ (with respect to the standard topologies) from the unit real segment [0, 1] to the space X such that γ(0) = x and γ(1) = y.
Thus, the first property a satisfactory parameterisation of the tree space must satisfy is The existence of paths between any two points alone is not necessarily enough to easily convert paths between points in the space M into paths between trees in T , because it could well be that the pre-image of the shortest path in M is not a path in T . Hence, we would like to have an easy way of testing what paths in M remain paths when pulled back to T . Furthermore, it could be that shortest paths in the tree space are not unique even when the model metric space M has unique shortest paths. The uniqueness of shortest paths is often desirable, as spaces with unique shortest paths possess the uniqueness of several important summary characteristics such as means and barycentres. Requirements of this sort can be fulfilled by the convexity of Image(p) in M, which means that every shortest path between every two points of Image(p) stays within Image(p).
Formally, a subspace X of a space Y is called convex if for every pair of points x, y ∈ X, every shortest path γ between x and y, and every real number s ∈ [0, 1], it follows that γ(s) ∈ X.
Hence, our second property is Suppose one has a probability distribution D on the metric space M with parameterisation p and one wants to define the probability distribution D on the tree space T by pulling D back to T . In this case, the image Image(p) has to be a non-trivial part of M, particularly it has to accumulate non-zero probability mass. Since a subspace of strictly smaller dimension than the dimension 3 of the space has measure zero, our next requirement is Image(p) has the same dimension as M. (P3) The properties P1-P3 guarantee that nice geometric properties of the space M will be inherited by the induced metric on the tree space T , but none of the requirements causes those properties to exist. They have to be postulated. Hence, we now go on to the properties of the space M. It is important to note that the following properties only make sense if the requirement P1-P3 are fulfilled.
The notion of convexity refers to a shortest path between two points. This is because it could well be that the shortest path is not unique. The uniqueness of shortest paths implies the uniqueness of several types of means, the soundness of the notion of a variance, and the existence and uniqueness of summary trees obtained by minimising an objective function of square distance. Hence, our next property is the following.
We say that a metric space possesses unique geodesics if there exists a unique shortest path between every two points in the space. This shortest path is called a geodesic 4 .
The metric space M possesses unique geodesics. (P4) In some cases even both the existence and the uniqueness of geodesics are not enough for the comprehensive analysis of real data sets, because the geodesics are incomputable for some metric spaces 5 . Hence our next and last property of the model space M is The geodesics in the space M are computable. (P5) The requirement for geodesics to be computable is reasonable for the theoretical investigation of presentations of the trees space, but could be still insufficient for the analysis to be carried out on a real computer. Hence if one has an appropriate notion of efficiently computable, then the property P5 can be strengthen to The geodesics in the space M are efficiently computable.
(P5 ) Statisticians and biologists are normally interested in the efficient computability 6 of various characteristics of a data set such as its mean, variance, diversity, confidence regions, and so on, rather than the geodesics themselves, but the computational complexity of most of the algorithms for computing these types of characteristics is polynomially equivalent to the complexity of computing the geodesics. In other words, computing the geodesic is the most computationally expensive step of the algorithms [1,3,21].
Our work is motivated by the lack of parameterisations in the literature that enjoy all properties P1-P5. Indeed, all known summary tree estimators operate in spaces larger than the space of equidistant rooted binary trees, hence breaking the requirement P3. For instance, Heled and Bouckaert [14] and Huggins et al. [16] use so-called Rooted Branch Score metric (RBS) space for producing a summary tree given a sampling from the posterior distribution. The idea of the RBS space is to encode a tree on n taxa by a (2 n − 1)-dimensional real vector, find an optimum in the (2 n − 1)-dimensional Euclidean space, and find the nearest point in the Euclidean space that can be pulled back to the tree space. Although this approach proved to be fruitful 5 It is not hard to see that the halting problem for Turing machines can be reduced to the problem of computing shortest paths in graphs. More precisely, there exists a computable graph G such that any algorithm that computes shortest paths between vertices in G, solves the halting problem. 6 They normally define the term 'efficient' differently for every problem at every point in time to be reasonably fast computable on devices available at that point of time.
in several applied scenarios [14], it lacks properties P1-P3. Moreover, a tree that minimises the RBS distance to a (finite) set of trees is not unique-there could be infinitely many such trees. This optimisation problem is computationally intractable even for not very large values of n. In implementations of this method, the inefficiency is overcome by restricting the search only to topologies that present in the posterior sample, that is, in the given set of trees. Furthermore, the topologies and the branch lengths have to be summarised separately in order to make the model computationally tractable [14].
Other metrics used in [16] employ projections to smaller dimension spaces to overcome the absence of properties P1-P3. Those metrics share the same pathologies as RBS. Moreover, the use of projections for estimating means can lead to unbounded errors as witnessed by the following proposition that claims that the projection of the mean can be as far from the mean of the projections as possible.
where d E is the Euclidean distance, pr D (x) is the projection of the point x ∈ E onto D, and mean X (x 1 , . . . , x s ) is the Fréchet mean of x 1 , . . . , x s in the space X.
Proof. We prove the proposition for k = s = 2. An arbitrary case is analogous. Let be the line through x 1 and x 2 in E and 0 be a line parallel to at a distance M from . Consider a parabola D which has its vertex on the line 0 and crosses the line at some points a and b both of which are between x 1 and x 2 . It is not hard to see that for large enough M , we get It might appear that the construction used in the proof is slightly artificial, but this is actually very similar to what is going on in such parameterisations as RBS and dissimilarity map distance [16], where the conditions on the set of points that correspond to trees are non-trivial [6]. The dissimilarity map distance [16] between two trees is defined as the distance between the distance matrices of the trees in the space of square matrices. That is, the parameterisation p maps a tree to its distance matrix, and the model metric space M is the space of n × n matrices with the pointwise distance. This space is geometrically similar to RBS in the way that none of the properties P1-P3 are satisfied. In [6], the Image(p) is characterised for the case when the trees are not necessarily equidistant. This characterisation fulfils the requirements P1-P3. An attempt to carry this characterisation over to the space of equidistant trees has the same complication as the BHV space, which we discuss below.
The most geometrically attractive parameterisation of non-equidistant tree space is the BHV space [3]. This is the only parameterisation we are aware of that fulfils all the properties P1-P5 [3,21]. This parameterisation employs a (2n − 2)-dimensional cubical complex with unique geodesics as the model metric space M, then a bijective correspondence between the space of all phylogenetic trees and the complex M is established. Trees of fixed topology are parameterised by a (2n − 2)-dimensional vector given by the lengths of the branches, and correspond to a cube. The adjacent cubes of the complex correspond to NNI-adjacent trees. Although it took 10 years to establish property P5 for the parameterisation, the polynomial algorithm designed in [21] appears to be quite practical.
As we demonstrate in the rest of this section, it is somewhat involved to apply the BHV model to the space of equidistant trees. A possible (naive) approach could be to simply restrict the BHV space to the set of equidistant trees. Unfortunately, this simple adaptation lacks all the properties P1-P3, so the algorithms developed in [21] become inapplicable.
Another (less naive) approach is to make the edge lengths dependent and demand that given the lengths of all internal edges and the shortest pendant edge, the lengths of the rest of pendant edges are computed so that the resulting tree is equidistant. This 'less naive adaptation' of the BHV space is similar to the 'bounded BHV' adaptation, which we consider in the end of this section.
A fundamental issue of all BHV-like spaces is that the subspaces corresponding to different ranked tree topologies have different volumes, which results in complications for statistical analysis of the space, particularly, for introducing prior probability distributions on the space.
In the rest of this section, we model the space of trees by a set of bounded polyhedral complexes indexed by the set of positive reals. We assume here that the reader is familiar with BHV space [3]. Otherwise, the rest of this section (excluding the next paragraph) can be skipped, as the following sections of the paper are self-containing.
Since the complexity of presentations is not the matter of this paper, we shall make no distinction between the tree space T and the model metric space M used in the parameterisation p of T , in the case when p is a bijection. For instance, when M has unique geodesics and p is a bijection, we shall simply say that T has unique geodesics (under this parameterisation). A parameterisation p is called strict if p is a bijection.
Consider the space BHV • , that is, the BHV space where pendant branches are ignored. The orthants in BHV • are unbounded and the space is a polyhedral complex of dimension n − 2, where n is the number of taxa. If each axis of each orthant was bounded by the same number, the complex would be cubical and would have identical geometric properties as the original BHV space. We restrict each orthant of the space BHV • to the set {T | T has height at most H}, where H is a fixed real number, and denote thus obtained space by BHV • H. The space BHV • H can be seen as the space of trees of height H, because every tree from BHV • H can be extended in a unique way to a tree of height H by attaching the pendant edges of appropriate lengths to the places where they were in the original BHV space. Thus, the polyhedral complex BHV • H is a strict parameterisation of the space of equidistant trees of height H. By varying H over the set of positive reals, we get a strict parameterisation of the tree space as a set of bounded polyhedral complexes indexed by positive reals. We call this space bounded BHV space.
Although the space BHV • H is not a cubical complex, it is geometrically and algorithmically similar to the BHV space. Indeed, since in a neighbourhood of the origin the space BHV • H is a cubical complex, it possesses efficiently computable unique geodesics in the same way as BHV does. This can be seen by noticing the following. Suppose C is a cubical complex with unique geodesics such that each cube is given by inequalities Then S has unique geodesics. Furthermore, if geodesics in C are efficiently 7 computable then so are geodesics in S. Both of the statements are not hard to prove, but this goes beyond the scope of this paper.
The first and most obvious complication of this parameterisation is the lack of independence between coordinates. The last coordinate, the length of the tree, cannot be smaller than the sum of coordinates corresponding to the internal edges. This results in technicalities in the study of the geometry of the space, and more problems in implementing algorithms. Another feature of this space is that a change of the length of only one internal branch causes a change of the length of all pendant edges. Hence, if the edge length is interpreted as time, which is the case for many phylogenetic applications, then a change of an older divergence time impacts the times of most recent divergence events for each taxon.
Although these technical issues are not very pleasant to deal with, they can be overcome. There are issues with parameterising the tree space using the bounded BHV that are more fundamental. If, for instance, one wants to bound the lengths of some internal edges using, for example, confidence intervals for those lengths, and perform the analysis under the assumption that these lengths vary only within the intervals, it will result in non-trivial boundary conditions on the lengths of the other internal edges as well as on the last parameter H. Another issue, which we mentioned before, is nonuniform distribution of the volume among different ranked tree topologies.
To overcome these and similar issues is the goal of the further sections of our paper.

The τ -space
In this section, we model the space of equidistant trees by a cubical complex, which we call τ -space, with efficiently computable unique geodesics and establish several geometric and algorithmic properties of the space.

The construction of the space
Those readers who are familiar with BHV space will probably not need much extra explanation other than Figure 1, which depicts one third of the 4dimensional τ -space T 4 , where each orthant is projected onto the subspace with the first coordinate τ 1 fixed. Although this projected space cannot be embedded into 3-dimensional Euclidean space, it can be visualised by imagining the other two thirds of the space. The figure is also helpful in understanding the fact that the Cartan-Alexandrov-Toponogov axiom (for k = 0), which we explain in Definition 4, holds for the space, that is, triangles are thin.
We now give a formal construction of the space to complement the figure. Since one of the main reasons for our interest in equidistant trees is that they allow accurate modeling of evolutionary processes, and in most of those models the height of a node is interpreted as time, we will be using words 'height' and 'time' interchangeably.
Let T be an equidistant tree on n taxa with times assigned to its nodes. Assuming that the times of all internal nodes are pairwise distinct, we denote the set of such trees by T n . We parameterise the tree T by a pair that consists of the ranked topology of the tree and the differences between the times of the tree's consecutive nodes. We proceed by defining this parameterisation formally. Let us order the internal nodes of T according to their times: v 2 , . . . , v n . Note that the node v n must be the root in this case. Denote the difference between the time of node v i+1 and the time of node v i by τ i for all i ∈ {2, . . . , n − 1}. We call τ i the coordinate of the node v i . Since the tree is equidistant, the differences between the time of v 2 and the times of external nodes are all the same, denote this difference by τ 1 . The coordinates of the tree T are given by the n-tuple (rt(T ),τ ), where rt(T ) is the ranked topology of the tree T andτ is the tuple (τ 1 , . . . , τ n−1 ) from R n−1 0 that consists of the coordinates of the nodes of T . By R n−1 0 we denote the (n − 1)-dimensional non-negative orthant {(r 1 , . . . , r n−1 ) | r i ∈ R & r i ≥ 0}, where R is the set of reals. Figure 2 depicts an example of τ -parameterisation of a tree from T 5 . Consider now the set RT n of all ranked topologies on n taxa such that all internal nodes have different ranks. We recall that there are (n−1)!·n! 2 n−1 many such topologies [22], and we denote this number by m throughout the paper.
Thus, we have constructed a disjoint union of m (n − 1)-dimensional polyhedra S = {(rt(T ),τ ) | T ∈ T n ,τ ∈ R n−1 0 }. Specifically, the polyhedra are orthants indexed by tree topologies. It is clear that the set T n is in a bijective correspondence with the interior of S. It is also obvious how to establish a bijection between the faces of the polyhedra in S and the set of ranked (multifurcating) tree topologies on n taxa which have at least two internal nodes of the same rank. Indeed, if we consider such a tree, the coordinates τ i that are between two nodes of the same rank have to be 0, and the faces of the polyhedra in S are precisely the tuples (rt(T ),τ ) where some of the coordinates τ i are 0. It remains to note that faces for which τ 1 = 0 (may be with some τ 2 , . . . , τ i also 0) will correspond to the trees where the most recent coalescent even (may be with second, . . . , i th most recent coalescent events) occurs at the origin of time 8 .
We now want to create a polyhedral complex in the obvious way, that is, by gluing the faces that correspond to same ranked (not necessarily fully resolved) tree topologies together. We proceed formally as follows. We define an equivalence relation ∼ on the set of faces of polyhedra in S. We say that two faces F and G are equivalent, written F ∼ G, if they correspond to the same ranked tree topology. Now, consider the set S that consists of the union of the set S and the set of all faces of elements from S. The polyhedral complex is then the quotient set S / ∼.
Since the trees are in a bijective correspondence with this complex, the parameterisation is strict and we shall refer to the space of trees T n as to a polyhedral complex slightly abusing the notation 9 .
The behaviour of polyhedral complexes where the polyhedra are orthants, is the same as the behaviour of cubical complexes. By this we mean that whenever one is interested only in local properties of the space, in properties that can be observed at a finite distance from the origin, rather than the properties that come up at the infinity, one can always assume that all coordinates of the orthants are bounded by a large enough constant. This boundary makes the polyhedral complex T n a cubical complex, which is a standard and well-studied object of geometric combinatorics [5,23]. This does not, in any sense, imply that the unbounded polyhedra are of smaller interest. There are whole branches of mathematics that investigate geometric properties at the infinity, e.g. Gromov's influential programme on large scale geometry [13]. We make this restriction simply because for our goals, as well as for the vast majority of biological applications, cubical complexes are a more appropriate object to model the space of trees because we investigate local properties solely. All of the results in this paper remain true in the unbounded case.
Thus, the space of trees T n is a cubical complex. This, particularly, implies that the space T n is a metric space with geodesics. It is easy to construct an example of a cubical complex where geodesics are not unique. Hence, the immediate question is whether geodesics are unique in τ -space. They are, and we are going to prove this, but before that let us consider some geometric properties of the space and compare them with those of BHV space.
It is convenient to think of the τ -space as of a set of points (trees) that freely move within orthants without leaving them as long as all the coordinates τ i are strictly positive. Whenever one of the τ i becomes 0, the point gets on the boundary of one smaller dimension. The point now can either move along the boundary by varying all the other τ i , or it can leave the boundary by increasing the τ i that got to 0. The boundary corresponds to a facet 10 F and there could be several orthants that have this facet F . It is not hard to understand that the possible numbers of orthants which share a common facet are one, two, and three. Indeed, we have seen an example of a polyhedron and its facet that is not a face of any other polyhedron, i.e. when τ 1 = 0. If a facet does not correspond to a multifurcation (see the facet between polyhedra corresponding to trees T and E in Figure 1), there will be two orthants that share this facet. If it does (as the other facet of the orthant corresponding to the tree T in Figure 1), then the number is 3.

Some geometric properties
At first glance, it might seem that the BHV and τ -space are very similar 11 . It might even appear that they are the same space. The graph on Figure 3 depicts the link of origin of the T 4 space. The link is very similar to that of the BHV space on four taxa indeed, but it already suggests several differences that we would like to investigate. In this subsection, we establish several geometric properties of the two 10 By a facet of a polyhedron here and throughout the paper, we mean a face whose dimension is one smaller than the dimension of the polyhedron, that is, a face of codimension one. 11 Again, the subsection can be skipped by those who are not familiar with BHV space, as the rest of the paper does not depend on this subsection. spaces to better understand the differences and similarities between them in order to answer the question of whether the algorithms developed for BHV space [21] are applicable in τ -space. The first property we want to point out is the following. For every tree topology, the dimensions of corresponding orthants in BHV n and T n are different. This is because the pendant edges add n to the dimension of the BHV-orthant and add 1 to the dimension of the τ -orthant. One might suggest that the spaces BHV • n and T • n , the corresponding spaces where the pendant edges are omitted, are geometrically similar. They are indeed, they share a number of geometric properties. But they appear quite different if one attempts to uniformly map one distance to the other. If such a mapping existed, all the geometric and algorithmic results for BHV could be directly applied to τ -space. We formalise this assertion in the following two propositions.
Proposition 2. The spaces BHV • n and T • n are not isometric. Proof. This follows from the fact that isometries preserve angles. Indeed, let us fix a non-caterpillar topology and consider the corresponding orthants in BHV and τ -space. We may notice that there will be several orthants in the τ -space and only one orthant in the BHV space. One can use an appropriate number of hyperplanes to partition the BHV-orthant in a way that every member of the partition corresponds to the trees in precisely one τ -orthant. Clearly, no embedding of these subspaces preserves angles.
The proof above can intuitively be understood by trying to establish an isometry between the orthants that correspond to the trees T and E in Figure 1. The corresponding τ -and BHV-subspaces can be drawn as Figure 4 (note that the objects depicted are flat). One might still wonder in what way the distances are related. It might even appear that the BHV-distance majorates the τ -distance. While it is obvious that the BHVand the τ -coordinates are easily computable from each other 12 , the following proposition indicates that the dependence of the coordinates is non-monotonic with respect to distances. It is important to note that since the dimensions of BHV n and τ n are different, we are ignoring the external branches here and considering BHV • n and τ • n . Proof. Consider the trees T , R, and E depicted in Figure 1. We finish the proof of the lemma by setting (1) All sigmas, mus, and taus to 1. In this case: It might look that an inequality of the second type can only be obtained in the quadrants that present in the τ -space but not in the BHV space. Although it is not necessary for the proof, we demonstrate that this is not the case by setting (2) τ 2 = 1, τ 3 = 2, µ 2 = 3, µ 3 = 4. In this case:

Uniqueness and efficiency of geodesics
The geometric property of main interest to us is the uniqueness of geodesics.
We are aimed at efficient procedures for computing several geometric characteristics such as Fréchet mean, standard deviation, and convex hulls, for which the uniqueness of geodesics is crucial. We derive the existence of those procedures for τ -space from general properties of cubical complexes. Although most of the properties of the space, including the uniqueness of geodesics, can be shown directly, we will establish those properties in this section using Gromov's CAT theory [12]. Our main motivation for doing this is that this way is more elegant, to our taste, than the one that uses a direct construction of geodesics.
We recall that a metric space X is called geodesic if every pair of points from X is connected by a shortest path. A geodesic metric space has unique geodesics if the geodesic between every two points is unique.
Definition 4. A geodesic metric space X is said to satisfy Cartan-Alexandrov-Toponogov axiom, or be CAT(0), if the following property holds.
For all triples x 1 , x 2 , x 3 ∈ X and all points y on a geodesic from x 1 to x 2 , the inequality d X (x 3 , y) ≤ d E (x 3 , y ) holds, where x 1 , x 2 , x 3 are three points on the Euclidean plane such that In other words, a metric space X is CAT(0) if no triangle ∆ in X is thicker than a Euclidean triangle ∆ E of the same size as ∆.
It follows from the definition of a CAT(0) metric space that it has unique geodesics. Indeed, let X be a CAT(0) space and a, b two points from X. Consider a point x on a geodesic γ from a to b and consider a degenerate Euclidean triangle a , x , b where x lies on the segment [a , b ] at the same distance from a as x is from a in X. The axiom CAT(0) implies then that d X (x, y) ≤ d E (x , x ), where y is a point on any geodesic from a to b at the same distance from a as x. Since d E (x , x ) = 0, d X (x, y) = 0 and every geodesic from a to b coincides with γ because we have chosen x arbitrarily.
Due to this observation, the fact that τ -space has unique geodesics is derived from the following theorem.
Theorem 5 (Gromov [12]). A cubical complex C with the intrinsic l 2 -metric is CAT(0) if and only if C is connected, simply connected, and for all natural numbers k, if three (k + 2)-cubes of C share a common k-cube and pairwise share common different (k+1)-cubes, then they are contained in a (k+3)-cube of C.
Clearly, the τ -space is a cubical complex which is connected and simply connected. For the last requirement of the theorem, we note that the configuration when three (k + 2)-cubes pairwise share three common different (k + 1)-cubes can only happen when the following property is satisfied. The topology corresponding to one of the three (k + 2)-cubes has some vertices v i , . . . , v i+k , k ≥ 2, such that the coordinates τ i+1 = . . . = τ i+k−1 = 0, and both τ i and τ i+k are positive. Furthermore, the topology corresponding to one of the other two cubes has τ i = 0 and τ i+1 > 0, and the topology corresponding to the remaining cube has τ i+k−1 > 0 and τ i+k = 0. Clearly, all these cubes are contained in the (k + 3)-cube where all of the τ i , τ i+1 , τ i+k−1 , τ i+k are positive.
Thus, we have established the following result.
This property is fundamental for summarising sets of trees, because the uniqueness of geodesics implies that several geometric centres are unique. For example, such objects as Fréchet mean, barycentre [20], convex hull, and many other, are well-defined.
The immediate question that arises once the existence and uniqueness of the geometric characteristics such as above is established, is the question of effectiveness. Can different types of means, barycentres, and hulls be efficiently computed? The answer is positive for the τ -space. Indeed, the space is not just a space with unique geodesics, τ -space is a cubical complex. For cubical complexes, low-degree polynomial algorithms are known for computing geodesics, means, barycentres, variances and convex hulls [19]. Thus, we suggest that τ -space serves as a tool for statistical analysis of sets of trees. Particularly, for computing the summary tree of a posterior sample obtained using, for example, MCMC.

The t-space
A natural question one could ask is why we take the time differences instead of times themselves as coordinates of trees in the orthants of τ -space. One of the reasons for this is that the τ -coordinates are independent as opposed to the t-coordinates-the times of the nodes. The independence of coordinates is a very useful property of parameterisations because it avoids constant checking of dependency conditions while performing the analysis (both geometric and algorithmic) of the space. We have seen in previous sections an example of parameterisation where coordinates are dependent in a way which causes complications in working with the space. Another reason is that τ -parameterisation is used in several phylogenetic models such as coalescent.
In some cases boundary conditions are unavoidable. For example, some boundary conditions present implicitly in τ -space-we require the coordinates to be positive and bound the first coordinate to the set of topologies of trees. The boundary condition that makes orthants cubes is explicit. Since these conditions are easily satisfiable and checkable, they do not cause any serious complication as we have seen in Section 2. Furthermore, as absolute divergence times are often the object of interest, the parameterisation of trees using the times of their nodes is natural for several phylogenetic modes, e.g. birth-death model [18]. As birth-death priors are one of the main classes of priors used in Bayesian inference, we address this parameterisation in more details.
The purpose of this section is twofold. First, as we mentioned, we would like to study the geometric and efficiency properties of one of the prominent parameterisations in evolutionary biology. Secondly, we demonstrate how radically these properties can change after a seemingly negligible change in parameterisation. Namely, converting τ -coordinates to their initial sums, that is, to the times of coalescent events, makes fundamental results from combinatorial geometry such as Gromov's theorem used to prove Theorem 6 inapplicable, along with the algorithmic results from [21] and [19].
It is probably already clear how we are going to introduce the t-space, but we would like to be explicit about this and do so formally. Let us consider an equidistant tree T with ranked topology rt(T ) with no nodes of the same rank. For each node v i from T , let t i be the distance from v i to the closest taxon. In this way, we assign times to all nodes of T , with all taxa being of time 0. Let us order all internal vertices of T according to their times: v 1 , . . . , v n−1 . Then the coordinates of the tree T in t-space is (rt(T ), t 1 , . . . , t n−1 ).
We note that if we vary the times of the nodes of T while keeping the ranked topology preserved, we get a simplex {(t 1 , . . . , t n−1 ) | 0 ≤ t 1 ≤ . . . ≤ t n−1 ≤ H}, where the upper bound on the height of the tree H is introduced by the same reason it was introduced in τ -space. Figure 5 depicts one such simplex. We create a simplicial complex out of (n−1)!·n! 2 n−1 such simplices corresponding to different ranked topologies on n taxa in a similar way the complexes are created in BHV and τ -spaces, namely, we identify faces of simplices corresponding to same tree topologies. The first substantial difference is that the edge of the complex that is shared by all the simplices is not an axes, it is rather the line t 1 = . . . = t n−1 . Furthermore, the faces are defined by some of the coordinates being equal, t i = t k , rather than some of the coordinates being 0. We denote the t-space on n taxa by T n . Figure 6 depicts the space T 3 in full. In this figure, three coloured triangles of the simplicial complex correspond to the three depicted topologies. The triangles share a line that corresponds to the unresolved tree on three taxa. The upper bound of the triangles is the artificial bound H we put on the height of trees to make the polyhedral complex simplicial.
The following Figure 7 depicts a part of the t-space T 4 . Unlike Figure 1, we do not project the simplices onto a 2-dimensional subspace and draw them as 3-dimensional pyramids. Three such pyramids are depicted in Figure 7 in white, blue, and grey. The white pyramid shares a facet with both grey and blue pyramids. The grey and blue pyramids share the edge t 1 = t 2 = t 3 of the complex only. This is one-sixth of the space T 4 corresponding to the topologies depicted. Our next step is naturally to ask whether t-space has unique geodesics. This question cannot be answered in the same simple way as it is done for BHV and τ -space using Gromov's CAT theory, because t-space is not a cubical complex. It is reasonable to expect that there is in the literature a result similar to Gromov's theorem for simplicial complexes. There is one indeed. A simplicial complex is called k-large if every cycle in the link of every vertex, such that no two consecutive edges of the cycle are contained in a 2-simplex of the complex, has length at least k. As noted in [17], although no combinatorial formulation of the CAT(0) property of the standard piecewise Euclidean metric on a simplicial complex is known, the following theorem holds.
It is not hard to see that t-space T n is 6-large and not 7-large when n ≥ 4, hence the theorem above is not applicable. Simplicial complexes that are connected, simply connected, and 6-large are called systolic. Systolic complexes have extensively been studied [9] and it was shown that in dimension 13 two, a simplicial complex is systolic exactly when the standard piecewise Euclidean metric is CAT(0), while in higher dimensions being systolic is neither stronger nor weaker than the standard piecewise Euclidean metric being CAT(0). Hence for t-space, being systolic does not necessarily imply having unique geodesics.
Although it is truly amazing how a minor change in the parameterisation that leaded us from τ -space to t-space results in such a dramatic change in the geometry of the space, which in particular made two very powerful theorems from geometric combinatorics inapplicable, t-space possesses unique geodesics. We will sketch the proof of this fact later in this paper.
The change in the parameterisation results not only in the question of uniqueness becoming more complicated. What is more important is that the algorithms used for computing geodesics in BHV and τ -space cannot directly be applied in t-space. Moreover, their existence has to be questioned, and rightly so, as the following theorem demonstrates.
Theorem 8. The problem of computing geodesics in t-space is NP-hard.
We will reduce the problem of computing NNI-distance to the problem of computing geodesics in t-space, but before going on to the proof of this result, we would like to develop some intuition of why t-space is so different from both BHV and τ -space. The key property for this difference is that the cone-path is rarely a geodesic in t-space. Indeed, in both BHV and τspace the position of two cubes can result in a cone-path being the geodesic between any pair of trees from these cubes. Particularly, the measure of the set of pairs of trees between which the cone-path is a geodesic is positive. For example, if two trees T and R have topologies with no compatible splits, then the geodesic between T and R is a cone-path [3]. A property such as this does not present in t-space. It will follow from the observations below that the measure of the set of pairs of trees between which the geodesic is a cone-path in t-space has measure 0. Let us illustrate this effect by the following example. Consider the trees T and R depicted in Figure 8. Since the trees do not have compatible splits, the geodesic is a cone-path in both BHV and τ -space. It is not hard to see that the shortest cone-path in tspace passes through the star-tree of height 6 and has length 2  We note that we still have not established the uniqueness of geodesics in t-space. Due to technical reasons, it is easier to prove Theorem 8 first and then use the approach developed in the proof to derive the uniqueness of geodesics. Hence, we continue with the proof of Theorem 8 that does not assume that the geodesics in t-space are unique.
Proof of Theorem 8. Let us introduce the following notations that make the statement of the theorem more formal and will be used throughout the proof.
By a partition with attached time-coordinate, we mean an object of the form (N 1 | . . . | N q ) : t, where N 1 | . . . | N q is a partition of the taxa that can be obtained by cutting the tree along the line obtained by fixing the timecoordinate, and t is the least value of the time-coordinate that produces this partition. For example, the left-hand side tree on Removing one or more partitions from a set of partitions that defines a completely resolved tree gives rise to a non-resolved tree or a tree with two or more internal nodes of the same rank. For example, if we remove the partition (12 | 3 | 4) from the left-hand side tree on Figure 8 then we get the tree ((1, 2), (3,4)) of height 9 with both internal nodes being of height 8.
We note here that as we consider only trees with all the taxa being at time 0, the partition (1 | 2 | . . . | n) : 0 is assumed to (invisibly) present everywhere 14 . Clearly, a tree is unambiguously defined by its set of partitions with attached time-coordinates, and a set of partitions defines a tree if and only if one member of every pair of partitions from the set refines the other and the time-coordinates of the partitions are monotonic under these refinements.
It can be shown in the same way as in [19] that the coordinates in t-space change at a fixed rate along geodesics. This justifies the following definition.
We say that a geodesic γ between trees T and R is computable in polynomial time if a polynomial and an algorithm exist that given the t-coordinates of trees T and R, outputs after a number of steps bounded by the polynomial of n an array of sets of partitions with time-coordinates attached to every partitions such that (1) The set of partitions ∪ i A i from the first row along with the attached time-coordinates define the tree T .
(2) The set of partitions ∪ i B i from the last row along with the attached time-coordinates define the tree R.
(3) Every row of the array defines the tree on the face of a simplex where the geodesic γ crosses the face.
(4) Every tree that is the intersection of the geodesic γ with a face of a simplex corresponds to a unique row of the table.
In terms of simplices, the number of elements in each A i , which is equal to the number of elements in corresponding B i , is the codimension of the face corresponding to the i th row. In terms of trees, this number is the number of multifurcations plus the number of non-resolved ranks of internal nodes of the tree corresponding to the i th row. Note that these properties imply that all the rows of the array are pairwise different, all time-coordinates attached to the partitions from the same row are pairwise different, and the time coordinates attached to the same partition in different rows may be different. Clearly, every geodesic is unambiguously defined by an array of partition satisfying these properties.
We say that the geodesic γ is an NNI-path if every A i contains precisely one partition 15 . Note that the properties above imply that every B i has to contain precisely one partition as well. We reduce the problem of computing the NNI-distance between two tree topologies, which is known to be NPcomplete [7], to the problem of computing geodesics in t-space. The example that precedes this proof is a helpful illustration of why such a reduction is possible. We need the following lemma, which appears technical at first glance, but is actually the key to both algorithmic hardness of geodesics and to their uniqueness. Lemma 9. Let T and R be two trees and γ a geodesic in t-space between them. Then there exist a tree T and a geodesic γ between T and R such that (1) The trees T and T have the same ranked topology.
(3) The tree T is computable in polynomial time from T and γ.
(4) The set of partitions that defines the geodesic γ is computable in polynomial time from T and γ.

Proof. Let
be the geodesic γ from T to R. We adjust the time-coordinates of T to obtain T with desired properties. Let s be the least natural number such that the set A s contains more than one element. If such s does not exist, we are done and γ is an NNI-path. Suppose first that the set A s contains precisely two elements, A s = {q 1 : t 1 , q 2 : t 2 }. In this case, B s = {r 1 : v 1 , r 2 : v 2 }. We note that the partition q 1 presents in A s in all rows where A s appears, particularly it presents in the tree T , perhaps with different time-coordinate t 1 . Since the time coordinate changes at a constant rate along the geodesic and the distance function is a continuous function, it is possible to find a positive real number ε such that if we vary the time-coordinate t 1 by ε, then there will be a geodesic from thus obtained tree T 1 to the tree R having all partitions the same as γ up to A s and having two partitions A s = {q 1 : t 1 } and A s = {q 2 : t 2 } instead of A s . Indeed, suppose first that t 1 < t 1 , hence the time-coordinate of the partition q 1 changes continuously from t 1 to t 1 , hence if we change t 1 to be t 1 +ε for small enough epsilon then thus obtained tree T 1 and a geodesic from T 1 to R will satisfy the specified property. If t 1 > t 1 then we repeat the argument but take t 1 − ε instead of t 1 + ε. Finally, if t 1 = t 1 then the time-coordinate t 2 and the time-coordinate t 2 which corresponds to the partition q 2 in T must be different, so in this case we apply the above argument but take t 2 and q 2 instead of t 1 and q 1 .
Informally, this means that we can adjust the time-coordinate of the partition q 1 (or q 2 ) in T so that instead of passing the face corresponding to A s of codimension 2, the geodesic passes first the face A s of codimension 1 and then another face A s of one smaller codimension than A s . This happens because the change of the coordinate t 1 in T will result in the geodesic developing faster in the direction of t 1 , so the geodesic will first reach the face corresponding to A s and then the one corresponding to A s rather than both A s and A s simultaneously as the geodesic γ does.
If A s has more than two partitions, we repeat the construction above until the set A s is replaced by several one-element sets. The only thing to note here is that there can be at most one partition q i in A s such that the timecoordinate t i of the partition q i in row s is the same as the time coordinate of q i in the first row. Once the set A s is replaced by one-element sets, we proceed to the next non-one-element set of partitions until all the sets A i become one-element.
Clearly, this algorithm of computing T is polynomially equivalent to the maximum of complexities of T and γ. It remains to note that on our way of adjusting the time coordinates of the tree T , we computed all the partitions that define the geodesic γ . We note that we have not specified the timecoordinates of those partitions, but this is not required in the lemma.
We note that it follows from the proof of this lemma that the measure of the set U of pairs of trees with a geodesic being a cone-path is zero. Indeed, if we fix a tree T then the set of trees R for which a geodesic from T to R is a cone path has measure 0, according to the proof of the lemma. That is, the measure of the set {R | (T, R) ∈ U } is 0 for all T . Then it follows from Fubini's theorem that the (product) measure of the set U is 0.
Let us get back to the proof of Theorem 8. Consider two tree topologies tt(T ) and tt(R). Assign arbitrarily the coordinates to internal nodes of the topologies so that the trees T and R are fully resolved equidistant trees with all internal nodes of different ranks. We reduce the problem of computing NNI-distance between tree topologies to the problem of computing a geodesic between corresponding trees in t-space. To do so we first compute a geodesic γ from T to R in t-space. Then we apply Lemma 9 to compute T and the sets of partitions that define γ as in the lemma. It remains to note that given the set of partitions for γ , we can compute the NNI-distance between tt(T ), which coincides with tt(T ), and tt(R) in polynomial time. Indeed, given the set of partitions for the geodesic γ , we can recognise what rows of the partition representation array of γ correspond to NNI moves and what are just changes in the ranks of nodes of the topology. Since γ is an NNIpath, the number of NNI moves obtained in this way is the NNI-distance. This last claim follows from the fact that if thus obtained number of NNI moves is greater than the NNI distance between the tree topologies, then the geodesic that follows the NNI-path with smaller number of NNI moves must be shorter. Indeed, any NNI-path in t-space has to change the topology of the trees from one to the other using only NNI-moves. Clearly, every extra NNI-move adds positive length to the corresponding path in t-space. Hence, every t-geodesics which is an NNI-path has to follow a shortest path in the NNI graph.
Although, algorithmic hardness is generally a bad news for practical applications, it is not the end of the story, as in practice things are greatly depend on specific problems and data sets. Furthermore, even NP-hard problems can sometimes have efficient solutions for satisfactory large number of taxa. For example, NP-completeness of SPR-distance, for which NNIdistance is a refinement, did not stop a fast enough algorithm to exist for several practical problems, neither it stopped Whidden et al. [24] from finding that algorithm.
We finish our paper with sketching the proof of the uniqueness of geodesics. Since we are not going to present a complete proof of the result, we leave it as a conjecture. The result can be established using a similar technique of analysing local properties of geodesics as in the proof of Theorem 8.
Conjecture 10. The t-space is a space with unique geodesics.
Proof 's roadmap. A careful consideration of the proof of Lemma 9 shows that the statement of the lemma can be strengthened by additionally requiring that the lengths of γ and γ are the same.
Assume that there exist two trees T and R and two different geodesics γ 1 and γ 2 between them. Applying the lemma to both of the geodesics, we can assume that they are NNI-paths. It is not hard to see that no two NNI-paths can be geodesics between two completely resolved trees. This is a contradiction.

Conclusion and further directions
We have considered two standard parameterisations of the space of ultrametric phylogenetic trees: (1) using lengths of coalescent intervals and (2) using times of divergence events. By considering suitable polyhedral complexes, we have found two possible representations of the space of trees called τ -space and t-space respectively. Despite their similarity, the two parameterisations have significantly different geometric and algorithmic properties. We proved that shortest paths, Fréchet mean, standard variance, and some other geometric and statistical characteristics are efficiently computable in τ -space but not in t-space. We have established that the problem of computing shortest paths between trees is NP-hard in t-space. Although t-space has a high algorithmic complexity for computing shortest paths, the space has several properties that are desirable for statistical analysis of tree space. For instance, we proved that the paths that traverse a star tree are often shortest in τ -space and are almost never shortest in t-space. This feature of t-space is a desirable property for phylogenetic applications, and particularly for summarising posterior samples by a point estimate. Indeed, one of the unpleasant features of BHV space is that parts of the summary tree are often close to the star-tree, when incompatible subtrees are supported by the posterior. As we have demonstrated in this paper, this feature is a consequence of a fundamental property of the space. The property is that the measure (volume) of the set of pairs of trees for which the shortest path traverses a star-tree is positive in τ space (and in BHV space [3]), while in t-space the measure of this set is zero. Thus we expect summary trees produced using t-space to be more informative and realistic.
An obvious direction of further research is to design and implement the algorithms the existence of which is proven in this paper. It will be important to test these algorithms on simulated and real data sets, compare them with known algorithms, and suggest what extra formal properties of a parameterisation of the tree space are desirable. As is suggested in our work, there are other possible ways that equidistant tree space can be parameterised. We have considered two obvious parameterisations and established that they are already quite different. One can easily come up with many other ways the space could be parameterised. The question arises: Question 1. Is there, in some sense, a single optimal parameterisation of the tree space? If not, what is the class of acceptable parameterisations?
Our paper suggests a number of directions for further theoretical investigations. An important statistical question is Question 2. What parameterisation should be used for the coalescent model? Birth-death model? Must the parameterisations used for these two models be different?
The first step towards the answer for this question is obviously to consider the coalescent and the birth-death priors in τ -and t-spaces. Are these priors continuous in these spaces? Can the distance between two trees be made a simple function of their prior probabilities?
Although much work has been done investigating CAT(0) simplicial complexes, no satisfactory characterisation of the complexes is known [9]. Further research is needed with an eye towards effectiveness properties. The problem in general is expected to be hard because even constructing non-trivial examples of CAT(0) simplicial complexes requires significant effort and only a few such examples are known [9]. In this paper, we have provided such an example-the t-space. Hence the following question, which we ask for t-space, is also important for CAT(0) simplicial complexes in general. Question 3. Is there a low-degree exponential algorithm for computing shortest paths between trees in t-space? If yes, how large a data set can be handled using the algorithm?
As we have established in this paper, the measure (volume) of the set of pairs of trees between which the shortest path traverses a star-tree is 0 in t-space and is positive in τ -space. This measure is positive in BHV space [3] too. Hence the obvious question is to find this measure. More precisely: Question 4. Let µ n be the measure of the set of pairs of trees on n taxa between which the geodesic is a cone-path 16 . What is the value of µ n for BHV space? For τ -space? Is the sequence {µ n } n∈ω convergent? If so, what is the limit lim n µ n ? What is the meaning of this limit? Clearly, µ 3 = 1 in both BHV and τ -space. To find µ 4 is an entertaining exercise.