Heading in the right direction? Using head moves to traverse phylogenetic network space

Head moves are a type of rearrangement moves for phylogenetic networks. They have mostly been studied as part of more encompassing types of moves, such as rSPR moves. Here, we study head moves as a type of moves on themselves. We show that the tiers ($k>0$) of phylogenetic network space are connected by local head moves. Then we show tail moves and head moves are closely related: sequences of tail moves can be converted to sequences of head moves and vice versa, changing the length by at most a constant factor. Because the tiers of network space are connected by rSPR moves, this gives a second proof of the connectivity of these tiers. Furthermore, we show that these tiers have small diameter by reproving the connectivity a third time. As the head move neighbourhood is in general small, this makes head moves a good candidate for local search heuristics. Finally we prove that finding the shortest sequence of head moves between two networks is NP-hard.


Introduction
For biologists, it is vital to know the evolutionary history of the species they study. Evolutionary histories are, among other things, needed to find the reservoir/initial infection for some disease [e.g., 10,23], or to learn about the evolution of genes, giving us insight in how they work [e.g., 27,31,18,11,1].
These histories are traditionally represented as phylogenetic trees. This focus on trees has recently started shifting towards phylogenetic networks, in which more biological processes can a b c d e f be represented ( Figure 1). These biological processes, such as hybridization and horizontal gene transfer, are collectively known as reticulate evolutionary events, because they cause a reticulate structure in the representation of the history.
Phylogenetic networks are generally harder to reconstruct than trees. Reconstruction most often takes the form of an optimization problem. Certain phylogenetic optimization problems can still be solved quickly, even if they involve networks [e.g., 30]. However, most of these problems are already hard when they involve trees, and they do not get easier when networks are introduced as well (e.g., ML based reconstruction [25]). In such cases, some kind of local search is often employed [6,22,24,23]. This is a process where the goal is to find a (close to) optimal tree by exploring the space of trees making only small changes, called rearrangement moves.
Several rearrangement moves have long been studied for phylogenetic trees. The most prominent ones are Nearest Neighbour Interchange (NNI), Subtree Prune and Regraft (SPR), and Tree Bisection and Reconnection (TBR) [6,26]. The last decade has seen a surge in research on rearrangement moves for phylogenetic networks based on these moves for trees. There are several ways of generalizing the moves to networks, which means there is a relatively large number of moves for networks, including rSPR moves [9], rNNI moves [9], SNPR moves [2], tail moves [17], and head moves ( Figure 2).
All these moves are similar in that they only change the location of one edge in the network. Hence, a lot of properties of the search spaces defined by different moves can be related. In this paper we study relations of such properties for the spaces corresponding to tail and head moves. The results we obtain can also be used in the study of rSPR moves, because rSPR moves consist of tail moves and head moves; and it can be used for the study of SNPR moves for the same reason.
We start by proving that each tier of phylogenetic network space is connected by distance-2 head moves, but not by distance-1 head moves (Section 3). Then, in Section 4 we prove that each head move can be replaced by at most 16 tail moves, and each tail move can be replaced by at most 15 head moves. This not only reproves connectivity of tiers of head move space, but also gives relations for distances between two networks measured by different rearrangement moves ( (u, v) to (x, y); Bottom: the head move (u, v) to (x, y). On the left, the starting networks in which the moving edges are coloured. The right networks are the resulting networks after the moves, with the moved edge coloured differently. The middle graph is a combination of the left and the right network, with the moving edge coloured differently. The solid coloured edge is the moving edge of the network before the move, the dashed coloured edge is the moving edge of the network after the move. We distinguish the moves with edge colours: blue is a tail move, orange is a head move.
head moves, and tail moves). In Section 5, we prove the upper bound 6n + 6k − 1 for the diameter of tier-k of network space with n taxa. Lastly, in Section 6, we prove that computing the head moves distance between two networks is NP-hard.
Incoming edges of reticulation nodes are called reticulation edges. The reticulation number of N is the number of reticulation nodes. The set of all networks with reticulation number k is called tier-k of phylogenetic network space.
For simplicity, we will often refer to binary phylogenetic networks as phylogenetic networks or as networks. Phylogenetic trees are phylogenetic networks without reticulation nodes. Many phylogenetic problems start with a set of phylogenetic trees and ask for a 'good' network for this set of trees. This often means that the trees must be contained in this network in the following sense.
Definition 2 Subdividing an edge (u, v) consists of deleting it and adding a node x and edges (u, x) and (x, v). A subdivision of a digraph G is any graph obtained from G by repeatedly subdividing edges.
The reverse operation is suppressing an indegree-1, outdegree-1 node. Let x be such a node with in-edge (u, x) and out-edge (x, v), then suppressing x consists of removing the edges (u, x) and (x, v) and the node x, and then adding an edge (u, v).
Definition 3 A tree T can be embedded in a network N if there exists an X-labelled subgraph of N that is a subdivision of T . We say T is an embedded tree of N . The corresponding map, which sends nodes of T to nodes of N and edges of T to directed paths in N , is called an embedding of T into N .
Because a network is a DAG, there are unambiguous ancestry relations between the nodes. We draw networks with their root at the top, and the leaves at the bottom. This induces the following terminology.
Definition 4 Let u, v be nodes in a network N . Then we say: • u is above v, and v is below u, if there is a directed path from u to v in N .
• u is directly above v, and v directly below u, if there is an edge (u, v) in N .
Similarly, we say an edge (u, v) is above a node w or above an edge (w, z) if v is above w. An edge (u, v) is below a node z or an edge (w, z) if u is below z.
Alternatively, if u is above v, we say that u is an ancestor of v, and if u is directly above v, we say that u is a parent of v and v a child of u. A Lowest Common Ancestor (LCA) of two nodes u and v is a node z which is above both u and v, such that there are no other such nodes below z.
Unlike for trees, there may not be a unique LCA for two nodes in a network. An important substructure in a network is the triangle. This structure has a big impact on the rearrangements we can do in a network.
Definition 5 Let N be a network and t, s, r nodes of N . If there are edges (t, s), (t, r) and (s, r) in N , then we say t, s and r form a triangle in N . We call t the top of the triangle, s the side of the triangle, and the reticulation r the bottom of the triangle.
The following observation describes one property of triangles that we will use frequently.

JGAA, 25(1) 263-310 (2021) 267
Observation 1 Let N be a network with a triangle u, s, v, where s is a tree node. Then reversing the direction of the edge (s, v) gives a network N . We say the direction of the triangle is reversed. This can be achieved by the distance-1 head move which moves (u, v) to the outgoing edge of s that is not part of the triangle (Figure 3). Definition 6 Let X = {x 1 , . . . , x n } be an ordered set of labels. The caterpillar C(X) is the tree defined by the Newick string (· · · (x 1 , x 2 ), x 3 ) · · · , x n );

Rearrangement moves
The main topics of this paper are head and tail moves, which are types of rearrangement moves on phylogenetic networks. Several types of moves have been defined for rooted phylogenetic networks. The most notable ones are tail moves [17], rooted Subtree Prune and Regraft (rSPR) and rooted Nearest Neighbour Interchange (rNNI) moves [9] and SubNet Prune and Regraft (SNPR) moves [2]. These moves typically change the location of one endpoint of an edge, or they remove or add an edge. We now introduce the basic notions of head and tail moves, following the presentation of [17]. 2. subdivide f with a new node u ; 3. suppress the indegree-1 outdegree-1 node u; 4. add the edge (u , v).
Tail moves are only allowed if the resulting digraph is still a network (Definition 1). We say that a tail move is a distance-d tail move if, after step 2, a shortest path from u to u in the underlying undirected graph has length at most d + 1. The networks before and after a tail or head move always lie in the same tier. Hence, we say these moves are horizontal moves; this is to contrast them with moves that change the reticulation number, which we call vertical moves. Note that rSPR moves are horizontal moves as well. In fact, an rSPR move is either a head move or a tail move. Similarly, rNNI moves are distance-1 rSPR moves (i.e., distance-1 head or tail moves). SNPR moves, however, may be vertical as well: an SNPR move is either a tail move, or a vertical move that simply removes or adds an edge.

Validity of moves
As we want to use rearrangement moves to traverse network space, each move must result in a phylogenetic network. The definitions in the previous subsection enforce that this always happens for tail and head moves. In this paper, we often propose a sequence of moves by stating: move the tail of edge e to edge f , then move the head of edge e to f and so forth. We then check whether these moves are valid, or allowed ; that is, whether applying the steps in the definitions of the previous subsection produces a phylogenetic network. A necessary condition for a rearrangement move to be valid, is that the moving edge is 'movable', which ensures that 'detaching' the edge does not create parallel edges. Definition 9 Let (u, v) be an edge in a network N , then (u, v) is tail-movable if u is a tree node with parent p and other child c, and there is no edge (p, c) in N .
This is equivalent to saying that an edge with tail u is tail-movable if u is a tree node and u is not the side of a triangle. We give a similar definition for head moves.
Definition 10 Let (u, v) be an edge in a network N , then (u, v) is head-movable if v is a reticulation node with other parent p and child c, and there is no edge (p, c) in N .
When the type of move is clear from context, we will simply use the term movable. Using the concept of movability, we can now succinctly give sufficient conditions for a move to be valid. Besides movability, we need additional conditions to make sure that reattaching the edge does not create parallel edges, and that the resulting network has no cycles. These correspond to the second and third conditions in the following lemma.
is valid if all of the following hold: • v is not above s.
Proof: Because (u, v) is tail-movable, the removal of (u, v) and subsequent suppression of u does not create parallel edges. Because v = t, subdividing (s, t) with a node u and adding the edge (u , v) does not create parallel edges either. Hence, the resulting digraph N of the tail move contains no parallel edges. Now suppose N has a cycle. As each path that does not use (u , v) corresponds to a path in N , the cycle must use (u , v). This means that there is a path from v to u in N . Because u is a tree node with parent s, there must also be a path from v to s in N . This implies there was also a path from v to s in N , but this contradicts the third condition: v is not above s. We conclude that N is a DAG. As all labelled nodes remain unchanged by the tail move, N is a phylogenetic network and the tail move is valid.
The proof of the corresponding lemma for head moves is completely analogous.
is valid if all of the following hold: • t is not above u.
We will very frequently use the following corollary of this lemma, which makes it very easy to check whether some moves are valid.
Corollary 1 Let (u, v) be a tail-movable edge, then moving the tail of (u, v) to an edge above u is allowed. We also say that moving the tail of (u, v) up is allowed. Similarly, moving the head of a head-movable edge down is allowed. Lemma 3 Let N be a network and T its set of embedded trees, and let N with embedded trees T be the result of one head move in N . Then there is a tree T ∈ T which is embedded in N . Furthermore, for each T ∈ T there is a tree T ∈ T at most one tail move away from T .
Proof: Let (u, v) be the edge that is moved in the head move from N to N . Then v is a reticulation and it has another incoming edge (w, v). There is an embedded tree T of N that uses this edge, and therefore does not use (u, v). This means that changing the location of (u, v) does not change the fact that T is embedded in the network.
For the second part: first suppose the embedding of T in N does not use the new edge (u, v ). Then clearly T can be embedded in N without the edge (u, v). This means it can also be embedded in N . Now suppose the embedding of the edge (t, z) of T in N uses the new edge (u, v ). Let P be the path through the embedding of T in N starting at the image of t, and ending at v . Note that this path passes through (u, v ). Now consider the tree obtained by taking the embedding of T , removing P and adding a path of reticulation edges leading from a node s in the embedding of T to v via the other incoming edge of v . This tree only uses edges that are also in N . Hence, it is embedded in N and it is at most one tail move away from T : the one that moves the subtree below v to the edge of T whose embedding is a path containing s.

Phylogenetic network spaces
Using rearrangement moves, we can not only consider phylogenetic networks as sets [e.g. 8], but also as spaces. These spaces can be defined as graphs, where each network is represented by a node, and there is an edge between two networks N and N if there exists a rearrangement move changing N into N . Several properties of these phylogenetic network spaces have been studied.
The most basic of these properties is connectivity. The introduction of each network rearrangement move was followed by a proof of connectivity of the corresponding spaces [rSPR and rNNI moves: 9] [NNI, SPR and TBR moves: 13] [SNPR moves: 2] [tail moves : 17].
Note that spaces that take the shape of a connected graph come with a metric: the distance between two nodes. Hence, a natural follow up question is to ask about the distances between pairs of networks.

Definition 11
Let N and N be phylogenetic networks with the same leaf set in the same tier. We denote by d M (N, N ) the distance between phylogenetic networks N and N using rearrangement moves of type M . That is, d M (N, N ) is the minimum number of M -moves needed to change N into N .
For phylogenetic trees, the distance between two trees is nicely characterized by a concept known as agreement forests. Recently, agreement forest analogues for networks have been introduced [21,20], which bound distances between networks but do not give the exact distances, except in some special cases. However, not much more is known about such distances for a given pair of networks. The only other known bounds relate to the diameters: the maximal distance between any pair of networks in a phylogenetic network space.
Definition 12 Let k ∈ Z ≥0 be the number of reticulations, n ∈ Z ≥2 be the number of leaves and M a type of rearrangement move. We denote with ∆ M k (n) the diameter of tier-k of phylogenetic network space with n leaves using moves of type M : JGAA, 25 (1) 263-310 (2021) 271 where N, N are tier-k networks with n leaves.
For all previously introduced moves, some asymptotic bounds on the diameters are known ( [13,17,2]). For SNPR moves, the diameter is unbounded as each vertical move can only change the reticulation number by at most one. For all moves, the diameter of each tier of phylogenetic networks is finite.
The last property we discuss is the neighbourhood size of a phylogenetic network: the number of networks that can be reached using one rearrangement move. The size of the neighbourhood is important for local search heuristics, as it gives the number of networks that need to be considered at each step. For networks, the only rearrangement move neighbourhood that has been studied is that of the SNPR move [19].

Connectivity
In this section, we consider the connectivity of tiers of network space under local head moves. One might hope that distance-1 head moves are enough to reach any network from an arbitrary other network in the same tier. For tail moves, such a result has been proved [17], so it seems reasonable to expect a similar result for head moves. We prove that this is, unfortunately, not the case. However, we will show that distance-2 head moves do suffice.

Distance-1 is not enough
We show by example that distance-1 head moves are not enough to connect the tiers of phylogenetic network space ( Figure 6). For tier-1 networks, this example can easily be checked, as there are no distance-1 head moves in the left network that result in a different network. For higher tiers, however, there remain many valid distance-1 head moves. Using the following lemma, we will show that the reticulations remain roughly at the same place in all the resulting networks.
Lemma 4 Let N be a network, and N the result of a distance-1 head move in N . If, in N , all reticulations and their parents are below some tree node s, then the same holds for N .
Proof: Suppose the head move between N and N moves (u, v) from (x, y) to (z, w). Let q be a reticulation or a parent of a reticulation in N , with q not equal to v, then there is a path from s to r in N . Furthermore, we may assume that this path does not pass through (u, v), as it could alternatively use a path using the other in-edge of v. Hence, after the head move, q is still below s. Now note that the parent of v in N is below s. If v itself were above s in N , there would be a cycle in N . Hence, v is below s in N as well. We conclude that all reticulations and their parents are below s in N .
Proposition 1 In all tiers of phylogenetic space with n ≥ 3 leaves, there exist two networks not connected by a sequence of distance-1 head moves.
Proof: For tier-0, no head moves are possible, but there are non-isomorphic networks (trees) with no reticulations and at least 4 leaves. This proves the proposition for tier-k where k = 0. We now prove the proposition for tier-k for an arbitrary k > 0.
Let T be a caterpillar on n ≥ 3 leaves, and let s be the common parent of two of the leaves, and let t be the highest tree node in T . Now construct the network N by adding k reticulation edges between the outgoing edges of s, and the network N by adding k reticulation edges between the outgoing edges of t ( Figure 6). In N , all of the reticulations and their parents are below s. Lemma 4 implies that, using distance-1 head moves, only networks with all reticulations below s can be reached. Furthermore, because no part above s can ever be involved, the caterpillar structure above s will remain intact. Hence, any network reachable from N using distance-1 head moves consists of a chain of pendant leaves followed by the node s, which must still be above all reticulations and parents of reticulations. Now note that N is not such a network. We conclude that there is no distance-1 head move sequence between N and N .
Proposition 2 All tiers of phylogenetic network space with one leaf are connected by distance-1 head moves.
Proof: This follows from the fact that all tiers of phylogenetic network space with one leaf are connected by distance-1 tail moves [17]. Indeed, if one reverses the direction of all edges, a network with one leaf becomes another network with one leaf, and each distance-1 tail move in the reversed network is a distance-1 head move in the original network.
This shows one should not use only distance-1 head moves in local-search heuristics. However, if one wants to use them as part of a search strategy, it would still be interesting to know how disconnected spaces of distance-1 head moves are. For now, we will leave this question open, and pragmatically turn our attention to distance-2 head moves.

Distance-2 suffices
To prove the connectivity of tiers of network space using distance-2 head moves, we present a procedure to generate a sequence between any two networks in the same tier. This sequence first turns both networks into networks that look like a tree, with all reticulations collected at the top. Next, the tree structure of these networks is adjusted, by simulating rSPR moves on the trees using distance-2 head moves.

Collecting the reticulations at the top
In this subsection, we show how all reticulations can be collected at the top of the network using distance-2 head moves. This will be achieved by creating triangles, and moving these through the network. First, we define what it means for all reticulations to be at the top. 3) the edges (c, a 1 ) and (c, b 1 ); We say the there are k reticulations neatly at the top if they are all directed to the same edge, i.e. we replace point 4) with Examples are shown in Figure 7. The following lemma ensures that the top reticulations can be directed neatly using local head moves. The moves are similar to the one used to change the direction of a triangle (cf. Observation 1).

Definition 14
Let N be a network with k reticulations at the top. Changing the direction of an edge (a i , b i ) (as in Definition 13) consists of changing N into a network N that is isomorphic to N when (a i , b i ) is replaced by (b i , a i ). Note that labels a j and b j do not coincide between N and N . Changing the direction of a set of such edges at the same time is defined analogously.
Lemma 5 Let N be a network with k reticulations at the top. Then the reticulations can be redirected so that they are neatly on top (directed to either edge) with at most k distance-1 head moves. The network below a k and b k (notation as in Definition 13) is not altered in this process.
Proof: We redirect the top reticulations starting with the lowest one. The move (u i−1 , b i ) to (a i , v i+1 ) with u i−1 the parent of b i that is not a i and v i+1 the child of a i that is not b i (Figure 8) changes the direction of the chosen edge (a i , b i ) and all the reticulation edges (a j , b j ) above; it leaves all other edges fixed as they were.
274 Remie Janssen Heading in the right direction? Definition 15 Let N be a network with k reticulations at the top (notation as in Definition 13) and a tree node x directly below a k . Moving a triangle from the top consists of creating a triangle at x by a head move of (a k , b k ) to one of the outgoing edges (x, c(x)) of x. Moving a triangle to the top is the reverse of this operation.
Lemma 6 Triangles can be moved along between tree nodes, and to/from the top using distance-2 head moves.
Proof: Suppose the network has a triangle consisting of the edges (u, v), (u, w), and (v, w) with u the child of a tree node s. Let the other children of s, v, and w be a, b, and c respectively. To move the triangle up to s, we use the following sequence of distance-2 head moves. Move (v, w) to (s, a), move (s, w ) to (u, v), move (u, w ) to (v, b), and move (v, w ) to (s, u) ( Figure 9). None of the intermediate networks in the sequence contain a directed cycle or parallel edges, unless a = b. However, in that case, the move (u, w) to (s, a) is a distance-2 head move that moves the triangle up. Now suppose the network has k reticulations at the top, and there is a triangle (u, v, w) below s = a k . To move the triangle to the top, first move the triangle up using the previous sequence of moves; then reverse the direction of the triangle using the distance-1 head move (s, w ) to (v, b k ), resulting in the triangle (s, v, w ); and, lastly, move (v, w ) to the outgoing edge of b k .
If the restriction to distance-2 moves is relaxed, the triangle can also be moved using one distance-3 head move: (u, w) to (s, a).
Lemma 7 Let N be a network and v a highest reticulation below the top reticulations. Suppose (u, v) to (x, y) is a valid head move resulting in a network N . Then there is a sequence of distance-2 head moves from N to N .
Proof: Pick an up-down path from v to (x, y) not via (u, v). Note that if there is a part of this path above u, it is also above v and therefore only contains tree nodes. Sequentially move the head of (u, v) to the pendant branches of this path as in Figure 10 point where u is on the up-down tree path (the obvious move is a distance-3 move), and at the top.
Note that at the top, we need to move the head to the lowest reticulation edge at the top. This is of course only possible if this reticulation edge is directed away from u. If it is not, we redirect it using one distance-1 head move (Lemma 5), and redirect it back after we move the moving head down to the other branch of the up-down tree path.
If u is on the up-down path, we use Lemma 6 to pass this point: Let c(u) be the other child of u (not v) and p(u) the parent node of u; moving the head from a child edge of c(u) to the other child edge (p(u), w) of p(u) is equivalent to moving the triangle at c(u) to a triangle at p(u).
We have to be careful, because if the child of u is not a tree node, this sequence of moves does not work. However, if c(u) is a reticulation node, there exists a different up-down path from v to (x, y) not through u: such a path may use the other incoming edge of c(u).
At all other parts of the up-down path, the head may be simply moved to the edges on the path. Using these steps, we can move the head of (u, v) to (x, y) with only distance-2 head moves.
Using these lemmas, it is easy to prove we can use distance-2 head moves to move reticulations to the top. Lemma 8 Let N be a tier-k network, then there is a sequence of distance-2 head moves turning N into a network with all reticulations at the top.
Proof: Note that the network induces a partial order on the reticulation nodes. Suppose N has l < k reticulations at the top. Let r be a highest reticulation node that is not yet at the top. One of the two corresponding reticulation edges is head-movable. Let this be the edge (s, r). If s is a child of a l or b l (as in Definition 13; i.e., s is directly below the top reticulations), then one head move suffices to get this reticulation to the top. By Lemma 7, this move can be replaced by a sequence of distance-2 head moves. Otherwise, there is at least one node between s and the top, let t be the lowest such node, that means that t is the parent of s. Because r is a highest reticulation that is not at the top, t is a tree node and there are edges (t, s) and (t, q). Moving the head of (s, r) to (t, q) is a valid head move that creates a triangle. By Lemma 7, this head move can be replaced by a sequence of distance-2 head moves. Now we move this triangle to the top using distance-2 head moves as in Lemma 6. This increases the number of reticulations at the top by one.
276 Remie Janssen Heading in the right direction? Note that on the side of the tree containing the tail of the moving edge, we use the side branches to avoid cycles. The numbers represent the order of the distance-2 head moves. Note that the move from position 0 to position 1 is not a distance-2 head move, in this case we use the sequence of moves described in Lemma 6. Also note that position 4 is only allowed when the lowest reticulation at the top is directed away from the tail of the moving edge.

Changing the tree
Networks with all reticulations at the top have exactly one embedded tree. As such networks are essentially determined by this tree, we need to change this embedded tree. To achieve this, we use the lowest reticulation edge (a k , b k ) to create a triangle that can move around the lower part of the network. Using the reticulation in this triangle, we simulate rSPR moves on the embedded tree.
Lemma 9 Let N and N be tier-k networks on the same leaf set with k − 1 reticulations at the top and the k-th reticulation at the bottom of a triangle. Suppose N and N have the same embedded trees, then there exists a sequence of distance-2 head moves from N to N .
Proof: Note that the network consists of k−1 reticulations at the top, and two pendant subtreesisomorphic to the two pendant subtrees below the highest tree node of the embedded tree-one of which contains a triangle. The triangle can be moved through one of these subtrees using Lemma 6.
To move the triangle anywhere, we need to be able to move it between the pendant subtrees as well. This can be done by moving the triangle to the top, and then moving it down on the other side after redirecting all the top reticulations, using Lemma 6. None of these triangle moves change the embedded tree: each of the intermediate networks has exactly one embedded tree, and doing a head move keeps at least one embedded tree. Hence, moving the triangle to the right place and then redirecting the triangle and the top reticulations as needed gives a sequence from N to N .
Lemma 10 Let N and N be tier-k networks (k > 0) on the same leaf set with all reticulations neatly at the top. Then there exists a sequence of distance-2 head moves turning N into N .
Proof: Note that N and N both have exactly one embedded tree, T and T respectively, and we aim to change this embedded tree. It suffices to prove this for any T that is one rSPR move away from T , because the space of phylogenetic trees with the same leaf set is connected by rSPR moves. Hence, let (u, v) to (x, y) be the rSPR move that transforms T into T . First suppose the rSPR move does not involve the root edge of the embedded trees. We can move triangles anywhere below the k − 1 reticulations at the top by Lemma 9. Hence, there is a sequence of head moves transforming N into a network M with the following properties: the tree T can be embedded in M ; M has a reticulation edge (a, b) where a lies on the image of (x, y) in M , and the head b lies on the image of the other outgoing edge (x, z) of x if x is not the root and on the image of one of the child edges (y, z ) of y otherwise.
This creates a situation where there are edges (x, a), (a, b), (p, b) (a, y) and (b, ζ) with p = x and ζ = z or p = y and ζ = z . The case p = x is depicted in Figure 11. Now do a head move of (a, b) to the image of the T -edge (u, v), this is allowed because any reticulation edge in a tier-1 network is movable; b is not equal to the image of u as b is a reticulation node and the image of u a tree node; and the image of v is not above a, as otherwise the tail move (u, v) to (x, y) could not be valid. Let N be the resulting network, and note that the embedded tree using the new reticulation edge is T .
Next, suppose the rSPR move does involve the root edge of the embedded tree. Let (u, v) to (ρ, c) be such an rSPR move to the root edge of the tree, and let x and y be the nodes directly below the top: x on the side of the reticulations b i , and y on the side of the tree nodes a i . Do the rSPR move of (u, v) to (a k , x), that is, to an edge directly below the top. This produces the network with k − 1 reticulations at the top, and (in Newick notation) embedded tree where T ↓ z denotes the part of tree T below z. Then do the rSPR move (u , x) to (b k , y), the other side of the top, producing a network with k − 1 reticulations at the top and embedded tree This creates the desired network with v below one side of the top, and x and y on the other side. Both these rSPR moves are performed as in the previous case, which did not involve the root edge. After the rSPR moves, we move the triangle back to the top without changing the embedded tree, and redirect the top reticulations as needed to produce N .
Theorem 1 Tier-k of phylogenetic networks is connected by distance-2 head moves, for all k > 0.
Proof: Let N and N be two arbitrary networks in the same tier with the same leaf set. Use Lemma 8 and Lemma 5 to change N and N into networks N n and N n with all reticulations neatly at the top using only distance-2 head moves. Now, Lemma 10 tells us that there is a sequence of distance-2 head moves from N n to N n . Hence, tier-k of phylogenetic network space is connected by distance-2 head moves.
Corollary 2 Tier-k of phylogenetic networks is connected by head moves, for all k > 0.

Relation to tail moves
In this section, we show that each tail move can be replaced by a sequence of at most 15 head moves, and each head move can be replaced by a sequence of at most 16 tail moves.

Tail move replaced by head moves
Here, we show how to replace a tail move by a sequence of head moves (Theorem 2). The proof works by case distinction, where the main cases represent different types of tail moves. The first two lemmas prove that we can replace certain types of distance-1 tail moves: in Lemma 11, we replace a distance-1 tail move between the two outgoing arcs of a tree node, and in Lemma 12, we replace a distance-1 tail move between the two incoming arcs of a reticulation. Then, we turn to the remaining cases, where the tail move (u, v) from (x L , a L ) to (x R , a R ) is such that a L = a R and a L is not above a R (Lemma 15), and a L = a R and a L is not above a R (Lemma 16). This case is split up into two lemmas, depending on where the head-movable arcs are located in the network in relation to a R (Lemmas 13 and 14).
In this section, unless stated otherwise, each move is a head move and movable means headmovable.
to (x, a R ) be a valid tail move in a tier k > 0 network N resulting in a network N . Then there exists a sequence of head moves from N to N of length at most 6.
Proof: To prove this, we have to find a reticulation somewhere in the network that we can use, as the described part of the network might not contain any reticulations.
Note that there exists a head-movable reticulation edge (t, r) in N with t not below both a L and a R : Find a highest reticulation node below a L and a R ; if it exists, one edge is movable, this edge cannot be below both; if there is no such reticulation, then there is a reticulation r that is not below both a L and a R and so the same holds for its movable edge.
First assume we find a head-movable edge (t, r) with t = x. Note that r cannot be the same node as u, as u is a tree node and r is a reticulation. This means that r = a R , and (x, a R ) is movable. Move (x, a R ) to (u, a L ), which is allowed because x = u, a L not above x and (t, r) = (x, a R ) is movable. Now moving (u, r ) to (s, z), the edge created by suppressing a R after the previous move, we get network N . Now assume we find a head-movable edge (t, r) with x = t. Suppose w.l.o.g. (t, r) is not below a L , then we can use the following sequence of 4 moves except in the cases we mention in bold below the steps. For this lemma, we call this sequence the 'normal' sequence ( Figure 12). The validity of each move is checked using Lemma 2.
which it is by choice of (t, r); t = u, but we note that t = u may occur; and, a L is not above t, which is true by choice of (t, r).
• Move (u, r ) to (x, a R ), creating edge (t, a L ) as (x, a R ) = (t, r ) and (x, a R ) = (r , a L ). For this move, note that (u, r ) is movable, except when (t, a L ) ∈ N ; u = x as these nodes are distinct in the original network; and a R is not above u, as it is not above x.
• Move (x, r ) to (t, a L ). The edge (x, r ) is movable because v = a R (otherwise the tail move would be not valid); t = x, as we have assumed so for the first move; and a L is not above x, as it is above x.
• Move (t, r ) back to its original position. This move is allowed because this produces the network N . We now look at the situations where t = x, t = u, or (t, a L ) ∈ N separately. We will split up in cases to keep the proof clear. Recall that (t, r) is a movable edge in N .
(a) r = a L . Move (t, r) to (x, a R ), then move (x, r ) back to the original position of r, creating N . This is a sequence of 2 head moves.
, creating a triangle at x. Now reverse the triangle by moving (x, r ) to (t, a L ). Now create N by moving (t, r ) back to the original position of r. This is a sequence of 3 head moves.
2. t = u. Note that we can assume that (u, a L ) is not movable, as otherwise we are in the previous case. Because t = u, we may also assume (t, a L ) ∈ N , otherwise this is no special case and we can use the sequence of moves from the start of this proof. Hence, a L is a reticulation node on the side of a triangle formed by t, a L , and the child c(a L ) of a L .
280 Remie Janssen Heading in the right direction?
i. t is below v. Since t is below both a R and v, there is a highest reticulation s strictly above t and below both a R and v. Since s is strictly above t, it is strictly above a L . Therefore we are either in the 'normal' case, or in Case 1b of this analysis with movable edge (p(s), s). This means this situation can be solved using at most 4 head moves. ii. t is not below v. As (t, a L ) is a reticulation edge in the triangle, it is movable, and because t is not below v, the head move (t, a L = r) to (u, v) is allowed. Now the tail move (u, r ) to (x, a R ) is still allowed, because v is not above x. As (u, c(a L )) is movable in this new network, we can simulate this tail move like in Case 1a. Afterwards, we can put the triangle back in its place with one head move, which is allowed because it produces N . All this takes 6 head moves.
Therefore we can do the 'normal' sequence of moves from the start of this proof in reverse order, effectively switching the roles of a L and a R . Because we use the 'normal' sequence of moves, this case takes at most 4 head moves.
To prove the case of a more general tail move, we need to treat another simple case first.
be a valid tail move in a network N turning it into N , then there is a sequence of head moves from N to N of length at most 4.
Proof: Let z be the child of r, and note that not all nodes described must necessarily be unique. All possible identifications are x L = x R and v = z, other identifications create cycles. First note that in the situation x L = x R , the networks N and N before and after the tail move are isomorphic. Hence we can restrict our attention to the case that x L = x R . To prove the result, we distinguish two cases.
1. z = v. This case can be solved with two head moves: (x R , r) to (x L , u) creating new reticulation node r above u followed by (x L , u) to (u, z). The first head move is allowed because v = z, so (x R , r) is head-movable; x R = x L ; and u is not above x R because both its children aren't: z is below a R , and if v is above x R , the tail move N → N is not allowed. The second head move is allowed because it produces the valid network N . Hence the tail move can be simulated by at most 2 head moves ( Figure 13).
The proposed moves of the previous case are not valid here, because they lead to parallel edges in the intermediate network.
To prevent these, we reduce to the previous case by moving (u, z) to any edge e not above z and e = (z, c(z)) (hence neither above x L nor above x R ) and moving it back afterwards. Note that if there is such an edge e, then the head move (u, z) to e is allowed. Let the new node subdividing e be v , then the tail move (u, v ) to (x R , r) is 'still' allowed and can therefore be simulated by 2 head moves as in the previous case, the last head move, moving (u , v ) 'back' is allowed because it creates the DAG N which is a network. Such a sequence of moves uses 4 head moves ( Figure 14).
It remains to prove that there is such a location (not above z and excluding (z, c(z))) to move (u, z) to. Recall that we assume any network has at least two leaves. Let l be a leaf not equal to c(z), then its incoming edge (p(l), l) is not above c(z) and not equal to (z, c(z)).
Hence this edge e = (p(l), l) suffices as a location for the first head move.  We conclude that any tail move of the form (u, v) from (x L , r) to (x R , r) can be simulated by 4 head moves.
be a valid tail move in a network N resulting in a network N . Suppose a L = a R , a L is not above a R , and there exists a movable reticulation edge (t, r) not below a R . Then there exists a sequence of head moves from N to N of length at most 7.
Proof: Note that v cannot be above either of x L and x R . The only possible identifications within the nodes a L , a R , x L , x R , u, v are a L = a R , x L = x R and a R = x L (but not simultaneously), all other identifications lead to parallel edges, cycles in either N or N , a contradiction with the condition "a L is not above a R ", or a trivial situation where the tail move leads to an isomorphic network. The first of these two identifications have been treated in the previous two lemmas, so we may assume a L = a R and x L = x R . We now distinguish several cases to prove the tail move can be simulated by a constant number of head moves in all cases.
(a) r = x R . As (t, r) is movable and not below a L or v, we can move the head of this edge to (x L , u). The head move (x L , r ) down to (u, a L ) is then allowed. Let s be the parent of r in N that is not t. Since u = s (otherwise the original tail move was not allowed), the head move (u, r ) to (s, a R ) is allowed, where s is the other parent of r in N (i.e., not t). Lastly (s, r ) to (t, u) gives the desired network N .
(b) r = x R . In this case, we can move (t, r) to (x R , a R ) in N (if t = x R then (t, a R ) ∈ N , contradicting the assumptions of this case). Because neither a L nor v can be above x R and x L = x R , we can now move (x R , r ) to (x L , u). Then we move down the head (x L , r ) to (u, a L ), followed by (u, r ) to (t, a R ). If u = t and r = v, the last move is not allowed, and if u = t and r = a L these last two moves are not allowed. In these cases, we simply skip these move. Lastly, we move (t, r ) to (s, z) to arrive at N , where s and z are the other parent and the child of r in N . Hence the tail move of this situation can be simulated by 5 head moves ( Figure 15). (a) z = a R . Note first that in this case, we must have either x R = t or x R = r, otherwise one of the edges (t, a R ) and (r, z) is not in N .
i. x R = t This case is quite easy, and can be solved with 3 head moves. Because r and t = x R are distinct, a R = z is a reticulation node with movable edge (t = x R , z = a R ). The sequence of moves is: (t, z) to (x L , u), then (x L , z ) to (u, a L ), then (u, z ) to (r, c(z)). ii. x R = r Note that the tail move (u, v) to (t, a R ) is also allowed in this case because (u, v) is tail-movable, v = a R and v not above t (otherwise the tail move to (r, z) is not allowed either). This tail move is of the type of the previous case, and takes at most 3 head moves. Now the move (u , v) to (r, z) is of the type of Lemma 12, which takes at most 4 head moves to simulate. We conclude any tail move of this case can be simulated with 7 head moves.
i. a R = r.
A. x R = t and v = r. We can move the tail of (u, v) to (x R , r) with a sequence of four head moves like the sequence in Case 2(a)i. The resulting DAG is a network because v is not above x R and v = r. Call the resulting new location of the tail u . We can get to N with two head moves (Lemma 11 Case 1a): (u , r) to (t, a R ), then (t, r ) to (s, z). This case therefore takes at most 6 head moves. B. x R = t and v = r. The following sequence of four head moves suffices: (t = x R , r = v) to (x L , u), then (x L , r ) to (u, a L ), then (u, r ) to (t, a R ) and finally (t, r ) to (u, z). Hence this case takes at most 4 head moves. C. x R = t and a R = s. Because a R = s, the edge (x R , a R ) is movable. Also, because x R = x L and u not above x R , (x R , a R ) can be moved to (x L , u). Now (x L , a R ) is movable, and it can be moved down to (u, a L ). Finally, the head move (u, a R ) to (t, c(a R )) results in N . Hence in this case we need at most 3 head moves. D. x R = t and a R = s. The following sequence of five head moves suffices: (t, r) to (u, a L ), then (x R , s) to (x L , u), then (x L , s ) to (u, r ), then (u, s ) to (t, c(r)), and finally (t, r ) to (s , c(r)). Hence this case takes at most 5 head moves. ii. a R = r. In this case either x R = t or x R = s.
A. x R = t. This case is easily solved with 3 head moves: (x R , a R ) to (x L , u), then (x L , a R ) to (u, a L ), then (u, a R ) to (s, z).
there is no edge t, z), then we can relabel t ↔ s and treat like the previous case. Otherwise, there is an edge (t, z) and we use the following sequence of moves: (t, a R ) to (u, a L ), then (x R = s, z) to (x L , u), then (x L , z ) to (u, a R ), then (u, z ) to (t, c(z)), then (t, a R ) to (z , c(z)). The tail move of this situation can therefore be replaced by 5 head moves.
to (x R , a R ) be a valid tail move in a network N resulting in a network N . Suppose a L = a R , a L is not above a R , and all movable reticulation edges are below a R . Then there exists a sequence of head moves from N to N of length at most 15.
Proof: Like in the proof of last lemma, we assume that a L is not above a R . Because the network has at least one reticulation, we can pick a highest reticulation r in the network, let (t, r) be its movable edge. As each movable reticulation edge is below a R , so is (t, r). Let us denote the root of N with ρ, and distinguish two subcases: 1.
x R = ρ. Because x R is above a R , it must be a tree node, it has another child edge (x R , b) with b = a R not above t: if b were above t, there would have to be a reticulation above r, contradicting our choice of r.
(a) r = b. In this case, we can move (t, r) to (x R , b) in both N and N , producing networks M and M . Now (x R , r ) is movable in M , and by relabelling t = x R we can see that there is one tail move between M and M of the same type as Case 2(b)i of Lemma 13. To see this, take r as the relevant reticulation with movable edge (t , r ) and consider the tail move (u, v) to (x R , a R ) producing M . This case can therefore be solved with at most 5 + 2 = 7 head moves.
284 Remie Janssen Heading in the right direction?
(b) r = b and (t, c(r)) ∈ N . In this case, (x R , r) is movable, and not below a R , contradicting our assumptions.
(c) r = b and (t, c(r))) ∈ N . Because N has at least two leaves, there must either be at least 2 leaves below r, or there is a leaf not below r. Let l be an arbitrary leaf below r in the first case, or a leaf not below r in the second case. Note that the head move (t, c(r)) to the incoming edge of l is allowed, and makes (x R , r) movable. Now the tail move (u, v) to (x R , a R ) is still allowed, because v = a R , v is not above x R and (u, v) is tail-movable. For this tail move we are in a case of Lemma 13 because (x R , r) is not below a R , hence this tail move takes at most 7 moves. After this move, we can do one head move to put (t, c(r)) back. Hence this case takes at most 9 moves. 2.
x R = ρ. Let y, z be the children of a R . Now first do the tail move of (u, v) to one of the child edges (a R , z) of a R . This is allowed because a R is the top tree node. The sequence of head moves used to do this tail move is as in the previous case. Note that N is now one tail move away: (u , z) to (a R , y). This is a horizontal tail move along a tree node as in Lemma 11, which takes at most 6 head moves. As the previous case took at most 9 head moves, this case takes at most 15 head moves in total.
be a valid tail move in a network N resulting in a network N . Suppose a L = a R and a L is not above a R , then there exists a sequence of head moves from N to N of length at most 15.
Proof: This is a direct consequence of the previous two lemmas. Lemma 16 Let (u,v) from (x L , a L ) to (x R , a R ) be a valid tail move in a network N resulting in a network N . Suppose a L = a R and a L is above a R , then there exists a sequence of head moves from N to N of length at most 15.
Proof: Note that in this case a R is not above a L in N . Reversing the labels x L ↔ x R and a L ↔ a R we are in the situation of Lemma 15 for the reverse tail move N to N . This implies the tail move can be replaced by a sequence of at most 15 head moves.
Theorem 2 Any tail move can be replaced by a sequence of at most 15 head moves.
Proof: This follows from the previous lemmas.

Head move replaced by tail moves
In this section, we show how to replace a head move (u, v) from (x 1 , y 1 ) to (x 2 , y 2 ) by a sequence of at most 16 tail moves (Theorem 3). In the proof, we first show how to efficiently replace downward head moves by tail moves (i.e., when y 1 is above x 2 ; Section 4.2.2). This is then used repeatedly to simulate arbitrary head moves in Section 4.2.3. Unless stated otherwise, each move in this section is a tail move and movable means tailmovable.

Distance-1 head moves
We first recall a result from [17]: any distance-1 head move can be replaced by a constant number of tail moves, so the following result holds.
Lemma 17 Let (u, v) from (x 1 , y 1 ) to (x 2 , y 2 ) be a valid distance-1 head move in a network N resulting in a network N . Then there is a sequence of at most 4 tail moves between N and N , except if N and N are different networks with two leaves and one reticulation.
And there is the following special case, for which we repeat the proof here.
Lemma 18 Let (u, v) from (x 1 , y 1 ) to (x 2 , y 2 ) be a valid head move in a network N resulting in a network N . Suppose that y 1 = x 2 and x 2 is a tree node, then there is a sequence of at most 1 tail moves between N and N .
Proof: Let c(x 2 ) be the other child of x 2 (not y 2 ), then the tail move (x 2 , c(x 2 )) to (x 1 , v) suffices.

Downward head moves
Now, we prove that the head move can be replaced by a sequence of constant length if y 1 is above x 2 . We start by considering the case that x 2 is a tree node. In the proof we use a constant number of moves to create a situation where we simply need to do a distance-1 downward head move.

Lemma 19
Let (u, v) from (x 1 , y 1 ) to (x 2 , y 2 ) be a valid head move in a network N resulting in a network N . Suppose that y 1 is above x 2 , y 1 = x 2 , and x 2 is a tree node, then there is a sequence of at most 4 tail moves between N and N .
Proof: We split this proof in two cases: (x 2 , y 2 ) is movable, or it is not. We prove in both cases there exists a constant length sequence of tail moves between N and N .
1. (x 2 , y 2 ) is tail-movable. Tail move (x 2 , y 2 ) up to (v, y 1 ), this is allowed because any tail move up is allowed if the moving edge is tail-movable (Corollary 1). Now (u, v) is still headmovable, hence we can move it down to (x 2 , y 2 ). As this is exactly the situation of Lemma 18, we can replace this head move by one tail move. Now tail-moving (x 2 , v ) back down results in N , so this move is allowed, too. Hence there is a sequence of 3 tail moves between N and N .
2. (x 2 , y 2 ) is not tail-movable. Because x 2 is a tree node and (x 2 , y 2 ) is not movable, there has to be a triangle with x 2 at the side, formed by the parent p of x 2 and the other child c of x 2 . Note that (p, x 2 ) is tail-movable, and that it can be moved up to (v, y 1 ). After this move, Lemma 18 tells us we can head-move (u, v) to (p , x 2 ) using one tail move. The next step is to tail move (p , v ) back down to the original position of p. The resulting network is allowed because it is one valid distance-1 head move away from N (as c is not above u). Lastly, we do this distance-1 head move, which again can be simulated by one tail move by Lemma 18. Note that this sequence is also valid if p = y 1 . Hence there is a sequence of at most 4 tail moves between N and N ( Figure 16). Lemma 20 Let (u, v) from (x 1 , y 1 ) to (x 2 , y 2 ) be a valid head move in a network N resulting in a network N . Suppose that y 1 is (strictly) above x 2 and x 2 is a reticulation, then there are networks M and M such that the following hold: 1. turning N into M takes at most one tail move; 2. turning N into M takes at most one tail move; 3. there is a head move between M and M , moving the head down to an edge whose top node is a reticulation; 4. there is a tail-movable edge (s, t) in M with t not above x 2 .
Proof: Note that we have to find a sequence consisting of a tail move followed by a head move and finally a tail move again, between N and N such that the head move is of the desired type and the network after the first tail move has a movable edge not above the top node x 2 of the receiving edge of the head move. Note that if there is a tail-movable edge (s, t) in N with t not above x 2 , we are done by the previous lemmas: take M := N and M := N . Hence we may assume that there is no such edge in N . Suppose all leaves (of which there are at least 2) are below y 2 , then there must also be a tree node below y 2 . And as one of its child edges is movable, there is a tail-movable edge below y 2 (and hence not above x 2 ). So if all leaves are below y 2 , we can again choose M := N and M := N .
Because our networks have at least 2 leaves, the remaining part is to show the lemma assuming that there is a leaf l 1 not below y 2 . Note that there also exists a leaf l 2 below y 2 . Now consider an LCA j of l 1 and l 2 . We note that j is a tree node of which at least one outgoing edge (j, m) is not above x 2 . If (j, m) is tail-movable, then M := N and M := N suffices, so assume (j, m) is not tail-movable. Let i be the parent of j, and k be the other child of j; because j is a tree node and (j, m) is not movable, i, j and k form a triangle ( Figure 17).
The idea is to 'break' this triangle with one tail move in N and N simultaneously, meaning we either move one of the edges of the triangle, or we move a tail to an edge of the triangle. If we can break the triangle in both networks keeping (u, v) movable, creating new networks M and M , then choosing (s, t) := (j, m) in M will work. The last part of this proof shows how we do this. We have to split in two cases: • i is the child of the root. In this case we break the triangle by moving a tail to the triangle. As v is a reticulation and there is no path from any node below m to v (if so, there is a path from m to x 2 ), there must be a tree node p below k and (not necessarily strictly) above both parents of v. At least one of the outgoing edges (p, q) is movable in N . If v is a child of p and (p, v) is movable, then we choose q = v, otherwise any choice of (p, q) will suffice.
Because (p, q) is movable (by choice of (p, q)) and k is above p, the tail move (p, q) to (j, k) is valid. Now the head move (u, v) (or (u , v) if p = u in N ) to (x 2 , y 2 ) is valid, because x 2 is below v, and (u, v) is movable because (u, v) was movable in N , and the only ways to create a triangle with v on the side with one tail move are: suppressing one node of a four-cycle that includes v to create a triangle by moving the outgoing edge of that node that is not included in the four-cycle. As this node is p, and p is above both parents of v, the suppressed node must be on the incoming edge of v in the four-cycle (Figure 18 top). However, in that case v is a child of p and (v, p) is tail-movable, so we choose to move (v, p) up for the first move, which keeps (u, v) head-movable.
moving the other incoming edge of v (not (u, v)) to the other incoming edge of the child c(v) of v (so not (v, c(v))). But as the tail move moves (p, q) to (j, k), we see that k = c(v) which contradicts the fact that v is strictly below k in N . Hence this cannot result in a triangle with v on the side (Figure 18 bottom left).
moving the other incoming edge of the child c(v) of v (so not (v, c(v))) to the incoming edge of v that is not (u, v). As we move (p, q) to (j, k), we see that v = k and u = i. But then c(v) = q must be below the other child m of j, and as x 2 is below q, this contradicts the fact that (j, m) is not above x 2 . Hence this cannot result in a triangle with v on the side (Figure 18 bottom right). Figure 18: The ways of making (u, v) not head-movable in Lemma 20. Top: creating a triangle by suppressing a node in a four cycle. The first two of these are invalid because p is not above both parents of v. The right one does not give any contradictions, but forces us to choose to move (p, v), so that no triangle is produced. Bottom: creating a triangle by moving an edge to become part of the triangle. Both these options contradict our assumptions.
The preceding shows that (u, v) is still head-movable after the first tail move. Because p is above x 2 through two paths, y 1 is still above x 2 after the tail move (p, q) to (i, j). Also we did not change x 2 , so it still is a reticulation. This means that the head move (u, v) to (x 2 , y 2 ) is still valid and of the right type. Furthermore (j, m) is a tail-movable edge with m not above x 2 . Now note that after the head head move (u, v) to (x 2 , y 2 ), we can move (p , q) back to its original position to obtain N .
Hence we produce M by tail-moving (p, q) to (i, j) and M by moving the corresponding edge to (i, j) in N . We can do this because (i, j) is still an edge in N : indeed it is not subdivided by the head move, and i and j are both tree nodes, so they do not disappear either. So this case is proven.
• i is not the child of the root. In this case we can move the tail of (i, k) (possibly equal to (u, v)) up to the root in N . Now note that j is a tree node, so the tail move cannot create any triangles with a reticulation on the side. This means that (u, v) is still movable after the tail move. Furthermore, after the tail move x 2 is still a reticulation node below y 1 , and (j, m) is movable and not above x 2 . Hence the head move (u, v) to (x 2 , y 2 ) is allowed and of the appropriate type. Now moving the tail of (i , k) back to the incoming edge of j, we get N .
Hence this case works with M being the network obtained by moving (i, k) up to the root edge in N , and M the network obtained by moving (i, j) up to the root edge in N .

Lemma 21
Let (u, v) from (x 1 , y 1 ) to (x 2 , y 2 ) be a valid head move in a network N resulting in a network N . Suppose that y 1 is above x 2 and x 2 is a reticulation, then there is a sequence of at most 8 tail moves between N and N .
Proof: By Lemma 20, with cost of 2 tail moves, we can assume there is a tail-movable edge (s, t) that can be moved to (x 2 , y 2 ). Make this the first move of the sequence. Because the head move (u, v) to (s , y 2 ) goes down, and (u, v) is head-movable, this head move is allowed. By Lemma 19, there is a sequence of at most 4 tail moves simulating this head move. Now we need one more tail move to arrive at N : the move putting (s, t) back to its original position. This all takes at most 8 moves.
All previous lemmas together give us the following result.
to (x 2 , y 2 ) be a valid head move in a network N resulting in a network N . Suppose that y 1 is above x 2 or y 2 is above x 1 , then there is a sequence of at most 8 tail moves between N and N .

Non-downward head moves
Finally, we consider head moves where the original position of the head and the location it moves to are incomparable.
to (x 2 , y 2 ) be a valid head move in a network N resulting in a network N , where N and N are not networks with two leaves and one reticulation. Suppose that y 1 is not above x 2 and y 2 is not above x 1 , then there is a sequence of at most 16 tail moves between N and N .
Proof: Find an LCA s of x 1 and x 2 . We split into different cases for the rest of the proof: 1. s = x 1 , x 2 . One of the outgoing edges (s, t) of s is tail-movable and it is not above one of x 1 and x 2 . Suppose t is not above x 1 , then we can do the following ( Figure 19): • Tail move (s, t) to (x 1 , v); allowed because t = v, (s, t) movable, and t not above x 1 .
• Distance-1 head move (u, v) to (s , y 2 ); No parallel edges by removal: if so, they are between s and y 1 = y 2 , but then the move actually resolves this; no parallel edges by placing: u = s ; no cycles: y 2 not above u, otherwise cycle in N • Move (s , y 1 ) back up to (x 1 , t); Moving a tail up is allowed if the tail is movable.
• Move (s , t) back up to its original position. Moving a tail up is allowed if the tail is movable.
As the head move used in this sequence is a distance-1 move, it can be simulated with at most 4 tail moves. Hence the sequence for this case takes at most 8 tail moves. (a) u is not below x 1 . Head move (u, v) to the other child edge of x 1 , this takes at most 4 tail moves by Lemma 17. Now we have to move the head of (u, v ) down to create N , this takes at most 8 tail moves by Proposition 3. Hence for this case we need at most 12 tail moves.
(b) u is below x 1 . In this case the previous approach is not directly applicable, as moving the head of (u, v) to the other child edge of x 1 creates a cycle. Hence we need to take a different approach, where we distinguish the following cases: i. (x 1 , v) is tail-movable. Tail move (x 1 , v) down to (x 2 , y 2 ), this is allowed because y 1 is not above x 2 . Then do the sideways head move (u, v) to (x 1 , y 2 ), this takes at most 4 tail moves. Then move (x 1 , y 1 ) back up to create N . This takes at most 6 moves ii. (u, v) is tail-movable. Move (u, v) up to the incoming edge (t, s) of s. The head move (u , v) to (x 2 , y 2 ) is still allowed, except if s, u, v form a triangle with the child of u as well as of v being y 1 in N , but in that case x 1 was not the LCA of x 1 and x 2 . Hence we can simulate the head move with at most 12 tail moves by Case 2a of this analysis. As afterwards we can move the tail of (u , v ) back to its original position, this case takes at most 16 moves. iii. Neither (x 1 , v) nor (u, v) is tail-movable. We create the situation of Case 1 by reversing the direction of the triangle at x 1 , this takes at most 4 tail moves because it is a head move. Only if the bottom node of the triangle is x 2 , we do not get this situation, but then the head move is composed of two head moves, so it can be simulated with 8 tail moves. If we are actually in the situation of Case 1, simulate the head move with at most 8 moves as done in that case. This is allowed because it produces N with the direction of a triangle reversed, which is a valid network. Then reverse the direction of the triangle again using at most 4 tail moves. This way we obtain N with at most 16 tail moves (Figure 20).
3. s = x 2 . This can be achieved with the reverse sequence for the previous case.
x 1 5 Head move diameter and neighbourhoods

Diameter bounds
There are some obvious results concerning the diameter of head move space found using results from Section 4 and existing bounds on the rSPR diameter. Each rSPR sequence of length l can be replaced by a sequence of head moves of length at most 15l. Hence, we get upper bounds ∆ k Head ≤ 15∆ k rSPR on head move diameters. Furthermore, each sequence of head moves is also an rSPR sequence, hence the rSPR diameter gives lower bounds ∆ k rSPR ≤ ∆ k Head . Similarly the rSPR bounds give bounds on the tail move diameters. These bounds for tail move diameters are inferior to the bounds in [17]. The tail move diameter bounds from that paper are obtained using a technique where an isomorphism is built incrementally.
In this section we prove a bound for the head move diameter (Theorem 4). The proof employs a technique similar to the one used for tail moves: for any pair of networks, we build an isomorphism between growing subnetworks where in each step we only have to use a small number of moves to grow the isomorphism (Lemma 22). For tail moves and rSPR moves, it is convenient to build this isomorphism bottom-up. Head moves are essentially upside-down tail moves. Hence, for head moves, we build an isomorphism starting at the top. Doing this, we ignore the leaf labels. Consequently, to prove the bound we permute the leaf labels using a small number of head moves (Lemma 23).

Remie Janssen Heading in the right direction?
Each move in this section is a head move, unless stated otherwise. As we need to explicitly work with the vertices and edges of different networks, we denote a network with nodes V and edges A as N = (V, A). We first define a few structures that we use extensively: upward closed sets, isomorphisms, and induced graphs. Let N = (V, A) be a network with Y ⊆ V a subset of the vertices. We say that Y is upward closed if for each u ∈ Y the parents of u are also in Y .

Definition 16
Lemma 22 Let N 1 and N 2 be tier k > 0 networks with label set X of size n, then there exists a pair of head move sequences S 1 on N 1 and S 2 on N 2 such that the resulting networks are unlabelled isomorphic and the total length is |S 1 | + |S 2 | ≤ 4n + 6k − 4.

Proof: We incrementally build upward closed sets
and N 2 [Y 2 ] are unlabelled isomorphic with isomorphism φ. Starting with Y 1 = {ρ 1 } and Y 2 = {ρ 2 } the roots only, we set the isomorphism ρ 1 → ρ 2 . Next we increase the size of Y 1 by changing the networks slightly with a constant number of head moves, and then adding a node to Y 1 and Y 2 and extending the isomorphism. We will add all the leaves to the isomorphism last.
1. There is a highest node x 1 of N 1 not in Y 1 such that x 1 is a tree node. Because x 1 is a highest node not in Y 1 , the parent p 1 of x 1 is in Y 1 and there is a corresponding node p 2 := φ(p 1 ) in Y 2 . This node must have at least one child x 2 that is not in Y 2 , as otherwise the degrees of p 1 and p 2 in N 1 [Y 1 ] and N 2 [Y 2 ] do not coincide.
(a) The node x 2 is a tree node. In this case we can add x 1 and x 2 to Y 1 and Y 2 and set φ : x 1 → x 2 to get an extended isomorphism. We do not have to use any head moves to do this extension.
The node x 2 is a reticulation. We make sure p 2 has a tree node y 2 as a child not in Y 2 , using at most 3 head moves. We can then add x 1 to Y 1 and y 2 to Y 2 and extend the isomorphism with x 1 → y 2 . To create this tree node, we use a tree node c 2 ∈ N 2 \ Y 2 , which exists because there is a tree node in N 1 \ Y 1 .
i. The edge (p 2 , x 2 ) is movable. Move (p 2 , x 2 ) to the incoming edge (t 2 , c 2 ) of the tree node c 2 . This move is valid because c 2 cannot be above p 2 (otherwise c 2 ∈ Y 2 , a contradiction), and t 2 = p 2 as otherwise p 2 would have a tree node child not in Y 2 . Now the edge (t 2 , x 2 ) is movable to any of the outgoing edges of c 2 . Now p 2 has child node c 2 , which is a tree node, so we can extend the isomorphism with a tree node φ : x 1 → c 2 using at most 2 head moves ( Figure 21). Figure 21: The moves and incremented isomorphism for Lemma 22 Case 1(b)i. For nodes outside of the shaded region, it is not known whether they are in Y 2 .
ii. The edge (p 2 , x 2 ) is not movable. This means that x 2 is on the side of a triangle. Denote by d 2 the child of x 2 and the other parent of x 2 with z 2 . Now note that (z 2 , d 2 ) is movable, and can be moved to an edge (u 2 , v 2 ) with v 2 not in Y 2 and (u 2 , v 2 ) distinct from both (x 2 , d 2 ) and from the outgoing edge of d 2 . Such an edge exists: pick a leaf l not equal to the child of d 2 (if that node is a leaf); as we add all leaves to the isomorphism last, the leaf is not in Y 2 , furthermore, l is not above z 2 , and the incoming edge of l is not equal to (x 2 , d 2 ) nor to the outgoing edge of d 2 .
Doing the head move (z 2 , d 2 ) to the incoming edge of l creates the situation of the previous case (Case 1(b)i), and we can use 2 more head moves to create a network with a tree node c 2 below p 2 which maintains the isomorphism of the upper part Y 2 . Hence we can extend the isomorphism with a tree node φ : x 1 → c 2 using at most 3 head moves.
(c) The node x 2 is a leaf. Again, note there is a tree node c 2 in N 2 \ Y 2 , and let its parent be t 2 . Note also that N 2 has a reticulation node r 2 with incoming edge (s 2 , r 2 ) which is movable to (p 2 , x 2 ) (if p 2 = s 2 , then the other incoming edge (s 2 , r 2 ) is also movable, and can instead be moved to (p 2 , x 2 )).
i. The nodes s 2 and t 2 are different nodes. First move (s 2 , r 2 ) to (p 2 , x 2 ). Now the edge (p 2 , r 2 ) is movable, and can be moved to (t 2 , c 2 ), because c 2 is not above p 2 and p 2 = t 2 (otherwise p 2 has a tree node as child). This makes (t 2 , r 2 ) movable, 294 Remie Janssen Heading in the right direction?
and we can move it to (s 2 , x 2 ) because s 2 = t 2 and x 2 is a leaf, so it is not above t 2 . Lastly, we restore the reticulation by moving (s 2 , r 2 ) back to its original position. Hence, in this situation, 4 head moves suffice to make p 2 the parent of a tree node c 2 , so that we can extend the isomorphism by φ : x 1 → c 2 with a tree node (Figure 22). Figure 22: The moves and incremented isomorphism for Lemma 22 Case 1(c)i. For nodes outside of the shaded region, it is not known whether they are in Y 2 .
ii. The nodes s 2 and t 2 are the same. Note that a child of t 2 is a tree node and a child of s 2 is a reticulation. This means that s 2 = t 2 is a tree node, as it has two distinct children. The edge (s 2 , r 2 ) can be moved to the pendant edge (p 2 , x 2 ). Now the new edge (p 2 , r 2 ) can be moved to (s 2 , c 2 ), because p 2 = s 2 and c 2 is not above p 2 (otherwise c 2 has to be in Y 2 , contradicting our assumption). Now we can move (s 2 , r 2 ) back to its original position. This all takes three head moves, and makes sure that a child c 2 of p 2 is a tree node. This means we can extend the isomorphism by setting φ : x 1 → c 2 (and if r 2 was in Y 2 , changing φ : φ −1 (r 2 ) → r 2 to φ : φ −1 (r 2 ) → r 2 ) using at most 3 head moves to add a tree node 2. There is a highest node x 2 of N 2 not in Y 2 such that x 2 is a tree node. Do the same as in the previous case (Case 1) switching the roles of N 1 and N 2 .
3. Each highest node x 1 of N 1 not in Y 1 and x 2 of N 2 not in Y 2 is a reticulation node or a leaf.
(a) There exists a highest node x 1 of N 1 not in Y 1 which is a reticulation node. This means the two parents p 1 and q 1 of x 1 are in Y 1 , and consequently have corresponding nodes p 2 and q 2 in Y 2 . Both these nodes also have at least one child not in Y 2 , say c p 2 and c q 2 . i. The children of p 2 and q 2 are equal (i.e., c p 2 = c q 2 ). In this case, we can immediately extend the isomorphism with φ : ii. Both nodes c p 2 and c q 2 are reticulations. Assume without loss of generality that c p 2 is not below c q 2 . A. The edge (p 2 , c p 2 ) is movable. Move this edge to (q 2 , c q 2 ), which is allowed because c q 2 is not above p 2 , and p 2 = q 2 . Now p 2 and q 2 have a common child x 2 := c p 2 , so we can add one reticulation to Y 1 and Y 2 and extend the isomorphism by φ : x 1 → x 2 using 1 head move. B. The edge (p 2 , c p 2 ) is not movable. Because (p 2 , c p 2 ) is not movable, c p 2 must be the side node of a triangle, and therefore its outgoing edge (c p 2 , z) is movable. By our assumption, c q 2 is not above c p 2 , so we can move (c p 2 , z) to (q 2 , c q 2 ). Now the other incoming edge (t, c p 2 ) of c p 2 becomes movable, and we can move it down to (z , c q 2 ). Now p 2 and q 2 have a common child x 2 := z , and the isomorphism can be extended with one reticulation by setting φ : x 1 → x 2 using at most 2 head moves ( Figure 23). iii. The node c p 2 is a reticulation, and c q 2 is a leaf. The subcases here work exactly like the previous subcases in Case 3(a)ii. A. The edge (p 2 , c p 2 ) is movable. Move this edge to (q 2 , c q 2 ), which is allowed because c q 2 is not above p 2 , and p 2 = q 2 . Now p 2 and q 2 have a common child x 2 := c p 2 , so we can add one reticulation to Y 1 and Y 2 and extend the isomorphism by φ : x 1 → x 2 using one head move. B. The edge (p 2 , c p 2 ) is not movable. Because (p 2 , c p 2 ) is not movable, c p 2 must be the side node of a triangle, and therefore its outgoing edge (c p 2 , z) is movable. Because c q 2 is a leaf, it is not above c p 2 , so we can move (c p 2 , z) to (q 2 , c q 2 ). Now the other incoming edge (t, c p 2 ) of c p 2 becomes movable, and we can move it down to (z , c q 2 ). Now p 2 and q 2 have a common child x 2 := z , and the isomorphism can be extended with one reticulation by setting φ : x 1 → x 2 using at most 2 head moves. iv. The node c q 2 is a reticulation, and c p 2 is a leaf. Switch the roles of p 2 and q 2 and do as in the previous case. v. Both nodes c p 2 and c q 2 are leaves. Note that because x 1 is a reticulation node not in Y 1 , there must also be a reticulation node r 2 ∈ N 2 not in Y 2 . Let its movable 296 Remie Janssen Heading in the right direction? incoming edge be (s 2 , r 2 ). As p 2 = q 2 we know that s 2 can be equal to at most one of p 2 and q 2 , hence we can assume without loss of generality that s 2 = p 2 . Then the head move (s 2 , r 2 ) to (p 2 , c p 2 ) is allowed, because the leaf c p 2 cannot be above s 2 . Now (p 2 , r 2 ) is movable because the child of r 2 is a leaf, and it can be moved to (q 2 , c q 2 ) because p 2 = q 2 and c q 2 is a leaf, and hence not above p 2 . After this head move, p 2 and q 2 have a common child x 2 := r 2 , and the isomorphism can be extended with one reticulation by setting φ : x 1 → x 2 using at most 2 head moves.
(b) There exists a highest node x 2 of N 2 not in Y 2 which is a reticulation node.
Do the same as in the previous case, switching the roles of N 1 and N 2 .
(c) All highest nodes of N 1 not in Y 1 and of N 2 not in Y 2 are leaves. In this case, the networks are already unlabelled isomorphic: N 1 [Y 1 ] and N 2 [Y 2 ] are isomorphic, and the only nodes not part of the isomorphism are leaves, hence there is only one way (ignoring symmetries of cherries) to complete the isomorphism.
Note that this procedure first adds all tree nodes and reticulations to the isomorphism, using four moves per tree node and two moves per reticulation node at most. Then finally it adds all the leaves, without changing the networks any more. Noting that the number of tree nodes is n + k − 1, we see that we need to do at most 4(n + k − 1) + 2k = 4n + 6k − 4 moves in N 1 and N 2 to get N 1 and N 2 which are unlabelled isomorphic.

Lemma 23
Let N and N be tier k > 0 networks with label set X of size n, which are unlabelled isomorphic. Then there is a head move sequence from N to N of length at most 2n.
Proof: Note that the only difference between N and N is a permutation of the leaves, say π = (l 1 1 , . . . , l 1 Π1 )(l 2 1 , . . . , l 2 Π2 ) · · · (l P 1 , . . . , l Πq ) to get from N to N (where all l j i are distinct). Note also that there is a reticulation in N with a head-movable edge (t, r), which is movable to the incoming edge of any leaf. A sequence of moves from N to N consists of the moves • . . .
• (p(l j 2 ), r (Πj −1) ) to (p(l j 1 ), l j 1 ); • (p(l j 1 ), r (Πj ) ) to (t, l j Πj ); • (t, r (Πj +1) ) to (s, c), for each cycle (1 ≤ j ≤ q) of π, where c is the child of r in N and s is the other parent of r in N . This permutes the leaves in N by π so that the resulting network is N . The sequence is allowed provided no two subsequent leaves in a cycle have a common parent (e.g., p(l j i ) = p(l j i−1 )). There is always a permutation in which this does not happen. Indeed, if this were to happen, the two leaves would be in a cherry. The worst case is attained when there are a maximal number of cycles in the permutation, which happens when π consists of only 2-cycles. In such a case there will be n/2 cycles of length 2. Each such a cycle takes four moves. An upper bound to the length of the sequence is therefore 4(n/2) = 2n.
A direct corollary of the previous two lemmas is the following theorem, giving an upper bound on the diameter of head move space. To see this, note that any head move is reversible, and hence we can concatenate sequences in different directions.
Theorem 4 Let N and N be tier k > 0 networks with label set X of size n, then there is a head move sequence of length at most 6n + 6k − 4 between N and N .

Neighbourhood size
In this subsection, we consider a third property of phylogenetic network space: the neighbourhood size. We start by giving simple upper bounds on the head move neighbourhood size, and then compare these to known bounds for other moves. For complete comparisons, we need information about the smallest and the largest neighbourhood size in each tier. However, giving lower and upper bounds for both of these lies beyond the scope of this paper. Hence, we focus on upper bounds for the largest neighbourhood in a tier, as these would, in practice, be limiting.

Proposition 5
The size of the head move neighbourhood of a network with n leaves and k reticulations is at most 4kn + 6k 2 − 2k, the size of the distance-1 head move neighbourhood is at most 8k, and the size of the distance-2 head move neighbourhood is at most 24k.
Proof: Head moves can only move reticulation edges, of which there are 2k in a tier-k network. Furthermore, there are 2n + 3k − 1 edges in a tier-k network with n leaves. Hence, an upper bound on the head move neighbourhood size in a tier-k network with n leaves is 4kn + 6k 2 − 2k. An upper bound on the size of the distance-1 head move neighbourhood is 8k: there are 2k heads that can be moved, and for each head, there are at most four adjacent edges it can be moved to. Similarly, an upper bound on the size of the distance-2 head move neighbourhood is 24k, as there are at most twelve edges within distance-2 of a node.

Comparison with tail move neighbourhood
As not much is known about neighbourhood sizes for other rearrangement moves, we will only give a rough comparison. We start with tail moves, as the spaces are the same for head moves as for tail moves. Like for head moves, there are obvious upper bounds on the neighbourhood size: a network with n leaves and k reticulations has at most 4n 2 + 3k 2 + 8nk + −6n − 7k + 2 tail move neighbours. Indeed, this is the number of edges (u, v) where u is a tree node (2n + k − 2) multiplied by the total number of edges (2n + 3k − 1).
Although the bounds for both the head move and the tail move neighbourhoods are quadratic in some sense, we point out that for tail moves, there is quadratic dependence on the number of leaves (a term n 2 ) which is absent in the head move neighbourhood bound. When the reticulation number is small, this implies the head move neighbourhood will likely be smaller than the tail move neighbourhood. Note that we have not proven this quadratic dependence for tail moves, we only conjecture it to be present on the basis of the following arguments. First, the simple upper bound above contains a quadratic term. Secondly, the tail move neighbourhood for trees also has a quadratic dependence on the number of leaves (Corollary 4.2 [28]). One might try to use this last fact to show that the quadratic term is actually necessary, for example by showing that each neighbour of a tree contained in a network is contained in a neighbour of the network. However, this is not true: the network N in Figure 24 contains only the tree T , and one of its neighbours T is not contained in any of the neighbours of N . A third hint for the quadratic term in the tail move neighbourhood size can be found in the SNPR neighbourhood bounds computed in [19]. The bounds for the largest neighbourhood for a network with n leaves in that paper are quadratic in n. However, the comparison is not fair, as this paper considers only tree-child networks-networks with a restricted structure-and it includes vertical moves as well. Lastly, there is a quadratic term in the SPR neighbourhood size for unrooted networks as well ( [8] above Proposition 4). Again, a direct comparison is not possible, as not every neighbour of the underlying unrooted network of N may have an orientation which is JGAA, 25(1) 263-310 (2021) 299 a neighbour of N .

Comparing local neighbourhoods
For local moves, we again start by comparing tail moves and head moves. As before, we can get an upper bound on the distance-1 tail move neighbourhood for a network with n leaves and k reticulations by counting the number of tails at a tree node, and multiplying by the number of edges at distance one. There are 2n + k − 2 edges (u, v) where u is a tree node, and at most 4 edges at distance one from these tails. Hence, an upper bound on the distance-1 tail move neighbourhood is 8n + 4k − 8. This bound can probably be improved using the approach of Proposition 2 in [9], but it will remain linear in the number of nodes using that technique.
Like for non-local moves, the tail move neighbourhood bound has a strong dependence on the number of leaves, which is absent for head moves. Compared to non-local moves, even less is known about neighbourhoods of local moves. The size of the rNNI neighbourhood is linear in the number of leaves, which indicates that the linear term n for tail moves is necessary, but it does not prove it. Like for tail moves (cf. Figure 24), an rNNI move on an embedded tree of N may not correspond to a distance-1 tail move in N ( Figure 25). Although this section contains arguments for bounds on the neighbourhood sizes, the lack of formal proofs is striking. Hence, if anything, this highlights the open questions that remain for all rearrangement moves. We know very little about the neighbourhood sizes of phylogenetic networks. It would be very useful to have bounds or exact sizes to compare different rearrangement moves, and, possibly, to get better bounds on the diameters for spaces defined by these moves.

Hardness of computing head move distance
In this section, we prove that the problem Head Distance of computing the head move distance between two networks is NP-hard. The proof uses a reduction from Tree rSPR Distance, which is the problem of finding the rSPR distance between two rooted trees. The rough idea is to convert rSPR moves on trees into head moves on specifically constructed networks.
Because rSPR moves change the location of the tail and not the head of an edge, we have to use a trick: we turn the tree upside down, which turns each tail into a head, and hence a tail move into a head move. Just reversing the direction of the edges of the tree is not sufficient, as this gives a graph with multiple roots and one leaf. Hence, we connect all these roots and add a second leaf to create a phylogenetic network. This construction is formalized in the following definitions.
After these definitions, we will show that the minimal number of head moves between two upside down trees is equal to the number of rSPR moves between the two original trees (Lemma 27). For the proof, we show that each sequence of moves between a pair of upside down trees gives an upside down agreement forest for these networks (Lemma 26); and each such upside down agreement forest gives a regular agreement forest for the original trees (Lemma 25).

Definition 19
Let T be a phylogenetic tree with labels X = {x 1 , . . . , x n }, the upside down version of T is a network T with 2n 2 + 2 leaves (e x,i for x ∈ X and i ∈ [2n], y, and ρ) constructed by: 1. Creating the labelled digraph S, which is T with all the edges reversed; 2. Creating the tree D by taking C(X ∪ {y}) and adding 2n pendant edges with leaves labelled e x,1 , . . . , e x,2n to each pendant edge e = (·, x) of C(X); 3. Taking the disjoint union of D and S; 4. identifying the node labelled x i in D with the node labelled x i in S and subsequently suppressing this node for all i.
The bottom part of T is the subgraph of T below (and including) the parents of the e x,1 . The rSPR distance between two trees can be characterized alternatively as the size of an agreement forest [3]. Here, we use this alternative description as part of the reduction. To define agreement forests, we need the following definitions, which we have generalized slightly to work well for networks. Definition 20 Let T be a tree (for digraphs: the underlying undirected graph is a tree) with its degree-1 nodes labelled bijectively with X and let Y ⊆ X be a subset of the labels. Then T | Y is the subtree of T induced by Y ; that is, it is the union of all shortest (undirected) paths between nodes of Y .

Definition 21
Let G and G be labelled digraphs. Suppose G and G are labelled isomorphic after suppression of all their redundant nodes (indegree-1 outdegree-1 nodes), then we write G ≡ G , or say G is s-isomorphic to G (for suppressed isomorphic).
An embedding of a graph H in G is an s-isomorphism H ≡ S of H with a subgraph S of G. We say that H can be embedded in G if an embedding of H in G exists. Note that any subgraph H of G can be embedded in G as H ≡ H.
Now we look at an important property of embeddings relating to subgraphs, which implies that being embeddable is transitive. Proof: The s-isomorphism A ≡ B is an isomorphism of graphs (topological minors) without redundant nodes. This isomorphism is a bijection between the non-redundant nodes of A to the non-redundant nodes of B. The map of the edges is a map of paths of A to paths of B, where the internal nodes of these paths may only be redundant nodes. Now consider the subgraph H of A, and note that the non-redundant nodes of H are non-redundant nodes of A as well. Indeed, the only way to create new non-redundant leaves by taking a subgraph, is to create a leaf from a redundant node, but L(H) ⊆ L(A) = L(B), so each degree-1 node of H corresponds to a degree-1 node of A and of B. This means each non-redundant node of H corresponds to a non-redudant node of B, and each edge of H to a path between such nodes in B, and there is an s-isomorphism of H with the subgraph of B formed by these nodes and edges.
Now we turn to the definition of an agreement forest, which, as mentioned earlier, characterizes the rSPR distance. Following the definition of the agreement forest, we define a tool similar to an agreement forest tailored to upside down versions of trees. This upside down agreement forest (udAF) can be turned into an agreement forest of the two original trees.

Definition 22
Let T 1 and T 2 be phylogenetic trees with labels X and root ρ. Then a partition P = {P i } of X ∪ {ρ} is an agreement forest (AF) for T 1 and T 2 if the following hold: • T t | Pi and T t | Pj are node-disjoint for all pairs i, j with i = j and fixed t ∈ {1, 2}.

Definition 23 Let
T be the upside down version of the phylogenetic tree T with label set X. Then an upside down agreement forest (udAF) for T is a directed graph F such that: • The underlying undirected graph of F is an (undirected) forest; • F is a leaf-labelled graph with label set {e x,i : x ∈ X, i ∈ [2n]} ∪ {ρ}, where each label appears at most once; • F ≡ S for some subgraph S of the bottom part of T .
302 Remie Janssen Heading in the right direction?  Note that the third requirement implies the first ( Figure 27).

Lemma 25
Let T and T be phylogenetic trees with label set X. If F is an udAF for T and for T , then there exists an AF of T and T of size at most |F |, where |F | denotes the number of components of F .
Proof: Let K be the set of components of F . For each K ∈ K we define the following part of the agreement forest: where e x,i indicates the i-th leaf of T corresponding to x. The agreement forest consists of these parts (ignoring the empty ones, resulting from components that have no complete sets of leaves), together with one part for each leaf that is in none of these parts, i.e.
Note that each component Y of AF corresponds either uniquely to a component K of F which has all e x,i for some leaf x, or it corresponds to a leaf x for which not all e x,i are contained in one component of F . In the last case, there is a component of F consisting of one leaf e x,i for some i. Note that this correspondence AF → K must therefore be injective, and AF has size at most |F |. What remains to prove is that AF is indeed an agreement forest for T and T .
Let F be the subgraph of F where each component K is restricted to the subgraph consisting of all paths between the leaves in K AF . As (per definition of an udAF) F can be embedded in the bottom part of T , F can also be embedded in the bottom part of T as it is a subgraph of F (Lemma 24). This embedding must be unique, because it is of a labelled forest into a labelled tree.
Let E x be the subgraph of T induced by the leaves e x,i and their parents for all i and a fixed x. Now replace each subgraph E x with one leaf x in both F and in the bottom part of T . Let the resulting graphs be F s and B s . Subsequently reverse the direction of each edge in both B s and in F s with resulting graphs B r and F r . Note that the resulting graphs are B r = T and the union F r = ∪ K∈K T | K AF , and all the restricted trees T | K AF are node disjoint. We repeat this argument for T , and note that the modifications from F to F r are independent of T , so we have the equality F r = ∪ K∈K T | K AF , where the parts T | K AF are again node disjoint. This means T | K AF ≡ T | K AF for each K ∈ AF corresponding to a non-trivial component of F , and T | Pi and T | Pj are node disjoint for all nontrivial parts P i and P j of AF (similarly for T ). Hence so far the elements of AF corresponding to non-trivial components of F , meet all the requirements of an AF.
The only other elements of AF contain only one label, each of which is not in any of the nontrivial components of AF . Hence, for any such label x, the restriction T | {x} consists of only the node labelled x, which is not contained in any other component by definition (and similarly for T ). Furthermore, the s-isomorphism T | {x} ≡ T | {x} is trivial. Hence, AF is indeed an agreement forest.
The preceding lemma shows that an udAF for two upside down trees gives an AF for the original trees of the same size. We still lack a connection between the number of head moves and an udAF, however. The following lemma shows that appropriate head move sequences correspond to udAFs of size related to the number of head moves.
Lemma 26 Let T and T be trees with label set X, and |X| = n. Suppose S is a sequence of head moves T = N 0 , . . . , N |S| = T of length |S| < 2n. Then there is an udAF F of T and T with at most |S| + 1 components.
Proof: Let B be the bottom part of T . We prove this result using induction on the number of moves to prove that there exist subgraphs F i of N i which can be embedded in the bottom part of T and have |F i | ≤ i components. Finally we prove the subgraph F |S| of N |S| = T must actually be a subgraph of the bottom part of T . As a base of the induction, set F 0 = B, which is connected and can clearly be embedded in itself and is a subgraph of T . Now suppose we have subgraphs F i of N i with embeddings of F i in B and |F i | ≤ i for all i < j ≤ |S|. We prove that there also exists a subgraph F j of N j with at most j components that can be embedded in B.
Note that F j−1 is a subgraph of N j−1 and therefore the moving edge e j = (u, v) can be either an edge of F j−1 , or it is in the complement N j−1 \ F j−1 . In the last case e j can have only its endpoints in F j−1 . Now construct F j as follows: • remove edge e j = (u, v) from F j−1 if it was contained in it; • clean up the resulting graph by removing all edges not contained in any undirected path between two leaves, and suppressing v if it is a degree 2 vertex after removal of (u, v).
• add the new endpoint if necessary. That is: let the target edge of the move be t, if t is contained in the graph after cleaning up, subdivide t.
Note that F j can be embedded in F j−1 because the only operations were: restriction to a subset of labels, subdivision, and suppression (Lemma 24). Because F j−1 embeds in B, there is also an embedding of F j into B. Furthermore, F j is a subgraph of N j by construction: the three steps correspond exactly to the three steps of a head move in N j−1 . Lastly, F j has at most one more component than F j−1 , because the only operation

Discussion
When generalizing rSPR moves on rooted trees to rooted networks, it is natural to consider tail moves, because each rSPR move in a tree is a tail move. However, when taking the view that an rSPR move is a move that changes one of the endpoints of an edge, head moves also belong to the generalization of rSPR moves [9]. In this view, it is equally natural to only consider head moves, as to only consider tail moves.
We have showed that head moves are sufficient to connect all tiers of phylogenetic network space except tier-0. This might be surprising because head moves are relatively limited compared to tail moves: the head move neighbourhood is small compared to the tail move neighbourhood. On the other hand, when one reverses all the edges of a network, each tail move becomes a head move. This makes the difference between connectivity results for these types of moves just a mathematical difference in numbers of roots, reticulations, and leaves, instead of a fundamental difference in biological interpretation.
To unify these connectivity results, one could consider head or tail moves in a broader class of networks, which may have multiple roots and at least one leaf (instead of at least two) ( Figure 28). For such multi-rooted networks, connectivity results for head moves and for tail moves could easily be related. This reason for studying multi-rooted networks is mathematically inspired. Another reason to study multi-rooted networks is inspired by biology: these networks could be interesting on their own as subnetworks of ordinary phylogenetic networks. The advantage is that 306 Remie Janssen Heading in the right direction? one does not have to make assumptions about how these roots are connected higher up, that is, about the evolutionary history before the existence of these root genes or species [12]. Additionally, a famous but slightly dated view of the evolutionary history is the net of life by Doolittle, which features multiple roots [5]. A third reason becomes apparent when we take a broader view of phylogenetic networks that includes pedigrees: these often start with multiple individuals that may coalesce in the distant past.
While we focussed mostly on head and tail moves of any distance, we have proven the connectivity of tiers of phylogenetic network space by distance-2 head moves. Distance-1 head moves are not sufficient in general because heads cannot move past their own tails. It would be interesting to see which networks are actually connected by distance-1 head moves. Figure 29: A distance-2 head move in a network and the displayed trees before (left) and after (right) the head move. The top displayed tree is the same before and after the head move. The bottom tree before disappears, and is replaced by the tree bottom-right, which is a distance-3 tail move away from the top tree.
It is unclear if this connectivity result for distance-2 moves is useful, especially in the context of embedded gene trees their relevance may be disputed. After all, local head moves do not generally correspond to local moves in the displayed trees ( Figure 29). Problems studying the relation between displayed trees, which are interpreted as gene trees, and phylogenetic networks are often quite hard [4,16,15]. Hence, strategies for solving these problems could benefit from local search heuristics.
As mentioned, an important motivation for studying rearrangement moves is their possible use in local search strategies for phylogenetic networks. As such, it is important to understand the topological and geometric properties of the tiers of phylogenetic networks space. In this paper, we started this study by giving bounds on the head move diameters, and finding additional connections between head move distance, tail move distance, and rSPR distance.
Although the bounds for the head move diameter we found are already quite good-both upper bound, and lower bound are linear in the number of leaves and the reticulation number, just like for tail moves and for rSPR moves [17]-they could possibly be improved.
As future research, one could try to discover the exact diameters. Another direction would be to try to apply our techniques for bounding the diameter to other types of moves, such as SNPR JGAA, 25(1) 263-310 (2021) 307 and PR moves [2,20]. Because these classes of moves also include vertical moves, this might be quite challenging.
The other property of phylogenetic network space defined by head moves we touched on was the neighbourhood size. The head move neighbourhood is relatively small. Nevertheless, it is still possible to reach any network quite quickly, as the diameter still grows linearly. This means head moves might be very well suited for local search heuristics.
Of course, there are other factors to consider. Head moves might not have the proper relation to the studied phylogenetic objectives. For example, they could give irregular optimization landscapes. For example, in phylogenetic tree space, NNI moves give local optima (not globally optimal) for maximum parsimony, whereas SPR moves only give a global optimum for perfect sequence data [29]. It would be interesting to analyse such relations for networks, too. For instance, by studying the occurrence of local optima for different kinds of parsimony [7,14] using the existing types of rearrangement moves.
Another possible complicating factor in the relation between head moves and the optimization objective could be that head moves might be too restrictive for some types of networks. Indeed, we have not studied head moves for subclasses of networks. It might be useful to see if head moves also connect tiers of tree-child networks for example. Such questions have been answered for other moves [2].
Lastly, in this paper we have studied the problem of computing the head move distance between two networks. For tail moves, rSPR moves [17] and for SNPR moves [21], it was already known that computing the distance between two networks is NP-hard. For the first two of these, we additionally know that computation of distances is hard for each tier. Here, we have shown that computing head move distance is also NP-hard, although we have not shown this for each tier separately. A first step in proving hardness in each tier might be to study head move distance computation in tier-1.
It could also be interesting to find an efficient algorithm for the task of finding a shortest head move sequence, or to characterize the exact distance between two phylogenetic networks in a more abstract way. No efficient (FPT) algorithm for this task is known, nor are there any exact characterizations of distances between networks given by rearrangement moves. A first attempt was recently made using a generalization of agreement forests, this approach currently only yields exact distances between trees and networks, and no exact distances between two networks [20,21].