Comparing the topology of phylogenetic network generators

Phylogenetic networks represent evolutionary history of species and can record natural reticulate evolutionary processes such as horizontal gene transfer and gene recombination. This makes phylogenetic networks a more comprehensive representation of evolutionary history compared to phylogenetic trees. Stochastic processes for generating random trees or networks are important tools in evolutionary analysis, especially in phylogeny reconstruction where they can be utilized for validation or serve as priors for Bayesian methods. However, as more network generators are developed, there is a lack of discussion or comparison for different generators. To bridge this gap, we compare a set of phylogenetic network generators by profiling topological summary statistics of the generated networks over the number of reticulations and comparing the topological profiles.


Introduction
A phylogenetic network is a directed acyclic graph that represents the evolutionary history of a set of species [26]. With the recent advancements in the theory of phylogenetic networks and the accumulation of evolutionary data, it is becoming increasingly viable and common to represent the evolutionary hierarchical relations with phylogenetic networks instead of phylogenetic trees, which cannot record reticulate evolutionary processes such as horizontal gene transfer (also known as lateral gene transfer), hybrid speciation, hybrid introgression, and recombination [17,4,5].
In evolutionary analysis, the stochastic process that generates random trees and random networks play an important role, especially in the reconstruction of phylogenies. Similar to phylogenetic tree reconstruction, there are two main types of methods for rebuilding evolutionary histories as phylogenetic networks: combinatorial methods and model based methods. Combinatorial methods are based on, for example, distances between taxa [7,9,31] or gene trees [2,38,50,29,28], while model based methods involve Bayesian statistics [53,55] or likelihood calculations [52,47] in models of sequence and lineage evolution. For both types of methods, random phylogenetic networks are needed in validating the reconstruction methods. Moreover, for Bayesian methods, the stochastic processes used to generate random networks also serve as priors [53]. In this paper, we call the stochastic processes that generate random phylogenetic networks the network generators. Note that this is different from the generator of a network defined in [27], which encodes the underlying reticulate structure of a phylogenetic network.
There are a few approaches that existing network generators usually take to constructing random networks. One approach is to alter a tree generator and add hybrid edges, for example, network generators [29,43] that are based on the Yule-Harding model [23] or on a continuoustime Markov model of speciation to a birth-hybridization Markov model [53,42]. The hybrid edges can either be added during the process of building a tree or after a tree is generated. For these methods, there is also an option to add the hybrid edges in a way that favours shortdistance (local) hybrid edges [42,43].
Other approaches that do not rely on tree generating processes include assembling a network directly from degree constraints [54] and sampling a network in the network space by random local surgeries, e.g., rearrangement moves which move one endpoint of an edge [21,32]. The latter is used in Bayesian methods for sampling posteriors [53,55]. In order to present a more comprehensive comparison of network generators, we include two new network generators in our analysis, in addition to the ones already available in the literature. The first is based on the beta-splitting model for trees [1], which we extend to networks by adding edges to a generated tree in various ways. The second is based on a birth-death process [24] which we extend with distance dependent hybrid speciation events.
For phylogenetic trees, it is well discussed which stochastic process generates phylogenetic trees that resemble trees reconstructed from data [6], and random tree generators are well compared from different points of view [48,40,12,3]. However, as increasingly many network generators are developed, there is a lack of discussion about the differences and similarities of generated random networks. To bridge this gap, we examine and compare network generators by profiling topological summary statistics of the generated networks over the number of reticulations in the networks and comparing the topological profiles. The topological summary statistics used for the comparison include network balance, number of cherries, number of reticulated cherries, number, size and level of blobs as well as network class. Moreover, we generalize the recent polynomial representation for trees [39] to networks and leverage the high-resolution polynomial representation in comparing small phylogenetic networks.
We show that generators based on different types of approaches can generate networks with similar topological profiles. If a network generator is based on a tree generator, the topological characteristics are dominated by the topology of the underlying trees when the number of reticulations is small; and as the number of reticulations increases, the topological characteristics may diverge with respect to the integrated reticulation structures and the locality of added reticulations.

Topological profiles
A phylogenetic network or simply a network is a rooted directed acyclic graph whose out-degree zero nodes (leaf nodes) have in-degree one. We study binary networks, that is, every internal node in the network has in-degree one and out-degree two (tree nodes) or in-degree two and outdegree one (reticulation nodes). In a phylogenetic network, a reticulation event is represented by a reticulation node and the edges pointing to the reticulation node which are called hybrid edges. A reticulation node together with the hybrid edges incident to it is a reticulation structure.
The topological profile of a network generator is defined to be the topological summary statistics of generated networks as a function of number of reticulations. In other words, the topological profile of a network generator describes how the topological summary statistics change as the number of reticulations increases. We choose commonly used topological characteristics of phylogenetic networks for our comparison. These characteristics include network balance, number of cherries and reticulated cherries, size and level of blobs as well as proportions of generated networks that belong to various network classes.

Network balance
The balance of phylogenetic trees is a well studied topological characteristic. The balance for phylogenetic networks is recently revisited in [3], where the B 2 balance for phylogenetic networks [46] is reintroduced. The B 2 balance is an index based on the entropy of probability distribution on leaf nodes. Specifically, let N be a network with the set of leaf nodes denoted by L and p l be the probability of a directed random walk starting from the root and ending at the leaf node l, then the B 2 balance of N can be computed by B 2 (N) = − ∑ l∈L p l log 2 (p l ).
The maximal value of B 2 balance for a network with n leaf nodes is log 2 (n), which corresponds to the most balanced case where every leaf node has probability 1/n. For our comparison, we use the normalized B 2 balance, which is computed by dividing the B 2 balance a of network with n leaf nodes by log 2 (n).

Cherries and reticulated cherries
The number of small subgraphs such as cherries and pitchforks is another well-studied topological characteristics for phylogenetic trees [12]. A cherry of a network is a subgraph consisting of two leaf nodes x and y that share a common parent node. For our comparison of phylogentic networks, in addition to the number of cherries, we also compute the number of reticulated cherries which are subgraphs consisting of two leaf nodes x and y, a reticulation node w as the parent node of x (or y), and the common parent of y and w (or x and w) [7].

Blobs
A blob in a network is a maximal connected subgraph without cut edges and consisting of at least two nodes [22]. The size of a blob in a network is the number of nodes in the blob; the level of a blob is the number of reticulation nodes in the blob; and the level of a network is the maximal level over all its blobs.

Network classes
To overcome the obstacles faced in extending the mathematical theory of phylogenetic trees to phylogenetic networks, some subclasses of phylogenetic networks were introduced. These network classes include tree-child networks, stack-free networks, orchard networks, and treebased networks. We examine the proportion of generated networks that belong to each of the classes as another topological characteristic in the profile of a generator.
A network is tree-based if it has an embedded spanning tree whose leaf nodes are exactly the leaf nodes of the network [20]. A network is orchard if it can be reduced to a network consisting of a single vertex by a sequence of cherry reduction operations [33,18]. A stack in a phylogenetic network is a pair of adjacent reticulation nodes. A network is stack-free if there are no stacks in the network [45]. A network is tree-child if each internal node of the network has at least one child that is not a reticulation node [11].
For binary networks, a tree-child network is also a stack-free network as well as an orchard network [33]. Moreover, every stack-free network or orchard network is also tree-based [54,25,30].

Polynomial comparison
A polynomial representation for phylogenetic trees was introduced in [40]. The polynomial representation is an isomorphic invariant for trees, that is, two trees are isomorphic if and only if they have the same polynomial representation. Moreover, the polynomials for rooted trees are irreducible, that is, they cannot be factored as a multiplication of smaller polynomials [39]. We leverage these results to extend the polynomial representation for further comparing phylogenetic network generators.
A natural way to connect phylogenetic networks with trees is by considering their sets of spanning trees. Let N be a phylogenetic network and T N be the deck of rooted spanning trees of N, that is, T N may contain identical elements. It is natural to represent N with the polynomial P(N, x, y) = ∏ T∈T N P(T, x, y). It is unknown in general whether graphs can be reconstructed from their decks of spanning trees. However, it has been proved that two tree-child networks are isomorphic if and only if they have the same set of embedded spanning trees [19]. If we compute the polynomial for tree-child networks based on the set of embedded spanning trees, it is immediate that two tree-child networks are isomorphic if and only if they have the same polynomial. However, although not all networks are tree-child networks, it is still useful to compute their polynomials for comparison.
To compare the networks, We use the polynomial distance defined in [40], that is, the Canberra distance [37] between the polynomial coefficient vectors of the networks. Then we use multi-dimensional scaling [13] to visualize the distances, and we deploy k-medoids clustering [35] on the distances to determine if networks generated by different generators can be distinguished by the polynomial distance.

Network generators
Phylogenetic network generators typically alter tree generators by introducing reticulation event to the process. An important phylogenetic tree generator is the beta-splitting model which is a generalization of the Yule model and the PDA model [1,6].
For our comparison, we select the LGT network generator [43] and the ZODS network generator [53] as representations of network generators based on the Yule model (i.e., beta-splitting model with β = 0). We compare these network generators to a new network generator which is an extension of Heath's tree generator [24]. Another network generator we choose is the NTK network generator [54] which is described not in terms of a branching process, but it utilizes the degree constraints of networks for direct sampling. Moreover, for comparison, we also introduce two additional network generators as references. The first is an extension of the beta-splitting tree generator. Instead of adding reticulations in the process, the beta-splitting network generator adds reticulations to generated trees in various ways. The second is the MCMC network sampler which is commonly used in Bayesian methods for sampling posteriors, but we employ it as a standalone generator.
For network generators that are based on a phylogenetic tree generator, we will discuss three types of reticulation structure integrated in the branching process. See Figure 1. For a pair of leaf nodes x and y of the network under generation at the current state, the n-type reticulation adds an edge from x to a new leaf node x , an edge from y to a new leaf node y and a horizontal edge from x to y (or y to x); the y-type reticulation glues the leaf node x and y and produces a new leaf node z; the m-type reticulation branches at x and y and glues a pair of new leaf nodes producing three new leaf nodes x , y and z. The n-type reticulation structure has a natural interpretation as an HGT or an hybrid introgression event, and the m-type can be interpreted as a hybrid speciation event.
x y leaf nodes x y x y n-type m-type x y x y z x y z y-type

LGT network generator
The LGT network generator is a network generator designed to model the lateral gene transfer (LGT) events. The generator is introduced in [43], and, from its description, the speciation events are designed as in the Yule model, that is, a leaf node at the current state is chosen uniformly at random for branching. The LGT network generator uses n-type reticulations and has two parameters governing the introduction of reticulation events: a parameter α ∈ [0, 1] controls the probability of the next event being a reticulation event instead of a speciation event, and a parameter γ ≥ 0 regulates the probability of a reticulation event being within an existing blob, where smaller γ corresponds to higher probability of a local reticulation event. Hence, if the number of events is fixed, the parameter α influences the number of reticulations in the network, and the parameter γ influences the level of the network. For our comparison, we modified the code of the generator so that we have networks with the same number of reticulations and number of leaf nodes, making the α parameter irrelevant for our comparison.

ZODS network generator
The ZODS network generator is a birth-hybridization process introduced in [53] as a prior in the Bayesian method. The generator uses y-type reticulations and has a constant speciation rate λ and a constant hybridization rate ν, that is, at a step when there are k extant leaf nodes in a network, the speciation rate is λk and hybridization rate k(k − 1)ν/2. Besides the reticulation type, another difference between the ZODS network generator and the LGT network generator is that the hybridization rate. The hybridization rate in the ZODS network generator depends quadratically on the number of leaf nodes, whereas in the LGT network generator, the hybridization rate depends linearly on the number of leaf nodes.

NTK network generator
A binary phylogenetic network has the degree constraints that a tree node has in-degree one and out-degree two, a leaf node has in-degree one and out-degree zero, and a reticulation node has in-degree two and out-degree one. The NTK network generator introduced in [54] does not rely on a branching process but utilizes these degree constraints and generates networks as follows. To generate a random network of a certain size, the generator selects uniformly at random n nodes as leaf nodes and r nodes as reticulation nodes from 2(n + r) − 1 nodes, and randomly inserting edges between a pair of nodes so that the degree constraints are met.

Beta-splitting network generator
The beta-splitting network generator is based on the beta-splitting model for trees [1,6,44]. The beta-splitting model for trees is a generalization of the Yule model (β = 0) and the PDA model (β = −1.5). Here, we introduce three methods to create a network from a generated tree by repeatedly adding edges to the tree. The first method picks two edges uniformly at random, subdivides these two edges with two vertices respectively and adds another new edge between these two vertices with a random direction that does not introduce cycles (Algorithm 1); the second chooses edges more locally governed by a single parameter P stop ∈ [0, 1] that stochastically determines how local the cycles are, a high P stop leads to more local reticulations (Algorithm 2); the third picks two leaf nodes, subdivides their incoming edges with two vertices and adds a horizontal edge between the two vertices with a random direction (Algorithm 3). We call the process of generating networks based on the PDA trees in these ways the PDA network generator, and the generated networks the PDA networks.

MCMC network sampler
The Markov-chain-Monte-Carlo (MCMC) network sampler performs a random walk through the space of phylogenetic networks by applying rSPR moves, small changes to the network that move one endpoint of an edge to another edge [21]. By selecting networks after sufficiently many random rSPR moves, the MCMC network sampler is able to sample (almost) uniformly from the set of (internally labelled) networks, as this space is connected [32]. The MCMC network sampler lies at the basis of a common method in Bayesian analysis for sampling posteriors, as in PhyloNet [49] and SpeciesNetwork [53].

Heath network generator
Finally, we present a new network generator based on the branching process introduced in [24], which uses auto-correlated speciation and extinction rates. For our analysis, we set the extinction rate to zero. The branching process is extended by adding hybrid speciation events, i.e., we integrate the m-type reticulation in this model. We introduce a new distance dependent hybridization rate f hyb (d) between two leaf nodes at the current state, which depends on the weighted distance d between the leaf nodes, where d is the sum of the lengths of all up-down paths between the two leaves, weighted by the probability of the up-down path, i.e., the product of the edge-probabilities on the path. The function f hyb (d) is piecewise linear, and it is linearly decreasing in d (for h l ≤ d ≤ h r ) with a minimum and maximum rate h lr and h rr .
See Appendix A.2.1 for more details of the generator, and for an efficient method to update the weighted distance d after several types of evolutionary events. Using these, our distance dependent extension is also viable for other tree generators than the Heath tree generator.

Data
To compare the network generators, we generate a large number of networks with each of them, varying the parameters if there are any. To keep the comparison straightforward, we generate networks with 100 leaf nodes for this purpose, and we vary the number of reticulations. When using the polynomial to compare networks, we use smaller networks with 10 leaves.
LGT networks. For the LGT network generator, we use three different parameter values (0.01, 0.1, and 1.0) for the parameter γ that regulated the locality of the reticulations. The parameter α becomes irrelevant as we fix the number of reticulation nodes and leaf nodes as mentioned before. For each of these parameter values, we generate 500 networks with r = 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, 500 reticulations. This results in a set of 16500 networks.
ZODS networks. Setting the number of leaf nodes and reticulation nodes, the parameters become irrelevant as we have argued before. Hence, we generate networks with a set number of reticulations using a reticulation rate that gives a large probability to generate such a network. We used the parameters ν = 0.0005, 0.001, 0.002 to create 500 networks for each reticulation number r = 1, . . . , 5; we used ν = 0.005 to create 500 networks with 10 and 20 reticulations, ν = 0.01 for 500 networks with 20 and 50 reticulations, ν = 0.015 for 500 networks with 50 and 100 reticulations, and, finally, ν = 0.02 for 500 networks with 100 and 200 reticulations. Hence, the full set of networks totalled 23 · 500 = 11500 networks.
MCMC sampled networks. We sample networks with r = 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, 500 reticulations, starting with a network sampled by the beta-splitting network generator with β = −1.0, −1.5 adding edges uniformly. For each of these parameter combinations, we have sampled 500 networks. This results in a set of 1000 networks for each reticulation number, and 11000 networks in total.
Each sample is taken after 5000 proposals. Because about half of the proposals were rejected as invalid rSPR moves, we took about 2500 steps through the space of networks between each sample. As the diameter of the space of networks with n leaf nodes and r reticulation nodes is at most 2n + 3r − 2; see [34]. This is reasonable to make the samples independent for n = 100 and r ≤ 500.
Note that we do not generate networks with 100, 200, and 500 reticulations with this generator. This is because the Heath network generator uses m-type reticulation events, which add a leaf to the network. Hence, to generate a network with r reticulations, the network must contain at least r + 2 leaves, so it is impossible to generate networks with 100 leaves and 100 reticulations.

Topological profiles 3.1.1. Balance
It is displayed in Figure 2 that the balance of the generated networks depends primarily on the balance of the underlying tree model. For the generators based on the Yule model, that is, the LGT network generator and the ZODS network generator, the B 2 balance of networks with few reticulations are similar to the beta-splitting networks at β = 0 (Yule model).
Similarly, for beta-splitting networks, the values of the parameter β determining the balance of the underlying trees also determine the B 2 balance of the network when the number of reticulations is small. As the number of reticulations increases, the balance differs with respect to the method of adding reticulations. Adding reticulations horizontally at leaf nodes makes the network more balanced, while adding reticulations locally leads to less balanced networks.
Regarding the MCMC network sampler, networks with more reticulations are more balanced on average than networks with fewer reticulations. This is because, in the MCMC network sampler, the networks with few reticulations are relatively unbalanced compared to the other network generators, as unbalanced networks are over-represented among all distinguishable arrangements. In fact, when samples are taken after sufficiently many steps, the MCMC network sampler theoretically behaves exactly the same as the PDA model (beta-splitting model with β = −1.5) on trees, which is known to be relatively unbalanced.

Cherries and reticulated cherries
The relations between the number of cherries, the number of reticulated cherries and the number of reticulations for different network generators are displayed in Figure 3. For all network generators examined, the number of cherries decreases as the number of reticulations increases. This is because when adding hybrid edges, existing cherries can be removed while new  cherries can never be created. It is clear that for network generators based on a branching processes, the initial number of cherries depends on the underlying tree shapes. In particular, the network generators based on the Yule model, namely the LGT and ZODS network generators, have similar initial number of cherries to the beta-splitting network with β = 0 (Yule model). It can also be observed that the initial number of cherries of the NTK network generator is similar to the beta-splitting network with β = 0 as well, and the number of cherries of the MCMC network sample resembles the PDA networks (the beta-splitting network with β = 0).
For the LGT, ZODS, and Heath network generators, the number of reticulated cherries increases with the number of reticulations. The rate with which the number of reticulated cherries increases depends on the reticulation structure used in the process. Specifically, the ZODS network generator uses y-type reticulation structure, and the number of reticulated cherries increases the slowest; The LGT network generator uses n-type reticulation structure, and the number of reticulated cherries increases faster; the Heath network generator uses m-type reticulation structure, and the number of reticulated cherries increases the fastest. Also note that adding reticulations locally makes the number of reticulated cherries go up faster in the LGT network generator.

Blobs
We examine the relations between the number of blobs, the level, the blob size of networks and the number of reticulations in the networks generated by different generators. Figure 4 displays the results. For all the generators examined, The blob size increases slower than the number of reticulations, but eventually the ratio between the blob size and the number of reticulations becomes constant as the number of blobs in the networks becomes one. More precisely, in a network with n leaf nodes and r reticulations, as r increases and the number of blobs in the networks becomes one, the number of nodes in the largest blobs converges to n + 2r ≈ 2r and the ratio between the level of the network and the number of reticulations r converges to 1.
Trivially, adding edges locally results in more and smaller blobs. The type of locality does affect the profile for the number of blobs subtly as well. The LGT network generator explicitly forces local reticulation events to stay within a blob. Hence, blobs cannot merge in a local event and networks often have multiple blobs even when they have a large number of reticulations. The beta-splitting network generator and the Heath network generator enforce locality differently, where blobs are allowed to merge, which nearly always leads to networks with one blob when a large number of reticulations are added.
Despite the qualitative similarity in the profiles, the Heath network generator with m-type reticulations has distinct profiles with a more significant change in the number of blobs. Similarly, the beta-splitting network generator with different parameter β also generates qualitatively similar profiles but with differences in the quantities; see Figure C.2

Network classes
Lastly, We study the proportions of generated networks that belong to each of the network classes. Figure 5 displays the proportions with respect to the number of reticulations. As the number of reticulations increases, most of the generated networks fall out of any of the four classes.
For the beta-splitting networks, the beta parameter had negligible influence, so we combined data from all values of betas; see the left panel in Figure 5. The curves for all different methods of adding reticulations are very similar. One exception is that adding edges horizontally always leads to networks that are orchard, and thus also tree-based. In Lemma Appendix B.2, we show that it is in fact impossible for this method to produce non-orchard networks.
For the other network generators, the general results are similar. The exceptions are LGT networks and Heath networks. The LGT network generator, like the beta-splitting model that adds edges horizontally, only produces orchard networks. The Heath network generator is the most restricted, because these only produce tree-child networks as a result of its m-type reticulations.

Polynomial comparison of small networks
We deploy the polynomial as an alternative comparison between generated networks. We compare the network generators with similar profiles, namely the ZODS network generator versus the NTK network generator and the MCMC network sampler versus PDA network generator (with reticulations added uniformly). As a reference, we also compare the LGT network generator (with reticulations added uniformly, i.e. γ = 1.0) vs the NTK network generator. Note that  other than the network classes, the LGT network generator and the NTK network generator have similar topological profiles. Figure 6 displays the misclassification rates from k-medoids clustering sets of networks, the experiment is performed on networks with 10 leaves, and repeated 50 times for each number of reticulations. A misclassification rate of 0.5 means that the two sets of generated networks cannot be distinguished, while a misclassification rate of 0 means the two sets of generated networks can be completely distinguished. When the number of reticulations in the networks reaches 10, the polynomial distance can distinguish the LGT networks and NTK networks. However, the polynomial distance cannot successfully distinguish the ZODS networks from the NTK netowrks or MCMC networks from the PDA networks.

Discussion
We have compared the topological profiles of several network generators. In general, the topological characteristics are related to the underlying tree shapes when the number of retic-ulations is small. For generators based on a branching process, as the number of reticulations increases, the topological characteristics change with respect to the type of integrated reticulation structures and the locality of added reticulations. For example, y-type reticulations (ZODS network generator) can be used to create any network in any network class, and they lead to a relatively large number of cherries which seems to persist even for very large numbers of reticulations. This large number of cherries can be explained by the relatively large number of speciation events needed to generate networks with larger numbers of reticulations. Using ntype reticulations (LGT network generator) always leads to orchard networks, and a relatively high balance compared to the other generators. While using m-type reticulations (Heath network generator) always leads to tree-child networks, and to a sharp increase in the number of reticulated cherries, because each m-type reticulation involves two reticulated cherries.
For generators that are not based on a branching process, we have found resemblances between the NTK network generator and the ZODS network generator, as well as between MCMC network sampler and the PDA network generator (with reticulations added uniformly). The former resemblance could be explained by seeing the ZODS network generator as having a predefined set of nodes with degree constraints, which are connected in a top-down fashion. This makes it quite similar to the NTK network generator, but whose description does not explicitly include a top-down order for connecting the nodes. The resemblance between the MCMC network sampler and the PDA network generator is less obvious, as there only a clear relation between these methods when there are no reticulations. It remains an open problem whether the theoretical equivalence for these models on trees can be extended formally to an equivalence on networks.
These resemblances have been further analyzed by comparing small networks generated by the set of generators using the polynomial distance. We found that among the pairs of network generators with similar topological profiles, the polynomial distance can distinguish the LGT networks from the NTK networks, but cannot successfully distinguish ZODS networks from the NTK networks or MCMC networks from PDA networks. We limited our analysis to small networks due to the computation complexity of the polynomial. For larger phylogenetic networks, we can compute the average of coefficients over the set of spanning-tree polynomials to avoid too many multiplications of polynomials. Consider the polynomial coefficients as vectors, we can also compute the diversity [10] of the set of spanning-tree polynomials of a network. This would provide additional information about the topology of a network.
Perhaps surprisingly, we observed that adding reticulations locally does not only affect the number of blobs and their sizes, but also affects the balance of the networks. This effect is more prominent in the beta-splitting and Heath network generators than in the LGT network generator. It would be interesting to see whether it can be explained mathematically why local reticulations lead to networks with lower balance, and why this effect is not as strong as in the LGT network generator.
As we mentioned in the introduction, there are two main applications for network generators in phylogeny reconstruction: as a tool to validate network reconstruction methods, and as prior for Bayesian methods. The profiles and the analysis of the network topology presented in this paper can help select suitable generators in these tasks. It is an important question in validation which generator produces realistic networks that resemble networks reconstructed from data. This question is discussed for phylogenetic trees [6], but is not investigated for phylogenetic networks. One of the gaps in this investigation is the lack of realistic networks reconstructed from data. Once sufficient realistic networks become available, for example through reconstructions [41,51] or through simulated evolution [14,15,16], the topological profiles and the polynomial can be utilized to compare the natural evolutionary processes with simulated evolution models or the random network generators, and networks reconstructed from data with the generated random networks.
To provide a more flexible framework for adding hybridization events to tree generators, we also developed a new network generator based on the Heath branching process. This network generator uses the m-type reticulation structure, and hybrid events are more likely for closely related species, which is modelled by a distance that we update efficiently after each event. Consequently, the Heath network generator has different topological profiles from the other network generators. For all these network generators based on a branching process, we did not include extinction events, which would change the topological profiles; see Appendix C.1. Moreover, extinction would allow network generators using m-type reticulations to have more reticulations than leaves, and network generators with m-type and n-type reticulations to generate networks outside the class of tree-based networks.
In the perspective of Bayesian methods, our results suggest that the priors introduced in [49] and [53] can behave quite differently in terms of balance and number of (reticulated) cherries, for in the former an explicit prior is absent so the actual prior is the MCMC network sampler, and in the latter, the prior is the ZODS network generator. Moreover, if a generator that is unable to generate all networks (e.g. the LGT network generator) is used as prior, then there exist networks that cannot be traversed by the random walk in sampling posterior distribution. Using such a prior thus requires new investigations on the connectedness of the spaces of networks that can be generated by such a generator, like for tree-child networks in [8,36].

Algorithm 3: ADDEDGEHORIZONTAL(N)
Data: A network N Result: A network N obtained from N by adding one arc. 1 Set N = N; 2 Pick two leaves l 1 , l 2 of N uniformly at random; 3 Subdivide the incoming edges of these leaves with nodes v 1 , v 2 ; 4 Add the arc (v 1 , v 2 ) to N ; 5 return N ;

. Heath networks
In this section, we give more details pertaining the implementation of our network generator based on the Heath branching process [24]. Algorithms 4 and 5 contain the pseudo-code for the model, and in the next subsection, we show that updating the distances as we do in Algorithm 5 is correct.
Data: A rate r, parameters r α , r β for the prior gamma distribution of r, and parameters u α , u β for the update gamma distribution. Result: A new rate r 1 Sample u from Γ(u α , u β ); 2 Let r = ur be the proposed new rate; 3 With probability min(1, pdf Γ(r α ,r β ) (r )/ pdf Γ(r α ,r β ) (r )), return r ; 4 Otherwise, return r; Appendix A.2.1. Updating distances As an example of an extension of a Markov model to a more flexible birth-hybridization model, we extend the model presented in [24], which uses auto-correlated speciation and extinction rates. To create networks, we add the options of including hybridization events. These events can be added to other Markov splitting models in the same way. So, if different or more realistic splitting models are desired (e.g., with speciation bursts) hybridization can easily be added to those as well.
Hybridization in this model should be thought of as hybrid speciation, and not hybrid introgression, that is, m-type reticulation model are used. Indeed, a hybridization event in this generator takes two extant taxa and combines them into a new taxon. The two parental taxa stay present in the network, so the number of taxa in the network increases by one.
In our model, the hybridization rate between two taxa depends on their distance. To keep it simple, we use the piece-wise linear function Recomputing the total rates for speciation sp(N) = ∑ x sp(x) and extinction ext(N) = ∑ x ext(x) is quite simple, as it only requires one summation over the individual speciation and extinction rates sp(x) and ext(x) of all the extant taxa x. Fully recomputing hyb(N) would entail finding all up-down paths between each pair of leaves. As there can be exponentially many up-down paths between a given pair of leaves, this full recomputation is not feasible.
Hence, to efficiently compute hyb(N) in our implementation, we keep track of the weighted distances between all pairs of leaves. Updating these distances requires few calculations for each of the types of events, as we will show next.

11
Attach two new edges with length 0 to this leaf; 12 Update the rates of l to get rates for the new leaves; 13 Replace l with the two new leaves in the set of extant leaves; 14 else if extinction then 15 Pick a random extant leaf l, weighted by by their extinction rates; 16 Remove l from the set of extant leaves; 17 else if hgt then 18 Pick a random extant leaf l, weighted by by their hgt rates; 19 Pick another leaf m uniformly at random; Sample t + = from an exponential distribution with parameter 1/(sp(N) + ext(N) + hgt(N) + hyb(N));

return N;
We will now show how to efficiently update the distances between all pairs of leaves efficiently after each of the following events: speciation, extinction, horizontal gene transfer/hybrid introgression, and hybrid speciation. In all these events, we assume all new edges have length 0 and are extended only later.
can be generated as it has r + 1 reticulations and a leaf directly below a reticulation. We conclude that any network with r + 1 reticulations can be produced using merging and speciation events. By induction, this implies that all networks can be generated thusly.

Lemma Appendix B.2.
Using HGT (adding reticulations n-type at the leaves) together with speciation generates only orchard networks.
Proof. Note that speciation and adding reticulations using the n-type are exactly the methods to "add a pair to a network" in the sense of [33]. The lemma now follows immediately, because these additions of pairs can be reversed, which gives a cherry-picking sequence for the network.
Lemma Appendix B.3. Using hybridization (adding reticulations m-type at the leaves) together with speciation generates only temporal tree-child networks, and each (binary) temporal tree-child network can be generated this way.
Proof. The first part follows because we can label each internal node with the number of the step (event) it was introduced in. This gives a temporal labelling of the network. Moreover, the such a network must be tree-child because an m-type reticulation event can never lead to two neighbouring reticulations or a tree node with two reticulation children. For the second part, note that the temporal labelling can be modified so that for each time t, the internal nodes with label t either represent a speciation event, or an m-type hybridization event.
Lemma Appendix B.4. HGT (n-type) or hybridization (m-type) together with extinction and speciation can generate any network.
Proof. This follows directly from Lemma Appendix B.1, because each merging event can be substituted for a HGT event and one extinction event, or a hybridization event and two extinction events.

Appendix C. Supplementary figures and tables
In this section, we provide additional data in the form of tables and figures. In Table C.1, for each generator, we give an estimate for the number of reticulations at which half of the generated networks are in each network class. This value is found by finding the intersection of the horizontal line where the fraction is a half with the lines in Figure 5. In Table C.2, we show the average balance for each of the generators and each number of reticulations. The stars show that adding reticulations significantly impacts the balance of the networks compared to the networks with 1 reticulation for the same network generator. Figure C.1 shows the profiles for the number of (reticulated) cherries in the beta-splitting model. Note that the initial number of cherries depends primarily on the beta parameter, while all profiles, except when edges are added horizontally, are qualitatively the same. Figure C.2 shows the dependence of the number of blobs and their sizes for the beta-splitting model. Note that the profiles are qualitatively the same, although the exact values differ slightly, as a result of the underlying tree shape. Table C.1: The number of reticulations at which half the networks are in a class for the different generators and parameters. Obtained by interpolation from the data with log-transformed number of reticulations, because in those data, the curves look more or less sigmoid ( Figure 5), and so that we do not underestimate the effects at small numbers of reticulations.

Appendix C.1. Heath Parameters and Extinction
In the main text of this paper, we have not considered extinction. Even though our implementation of the Heath network generator has an option to add extinction. In Figures C.3, C.4, C.5, and C.6, we show the effect of extinction on the profiles, when using the parameters for the Heath network generator as detailed in Table C.3. We have two methods of implementing extinction: simply generating a network where the desired number of taxa consists of both extinct and extant taxa; or generating a network with extinct taxa, and removing them to obtain a network with the desired number of extant taxa. When extinct networks are removed, Table C.1 does not show reticulation numbers at which half of the generated networks are orchard or treebased. This is because, for the reticulation numbers we could test for this model, none gave a fraction of less than half for these classes ( Figure C.3). Perhaps, this is because our networks have a small number of extinct taxa, and removing extinct taxa is necessary to produce networks that are not orchard or tree-based when adding reticulations as hybrid speciation. with different methods of handling extinction. The lines for "Heath" and "Heath keep extinct" are exactly the same, as these generators only produce tree-child networks; only when extinct lineages are removed, the Heath network generator can produce networks that are not tree-child, and even not tree-based.    [1,2,3,4,5] [10, 15,20] [30, 40,50] [1, 2,3,4,5] [10, 15,20] [30, 40,50] [1, 2,3,4,5] [10, 15,20]