Fast Generation of Unlabelled Free Trees using Weight Sequences

In this paper, we introduce a new representation for ordered trees, the weight sequence representation. We then use this to construct new representations for both rooted trees and free trees, namely the canonical weight sequence representation. We construct algorithms for generating the weight sequence representations for all rooted and free trees of order n, and then add a number of modifications to improve the efficiency of the algorithms. Python implementations of the algorithms incorporate further improvements by using generators to avoid having to store the long lists of trees returned by the recursive calls, as well as caching the lists for rooted trees of small order, thereby eliminating many of the recursive calls. We further show how the algorithm can be modifed to generate adjacency list and adjacency matrix representations for free trees. We compared the run-times of our Python implementation for generating free trees with the Python implementation of the well-known WROM algorithm taken from NetworkX. The implementation of our algorithm is over four times as fast as the implementation of the WROM algorithm. The run-times for generating adjacency lists and matrices are somewhat longer than those for weight sequences, but are still over three times as fast as the corresponding implementations of the WROM algorithm.


Introduction
The enumeration of trees, whether ordered, rooted or free, has been well-studied. Indeed, "Cayley's formula", which states that there are precisely n n−2 free trees on n labelled vertices, dates back to Carl Wilhelm Borchard in the middle of the nineteenth century [3]. In 1948, Otter [11] derived asymptotic estimates for the numbers of both unlabelled free and rooted trees. In addition, generating functions for the numbers of both unlabelled free and rooted trees have been obtained (see [5]). The exact counts for unlabelled free trees with n vertices, for n ≤ 36, are listed as Sequence A000055 in the OEIS [9].
One of the first efficient algorithms for generating unlabelled rooted trees was developed by Beyer and Hedetniemi [2] using a level sequence representation. This algorithm was extended by Wright, Richmond, Odlyzko and McKay [13] to generate all unlabelled free trees. This algorithm is referred to informally as the WROM algorithm, and is by far the most commonly used algorithm to generate non-isomorphic free trees. An alternative algorithm was constructed by Li and Ruskey [7] using the parent sequence representation. Indeed, a good survey of this topic can be found in Li's thesis [6]. Other work in this area has recently been conducted by Sawada [12], who presented algorithms to generate both rooted and free plane (i.e., ordered) trees.
In this paper, we construct new algorithms for generating rooted trees and free trees. Our algorithms use a different approach to previous authors. We introduce a new representation, the canonical weight sequence, and use this rather than than level or parent sequences. We introduce a number of modifications to improve the efficiency of the basic algorithms. We implemented the algorithms in Python, incorporating further improvements by using generators to avoid having to store the long lists of trees returned by the recursive calls, as well as caching the lists for rooted trees of small order, thereby eliminating many of the recursive calls. The major improvements in efficiency that we introduce are made possible because the weight sequence representation preserves referential transparency for subtrees. This is not the case for the level sequence and parent sequence representations. We further show how the algorithm can be amended to generate adjacency list and adjacency matrix representations for free trees.
We compared the run-times of our Python implementation for generating free trees with the Python implementation of the well-known WROM algorithm taken from NetworkX [8], the popular Python graph and network libary. The Python implementation of our new algorithm is over four times as fast as the corresponding implementation of the WROM algorithm.
The programs were all written in Python 3.7 and executed using the PyPy3 compiler, although the pseudo-code we present can easily be translated into other languages. The Python code can be found in the appendices. Any graph-theoretic terminology and notation not explicitly defined can be found in Bondy and Murty's text [1].
In Section 2, we introduce the weight sequence representations for ordered trees, weighted trees and free trees. In Sections 3 and 4, we present our algorithms for generating rooted and free trees, respectively. Then, in Section 5, we discuss improvements to the algorithms and their implementations, as well as the modifications required to generate the adjacency list and matrix representations of the trees. In Section 6, we compare the run-times of the Python implementations of our algorithm with those of the WROM algorithm, and Section 7 contains our concluding remarks .

Notation
A free tree T is an connected undirected graph that contains no cycles (conventionally, just called a tree in the graph theory literature). The degree of a vertex v of T is the number of vertices adjacent to v. A leaf of T is a vertex of degree 1; all other vertices of T are called branch vertices. It is easy to show that there is a unique path between any pair of vertices of T .
A rooted tree R is a free tree with a distinguished vertex called its root. Let v be a vertex of R. Any other vertex u on the path from the root to v is an ancestor of v, and v is a descendant of u. A descendant w of v that is adjacent to v is a child of v, and v is the parent of w. Any other child of v is a sibling of w. By definition, the root has no parent.
Let v be any descendant of the root of R. The subtree of R that consists of v together with all of its descendants can clearly be considered to be a rooted tree with root v. We denote this subtree by R(v) and define wt(v), the weight of v, to be the order of R(v); so the weight of the root of R is the order of R. If v is a leaf then wt(v) = 1 and R(v) contains just the vertex v. R − R(v) is the rooted tree, with the same root as R, obtained from R by deleting the subtree R(v) together with the edge between v and its parent.
An ordered tree, sometimes called a plane tree [12], is a rooted tree in which there is an ordering defined on the children of each vertex. By convention, when drawing a rooted tree, the root is placed at the top of the diagram and, for an ordered tree, the order of the children is from left to right. So we may refer to the first (left-most) or last (right-most) child of its parent. Similarly, for any vertex v that is not the last child of its parent, we may refer to the next sibling of v. We note that, if R is an ordered tree, the subtrees R(v) and R − R(v) are considered to be ordered trees, inheriting the ordering of the sets of children from R. For convenience, when w is a child of v, instead of saying that R(w) is subtree of R(v), we often say that R(w) is a subtree of v.
A tree is called a labelled tree if each vertex is assigned a unique label. For any unlabelled ordered tree R with n vertices, we conventionally label the vertices as v 1 , v 2 , . . . , v n in preorder, where v 1 is the root of the tree. Pre-order is the total ordering of the vertices of R defined recursively as follows: for any vertex u with children u 1 , u 2 , . . . , u p , pre-order for the subtree R(u) starts with u, followed by the vertices of R(u 1 ) in pre-order (if p ≥ 1), then the vertices of R(u 2 ) in pre-order (if p ≥ 2), etc. We note that v 2 is the first child of the root v 1 . It trivially follows that, for any vertex v k of R, the pre-order of the vertices of R(v k ) is a contiguous subsequence of the pre-order of the vertices of R.
Two labelled free trees are isomorphic if there is a bijection between their vertex sets that preserves adjacency and non-adjacency; two labelled rooted trees are isomorphic if there exists an isomorphism between their underlying free trees that maps the root of one onto root of the other; two labelled ordered trees are isomorphic if there exists an isomorphism between their underlying rooted trees that preserves the orderings of the children of each vertex. We say that two trees (whether ordered, rooted or free) are f-isomorphic if their underlying free trees are isomorphic, and that two trees (whether ordered or rooted) are r-isomorphic if their underlying rooted trees are isomorphic. For completeness, we will also say that two isomorphic ordered trees are o-isomorphic.
An integer sequence s is an (ordered) list of integers s 1 s 2 . . . s n . In this paper, we shall assume that every element s i in s is positive, and denote the length of s by |s|; so in this case |s| = n. If t = t 1 t 2 . . . t m is another integer sequence, we denote the concatenation of the two sequences by s ⊕ t, i.e., s ⊕ t = s 1 s 2 . . . s n t 1 t 2 . . . t m . For simplicity, we do not distinguish between a sequence of length one and single integer, e.g., we may write s 1 ⊕ t.
We say that s is lexicographically greater than or equal to t, denoted s ≥ t, if and only if either (a) or (b) below hold: (a) s i = t i , for 1 ≤ i < j, and s j > t j , for some j, 1 ≤ j ≤ min(n, m); (b) s i = t i for 1 ≤ i ≤ min(n, m) and n ≥ m.
Strict lexicographical inequality s > t holds if s ≥ t and s = t. We note that this defines a total ordering on the set of integer sequences.

Weight sequences of ordered trees
A common way to represent an ordered tree is by a suitable integer sequence obtained by traversing the tree in some specified order (usually pre-order) and recording some particular property of each vertex as it is visited. The resulting sequence is called a representation sequence for the tree. A valid representation for ordered trees is a representation by integer sequences such that any two ordered trees that have the same representation sequence are o-isomorphic. For example, consider the ordered tree of order 10 shown in Figure 1, in which the vertices are labelled in pre-order. If we record the level (where we define the level of the root to be 1, the level of its children to be 2, etc.) of each vertex in a pre-order traversal, we obtain the following sequence: 1 2 3 4 4 4 3 2 3 2. This is called the level sequence of the tree. Similarly, if we record the index of the label of the parent of each vertex, we obtain its parent sequence: 1 2 3 3 3 2 1 8 1 (note there is no parent for the root in the parent sequence representation). Both of these sequence representations are well-known and have been shown to be valid representations for ordered trees (see [2] [4]). They have been used in the design of algorithms for generating rooted trees and free trees by Beyer and Hedetniemi [2], Wright et al. [13], Li and Ruskey [7], Sawada [12] and Cook [4].
In this paper, we introduce a new representation sequence. This is constructed by recording the weight of each vertex in a pre-order traversal of the tree. We call this representation the weight sequence of the tree, and denote the weight sequence of any ordered tree R by ws(R). For example, for the tree R in Figure 1, ws(R) = 10 6 4 1 1 1 1 2 1 1.
Lemma 2.1. Let R be an ordered tree of order n with weight sequence ws(R) = s 1 s 2 . . . s n , where the vertices are labelled v 1 , v 2 , . . . , v n in pre-order. Then (a) s 1 = n; (b) for all vertices v k of R, (i) s k = wt(v k ), i.e., the order of R(v k ); (ii) ws(R) = x ⊕ ws(R(v k )) ⊕ y for some integer sequences x and y; Proof (a), (b)(i) and (b)(ii) follow immediately since the vertices of R are labelled in pre-order.
(b)(iii) The weight of any vertex in R(v k ) is the same as in R. Since R(v k ) is of order s k , it then follows that ws(R(v k )) = s k s k+1 . . . s k+s k −1 .
(c) Since R(v 2 ) is of order s 2 , the result follows easily from (b)(iii). ✷ Corollary 2.2. Let R be an ordered tree of order n. Suppose that u 1 , u 2 , . . . , u p are the children of the root of R, where u i+1 is the next sibling of u i for all i, 1 ≤ i ≤ p − 1. Then It therefore follows from Lemma 2.1(b) that, for any ordered tree R, the subsequence of ws(R) that corresponds to R(v k ) is just ws(R(v k )), where R(v k ) is considered as an ordered tree in its own right. This is the main reason why the weight sequence is a particularly useful representation for the generation of trees of order n: we can construct the weight sequence of any ordered tree of order n directly from the weight sequences of its subtrees. So, if r is the order of R(u 1 ), it follows from Lemma 2.1 and Corollary 2.2 that one way to accomplish this is to take the weight sequence of an ordered tree of order r (corresponding to ws(R(u 1 ))), and combine it appropriately with the weight sequence of an ordered tree of order n − r (corresponding to ws(R − R(u 1 ))). We shall elaborate on this in Sections 3 and 4.
We note that, since the weight sequence of a tree is well defined, any o-isomorphic trees must have the same weight sequence.
Lemma 2.3. The weight sequence is a valid representation for ordered trees.
Proof By inspection, the result clearly holds when the order is less than four. So suppose that the result holds for all ordered trees of order less than n, where n ≥ 4. Let R and R ′ be labelled ordered trees of order n such that ws(R) = ws(R ′ ), where the vertices of the trees are labelled ). Since these trees are of order less than n, it follows from the inductive hypothesis that . So, since v 2 and v ′ 2 are the first children of the roots of R and R ′ , respectively, it follows that R is o-isomorphic to R ′ . Hence the weight sequence is a valid representation for ordered trees. ✷ The following lemma will be used in Section 2.3.
Lemma 2.4. Let s and t be weight sequences of trees. If s > t then x ⊕ s ⊕ y > x ⊕ t ⊕ z, for any integer sequences x, y and z.
Proof This follows immediately from Lemma 2.1(a) and the definition of lexicographical order. ✷

Canonical weight sequences of rooted trees
We extend the definition of a valid representation by integer sequences to rooted trees: a valid representation for rooted trees is a well-defined representation such that any two rooted trees that have the same representation sequence are r-isomorphic.
Now, since the weight sequence is a valid representation for ordered trees by Lemma 2.3, two r-isomorphic ordered trees that are not o-isomorphic must have different weight sequences. For example, the two r-isomorphic ordered trees in Figure 2 have weight sequences 10 1 6 1 4 1 1 1 2 1 and 10 2 1 1 6 1 4 1 1 1, respectively (they are also r-isomorphic but not o-isomorphic to the ordered tree in Figure 1). So, in order to define a valid representation for rooted trees using weight sequences, we need to choose a unique representative from each r-isomorphism class of ordered trees.
An ordered tree R of order n is canonically ordered if ws(R(u)) ≥ ws(R(v)), for each vertex u of R having a next sibling v. Clearly, if R is canonically ordered then so is R(v), for each vertex v of R. It is easy to see that the ordered tree in Figure 1 is canonically ordered, but those in Figure 2 are not. Proof Let n be the order of R and R ′ . It is easy to see, by inspection, that the result holds when n ≤ 3. So suppose that n ≥ 4 and that the result holds for all pairs of trees of order less than n. Let u 1 , u 2 , . . . , u p be the children of the root of R, where u i+1 is the next sibling of u i for each u i . Let θ be an r-isomorphism from R to R ′ . Clearly, θ maps the children of the root of R to the children of the root of R ′ . So the subtrees of the root of R ′ are precisely the subtrees θ(R(u i )) in some order.
Hence ws(R(u i )) = ws(θ(R(u i ))) for each u i . Since R and R ′ are both canonically ordered, it is now easy to see from (1) that ws(R) = ws(R ′ ). Hence R and R ′ are o-isomorphic by Lemma 2.3. ✷ Clearly, for any ordered tree R, by suitably permuting the subtrees of each vertex, we can obtain a canonically ordered tree that is r-isomorphic to R. We therefore define cws(R), the canonical weight sequence of R, to be the weight sequence of any canonically ordered tree that is r-isomorphic to R. By Lemma 2.5, cws(R) is well defined.
Lemma 2.6. The canonical weight sequence is a valid representation for rooted trees.
Proof Let R 1 and R 2 be rooted trees such that cws(R 1 ) = cws(R 2 ). LetR 1 andR 2 be canonically ordered trees that are r-isomorphic to R 1 and R 2 , respectively. Then SoR 1 andR 2 are o-isomorphic by Lemma 2.3, and thus r-isomorphic. Therefore R 1 and R 2 are r-isomorphic. ✷ It immediately follows from this result that, subject to labelling, we may represent any rooted tree by a unique canonically ordered tree.
It is straightforward to show that the ordered tree R max that has the lexicographically largest weight sequence of all ordered trees r-isomorphic to R is canonically ordered, and that cws(R) = ws(R max ).

Free trees
We extend the definition of a valid representation by integer sequences to free trees: a valid representation for free trees is a well-defined representation such that any two free trees that have the same representation sequence are f -isomorphic.
Now, since the canonical weight sequence is a valid representation for rooted trees by Lemma 2.6, two f -isomorphic rooted trees that are not r-isomorphic must have different canonical weight sequences. So, in order to define a valid representation for free trees using weight sequences, we need to choose a unique representative from each f -isomorphism class of rooted trees.
Let T be a free tree of order n. Most algorithms for generating free trees of a given order choose the root of T to be a central vertex (T contains either a single central vertex or two adjacent central vertices). Instead, in keeping with our choice of the use of the weight sequence rather than the level or parent sequence, we choose the root of T to be the centroid when T is unicentroidal; when T is bicentroidal, we represent T as an ordered pair of subtrees rooted at the two centroidal vertices.
A centroidal vertex u of T is a vertex such that each component of the forest T − u is of order at most n 2 . It is well known that a tree is either unicentroidal, having a single centroidal vertex (in which case the largest component of T − u is of order at most n−1 2 ), or bicentroidal, having two adjacent centroidal vertices (in which case the largest component of T − u is of order n 2 ); see [1]. Moreover, it is easy to show that the centroids of two f -isomorphic free trees must map to each other under any f -isomorphism. We therefore consider the two types of free tree separately.
Suppose first that T is unicentroidal. We now define the free weight sequence fws(T ) of T to be the canonical weight sequence of any tree R that is rooted at its centroid and is f -isomorphic to T ; so fws(T ) = cws(R). We note that, since the centroid consists of a single vertex and the canonical weight sequence is well defined, the free weight sequence is well defined for all unicentroidal trees. It immediately follows from Lemma 2.6 that, subject to labelling, we may represent any unicentroidal tree by a unique canonically ordered tree rooted at its centroid.
For example, suppose that the tree in Figure 1 is a free tree T (so not rooted). It is easy to see that v 2 is the unique centroidal vertex of T , and therefore T is f -isomorphic to the canonically ordered tree in Figure 3, which is rooted at its centroid u. Therefore fws(T ) = 10 4 2 1 1 4 1 1 1 1.
Proof Let R and R ′ be two rooted trees, rooted at their centroids, that are f -isomorphic to T and T ′ , respectively. Suppose that fws(T ) = fws(T ′ ). Then We now consider the case when T is bicentroidal with centroidal vertices u and v. If we delete the edge between u and v, we obtain disjoint trees T u and T v of order n 2 , which we may consider to be rooted at u and v, respectively. We may therefore represent T as the ordered pair <T u , T v > when cws(T u ) ≥ cws(T v ), or <T v , T u > when cws(T v ) ≥ cws(T u ). We define fws(T ), the free weight sequence of T , to be cws(T u ) ⊕ cws(T v ) in the former case, and cws(T v ) ⊕ cws(T u ) in the latter case. We note that the first and n+2 2 th elements of fws(T ) correspond to u and v, and are both equal to n 2 . Since the canonical weight sequence is well defined for rooted trees, it follows that the free weight sequence is well defined for bicentroidal trees. It immediately follows from Lemma 2.6, that, subject to labelling, we may represent any bicentroidal tree of order n by a unique ordered pair of canonically ordered trees of order n 2 (not generally rooted at their centroids). For example, the path P 8 is f -isomorphic to the tree in Figure 4 with centroidal vertices u and v. Therefore fws(P 8 ) = 4 3 2 1 4 3 2 1.
u v Proof Let {u, v} and {u ′ , v ′ } be the centroidal vertices of T and T ′ , respectively, and let <T u , T v > and <T ′ u ′ , T ′ v ′ > be the representations of T and T ′ , respectively, described above. Suppose that fws(T ) = fws(T ′ ). Then cws(T u ) = cws(T ′ u ′ ) and cws(T v ) = cws(T ′ v ′ ). So, by Lemma 2.6, T u and T v are r-isomorphic to T ′ u ′ and T ′ v ′ , respectively. Since we can recover T from T u and T v by adding an edge between u and v, and similarly for T ′ , it immediately follows that T is f -isomorphic to T ′ . ✷ Lemma 2.9. The free weight sequence is a valid representation for free trees.
Proof If two free trees are isomorphic then they are both unicentroidal or both bicentroidal. The result then follows from Lemmas 2.7 and 2.8. ✷

Rooted tree generation
By Lemma 2.6, the canonical weight sequence is a valid representation for rooted trees. So, to generate all rooted trees of order n, we only need to generate every possible canonical weight sequence of length n.
An ordered set of integer sequences [a 1 , a 2 , . . . , a p ] is said to be reverse lexicographically (relex) ordered if a i ≥ a j when i < j, for all i and j. Let B(n) denote the relex ordered set of the canonical weight sequences of all rooted trees of order n. It follows from Lemmas 2.6 and 2.1 that, for each element s = s 1 s 2 . . . s n of B(n), there exists a unique canonically ordered tree R, with vertices labelled v 1 , v 2 , . . . , v n in pre-order, such that ws(R) = s, where s k = wt(v k ) for all k, 1 ≤ k ≤ n.
If s = s 1 s 2 . . . s n is an integer sequence, we let s # = s 2 s 3 . . . s n , i.e., s # is s with the first element s 1 removed. So if s is the weight sequence of an ordered tree R, then s # is the weight sequence of the ordered forest obtained by removing the root of R.
We write s t if t is some other integer sequence such that either s ≥ t or s is a prefix of t, i.e., t = s ⊕ x for some integer sequence x.
Let A q (n) be the set of all ordered pairs <a, b> in B(q) × B(n − q) such that a b # , and let A q (n). We recall that if <a, b> is in A q (n) then the first element of a is q, |a| = q, the first element of b is n − q and |b| = n − q.
Lemma 3.1. There is a bijection β from A(n, n − 1) to B(n) defined by Proof Suppose that < a, b > is in A q (n), for some q, 1 ≤ q ≤ n − 1. We first show that β(<a, b>) ∈ B(n).
Let R 1 be a canonically ordered tree rooted at v such that ws(R 1 ) = a, and let R 2 be a canonically ordered tree rooted at u such that ws(R 2 ) = b. Let u 1 , u 2 , . . . , u p be the children of u in order, and let R be a new ordered tree rooted at u with children v, u 1 , u 2 , . . . , u p , i.e., R is obtained from R 2 by adding R 1 as the new first subtree of u. Now <a, b> is in A q (n), so a b # , and thus wt(v) ≥ wt(u 1 ). Therefore R is canonically ordered as both R 1 and R 2 are canonically ordered. So ws(R) is in B(n) and, moreover, ws(R) = n ⊕ a ⊕ b # by Corollary 2.2. Therefore β(<a, b>) ∈ B(n).
Suppose that <a 0 , b 0 > is in A r (n), for some r, and that n ⊕ a 0 ⊕ b # 0 = n ⊕ a ⊕ b # . Then r = q since the first element of a 0 must be equal to the first element of a. It follows that a 0 = a and b 0 = b, as |a 0 | = |a|. Hence β is injective. Now suppose that s = s 1 s 2 . . . s n is an element of B(n), and let R be the unique canonically ordered tree such that ws(R) = s. By Lemma 2.1(b) and (c), ws(R(v 2 )) = s 2 s 3 . . . s s 2 +1 and ws(R − R(v 2 )) = t s s 2 +2 s s 2 +3 . . . s n where t = n − s 2 . Clearly, since R is canonically ordered, so are R(v 2 ) and R − R(v 2 ). Hence ws(R(v 2 )) ∈ B(s 2 ) and ws(R − R(v 2 )) ∈ B(n − s 2 ).
Moreover, since R is canonically ordered, it follows from Corollary 2.2 and the definition of that s 2 s 3 . . . s s 2 +1 s s 2 +2 s s 2 +3 . . . s n . Therefore <ws(R(v 2 )), ws(R − R(v 2 ))> is in A s 2 (n). Hence β is onto, and is therefore a bijection. ✷ Corollary 3.2. For any n, B(n) can be constructed from the sets B(q), where 1 ≤ q ≤ n − 1.
Proof It is easy to construct all rooted trees, and therefore B(n), when n ≤ 3. The result then follows using equation (2) and induction on n. ✷ The image of A q (n) under the bijection β defined in (2) is denoted by B q (n), i.e., B q (n) corresponds to those rooted trees of order n for which the first subtree of the root is of order q. So B q (n) contains those sequences in B(n) for which the second element is q. Clearly Following along the lines of the proofs of Lemma 3.1 and Corollary 3.2, we now construct a simple recursive algorithm to generate the elements of B(n). For each q, 1 ≤ q ≤ n − 1, and for each a in B(q), we need to find those elements b in B(n − q) for which a b # . We then form the integer sequence n ⊕ a ⊕ b # to obtain the appropriate element of B(n). We can avoid searching the whole of B(n − q) for those elements b for which a b # , by noting that we only need to consider those elements that are in B r (n − q), where 1 ≤ r ≤ min(n − q − 1, q).
In the pseudocode we use in the rest of the paper, we represent lists in square brackets; we use ⊕ for concatenating lists, and continue to use ⊕ for concatenating integer sequences. If L is a list, then L[start ...] denotes the sublist beginning at element L[start] and ending at the last element of L.
The following function RootedTrees(n) generates B(n). It makes use of the helper function RTHelper1(n, q) that generates B q (n).

Function RootedTrees(n)
if n = 1 then return [ There are two key points to note about the recursive calls in RTHelper1. Firstly, the length of the subsequence corresponding to the first subtree of the root must be smaller than the order of the tree itself; so we always have q < n. Secondly, if q = n − 1 then the sequence represents a tree in which the root has only one subtree; so we simply return n concatenated with the subsequence that corresponds to this subtree.
We note that B r (n − q), the list returned by RTHelper1(n − q, r), will clearly require too much space for most values of r when n − q is large. This problem is addressed by returning a generator instead of a list (see Section 5.3). We note further that, in the loops in RootedTrees and RTHelper1, the variables q and r are counting down, so Bn and Bqn will be relex ordered, as required. These correspond to the canonically ordered trees in Figure 6, where the label indicates which pair of trees in Figure 5 -corresponding to a and b in equation (2) -are used to construct the tree.
We discuss some optimisations of the function RTHelper1 in Section 5.
ah bh ch dh eg f g ge gf hd

Free tree generation
By Lemma 2.9, the free weight sequence is a valid representation for free trees. So, in order to generate all free trees of order n, we only need to generate every possible free weight sequence of length n. We recall that a free tree is either unicentroidal or bicentroidal, which have slightly different definitions of the free weight sequence.
We denote the relex ordered set of the free weight sequence representations of all free trees, unicentroidal free trees and bicentroidal free trees of order n by F(n), F U (n) and F B (n), respectively. So F(n) = F U (n) ⊕ F B (n), i.e., the elements of F U (n) followed by those of F B (n).

Unicentroidal
We recall from Section 2.4 that the free weight sequence fws(T ) of a unicentroidal free tree T is the canonical weight sequence of any tree rooted at its centroid that is f -isomorphic to T . So F U (n) ⊆ B(n). We can therefore generate F U (n) using a simple modification of the algorithm Root-edTrees from Section 3: the canonically ordered tree R that represents a unicentroidal free tree T is rooted at its centroid, so the sub-trees of the root are of order at most n−1 2 . It follows that |a| ≤ n−1 2 for every pair <a, b> in A(n, n − 1) for which β(<a, b>) is in F U (n).
Lemma 4.1. The mapping β defined in equation (2) is a bijection from A(n, n−1 2 ) to F U (n).
Proof We may represent any unicentroidal free tree T by a unique canonically ordered tree R in which the weight of each child of the root of R is at most n−1

2
. So the result can be proved in a similar manner to Lemma 3.1, with the additional restriction that |a| ≤ n−1 2 , i.e., we use A(n, n−1 2 ) instead of A(n, n − 1). ✷ Corollary 4.2. For any n, F U (n) can be constructed from the sets B(q), where 1 ≤ q ≤ n − 1. ✷ The following function UFT(n) generates the set F U (n). It also makes use of the helper function RTHelper1(n, q).

Function UFT(n)
if n = 1 then return [1] For example, we can construct F U (8) using the call UFT(8) to obtain

Bicentroidal
We recall from Section 2.4 that the free weight sequence of a bicentroidal free tree with centroidal vertices u and v is cws This corresponds to the set of ordered pairs of canonically ordered rooted trees of order 4 (see Figure  5), with an additional edge joining their roots, as shown in Figure 8.
By combining the unicentroidal and bicentroidal free tree algorithms, we can generate all free trees of order n using the following function FreeTrees(n).

Improvements and implementation of the algorithms
We now outline some of the changes we have made to improve the efficiency of the functions described in Sections 3 and 4 above, and their implementations in Python. We use a k to denote the integer sequence that is formed by the concatenation of k copies of the integer sequence a.

Improvements to RTHelper1
Firstly we note that, since there is only one rooted tree of order 1 and one of order 2, having canonical weight sequences 1 and 2 1, respectively, we may compute the result in a more efficient and explicit manner when q is 1 or 2. If q = 1 then the function should return the single sequence n ⊕ 1 n−1 , and if q = 2 then it should return the ordered set of sequences (2 1) t ⊕ 1 n−1−2t for t from n−1 2 down to 1. We also note that, when q = n − 2, the second subtree of the root contains just a single vertex, so b is just 1 in this case. These observations enable us to remove the recursive calls to RTHelper1 when q ∈ {1, 2, n − 2}, as in the more efficient function RTHelper2(n, q) below. (In practice, in the implementation, we subsume the case q = 1 into the case q = 2 by reducing the lower limit of t from 1 to 0, and correspondingly increasing the lower limit of q from 1 to 2 in RootedTrees.) Next we note that, during the execution of RTHelper1, checking whether a b # is only necessary when the order r of the second child of the root is the same as the order q of the first child. We further note that after a b # for the first time, this will also hold for all subsequent sequences b, since B(n) is relex ordered. This removes the necessity to check whether a b # from then on.
We shall assume from now on that the functions RootedTrees and UFT use the helper function RTHelper2 instead of RTHelper1.

Caching of B(k) for smaller values of k
We now discuss how we can further improve the efficiency of the tree generation algorithms by caching B(k) for small values of k.
RTHelper2(n, q) calls RootedTrees(q) and RTHelper2(n − q, r), where r ≤ q, and RootedTrees(q) calls RTHelper2(q, q ′ ), where q ′ < q. It follows that n ′ ≤ q in all the calls to RootedTrees(n ′ ) made by RTHelper2(n, q), whether directly or indirectly. It therefore follows that we can obtain a significant increase in the efficiency of the function call RTHelper2(n, q) if we cache in memory B(k), for 1 ≤ k ≤ q. This will increase the efficiency of both RootedTrees and UFT.
For rooted tree generation, RootedTrees(n) makes calls to RTHelper2(n, q), where 1 ≤ q ≤ n − 1. However, for large values of n, the space requirements to cache B(k), for 1 ≤ k ≤ n − 1, would be prohibitive. For unicentroidal free tree generation, q ≤ n−1 2 for all calls to RTHelper2(n, q) made by UFT(n). Furthermore, since BFT(n) only makes calls to RootedTrees( n 2 ), caching B(k) for 1 ≤ k ≤ n 2 would avoid all calls of RootedTrees for both UFT(n) and BFT(n), and thus also FreeTrees(n). So, for example, to generate all 109, 972, 410, 221 free trees of order 32, this would mean caching B(k) for 1 ≤ k ≤ 16, which is perfectly feasible since there only 235, 381 rooted trees of order 16. On the other hand, in order to generate all rooted trees of order 32 whilst avoiding all calls of RootedTrees, we would need to cache B(k) for 1 ≤ k ≤ 31. Since there are nearly 10 12 rooted trees of order 31, the cache space requirements for generating all rooted trees of order 32 would be of the order of at least 10 terabytes, and thus infeasible on practically all current computers (see [9] and [10] for tree counts).
We can then replace all calls of RootedTrees(q) in RTHelper2(n, q) by references to RTList[q], provided L ≥ q.
As there are very few rooted trees of order less than five, we explicitly create B (1) In this initialisation, when computing RootedTrees(k), we note that we will already have computed the previous elements of RTList; so the recursive calls of RootedTrees in RTHelper2 may be replaced by references to RTList.
As explained above, we can replace all calls to RootedTrees from FreeTrees(n) by references to RTList if L ≥ n 2 . In practice, as explained below, in order to improve the efficiency of the code when q = n−1 2 , we henceforth assume that L ≥ n 2 + 1. To avoid all calls to RootedTrees(q) in RTHelper2(n, q), we require that L ≥ q. This is clearly always true for FreeTrees. We will see that we can also avoid the recursive calls to RTHelper2(n − q, r) when q ≥ n − L.
When q, the order of the first subtree of the root, is at least n−1 2 , then newq = n − q − 1 ≤ L. So the r-loop (where r is the order of the second subtree of the root) can be dispensed with by letting b iterate through RTList[n − q]. Now, when q is at least n+1 2 , then r ≤ newq < q, so we can dispense with checking whether a b # . When q = n−1 2 and r = q, we can also avoid checking whether a b # by skipping the initial elements of RTList[n − q], as we now explain.
Suppose that r = q = n−1 2 and a = RTList[q][k]. When n is odd, n − q − 1 = q, so we can start with the element b for which b # = a; this is easily seen to be RTList[n − q][k]. When n is even, n − q − 1 = q + 1 = n 2 , so we can skip the |B( n 2 )| elements for which the first subtree is of order n 2 , and start with the element b for which b # = a ⊕ 1; this is easily seen to be RTList We note that, following the above changes, we can replace newq by q when q < n − L, since q ≤ L. Making these changes to RTHelper2 yields the algorithm RTHelper3.
In practice, as well as caching the B(k), we also cache B # (k), which is the relex ordered set of sequences that is obtained by replacing each sequence b in B(k) by b # . This removes the necessity to remove the first element of b each time.
We shall assume from now on that the algorithms RootedTrees and UFT use the helper function RTHelper3 instead of RTHelper2.

Generators
The size of B(n) grows exponentially, so the list Bqn may become prohibitively large for large values of n, except when q is small. Therefore, to avoid creating and returning the list Bqn in RTHelper3, we instead return a generator. The changes necessary to effect this are, in essence, to simply replace all the assignments of the form Bqn ← Bqn ⊕ [c] by the statement yield c, and make corresponding changes to the other algorithms.

Strings for sequences
We store the weight sequences of the trees as alphanumeric strings, instead of lists, both to save storage and to create the canonical weight sequences more efficiently. We use the digits 1 to 9 for the corresponding weights, and the letters A, B, C, . . . for weights 10, 11, 12, . . .. So the weight sequence of the free tree T in Figure 3 is denoted by the string "A421141111" instead of the sequence 10 4 2 1 1 4 1 1 1 1.

Adjacency lists and matrices
Although weight sequences are useful for generating trees, for most purposes a more conventional representation is required, such as adjacency lists or adjacency matrices. Most other tree generation algorithms also initially generate the trees using non-conventional representations (e.g., level sequences or parent sequences, as mentioned in the introduction). The adjacency lists or matrices are then constructed from the particular representation used.
We now give a brief explanation of how we can incorporate the construction of the adjacency lists of the free trees of order n into our algorithm FreeTrees, using a caching approach similar to that outlined in Section 5.2. We assume that the vertices are labelled 1 to n in preorder.
The algorithm AdjListFromWS below returns the adjacency list of a single free tree given its weight sequence. In the algorithm, we denote the j th element of the weight sequence ws by ws[j], and the list of n empty lists by [ ] n . We note that, given the weight sequence of any ordered tree (or indeed any ordered forest), whether canonically ordered or not, this algorithm will return its adjacency list if we remove the assignment A[j] ← A[j] ⊕ [i] and the if statement (which, for a bicentroidal tree, adds the edge between the two centroids).
We extend the procedure InitialiseRTList to construct the adjacency list representations of the rooted trees, by calling the function AdjListFromWS on each weight sequence in RTList[k]. We store these representations in a hash table (implemented as a Python dictionary) using the weight sequence as the key.
We can now construct the adjacency list representation of all the free trees of order n while we construct their weight sequences: for each subtree of the root, we look up its adjacency list representation in the hash table, and then increase the label of each vertex by a suitable offset value. For a unicentroidal free tree represented by the integer sequence n ⊕ a ⊕ b # , we offset the labels of the vertices of the subtree correponding to a by 1, and those of the vertices of the forest corresponding to b # by |a| + 1. For a bicentroidal free tree, we only need to offset the labels of the vertices corresponding to the subtree rooted at the second bicentroid by n 2 . It is fairly straightforward to modify the above procedure in order to generate adjacency matrices instead of adjacency lists in a similar manner. The Python code for generating both the adjacency list and matrix representations is included in the appendices.

Time tests to generate
We now present an empirical comparison of our algorithm with the popular WROM algorithm. We implemented our algorithms in Python and compared these with the Python implementation of the WROM algorithm taken from NetworkX. All computations were performed using Python 3.7 and the JIT compiler PyPy3.6-v7.3.1, running on a Pentium i7 with 16GB RAM; all times are in seconds. We set L, the order of the largest tree for which we cache the representations, to be n 2 + 1. Table 1 shows the times to generate all free trees of order n and return the count of the number of trees, without saving the representations. BRFE refers to the algorithm FreeTrees described above and WROM to the algorithm described in [13]. BRFE(ls) and BRFE(mat) include converting the weight sequences into the adjacency list and matrix representations, respectively; WROM(ls) and WROM(mat) are defined similarly. As can be seen, the run-times for generating the weight sequences using BRFE are less than a quarter of those for generating the level sequences using WROM. The speed-ups for the times to create the adjacency list and matrix representations are similar. Due to the excessive times involved, we have not run some of the algorithms for the larger values of n.
We note that the run times for BRFE are about four times as long using the standard CPython implementation as those in Table 1, and the run times for WROM are about ten times as long. We further note that, by increasing the value of L, we could significantly reduce the run-times of our algorithms for larger values of n.
Li and Ruskey presented an alternative algorithm in [6] [7] that generates parent sequences, and compared a PASCAL implementation of their algorithm and the WROM algorithm. It can be seen from Table 5.2 in [6] that the run-time of their algorithm is about 70% of that of WROM. We can deduce from this that BRFE would take about a third of the time of their algorithm.

Conclusion
In this paper we have presented new canonical representations for ordered, rooted and free trees. We constructed recursive algorithms for generating all rooted trees and all free trees of order n using these representations; each of these algorithms returns a list of the trees generated. We made a number of improvements to the algorithms and their Python implementations, including using generators to avoid having to explicitly construct and store the long lists of trees returned by the recursive calls. Moreover, in order to eliminate many of the recursive calls for small values of n, we cached the lists of rooted trees of small order. Our main interest is in the generation of free trees and, in this case, in order to eliminate a large proportion of the recursive calls, it is only necessary to cache the lists of rooted trees up to order around n 2 . We then described how the algorithm could be modified to generate the adjacency list or matrix representations of the trees.
We compared our Python implementation of the algorithm for generating free trees with the Python implementation of the well-known WROM algorithm taken from NetworkX. We used our algorithm to generate the free trees of order n, for 18 ≤ n ≤ 29, but because of the longer run-times, we only ran the WROM algorithm up to n = 27. It can be seen from Table 1 that the run-times for the new algorithm are less than a quarter of those for the WROM algorithm (the improvement in the run-times for the algorithms that generate adjacency lists or matrices is similar ). From the comparisons in [6], we may deduce that our algorithm would take less than a third of the time of the algorithm presented there.

Appendices
For the Python code, please email the authors.