Discussion on ‘Tabelliscolex (Cricocosmiidae: Palaeoscolecidomorpha) from the early Cambrian Chengjiang Biota and the evolution of seriation in Ecdysozoa’ by Shi et al. 2021 (JGS, jgs2021-060)

Supplementary material: Code necessary to reproduce analyses from the paper is available at https://doi.org/10.6084/m9.figshare.16419522


Introduction
Palaeoscolecids are a widespread Palaeozoic group of armoured subcylindrical worms whose phylogenetic position has attracted much debate. Their regionalized pharyngeal armature and posterior hooks recall those of priapulans (Harvey et al. 2010;Smith 2015;, but the possibility that such features may have been present in the ancestral ecdysozoan (Smith and Caron 2015) permits relationships elsewhere among the moulting animals, whether close to the nematomorphs, recognizing similarities in cuticle structure (Hou and Bergström 1994;Wills et al. 2012); as sister to the panarthropods (Dzik and Krumbiegel 1989;Han et al. 2007a); or close to the root of Superphylum Ecdysozoa (Budd 2001).
Precise anatomical observations of additional fossil material, such as the valuable description of new cricocosmiid specimens by Shi et al. (2021a), are needed in order to resolve this uncertainty. Tabelliscolex and its close relatives Cricocosmia and Tylotites have previously been linked to lobopodians on the basis of their phosphatized dorsal sclerites (Han et al. 2007a, b), which potentially correspond to net-like sclerites in an equivalent position on early-diverging lobopodians such as Microdictyon, Onychodictyon and Cardiodictyon (Ramsköld and Chen 1998). Shi et al. (2021a) describe serially repeated projections that lie ventrally opposite these dorsal plates. These structures fit neatly into the appealing, if speculative, framework proposed by Dzik and Krumbiegel (1989), in which lobopodians evolved from a palaeoscolecid-like ancestor via the extrusion of ventral legs opposite dorsal plates. This scenario rests on the assumption that plates in lobopodians and palaeoscolecids are homologous, and could imply that newly described ventral projections represent precursors to the lobopod limbs of lobopodians. Shi et al. (2021a) display commendable caution by testing this model in a phylogenetic framework, combining characters from two established phylogenetic datasets to test palaeoscolecid relationships within a modern framework of ecdysozoan evolution, encompassing recent fossil finds and conceptual advances (Wills et al. 2012;Smith and Ortega-Hernández 2014;Smith and Caron 2015;Yang et al. 2015;Zhang et al. 2016;Howard et al. 2020).
This combined dataset is characterized by a high proportion of character states that cannot logically be coded; 7155 of 17 005 entries are marked as 'inapplicable'. As an example, a taxon without sclerites cannot be meaningfully coded for the sub-character 'sclerite ornamentation' (character 86, Shi et al. 2021b), as neither of the two states 'net-like' or 'scaly' applies. Fitch (1971) parsimony and the Mk model (Lewis 2001) treat these inapplicable states as though they were simply uncertain, an approach that is known to materially affect the identification of optimal topologies (Maddison 1993;Brazeau et al. 2019). By way of example, a transition between 'net-like' and 'scaly' dorsal sclerites might be inferred in an ancestor that lacks sclerites. The inference of an evolutionary step that could not logically have occurred may add to an overall parsimony score that would otherwise be optimal. Workable solutions to the problemeach with their own strengths and limitationshave recently been proposed (De Laet 2005;Brazeau et al. 2019;Tarasov 2019;Goloboff et al. 2021;Hopkins and St. John 2021).

Methods
We explore the impact of inapplicable data on the results of Shi et al. (2021) by reanalysing their morphological matrix using four methods: Bayesian inference and maximum likelihood under the Mk model (Lewis 2001); standard Fitch (1971) parsimony; and the 'inapplicable-aware' parsimony algorithm of Brazeau et al. (2019) ('BGS').
The first three methods seek to replicate and extend the results of Shi et al. (2021); we follow these authors' methodology, making appropriate selections for unspecified analytical parameters.
For Bayesian inference, we conduct three independent runs of six chains in MrBayes 3.2.7a (Ronquist et al. 2012), sampling every 10 000 generations for 6 000 000 generations, discarding the first 25% of samples as burn-in. We use the Mk model with gammadistributed rate variation across characters (Lewis 2001), and a Dirichlet prior distribution on branch lengths Zhang et al. 2012). Convergence between runs is indicated by minimum estimated sample sizes > 1800 and potential scale reduction factors = 1.000 for the tree length and alpha parameters.
We calculate the maximum likelihood tree using IQ-tree 1.6.12 (Nguyen et al. 2015). ModelFinder (Kalyaanamoorthy et al. 2017) identifies the Mk model (Lewis 2001), with a four-category gamma model of rate heterogeneity and a correction for ascertainment bias, as the most appropriate model under the Bayesian information criterion. Correspondingly, the five invariant sites are removed prior to analysis. This model estimates 188 parameters from 174 variable sites, meaning that branch length estimatesand consequently reconstructed tree topologiesare likely to be unreliable (Burnham and Anderson 2002). We further note that IQ-tree treats partially ambiguous states (e.g. {0, 2}) as fully ambiguous (i.e. ?).
Finally, we apply the 'Morphy' (Brazeau et al. 2017) implementation of the BGS inapplicable treatment (Brazeau et al. 2019) using the R (R Core Team 2021) package 'TreeSearch' (Smith 2018(Smith , 2021, which conducts tree search using the parsimony ratchet heuristic (Nixon 1999). Implied weighting scores are calculated using the formula e=(e þ k) (Goloboff 1993), where e is the extra score associated with each character (i.e. the 'Morphy' score minus the minimum score possible), and k is the concavity constant. At each concavity constant, we run the search until tree score has not improved for 18 consecutive ratchet iterations, retaining up to 171 most parsimonious trees for each analysis.
Because six characters in the Shi et al. (2021a) matrix contain 6-13 sub-characters, and 14 sub-characters are contingent on more than one character, the TNT implementation of the Goloboff et al. (2021) algorithm (which supports a maximum of five subcharacters) cannot be used. The BGS algorithm, whilst approximate and imperfect (Brazeau et al. 2019;Goloboff et al. 2021), is the only inapplicable-aware phylogenetic method that supports arbitrarily many contingent sub-characters, and for which a computationally tractable implementation is presently available. This allows us to analyse the Shi et al. (2021a) matrix without modification, simplifying the interpretation of our results.
We explore differences between analytical methods by constructing a tree space (Hillis et al. 2005; Smith in press a) on a subset of 20 most parsimonious trees from each parsimony analysis, 100 Fig. 1. Summary of Bayesian posterior trees. Cricocosmiids (bold) are among the least stable taxa in Bayesian analyses, but are most likely to fall as sister to a clade comprising nematoids and panarthropods. The presence of paired ventral structures in the common ancestor of these taxa is not resolved by probabilistic methods. Tree represents the majority-rule consensus of the posterior distribution after removing the rogue taxon Acosmia, and is arbitrarily rooted on Loricifera + Kinorhyncha. Taxa are coloured according to their relative instability (Smith in press b): darker text denotes taxa that occupy a consistent position in all sampled trees. Edges are labelled with the posterior probability of the associated bipartition split; pie charts at nodes denote the reconstructed probability that serially repeated paired ventral structures (char. 77, Shi et al. 2021b) were present at each ancestral node. trees from the posterior distribution of each of MrBayes run, and the single most likely tree.
We define our tree space based on quartet distances between unrooted trees (Estabrook et al. 1985;Sand et al. 2014;Smith 2019b), which map faithfully into two dimensions via principal coordinates analysis (Gower 1966) (trustworthiness × continuity > 0.9) (Venna and Kaski 2001;Kaski et al. 2003). The minimum spanning tree (Gower and Ross 1969), which shows the shortest graph connecting all trees, highlights distortions in the mapping (Smith in press a). We verified that the results of tree space analysis are robust to the choice of distance metric by repeating the analysis using the clustering information distance (Smith 2020).
Morphological datasets often include taxa with a poorly constrained phylogenetic position, removal of which allows additional groupings to be resolved in a consensus tree (Wilkinson 1994(Wilkinson , 1996. We evaluate the stability of leaves within the posterior distribution of trees in order to identify rogue taxa using the heuristic approaches of Smith (in press b).
To evaluate the evolutionary implications of different tree topologies, we reconstruct ancestral states using the BGS algorithm ( parsimony trees) and maximum likelihood (Bayesian trees), using the R packages 'ape' and 'TreeSearch' (Paradis and Schliep 2019; Smith 2021).

Results
Our Bayesian analysis reproduces the results presented by Shi et al. (2021a). Analysis of the Bayesian posterior tree sample identifies Acosmia (Howard et al. 2020) as a rogue taxon whose variable position between posterior trees reduces the resolution of the majority-rule consensus tree. A majority-rule summary of the Bayesian posterior after removing Acosmia from each tree topology ( Fig. 1) thus yields additional phylogenetic information (Smith in press b). Most pertinently, the posterior tree sample suggests (with p . 0:7) that cricocosmiids belong in the stem group of a clade comprising nematoid worms and panarthropods, which are likely (p . 0:6) to be sister taxa.
Despite the tendency of phylogenetic analyses to overestimate Bayesian posterior probabilities, particularly in the presence of ambiguous (or inapplicable) data (Suzuki et al. 2002;Erixon et al. 2003;Yang and Rannala 2005;Lemmon et al. 2009), the results of this analysis are inconclusive as to whether the common ancestor of cricocosmiids and panarthropods had serially repeated paired ventral structures (char. 77, Shi et al. 2021b; Fig. 1); indeed, this character is reconstructed with a degree of uncertainty at all nodes. Dorsal epidermal specializations (char. 79, Shi et al. 2021b) are reconstructed as absent at this ancestral node with p ¼ 0:8.
Trees identified as optimal under the maximum likelihood and parsimony criteria fall at or beyond the fringes of the area of tree space sampled by the Bayesian posterior distribution (Fig. 2; Supplementary Data). Their high distance from the more central and densely sampled regions of the Bayesian posterior distribution indicates that these trees do not correspond to the most plausible trees under the Mk model.
Even if the most likely tree has an low likelihood of being exactly correct (p ¼ e À2390:0492 ), it can in principle serve as a point estimate of the most likely tree topologies in situations where a single bifurcating tree is desired (Brown et al. 2017). In this case, however, the large distances between the maximum likelihood tree and the majority of the Bayesian sample (Fig. 2) indicate that it inadequately represents the distribution of plausible trees and the associated phylogenetic uncertainty.
Parsimony analysis under the Fitch algorithm replicates the results of Shi et al. (2021a), though our extended searches recover  Fitch (1971) parsimony (+) and maximum likelihood (×) resolve cricocosmiids as sister to Priapulida (squares); BGS parsimony [Brazeau et al. (2019); open shapes] as sister to Panarthropoda (triangles); the Bayesian posterior sample [Lewis (2001); filled shapes] encompass both these possibilities as well as a position as sister to Panarthropoda + Nematoida (circles). p denotes the posterior probability of each topological hypothesis. Low (light shading) v. high (dark shading) concavity constants return a distinct set of trees in both parsimony treatments. additional most-parsimonious trees (k = 3: 82 rather than four; equal weights: 4 981 rather than six; tree scores listed in Supplementary Data). Values of the concavity constant close to TNT's default of 3, which is understood to penalize homology too severely in many settings (Goloboff et al. 2008b(Goloboff et al. , 2018Smith 2019a), recover trees that are very different from those under more appropriate concavity constants (higher values; darker shades in Fig. 2), which in turn resemble those obtained under maximum likelihood and equal weights; the most prominent differences concern whether Microdictyon and similar taxa fall in the stem group to Panarthropoda or Onychophora (Supplementary Data).
BGS parsimony recovers very different trees to Fitch parsimony ( Fig. 2; tree scores listed in Supplementary Data), again with strong differences between low concavity constants (in which Acosmia plots with kinorhynchs and loriciferans; Supplementary Data) and high ones (where Acosmia plots with nematoids; Fig. 3). All BGS trees recover the clade comprising panarthropods and lobopodians emerging from a paraphyletic 'Cricocosmiidae' (Fig. 3). The most parsimonious character reconstruction indicates that serially repeated paired ventral structures (char. 77, Shi et al. 2021b Fig. 3) and dorsal epidermal specializations (char. 79, Shi et al. 2021b) evolved in the common ancestor of Tabelliscolex and panarthropods, and are homologous between these groups. The potential homology of the paired ventral structures in Mafangscolex remains uncertain under the BGS algorithm.

Discussion
The central conclusion of Shi et al. (2021a) that dorsal sclerotized plates and spinose ventral appendages evolved independently in lobopodians and cricocosmiidsis supported only when cricocosmiids are resolved as stem-group priapulans. Parsimony analyses only recover this situation under unsuitable treatments of inapplicable characters. Bayesian analysis, which cannot yet readily account for inapplicable characters, assigns a low posterior probability (5.5%) to a stem-group priapulan relationship.
Our reanalyses indicate that it is more parsimonious to reconstruct cricocosmiids in the stem group to Panarthropoda, and that this relationshipwith the possible inclusion of the nematoidsis substantially more likely than any other. This view is consistent with the origin of panarthropods (and potentially nematoids) from within a palaeoscolecid grade, and the homology of their dorsal and ventral repeated structures (Dzik and Krumbiegel 1989). Nevertheless, in light of the strong impact of methodology on the relationships of the palaeoscolecids, and the considerable uncertainty in their superphylum-level affinity indicated by Bayesian analyses, we feel that it is premature to offer a decisive statement on the potential homologies between serially repeated structures in lobopodians and palaeoscolecids. Competing interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data availability All data generated or analysed during this study are included in this published article (and its supplementary information files).