Placozoa and Cnidaria are sister taxa

The phylogenetic placement of the morphologically simple placozoans is crucial to understanding the evolution of complex animal traits. Here, we examine the influence of adding new genomes from placozoans to a large dataset designed to study the deepest splits in the animal phylogeny. Using site-heterogeneous substitution models, we show that it is possible to obtain strong support, in both amino acid and reduced-alphabet matrices, for either a sister-group relationship between Cnidaria and Placozoa, or for Cnidaria and Bilateria (=Planulozoa), also seen in most published work to date, depending on the orthologues selected to construct the matrix. We demonstrate that a majority of genes show evidence of compositional heterogeneity, and that the support for Planulozoa can be assigned to this source of systematic error. In interpreting this placozoan-cnidarian clade, we caution against a peremptory reading of placozoans as secondarily reduced forms of little relevance to broader discussions of early animal evolution.


Introduction
The discovery 1 and mid-20 th century rediscovery 2 of the enigmatic, amoeba-like placozoan Trichoplax adhaerens did much to ignite the imagination of zoologists interested in early animal evolution 3 . As 48 microscopic animals adapted to extracellular grazing on the biofilms over which they creep 4 , placozoans have a simple anatomy suited to exploit passive diffusion for many physiological needs, with only six morphological cell types discernible even to intensive scrutiny 5,6 , and have no conventional muscular, digestive, or nervous systems, yet show tightly-coordinated behavior 7,8 . They 52 proliferate through fission and somatic growth. Evidence for sexual reproduction remains elusive, despite genetic evidence of recombination 9 and descriptions of early abortive embryogenesis 10,11 , with the possibility that sexual phases of the life cycle may occur only under poorly understood field conditions 12,13 56 Given their simple, puzzling morphology and dearth of embryological clues, molecular data are crucial in placing placozoans phylogenetically. The position of Placozoa in the animal tree proved recalcitrant to early standard-marker analyses [14][15][16] , although this paradigm did reveal a large degree of molecular diversity in placozoan isolates from around the globe, clearly indicating the existence of 60 many cryptic species 12,17,18 with up to 27% genetic distance in 16S rRNA alignments 19 . An apparent answer to the question of placozoan affinities was provided by analysis of a nuclear genome assembly 9 , which strongly supported a position as the sister group of a clade of Cnidaria+Bilateria (sometimes called Planulozoa). However, this effort also revealed a surprisingly bilaterian-like 20 64 developmental gene toolkit in placozoans, a paradox for such a simple animal.
As metazoan phylogenetics has pressed onward into the genomic era, perhaps the largest controversy has been the debate over the identity of the sister group to the remaining metazoans, traditionally thought to be Porifera, but considered to be Ctenophora by Dunn et al. 21 and 68 subsequently by additional studies [22][23][24][25][26] . Others have suggested that this result arises from inadequate taxon sampling, flawed matrix husbandry, and/or use of poorly fitting substitution 4 models [27][28][29][30][31] . A third view has emphasized that using different sets of genes can lead to different conclusions, with only a small number sometimes sufficient to drive one result or another 32,33 . This 72 controversy, regardless of its eventual resolution, has spurred serious contemplation of possibly independent origins of several hallmark traits such as striated muscles, digestive systems, and nervous systems 23,34-39 . Driven by this controversy, new genomic and transcriptomic data from sponges, 76 ctenophores, and metazoan outgroups have accrued, while new sequences and analyses focusing on the position of Placozoa have been slow to emerge. Here, we provide a novel test of the phylogenetic position of placozoans, adding draft genomes from three putative species that span the root of this clade's known diversity 17 , and critically assessing the role of systematic error in placing of 80 these enigmatic organisms.

Results and Discussion
Orthology assignment on sets of predicted proteomes derived from 59 genome and 84 transcriptome assemblies yielded 4,294 gene trees with at least 20 sequences each, sampling all 5 major metazoan clades and outgroups, from which we obtained 1,388 well-aligned orthologues.
Within this set, individual maximum-likelihood (ML) gene trees were constructed, and a set of 430 most-informative orthologues were selected on the basis of tree-likeness scores 40 . This yielded an 88 amino-acid matrix of 73,547 residues with 37.55% gaps or missing data, with an average of 371.92 and 332.75 orthologues represented for Cnidaria and Placozoa, respectively (with a maximum of 383 orthologues present for the newly sequenced placozoan H4 clade representative; Figure 1 Cnidaria+Placozoa, albeit with more marginal resampling support. Both Bayesian and ML analyses show little internal branch diversity within Placozoa. Accordingly, deleting all newly-added placozoan 96 genomes from our analysis has no effect on topology and only a marginal effect on support in ML analysis ( Figure 1 -Supplemental Figure 2). Quartet-based concordance analyses 42 show no evidence of strong phylogenetic conflicts among ML gene trees in this 430-gene set (Figure 1), although internode certainty metrics are close to 0 for many key clades including Cnidaria+Placozoa, 100 indicating that support for some ancient relationships may be masked by gene-tree estimation errors, emerging only in combined analysis 43 .
Compositional heterogeneity of amino-acid frequencies along the tree is a source of phylogenetic error not modelled by even complex site-heterogeneous substitution models such as 104 CAT+GTR [44][45][46][47] . Furthermore, previous analyses 32 have shown that placozoans and choanoflagellates in particular, both of which taxa our matrix samples intensively, deviate strongly from the mean amino-acid composition of Metazoa, perhaps as a result of genomic GC content discrepancies. As a measure to at least partially ameliorate such nonstationary substitution, we recoded the amino-acid 108 matrix into the 6 "Dayhoff" categories, a common strategy previously shown to reduce the effect of compositional variation among taxa, albeit the Dayhoff-6 groups represent only one of many plausible recoding strategies, all of which sacrifice information 31,[48][49][50] . Analysis of this recoded matrix under the CAT+GTR model again recovered full support (pp=1) for Cnidaria+Placozoa ( Figure 2). 112 Indeed, under Dayhoff-6 recoding, the only major change is in the relative positions of Ctenophora and Porifera, with the latter here constituting the sister group to all other animals with full support.
Similar recoding-driven effects on relative positions of Porifera and Ctenophora have also been seen in other recent work 31 . 116 Many research groups, using good taxon sampling and genome-scale datasets, and even recently including data from a new divergent placozoan species 26,31,51 , have consistently reported strong support for Planulozoa under the CAT+GTR model. Indeed, when we construct a supermatrix 6 from our predicted peptide catalogues using a different strategy, relying on complete sequences of 120 303 pan-eukaryote "Benchmarking Universal Single-Copy Orthologs" (BUSCOs) 52 , we also see full support in a CAT+GTR+Г analysis for Planulozoa, in both amino-acid ( Figure 3a) and Dayhoff-6 recoded alphabets (Figure 3b). Which phylogeny is correct, and what process drives support for the incorrect topology? Posterior predictive tests, which compare the observed among-taxon usage of 124 amino-acid frequencies to expected distributions simulated using the sampled posterior distribution and a single composition vector, may provide insight 46 . Both the initial 430-gene matrix and the 303gene BUSCO matrix fail these tests, but the BUSCO matrix fails it more profoundly, with z-scores (measuring mean-squared across-taxon heterogeneity) scoring in the range of 330-340, in contrast 128 to the range of 176-187 seen in the 430-gene matrix. Furthermore, inspecting z-scores for individual taxa in representative chains from both matrices shows that a large amount of this global difference in z-scores can be attributed to placozoans, with additional contributions from choanoflagellates and select isolated representatives of other clades ( Figure 3C). 132 As a final measure to describe the influence of compositional heterogeneity in this dataset, we applied a null-simulation test for compositional bias to each alignment in our set of 1,388 orthologues. This test, which compares the real data to a null distribution of amino-acid frequencies simulated along assumed gene trees with a substitution model using a single composition vector, is 136 less prone to Type II errors than the more conventional Χ 2 test 45 . Remarkably, at a conservative significance threshold of α=0.10, the majority (764 genes or ~55%) of this gene set is identified as compositionally biased by this test, highlighting the importance of using appropriate statistical tests to control this source of systematic error, rather than applying arbitrary heuristic approaches 53 . 140 Building informative matrices from gene sets on either side of this significance threshold, and again applying both CAT+GTR mixture models and ML profile mixtures, we see strong support for   Figure 4D), and also diminishes support for Placozoa+Cnidaria, perhaps reflecting the inherent information loss of using a reduced amino-acid alphabet for this relatively shorter matrix. In light of these observations, we question the assertion that the support for Porifera-sister seen in some 152 matrices and recoding strategies can be per se attributed to a lessened influence of compositional heterogeneity 31 .
The previously cryptic phylogenetic link between cnidarians and placozoans seen in gene sets less influenced by compositional bias will continue to raise questions on the homology of 156 certain traits across non-bilaterians. Many workers, citing the incompletely known development 10,12 and relatively bilaterian-like gene content of placozoans 9,51 , presume that these organisms must have a still-unobserved, more typical development and life cycle, or else are merely oddities that have experienced wholesale secondary simplification, having scant significance to any evolutionary 160 path outside their own. Indeed, it is tempting to interpret this new phylogenetic position as further bolstering such hypotheses, as much work on cnidarian models in the evo-devo paradigm is predicated on the notion that cnidarians and bilaterians share, more or less, many homologous morphological features, viz. bilaterality 54 , nervous systems 36,37,55-57 , basement-membrane lined 164 epithelia 58,59 , musculature 39 , embryonic germ-layer organisation 60 , and internal digestion 38,61-63 .
While we do not argue, as some have done 64,65 , that placozoans resemble hypothetical metazoan ancestors, we hesitate to dismiss them a priori as irrelevant to understanding early bilaterian evolution in particular: although apparently simpler and less diverse, placozoans nonetheless have 168 equal status to cnidarians as an immediate extant outgroup. Rather, we see value in testing assumed hypotheses of homology, character by character, by extending pairwise comparisons between bilaterians and cnidarians to include placozoans, an agenda which demands reducing the large 8 disparity in embryological, physiological, and molecular genetic knowledge between these taxa. 172 Conversely, we emphasize another implication of this phylogeny: characters that can be validated as homologous at any level between Bilateria and Cnidaria must have originated earlier in animal evolution than previously appreciated, and should either cryptically occur in modern placozoans or else have been lost at some point in their ancestry. In this light, paleobiological scenarios of early 176 animal evolution founded on inherently phylogenetically-informed interpretations of Ediacaran fossil forms [66][67][68][69][70] and molecular clock estimates 71-73 may require re-examination.

Materials and Methods 180
Sampling, sequencing, and assembling reference genomes from previously unsampled placozoans Orthologue alignment was performed using the MAFFT v7.271 'E-INS-i' algorithm, and probabilistic masking scores were assigned with ZORRO 88 , removing all sites in each alignment with scores below 5 as described previously 83 . 31 orthologues with retained lengths less than 50 amino acids were discarded, leaving 1,388 well-aligned orthologues. 268

Matrix assembly
A full concatenation of all retained 1,388 orthogroups was performed with the 'geneStitcher.py' script distributed with UPhO available at https://github.com/ballesterus/PhyloUtensils. However, such a matrix would be too large for 272 tractably inferring a phylogeny under well-fitting mixture models such as CAT+GTR; therefore we used MARE v0.1.2 40 to extract an informative subset of genes using tree-likeness scores, running with '-t 100' to retain all taxa and using '-d 1' as a tuning parameter on alignment length. This yielded our 430-orthologue, 73,547 site matrix. 276 As a check on the above procedure, which is agnostic to the identity of the genes assigned into orthologue groups, we also sought to construct a matrix using complete, single-copy sequences identified by the BUSCO v3.0.1 algorithm 52 , using the 303-gene eukaryote_odb9 orthologue set.
BUSCO was run independently on each peptide FASTA file used as input to OrthoFinder, and a 280 custom python script ('extract.py') was used to parse the full output table from each species, selecting only those entries identified as complete-length, single-copy representatives of each BUSCO orthologue, and grouping these into unix directories, facilitating downstream alignment, probabilistic masking, and concatenation, as described for the OrthoFinder matrix. This 303-gene 284 BUSCO matrix had a total length of 94,444 amino acids, with 39.6% of sites representing gaps or missing data.
Within the gene bins nominated by the test of compositional heterogeneity (see below), matrices were constructed again by concatenating and reducing matrices with MARE, using '-t 100' 288 to retain all taxa and setting '-d 0.5' to yield a matrix of an optimal size for inferring a phylogeny under the CAT+GTR model. This procedure gave a 349-gene matrix of 80,153 amino acids within the test-failing gene set, and a 348-gene matrix of 55,426 amino acids within the test-passing set ( Figure   4). 292

Phylogenetic Inference
Individual ML gene trees were constructed on all 1,388 orthologues in IQ-tree v1.6beta, with '-m MFP -b 100' passed as parameters to perform automatic model selection and 100 standard 296 nonparametric bootstraps on each gene tree.
For inference on the initial 430-gene matrix, we proceeded as follows: ML inference on the concatenated matrix (Figure 1 -Supplemental Figure 1) was performed with IQ-tree v1.6beta, passing '-m C60+LG+FO+R4 -bb 1000' as parameters to specify a profile mixture model and retain 300 1000 trees for ultrafast bootstrapping; the '-bnni' flag was used to incorporate NNI correction during UF bootstrapping, an approach shown to control misleading inflated support arising from model misspecification 89 . ML inference using only the H1 haplotype as a representative of Placozoa ( showed a maximum bipartition discrepancy (maxdiff) of 0.042 after removing the first 1000 generations as burn-in (Figure 2). QuartetScores 42 was used to measure internode certainty metrics including the reported EQP-IC, using the 430 gene trees from those orthologues used to derive the matrix as evaluation trees, and using the amino-acid CAT+GTR+Г4 tree as the reference to be 316 annotated ( Figure 1).
For inference on the BUSCO 303 gene set, we ran 4 chains of the CAT+GTR+Г4 mixture model with PhyloBayes MPI v1.7a, applying the -dc flag again to remove constant sites, but here not specifying a starting tree; chains were run from 1873 to 2361 generations. Unfortunately, no pair of 320 chains reached strict convergence on the amino-acid version of this matrix (with all pairs showing a maxdiff = 1 at every burn-in proportion examined), perhaps indicating problems mixing among the four chains we ran. However, all chains showed full posterior support for identical relationships among the 5 major animal groups, with differences among chains assignable to minor differences in 324 the internal relationships within Choanoflagellata and Bilateria. Accordingly, the posterior consensus tree in Figure 3A

Tests of compositional heterogeneity
For posterior predictive tests of compositional heterogeneity using MCMC samples under 348 CAT+GTR, we used PhyloBayes MPI v1.7a to test two chains from the initial 430-gene matrix, and 3 chains from the 303-gene BUSCO matrix, removing 2000 and 1000 generations as burn-in, respectively. Results from tests on representative chains were selected for plotting in Figure 3C; however, results from all chains tested are deposited in the Data Dryad accession. 352 For the per-gene null simulation tests of compositional bias 45 , we used the p4 package (https://github.com/pgfoster/p4-phylogenetics), inputting the ML trees inferred by IQ-tree for each of the 1,388 alignments, and assuming an LG+Γ4 substitution model with a single empirical frequency vector for each gene; this test was implemented with a simple wrapper script 356 ('p4_compo_test_multiproc.py') leveraging the python multiprocessing module. We opted not to model-test each gene individually in p4, both because the range of models implemented in p4 are more limited than those tested for in IQ-tree, and because, as a practical matter, LG (usually with variant of the FreeRates model of rate heterogeneity) was chosen as the best-fitting model in the IQ-360 tree model tests for a large majority of genes, suggesting that LG+Γ4 would be a reasonable approximation for the purposes of this test. We selected an α-threshold of 0.10 for dividing genes into test-passing and -failing bins as a conservative measure; however, we emphasize that even at a less conservative α=0.05, 47% of genes would still be detected as falling outside the null expectation. 364 Systems Infrastructure team provided essential support on the EBI compute cluster. Allen Collins, Scott Nichols, and particularly Andreas Hejnol provided useful comments on an earlier version of this manuscript. 372

Competing Interests
The authors declare that they have no conflicting interests relating to this work. CEL, JCM and GG conceptualized and initiated this work, and supervised throughout. All authors read 392 and contributed to the final manuscript.