Phylogeny and divergence times of suckers (Cypriniformes: Catostomidae) inferred from Bayesian total-evidence analyses of molecules, morphology, and fossils

Catostomidae (“suckers”) is a diverse (76 species) and broadly distributed family of Holarctic freshwater fishes with a rich fossil record and a considerable number (∼35%) of threatened and imperiled species. We integrate DNA sequences (three mitochondrial genes, three nuclear genes), morphological data, and fossil information to infer sucker phylogenetic relationships and divergence times using Bayesian “total-evidence” methods, and then test hypotheses about the temporal diversification of the group. Our analyses resolved many nodes within subfamilies and clarified Catostominae relationships to be of the form ((Thoburniini, Moxostomatini), (Erimyzonini, Catostomini)). Patterns of subfamily relationships were incongruent, but mainly supported two placements of the Myxocyprininae; distinguishing these using Bayes factors lent strongest support to a model with Myxocyprininae sister to all remaining sucker lineages. We improved our Bayesian total-evidence dating analysis by excluding problematic characters, using a clock-partitioning scheme identified by Bayesian model selection, and employing a fossilized birth-death tree prior accommodating morphological data and fossils. The resulting chronogram showed that suckers evolved since the Late Cretaceous–Eocene, and that the Catostomini and Moxostomatini clades have accumulated species diversity since the early to mid-Miocene. These results agree with the fossil record and confirm previous hypotheses about dates for the origins of Catostomide and catostomine diversification, but reject previous molecular hypotheses about the timing of divergence of ictiobines, and between Asian–North American lineages. Overall, our findings from a synthesis of multiple data types enhance understanding of the phylogenetic relationships, taxonomic classification, and temporal diversification of suckers, while also highlighting practical methods for improving Bayesian divergence dating models by coupling phylogenetic informativeness profiling with relaxed-clock partitioning.

274 declines in PI towards the root of our phylogeny from the final divergence time results presented 275 below. We compared three resolution probabilities, including probability correct, probability 276 polytomy, and probability incorrect or 'phylogenetic noise' (equations 11-13 in Townsend  284 molecular and morphological characters in the total-evidence dataset ( Table 3). Proportions of 285 missing data and parsimony informative characters ranged from 0.3-36.4% and from 21.9-286 97.6%, respectively, across datasets (Table 3). PartitionFinder identified seven unique DNA 287 sequence subsets (scheme BIC = 141854.71523), and the best-fit DNA substitution model for 288 each subset is listed in Table S1. Morphological character subsets (in subset M) were assigned 289 Mkv+Γ models, as described above. None of the mtDNA genes sequenced in this study showed 290 signs of NUMTs, and we found no indels in the IRBP sequences; however, GHI and RPS7 genes 291 aligned with ~32 and ~16 ingroup indels/gaps, respectively. We archived our sequence  Harris & Mayden (2001). This 'subfamily pattern 2' relationship was strongly 306 supported with BPP = 1 in our total-evidence consensus topology from BEAST (Fig. 4) and 307 resolved with low support in the four-locus topology (Figs. 2B and S4). Distinguishing between 308 these two conflicting sets of subfamily relationships is difficult, because each is supported by 309 molecular and total-evidence topologies herein and agrees with at least one previous molecular 310 study. To objectively determine the arrangement of these subfamilies with the greatest weight of 311 evidence conditional on our total-evidence dataset, we compared subfamily patterns 1 and 2 312 using Bayes factors. We ran MrBayes as described above, except employing topological 313 constraints set to subfamily pattern 2, and then used stepping-stone sampling ( Manuscript to be reviewed 343 support for the H. etowanum-H. nigricans node (Fig. S3). We resolved Moxostoma as 344 monophyletic with BPP = 0.89-1, except for a paraphyletic pattern in the morphology consensus 345 tree. Within the Erimyzonini, Erimyzon was monophyletic (BPP = 1) and sister to Minytrema 346 (e.g. Fig. 3). None of the molecular or total-evidence topologies we inferred resolved 347 Catostomus as monophyletic relative to Chasmistes, Deltistes, or Xyrauchen. Here, yet again, 348 results from the morphology tree departed from our other results, failing to resolve relationships 349 among these or virtually any other catostomine lineages with strong support (Figs. 1C and S3).
350 As a result, we do not discuss the morphology consensus topology further in this section.
367 Alternative topologies inferred for this clade involved rearrangements placing P. santaanae sister 368 to P. clarkii, but with non-significant posterior support (e.g. Figs. S1 and S4).
369 Bayesian total-evidence dating and relaxed-clock partitioning 370 Bayes factor comparisons of 12 clock-partitioning models showed that removing mtDNA 1 st and 371 3 rd codon sites deemed to be problematic during PI profiling (see below) progressively improved 372 model log-marginal likelihoods and posterior evidence (  (Fig. 4), and the Manuscript to be reviewed  List of sequences used in the present study including museum/field numbers and GenBank accession numbers.   Characteristics of each of the seven morphological and molecular datasets analyzed in this study.
Within tip labels, dagger symbols indicate extinct taxa, and museum (voucher) or field numbers are followed by the number of the individual sequenced (