Integrated phylogenomics and fossil data illuminate the evolution of beetles

Beetles constitute the most biodiverse animal order with over 380 000 described species and possibly several million more yet unnamed. Recent phylogenomic studies have arrived at considerably incongruent topologies and widely varying estimates of divergence dates for major beetle clades. Here, we use a dataset of 68 single-copy nuclear protein-coding (NPC) genes sampling 129 out of the 193 recognized extant families as well as the first comprehensive set of fully justified fossil calibrations to recover a refined timescale of beetle evolution. Using phylogenetic methods that counter the effects of compositional and rate heterogeneity, we recover a topology congruent with morphological studies, which we use, combined with other recent phylogenomic studies, to propose several formal changes in the classification of Coleoptera: Scirtiformia and Scirtoidea sensu nov., Clambiformia ser. nov. and Clamboidea sensu nov., Rhinorhipiformia ser. nov., Byrrhoidea sensu nov., Dryopoidea stat. res., Nosodendriformia ser. nov. and Staphyliniformia sensu nov., and Erotyloidea stat. nov., Nitiduloidea stat. nov. and Cucujoidea sensu nov., alongside changes below the superfamily level. Our divergence time analyses recovered a late Carboniferous origin of Coleoptera, a late Palaeozoic origin of all modern beetle suborders and a Triassic–Jurassic origin of most extant families, while fundamental divergences within beetle phylogeny did not coincide with the hypothesis of a Cretaceous Terrestrial Revolution.


Introduction
Beetles (Coleoptera) are a textbook example of a hyperdiverse clade, known from more than 380 000 living species and upwards of 1.5 million awaiting description [1,2], that display extraordinary morphological, taxonomic and ecological diversity [3]. Constituting nearly a quarter of extant animal diversity on our shared planet, beetles play indispensable roles in nearly all terrestrial and freshwater ecosystems. The ecological dominance of beetles is reflected by their fossil record. The earliest unequivocal stem beetles are early Permian [4,5], while crown beetles belonging to extant suborders (Adephaga, Archostemata, Myxophaga and Polyphaga) first occur in the late Permian [6,7], and most extant families are first encountered in the fossil record in the Jurassic to Cretaceous [8][9][10]. A multitude of hypotheses have been proposed to explain beetle megadiversity, focusing principally on the importance of key anatomical innovations, co-diversification with other clades such as angiosperms during the Cretaceous Terrestrial Revolution [11] and mass extinction events [9,[12][13][14][15]. Tests of these hypotheses of the causes and consequences of beetle diversification require a robust time-calibrated phylogeny. However, considering the long evolutionary history, exceptional species richness and unparalleled morphological disparity as well as apparent morphological convergence of beetles, resolving the phylogeny and timescale of Coleoptera evolution has proven challenging.
The lack of a consensus on higher level relationships within Coleoptera has compromised attempts to derive an evolutionary timescale. To date, the majority of molecular phylogenetic studies of beetles with the most comprehensive taxon sampling [9,[16][17][18] have sampled eight or fewer genes, with a matrix length of less than 10 000 nucleotides [19]. While this limited gene sampling has helped shed light on the relationships of some subfamilies and families, it is insufficient to accurately resolve the deep evolutionary relationships within Coleoptera. As exemplified in the most comprehensive studies based on morphology [20], eight-gene markers [16], mitochondrial genomes [21,22], phylogenomic datasets [10,23] and transcriptomes [23], the interrelationships of the suborders and most series and superfamilies of the most diverse beetle suborder, Polyphaga, still lack consistency and sufficient statistical support. As such, relationships among many families remain effectively unresolved [19]. Compositional and rate heterogeneity are among the most common sources of phylogenetic incongruence [24][25][26]. While models such as WAG and LG account for replacement rate heterogeneity and can account for across-site rate heterogeneity when combined with a Gamma distribution (e.g. the WAG + G and LG + G models), these models cannot account for across-site compositional heterogeneity. Reducing site compositional heterogeneity in datasets combined with the utilization of evolutionary models accounting also for compositional heterogeneity (such as the CAT-based models [27]) has been shown to improve the fit of the model to the data and reduce attraction artefacts that are frequent source of error in phylogenetics [27,28]. Consequently, site and rate heterogeneous models have been widely used to resolve difficult phylogenomic problems, such as deep and rapid radiations that are otherwise hard to resolve [29][30][31][32]. In different groups of beetles, compositionally site-heterogeneous models have recovered topologies that are highly congruent, sometimes identical, to traditional morphology-based classification schemes [21,25,33], thus contributing to resolving the often perceived 'conflict' between morphological and molecular phylogenies [34]. However, the compositionally site-heterogeneous CAT model has not been used widely for the analysis of proteincoding sequences of beetles.
Reconstructing the timescale of beetle evolution also has to explicitly account for sources of error in molecular clock estimates and the biases of the fossil record. Despite the rich fossil record of beetles, molecular clock analyses have generally incorporated few fossil calibrations, ranging from 7 to 34 [9,10,16,23,35], a practice that may lead to inaccurate divergence time estimates [36,37] as the congruence between molecular clock estimates and the fossil record tends to increase logarithmically with an increasing number of calibrations used [38]. The selection and phylogenetic placement of fossil calibrations is another significant factor influencing divergence dates [39,40], and past studies have either not followed best practice in justifying the phylogenetic position and stratigraphic age of the fossils [41], or used fossils to calibrate more derived nodes than they truly represent [9,10,16], skewing divergence time estimates. Finally, the commonplace use of arbitrary statistical distributions to establish a prior on clade age unduly biases clade age estimates [42]. Fossils can only directly inform minimum constraints on clade ages [43], which is particularly important for taxa such as beetles that are restricted to a small number of fossil deposits with exceptional preservation [44], making the criteria used to define maximum age constraints on when nodes are imposed an important consideration for molecular clock studies. These variables have contributed to uncertainties in dating key events in beetle evolution, such as the timing of origin of the beetle clade or the timing of Polyphaga radiation relative to the diversification of angiosperms-a key tenet of the Cretaceous Terrestrial Revolution hypothesis [11].
Here, we infer a timescale for beetle evolution that integrates the fossil record and refined sampling of 68 nuclear protein-coding (NPC) genes (16 206 amino acid sites) generated from Zhang et al. [10,45] with the addition of Rhinorhipus [46]. We established 57 new calibrations established in accordance with best practice recommendations [41], fully justified with respect to their stratigraphic age and systematic position, more than have been used in any previous analysis of the timing of beetle evolution. We provide a well-resolved phylogeny of beetles with a comprehensive taxon sampling based on the siteheterogeneous CAT-GTR + G4 model. This phylogeny is more consistent with morphological data [20,25] and whole-genome analyses [23] and we use it to propose formal changes to the classificatory scheme of beetles.

Dataset collation
We used the published NPC gene sequences from Zhang et al. [10] supplemented with the Rhinorhipus sequences from Kusy et al. [46], a morphologically peculiar genus suspected to represent an isolated early diverging polyphagan lineage. Zhang et al. [10] presented and analysed both nucleotide and amino acid alignments of their data; their decisive alignment was based upon a concatenated amino acid dataset of 95 NPC genes. We excluded 27 genes that contain up to 21 copies in some beetle genomes and could not be homologized confidently, following [46], keeping only single-copy orthologues. All NPC genes were individually aligned using the 'Translation Align' option with the FFT-NS-i-× 2 algorithm of MAFFT 7.2 [47]. Ambiguously aligned regions were trimmed with BMGE 1.1 (-m BLOSUM30) [48]. The sequences were then concatenated into a supermatrix (376 taxa, 16 206 sites) using FASconCAT [49]. The concatenated supermatrix (

Compositional heterogeneity and model-specific artefacts
General statistics for the dataset, compositional homogeneity tests (x 2 test of heterogeneity) and compositional heterogeneity (relative composition frequency variability (RCFV)) were computed using BaCoCa v. 1.105 [50]. To assess the prevalence of model-specific incongruences in beetle phylogeny, we generated a reduced dataset with 136 taxa, sampling only a single representative for each family, and applied the site-homogeneous models LG, JTT and WAG alongside the site-heterogeneous models C10 and C30. Analyses were conducted in IQ-TREE 1.6.3 [51] with 1000 bootstraps. The fit of these models to the dataset was assessed using ModelFinder [52] implemented in IQ-TREE 1.6.3.

Model selection and phylogenetic analysis
We tested the impact of accounting for compositional heterogeneity on phylogenetic inferences of Coleoptera. We tested the replacement rate heterogeneously and compositionally and rate siteheterogeneous infinite mixture model CAT-GTR + G4, which has been shown both theoretically and empirically to be frequently effective at suppressing attraction artefacts [24,28], and the sitehomogeneous model GTR. In addition, we tested the use of LG + C20, a pre-computed replacement rate heterogeneously and compositionally and rate site-heterogeneous model, which has been used in recent phylogenomic studies of Coleoptera [10,16,23]. The C20 component in the LG-C20 model accounts for site-specific compositional heterogeneity (it is a CAT model), but differs from CAT (as used in CAT-GTR + G4) as it offers only 20 substitutional categories, while the CAT model estimates the optimal number of categories from the data. Similarly, the LG component is a pre-computed GTR matrix where replacement rates are not estimated from the data. To test whether the CAT-GTR + G4 model fits the decisive amino acid alignment better than the GTR model, a variant of which was used by Zhang et al.
[10], a Bayesian cross-validation analysis was run with 10 replicates in PhyloBayes. CAT-GTR + G4 was run in PhyloBayes MPI 1.7 [53]; two independent Markov chain Monte Carlo (MCMC) chains were run until convergence maxdiff less than 0.3 for up to 14 months of real time. The bpcomp program was used to generate output of the largest (maxdiff ) and mean (meandiff ) discrepancies observed across all bipartitions. The models GTR and LG + C20 were implemented in IQ-TREE 1.6.3 [51] with 1000 bootstraps.

Fossil calibrations
We selected 57 calibrations spread throughout the tree of beetles based on a combination of fossil, phylogenetic, stratigraphic, geochronological and biogeographic evidence (electronic supplementary material, table S2). Fossil calibration choice was conducted conservatively following the criteria set out by Parham et al. [41]. Care was taken to select fossils that are the earliest members of their respective clades and are distributed equitably throughout the tree. The fossils were selected to calibrate nodes based on the presence of unambiguous synapomorphies. Soft maximum node age constraints were based on the absence of fossils in well-studied Lagerstätten to circumvent the use of arbitrary statistical distributions with uniformed maxima. Soft maxima are often established arbitrarily but objectively on key fossil Lagerstätten that demonstrate evidence of the absence of clade members based on the presence of taphonomic, ecologic and/or biogeographic counterparts. Note, however, that this conclusion is not based only on the named deposit. Rather, the soft maximum is based on the cumulative evidence of the absence of clade members (and presence of taphonomic, ecologic and/or biogeographic counterparts) in deposits intermediate in age between the minimum and soft maximum constraints. The deposit on which the soft maximum is defined is simply an arbitrary but objective reference datum reflecting our view that there is very little (but nonzero-hence it is a soft maximum) prior probability that the clade originated before this time.

Markov chain Monte Carlo tree analysis
Divergence time estimation was performed using the approximate likelihood calculation in MCMCtree implemented in PAML 4.7 [54], incorporating softbound fossil calibrations on nodes on the tree [55]. We used MCMCtree in place of competing Bayesian relaxed molecular clock software because of its computational efficiency, facilitating exploration of the impact of methodological variables. We obtained 200 000 trees with a sampling frequency of 50 and discarded 10 000 as burn-in. Default parameters were set as follows: 'cleandata = 0', 'BDparas = 1 1 0', 'kappa_gamma = 6 2', 'alpha_gamma = 1 1', 'rgene_gamma = 2 20', 'sigma2_gamma = 1 10' and 'finetune = 1: 0.1 0.1 0.1 0.01 0.5'. Convergence was tested in Tracer [56] by comparing estimates from the two independent chains. Analyses were run using both autocorrelated rate (AC) and independent rate (IR) clock models using uniform prior distributions to reflect ignorance of the true clade between the minimum and soft maximum constraints. In all analyses, the soft minimum and maximum bounds were augmented by a 2.5% tail probability. To ensure our priors were appropriate, we ran the MCMCtree analysis without sequence data to establish whether the effective priors are compatible with the specified priors.

Systematic bias in beetle phylogenomics
Prior to phylogenetic reconstruction, we subjected our 68-gene dataset to tests to determine possible sources of systematic error. Of the 376 analysed taxa, 115 failed the compositional homogeneity test ( p < 0.05) (electronic supplementary material, table S1 available on Mendeley Data). Moreover, the dataset displayed a relatively high RCFV value (0.0229) [57]. High RCFV values are indicative of pronounced amino acid compositional heterogeneity, which is characteristic of fast-evolving taxa [58] (long branches) and may lead to misleading phylogenetic inference [24][25][26]. This was confirmed in preliminary analyses with reduced taxon sampling, which showed that the recovery of relationships at deep nodes was heavily influenced by the model used in phylogenetic reconstruction. The site-homogeneous models LG, WAG, GTR and JTT recovered topologies with high statistical support, but differed profoundly from the better fitting site-heterogeneous model C10 (electronic supplementary material, table S1), with respect to the placement of numerous polyphagan groups, namely within Cucujiformia (electronic supplementary material, figures S1-S5). The recovered model-dependent phylogenetic signal is strongly suggestive of systemic incongruence of phylogenetic signal, attributable in part to long-branch attraction (LBA) artefacts and other sources of systematic error such as the heterogeneity of the substitution process. Hence, we expect systematic bias to be an important source of incongruence in beetle phylogenomics, corroborating previous phylogenomic studies of beetle mitochondrial genomes [25,33,59].

Phylogenomic reconstruction
To overcome bias associated with compositional heterogeneity, we tested site-homogeneous and siteheterogeneous models for their fit to the whole analysed dataset with 376 taxa by running a crossvalidation analysis in PhyloBayes. The site-heterogeneous model CAT-GTR + G4 fitted the dataset better than the site-homogeneous GTR model, variations of which have been used in some previous analyses of beetle phylogeny (CAT-GTR + G4 > GTR; CV score = 8094.35 ± 341.776). This is in line with previous studies that show that site-heterogeneous models consistently outperform site-homogeneous ones, albeit at the expense of a greater computational burden [30,32,60]. As such, we present the topology reconstructed by the site-heterogeneous model CAT-GTR + G4 implemented in PhyloBayes as our main tree. Topologies recovered with the site-homogeneous model (GTR, electronic supplementary material, figure S7) and a variant of the site-heterogeneous CAT model (LG + C20, electronic supplementary material, figure S8) varied with respect to the support of key deep clades and recovered relationships. For example, the GTR and LG + C20 models recovered the suborders Archostemata and Myxophaga as sister to Adephaga, an alternative branching order of superfamilies within Cucujiformia inconsistent with recent analyses [10,16,23], and differed with respect to the relationships among some polyphagan families such as Boganiidae. Nonetheless, the monophyly of major series and superfamilies has been supported by the site-homogeneous models as well as CAT-GTR + G4, which is further discussed below.
Our PhyloBayes analyses with the CAT-GTR + G4 model (figures 1 and 2) recovered deep relationships among suborders with Adephaga as the earliest diverging lineage, Adephaga (Polyphaga (Myxophaga, Archostemata)), a novel hypothesis [10,16], albeit with low support (figure 3). As noted in Zhang et al. [10], the two smallest coleopteran suborders, Archostemata and Myxophaga, were insufficiently sampled (each represented by a single taxon) and this may, in part, account for such a result. Despite extensive morphology-based phylogenetic studies [61][62][63], the intersubordinal relationships should be considered tentative, awaiting future phylogenomic studies [19]. Consistent with Zhang et al. [10], McKenna et al. [23] and an analysis of ultraconserved elements [64] within Adephaga, we recovered a paraphyletic 'Hydradephaga' with the aquatic Haliplidae and Gyrinidae forming a clade sister to other aquatic and terrestrial adephagans. A paraphyletic 'Hydradephaga' with Gyrinidae as the earliest diverging adephagan branch was also recovered by a number of morphological analyses of adults and larvae [65].
Among Polyphaga, five families formed three branching clades sister to the remaining polyphagan families (figure 2), in congruence with all recent studies [9,10,16]. As in previous analyses [9,10,16,23], Derodontidae was recovered as sister to Clambidae + Eucinetidae. In contrast with Zhang et al. [10], our analysis also included the enigmatic Australian endemic family Rhinorhipidae (superfamily Rhinorhipoidea), sequenced by Kusy et al. [46], which was strongly supported as sister to the remaining polyphagans. Thus, we can reject the recent hypothesis that it may be sister to royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211771 Nosodendridae or to Elateriformia based on mitogenomes and Sanger sequence data analysed under site-homogeneous models [46]. This finding is congruent with the genome-scale analysis of McKenna et al. [23] and selected analyses of Kusy et al. [46].
Next to Rhinorhipidae, the monophyletic series Elateriformia was recovered as the fourth polyphagan branch with strong support, a result consistent with genomic analyses [21,22,46], rejecting a more derived position based on an eight-gene phylogeny [16,19]. Within Elateriformia, Dascilloidea were corroborated as the sister group to the remaining elateriform superfamilies, as in Zhang et al. [  royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211771 A polyphyletic Byrrhoidea was also recovered based on mitogenomes [21], whole genomes [23] and a small number of genes [9], in which the name Dryopoidea was used for defining Byrrhoidea minus Byrrhidae, and Byrrhoidea s. str. for the moss-feeding Byrrhidae. As supported by numerous plesiomorphic characters (especially larval traits), Byrrhoidea s. str. and Buprestoidea have been long  regarded to be isolated from other elateriforms [67][68][69], in concert with our new topology. Additionally, Dryopoidea have been well defined by a unique rearrangement of tRNA gene order [70]. The interfamilial relationships among Dryopoidea are exactly the same as those based on a sitehomogeneous model [10], providing the best-supported relationships and clarifying uncertainties in the phylogeny of Dryopoidea based on a few genes [16,71]. Within Elateroidea, the clade of 'higher elateroids' sensu Kundrata et al. [72] was strongly supported, which is consistent with previous studies [9,10,16]. The clade Lycidae + Elateridae (Lissomini) was strongly supported to occupy a more basal position compared with Cantharidae, which is inconsistent with Zhang et al. [10], but consistent with McKenna et al. [16]. Elateridae were recovered non-monophyletic as in other recent studies [10,16,23,73]. Next to Elateriformia, Nosodendridae was strongly supported as sister to the remaining polyphagans (Staphyliniformia, Bostrichiformia and Cucujiformia), in congruence with McKenna et al. [23] and rejecting its placement within Elateriformia [9,16,23,72].

Fossil calibrations
We derived 57 calibrations spread throughout the tree of beetles based on a combination of fossil, phylogenetic, stratigraphic, geochronological and biogeographic evidence. Fossil calibration choice was conducted conservatively following best practice [41]. Care was taken to calibrate nodes distributed equitably throughout the tree, informed by fossils that are the earliest phylogenetically secure members of their respective clades based on the presence of unambiguous synapomorphies. For example, we selected Ponomarenkium belmonthense from the late Permian Newcastle Coal Measures at Belmont, Australia, to calibrate the node representing crown Coleoptera [82]. The type series is deposited in a public institutional collection, the Australian Museum in Sydney (holotype, 40278; paratype, 41618) [82]. The fossils possess open procoxal cavities, a narrow prosternal process and lack a broad prothoracic postcoxal bridge; these characteristics place them into crown Coleoptera. However, the combination of characters present in Ponomarenkium excludes it from the crown group of all four extant beetle suborders: internalized metatrochantin, metanepisternum only marginally part of the closure of the mesocoxal cavity, elytra without window punctures, antennae moniliform (Archostemata); transverse ridge of the mesoventrite, short metacoxae not reaching the hind margin of abdominal sternite III, absence of coxal plates (Adephaga); absence of a broad contact between the meso-and meta-ventrites (Myxophaga); and exposed propleuron (Polyphaga) [82]. The age of the Tatarian insect beds in the Newcastle Coal Measures at Belmont, from which the fossils originate, is derived from stratigraphic correlation and high-precision CA-TIMS U-Pb zircon dating and hence a conservative minimum age for the fossil can be taken from the top of the Changhsingian, dated at 251.902 Ma ± 0.024 Myr [82,83], thus, 251.878 Ma. The maximum constraint on the root of Coleoptera was 307.1 Ma, based on the age of the Mazon Creek Lagerstätte [84]. Despite its highly diverse fossil insect assemblage that has been intensively studied for decades [84], no unequivocal crown beetles are known from Mazon Creek or other younger Carboniferous insect Lagerstätte such as Commentry in France, Wettin Formation in Germany, Obora in Czechia, Chekarda, the Kuznetsk Basin, Soyana, Tikhie Gory and Isady in Russia. As such, the choice of soft maximum constraint was based on the absence of a clade from several wellexplored deposits older than that on which the minimum is defined, which were typically studied for decades and/or comprehensively monographed. Crucially, these deposits are biogeographically diverse and preserve insects, demonstrating that were representatives of crown Coleoptera present, they would have been preserved. While Adiphlebia from Mazon Creek has been interpreted as the earliest beetle [85], this has since been shown to be erroneous, based on clay artefacts attached to the wing membrane [86]. Full details of all 57 calibrations used are provided in the electronic supplementary material. This comprehensive set of fully justified calibrations serves as a basis for future divergence time analyses in Coleoptera, as well as our own. The suitability of our specified priors was evaluated by running an analysis without molecular data, yielding the effective priors for comparison [87].

Divergence time estimates
To derive an integrative timescale of beetles, we combined the results of alternative molecular clock models: IR and AC. Both models yielded comparable results, with the IR clocks proposing earlier dates on average (table 1; electronic supplementary material, figures S9 and S10). In the following discussion, results from both AC and IR analyses are combined to provide a single date range, since we cannot discriminate between the efficacies of these clock models.
We recovered a late Carboniferous origin of crown Coleoptera (322-306 Ma). This inferred divergence time is somewhat older than the age of the earliest unequivocal stem-group coleopteran, Coleopsis archaica from the Sakmarian Meisenheim Formation in Germany (approx. 290 Ma) [5], which has not been used as a calibration point. Despite a controversial Carboniferous record of putative stem Coleoptera and Coleopterida (i.e. Coleoptera + Strepsiptera) [85,88] that have been recommended for calibration in some studies [89], the affinities of these fossils have been questioned as they do not preserve unambiguous coleopteran apomorphies [4,86]. Polyphaga was estimated to have originated in a relatively narrow interval between the latest Carboniferous and Early Permian (307-286 Ma). These dates are significantly older than those estimated by previous studies [9,10,16], which inferred a Permian to Triassic origin of the clade, but fall within the range estimated by Toussaint et al. [35] and McKenna et al. [23]. The two earliest diverging coleopteran clades including Scirtidae and Clambidae diverged in the Late Triassic-Late Cretaceous (227-77 Ma) and Permian-latest Triassic (286-202 Ma), respectively. Despite their position on the polyphagan tree, these clades are scarcely represented in the pre-Palaeogene fossil record [90], which may contribute to their broad credibility intervals (CIs). The lineage comprising the isolated family Rhinorhipidae was inferred to be Permian in age (287-264 Ma).
The species-rich series Elateriformia diverged in the Late Permian to Middle Triassic (269-246 Ma). The last common ancestor of the Histeroidea, Hydrophiloidea and Staphylinoidea clade diverged in the Early to Late Triassic (258-238 Ma). Staphylinidae, the most diverse beetle family [91], originated between the Late Triassic and Early Jurassic

Taxonomic implications
Molecular phylogenetic studies conducted over the last decade have lent support to several deep relationships among coleopteran groups suspected by earlier workers on the basis of morphological data alone. Considering that some of these relationships are strongly supported in different molecular Table 1. Divergence dates (95% CI of the posterior distribution of age estimates, in Ma) of the crown groups of beetle suborders and series from IR and AC molecular clock analyses with uniform prior distributions. studies [10,19,21,23,77] and are thereby robust to taxon and gene sampling as well as to LBA artefacts, we herein propose several changes to the taxonomy of Coleoptera. All taxonomic changes, including morphological diagnoses, lists of included taxa and detailed commentaries, are provided in the electronic supplementary material and are briefly summarized below. In total, 234 families are recognized of which 194 are extant. A hypothesis of the relationships among the higher taxonomic units within Coleoptera is outlined in figure 3. The traditional series Derodontiformia, classically defined as containing the three problematic families Derodontidae, Nosodendridae and Jacobsoniidae, does not form a natural group in our analyses and other previous phylogenomic studies [10,23]. The monophyly of the group has been doubted based on morphological characters as well [20,68,92]. We consider Jacobsoniidae a member Hunt et al. [9] McKenna et al. [16] Toussaint et al. [35]   royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211771 of Staphylinoidea, and Nosodendridae as belonging to a series of its own. Consequently, Scirtiformia and Scirtoidea sensu nov., and Clambiformia Cai and Tihelka ser. nov. and Clamboidea sensu nov. (with Clamboidea Fischer, 1821 taking priority over Derodontoidea LeConte, 1861) are redefined, the former comprising the extant families Decliniidae and Scirtidae, and the latter Clambidae, Derodontidae and Eucinetidae.
The enigmatic Rhinorhipidae, represented in the recent fauna by only a single species endemic to northeastern Australia, are resolved with strong support as an isolated branch falling outside of Elateriformia (where it was placed by Lawrence and Newton [93]). We propose to include the family in a series of its own, Rhinorhipiformia Cai, Engel and Tihelka ser. nov.
Byrrhoidea in the current sense was recovered as paraphyletic, with Byrrhidae forming the sister group to Buprestidae and the remainder of the superfamily forming a sister group to Elateroidea. This split roughly corresponds to Crowson's concept [67] of Byrrhoidea as containing only a single family, Byrrhidae, and is not unexpected as the monophyly of Byrrhoidea has been doubted [23,71]. Thus, we regard Byrrhidae as constituting Byrrhoidea sensu nov. and treat the remaining former byrrhoid families within Dryopoidea stat. res. Relationships among clades within Elateroidea are not resolved with sufficient support to justify taxonomic revision, which is consistent with the results based on the anchored hybrid enrichment method using 2260 single-copy orthologous genes by Douglas et al. [73].
In congruence with other phylogenomic analyses [10,23], Nosodendridae were recovered as an isolated lineage sister to Staphyliniformia, Bostrichiformia and Cucujiformia, outside of Clambiformia ser. nov., where the family was provisionally placed by previous workers [68]. To maintain the monophyly of the remaining well-defined series, and considering the long-established morphological distinctiveness of the family, a new series is instituted for Nosodendridae, Nosodendriformia Cai and Tihelka ser. nov.
We recovered Scarabaeiformia (Scarabaeoidea) nested within Staphyliniformia, as a sister group to Staphylinoidea, in line with other genomic analyses [10,23]. As this would render the well-defined Staphyliniformia polyphyletic, we integrate Scarabaeiformia within Staphyliniformia sensu nov. This newly defined Staphyliniformia essentially corresponds to the Haplogastra concept proposed more than a century ago by Kolbe [94] which was subsequently supported by a range of morphological studies of adult and larval characters (see electronic supplementary material for full discussion).
Our results support the downgrading of the carrion beetles from a family to a subfamily of Staphylinidae sensu. nov., as Silphinae stat. nov., and reconsideration of other recent phylogenetic studies supports the restoration of Colonidae stat. nov. as a separate family, not subfamily of Leiodidae sensu nov. Moreover, click beetles (Elateridae), minute marsh-loving beetles (Limnichidae), earthboring dung beetles (Geotrupidae), scarabs (Scarabaeidae), bark-gnawing beetles (Trogossitidae), antlike beetles (Anthicidae), ironclad beetles (Zopheridae) and false darkling beetles (Melandryidae) were not recovered as natural groups, in congruence with some previous analyses [66,97]. These latter issues remain to be formally treated and will require a more extensive taxon sampling within and outside these groups to help redefine their boundaries.

Carboniferous origin of Coleoptera
Despite a history of problematic purportedly coleopteran fossils from the Carboniferous [85,88], the earliest unequivocal stem-group beetles appear in the fossil record in the early Permian [5], providing few clues into the initial history of the order. Molecular clock studies have broadly converged on two principal models of beetle origins: the explosive model, suggesting that total-group Coleoptera originated and diversified in the Permian [9,10], while the long-fuse model argues for a much earlier Carboniferous origin [35], implying a long cryptic evolutionary history of the group undocumented in royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211771 the fossil record. Our molecular clock studies lend support to the latter model, recovering a fully Carboniferous origin of Coleoptera and suggesting a 55-134 Myr gap in the known fossil record between the origin of Coleoptera and the earliest fossil record of the order. The paucity of early beetle fossils may be because of the rarity of early beetles, their narrow ecological niche, lower fossilization potential, insufficient sampling of fossiliferous formations of this age, as well as secular biases in the rock record. The former two appear especially likely, since stem-beetles were rare in the Permian and coleopterans only came to dominate fossil insect assemblages in the Mesozoic [98]. Moreover, stembeetles were apparently not considerably more morphologically diverse than modern beetles [99,100]. The early Carboniferous is part of the 'Hexapod gap', a stratigraphic interval known for its lack of insect fossils [101], which may further contribute to the lack a pre-Permian fossil record of beetles.
Adaptative radiations of successful clades in the fossil record are often associated with new morphological innovations that enable the colonization of new niches [102]. The heavily sclerotized bodies of beetles with forewings modified into hardened protective elytra are often cited to explain the diversification of beetles [8,9,14]. However, the long-fuse model, with a long cryptic history of Carboniferous-Permian beetles that apparently were neither abundant [98] nor morphologically disparate [100], argues for a more complex scenario of beetle diversification.

Early diversification of beetles
The recovery of Adephaga as the earliest diverging clade of Coleoptera, and the aquatic families Gyrinidae and Haliplidae as the earliest diverging families within Adephaga, implies that the ancestral coleopteran may have been aquatic. This seems reasonable based on the fossil record because the Permian stem-coleopteran families †Tshekardocoleidae and †Phoroschizidae may have been fully or partly aquatic [103] even though the life history of †Alphacoleoptera Engel, Cai and Tihelka subord. nov. (refer to electronic supplementary material) remains mysterious [104]. Different phylogenetic reconstructions among the four suborders and the internal relationships of Polyphaga may refute the hypothesis for an aquatic ancestor [16]. Consistent with Zhang et al. [10] and McKenna et al. [23], Scirtiformia are resolved as the basal-most polyphagans and are a group with mainly aquatic and semi-aquatic larvae [105], though some forms are fully terrestrial [106]. Their placement supports the reconstruction of an aquatic ancestor for Archostemata + Myxophaga and the Polyphaga. An understanding of the phylogenetic relationships within scirtiforms [107] is critical for testing the aquatic-ancestor hypothesis.
Under the aquatic-ancestor hypothesis, beetles subsequently invaded terrestrial ecosystems independently four times, once each in the ancestor of Geadephaga, Archostemata, some Myxophaga and Polyphaga, although many beetles have subsequently returned to fresh water again. The radiation of basal Coleoptera was rapid and occurred over a period of 17-78 Ma; the split between Archostemata and Myxophaga was estimated as Permian (269-246 Ma), while Polyphaga diverged at the Carboniferous-Permian boundary (314-236 Ma). The basal diversification of crown Coleoptera and their invasion of land occurred in the Permian, a time of drastic environmental change that saw the replacement of Carboniferous hygrophytic and permanently wetted terrestrial ecosystems inhabited by (semi)aquatic alphacoleopterans with seasonally dry ecosystems [108]. Palaeozoic plant families were replaced by more modern Mesozoic groups and ecological communities became more complex, with a higher number of trophic levels [108,109]. It is therefore possible that the Permian ecosystem change played a pivotal role in shaping the early diversification of beetles [5,103]. Early diverging beetle lineages survived the End-Permian mass extinction event to diversify in the Mesozoic, most beetle superfamilies having diverged by the Jurassic.

Cretaceous co-diversification with angiosperms
Angiosperms replaced the previously dominant gymnosperms during the Cretaceous, in a period known as the Cretaceous Terrestrial Revolution (KTR) [110,111]. Co-diversification with flowering plants has been proposed as a mechanism explaining the species richness of herbivorous clades such as the weevils [9,15] and beetles have been regarded as the earliest pollinators of angiosperms [15,112,113]. Our molecular clock analyses suggest that major beetle clades were present before the KTR. This is corroborated by the beetle fossil record [12]; elateroids, staphylinoids and weevils in particular have a diverse fossil record since the Jurassic [8]. Nonetheless, some scarabaeoid and cucujiform clades underwent diversification during the Late Jurassic to Early Cretaceous, partly overlapping with the diversification of major angiosperms clades in the Early to mid-Cretaceous [114,115].
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211771 Besides directly affecting herbivorous lineages, the diversification of the angiosperms in the Cretaceous precipitated a diversification of vertebrate herbivores and predators [11]. Our molecular clock estimates corroborate the controversial idea, famously portrayed in numerous palaeontological reconstructions, that coprophagous beetles, namely geotrupids (dung beetles) and scarabaeoids (scarabs), may have been associated with Cretaceous herbivorous dinosaurs [23,116,117]. Our analyses also corroborate a Cretaceous origin of the bioluminescent lampyroid clade [118], temporally overlapping with the diversification of visually hunting predators such as anurans and stem-group birds during the KTR [119]. At the same time, some Mesozoic beetle families have their last appearance in the fossil record during the KTR, highlighting complex dynamics of transitioning from a gymnosperm-to angiosperm-dominated world [120].
Overall, our results provide support for a more nuanced view [15] of the KTR as an event that did not increase the superfamilial diversity of beetles, since most major beetle clades had already diverged by this time. Instead, the diversification of angiosperms was followed by clade-specific radiations in some beetle groups, such as scarabs [112] or weevils [13], in response to newly formed niches.

Cretaceous-Palaeogene mass extinction
The impacts of the Cretaceous-Palaeogene (K-Pg) mass extinction on beetles remain controversial, as herbivorous insects would be expected to have experienced elevated extinction levels given their close association with their host plants [121]. Our results suggest no diversification of beetles, at the level of families, in the aftermath of the K-Pg crisis, corroborating previous palaeontological macroevolutionary studies suggesting that the mass extinction was hardly devastating for beetles [12,98,122]. It is, however, possible that the K-Pg extinction may have had different impacts on lower taxonomic ranks [121] that have to be assessed by future studies focusing on specific herbivorous clades.

Summary
Our new beetle phylogeny corroborates many relationships inferred by phylomitogenomic analyses using models accounting for compositional and rate heterogeneity [21], but the resolution is significantly improved by a careful selection of single-copy NPC genes analysed with methods accounting for compositional and rate heterogeneity. We recovered with credibility the phylogenetic positions of the enigmatic Rhinorhipidae and recently recognized Coccinelloidea, as well as the paraphyly of Cucujoidea sensu Robertson et al. [78]. We also unravelled the isolated position of the peculiar Byrrhidae, consistent with the traditional classification of elateriform beetles [67,68,123]. Our molecular clock analyses suggest a Carboniferous origin of Coleoptera and a Palaeozoic origin of all four beetle suborders. A Carboniferous origin of Coleoptera implies a 55-134 Ma long 'beetle gap' in the fossil record. Early diverging beetle lineages survived the End-Permian mass extinction event to diversify in the Mesozoic, with major clades having originated by the Jurassic. Most major beetle clades were present by the Late Jurassic, although some groups diversified during the Cretaceous, in concert with the radiation of angiosperms.
We provide a revised treatment of the higher classification of Coleoptera that reflects the findings of phylogenomic studies conducted over the past decade. Scirtiformia and Scirtoidea sensu nov. are restricted to Decliniidae and Scirtidae, while Clambiformia Cai and Tihelka ser. nov. and its single constituent superfamily Clamboidea sensu nov. is considered to contain the extant families Clambidae, Derodontidae and Eucinetidae. The other members of the former Derodontoidea, Nosodendridae and Jacobsoniidae are placed into Nosodendriformia Cai and Tihelka ser. nov. and Staphylinoidea, respectively. To maintain the monophyly of the well-defined Elateriformia, we erect the new series Rhinorhipiformia Cai, Engel and Tihelka ser. nov. for the family Rhinorhipidae. Scarabaeoidea is formally incorporated into Staphyliniformia sensu nov. The former Cucujoidea is divided into three superfamilies: Erotyloidea stat. nov., Nitiduloidea stat. nov. and Cucujoidea sensu nov. Silphinae stat. nov. are treated as a subfamily of Staphylinidae sensu. nov., and Colonidae stat. nov. is restored as an independent family and not a subfamily of Leiodidae. In addition, we recognize that the extinct suborder †Protocoleoptera (established for †Protocoleidae) comprises a group of polyneopterans allied to earwigs (Dermaptera) rather than beetles, and that the younger †Archecoleoptera were defined on the basis of a temporal fauna rather than any characters, tax, or phylogenetic details. Accordingly, the earliest fossil beetles are here included in the new extinct suborder †Alphacoleoptera subord. nov. (refer to electronic supplementary material).