A near-complete and time-calibrated phylogeny of the Old World flycatchers, robins and chats (Aves, Muscicapidae)

The Old World flycatchers, robins and chats (Aves, Muscicapidae) are a diverse songbird family with over three hundred species. Despite continuous efforts over the past two decades, there is still no comprehensive and well-resolved species-level phylogeny for Muscicapidae. Here we present a supermatrix phylogeny that includes all 50 currently recognized genera and ca. 92% of all the species, built using data from up to 15 mitochondrial and 13 nuclear loci. In addition to assembling nucleotide sequences available in public databases


Introduction
Muscicapidae, the chats, robins and Old World flycatchers, is one of the largest bird families in the world, with 49 -57 different genera proposed and 303 -343 recognized species, depending on the taxonomy (e.g., Dickinson and Christidis, 2014;Clements et al., 2021;and Gill et al., 2022).It is widely distributed in various habitats across the entire Old World and adjacent Australasia and Nearctic (Winkler et al., 2020).Birds of this family exhibit great diversity in morphology, behaviors, vocalizations, and life history, which makes Muscicapidae a great study group to address various questions on evolution, diversity and biogeography.However, for very similar reasons, it also makes resolving the phylogeny for this species-rich family challenging.
Two of the latest efforts to build a large-scale species-level phylogeny for Muscicapidae, Sangster et al. (2010) and Zuccon and Ericson (2010a), laid the foundation for the classification of Muscicapidae and resolved many taxonomic uncertainties, although their taxon sampling was limited and with many deep nodes having poor support.Barve and Mason (2015) assembled another large-scale phylogeny for Muscicapidae using a supermatrix with both increased taxon sampling (252 taxa) and molecular data (11 molecular markers).This study provides support for many lineages delineated in previous studies as well as provides some new insights; however, it lacks comprehensive taxonomic reconciliations, assessments of the phylogenetic relationships, or suggestions for taxonomic change.
The goal of this study is to reconstruct a comprehensive species-level phylogeny for Muscicapidae with more extensive taxon and locus sampling than previous studies, which incorporates and synthesizes molecular data from numerous resources (i.e., nucleotide databases, genome assemblies, raw sequencing reads, and our own unpublished Sanger sequencing data).The resulting phylogenetic tree includes 301 species (ca.92% species coverage based on Clements/eBird Checklist of Birds of the World v.2021) from all currently recognized genera and provides a robust basis for the taxonomic classification of Muscicapidae as well as a resource for comparative studies.

Taxon name reconciliation
We downloaded all the sequences from the NCBI nucleotide database (as of Nov 10, 2020) for Muscicapidae and Turdidae, since many muscicapids are placed under Turdidae in GenBank.As GenBank taxa names do not follow a specific standardized avian taxonomy, we reconciled the GenBank names to match the Clements/eBird Checklist of Birds of the World v.2021, which includes 49 genera and 327 species within Muscicapidae (Clements et al., 2021), based on information from Avibase, online searches, and results presented in Hosner et al. (2022).We standardized the names based on the Clements/eBird checklist because it is updated regularly and frequently and intrinsically linked to trait databases such as Birds of the World (https://birdsoftheworld.org/) and Macaulay Library (https://macaulaylibrary.org/), which facilitates downstream phylogenetic comparative analyses.We also implemented taxonomic updates suggested by IOC World Bird List v12.1 (Gill et al., 2022) that are shown in Table 1.Collectively, this provided us with a target set consisting of 326 species from 50 genera.

Molecular data collection
After taxon name reconciliation, we performed an all by all BLASTN search on the downloaded nucleotide sequences (Camacho et al., 2009).
For the BLASTN search, each accession from a nuclear sequence was transformed into a single FASTA file, and for the mitochondria, each individual coding or rRNA region was made into a FASTA file.Any pair of sequences that had a maximum e-value of 1.0e− 5 and the pairwise alignment length covering at least 50% of the length of both sequences were considered homologous.We then performed single linkage clustering to obtain all groups of sequences (representing homologous loci) that were linked by significant BLAST hits.If a cluster contained multiple sequences from a species, we retained only a single, longest sequence.We manually searched GenBank nucleotide database for recently published sequences or sequences that may have been missed by the clustering (as of July 2022).
As noted by Hosner et al. (2022), sequences from closely related species may still be placed under the previous name in GenBank if they were split recently.Therefore, we also searched associated literature and museum vouchers for species with recent splits proposed.In this way, we identified more sequences, for example, Ficedula luzoniensis from Ficedula hyperythra (Moyle et al., 2015), Ficedula riedeli from Ficedula dumetoria (Outlaw and Voelker, 2006), Calliope tschebaiewi from Calliope pectoralis (Liu et al., 2016), Cyornis pelingensis from Cyornis colonus (Garg et al., 2018), and Copsychus superciliaris from Copsychus luzoniensis (Lim et al., 2010).These sequences were added into the supermatrix to represent their corresponding species.
To further enrich the data matrix, we extracted molecular markers from genome assemblies and raw sequencing reads (Table 2).We extracted nuclear loci from the six muscicapids that have genome assemblies available on GenBank (i.e., Erithacus rubecula, Saxicola maurus, Cercotrichas coryphoeus, Oenanthe oenanthe, Copsychus sechellarum, and Ficedula albicollis) using the Extract_seq pipeline (https://github.com/aakanksha12/Extract_seq) which incorporates the nhmmer tool (Wheeler and Eddy, 2013) to improve the detection of remote homologs for standard BLASTN (Camacho et al., 2009).We used MitoFinder (Allio et al., 2020) to extract mitochondrial regions from 27 mitochondrial genomes, as well as from the UCE target enrichment sequencing reads for Muscicapa striata that are available in the NCBI SRA database.The mitochondrial genomes for Muscicapa griseisticta (NC_045181) and Cyanoptila cyanomelana (NC_015232) were misidentified according to Sangster and Luksenburg (2021), therefore we did not include them in our data matrix.We also created a reference fasta file using the nuclear loci from Erithacus rubecula and mapped whole genome sequencing reads in the NCBI SRA database for 20 muscicapids, including Cichladusa arquata, which had no data available in the NCBI nucleotide database (as of July 2022).We applied the same process as described above to collect sequence data for three Turdidae species as the outgroup (Catharus fuscescens, Sialia sialis, and Turdus rufiventris), all with genome assembly data available on GenBank.

Data verification
We used MAFFT 7.407 (Katoh and Standley, 2013) to align each locus individually, and we manually checked alignments using MEGA X (Kumar et al., 2018).We then estimated gene trees in IQ-TREE 2.1.2(Minh et al., 2020) with 1000 ultrafast bootstrap replicates (-ufboot 1000) by considering all standard substitution models and allowing for invariable sites and discrete Gamma model for rate heterogeneity (-m TEST).Gene trees were visualized with FigTree 1.4.4 (Rambaut, 2018).
We examined each gene tree for exceptionally long branches and any species that appeared far apart from its congeners or in an unexpected position.This could be caused by sequences that were too short, of poor quality, or mislabeled.We removed the problematic sequences and replaced them with alternate sequences if available.For example, in the initial COI gene tree, Cyornis oscillans appeared close to the outgroup (Turdidae) instead of in the genus Cyornis or subfamily Niltavinae.The COI sequence in the BOLD System that was used to build the gene tree (id: BBIND665-08) was BLASTed and turned out to be most similar to Pachycephala homeyeri (Pachycephalidae).As there was no other molecular data available for this species, it was removed from the data matrix.A second example of mislabeling is that both the Cytb and NADH dehydrogenase subunit 2 (ND2) sequences of Phoenicurus fuliginosus (GenBank accessions: KJ024172 & KJ024236), likely belong to Niltava macgrigoriae.Thus, they were replaced with correctly labeled sequences.Additionally, we followed the findings in Moyle et al. (2014) and removed/replaced problematic sequences for Ficedula bonthaina, F. buruensis and F. subrubra.Only the loci that had data available for a minimum of 30 taxa were kept for further phylogenetic analyses (Table 3).

Phylogenetic analyses
After the process of data verification, we re-aligned the sequences and re-built gene trees as described above.The concatenated alignment and NEXUS data block were built using phyutility 2.7.1 (Smith and Dunn, 2008).Mitochondrial protein coding genes were partitioned by codon positions (i.e., 1st, 2nd, and 3rd codon positions).We then performed a partitioned analysis for the concatenated data in IQ-TREE (Minh et al., 2020), which infers phylogenetic trees by maximum likelihood and uses a greedy strategy to find the partitioning scheme with best model-fit (-m TESTMERGE).Tree estimation was run with 1000 ultrafast bootstrap replicates (-ufboot 1000) and edge-proportional branch lengths between partitions but allowing each partition to have its own substitution rate (-p).We also ran a partitioned analysis for mitochondrial and nuclear loci respectively using IQ-TREE with the same parameter settings as described above.
We used IQ-TREE to run model testing (-m TESTMERGEONLY -mset mrbayes) similar to PartitionFinder (Lanfear et al., 2012).The bestfitting partitions and models were then used for a partitioned analysis with Bayesian inference in MrBayes 3. 2.6 (Ronquist and Huelsenbeck, 2003) on the CIPRES Science Gateway (Miller et al., 2010).MCMC chains were run for 50 million generations (with four chains, chain temperature of 0.2, and two runs) and sampled every thousand generations.The first 20% of trees were discarded before summarizing the trees to a 50-majority-rule consensus tree (contype = Halfcompat).The mixing of chains was checked in Tracer 1.7.1 (Rambaut et al., 2018) to ensure all parameters had effective sample sizes (ESSs) > 200.Separate analyses were performed for all loci, mitochondrial loci only, and nuclear loci only.

Divergence time estimation
To estimate divergence times for Muscicapidae, we used TreePL under penalized likelihood (Smith and O'Meara, 2012), which allows for different rates across branches but penalizes rate differences over the tree with a rate smoothing parameter.Given that the ML tree based on all loci overall yielded better estimates of the topology than the Bayesian tree, we used the rooted ML tree inferred from all loci as the phylogeny for TreePL time calibration.A secondary calibration point for the split between Muscicapidae and Turdidae [min = 13.5 million years ago (Mya), max = 22 Mya] from Oliveros et al. (2019) was applied.We used the thorough option, which allowed the program to continue to iterate until convergence, and the prime option, which tested different optimization possibilities.The optimal rate smoothing parameter was identified through cross-validation that tested seven values (0.001, 0.01, 0.1, 1, 10, 100, 1000); 0.001 was the optimal smoothing parameter that we used in this analysis.

Taxon sampling and genetic data
We assembled genetic data for 301 Muscicapidae species, covering all genera targeted and 92% of recognized species (out of 326 species and 50 genera).The super-matrix included 2460 sequences from a total of 28 molecular markers (15 mitochondrial and 13 nuclear; Table 3) with ca.30% data coverage and an average of eight loci for each sampled species.Among them, 263 species were sampled with both mitochondrial and nuclear loci, 37 species with only mitochondrial loci, and one species with only nuclear loci.The resulting concatenated dataset contained 25,098 characters.

Phylogenetic relationships
Phylogenetic analyses using IQ-TREE (ML tree, Fig. 1) and MrBayes (the 50-majority-rule consensus tree, hereafter the Bayesian tree, Fig. 2) based on all taxa and all loci overall yielded similar topologies.Both trees showed four major lineages with strong support (> 95%), which corresponded to the four recognized subfamilies: Muscicapinae ( support ≥ 90%, and only 6.6% of the nodes had bootstrap support lower than 75%.The Bayesian tree was less resolved for some of the intrageneric relationships (e.g., Niltava, Cyornis and Ficedula) and failed to resolve some of the deep nodes in Niltavinae and Cossyphinae (Fig. 2); however, it agreed with the ML tree for most intergeneric relationships.The two tribes within Muscicapinae, Muscicapini (A1) and Copsychini (A2), are better supported by the ML tree based on all loci [bootstrap (BS) = (100, 90)] than the Bayesian tree [posterior probability (PP) = (1.00,0.76)].Within Saxicolinae, some of the deep nodes were connected by short internal branches, which were poorly supported in both ML tree and Bayesian trees (Figs. 1 & 2); however, the shallower relationships were more strongly supported by both analyses.
Mitochondrial and nuclear trees from both ML and Bayesian analyses all recovered four major clades (Supplementary Figs.S1-S4), however, they differed in the placement of Niltavinae and Cossyphinae.Both nuclear trees supported that Niltavinae is sister to Saxicolinae and Cossyphinae (BS = 100; PP = 1.00); the same topology was recovered in our trees based on all loci, as well as in previous studies (e.g., Sangster et al., 2010;Zuccon and Ericson, 2010a;Barve and Mason, 2015).In contrast, in the mitochondrial trees, Niltavinae was found to be sister to Saxicolinae with only moderate support (BS = 78; PP = 0.72).For intergeneric relationships within each subfamily, mitochondrial and nuclear trees largely agreed with each other, but did exhibit a few topological discordances.For example, the nuclear trees supported that Alethe is sister to the rest of the Muscicapinae species, rendering the tribe Copsychini (A2) paraphyletic.The mitochondrial trees, on the other hand, placed Alethe within Copsychini.
Most of the taxa that have not been included in previous phylogenetic trees were placed in their corresponding genera as defined by Clements et al. (2021).Twenty-seven of the 34 polytypic genera were monophyletic in both ML and Bayesian trees based on all loci with high support (> 90%); however, Muscicapa, Bradornis, Cercotrichas, Vauriella, Sheppardia, Cossypha and Myrmecocichla were not reciprocally monophyletic (Figs. 1 & 2), with species assigned to Vauriella falling into different subfamilies, which will be discussed in detail below.All original tree files are available in supplementary data.Gene trees estimated by IQ-TREE are presented in Supplementary Figs.S5-S32.

Fig. 2.
The 50-majority-rule consensus tree reconstructed using Bayesian inference based on a partitioned dataset containing up to 15 mitochondrial and 13 nuclear loci for 301 Muscicapidae species and three Turdidae species as outgroup.Genera that form reciprocally monophyletic clades have been collapsed and are in bold.
Values at nodes show posterior probabilities; full support is indicated with an asterisk.Red branches lineages that are different places than in the ML tree in Fig. 1. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Divergence time estimation
Three TreePL analyses overall provided similar estimates of the divergence time across Muscicapidae (Table 4).Muscicapidae was estimated to diverge from Turdidae in the early Miocene and the most recent common ancestors for the four subfamilies all arose around middle Miocene.The TreePL tree with a crown age for Muscicapidae estimated to be 17.43 Mya (Fig. 3; Supplementary Fig. S33) has divergence times most similar to those in Selvatti et al. (2015), so its dates were used in Discussion below.

Discussion
This study provides the most comprehensive species-level timecalibrated phylogeny for Muscicapidae.Combining molecular data collected from various sources, most of the nodes were strongly supported by the phylogenetic analyses.For taxa that exhibited conflicts in topology among analyses, we discuss in detail below and propose taxonomic changes (Table 5).We also reconciled the taxon names in IOC World Bird List 12.1 that differ from those in Clements/eBird 2021 checklist (Table 6).

Muscicapinae
Clade A1paraphyly of Muscicapa flycatchers.The inclusion of the Grand Comoro Flycatcher Humblotia flavirostris renders Muscicapa paraphyletic.Although only Cytb was sampled in the data matrix for this species (GenBank accession: MH307512), concatenated (ML & Bayesian) and single-locus analyses all strongly support that H. flavirostris is nested within Muscicapa and is sister to the clade containing M. aquatica, M. cassini, M. gambagae and M. striata (BS = 97; PP = 1.00;BS = 100 in Cytb gene tree).This sample of H. flavirostris was collected from Grande Comore, where there are currently no Muscicapa flycatchers according to eBird sightings (https://ebird.org/region/KM-G?yr=all).As this species is sufficiently distinct from Muscicapa flycatchers in morphology, we conclude that it is not a mislabeled Muscicapa species.Based on our results, Humblotia Milne-Edwards and Oustalet, 1885 should be subsumed into Muscicapa Brisson, 1760.Therefore, we propose to rename Humblotia flavirostris as Muscicapa flavirostris.This is the first time that this species has been included in a comprehensive phylogenetic study.
Clade A1polyphyly of Vauriella jungle-flycatchers.The Mindanao Jungle Flycatcher V. goodfellowi is recovered in the Muscicapinae clade as most closely related to Muscicapini flycatchers, whereas the rest of the sampled Vauriella species (V.gularis and V. insignis) unexpectedly formed a well-supported clade with the Bagobo Robin Leonardina woodi (BS = 100; PP = 0.98) in the Saxicolinae clade, rendering Vauriella polyphyletic.The ML tree based on all loci placed V. goodfellowi as part of Muscicapini (BS = 100), whereas the Bayesian tree placed it within Bradornis, forming a polytomy with B. ussheri and B. fuliginosus (PP = 0.54).Two previous studies also reported that V. goodfellowi appeared more closely related to the African flycatchers (Cibois and Cracraft, 2004;Barve and Mason, 2015), but both these were based on the same RAG1 sequence (GenBank accession: AY307211).There is now also a mitochondrial COI sequence of V. goodfellowi available on GenBank (KC354954), and the COI gene tree supports a sister relationship between V. goodfellowi and all Muscicapa flycatchers (BS = 99; Supplementary Fig. S9), while the RAG1 gene tree supports that V. goodfellowi is sister to the rest of Muscicapini (BS = 100; Supplementary Fig. S30), as in our ML tree.These two sequences were collected from different individuals and sequenced by different institutions (DNA Barcoding of Philippine Birds project, 2011, voucher MGPF3;Field Museum, 1992, voucher FMNH 357498), therefore diminishing the possibility that this "misplacement" was solely due to mislabeling.Although the exact position of V. goodfellowi is not fully resolved, it is clear that it should be placed within Muscicapini instead of in Saxicolinae.Since V. insignis is the type species of Vauriella Wolters, 1980, a new generic name is warranted for V. goodfellowi, as well as phylogenetic analysis of the unsampled V. albigularis, which is provisionally assumed to be closely related to the two other current Vauriella species (see Clade D, below).
Clade A1taxonomic notes for Bradornis.The Bayesian trees based on all loci (Fig. 2) and mitochondrial loci (Supplementary Fig. S3) placed Vauriella goodfellowi within Bradornis, rendering it paraphyletic.However, both concatenated nuclear trees recovered Bradornis as a monophyletic clade (BS = 96; PP = 0.98; Supplementary Figs.S2 & S4), as well as the ML tree based on all loci (BS = 93; Fig. 1), with an estimated crown age at 10.26 Mya.Considering that the position of V. goodfellowi within Muscicapini was unstable, we also included a Bayesian inference based on all loci concatenated for just clade A1 but excluding V. goodfellowi.We found that the six sampled Bradornis species formed a strongly supported clade (PP = 1.00;Supplementary Fig. S34).Despite the enigma of V. goodfellowi, it is appropriate to continue using Bradornis as the valid genus name for B. ussheri, B. fuliginosus, B. boehmi, B. mariquensis, B. microrhynchus and B. comitatus.Clade A2paraphyly of Cercotrichas scrub-robins.Our results showed that Cercotrichas was paraphyletic, as five of the 10 currently recognized species form a strongly supported clade that is sister to Copsychus magpie-robins while the remaining five are sister to that Copsychus plus Cercotrichas clade.The subdivision of Cercotrichas in our study agrees with the findings in recent phylogenetic work on this group (Sangster et al., 2010;Voelker et al., 2014).As suggested by Sangster et al. (2010)

Cossyphinae
Clade Cparaphyly of Sheppardia akalats.The clade comprising the Sheppardia akalats and the Gray-winged Robin-Chat Cossypha polioptera was strongly supported by both ML and Bayesian trees (BS = 100; PP = 0.99).All trees based on concatenation support a sister relationship between C. polioptera and S. bocagei, except for the concatenated nuclear trees (Supplementary Figs.S2 & S4) as S. bocagei lacks nuclear data.In agreement to several previous studies (Sangster et al., 2010;Voelker et al., 2010;del Hoyo et al., 2016;Fjeldså and Bowie, 2021), our results support the placement of C. polioptera in Sheppardia.
Clade Cpolyphyly of Cossypha robin-chats.The phylogenetic relationships among these African robin-chats remain highly inconsistent across different studies, likely due to incomplete taxon sampling (Zuccon, 2011).Our study sampled 13 Cossypha species (all currently recognized species except for C. heinrichi), which were placed in five

Table 4
Divergence time (million years ago) of major nodes estimated by TreePL using three different calibration points for our study and results from two previous studies.An asterisk represents where we put the secondary calibration point in the analysis.Dates in bold were used in the Discussion.The Bayesian analysis generally agreed with the above placements, except that C. cyanocampter was sister to the clade containing six Cossypha species (PP = 0.93; Fig. 2), and that C. humeralis was sister to Xenocopsychus (PP = 0.90).Given that the branch uniting C. humeralis and Xenocopsychus appears to be exceptionally long in the Bayesian tree (two long branches in the ML tree), the sister relationship between these two species likely resulted from long-branch attraction (Bergsten, 2005).Both ML and Bayesian mitochondrial trees supported a sister relationship between C. cyanocampter and the six Cossypha species with strong support (BS = 99; PP = 0.94; Supplementary Figs.S1 & S3).However, the two nuclear trees placed C. cyanocampter within this clade as sister to Fig. 3. Divergence time in million years ago (Mya) estimated by TreePL using the topology of the ML tree based on all loci.A secondary calibration point for the split between Muscicapidae and Turdidae (min = 13.5 Mya, max = 22 Mya) from Oliveros et al. (2019) was applied.Dotted vertical lines were drawn every-five million years.Genera that form reciprocally monophyletic clades have been collapsed.Tip labels in bold are the taxa that were placed in different positions by the ML and Bayesian analyses.The tree with all collapsed clades expanded is available in Supplementary Fig. S33.
C. niveicapilla (BS = 98; PP = 0.86; Supplementary Figs.S2 and S4), similarly in Beresford (2003), but not in Sangster et al. (2010).It is possible that C. cyanocampter might be more closely related to the clade containing the six Cossypha species than to the other Cossypha robinchats.However, due to the disagreement among mitochondrial, nuclear and all loci concatenated trees, the exact position of C. cyanocampter still remains unresolved.
Our concatenated trees all recovered Cossypha isabellae and Cossyphicula roberti as sister taxa with strong support (BS = 100; PP = 0.97).Cossypha isabellae was sampled for Cytb and RAG1, and Cossyphicula only for Cytb.The RAG1 gene tree (Supplementary Fig. S30) puts C. isabellae outside the clade containing Cossypha and Chamaetylas (BS = 94), whereas the Cytb gene tree supports a sister relationship for C. isabellae and Cossyphicula (BS = 99).We tentatively propose to place C. isabellae in Cossyphicula.Future studies with increased molecular sampling should better evaluate their relationship.
Our results suggest that C. caffra, C. archeri, C. anomala and C. humeralis should have a separate generic status apart from the rest of the Cossypha robin-chats.Zuccon (2011) pointed out that the name Caffrornis Roberts, 1922 (type species C. caffra) is available for C. caffra, C. archeri, and C. anomala, which in our analyses form a well-supported clade (BS = 100; PP = 1.00) with an estimated most recent common ancestor at 5.93 Mya.There is strong (BS = 99) or fairly strong (PP = 0.91) support for inclusion of C. humeralis in the same clade as the three previous species, although it is more anciently diverged from these, with an estimated divergence time of 9.79 Mya.If these four species are placed in the same genus, the name Bessonornis (Smith, 1844) (the corrected spelling for Dessonornis A. Smith, 1836) has priority over Caffrornis Roberts, 1922.Since Xenocopsychus ansorgei is supported as either sister to the four others (BS = 94) or as sister to C. humeralis (PP = 0.90), this species is also appropriately placed in the same genus, which in that case would be Bessonornis A. Smith, 1836, which has priority over Xenocopsychus Hartert, 1907.An alternative would be to recognize three genera, Caffrornis (including C. caffer, C. archeri and C. anomalus), Bessonornis (monotypic: B. humeralis) and Xenocopsychus (monotypic: X. ansorgei).
Clade Cadditional taxonomic notes.The ML and Bayesian trees based on all loci (Figs. 1 & 2) strongly supported Pogonocichla stellata and Swynnertonia swynnertoni as sister taxa (BS = 96; PP = 1.00).This sister relationship was also reported in Sangster et al. (2010).Our results suggest that Swynnertonia Roberts, 1922 could be subsumed in Pogonocichla Cabanis, 1847, and S. swynnertoni renamed as Pogonocichla swynnertoni.However, in view of the deep divergence between these (11.75 Mya), continued treatment as monotypic genera is equally valid.

Saxicolinae
Although most of the relationships were well resolved and strongly supported by the concatenated trees, a few deep nodes in Saxicolinae still remained poorly supported, especially for the nodes uniting Ficedula flycatchers and the forest understory birds that are now mainly distributed in subtropical and tropical Asia (e.g., Enicurus, Myophonus, Cinclidium, Larvivora, Brachypteryx, Myiomela, Calliope, etc.).These groups may have evolved through rapid radiations.Therefore, they have not had the time to accumulate sufficient substitutions, which makes fully resolving their phylogenetic relationships challenging.
Clade Dparaphyly of Myrmecocichla chats.In the ML tree based on all loci concatenated (Fig. 1), the Mocking Cliff-Chat Thamnolaea cinnamomeiventris is nested within Myrmecocichla and sister to M. monticola, rendering Myrmecocichla paraphyletic, although the sister relationship between T. cinnamomeiventris and M. monticola was poorly supported (BS = 61).Sangster et al. (2010) and Barve and Mason (2015) also reported T. cinnamomeiventris and M. monticola as sister taxa.However, the mitochondrial ML tree (Supplementary Fig. S1) and the Bayesian tree based on all loci (Fig. 2) strongly support a sister relationship between T. cinnamomeiventris and all Myrmecocichla chats (BS = 100; PP = 1.00), as well as in some previous studies (Aliabadian et al., 2012;Voelker et al., 2012;Alaei Kakhki et al., 2016).In our nuclear concatenated trees (Supplementary Figs.S2 & S4), T. cinnamomeiventris is part of an unresolved polytomy with Myrmecocichla chats and the Moorland Chat Pinarochroa sordida.Collectively, this suggests that Thamnolaea is more likely to be sister to all Myrmecocichla chats.Future phylogenetic work with increased molecular sampling for Thamnolaea and Myrmecocichla chats as well as increased taxon sampling within the polytypic T. cinnamomeiventris may be warranted.

Taxa in need of further phylogenetic study
Based on the Clements/eBird 2021 checklist (excluding Pinarornis plumosus), there are 25 unsampled Muscicapidae species in our study (Supplementary Table S2).Most of them either occur only on islands or are under threat of extinction.About 30% of the missing taxa are in the genus Cyornis, including the critically endangered island-dwelling C. ruckii.Advances in next-generation sequencing technologies (such as the probe capture for target enrichment) have enabled us to obtain substantial amount of molecular data from toepad samples in museum collections, which could be a promising future step for sampling these difficult species.
Additionally, 17 subspecies in Clements/eBird 2021 checklist, for example Larvivora komadori namiyei (L.namiyei) and Thamnolaea cinnamomeiventris coronata (T.coronata), have been elevated to the species level in IOC World Bird List 12.1 (Supplementary Table S3).Although these taxa were not included in our study, future studies can easily incorporate them, since most of them have molecular data available in GenBank.Careful curation of taxon name, sampling locality and associated literature may be required to accurately assign additional Gen-Bank sequences to the correct taxa in cases of taxonomic change (see Hosner et al., 2022).

Conclusion
Our study provides the most comprehensive species-level phylogeny for Muscicapidae, covering 92% of the species from all recognized genera.Most of the nodes are well resolved and strongly supported.For genera that appeared to be non-monophyletic in our analyses, we proposed taxonomic revisions based on results both from our study and previous work.Given the number of taxonomic changes already observed, it is likely that more complete taxon sampling, including expanded sampling from subspecies and distinct populations, may further clarify patterns of diversification within the Muscicapidae.In addition, use of next-generation sequencing approaches may be particularly helpful to resolve those nodes with limited support, particularly within the Saxicolinae.
Fig. 1.Maximum likelihood tree estimated by IQ-TREE using a partitioned dataset containing up to 15 mitochondrial and 13 nuclear loci from 301 Muscicapidae and three Turdidae species.Clades A -C (top panel) and clade D (bottom panel) represent the four subfamilies.A1 represents the tribe Muscicapini, and A2 represents the tribe Copsychini.Values at nodes show bootstrap support; 100% bootstrap support is indicated with an asterisk.Red branches denote lineages that are in different places than in the Bayesian tree in Fig. 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 1
Taxonomic updates based on two popular standardized avian taxonomies.

Table 2
Data sources of the molecular markers used.For details of data accessions, see Supplementary TableS1.

Table 3
Molecular markers included in the super-matrix.Taxa number refers to the number of Muscicapidae species with sequences from each locus.Locus length shows the length of final alignment.

Table 5
Recommended taxonomic changes given results in our study and previous work.For taxon names in IOC World Bird List 12.1 that differ from those in Clements/ eBird 2021 checklist and from the results shown in present study, see Table6for reconciliations.
* See alternative suggestion in Discussion.** Provisionally assumed to be closely related to L. gularis and L. insignis, pending further phylogenetic analysis.

Table 6
Recommended taxonomic changes for taxon names in IOC World Bird List 12.1 which differ from those in Clements/eBird 2021 checklist.relatively shallow divergence (MRCA at 10.99 Mya) from the preivous three species (MRCA at 9.99 Mya), and based on this it seems reasonable to include all of these in a single genus.One option is to combine V. gularis, V. insignis, and tentatively V. albigularis, with H. calligyna and L. woodi in Leonardina Mearns, 1905, by priority (Heinrichia was introduced by Stresemann 1931 and Vauriella by Wolters with