Mycobacteriales taxonomy using network analysis-aided, context-uniform phylogenomic approach for non-subjective genus demarcation

ABSTRACT Genomics allows accurately pinpointing microbial ancestry for taxonomic purposes, yet there is currently no consensus to genus definition based on genome data. Different metrics and criteria are used, with an increasing trend toward genera over-splitting. Here, we report a method for prokaryotic genus assignment that combines normalized tree clustering and network analysis of several genomic relatedness indices. Objectivity is maximized by linear application of the same partitioning thresholds across an entire middle/high taxonomic rank context (order level), with the classical (“pre-genomic”) genera as demarcation reference, ensuring continuity in taxonomy and nomenclature. We tested the method with the Mycobacteriales, where recent examples of genus fragmentation divided Mycobacterium into five genera, or made Rhodococcus paraphyletic by creating the genus Prescottella for the sublineage containing the animal and human pathogen, Rhodococcus equi. Our approach did not support the mycobacterial five-genus split or the Prescottella nested genus, but identified a basal branch in each of the mycobacterial and rhodococcal radiations warranting genus status (Mycobacteroides, and novel genus Rhodococcoides for the Rhodococcus fascians clade, respectively). The median average amino acid identity (AAI) between the demarcated genera was 56% to 59%, consistent with the <65% AAI genus boundary standard. Shifting the demarcation threshold to the Prescottella/mycobacterial five-genus level systematically elevated the intrageneric sublineages to genus rank, leading to taxonomic atomization (≈threefold increase in potential Mycobacteriales genera). The proposed approach provides a standardizable methodological framework for non-subjective prokaryotic genus delineation. IMPORTANCE A robust taxonomy is essential for the organized study of prokaryotes and the effective communication of microbial knowledge. The genus rank is the mainstay of biological classification as it brings together under a common name a group of closely related organisms sharing the same recent ancestry and similar characteristics. Despite the unprecedented resolution afforded by whole-genome sequencing in defining evolutionary relationships, a consensus approach for phylogenomics-based prokaryotic genus delineation remains elusive. Taxonomists use different demarcation criteria, sometimes leading to genus rank over-splitting and the creation of multiple new genera. This work reports a simple, reliable, and standardizable method that seeks to minimize subjectivity in genomics-based demarcation of prokaryotic genera, exemplified through application to the order Mycobacteriales. Formal descriptions of proposed taxonomic changes based on our study are included.

• Supplemental Material: pdf file with Supplemental Text, Figures S1 to S6, and Tables S1 and S2.
• Supplemental Dataset: ML distance and genomic relatedness index (GRI) matrices used in network analyses in xls file.

SUPPLEMENTAL TEXT
Nocardiaceae phylogeny.The Nocardiaceae ML tree (166 genomes) shows two main lines of descent: the genus Nocardia, and the rhodococci, comprised of the genus Rhodococcus "sensu stricto" and the "fascians" clade (genus Rhodococcoides gen.nov.proposed herein).There is also a minor grouping at the base of the Nocardia radiation comprising the monospecific genera Skermania and Aldersonia, and the three-species genus Antrihabitans (Fig. S4).
In the ML phylogenies, the rhodococci appear as an earlier evolving monophyletic grouping with greater internal diversity and more extended distances between its members compared to the nocardiae.This is also reflected in the network analyses, where the rhodococcal nodes are more loosely interconnected (Figs. 4, S6).In contrast, the Nocardia genus is consistently organized as a closely packed spherical cluster with equidistant nodes.This suggests a younger diversification, a more homogeneous ecological niche, less internal evolutionary bottlenecks, or a combination thereof.
Based on these differences, it could be justified classifying the rhodococcal clade as a separate family of the Mycobacteriales order.Each of the two main clades of this Rhodococcaceae family, i.e.
These differences in DNA content are characteristic of the rhodococci and are mirrored by the unique metabolic versatility and niche adaptability of this bacterial group (1)(2)(3)(4)(5)(6)(7).The rhodococcal genome plasticity is critically underpinned by pools of circular and linear conjugative replicons of different sizes whose backbones have been actively exchanged and have acquired specific traits during nicheadaptive evolution (6,8).comparisons divided by the total number of CDSs in the two genomes), as previously cautioned (9), could be potentially biased by differences in genome size.This is evident in the shown network graphs, where rhodococcal species with larger genomes (>8 Mbp) are clustered away from other members of their same sublineage (no.4) despite all being phylogenetically closely related (see Figs. 1, 2).This observation underscores that genome size, although linked to the bacterial phylogeny at a broad scale (10), does not necessarily have taxonomic value at lower rank (e.g.genus) levels -reflecting that differential niche-adaptive genome expansion or contraction phenomena may occur in closely related bacteria (4,11,12).Rhodococcus  gives subnetwork partitions fully consistent with those based on the AAI, AF and ML distance matrices (Figs. 3, 4), albeit with a smaller clustering threshold (ct) dynamic range, i.e. network over-fragmentation is rapidly reached as ct values increase.However, at low ct values, the gANI metric affords a good visualization of the taxonomic interconnections at supra-genus (i.e.suborder or family) level.See Movies S3 to S7 for 3D network animations showing the higher rank-level clustering resolution of AF, AAI, ML distance, gANI, and ANI.

Current name (as per NCBI)
Proposed FIG S2.Network analysis of Nocardiaceae taxonomic relationships based on POCP genomic index.The figure illustrates that the POCP metric (the sum of BlastP orthologs identified in two-way genome

FIG S3 .
FIG S3.Taxonomic network analysis based on gANI genomic index at genus and supra-genus level.Top panel, Mycobacteriales lineage I; bottom panel, Nocardiaceae.At genus level, the gANI-based clustering

FIG S5 .
FIG S5.Split-network analysis of the Nocardiaceae ML tree in Fig. S4.Alternative branchings are shown in red and indicated by black arrows.Tree plotted using Splittree5 v5.3.0.(A) The presence of Millisia brevis NBRC 105863 T and Smaragdicoccus niigatensis DSM 44881 T in the phylogenetic analysis causes a generalized reduction in the bootstrap support of basal branches of the tree.The genomic indices of these two species indicate they are distantly related to the other Nocardiaceae, and both are located in relatively longbranches in the ML tree (Fig.1).This likely results in a Long Branch Attraction (LBA) effect due to random rather than phylogenetically informative substitutions, causing tree instability.(B) Same tree after removal of M. brevis NBRC 105863 T and S. niigatensis DSM 44881 T from the concatenated supermatrix.

FIG S6 .
FIG S6.Taxonomic network analysis based on AAI, AF, gANI and ML distance indices showing the relationships of the monospecies genera Aldersonia, Millisia and Skermania with the rest of the Nocardiaceae.Clustering threshold (ct) applied at family level.Even at the lowest (less stringent) ct values, the more distant Smaragdicoccus niigatensis DSM 44881 T appears as a singleton.With all metrics, raising the ct to genus level values isolates Aldersonia, Millisia and Skermania nodes in singletons.
legend).Genus-level tree clustering cutoff indicated by dotted arrow.RED scale, relative evolutionary divergence.Possible new species of the genus are

Table S1 (
cont.) a Mycobacterial names remain listed in NCBI databases as per the five-genus scheme ofGupta et al. (ref.14)i.e.Mycobacterium, Mycobacteroides, Mycolicibacter, Mycolicibacillus, Mycolicibacterium, although the Mycobacteroides, Mycolicibacter, Mycolicibacillus, and Mycolicibacterium circumscriptions were reclassified in 2021 back into Mycobacterium, and corresponding nomenclature emended, by Meehan et al. (ref.15).bEmendation to Meehan et al. nomenclature (ref.15)whereby the genus Mycobacteroides proposed by Gupta et al. (ref.14) is maintained for the basal mycobacterial sublineage containing Mycobacterium abscessus.cEmended name as per Meehan et al. (ref.15),confirmed by our study (except for the Mycobacteroides clade, which should have genus status).dRevised rhodococcal nomenclature based on the findings in this study.Supplementary