Krumholzibacteriota and Deltaproteobacteria contain rare genetic potential to liberate carbon from monoaromatic compounds in subsurface coal seams

ABSTRACT Biogenic methane in subsurface coal seam environments is produced by diverse consortia of microbes. Although this methane is useful for global energy security, it remains unclear which microbes can liberate carbon from the coal. Most of this carbon is relatively resistant to biodegradation, as it is contained within aromatic rings. Thus, to explore for coal-degrading taxa in the subsurface, this study reconstructed relevant metagenome-assembled genomes (MAGs) from coal seams by using a key genomic marker for the anaerobic degradation of monoaromatic compounds as a guide: the benzoyl-CoA reductase gene (bcrABCD). Three MAGs were identified with this genetic potential. The first represented a novel taxon from the Krumholzibacteriota phylum, which this study is the first to describe. This Krumholzibacteriota MAG contained a full set of genes for benzoyl-CoA dearomatization, in addition to other genes for anaerobic catabolism of monoaromatics. Analysis of Krumholzibacteriota MAGs from other environments revealed that this genetic potential may be common, and thus, Krumholzibacteriota may be important organisms for the liberation of recalcitrant carbon in a broad range of environments. Moreover, the assembly and characterization of two Syntrophorhabdus aromaticivorans MAGs from different continents and a Syntrophaceae sp. MAG implicate the Deltaproteobacteria class in coal seam monoaromatic degradation. Each of these taxa are potential rate-limiting organisms for subsurface coal-to-methane biodegradation. Their description here provides some understanding of their function within the coal seam microbiome and will help inform future efforts in coal bed methane stimulation, anoxic bioremediation of organic pollutants, and assessments of anoxic, subsurface carbon cycling and emissions. IMPORTANCE Subsurface coal seams are highly anoxic, oligotrophic environments, where the main source of carbon is “locked away” within aromatic rings. Despite these challenges, many coal seams accumulate biogenic methane, implying that the coal seam microbiome is “unlocking” this carbon source in situ. For over two decades, researchers have endeavored to understand which organisms perform these processes. This study provides the first descriptions of organisms with this genetic potential from the coal seam environment. Here, we report metagenomic insights into carbon liberation from aromatic molecules and the degradation pathways involved and describe a Krumholzibacteriota, two Syntrophorhabdus aromaticivorans, and a Syntrophaceae MAG that contain this genetic potential. This is also the first time that the Krumholzibacteriota phylum has been implicated in anaerobic dearomatization of aromatic hydrocarbons. This potential is identified here in numerous MAGs from other terrestrial and marine subsurface habitats, implicating the Krumholzibacteriota in carbon-cycling processes across a broad range of environments.

T he global energy transition from hydrocarbons to renewables requires low-emission fuels for facilitating energy security during this shift.For this transition fuel need, methane is one such lower emission alternative to coal (1).Methane gas provides a dispatchable source of energy that, unlike coal, produces neither particulates nor harmful nitrous and sulfur oxides during combustion (2,3).Thus, significant research interest exists in enhancing rates of methane gas production in subsurface coal by using the coal seam microbiome (4)(5)(6).
Within the coal itself, carbon is primarily (~60% to 100%) contained within aromatic rings, which increase in abundance with thermal maturity (7).Thus, microorganisms capable of aromatic ring degradation may be the dominant contributors of carbon to the coal seam microbiome, especially in more thermally mature coals.Over the last decade, an understanding has been formed of the range of microbes that occur in subsurface coal seams, demonstrating that they are typically characterized by a few relatively abundant taxa and a long-tailed distribution of rarer taxa (4,8,9).Coal seam microbiome studies have also revealed that the dominant organisms, such as taxa within Desulfuromonas and Desulfovibrio, are not involved in aromatic degradation but rather degrade simpler intermediates (8,10).Aromatic-degrading taxa, therefore probably occur within the rarer taxa.
Identification of aromatic-degrading taxa is important both for enhancing applied outcomes such as industrial gas production and for understanding carbon mobilization from carbon-rich regions of the lithosphere.Indeed, the research effort to identify these taxa has been ongoing in the subsurface coal microbiology field for the last three decades (4,5,8,11).Some taxa have previously been identified with the genetic tools for aerobic aromatic ring degradation within coal seams (12)(13)(14); however, aerobic pathway contributions are likely restricted to very shallow regions of meteoric water infiltration, since subsurface coal seams are overall highly anoxic environments (14).Within other anoxic hydrocarbon-degrading environments, such as oil reservoirs (15), the catabolism of aromatic substrates progresses at least in part via intermediate aromatic compounds.Although a wide range of monoaromatic biodegradation pathways are known, these pathways are dependent upon a relatively small range of central intermediate aromatic compounds.One intermediate aromatic compound in particular, known as benzoyl coenzyme A (benzoyl-CoA), is central to the anaerobic degradation of a particularly wide range of monoaromatics (16).Further catabolism from this central intermediate requires the benzoyl-CoA reductase enzyme, which is responsible for the first dearomatization steps of the ring (17).This enzyme for ring cleavage is a crucial step for accessing the carbon contained within the ring structures, as the thermodynamic stability of the aromatic ring structure renders it highly resistant to degradation (15).The ability to encode this enzyme, and those for the proceeding metabolites in the benzoyl-CoA pathway, would provide organisms with access to much of the carbon locked up in the organic substrates present in coal.
To date, numerous studies have attempted to identify aromatic-degrading taxa in coal seams using a range of strategies.Some of these have included the enrichment of microbes capable of degrading aromatic compounds, using either a variety of monoand polyaromatic compounds or organic matter of differing maturities as sole sources of carbon (18,19).These studies identified putative aromatic degraders among the enriched taxa but also obscured primary aromatic degraders amid the numerous taxa subsequently enriched on their downstream degradative products.
Metagenome-assembled genomes (MAGs) have also been used to identify putative aromatic-degrading taxa by using known genes required for hydrocarbon degrada tion, performed alongside methods such as bio-orthogonal non-canonical amino acid tagging to differentiate active from inactive cells (12,13).Within the active taxa, this strategy has identified a range of genes used for rearranging the substituents of the aromatic ring in peripheral pathways above the benzoyl-CoA intermediate, for example, genes for catabolism of ethylbenzene to acetophenone in Chlorobiota and Geobacter taxa (12).Although these studies identify an abundance of genes and taxa actively involved in the coal-to-methane degradation pathways, the specific anaerobic mono aromatic-degrading genes identified are not involved in catalyzing dearomatization reactions or capable of liberating carbon from their targeted substrates (15,20,21).Consequently, the identification of taxa containing the genes for dearomatization of these monoaromatic substrates, such as the benzoyl-CoA reductase gene, remains an unanswered and critical step for understanding the in situ liberation of carbon from coal.
Recently, Syntrophorhabdaceae and Syntrophaceae were implicated as potentially important coal-degrading families using the linear discriminant analysis effect size statistical method on a group of algal-amended coal seam microbiomes from the Powder River Basin, USA (22).Syntrophorhabdaceae is a monotypic family, of which Syntropho rhabdus aromaticivorans is presently the sole described species (23).S. aromaticivorans has previously been identified within an enriched Surat Basin (Australia) coal seam microbial community, which had likely responded to the increased surface area of the provided coal that had undergone solvent extraction (24).Outside the coal seam environment, an S. aromaticivorans isolated from an anaerobic digester was demonstra ted to anaerobically catabolize multiple monoaromatic substrates to acetate, to utilize a model organic electron acceptor, and was proposed to be capable of interspecies electron transfer in partnership with a hydrogenotrophic methanogen (23).S. aromatici vorans is thus a promising candidate for monoaromatic degradation capabilities within the coal seam environment.In contrast to this, Syntrophaceae spp.are more commonly associated with aliphatic degradation than aromatic degradation; however, some taxa within the family can utilize a limited range of more labile monoaromatic substrates (25,26).Although Syntrophaceae spp.and S. aromaticivorans have been implicated as taxa that may be important for in situ coal degradation, there has been no demon stration that strains from this environment have the genetic tools required to access aromatic compounds or culture-based demonstrations of their activity against aromatic constituents of coal.
One alternative strategy to those outlined above is to "mine" metagenomic data for specific aromatic-degrading genes and then use binning techniques to reassemble genomes of potential aromatic degraders.Accordingly, this study aimed to identify putative aromatic-degrading taxa by mining metagenomic data from Australia and North America through first identifying assembled DNA sequences that contained benzoyl-CoA reductase gene subunits and then using these as references to assemble MAGs for the identification of these taxa and their potential capabilities.

Sources of metagenomic DNA
For further details of metagenomic sequence data sourcing, see reference 27.Briefly, whole-genome shotgun sequences selected for use were required to (i) be from a subsurface coal seam (ii), state the coal seam or associated geological basin, (iii) have been sequenced using Illumina (28), and (iv) be available as unassembled sequence reads.Nine metagenomes from North America and four from Australia were found to be suitable (Table S1).Eight of the North American metagenomes were from the Nance, Flowers-Goodale, and Terret coal seams of the Powder River Basin, USA, and one was from the Pocahontas No. 3 Coal of the central Appalachian Basin, USA.From Australia, three metagenomes were from the Walloon Subgroup of the Surat Basin, and one was from the Bandanna Formation of the underlying Bowen Basin, and all of these from the state of Queensland, eastern Australia.

Reassembly, annotation, and detection of contigs containing benzoyl-CoA reductase gene subunits
All metagenomes were downloaded as unassembled reads and passed through the same error correction, assembly, and annotation pipeline.These reads were error corrected using Blue v2 (29) prior to assembly using SPAdes v3.13.2 with the meta flag (30).Resultant contigs were annotated using prokkaMeta v1.14.5 (31), and these annotation descriptions were then searched for any class-I enzyme system benzoyl-CoA reductase subunits, which are associated with facultative anaerobes such as Magnetospir illum and Thauera spp.(15,32).Contigs with benzoyl-CoA reductase gene subunits were then also explored, using the Prokka annotations, for other genes in the benzoyl-CoA pathway.The contigs on the final list selected for further analysis all contained at least one benzoyl-CoA reductase gene subunit (bcrABCD), were of lengths greater than 7 kbp, and had coverages greater than five.

Generation of trimer signatures, correlations, bin quality control, and accessioning
In order to identify other contigs within the metagenome from the same taxon, a trimer approach was used (33).In brief, the Python programming language (34) was used to count the proportion of each of the 64 possible trimers in all contigs containing at least 1,000 base pairs.The trimer signature for those contigs containing benzoyl-CoA reductase gene subunits was then used as a reference, and the trimer signatures for all contigs in the metagenome were examined against these reference contigs using a Pearson correlation coefficient in SciPy 1.7.3 (35).Those contigs with Pearson's R values greater than 0.95 were collected into bins for quality control.As a subsequent step in bin quality control, the coverages of each contig within the bin were inspected manually, and contigs with aberrant coverage were removed.Each bin was then subject to inspection for completeness and contamination using CheckM v1.1.3 (36).

Characterization of the genomic content of the bins
Summary contig statistics (total bin length, total contigs, mean contig length, median contig length, N50, maximum contig length, and GC content) were determined using Python.The bins were then submitted to online tools to further characterize the genomic content of each bin.For use with KEGG Mapper (genome.jp/kegg/mapper)(37), the Prokka-annotated amino acid files were submitted to the BlastKOALA v2.2 online tool and run against the "species_prokaryotes" database (38).The Prokka-annotated amino acid files were also used to determine which transport proteins were present in each bin, by submission to the TransportDB 2.0 TransAAP (membranetransport.org)for transporter annotation (39).In order to characterize the carbohydrate active enzymes, dbCAN HMMdb v10.0 (bcb.unl.edu/dbCAN2) was run on the unannotated contigs within each bin (40).Similarly, the CRISPR sequences and cas genes were identified in the contigs within each bin using CRISPRCasFinder online (crisprcas.i2bc.paris-saclay.fr)(41).Resultant data from these tools were summarized and used to characterize the genomes in each bin and estimate their ecological roles within the coal seam environment.

Identification of the bins
As all bins lacked 16S ribosomal RNA (rRNA) genes, the elongation factor G genes were used to identify the closest known relatives of each bin.For this, BLASTx searches of the non-redundant protein database were used (https://blast.ncbi.nlm.nih.gov/Blast.cgi).Subsequently, putative identities from elongation factor G genes were compared with BLASTN searches of 16S rRNA (V4) genes obtained from each metagenome using the Earth Microbiome Project V4 region primers with Kelpie (42)(43)(44).This manual compari son between the elongation factor G taxonomy and Kelpie-derived 16S rRNA genes was used to identify putative taxa at the level of family or lower, with a minimum per-taxon abundance of 10 applied for the 16S rRNA genes.In addition, all 16S rRNA genes generated using Kelpie were then compared to the coal seam microbiome (CSMB) reference set, using USEARCH v11.0.667 at 97% identity, to determine any associated CSMB reference taxa (8,45).
Where close elongation factor G sequence percent identities were not found with BLASTx, the phylogeny of the bin was estimated against representative genomes using multilocus sequence analysis (MLSA).Relevant representative genome assemblies and MAGs were sourced from the NCBI Assembly database (ncbi.nlm.nih.gov/assembly/)(46) and annotated using Prokka.Common housekeeping genes were then compiled using a Python script (34), with their selection restricted to those housekeeping genes present within both the bin and the majority of the representative genomes.The constructed fasta files of the selected housekeeping genes were aligned and clustered in Mega v11.0.13 (47) with the ClustalW function and edited as Newick tree format files in FigTree (tree.bio.ed.ac.uk/software/figtree/).

Distribution of benzoyl-CoA reductase genes across the examined metage nomes
Subunits of the benzoyl-CoA reductase gene (bcrABCD) were detected in the Prokka annotations of all examined metagenomes except for the Appalachian Basin metage nome (Table S2a).Relative to the total coding sequences (CDS) identified in each metagenome, the proportion of bcrABCD subunits per million CDS ranged from 17 to 52, with Powder River 85 (Flowers-Goodale coal seam) containing the lowest proportion and Powder River 10 (Nance coal seam) containing the highest relative proportion.This is not a measure of relative abundance of these genes but rather an indication of the number of times the bcrABCD subunits were identified within larger distinct fragments of assembled contiguous sequences (contigs).Overall, subunits bcrB and bcrC were identified far more often than subunits bcrA and bcrD; no bcrA or bcrD subunits were detected in the Australian metagenomes, and the proportion of bcrBC subunits was an order of magnitude higher than bcrAD in the Powder River Basin metagenomes (218 and 27 per million CDS, respectively).
Twenty-one bins were obtained that contained benzoyl-CoA reductase gene subunits, and pair-wise comparison of these indicated three groupings of similar genomes (Table S3).A representative bin was chosen from each of these groupings, with the highest completeness and lowest contamination scores in the CheckM results for that group.Although groups 1 and 3 contained bins only from the Powder River Basin, group 2 contained bins from the Powder River, Bowen, and Surat basins, so a representative bin from each continent was selected for further characterization (bins 1.1, 2.2, 2.6, and 3.1; Fig. 1; Table 1).The remaining 17 bins were determined to be chimeric, as they were too large to represent a single genome or to have unacceptably low completeness and/or unacceptably high contamination scores in the CheckM results (Table S3).

Phylogeny of the four contig bins that were chosen for further analysis
16S rRNA genes were not recovered in any of the four contig bins selected for further analysis or in the other related 17 bins.In place of this, phylogenetic analy ses were performed using BLASTx to identify elongation factor G gene sequences with high percent identities to the four contig bins and then infer the associated 16S rRNA gene operational taxonomic units (OTUs).These results indicated that the genomic content of these four bins came from a Syntrophaceae species (bin 1.1), two Syntrophorhabdus aromaticivorans (bins 2.2 and 2.6), and a novel member of the Fibrobacterota-Chlorobiota-Bacteroidota (FCB) superphylum (bin 3.1).Although the closest relative of bin 1.1 by its elongation factor G gene was a Syntrophaceae sp., possible Syntrophaceae matches for this bin in the 16S rRNA gene results from the metagenome were limited to two distinct taxa (OTU_64 and OTU_114; detected 12 and 10 times, respectively).Both of these have >90% identities to type sequence Smithella propionica LYP (96.84% and 94.86%, respectively; GenBank accession NR_024989.1)as well as to each of the three formally described species within the Syntrophus genus  (15,32).See Table S2 for number of occurrences of each gene in the MAGs and metagenomes.The displayed benzoyl-CoA pathway was adapted from genome.jp/pathway/map00362(37).(42,43).The nomenclature used in the present study does not reflect these proposed updates and instead aligns with current entries for Deltaproteobacteria and Syntrophaceae in the LPSN (https://lpsn.dsmz.de/).Although 16S rRNA sequences for the Deltaproteobacteria MAGs were able to be found by correlating close relatives between the Kelpie-produced 16S rRNA gene results and the annotated elongation factor G genes (Syntrophaceae sp.bin 1.1 and Syntropho rhabdus aromaticivorans bins 2.2 and 2.6), the novelty of bin 3.1 impeded identification through close relatives.
MLSA was performed to further characterize bin 3.1 after it was placed loosely within the FCB superphylum by low percent identity NCBI BLAST results using its elongation factor G gene.MLSA was performed first on 37 representative genomes and MAGs from the FCB superphylum (50) and then was again performed on 47 MAGs from the Krumholzibacteriota, Delphinibacteriota, and Latescibacterota candidate phyla once this specific region of the FCB superphylum had been identified as most closely related to bin 3.1 (Fig. S2; Table S4).These phyla are not yet well resolved, and the Genome Taxonomy Database classifies the Delphinibacteriota and some Latescibacterota within the Krumholzibacteriota phylum (51).From these MAGs, six housekeeping genes were selected for analysis: adenylosuccinate synthase, ATP synthase subunit beta, elongation factor G, elongation factor P, malate dehydrogenase, and protein RecA genes, which were identified in bin 3.1 in addition to a maximum number of the representative genomes.Importantly, each of these genes occurred within two or more genomes from Delphini bacteriota and Krumholzibacteriota; the two phyla with which bin 3.1 had the highest percent identities in the BLASTx results for the elongation factor G gene.The results of the MLSA indicate that bin 3.1 is likely a novel member of the Krumholzibacteriota phylum, of which the closest known relative appears to be a marine sediment enrich ment culture MAG from the Bothnian Sea, in the Scandinavian region (52).Four other closely related Krumholzibacteriota MAGs were also identified, coming from a stratified freshwater reservoir of the Cañas River in Puerto Rico and from Lake Lovojärvi in Finland (Fig. 2; Table S4b; NCBI Assemblies GCA_ 903834005, GCA_903859215, GCA_903898915, and GCA_903925875, all from project number PRJEB38681) (53).
For the purpose of aiding cross-study analysis of Krumholzibacteriota sp.bin 3.1, the four MAGs in the MLSA results that contained 16S rRNA genes (Fig. 2) were used to determine both the most probable 16S rRNA gene OTU in the present study and the most probable CSMB reference set sequence (8).These four MAGs consisted of a Delphinibacteriota, two Krumholzibacteriota, and a Latescibacterota MAG (NCBI Assemblies GCA_002747875, GCA_022711765, GCA_024276305, and GCA_012103555, respectively).Each of these were compared against the metagenomic 16S rRNA gene OTUs and the CSMB reference set using USEARCH (90% minimum threshold) (45).Although no 16S rRNA gene OTU or CSMB identities were found for Krumholzibacteriota GCA_024276305 or Latescibacterota GCA_012103555, the USEARCH results indicated that Krumholzibacteriota sp.bin 3.1 may correspond to OTU_57 (94.1% and 92.9% identities to GCA_022711765 and GCA_002747875), which also is in agreement with the distribution of similar bin three group MAGs across the metagenomes in the present study (Table S3; assuming the absence of 16S rRNA genes from rarer taxa is due to the shallower sequencing depth overall in Powder River 9).The CSMB identities indicated that Krumholzibacteriota sp.bin 3.1 may correspond to CSMB_1092 (95.5% and 93.9% identities to GCA_022711765 and GCA_002747875), which has been previously reported from coal seams in the Bowen, Surat, and Sydney basins, Australia (Table S5).

General description of the Syntrophaceae sp., S. aromaticivorans, and Krumholzibacteriota sp. genome bins
The four bins chosen for further characterization ranged in size from 1.96 to 4.18 Mbp, with S. aromaticivorans bin 2.6 having the smallest bin size and the Krumholzibacter iota sp.having the largest bin size (Table 1).GC content was broadly similar for the three Deltaproteobacteria MAGs (bins 1.1, 2.2, and 2.6), with an average of 46.7%.Krumholzibacteriota sp.bin 3.1, however, had a higher GC content of 71.1% (Table 1) Thirty other MAGs from the initial analysis were missing the selected housekeeping genes and thus are not included here (Table S4) (52,(54)(55)(56)(57)(58)(59).Environmental sources are indicated for each MAG; circle size is proportional to the number of times each environment applied within the same phylum.
consistent with the other closely related Krumholzibacteriota MAGs from freshwater lakes and marine sediment (69.2%-70.8%;Fig. 2; Table S4) (52,53).Across the four bins, CheckM (36) results indicated that contamination was very low (<2%).Although only taxonomically resolved to the phylum level, genome completeness was greatest for Krumholzibacteriota sp.bin 3.1 (95%), whereas genome completeness was lowest for the Powder River Basin-sourced S. aromaticivorans bin 2.6 (55%; Table 1; Table S3).Overall, these bins are medium-to high-quality draft MAGs, based on the combination of completeness and contamination scores with the varying presence of key marker genes within the assemblies (60).Although completeness and contamination scores for the Krumholzibacteriota sp.bin 3.1 would support its classification as a high-quality draft, it is instead classified as a medium-quality draft due to the lack of recovered 16S rRNA genes.

Dearomatization genes and other notable ecophysiological MAG characteris tics
The most commonly detected benzoyl-CoA reductase gene subunits were those of bcrABCD from the class-I enzyme system, associated with facultative anaerobes such as Magnetospirillum and Thauera spp.(15,32).The Prokka (31) annotations contained the highest number of these; however, the BlastKOALA (38) annotations included an additional enzyme system (Table 2; Table S2).This additional enzyme system, found in S. aromaticivorans bin 2.2, was the ATP-independent class-II benzoyl-CoA reductase gene subunit, bamB, associated with obligate anaerobes (Fig. 1) (32).
In addition to their use for MLSA, the Prokka annotations of the representative MAGs from the FCB superphylum, including those from the Krumholzibacteriota, Latescibacterota, and Delphinibacteriota phyla, were searched for benzoyl-CoA pathway genes.No genes of either the bcr or bam enzyme systems were detected in the 37 representative FCB superphylum genome assemblies and MAGs from the NCBI Assembly database (Table S4a) (46).Other genes in the benzoyl-CoA pathway were also rare, with the highest number (two) being found in the MAG of Longimicrobium terrae CECT 8660, from the Gemmatimonadota phylum, for the production of ben zoyl-CoA from 4-hydroxybenzoyl-CoA and benzoate (hcrABC and badA, respectively).In contrast, the bcr gene and other benzoyl-CoA reductase genes were far more common among the 40 representative Krumholzibacteriota MAGs from GenBank that were used for the second MLSA during phylogenetic classification of this MAG.From these representative Krumholzibacteriota MAGs, 16 contained one or more subunits from bcrABCD, 12 contained all four subunits, and 21 contained one or more genes or subunits from the benzoyl-CoA pathway as depicted in Fig. 1 (Table S4b; Fig. 2) (52,54,55).Indeed, 11 of these Krumholzibacteriota MAGs contained complete or near-complete (missing only one gene) pathways from benzoyl-CoA to 3-hydroxy pimeloyl-CoA, as was identified in the Krumholzibacteriota MAG (bin 3.1) from the present study.These 11 representative Krumholzibacteriota MAGs that contained the genes for the benzoyl-CoA pathway are recorded as being sourced from a hydro thermal vent chimney along a mid-ocean ridge in the southwest Indian Ocean and from freshwater lakes in Puerto Rico, Switzerland, and Finland (hydrothermal vents [61]: GCA_024275485, GCA_024276305, and GCA_024277025; freshwater lakes [53]: GCA_903834005, GCA_903845975, GCA_903847545, GCA_903849715, GCA_903850775, GCA_903851775, GCA_903859215, GCA_903898915, GCA_903912525, GCA_903916975, and GCA_903925875).
Aside from the benzoyl-CoA pathway genes, the most notable BlastKOALA results regarded flagellar assembly and chemotaxis, anabolic antimicrobial genes, and nutrient and electron acceptor scavenging (Table S2c).Substantial numbers of genes for flagellar assembly and chemotaxis were detected within Krumholzibacteriota sp.bin 3.1 as well as genes for biosynthesis of antimicrobial substances.Genes for nutrient and electron acceptor scavenging of amino acids, nitrogen compounds, and sulfur compounds were most commonly detected within the S. aromaticivorans bins 2.2 and 2.6.The substrates for these nitrogen and sulfur scavenging genes included elemental sulfur, sulfite, trithionate, thiosulfate, ammonia, and nitrogen gas.Krumholzibacteriota sp.bin 3.1 also contained genes for removing amine and phosphate functional groups from monoaromatic compounds.

Carbohydrate-active enzymes
Each of the four bins contained a small number of carbohydrate-active enzymes (Table 2).The highest number of carbohydrate-active enzymes was detected in Krumholzi bacteriota sp.bin 3.1 (143 total), of which the majority were glycosyltransferases and glycoside hydrolases.Of the glycoside hydrolase genes in Krumholzibacteriota bin 3.1, those for starch/glycogen catabolism were most abundant, although genes for oligosaccharides, fructan, cellulose, and other plant and animal polysaccharides were also detected (62).The lowest number of carbohydrate-active enzymes was found in S. aromaticivorans bin 2.6 (44 total), more than half of which were glycosyltransferases. Overall, glycosyltransferases were the most commonly detected carbohydrate-active enzymes in each bin.No polysaccharide lyases were detected in any of the bins.

Membrane transporter proteins
Numerous transporter protein genes were detected in each bin (Table 2; Table S2d).Syntrophorhabdus aromaticivorans bins 2.2 and 2.6, along with Krumholzibacteriota sp.bin 3.1, contained higher numbers of transporter proteins (totaling 269, 254, and 266, respectively).In contrast, Syntrophaceae sp.bin 1.1 contained a lower number of transporter proteins (totaling 171).Across all four bins, the ATP-binding cassette transporters were the most commonly detected; however, it is also notable that the tripartite ATP-independent periplasmic transporters were unusually abundant within the S. aromaticivorans bins 2.2 and 2.6 (32 and 36, respectively).Efflux transporter genes were also found in all bins, although were twice as abundant in Krumholzibacteriota sp.bin 3.1, including three associated with cobalt-zinc-cadmium resistance.

CRISPRs and cas genes
Relevant to viral predation and defense, three of the four bins included multiple CRISPR loci, and two of these (bins 1.1 and 2.2) contained several cas genes (Table 2; Table S2e

DISCUSSION
For microbiologists working to understand the catabolism of coal in the subsurface, the identification of taxa engaged in degrading aromatic compounds has proved elusive.Indeed, microbial communities from the coal seam environment tend to be numerically dominated by one or two methanogens (12,63), a handful of bacterial taxa presuma bly engaging in syntrophic partnerships with these methanogens, and a long-tailed distribution of rarer taxa (64).Since it has generally been the view that these bacterial syntrophs cannot degrade complex aromatic or aliphatic compounds (4,65,66), this leaves the relatively unexplored long tail of the coal seam microbiome as a likely place to find those microbes capable of coal degradation.This unexplored long tail is comprised of low-abundance, diverse taxa, collectively known as the "rare biosphere" (67)(68)(69).It is ecologically reasonable to hypothesize that the unexplored rare biosphere of coal seam microbiome contains microbes with a more diverse array of nutritional strategies, since presumably, there are abundant and less-highly competed nutritional niches within the diverse heterogeneity of the organic matter in coal.Data presented here demonstrate for the first time the identity of multiple lineages with crucial monoaromatic-degrading genes from subsurface coal seams on two continents.Four MAGs were explored in this study, identified as a Syntrophaceae sp.(bin 1.1), two Syntrophorhabdus aromaticivorans (bins 2.2 and 2.6), and a novel taxon from the Krumholzibacteriota phylum in the FCB superphylum (bin 3.1).The Krumholzibacteriota sp.represented by this MAG is likely engaged in monoaromatic degradation within the Flowers-Goodale coal seam of the Powder River Basin and had not previously been identified in the coal seam environ ment, nor had it been implicated as having a role in carbon liberation from aromatic compounds present in coal.While it has been previously stated that S. aromaticivorans may be important for coal degradation (22), this is the first demonstration that coal seam MAGs from this taxon carry genes for monoaromatic degradation.For the Syntropha ceae sp., taxa within this family are well-known hydrocarbon degraders in subsurface environments, such as aliphatic molecule degradation in oil reservoirs (26); however, they had not previously been implicated in aromatic degradation in the coal seam environment.

Ecological characteristics of the Krumholzibacteriota sp. bin 3.1 MAG
Based on its genetic potential, the Krumholzibacteriota sp.bin 3.1 MAG represents a primary coal degrader with the ability to anaerobically catabolize a wide range of monoaromatic compounds (Fig. 3; Table S2).This implicates the Krumholzibacteriota candidate phylum in aromatic degradation in the coal seam environment, as well as more broadly, for the first time.Although the 16S rRNA gene for Krumholzibacteriota sp.bin 3.1 (OTU_57; CSMB_1092) was able to be inferred using 16S rRNA genes from four closely related MAGs of the same phylum, downloaded from the NCBI Assembly database (46), further confirmation of the 16S rRNA gene specific to this organism could improve cross-study analysis in the coal seam environment as well as for other environments where similar taxa within the Krumholzibacteriota phylum may play a key ecological role.
In addition to aromatic degradation, the Krumholzibacteriota sp.represented by bin 3.1 may also compete in other ecological niches such as microbial biomass or necromass recycling, since it has a substantially higher number of glycoside hydrolase genes than the three Deltaproteobacteria MAGs (Table 2).Glycoside hydrolase enzymes are used for recycling complex carbohydrates and play an essential role in both ecosystem-scale and global carbon cycling (62).When compared against other biomass-recycling taxa from the coal seam environment (64), however, Krumholzibacteriota sp.bin 3.1 contains a slightly lower abundance of these and of other carbohydrate-active enzyme genes, suggesting that it is unlikely to be specializing in biomass recycling.Biomass recycling may, however, be used to supplement its extraction of carbon from coal as well as to obtain nitrogen and phosphorus (Table S2).Krumholzibacteriota sp.bin 3.1 may obtain nitrogen and phosphorus from biomass recycling and possibly from the coal itself, since this MAG contained a near-complete set of genes for the catabolism of benzamide and benzoyl phosphate to acetate via the benzoyl-CoA pathway (Fig. 3; Table S2).
Other genetic characteristics relevant to the ecological role and competitive success of Krumholzibacteriota sp.bin 3.1 include lower viral predation, higher multidrug efflux pumps, and flagella-driven motility (Table S2).Relative to the other MAGs described here, the Krumholzibacteriota sp.MAG contained far less CRISPR sequences and no cas genes and thus does not appear to experience the same degree of viral predation.Further more, the Krumholzibacteriota sp.contained approximately twice as many multidrug efflux pump genes relative to the other MAGs examined here, spanning three different transporter superfamilies, although primarily from the resistance/nodulation/cell division superfamily.These genes are associated with resistance to toxic metals such as cobalt, zinc, and cadmium as well as resistance to toxic hydrocarbons and other antimicrobial substances by removal from the cell.Finally, it is also the only MAG described here to contain most of the genes necessary for flagellar assembly and chemotaxis, required for locomotion in response to chemical cues, which would likely assist survival within the coal seam environment.Cell locomotion could also provide a competitive advantage over other potential aromatic compound degraders such as S. aromaticivorans, of which the type species is nonmotile (23), and no genetic potential for this was found in the MAGs examined here either (Table S2c).Overall, these findings implicate the Krumholzi bacteriota sp.bin 3.1 MAG as containing the genes for a range of competitive advantages over S. aromaticivorans (bins 2.2 and 2.6) and Syntrophaceae sp.(bin 1.1) for catabolism of monoaromatic compounds in the coal seam environment.Further studies verifying the function of this microbe in the coal seam environment, such as axenic culturing to test its response to complex carbon substrates, could improve understanding.Axenic culturing could also allow observations of cell morphology and other phenotypic traits of the organism that is represented by bin 3.1, which may aid in understanding its ability to utilize different niches within the coal seam environment.The genetic characteristics provided here may serve as a guide to assist in obtaining this taxon in pure culture.

Krumholzibacteriota taxa may liberate recalcitrant carbon in aquatic environments globally
Within the Krumholzibacteriota phylum, dearomatization and downstream catabolism of monoaromatic compounds in anoxic environments may be relatively common (Fig. 2).Of the 40 other GenBank-sourced Krumholzibacteriota MAGs used for MLSA, 13 contained a complete or near-complete set of genes for catabolism of benzoyl-CoA to 3-hydroxypimeloyl-CoA (Table S4b) (46).The environmental sources of these MAGs are listed as a deep cold seep fluid in the South China Sea (70), a hydrothermal vent in the southwest Indian Ocean (61), and freshwater lakes in Finland, Puerto Rico, and Switzerland (53).In addition to this genetic potential for anaerobic dearomatization, aerobic dearomatization has recently been suggested regarding a Krumholzibacteriota MAG that contained one subunit of the gene encoding for benzoate/toluate 1,2dioxygenase reductase (benC-xylZ) and was sourced from an oil-contaminated environ ment in the Persian Gulf (71).In the present study, Krumholzibacteriota sp.bin 3.1 came from a subsurface coal seam (aquifer; high-similarity MAGs recovered from four of the five Flowers-Goodale coal seam metagenomes) in the USA, and the inferred 16S rRNA gene (OTU_57; CSMB_1092) has previously been identified in amplicon sequences from other subsurface coal seams in three different geological basins in eastern Australia (Table S5).In combination, these findings suggest that the role of Krumholzibacteriota taxa merits consideration when attempting to understand biodegradation of aromatic compounds in a wide range of aquatic environments and may be of relevance when attempting to understand contributions from these environments to the global carbon cycle.Again, axenic studies of putative carbon-liberating Krumholzibacteriota would be beneficial in clarifying their phylogenetic and metabolic capabilities and their ecological functions in environments such as marine sediments, deep marine cold seep fluids, hydrothermal vents, and surface freshwaters.Indeed, this clarification could result in a clearer understanding of potential rate-limiting ecological roles of Krumholzibacteriota spp. in the subsurface coal seam environment.

Syntrophorhabdus aromaticivorans in the coal seam environment
Unlike the novel Krumholzibacteriota MAG, S. aromaticivorans has been identified within the coal seam environment on numerous previous occasions.Indeed, the CSMB reference set presently lists six distinct taxa from the Syntrophorhabdus genus, and one or more of these have been identified from coal-bearing basins of eastern Australia on 23 different occasions (Table S5) (8,32).Recently, Syntrophorhabdus have also been detected in the Powder River Basin, USA, and implicated as a potential primary degrader of organic matter in coal based on both their increase in abundance after algal-amend ment and on the published syntrophic degradative abilities for monoaromatic com pounds by the type species when grown in coculture with a methanogen (23).It should be noted, however, that only this single isolate has been described from the genus and was obtained two decades ago from anaerobic sludge in terephthalic acid manufactur ing wastewater (23).It was thus unclear whether other strains of S. aromaticivorans, such as those identified in the coal seam environment, were capable of aromatic degradation.Taken together, these data indicate for the first time that two coal seam-sourced MAGs of S. aromaticivorans also have the genetic potential to directly access the carbon contained within the monoaromatic molecules of coal, such as 4-hydroxybenzoate and benzoate (Fig. 1).
In addition to the genetic potential for catabolism of monoaromatic compounds, the S. aromaticivorans MAGs described here have noteworthy membrane transporters, viral predation indicators, and an absence of genes for biomass recycling.Both MAGs have an unusually large array of tripartite ATP-independent periplasmic transporters, which may indicate that dicarboxylates (such as fumarate) are an important source of carbon for S. aromaticivorans (Fig. 3).Although neither MAG had a high completeness score (Table 1), the recovered sequences did not indicate potential for biomass recycling as an alternative source of carbon, as neither strain contained substantial numbers of genes for glycoside hydrolase and other carbohydrate enzymes.Interestingly, the Australian S. aromaticivorans genome (bin 2.2) contains abundant CRISPR spacers, suggesting substantial pressure on this taxon from viral predation in the Bowen Basin, eastern Australia.Viral predation has been implicated as an important process in the overlying Surat Basin (14), and numerous other taxa from this environment have been previously demonstrated to harbor substantial CRISPR spacers arrayed against a host of viruses (10,72).In contrast, the lack of CRISPR sequences detected in the North American genome (bin 2.6) indicates little to no viral predation stress on this taxon in the Powder River Basin; however, these sequences may have simply not been recovered due to the lower completeness of this bin (Table 1).Lastly, the ability of S. aromaticivorans to use organic electron acceptors (9,10-anthraquinone-2,6-disulfonate) when isolated from wastewater is intriguing (23), and it may be that aromatic compounds from the coal seam environ ment could also act as alternative electron acceptors, although experimental work with the organism in culture would be important to confirm this speculation.Regardless, certainly for Australian strains, there is clear evidence that this species has the genetic potential to catabolize aromatic compounds from the coal seam, and if these taxa are also subject to viral predation, it may be a mechanism by which this carbon is made available to a wider range of taxa post-lysis of the S. aromaticivorans cells.

Syntrophaceae taxa may utilize aromatic carbon in coal seams
The other Deltaproteobacteria MAG represented a Syntrophaceae sp. from the Powder River Basin and contained genes associated with monoaromatic degradation.Data from the CSMB reference set reveal that the Syntrophaceae family is commonly present within coal seams in Australia, in all of the North American coal seams in the present study, and also in the Ishikari Basin, Japan (Table S5) (8).The Syntrophaceae family, Smithella propionica specifically, is well known for its use of aliphatic hydrocarbons (26,73).Indeed, Smithella spp.are well-known alkane degraders from anoxic environments such as oil reservoirs, where they can grow in syntrophy with hydrogenotrophic methanogens (26).Despite their prevalence in hydrocarbon-rich environments, Syntrophaceae sp.bin 1.1 represents the first indication that this family may be involved in primary degradation of aromatic substrates in the coal seam environment.Given the lower MAG completeness score and the absence of other benzoyl-CoA pathway genes in Syntrophaceae sp.bin 1.1, further analysis could confirm accurate annotation and genetic potential, and axenic studies of this taxon would be especially useful for validation.Although incomplete, the organism represented by Syntrophaceae sp.bin 1.1 appears to experience considerable pressure from viral predation, as this MAG contains a high number of CRISPR spacers relative to the other Powder River Basin MAGs examined here (Fig. 3; Table 2).Viral predation may, therefore, be an important process in the Powder River Basin, as it is in other subsurface habitats.

Ecological functions and implications for understanding the coal seam microbiome
In terms of their life strategies in the coal seam, the organisms represented by the Krumholzibacteriota, S. aromaticivorans, and Syntrophaceae MAGs likely have stress-toler ant ecological profiles, in the sense of Grime (74), in that they contain genes for a range of metabolic functions or adaptations to aid survival in this challenging environment.Each appears to possess specialized genetic tools for the degradation of a plentiful, but difficult to access, carbon source within subsurface coal seams.Interestingly, for Krumholzibacteriota sp.bin 3.1, as well as hosting these genes for monoaromatic degradation, it hosts an array of other genes associated with accessing nutrients in moribund cells or plant material (Fig. 3; Table 2).As described earlier, coal seams are oligotrophic environments, and access to other macronutrients, such as nitrogen, in coal organic matter may be important for competition in this environment.Furthermore, while it appears that the Krumholzibacteriota organism described here may have a relatively high number of genes involved in carbohydrate metabolism, it has relatively few compared to truly ruderal taxa that occur in coal seams (14).
Importantly for scientific efforts to enhance or control methane yields from coal, the four putative aromatic-degrading taxa described here may hold key rate-limiting roles in the biodegradation of coal to methane in the subsurface.If the Krumholzibacteriota sp., the two S. aromaticivorans, and the Syntrophaceae sp.represented by these MAGs are all indeed capable of direct access to the carbon within coal, further study of their metabolic strategies may provide important tools for altering biodegradation of coal and other complex carbon in the subsurface.
Much of the previous research into the coal seam microbiome has centered around either descriptive studies of species distributions or the effect of nutrient mixtures in an effort to enhance gas yields.While both of these approaches are valuable, they provide comparatively little information on the function of individual microbes in these communities.In contrast, this study describes the genomes of four MAGs from the coal seam environment with likely roles extracting carbon from monoaromatic compounds.Studies that seek to elucidate processes upstream of monoaromatic degradation, involving the liberation of soluble organic matter from the insoluble coal macromole cule, would further our understanding of this unusual environment.

FIG 1
FIG 1Benzoyl-CoA pathway genes and subunits, from 4-hydroxybenzoyl-CoA and benzoate to 3-hydroxypimeloyl-CoA, and their presence in the selected contig bins (MAGs).Benzoyl-CoA reductase genes for two enzyme systems are displayed: the class-I, ATP-dependent bcrABCD (associated with facultative anaerobes), as well as the class-II, ATP-independent bamB (associated with obligate anaerobes)(15,32).See TableS2for number of occurrences of each gene in

FIG 2
FIG 2 Phylogenetic relationships and environmental sources of selected Krumholzibacteriota, Delphinibacteriota*, and Latescibacterota* MAGs.Maximum likelihood phylogeny of bin 3.1 and 17 representative MAGs using multilocus sequence analysis (MLSA; see Fig. S2 for the equivalent FCB superphylum tree).

TABLE 1
Bin statistics Syntrophaceae sp., being closely related to the Smithella and Syntrophus genera.Under recent alternative phylogenomic classifications of the Deltaproteobacte ria, S. aromaticivorans and the Syntrophaceae sp.fall within a phylum named either Thermodesulfobacteriota or Desulfobacterota, and the lowest common rank of close relatives to bin 1.1 is the Syntrophales order

TABLE 2
The abundance of genes or genetic elements identified in the MAGs ). Syntrophaceae sp.bin 1.1 had four CRISPR loci containing 18 spacers, S. aromaticivorans bin 2.2 had three CRISPR loci and 30 spacers, and Krumholzibacteriota sp.bin 3.1 had two CRISPR loci and seven spacers.Notably, while Syntrophaceae sp.bin 1.1 and S. aromaticivorans bin 2.2 contained the aforementioned cas genes (7 and 9, respectively), no cas genes were detected within Krumholzibacteriota sp.bin 3.1, and neither CRISPR loci nor cas genes were detected in S. aromaticivorans bin 2.6.