Atribacteria Reproducing over Millions of Years in the Atlantic Abyssal Subseafloor

The deep subseafloor sedimentary biosphere is one of the largest ecosystems on Earth, where microbes subsist under energy-limited conditions over long timescales. It remains poorly understood how mechanisms of microbial metabolism promote increased fitness in these settings. We discovered that the candidate bacterial phylum “Candidatus Atribacteria” dominated a deep-sea subseafloor ecosystem, where it exhibited increased transcription of genes associated with acetogenic fermentation and reproduction in million-year-old sediment. We attribute its improved fitness after burial in the seabed to its capabilities to derive energy from increasingly oxidized metabolites via a bacterial microcompartment and utilize a potentially reversible Wood-Ljungdahl pathway to help meet anabolic and catabolic requirements for growth. Our findings show that “Ca. Atribacteria” can perform all the necessary catabolic and anabolic functions necessary for cellular reproduction, even under energy limitation in anoxic sediments that are millions of years old.

M arine sediments contain a ubiquitous "deep biosphere" (1) extending at least as far as 2,500 m below the seafloor (mbsf) (2), which consists of active and dormant cells (3)(4)(5) with measurable impacts on subseafloor biogeochemical processes (6). At abyssal water depths in the deep sea, the subseafloor communities generally are less sampled (7) than those in continental shelf sediments that have higher activities and rates of microbial sulfate reduction (8)(9)(10). At abyssal depths under the oligotrophic ocean gyres, the sedimentation rates are low, ranging from 1 to 5 m of sediment deposited per million years (11,12). These abyssal subseafloor communities have extremely low metabolic activity (13) and live near the energy limit to life (6,14). As a result of this, the deep biosphere of marine sediment is generally characterized by net death, and it remains poorly understood to what extent microbial activity is translated into cellular reproduction in the deep subseafloor (15,16).
Many subseafloor microbes exhibit viability, since they actively take up carbon and nitrogen in incubation experiments (4,17), indicating potential for microbial growth in energy-limited anoxic subseafloor sediments. Microbial activities can also be stimulated at redox interfaces deep below the seafloor over geological timescales (3,18). However, the capacity of microbes to reproduce in abyssal subseafloor ecosystems close to the energy limit to life (6,14) is particularly unconstrained, given the extreme scarcity of organic substrates in these settings (19). There is reason to suspect that cellular reproduction in the abyssal subseafloor is minimal, since microbial biomass tends to decrease an order of magnitude over the top 10 m of sediment in all abyssal locations yet sampled, reaching the detection limit for life at relatively shallow subseafloor depths of ca. 15 mbsf (12). This follows the global trend whereby subseafloor microbes tend to die faster than they grow (1), particularly in the top 10 m of marine sediment.
Here, we report an exception to this global trend in anoxic deep-sea clay recovered from an abyssal water depth of Ͼ5,500 m in the North Atlantic, characterized by an ultraslow sedimentation rate of ca. 3 m per million years. In contrast to oxic abyssal red clay where microbial abundance decreases several orders of magnitude over the top 10 mbsf (12,20), we show here that microbial abundance in the anoxic abyssal clay increases an order of magnitude from the seafloor down to 15 mbsf (spanning ca. 5 million years). We then proceeded to use metatranscriptomics to further investigate the anaerobic metabolic mechanisms that explain this net growth in the size of the subseafloor microbial ecosystem over multimillion-year timescales.

RESULTS AND DISCUSSION
Sediment biogeochemistry. We obtained deep-sea clay sediment from a 5,515-m water depth in the ultraoligotrophic open ocean of the North Atlantic. This coring site (KN223-15) is characterized by a mean sedimentation rate of ca. 3 m per million years (11). Samples ranged from 0.1 to 30 m below seafloor (mbsf). Given the mean sedimentation rate, the deepest sample has an approximate age of 9 to 10 million years. Oxygen and nitrate penetration into the sediment is restricted to the top millimeter of sediment, as they were detectable in the bottom water but below detection in the uppermost portion of the core at 0.02 and 0.03 mbsf, respectively (see Fig. S1 in the supplemental material). The abyssal sediments of the North Atlantic are typically oxic red clay that tend to have O 2 penetrating many meters into the seafloor (11,21), but the subseafloor microbial ecosystem sampled here is unique in the sense that the sediments are anoxic despite having ultraslow sedimentation rates. Moreover, while the sediment of our abyssal subseafloor core displays sulfate (SO 4 2Ϫ ) concentrations of approximately 29 mM at 0.02 mbsf and is fully anoxic downward, the rates of anaerobic microbial SO 4 2Ϫ reduction over the top 10 m of sampled sediment are low (Ϫ3.8 ϫ 10 Ϫ4 mol SO 4 2Ϫ m 3 yr Ϫ1 ) and below detection underneath, resulting in pore water SO 4 2Ϫ remaining Ͼ20 mM throughout the core (Fig. 1A). This profile is very similar to profiles observed previously in anoxic sediments from other oligotrophic regions, such as the Eastern Equatorial Pacific, where sediments are anoxic but community metabolic activity is too slow to consume all of the available SO 4 2Ϫ (6,22). Abundance and diversity of the microbial communities. The density of 16S rRNA gene copies per gram of wet sediment exhibits a subsurface peak in abundance, increasing 1 order of magnitude (2.5 ϫ 10 5 to 2.3 ϫ 10 6 copies) from 0.2 to 3.5 mbsf and remaining Ͼ10 6 between 3 and 10 mbsf. Thereafter, microbial 16S rRNA gene abundances decrease gradually to a minimum of 9.9 ϫ 10 4 copies at 17 mbsf (Fig. 1B). In the uppermost sediment samples (0.1 to 0.4 mbsf), Actinobacteria, Planctomycetes, Deltaproteobacteria, and Gammaproteobacteria dominate the community (Fig. 1C). Below this depth, starting at 0.5 mbsf, the relative abundance of "Candidatus Atribacteria" rapidly increased from Ͻ5% to Ͼ40% ( Fig. 1B and C) in all four biological replicates sampled (see Fig. S2 and S3). The "Ca. Atribacteria" is only represented by 3 operational taxonomic units (OTUs), with one single OTU accounting for up to 40% of the whole community throughout the record ( Fig. 2A and B), which was consistent across three to four biological replicates (Fig. S2). This dominant OTU ( Fig. 2A) was most closely affiliated with unpublished 16S rRNA gene sequences from deep subseafloor sediments from the Nankai Trough (23) and a potential gas hydrate region of southwest Taiwan subseafloor sediments (Fig. 2B).
According to a protocol for quantitative normalization of barcoded 16S gene sequence data (24), we normalized group-specific 16S rRNA gene abundances by dividing the total number of 16S rRNA gene copies determined via quantitative PCR (qPCR) by their affiliated 16S rRNA gene relative abundances (fractional percentage of 2Ϫ reduction rates and dissolved SO 4 2Ϫ concentrations. (B) Quantitative PCR (qPCR) of 16S rRNA (black dots) and dsrB (orange stars) genes, and the summed qPCR-normalized abundance of 16S rRNA gene OTUs affiliated with "Ca. Atribacteria" (blue), Chloroflexi (green), and Deltaproteobacteria (light orange). Error bars correspond to standard deviations (2 ) based on four biological replicates. (C) Diversity of 16S rRNA genes based on three to four biological replicates. (D) Number of ORFs attributed to "Ca. Atribacteria" in the metatranscriptomes as a function of depth. Light blue circles are the number of ORFs detected in individual metatranscriptome replicates, and the larger dark blue circles are the total number of unique ORFs detected when summing across all replicates. The top two samples in panels B and C (0.1 to 0.3 mbsf) were recovered by gravity coring, the deeper samples (0.5 to 29 mbsf) were recovered via long-piston coring. Note in panels B and C that an increase in abundance of "Ca. Atribacteria" in the upper 1 mbsf coincides with their higher level of gene expression in panel D. total sequences per sample). This showed that "Ca. Atribacteria" dominates the community (Fig. 1B), which is attributed to a single 16S OTU (Fig. 2). We considered the potential influence of multiple copies of 16S rRNA genes (25) on the result and searched for all "Ca. Atribacteria" genomes in the JGI database sequenced to date, which revealed that all sequenced genomes from this group have only one 16S rRNA gene copy, whereas Deltaproteobacteria and Chloroflexi have a median of two 16S rRNA gene copies (up to four) per genome (25). Thus, despite having an average lower 16S rRNA gene copy number than Chloroflexi and Deltaproteobacteria, "Ca. Atribacteria" relative abundance was high in our 16S rRNA gene sequence data set, even suggesting that our data actually underestimate the abundance of "Ca. Atribacteria." These profiles show exponential net increases of "Ca. Atribacteria" and Chloroflexi within the upper 10 m of sediment, whereas those of Deltaproteobacteria increase at 0.4 mbsf but recede rapidly over time from 10 4 to 10 3 copies or less, reaching our limit of detection (10 2 gene copies g Ϫ1 [wet weight] sediment) at the bottom of the core. The vertical profile of the beta subunit of the dissimilatory sulfate reductase (dsrB) gene runs parallel to that of Deltaproteobacteria 16S rRNA genes, which points to Deltaproteobacteria as the main sulfate-reducing bacteria (SRB) in this anoxic sediment. This inference is supported by the abrupt 10-fold drop in dsrB and Deltaproteobacteria 16S rRNA gene sequences at 10 mbsf (Fig. 1B), the same depth at which net SO 4 2Ϫ reduction decreases to below detection levels (Fig. 1A). The abundance of SRB is lower than in organic-rich shelf sediments (26,27), which may be related to the relatively low sedimentation rate at our sampling location, since SO 4 2Ϫ reduction rates are correlated with the square of the sedimentation rate (9). The relative slowness of SO 4 2Ϫ reduction at our sampling location is readily evident in the profile of dissolved SO 4 2Ϫ concentrations, which remained Ͼ20 mM throughout the core (Fig. 1A).
Metagenomic analysis. Metagenomes from five depths were sequenced at an average depth of 8.4 million reads (Ϯ 2.5 million) (see Table S3). Because the DNA was amplified by multiple displacement amplification (MDA), we only considered presence/ absence of genes and did not interpret the relative abundance of genes in the MDA-amplified metagenomes to be indicative of taxon abundance. De novo "binning"based methods for creating metagenome-assembled genomes (MAGs) are useful for discovering new taxa when distantly related genomes in databases preclude similaritybased searches (28)(29)(30)(31). Because of the ultralow DNA concentrations extractable from these abyssal clay sediments, we were only able to obtain metagenomic data after amplifying the extracted DNA using multiple displacement amplification (MDA). Our attempts at de novo binning of the MDA products revealed a selective amplification of short fragments that precluded binning and completion of high-quality MAGs (see Table S2). Specifically, manual curation of MaxBin results using Anvi'o (32) produced 12 bins at relatively low levels of genome completeness (17% to 44%) that were able to be assigned to "Ca. Atribacteria" (see Fig. S4). Thus, in order to increase the annotation of putative functions of open reading frames (ORFs) in metagenomes from the "Ca. Atribacteria" that could not be assembled into bins, we also applied a bioinformatics pipeline, whereby ORFs harbored by de novo assembled contigs were searched for similarity against a large aggregated genome database of predicted proteins from all atribacterial MAGs and single-cell genomes (SAGs) sequenced to date, including all published data from subsurface metagenomics studies (see Materials and Methods). We then extracted all the ORFs having a predicted protein from a "Ca. Atribacteria" genome as a top BLASTp hit and ran a phylogenomic analysis based on 31 phylogenomic markers using AMPHORA (33). This phylogenomic analysis demonstrates that our previously published method (34,35) recovers ORFs with high similarity (Ͼ95% amino acid similarity) to predicted proteins in previously sequenced "Ca. Atribacteria" genomes (Fig. 2C). The phylogenomic analysis shows that the atribacterial ORFs from all sampled depths form a monophyletic clade sister to those from deep-sea sediments of the Atlantic Ocean, Ross Sea, and Pacific Ocean methane hydrates within the JS1 clade (Fig. 2C). This high level of similarity to existing "Ca. Atribacteria" genomes enabled similarity-based assignment of "Ca. Atribacteria" ORFs in our samples (see Fig. S5). As further evidence of this, the median similarity of ORFs to their top hit in the database in all metatranscriptome and metagenome data sets of Ͼ60% indicates that the ORFs in our samples had relatively high similarity to existing predicted proteins in the database (see Fig. S6).
We confirmed the accuracy of this previously published approach (34,35) by an in silico test for true-and false-positive annotations based on 151 randomly selected peptide fragments extracted from "Ca. Atribacteria" predicted proteomes as well as other bacterial and archaeal genomes (see Fig. S7). In this analysis, 50 randomly selected predicted proteins were randomly cut into peptide fragments ranging from 20 to 140 amino acid residues in length in order to replicate partial ORFs typically recovered in metagenomes. The random peptide fragments were then searched against the large aggregate database (34,35) for their top hits with BLASTp. This in silico test showed that 100% of all randomly cut peptide fragments from atribacteria were true positives; they had the same atribacterial genome as a top BLASTp hit. This shows that our use of a similarity-based approach for ORF annotations (34)(35)(36) is adequate for assigning ORFs harbored by de novo assembled contigs to groups at high taxonomic levels, including those derived from the candidate phylum Atribacteria.
Gene expression analysis. Metatranscriptomes from eight depths were produced in biological replicates (two to seven replicates per depth) (Fig. 1D) and sequenced at an average depth of 4.0 million reads (Ϯ1.5 million) with 91,199 contigs across all samples sequenced (Table S3). Similar to that for the 16S rRNA gene abundances, there was a subsurface peak in the number of unique expressed ORFs assigned to "Ca. Atribacteria" that increases exponentially between 3.5 and 10 mbsf, which was consistent across multiple replicate metatranscriptomes from each depth, and at 16 m, our RNA limit of detection was reached (Fig. 1D). We defined this depth as our RNA detection limit, because the number of unique ORFs assigned to "Ca. Atribacteria" was no longer detectable and, below this depth, the only ORFs annotated were those assigned to groups of known contaminants from molecular kits, including those from human skin and soil (37). Many of these same groups are common laboratory contaminants found in dust samples from our lab in 16S rRNA gene surveys (38) and include Pseudomonas, Rhizobium, Acinetobacter, and Staphylococcus. Presumably, the detection limit was reached because a smaller amount of extracted RNA from the in situ active community becomes overprinted by background "noise" from contaminating DNA (or RNA) derived from the kits, aerosols, or other laboratory contaminants.
There was a statistically significant correlation (r values ϭ 0.59 and 0.61, P values ϭ 0.016 and 0.014) between the abundance of atribacterial 16S rRNA genes and expressed ORFs (see Table S1). The number of expressed ORFs correlated with 16S rRNA gene quantities, which was consistent for the entire data set (total bacteria) and when comparing the number of ORFs expressed per group to the qPCR normalized abundance of the same taxonomic group for the four groups with the highest numbers of detected ORFs in the metatranscriptomes (atribacteria, Deltaproteobacteria, Chloroflexi, and Archaea) (Fig. 3A).
Evidence for subseafloor reproduction. The steadily increasing abundance of "Ca. Atribacteria" over time (since sediment deposition) is apparently due to cells undergoing cytokinesis, as the 16S rRNA gene abundances correlated with the transcription of genes encoding proteins involved in cell division and cell shape determination such as FtsAEKQWZ, MreBC, and RodA ( Fig. 3B) (39,40). The expressed genes encoding the Fts proteins form the "divisome" (41), which includes the FtsZ ring, a cytokinetic protein ring that localizes at the cell division site prior to cytokinesis in dividing bacterial cells and pumps the replicated chromosome into the daughter cell (42). These genes are expressed during cellular division in "Ca. Atribacteria," indicating the exponential growth phase (43). Thus, while we do not report measurements of biomass turnover such as amino acid racemization (44,45) or stable isotope probing (4,17), the steadily increasing microbial biomass of "Ca. Atribacteria" with increasing sediment depth together with the detected correlation between their abundance and transcriptional activity (Fig. 3A) and the transcription of ORFs encoding FtsZ ring proteins (Fig. 3B) strongly suggest that the higher abundances of "Ca. Atribacteria" between 0.5 and 10 mbsf are due to actively dividing cells. In this zone of apparent increased reproduction, the number of unique expressed ORFs correlated significantly with the number of 16S rRNA gene copies from the dominant groups ( Fig. 1 and 3). The most parsimonious explanation for this is that more expressed ORFs from atribacteria are detected at these depths because there are higher numbers of metabolically active atribacteria producing mRNA transcripts.
The sediment ages in this interval span several million years; thus, this higher number of metabolically active microbes originated from "Ca. Atribacteria" that slowly reproduced through binary fission and cytokinesis over multimillion-year timescales.
Because genes involved in the formation of the divisome, FstZ ring, and cytokinesis were only expressed at depths where the highest 16S rRNA gene copies were found (Fig. 4B), the higher number of 16S rRNA gene copies at those depths is likely due to higher numbers of actively reproducing atribacterial cells. Active cells have higher numbers of ribosomes and thus higher copies of 16S rRNA per cell (46), but our qPCR assay targeted DNA (the bacterial chromosome), not RNA (expressed genes), thus discarding active rRNA synthesis as a confounding factor in our bacterial abundance estimations. Since "Ca. Atribacteria" has only one 16S rRNA gene copy, it is possible to conclude that the observed 10-fold increase in atribacterial 16S rRNA gene copy numbers over the top 10 mbsf requires chromosomal (genome) replication. In bacteria, genome replication occurs in actively dividing cells prior to cytokinesis and binary fission (46), assuming steady-state microbial input in sediment over millions of years. The only alternative explanation to our observations would require an increase in 16S rRNA copy number in the chromosome of "Ca. Atribacteria" over the top 10 mbsf followed by its subsequent decrease in the same chromosome below 10 mbsf. Such an incredibly high rate of genome evolution affecting the highly conserved 16S gene and then acting in a reversible way after 10 mbsf is inconceivable, especially considering this gene evolves at a rate of 1% every 100 million years (47) and our deepest sampled depth is roughly 10 million years old. Thus, the most likely explanation is that the detection of more 16S rRNA gene copies from the dominant OTU of "Ca. Atribacteria" ( Fig. 2A) is due to more cells, each containing one chromosome with one copy of the 16S rRNA gene. The correlation of these 16S rRNA genes with higher numbers of expressed genes from "Ca. Atribacteria" over the top 15 mbsf (Fig. 1D and 3) can be attributed to a single clade (Fig. 2) that has been slowly reproducing and increasing in abundance over millions of years.
Predicted metabolism for "Ca. Atribacteria." Our finding that "Ca. Atribacteria" expressed the highest number of protein-encoding genes among all other groups (Fig. 3) is furthermore consistent with their high levels of gene transcription in deep subseafloor sediments of the Baltic Sea (48). Below 15 mbsf, the microbial ecosystem transitions to net death, since microbial abundances decrease by 2 orders of magni- Subsurface Microbes Reproducing over Millions of Years ® tude, RNA levels decline to below detection, and the net SO 4 2Ϫ reduction rate is below our detection limit (Fig. 1). However, "Ca. Atribacteria" remains the dominant group throughout the core, even below 10 mbsf. Its continued dominance is consistent with its abundance in other deep subseafloor settings (49)(50)(51)(52) and the findings that dominant taxa in the subseafloor community do not necessarily require cellular reproduction to outcompete other taxa but can reach higher relative abundances from lower mortality than their competitors over long timescales (53,54). This is supported by the lack of expression of ORFs by atribacteria harboring cellular division proteins below 10 mbsf, where atribacterial 16S rRNA gene abundances are Ͻ10 5 copies per g (Fig. 3B). Similar to what has been predicted from genomic studies (55)(56)(57)(58), the gene transcription data strongly indicate that the "Ca. Atribacteria" dominating throughout the core actively utilizes a sugar-based acetogenic metabolism (Fig. 4A). This is also consistent with gene transcription from "Ca. Atribacteria" in deep subseafloor Baltic Sea sediments that showed active utilization of trehalose (48).
The Wood-Ljungdahl pathway (WLP) has been found to be a widespread and potentially important metabolic pathway in subsurface microbial life (57,59,60). The WLP can be used catabolically to achieve redox balance and regenerate NAD ϩ and oxidized ferredoxin to thereby increase anaerobic metabolic efficiency (61). In acetogenic bacteria, this is coupled to the Rnf complex at the membrane that utilizes either ferredoxin or NAD ϩ as the terminal electron acceptor, driving a Na ϩ ion gradient and ATP synthesis (Fig. 5) (62). Several lines of evidence in our gene transcription data indicate that "Ca. Atribacteria" utilizes a similar metabolism.
Specifically, "Ca. Atribacteria" expressed transcripts encoding ORFs with similarity to proteins involved in sugar fermentation and the WLP (Fig. 4B). Genes encoding enzymes of the WLP were all present (Fig. 4A) and differently expressed by "Ca. Atribacteria" as were those encoding proteins involved in glycolysis, fermentation, and electron bifurcation (Fig. 4B). Transcription of ORFs by "Ca. Atribacteria" with similarity to enzymes known to be involved in electron bifurcation that results in production of molecular hydrogen included the beta subunit of the heterodisulfide reductase, formate hydrogen lyase (63), gamma subunit of the NiFe hydrogenase (64), and subunit alpha of the reversible formate dehydrogenase (65). The formate hydrogen lyase complex combines NiFe hydrogenase and soluble formate dehydrogenase to couple  (57), as well as the function of the FstZ ring and divisome (41). Colored dots correspond to the successive enzymatic steps (as listed in Fig. 4), e.g., fermentation corresponds to glycolysis (11 dots) and lactate fermentation (3 dots). Actively expressed genes encoding the divisome complex (i.e., FtsAEKQWZ) are indicated at the top right. nPPP, nonoxidative pentose phosphate pathway; SLP, substrate-level phosphorylation; DERA, 2-deoxy-D-ribose 5-phosphate aldolase; ALDH, aldehyde dehydrogenase; ADH, alcohol dehydrogenase; FBEB, flavin-based electron bifurcation; FHL, formate hydrogen lyase; WLP, Wood-Ljungdahl pathway.
Subsurface Microbes Reproducing over Millions of Years ® formate and/or CO-dependent hydrogen production to the generation of the Na ϩ motive force to generate ATP. This enzymatic complex allows growth on formate disproportionation (66), during which two electrons are passed to hydrogenase and two protons are neutralized (67). The gene expression data indicate that the "Ca. Atribacteria" produces the chemiosmotic Na ϩ gradient using the Rnf complex for ATP synthesis (61). Atribacterial ORFs encoding the Rnf complex were expressed at the majority of depths (Fig. 4B), indicating this is indeed an important mechanism of anaerobic ATP synthesis.
We also detected active transcription of genes encoding pyruvate-formate lyase, acetate kinase, and formate dehydrogenase, which is reversible (65); therefore, the WLP could be used for either catabolic or anabolic purposes (Fig. 5). However, the expression of an H 4 F:B 12 methyltransferase from "Ca. Atribacteria" is a direct indication that the WLP is functioning as an acetogenic pathway as opposed to one for methanogenesis, because acetogens use 5-methyltetrahydrofolate:corrinoid (H 4 F:B 12 ) methyltransferase, whereas methanogens use tetrahydromethanopterin methyltransferase in the last step of methyl synthesis prior to acetyl synthesis (68). Expression of the H 4 F:B 12 methyltransferase from "Ca. Atribacteria" was relatively high in the deeper samples at 8.2 and 9.7 mbsf (Fig. 4B), where the steadily increasing abundance of 16S rRNA genes with depth reached their peak values (Fig. 1B).
"Ca. Atribacteria" expression of transporter-encoding genes for multiple types of sugar (i.e., hexose, hexulose, pentose, pentulose, and ribose) and glycosidases (Fig. 4B) indicates a potential preference for sugar-based substrates. Sugars are in anoxic sediments as an energy substrate from bacterial necromass (e.g., ribose-containing nucleic acids DNA and RNA) and could be used as a fermentation substrate (69). The only detectable gene expression from "Ca. Atribacteria" observed at a 15.9-m depth indicates sugar transport, peptidases, fructose biphosphate, and phosphogluconate aldolase activities (Fig. 4B), again pointing that utilization of necromass (sugars and peptides from dead cells) is an important activity for long-term survival.
Fermentation products and other oxidized organic substrates (i.e., alcohols, ketones, aldehydes, and carboxylic acids) become toxic if they accumulate in the cell (70,71). The gene expression from "Ca. Atribacteria" shows a metabolic potential for aldehydes to be reoxidized in secondary fermentations via a bacterial microcompartment (BMC), regenerating additional reduced ferredoxin and NADH in the process (72,73). We obtained three contigs from "Ca. Atribacteria" with ORFs encoding BMC shell proteins syntenous with Rnf, aldehyde dehydrogenase, and 2-deoxy-D-ribose 5-phosphate aldolase (DERA) (Fig. 4C), similar to the already reported genome assembled from "Ca. Atribacteria" (57). These genes were coexpressed at multiple depths (Fig. 4B), suggesting that the BMC and Rnf-based establishment of the Na ϩ ion gradient for ATP production are connected (Fig. 5). Collectively, these findings indicate that "Ca. Atribacteria" may use a BMC to recycle toxic intermediates produced during secondary fermentations for additional energy (Fig. 4 and 5). The ability to use a BMC to sustain secondary fermentations, in addition to primary fermentation of sugars ( Fig. 4 and 5), may optimize energy conservation (e.g., increase ATP yields per mole substrate oxidized) under the extreme energy-limiting subseafloor conditions. Similar to our results, a dominance of "Ca. Atribacteria" in organic lean sediments has been observed previously during the IODP expedition 313 to the New Jersey shelf (27). However, we acknowledge that biases inherent to all methods for determining the abundance of Bacteria and Archaea make quantitative comparisons difficult (5) and thus interpret the qPCR data showing a relatively lower abundance of Archaea (Fig. 1B) with caution.
Although fermenters and acetogens are often outcompeted by SRB in organic-rich subseafloor sediments, the poor fitness of SRB in the sampled environment ( Fig. 1B and  C), compared to that of "Ca. Atribacteria," may be related to the relatively low SO 4 2Ϫ reduction rates (Fig. 1A) and relatively low rates of organic matter burial at our sampled site (11,19). In contrast, Chloroflexi related to Dehalococcoidia, which are predicted (homo)acetogens with metabolic potential to degrade complex organic compounds (59,60), are relatively abundant in these deep-sea anoxic clays ( Fig. 1B and C) and transcribe genes involved in cell division (Fig. 3B). Our results lead us to speculate that organic-lean, anoxic abyssal subseafloor sediment thus represents a niche that favors the reproduction of Chloroflexi and "Ca. Atribacteria" over that of other microbes, including SRB. Moreover, in culture experiments from the first cultivated representative of the atribacteria, the presence of an H 2 -consuming methanogenic partner increased the growth rate of atribacteria by more than 100-fold (43). Based on this experimental result, we speculate that the relatively high abundance of the atribacteria in our sampled ancient anoxic sediment could be attributed to an as-of-yet-unidentified syntrophic or semisyntrophic partner organism, from the Chloroflexi, for example, Dehalococcoidia (74,75).
Our data show that reproduction can occur in the subseafloor over multimillion-year timescales. Although RNA from the "Ca. Atribacteria" was no longer detectable below 16 mbsf, metagenomes and 16S sequencing show that they subsist and remain the dominant group down to at least 29 mbsf. The extractable DNA with our protocol does not target extracellular DNA (20); thus, the DNA in these deeper sediments likely does not derive from "dead" DNA preserved from once-living cells. Therefore, we assume that "Ca. Atribacteria" remains viable, and potentially active, in the deeper sediments but at cell abundances and transcriptional activities that are too low for our current RNA-based methods. Thus, the survival of the "Ca. Atribacteria," associated with gene expression at relatively low levels, supports the hypothesis that reduced metabolic activity is a fitness advantage (16) in the energy-starved subseafloor.

MATERIALS AND METHODS
Sampling. All samples were taken by cruise KN223 of the R/V Knorr in the North Atlantic, from 26 October to 3 December 2014 (Woods Hole, MA, to Woods Hole, MA). At site KN223-15 (33°29.0=N, 54°10.0=W; water depth, 5,515 m), successively longer sediment cores were retrieved using a multicorer (ϳ0.4 m), gravity corer (ϳ3 m), and the Woods Hole Oceanographic Institution (WHOI) piston-coring device (ϳ28 m). Additional details of sampling are published (11). Dissolved oxygen concentrations in the core sections (see Fig. S1 in the supplemental material) were measured with optical O 2 sensors as described previously (76). Concentrations of dissolved SO 4 2Ϫ were measured as published previously (6), and SO 4 2Ϫ reduction rates were calculated as previously described (14). Sediment subcores were retrieved on the ship aseptically using end-cut sterile syringes and kept frozen at Ϫ80°C until extraction in the home laboratory.
DNA extraction, quantitative PCR, and 16S rRNA gene libraries. For each sampled depth, we performed four biological replicates of DNA extraction. Total DNA was extracted from 0.7 g of sediment as previously described (20). DNA templates were diluted 10 times in ultrapure PCR water (Roche) and used in qPCR amplifications with updated 16S rRNA gene primer pair 515F (5=-GTG YCA GCM GCC GCG GTA A-3=) and 806R (5=-GGA CTA CNV GGG TWT CTA AT-3=) to increase our coverage of Archaea and marine clades (77) and run as previously described (38). All qPCRs were set up in 20-l volumes with 4 l of DNA template, 20 l SsoAdvanced SYBR green Supermix (Bio-Rad), 4.8 l nuclease-free H 2 O (Roche), 0.4 l primers (10 M; biomers.net), and 0.4 l MgCl 2 and carried out on a CFX-Connect qPCR machine for gene quantification. For 16S rRNA genes, we ran 40 PCR cycles of two steps corresponding to denaturation at 95°C for 15 s and annealing and extension at 55°C for 30 s. To measure the abundance of dsrB genes, we used a previously described assay (78) with the primer pair dsrB4-R (5=-GTG TAG CAG TTA CCG CA-3=) and dsrB2060F (5=-CAA CAT CGT YCA YAC CCA GGG-3=). All qPCRs were set up in 20-l volumes with 4 l of DNA template and performed as previously described (79). Gel purified amplicons of the 16S rRNA, dsrB, and mcrA genes were quantified in triplicates using QuantiT dsDNA reagent (Life Technologies) and used as a standard. An EpMotion 5070 automated liquid handler (Eppendorf) was used to set up all qPCRs and prepare the standard curve dilution series spanning from 10 7 to 10 1 gene copies. Reaction efficiency values in all qPCR assays were between 90% and 110% with R 2 values of Ͼ0.95% for the standards.
For 16S rRNA gene library preparation, qPCR runs were performed with barcoded primer pair 515F and 806R. All 16S rRNA gene amplicons were purified from 1.5% agarose gels using the QIAquick gel extraction kit (Qiagen), quantified with the Qubit dsDNA HS assay kit (Thermo Fisher Scientific), normalized to 1 nM solutions, and pooled (38). Library preparation was carried out according to the MiniSeq System Denature and Dilute Libraries guide (Illumina). Sequencing was performed on all four biological replicates (Fig. S2) on the Illumina MiniSeq platform at the Geo-Bio LMU Center. We used USEARCH version 10.0.240 for MiniSeq read trimming and assembly, OTU picking, and 97% sequence identity clustering (80), which we showed previously captures an accurate diversity represented within mock communities sequenced on the same platform (38). OTU representative sequences were identified by BLASTn searches against SILVA database version 132 (81). To identify contaminants, 16S rRNA genes from extraction blanks and dust samples from the lab were also sequenced in triplicates (38). These 16S rRNA gene sequences were used to identify any contaminating bacteria (e.g., Acinetobacter, Bacillus, and Staphylococcus) and selectively curate the OTU plotted in a maximum likelihood RAxML phylogenetic tree (83) (Fig. S3) against the SILVA 16S rRNA SSU NR99 reference database version 132 (81) using ARB (84). Closest environmental sequences with nearly full-length sequences (Ͼ1,400 bp) were selected as taxonomic references and used to calculate trees implemented with the bacterial and archaeal filter and advanced bootstrap refinement selecting the best tree among 300 replicates (84). Partial OTU sequences were added to the tree using the maximum parsimony algorithm without allowing changes of tree topology ( Fig. 2 and S3). Metagenome libraries. Whole-genome amplifications were performed on DNA extracts at 10ϫ dilutions through a multiple displacement amplification (MDA) step of 6 to 7 h, using the REPLI-g Midi kit (Qiagen) and according to the manufacturer's instructions. Dilution of the extracted DNA was necessary, because the MDA reaction was inhibited in undiluted DNA extracts (presumably from coextracted inhibitory chemicals). We added SYBR green I (Invitrogen) at 1,000ϫ concentration to visualize the amplification in real time on the CFX-Connect qPCR machine with a fluorescence reading taken every 10 min. Amplification was stopped after reaching the exponential increase by heating to 65°C for 3 min. Metagenomic libraries were prepared using the Nextera XT DNA Library Prep kit (Illumina), quantified on an Agilent 2100 bioanalyzer system (Agilent Genomics), and normalized with the Selecta-Size DNA Clean and Concentrator MagBead kit (Zymo Research) as previously described (20).
RNA extractions and metatranscriptome libraries. Total RNA extractions were obtained according to a previously published protocol (85). In brief, RNA was extracted from 3 g of sediments using the FastRNA Pro Soil-Direct kit (MP Biomedicals) according to the manufacturer's instructions, with the addition of 4 l glycogen (0.1 g ml Ϫ1 ) to increase yield during precipitation of the RNA pellet and final elution in 40 l PCR-grade water (Roche). Extraction blanks (no RNA added) were processed alongside to assess laboratory contamination, and sequencing of these contamination controls revealed common laboratory contaminants found in dust samples from our lab in 16S rRNA gene surveys (38), including Pseudomonas, Rhizobium, Acinetobacter, and Staphylococcus. RNA extracts were quantified using the QuBit RNA HS assay kit (Thermo Fisher Scientific). DNase treatment, synthesis of cDNA, and library construction were performed on the same day from 10 l of RNA templates using the Trio RNA-Seq kit protocol (NuGEN Technologies). Libraries were quantified as described above. All libraries were diluted to 1 nM and pooled for further sequencing on the MiniSeq platform (Illumina).
Gene identification and normalization in metagenomes. The SqueezeMeta (86) metagenomic analysis pipeline was used for downstream analysis of metagenomic reads in coassembly mode. For adapter removing, trimming, and quality filtering, we used Trimmomatic set to the following: leading, 8; trailing, 8; sliding window, 10:15; and minimum length, 30 (87). Contigs were assembled to minimum length of 200 bp using Megahit assembler (88). ORFs for genes and rRNAs were called using Prodigal (89), and rRNA genes were determined by barrnap (90). RDP classifier were used for the classification of 16S rRNA genes (91). Diamond software (92) was deployed for taxonomic assignment of retrieved gene homologies against the GenBank, eggNOG v. 4.5 (93), and KEGG (94) databases. Cutoff values for assigning hits to specific taxa were performed at an E value of 1 ϫ e Ϫ3 , minimum amino acid similarity of 40 for taxa and 30 for functional assignment, using SqueezeMeta with default settings. Reads were mapped onto contigs and genes by using Bowtie2 (95). Coverage and transcripts per million (TPM) values were calculated using SqueezeMeta. For binning, we used MaxBin 2.0 (96) and MetaBAT (97), and bins generated by the two different algorithms were merged into one single data set using DAS Tool (98). Bin completeness and contamination were checked using CheckM (99). Further analysis of metagenomeassembled genomes (MAGs) based on results from SqueezeMeta was achieved using Anvi'o v. 6.2 (32) and MAGs selected by DAS tool further refined manually based on hierarchical clustering of contigs. The complete workflow using DAS Tool separates 14 bins whose taxonomic affiliations are uncertain. Manual curation of these results points to "Candidatus Aerophobetes," Chloroflexi, and "Ca. Atribacteria" as potential bin affiliations (Table S2). In comparison, from the manual curation of MaxBin results, we produced 31 bins (17% to 59%), among which, 12 bins are clearly assigned to "Ca. Atribacteria" with different levels of genome completeness (Fig. S4).
Because the MDA step produced many short fragments that did not allow high-quality binning and full genome completion, assigning a taxonomic affiliation to metagenomic and metatranscriptomic data is challenging (100). For annotating putative functions of ORFs in metagenomes and metatranscriptomes from particular "higher-level" taxonomic groups of microorganisms (34,35), we also applied a bioinformatics pipeline whereby protein-encoding ORFs were extracted from de novo assembled contigs using FragGeneScan v. 1.30 (101) and functionally annotated against a large aggregated genome database (20,34,35) containing predicted proteins from all protist, fungal, bacterial, and archaeal genomes and MAGs in the JGI and NCBI databases using DIAMOND version 0.9.24 (92). This database, which we refer to as MetaProt, also contained all ORFs from all of the transcriptomes of microbial eukaryotes from the MMETS project (102), and we removed any hits to photosynthetic eukaryotic algae as contaminants. This custom MetaProt database that we used for this study is available as a single 32 GB amino acid fasta file on the LMU Open Data website (https://data.ub.uni-muenchen.de/183/). Cutoff values for assigning hits to specific taxa were performed at a minimum bit score of 50, minimum amino acid similarity of 60, and an alignment length of 50 residues. All scripts and code used to produce the analysis have been posted on GitHub (https://github.com/williamorsi/MetaProt-database), and we provide a link to the MetaProt on the GitHub page as well as instructions within the scripts regarding how to conduct the workflows that we used. This approach assigns ORFs to higher-level taxonomic groups (35). As is the case in all metagenomic studies, the incomplete nature of genomes in databases, together with the lower representation of sequenced genomes from candidate clades than from cultured ones, makes it likely that our pipeline misses annotation of ORFs that are derived from as-yet-unsequenced atribacterial genomes. We acknowledge that some genes in databases annotated as being present in "Ca. Atribacteria" might have been assigned to bins according to criteria that differ from study to study (Fig. S7).
Gene identification and normalization in metatranscriptomes. Paired-end reads were trimmed and assembled into contigs using CLC Genomics Workbench 9.5.4 (Qiagen, Hilden, Germany), using a word size of 20, bubble size of 50, and a minimum contig length of 300 nucleotides. Reads were then mapped to the contigs using the following parameters (mismatch penalty, 3; insertion penalty, 3; deletion penalty, 3; minimum alignment length, 50% of read length; minimum percent identity, 95%). Coverage values were obtained from the number of reads mapped to a contig divided by its length (i.e., average coverage). Only contigs with an average coverage of Ͼ5 were selected for ORF searches and downstream analysis (20,34,85). This protocol does not assemble rRNA; thus, results are only discussed in terms of mRNA. We then performed even further stringency controls by removing any contig that had less than 5ϫ coverage, e.g., reads per kilobase mapped (RPKM). The final resulting data set of contigs was then used for ORF searches and annotation against the MetaProt aggregate database, as described above.
For the metatranscriptomes, normalization of the relative abundance of ORFs was based on the number of unique ORFs assigned to a group (e.g., "Ca. Atribacteria"), a fractional percentage of total ORFs detected. We normalized expression in this manner as opposed to more conventional procedures such as RPKM because we found that the transcriptome sequencing (RNA-seq) kit we used has an amplification step (SPIA amplification, Trio RNAseq Ovation kit; NuGen) that biases the relative abundance of reads mapping to contigs when normalized using RPKM. For example, the RPKM value for the same ORF across technical replicates was found to have very large (orders of magnitude) variability in RPKM. In contrast, the total number of unique ORFs (e.g., presence/absence of an expressed ORF) assigned to specific groups (e.g., "Ca. Atribacteria") was highly consistent between technical replicates. We assume that this technical variation in the RNA-seq data is associated with randomized SPIA amplification of different transcripts and/or fluctuations in the number of mRNA molecules in technical replicate tubes due to the highly labile nature of RNA during the extraction and library prep procedure. For this reason, we normalized the relative abundance of ORFs assigned to a specific group based on presence/absence of expressed ORFs, which was highly consistent between technical replicates despite the SPIA amplification. If significantly higher numbers of unique ORFs are detected from a particular group than from other groups, it can be attributed to relatively higher transcriptional activity. COG categories were assigned by searching the ORFs against the COG database (103) using BLASTp. Metagenomic raw reads and metatranscriptomic atribacterial contigs were mapped with high stringency against a previously sequenced subseafloor MAG of "Ca. Atribacteria" that has relatively high completeness (88%) as a reference (104), using Geneious 8.1.9. The average coverage after mapping against GenBank reference no. NCRO00000000.1 to 0.700 (104) was 224 (Ϯ131), and 93% of ORFs in the reference MAG were detected in our subseafloor metagenomes and metatranscriptomes. Consensus sequences were exported and annotated using the online tool RAST v. 2.0 (105). Taxonomic assignment of protein-encoding genes to "Ca. Atribacteria" clade JS1 was further confirmed in our metagenomes (Fig. S6) by selecting and aligning 31 phylogenomic markers from the corresponding metagenomic ORFs and 36 atribacterial reference genomes obtained from the NCBI database by using AMPHORA2 (33). Statistical analyses of beta-diversity were performed using RStudio v. 3.3.3 with the Bioconductor package (106).
Data availability. Data are publicly available through NCBI BioProject PRJNA590088. Metagenomes and metatranscriptomes have accession numbers SAMN13317858 to SAMN13317880. The 16S data are available in SRA BioSample accessions SAMN10929403 to SAMN10929517 and SAMN13324854 to SAMN13324920. Additional data related to this paper may be requested from the authors.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.   We declare no competing interests.