Ecological and Genomic Attributes of Novel Bacterial Taxa That Thrive in Subsurface Soil Horizons

Soil profiles are rarely homogeneous. Resource availability and microbial abundances typically decrease with soil depth, but microbes found in deeper horizons are still important components of terrestrial ecosystems. By studying 20 soil profiles across the United States, we documented consistent changes in soil bacterial and archaeal communities with depth. Deeper soils harbored communities distinct from those of the more commonly studied surface horizons. Most notably, we found that the candidate phylum Dormibacteraeota (formerly AD3) was often dominant in subsurface soils, and we used genomes from uncultivated members of this group to identify why these taxa are able to thrive in such resource-limited environments. Simply digging deeper into soil can reveal a surprising number of novel microbes with unique adaptations to oligotrophic subsurface conditions.

use carbon monoxide (CO) as a supplemental energy source, and the ability to form spores. Together these attributes likely allow members of the candidate phylum Dormibacteraeota to flourish in deeper soils and provide insight into the survival and growth strategies employed by the microbes that thrive in oligotrophic soil environments. IMPORTANCE Soil profiles are rarely homogeneous. Resource availability and microbial abundances typically decrease with soil depth, but microbes found in deeper horizons are still important components of terrestrial ecosystems. By studying 20 soil profiles across the United States, we documented consistent changes in soil bacterial and archaeal communities with depth. Deeper soils harbored communities distinct from those of the more commonly studied surface horizons. Most notably, we found that the candidate phylum Dormibacteraeota (formerly AD3) was often dominant in subsurface soils, and we used genomes from uncultivated members of this group to identify why these taxa are able to thrive in such resource-limited environments. Simply digging deeper into soil can reveal a surprising number of novel microbes with unique adaptations to oligotrophic subsurface conditions. KEYWORDS soil microbiology, metagenomics, microbial traits, critical zone, microbial ecology S ubsurface soils often differ from surface horizons with respect to their pH, texture, moisture levels, nutrient concentrations, clay mineralogy, pore networks, redox state, and bulk densities. Globally, the top 20 cm of soil contains nearly five times more organic carbon (C) than soil in the bottom 20 cm of meter-deep profiles (1). In addition, residence times of organic C pools are typically far longer in deeper soil horizons (2), suggesting that much of the soil organic matter found in the subsurface is not readily utilized by microbes. Unsurprisingly, the strong resource gradient observed through most soil profiles is generally associated with large declines in microbial biomass (3)(4)(5)(6)(7)(8); per gram soil, microbial biomass is typically 1 to 2 orders of magnitude lower in the subsurface than surface horizons (4,6,7). Although microbial abundances in deeper soils are relatively low on a per gram soil basis, the cumulative biomass of microbes inhabiting deeper soil horizons can be on par with that living in surface soils, owing to the large mass and volume of subsurface horizons (3,5). Moreover, those microbes living in deeper horizons can play important roles in mediating a myriad of biogeochemical processes, including processes associated with soil C and nitrogen (N) dynamics (9,10), soil formation (11), iron redox reactions (12,13), and pollutant degradation (14).
Given that soil properties typically change dramatically with depth, it is not surprising that the composition of soil microbial communities also generally changes with depth through a given profile (4-6, 8, 15, 16). In some cases, the differences observed in microbial communities with depth through a single soil profile can be large enough to be evident even at the phylum level of resolution. For example, both Chloroflexi (15,17) and Nitrospirae (15) may increase in relative abundance with depth. However, while previous work suggests that particular taxa can be relatively more abundant in deeper soils, it is unclear if such patterns are consistent across distinct soil and ecosystem types. We hypothesized that there are specific groups of soil bacteria and archaea that are typically rare in surface horizons but more abundant in deeper soils. Taxa that are proportionally more abundant in deeper soil horizons likely have slow-growing, oligotrophic life history strategies due to the lack of disturbance at depth and the lowresource conditions typical of most deeper soil horizons (18). Likewise, we expect deeper soils to harbor higher proportions of novel and undescribed microbial lineages, given that oligotrophic taxa are typically less amenable to in vitro, cultivation-based investigations (19).
We designed a comprehensive study to investigate how soil bacterial and archaeal communities change with soil profile depth, to identify taxa that are consistently more abundant in deeper horizons, and to determine what life history strategies enable these taxa to thrive in the resource-limited conditions typical of most subsurface horizons. We collected soil samples at 10-cm increments from 20 soil profiles representing a wide range of ecosystem types throughout the United States, with most of the profiles sampled to 1 m in depth. We examined the bacterial and archaeal communities of these soil profiles by pairing amplicon 16S rRNA gene sequencing with shotgun metagenomic sequencing on a subset of samples. We found that deeper soil horizons typically harbored more undescribed bacterial and archaeal lineages, and we identified specific phyla (including Dormibacteraeota, GAL15, Chloroflexi, Euryarchaeota, and Nitrospirae) that consistently increased in relative abundance with depth across multiple profiles. Moreover, we found one candidate phylum (Dormibacteraeota, formerly AD3) to be particularly abundant in deeper soil horizons with low organic C concentrations. From our metagenomic data, we were able to assemble genomes from representative members of this candidate phylum and document the life history strategies, including low maximum growth rates and spore-forming potential, that are likely advantageous under low-resource conditions.

RESULTS AND DISCUSSION
Sample descriptions and soil properties linked to soil depth. We collected soils from a network of 10 current and former Critical Zone Observatories (CZOs) located across the United States (Fig. 1A) that span a broad range of hydrogeological provinces, soil orders, and ecosystem types, including tropical forest, temperate forest, grassland, and cropland sites. Soils were sampled from two distinct profiles per CZO for a total of 20 different soil profiles. Details of the site characteristics and edaphic properties for The proportion of 16S rRNA gene sequences from the sampled soils for which representative genome data are available decreases with depth. We matched our 16S rRNA gene amplicon sequences to 16S rRNA genes from finished bacterial and archaeal genomes in the NCBI database. At deeper soil depths, we found that fewer taxa in our data set had matches to publicly available genomes, indicating that the bacterial and archaeal taxa found in deeper soil horizons are less represented in genomic databases than those found in surface soils. More details on these analyses are presented in Materials and Methods. The purple trend lines represent smoothed conditional means, generated using the loess modeling method.
Uncultured Oligotrophs Thrive in Deep Soils each of the 20 soil profiles are provided in Data Set S1 in the supplemental material. Soils were collected from the first meter (where possible) of freshly excavated profiles, sampling at 10-cm increments and focusing on mineral soil horizons only (O horizons, if present, were not sampled). Together, this collection yielded 179 individual soil samples collected across sites with a wide range of different climatic conditions (e.g., mean annual temperatures ranging between 5 and 23°C and mean annual precipitation ranging from 26 to 402 cm yr Ϫ1 ) (Data Set S1). The sampled profiles ranged from poorly developed Entisols and Inceptisols to highly developed Oxisols and Ultisols (per the USDA soil taxonomy system) and reflected an extremely broad range of soil properties. For example, in the 0-to 10-cm depth increment, soil pH ranged from 3.3 to 9.8, organic carbon concentrations ranged from 1.3% to 21.6%, and texture ranged from 0% to 45% silt plus clay across the profiles.
Some soil properties changed consistently with depth across all 20 profiles. Total N and organic C concentrations were both negatively correlated with soil depth, in agreement with previous observations (1, 20) (depth versus %C rho ϭ Ϫ0.61, P Ͻ 0.001; depth versus %N rho ϭ Ϫ0.56, P Ͻ 0.001; Spearman). On average, soil total organic C concentrations below 50 cm were 4.4 times lower than in surface soils, while total N concentrations were 6.3 times lower. While we measured a suite of additional chemical and soil properties (Data Set S1), only clay concentrations exhibited consistent changes with depth (with percent clay generally increasing with depth; rho ϭ 0.29, P Ͻ 0.001; Spearman). Given that our sampling effort included a wide range of different soil types and the expectedly high degree of variability in inter-and intraprofile edaphic characteristics, our goal was not to determine if distinct soil samples harbored distinct microbial communities or to characterize the factors related to shifts in overall community composition. Rather, our goal was to determine if there were any consistent changes in soil microbial communities with depth across the 20 sampled profiles.
Community characteristics linked to soil depth. Unsurprisingly, we found that the location of each soil profile had a strong influence on the composition of soil bacterial and archaeal communities, as determined by 16S rRNA gene amplicon sequencing (r ϭ 0.47, P Ͻ 0.001, permutational multivariate analysis of variance [PERMANOVA]). Individual soil profiles generally harbored distinct microbial communities ( Fig. 2;  Fig. S1). In addition to this variation across the profiles, soil depth also had a significant effect on the composition of the bacterial and archaeal communities within individual profiles (P Ͻ 0.01 for 16 of 20 profiles, rho values ranging from 0.24 to 0.45). In general, Here, we show the relative abundances of the eight most abundant phyla identified from our 16S rRNA gene amplicon data. Not all profiles were sampled to 1 m due to variable bedrock depth. Note that the two profiles sampled from each CZO site were selected to represent distinct soil types (details on soil characteristics are available in Data Set S1 in the supplemental material). the variation in community composition with depth within a given profile, while significant, was less than the differences in soil communities observed across different profiles when all profiles and soil depths were examined together (depth, r ϭ 0.02, P Ͻ 0.001; location, r ϭ 0.47, P Ͻ 0.001, PERMANOVA).
Several characteristics of the bacterial and archaeal communities changed consistently with depth despite the high degree of heterogeneity observed across the different soil profiles. As soil depth increased, microbial communities found at depth became increasingly dissimilar to those found in surface horizons (Fig. 1B). When we analyzed the entire sample set together, dissimilarity to surface soils (0-to 10-cm depth) was positively correlated with depth (P Ͻ 0.001, rho ϭ 0.73, Spearman). This trend also held for 17 out of 20 individual soil profiles (depth was not significant in both Eel River sites and IML site 1). We also found that, in general, the diversity of microbial communities decreased with depth, with several CZOs exhibiting stronger declines with depth than others (Calhoun, Luquillo, and South Sierra) (Fig. 1C). Lastly, when we compared the 16S rRNA gene sequences from this study to those 16S rRNA gene sequences from finished bacterial and archaeal genomes in the NCBI database, we found that the proportion of taxa for which genomic data are available declined with depth (from 6.2 to 26.1% in surface soils to 1.9 to 18.0% in the deepest horizons sampled) (Fig. 1D). Although representative genomes are currently unavailable for the majority of soil bacterial and archaeal taxa (21), we found that this problem is exacerbated for taxa living in deep soils.
Taxonomic shifts with soil depth. Although each soil profile harbored distinct microbial communities (Fig. 2), we identified five phyla that consistently increased in abundance with soil depth, as measured by Spearman correlations across the entire data set: Chloroflexi, Euryarchaeota, Nitrospirae, and the candidate phyla Dormibacteraeota and GAL15 (Fig. 3) (false discovery rate [FDR]-corrected P values Ͻ 0.02, rho Ͼ 0.22). For example, Dormibacteraeota were, on average, 27 times more abundant in soils at 90 cm than in surface horizons. The candidate phylum Dormibacteraeota, Chloroflexi, and Nitrospirae have previously been found to increase in abundance with increasing soil depth in individual profiles (15,17), while candidate phylum GAL15 has been shown to be abundant in oxic subsurface sediments (22). Members of these phyla are likely oligotrophic taxa adapted to survive under the resource-limited conditions found in deeper horizons. Indeed, soil Euryarchaeota (23), Chloroflexi, and Nitrospirae (24) have been shown to decrease in relative abundance upon soil fertilization. These five phyla are also underrepresented in public genome databases; together, they  Table S5.
Uncultured Oligotrophs Thrive in Deep Soils ® account for only 2.8% of bacterial and archaeal genomes deposited in the IMG database (as of December 2018), reinforcing our observation (highlighted in Fig. 1D) that bacteria and archaea living in deeper soils are underrepresented in genome databases.
Community-level shotgun metagenomic analyses. We selected one soil profile from each of the CZOs for metagenomic sequencing, selecting the profile that displayed the most community dissimilarity through depth (Eel River samples were not analyzed for logistical reasons). In total, we obtained shotgun metagenomic data from 67 soil samples with an average of 7.84 million quality-filtered reads per sample (see Table S1 for details). We first used these metagenomic data to quantify changes in the relative abundances of the bacterial, archaeal, and eukaryotic domains with depth. The overwhelming majority of rRNA gene sequences that we detected were from bacteria (89.2% to 98.7% of reads), followed by archaea (0.03% to 7.70%), and then eukaryotes (0.04% to 4.27%). Interestingly, we found that the proportion of eukaryotic sequences in our samples decreased with depth (rho ϭ Ϫ0.32, P ϭ 0.05). Most of these eukaryotic rRNA gene reads were classified as Fungi (58%), followed by Charophyta (16%), Metazoa (9.3%), and Cercozoa (7.0%). These results are in line with previous work showing that the contributions of eukaryotes, most notably fungi, to microbial biomass pools typically decrease with soil depth (25).
We also directly compared the results obtained from our 16S rRNA amplicon and shotgun metagenomic sequencing across the same set of samples. We did this to check whether our PCR primers introduced significant biases in the estimation of taxon relative abundances. We found that the shotgun and amplicon-based estimations of abundance for the most ubiquitous and abundant phyla we observed across the sampled profiles (Fig. 2) were well correlated (Fig. S2, mean rho values ϭ 0.70). Next, we checked whether our primers missed any major groups of bacteria or archaea, as it has been noted that many taxa from the Candidate Phyla Radiation (CPR; recently assigned to the superphylum Patescibacteria) (26) are not detectable with the primer set used here (27). While we found that our primer pair did fail to recover sequences from the superphylum Patescibacteria, these taxa were rare in our data-the entire superphylum accounted for only 0.5% of 16S rRNA gene reads across the whole metagenomic data set.
Candidate phylum Dormibacteraeota is relatively more abundant in soils with low organic carbon. We found that members of candidate phylum Dormibacteraeota were consistently more abundant in deeper soil horizons and particularly abundant in subsurface horizons from the Calhoun and Shale Hills CZOs (Fig. 4). In these soils, Dormibacteraeota dominated the microbial communities-in some samples, over 60% of 16S rRNA sequences were classified as belonging to members of the Dormibacteraeota candidate phylum. The high abundances of Dormibacteraeota were confirmed with shotgun metagenomic analyses (Fig. S2), indicating that the abundances of this phylum were not inflated by PCR primer biases. The candidate phylum Dormibacteraeota was first observed in a sandy, highly weathered soil from Virginia, United States (28), and does not yet have a representative cultured isolate. The phylum was previously known as "AD3" but was renamed Dormibacteraeota after three genomes from the phylum were assembled from Antarctic soils (29). Other representative genomes from this phylum have also become available with the recent addition of 47 genomes assembled from thawing permafrost (30). The phylum Dormibacteraeota has been observed in subsurface soil horizons previously (31,32), and its relative abundance has been found to be negatively correlated with water content, C, N, and total potential enzyme activities (17).
While the abundance of members of the phylum Dormibacteraeota was generally positively correlated with depth across all samples included in this study (rho ϭ 0.22, P ϭ 0.02, Spearman), this pattern did not hold for all profiles (Fig. 4). Instead, we found organic C concentrations to be the best predictor of the abundance of Dormibacteraeota in these soil communities (Fig. S3); Dormibacteraeota were typically eight times more abundant in soils with less than 1% organic C than in soils where organic C concentrations were greater than 2%. Because soil depth and organic C concentrations were correlated across the profiles studied here, we used an independent data set of surface soils (0-to 10-cm depth) collected from 1,006 sites across Australia to determine if the abundances of Dormibacteraeota were also correlated with organic C concentrations when analyses were restricted to a broad range of distinct surface soils (33). Indeed, we found that the relative abundances of Dormibacteraeota in Australian surface soils (which ranged from 0.0 to 7.0% of 16S rRNA gene sequences) were also negatively correlated with soil organic carbon concentrations (Fig. S3). Together, these results indicate that members of the Dormibacteraeota phylum are typically most abundant in surface or subsurface soils where organic C concentrations are relatively low.
Dormibacteraeota draft genomes recovered from metagenomic data. To gain more insight into the potential traits and genomic attributes of soil Dormibacteraeota, we conducted deeper shotgun metagenomic sequencing on several soils where Dormibacteraeota were found to be particularly abundant (Fig. 4A), with the goal of assembling draft genomes from members of this group. We assembled two Dormibacteraeota genomes, both from deep soils (Fig. 4). These genomes are considered medium-quality drafts, according to published genome reporting standards for metagenome-assembled genomes (MAGs) (34); bin 3 is estimated to be 72.6% complete at 3.43 Mb, while bin JG-37 is 69.9% complete at 2.48 Mb (see further genome details in Table S1). These genomes are similar in size to those previously assembled from the phylum (range of 3.0 to 5.3 Mb, all Ͼ90% complete [29]; range of 1.6 to   FIG 4 (A) The 16S rRNA gene relative abundance of phylum Dormibacteraeota is variable across different soil profiles but generally increases with depth. The samples used for the Dormibacteraeota genome assemblies are noted with stars. The trend lines represent smoothed conditional means, generated using the loess modeling method. (B) The two Dormibacteraeota genomes we assembled from soil profile metagenomic data cluster phylogenetically with previously published Dormibacteraeota genomes. Our deep soil genomes also fall near the known sister phyla Chloroflexi and Armatimonadetes, validating their identity as members of candidate phylum Dormibacteraeota. This tree was created using the concatenated marker gene phylogeny generated from GTDBTk (26) and was plotted using iTOL (70). Only closely related phyla are included in the tree. Genomes assembled in this study are indicated in red, and all other AD3/Dormibacteraeota genomes originated from either reference 29 or 30. The family groupings for the Dormibacteraeota tree were first presented in reference 30.
Analyses of the Dormibacteraeota genomes that we recovered indicate that members of this phylum are aerobic heterotrophs adapted to nutrient-poor conditions. Both Dormibacteraeota genomes encode high-affinity terminal oxidases, indicative of an aerobic metabolism (cbb 3 oxidase, bin JG-37; bd oxidase, bin 3). These genomes contain no markers of an autotrophic metabolism, with no RuBisCO or hydrogenase genes detected in either of the assembled genomes. Both genomes encode glycosyl hydrolases (with bin 3 containing 14 of these genes in total), indicating an ability to use polysaccharides for growth. Specifically, both genomes contain glycogen catalysis (alpha-amylase, glucoamylases) and synthesis (glycogen synthase) genes. The ability to synthesize, store, and break down glycogen has been shown to promote the survival of bacteria during periods of starvation (36,37). Additionally, both Dormibacteraeota genomes contain the trehalose 6-phosphate synthase gene, a key gene in the pathway for the synthesis of trehalose, a C storage compound that also confers resistance to osmotic stress and heat shock (37) and can protect cells from oxidative damage, freezing, thermal injury, or desiccation (38). These attributes likely confer an advantage in resource-limited soils, as the ability to store C for later use may be advantageous in environments where organic C is infrequently available or of low quality.
Based on several lines of evidence, soil-dwelling Dormibacteraeota appear to be oligotrophic taxa with low maximum growth rates. First, as mentioned above, these taxa have the highest relative abundances in soils with low organic C concentrations, where we would expect oligotrophic lifestyles to be advantageous. Second, both Dormibacteraeota genomes appear to contain a single rRNA operon, a feature often linked to low maximum potential growth rates (39). Third, although we cannot directly measure the maximum growth rate of uncultivated bacterial cells, we can estimate maximum growth rate from genomes by measuring codon usage bias with the ΔENC= metric (40). ΔENC= is a measure of codon bias in highly expressed genes and has been shown to correlate strongly with growth rate for both bacteria and archaea (41). We calculated ΔENC= for our Dormibacteraeota genomes, the Antarctic Dormibacteraeota genomes (29), the thawing permafrost Dormibacteraeota genomes (30), and a set of bacterial and archaeal genomes which matched the 16S rRNA gene amplicon sequences recovered from our soil profile samples at Ն99% sequence similarity. The ΔENC= values for all the Dormibacteraeota genomes clustered together toward the lower end of the spectrum for our set of soil bacteria and archaea, indicating that members of the phylum Dormibacteraeota are likely to exhibit low potential growth rates (Fig. S4).
To our knowledge, all previous Dormibacteraeota genomes were recovered from either Antarctic desert (29) or permafrost soils (30), while our genomes hail from subsurface soils collected from temperate regions. Despite these disparate origins, some central characteristics of the phylum Dormibacteraeota appear to be consistent. Similar to the Antarctic and permafrost Dormibacteraeota genomes, our Dormibacteraeota genomes also contained carbon monoxide (CO) dehydrogenase genes. There are two forms of CO dehydrogenases, which differ in their ability to oxidize CO and the rate at which they do so (42). While the active site of form I is specific to CO dehydrogenases, form II active sites also occur in many molybdenum hydroxylases that do not accept CO as a substrate (42). Using sequence data from our assembled Dormibacteraeota genomes, the Antarctic and permafrost Dormibacteraeota genomes, and selected CO dehydrogenase large-subunit sequences (CoxL), we generated a phylogenetic tree based on the amino acid sequence of CoxL (Fig. S5). With these analyses, we found that both of the Dormibacteraeota genomes recovered here possess form II CO dehydrogenase genes, as do many of the Antarctic and permafrost Dormibacteraeota genomes. Although it has been shown that form II CO dehydrogenases can permit growth with CO as a sole C and energy source in some cases (43), further work is needed to determine whether the form II CO dehydrogenase genes allow Dormibacteraeota to actively oxidize CO or if these genes code for molybdenumcontaining hydroxylases responsible for other metabolic processes (44). Interestingly, one Antarctic and many permafrost Dormibacteraeota genomes also encode form I CoxL, indicating that some members of this phylum are capable of CO oxidation (Fig. S5). CO oxidation with form I CO dehydrogenases has been shown to improve the survival of bacterial cells under nutrient-limited conditions (45).
Analyses of our assembled Dormibacteraeota genomes also reveal that these soil bacteria may be capable of spore formation. All together, our Dormibacteraeota genomes contain 34 spore-related genes scattered across a variety of spore generation phases (Table S2). We also found spore-forming genes among the Antarctic and permafrost Dormibacteraeota genomes, most notably those encoding SpoIIE, SpoIIM, SpoIIIE, and SpoVS. Nutrient-limiting conditions are known to trigger spore formation (46), and sporulation can allow bacterial cells to persist until environmental conditions become more favorable. Additionally, members of the Chloroflexi, a sister phylum to Dormibacteraeota, are capable of spore formation (47). Because there are no Dormibacteraeota isolates available to test for sporulation, we adapted a method previously used in stool samples (48) to identify potential spore-forming taxa by using a cultureindependent approach. We incubated three soil samples from our study in 70% ethanol to kill vegetative cells and then used propidium monoazide (PMA) to block the amplification of DNA from these dead cells (49). We then sequenced these soils using our standard 16S rRNA gene amplicon method both with and without the ethanol and PMA treatment. We found that the abundances of the two dominant Dormibacteraeota phylotypes were significantly higher in the spore-selected treatment than the untreated controls (Table S3). Other known sporeformers were enriched in the spore selection treatment as well, including taxa from the orders Actinomycetales, Bacillales (48), Myxococcales (50), and Thermogemmatisporales (51). While the enrichment of Dormibacteraeota in ethanol-treated samples shows that these cells are hardy, it is not conclusive proof of spore formation and further testing is needed to verify our findings (there are other methods of ethanol resistance in bacteria, such as biofilm formation [52] and residence inside other cells [53]).
Conclusions. Our results indicate that as soil depth increases, not only do bacterial and archaeal communities become less diverse and change in composition, but novel, understudied taxa become proportionally more abundant in deeper soil horizons. We identified five poorly studied bacterial and archaeal phyla that become more abundant in deeper soils across a broad range of locations and investigated one of these further (the candidate phylum Dormibacteraeota, formerly AD3) to determine what characteristics may allow Dormibacteraeota to survive in resource-limited soil environments. We found that members of Dormibacteraeota are likely slow-growing aerobic heterotrophs capable of persisting under low-resource conditions by putatively storing and processing glycogen and trehalose. Members of this candidate phylum also contain type I and II carbon monoxide dehydrogenases, which can potentially enable the use of trace amounts of CO as a supplemental energy source. We also found that soil-dwelling Dormibacteraeota are likely capable of sporulation, another trait that may allow cells to persist during periods of limited resource availability. More generally, analyses of these novel members of understudied phyla suggest life history strategies and traits that may be employed by oligotrophic microbes to thrive under resource-limited soil conditions. Volunteers were asked to sample in 10-cm increments Uncultured Oligotrophs Thrive in Deep Soils ® to a depth of at least 100 cm or to refusal. Site details are available in Data Set S1 in the supplemental material.

MATERIALS AND METHODS
All soil samples were sent to the University of California, Riverside, for processing. A portion of each field sample was sieved (Ͻ2 mm, ASTM no. 10), homogenized, and divided into subsamples for further analyses, with subsamples stored at either 4°C, Ϫ20°C, or Ϫ80°C. For some soils (particularly some wet, finely textured depth intervals), sieving was not practical. These samples were homogenized by mixing, with larger root and rock fragments removed by hand. In addition, as samples from Shale Hills site 2 (70to 100-cm depth) consisted almost entirely of medium-sized rocks, soil was collected by manually crushing rocks with a ceramic mortar and pestle; this material was then passed through a 2-mm sieve.
DNA was extracted from subsamples frozen at Ϫ20°C using the DNeasy PowerLyzer PowerSoil kit (Qiagen, Germantown, MD, USA) according to the manufacturer's instructions, with minor modifications to increase yield and final DNA concentration based on the assumption that some sites and depths would have a relatively low microbial biomass. Specifically, 0.25 g of soil was weighed in triplicate (i.e., three 0.25-g aliquots ϭ 0.75 g total soil per sample) from one frozen aliquot of sieved soil. Extractions on each 0.25-g replicate aliquot proceeded in parallel, until the stage when DNA was eluted onto the spin filter; replicates were pooled at this point onto a single filter, and extractions proceeded from this point as a single sample. In addition, the final step of elution of the DNA from the filter was conducted with 50 l of elution buffer instead of 100 l; the initial flowthrough was reapplied to the filter once to increase yield.
Soil characteristics. Frozen subsamples (stored at Ϫ20°C) were shipped to the University of Illinois at Urbana-Champaign for characterization of soil physicochemical properties. Soil C and N concentrations were measured on freeze-dried, sieved, and ground subsamples using a Vario Micro Cube elemental analyzer (Elementar, Hanau, Germany). Approximately 1 g of each subsample was also extracted in 30 ml of 0.5 N HCl for determination of Fe(III) and Fe(II) concentrations by using a modified ferrozine assay (54). Soil texture was measured for oven-dried and sieved soil in accordance with the method of Gee and Bauder (55).
Soil pH and gravimetric water content were measured using modified Long Term Ecological Research (LTER) protocols, as described by Robertson et al. (56). Soil pH was determined using 15 g of field-wet soil and 15 ml of Milli-Q water (Millipore Sigma, Burlington, MA) and was measured on a Hannah Instruments (Woonsocket, RI) HI 3220 pH meter with an HI 1053B pH electrode, designed for use with semisolids. For determining gravimetric water content, we oven-dried 7 g of soil at 105°C for a minimum of 24 h.
Amplicon-based 16S rRNA gene analyses. To characterize the bacterial and archaeal communities in each sample, we used the barcoded primer pair 515f/806r for sequencing the V4-V5 region of the 16S rRNA gene. We amplified this gene region three times per sample, combined these products, and normalized the concentration of each sample to 25 ng using SequalPrep normalization plate kits (Thermo Fisher Scientific, Waltham, MA). All samples were then pooled and sequenced on the Illumina MiSeq (2 ϫ 150 paired-end chemistry) at the University of Colorado next-generation sequencing facility. The sample pool included several kit controls and no template controls to check for possible contamination.
Sequences were processed using a combination of QIIME (57) and USEARCH (58) commands to demultiplex, quality-filter, remove singletons, and merge paired-end reads. Sequences were classified into exact sequence variants (ESVs) using UNOISE2 (59) with default settings, and taxonomy was assigned against the Greengenes 13_8 database (60) using the RDP classifier (61). ESVs with greater than 1% average abundance across all sequenced controls were classified as contaminants and removed from further analyses, along with ESVs identified as mitochondria and chloroplasts. The entire data set was then rarefied to 3,400 sequences per sample. All statistical analyses were done in R version 3.5.1 (62), and all figures were created with ggplot2 (63) unless otherwise noted. We used the R package Vegan (64) to calculate Bray-Curtis dissimilarity (vegdist, methodϭ"bray") on Hellinger transformed ESV tables (decostand, methodϭ"hellinger") and to calculate Shannon diversity (diversity, indexϭ"shannon"). We calculated Spearman and Pearson correlations (cor.test) and corrected P values using base R functions (p.adjust, methodϭ"fdr").
We checked if our amplicon sequences had "representative genomes" in public databases by matching the 16S rRNA gene amplicon sequences generated in this study to 16S rRNA genes from finished bacterial and archaeal genomes in the NCBI database using the USEARCH10 command "use-arch_global." We considered 16S rRNA gene amplicons to have a "representative genome" if they matched a genome sequence with Ն97% identity.
Shotgun metagenomic analyses. One soil profile from each CZO was selected for shotgun sequencing; we chose the profile that exhibited the most dissimilarity in microbial community composition through depth to sequence. The Eel River CZO samples were not collected in time to be included in these analyses. Using the same DNA as used for the amplicon sequencing, we generated metagenomic libraries with the TruSeq DNA LT library preparation kit (Illumina, San Diego, CA). All samples were pooled and sequenced on an Illumina NextSeq run using 2 ϫ 150-bp paired-end chemistry at the University of Colorado next-generation sequencing facility. Prior to downstream analysis, we merged and quality filtered the paired-end metagenomic reads with USEARCH. After quality filtering, we had an average of 8.8 million quality-filtered reads per sample (range, 1.9 to 15.4 million reads; we included only samples with at last 1 million reads). These sequences were uploaded to MG-RAST (65) for public access. We used Metaxa2 (66) with default settings to extract small-subunit (SSU) rRNA gene sequences (bacterial, archaeal, and eukaryotic) in each sample and assigned taxonomy as described above using the Greengenes 13_8 database (60) and the RDP classifier (61).