Polymetallic nodules, sediments, and deep waters in the equatorial North Pacific exhibit highly diverse and distinct bacterial, archaeal, and microeukaryotic communities

Abstract Concentrated seabed deposits of polymetallic nodules, which are rich in economically valuable metals (e.g., copper, nickel, cobalt, manganese), occur over vast areas of the abyssal Pacific Ocean floor. Little is currently known about the diversity of microorganisms inhabiting abyssal habitats. In this study, sediment, nodule, and water column samples were collected from the Clarion‐Clipperton Zone of the Eastern North Pacific. The diversities of prokaryote and microeukaryote communities associated with these habitats were examined. Microbial community composition and diversity varied with habitat type, water column depth, and sediment horizon. Thaumarchaeota were relatively enriched in the sediments and nodules compared to the water column, whereas Gammaproteobacteria were the most abundant sequences associated with nodules. Among the Eukaryota, rRNA genes belonging to the Cryptomonadales were relatively most abundant among organisms associated with nodules, whereas rRNA gene sequences deriving from members of the Alveolata were relatively enriched in sediments and the water column. Nine operational taxonomic unit (OTU)s were identified that occur in all nodules in this dataset, as well as all nodules found in a study 3000–9000 km from our site. Microbial communities in the sediments had the highest diversity, followed by nodules, and then by the water column with <1/3 the number of OTUs as in the sediments.

seabed to the ocean's surface, and (3) release mining tailings composed of nodule-free sediments back into the water column, thereby potentially impacting much larger areas than due to the deep water plume (Rolinski, Segschneider, & Sundermann, 2001), and significantly disturbing large areas of the abyssal seafloor and overlying water column . To fully understand potential impacts on microbial processes that may be disturbed by mining activities, it is important to characterize the microbial communities of the abyssal seafloor and the overlying water column since both will likely be impacted by mining.
In the CCZ, the low particulate organic-carbon flux from the overlying waters compared to continental margins results in oxygenated sediments to depths of 2-3 m (Mewes et al., 2014;Smith, De Leo, Bernardino, Sweetman, & Arbizu, 2008). The abyssal plain can be a relatively stable environment over a period of many years, and benthic production can be low as it depends on input of organic particles from the euphotic zone. The physical and chemical structure of polymetallic nodules potentially provides a unique niche for bacterial, archaeal, and microeukaryotic communities to colonize. Little is known about nodule formation, although it has been hypothesized to be an abiotic process (Kerr, 1984). However, a recent study posited a microbially mediated mechanism for nodule initiation because X-ray and microscopy studies indicated high concentrations of bacteria in Mn-rich micronodules (Wang, Schlossmacher, Wiens, Schroeder, & Mueller, 2009). Indeed, an early study of polymetallic nodules using light and scanning electron microscopy revealed biofilms and filamentous microorganisms associated with nodule surfaces (Burnett & Nealson, 1981). Three recent studies (Blothe et al., 2015;Tully & Heidelberg, 2013;Wu et al., 2013), relying on gene-based surveys, identified unique bacterial and archaeal operational taxonomic unit (OTUs) specifically associated with nodules compared to surrounding sediments in the eastern North Pacific Subtropical Gyre (NPSG), the central South Pacific Gyre and the central and western NPSG. A fourth study investigated only bacterial diversity in the sediments of the CCZ but not from nodules (Wang et al., 2010). The nodule-associated protistan community has thus far not been investigated beyond morphological identification of selected foraminifera (e.g., (Mullineaux, 1987;).
Here we present findings of the bacterial, archaeal, and microeukaryotic communities associated with a polymetallic nodule field based on amplification and sequencing of rRNA genes. These analyses included 75 sediment samples, 24 water column samples, and 20 individual nodules from 11 stations randomly distributed over a 30 × 30 km stratum (AB-01; Table 1, Figure 1a) within the United T A B L E 1 Sampling site locations in the UK-1 claim area, dates, and depths for this study three stations for seawater. Seawater samples were collected from eight discrete depths within the water column (5, 150, 300, 500, 1000, 2000, 3000 m, and near-bottom waters) using a conductivitytemperature-depth (CTD; SBE 911plus; Sea-Bird Electronics) rosette sampler equipped with 24 10 L sampling bottles. The rosette sensor package also included a fluorometer (Seapoint Chlorophyll Fluorometer; Seapoint Sensors, Inc.) and dissolved oxygen (O 2 ) sensor (SBE 43;. Seawater (2 L each from 5, 150, 300, and 500 m and 8 L from 1000, 2000, 3000 m, and near-bottom waters) was subsampled from the rosette bottles into polycarbonate carboys (4.5 L carboys for 2 L samples and 10 L carboys for 8 L samples) and immediately filtered using a peristaltic pump onto in-line 25 mm diameter, 0.2 μm pore-sized, Supor filters. Filtration times varied from 40 min to 2.5 hr depending on the volume. Filters were flash-frozen in liquid nitrogen and stored at −80°C until shore-based laboratory processing. Water samples (1.5 ml) for subsequent flow cytometric analyses of picoplankton abundance were fixed with 0.22 μm-filtered formaldehyde (2% final concentration), incubated at 4°C for 15 min, flash-frozen in liquid nitrogen, and stored at −80°C until shore-based analyses.
Nodules and sediments were aseptically sampled from 0.25 m 2 box cores (nodules) or 80 cm 2 megacore tubes (nodules and sediments) (see (Glover, Dahlgren, Wiklund, Mohrbeck, & Smith, 2016) for boxcoring and megacoring equipment and sampling protocols). Nodules were mostly found within the 0-5 cm fraction of the box core and megacore, although some were recovered from our maximum sampling depth of 10 cm. All nodules found in megacore tubes designated for microbiology (usually 2 per deployment) were collected, as were a random subset of nodules from boxcores. From this collection of megacore and boxcore nodules, a random subset were selected for subsequent DNA extraction, and amplification and sequencing of 16S and 18S rRNA genes. In total, extracts from 20 nodules were used for 16S rRNA amplification and sequencing, with 18 of these nodulederived DNA extracts also used for 18S rRNA gene amplification and sequencing. Subcores of sediments were obtained using sterile 20 mL syringes with the tip ends cutoff, in each of four sediment horizons: 0-5 cm below seafloor (cmbf), 5-6 cmbf, 6-8 cmbf, and 8-10 cmbf.
Sediment subcores were stored in sterile Whirl-Pak bags (Nasco, Fort Atkinson, Wisconsin) at −80°C. Nodules were rinsed with 0.2 μmfiltered ambient bottom water to remove sediment adhering to the surface and stored whole in sterile Whirl-Pak bags at −80°C.

| DNA extraction
In the shore-based laboratory, genomic DNA was extracted from seawater samples using a DNeasy Plant Mini Kit (Qiagen) following a modified protocol (Paerl, Foster, Jenkins, Montoya, & Zehr, 2008).
Briefly, the filters were subjected to chemical and physical (beadbeating step using both 0.1 and 0.5 mm beads) means for cell disruption. Total lysates were purified using the DNeasy Mini spin column procedure (Qiagen) following the manufacturer's recommendations.
Under sterile laboratory conditions, the polymetallic nodules were rinsed with 0.2 μm-filtered, autoclaved, bottom (~4000 m) seawater, returned to Whirl-Pak bags, and broken while still in the bag using an autoclaved mortar and pestle. Two ~500 mg pieces from the interior of each nodule were subjected to DNA extraction. Extraction of DNA from nodules and sediments was performed using the FastDNA Spin Kit for Soil (MP Biomedicals, USA) following the manufacturer's protocol, modified as follows: homogenization was performed in a Mini-Beadbeater-16 (Biospec Products, Bartlesville, Oklahoma) and centrifugation following homogenization was extended to 15 min. An extraction blank (FastDNA Spin Kit for Soil spin column with no sample added) was processed alongside samples. DNA concentrations were determined from 4 μl of each sample using the Qubit 2.0 Fluorometer and the Qubit dsDNA High Sensitivity Assay kit (Life Technologies).
Extracts with DNA concentrations >0.1 ng/μl were purified and concentrated using the Zymo Clean & Concentrator-5 (2:1 DNA Binding Buffer) kit with the resulting DNA eluted in sterile, DNase-free water.

| PCR amplification and Illumina sequencing of 16S and 18S rRNA genes
The V4 region of the 16S rRNA gene was amplified by the polymerase chain reaction (PCR) using the oligonucleotide primer pair 515f/806r, which include the Illumina flowcell adapter sequences and a samplespecific barcode exactly as described in Caporaso et al., (2011Caporaso et al., ( , 2012. Of the 20 nodules sampled for 16S rRNA genes, most were sampled in duplicate (i.e., two separate ~500 mg pieces from the same nodules were subjected to DNA extraction, PCR amplification, and sequencing). Initial 16S rRNA gene results from the duplicate sam-

| Bioinformatic analyses of sequences
Illumina paired-end 16S rRNA gene reads were joined using the bioinformatic software fastq-join (Aronesty, 2013) and sequences were processed, including an initial quality filtering and sequence sample-mapping by barcode, using QIIME version 1.8.0 (Caporaso, Kuczynski, et al. 2010). Potentially chimeric sequences were identified using the UCHIME algorithm within the USEARCH package (Edgar, 2010) and removed from further analysis. Open referencebased OTU picking was performed using the UCLUST algorithm (Edgar, 2010), one of the principal clustering algorithms in the QIIME package, at a 97% sequence similarity cutoff against the Greengenes rRNA gene database release 13_8 (DeSantis et al., 2006). OTUs that occurred as absolute singletons or were observed in the extraction and/or PCR blanks were filtered from the experimental samples. Taxonomy was assigned based on the Greengenes taxonomy (McDonald et al., 2012;Werner et al., 2012) using a UCLUST-based consensus taxonomy assigner (Bokulich et al., 2015). A total of 13,835,715 high-quality sequences were generated, with an average of 101,133 sequences/sample (minimum sequences/sample = 16,753; maximum sequences/sample = 236,702). These data were normalized to 16,000 reads/sample to account for uneven sampling depth using the script single_rarefaction.py, which randomly subsamples the input OTU table without replacement, and this normalized OTU table was used in subsequent analyses unless otherwise specified. The script summarize_otu_by_cat.py in the QIIME package was used to collapse this OTU table by sample type and/or depth when necessary. The only exceptions were the differential abundance analysis in which the full dataset was used, and alpha diversity analyses in which samples were collapsed by sample type (water column, nodules, or sediments) and the dataset was subsampled randomly multiple times at different depths, with a maximum depth of 2,401,000 sequences in order to take maximum advantage of this large dataset.
Illumina 5′ 18S rRNA gene reads were processed similarly to 16S rRNA reads, except reference-based OTU picking was performed against the SILVA 119 rRNA gene database (Quast et al., 2013).
Taxonomy was assigned based on the SILVA 119 taxonomy (Yilmaz et al., 2014) using BLAST (Altschul, Gish, Miller, Myers, & Lipman, 1990). The resulting OTUs were filtered to exclude 38,523 bacterial OTUs, 23,733 archaeal OTUs, and 3,126 OTUs that could not be identified at the domain level. A total of 54,819 Eukaryota OTUs comprised of 5,353,354 high-quality sequences remained, with an average of 45,367 sequences/sample (minimum sequences/sample = 5450; maximum sequences/sample = 154,747). These data were normalized to 5,400 reads/sample to account for uneven sampling depth, and either this normalized OTU table, or a table normalized to relative abundance, was used in subsequent analyses, except the differential abundance analysis in which the full dataset was used and alpha diversity analyses in which samples were collapsed by sample type (water column, nodules, or sediments) and the dataset was subsampled randomly multiple times at different depths, with a maximum depth of 100,100 sequences. Joined, quality filtered 16S fastq files and 5′, quality filtered 18S fastq files have been deposited in the NCBI's Sequence Read Archive under BioProject ID PRJNA281530, SRA ID SRP057408.
The nodule prokaryotic core microbiome was computed using the script compute_core_microbiome.py within the QIIME package. The Wu et al. dataset was downloaded from NCBI and OTUs were picked and taxonomy assigned as described for our dataset. OTUs within our core microbiome that hit to Greengenes were compared to the newly created Wu et al. OTU table in order to identify reference-based OTUs that were present in both datasets.

| Statistical methodologies
Principal Coordinates Analysis (PCoA) was used to visualize patterns in microbial community structure based on sample type within the CCZ.
Analysis of similarities (ANOSIM; (Chapman & Underwood, 1999) was performed on weighted UniFrac distance measurements of both 16S and 18S gene sequences, and implemented using the compare_categories.py script within the QIIME package. Briefly, UniFrac calculates a distance measure based on the fraction of branch length shared between two communities within a phylogenetic tree; weighted UniFrac additionally takes into account the differences in relative abundances of taxa within each community (Lozupone, Lladser, Knights, Stombaugh, & Knight, 2011). The prokaryotic phylogenetic tree used for UniFrac was built using FastTree (Price, Dehal, & Arkin, 2010) from representative sequences aligned with PyNAST (Caporaso, Desantis, et al., 2010), as implemented in the pick_open_reference_otus.py workflow, and is available as Figure S8; sequences which failed to align were omitted from both the tree and the OTU table. The eukaryotic phylogenetic tree was created similarly from representative sequences aligned with Infernal (Nawrocki, Kolbe, & Eddy, 2009) and is available as Figure S9. A heatmap ( Figure 5) was created using the function heatmap.2 in the R package gplots (R Core Team, 2015; Warnes et al., 2016). A Bray-Curtis dissimilarity matrix was created from an OTU table containing the 10 most abundant OTUs in each habitat, average linkage hierarchical clustering was performed and a dendrogram was created using the R package vegan (Oksanen et al., 2016). Colors came from the R package RColorBrewer (Neuwirth, 2014). Average linkage hierarchical clustering was also done on the full dataset and the results were similar, that is, sediments, nodules, and the water column each formed groups ( Figure S10). To create Figure 9, a differential analysis of count data using shrinkage estimation (DESeq2, (Love, Huber, & Anders, 2014)) was implemented on the full dataset (not rarefied) within the phyloseq package (McMurdie & Holmes, 2014). Differential OTUs which had a base mean of ≥100 (prokaryotes) or ≥10 (eukaryotes) were reported and visualized using the R package ggplot2 (Wickham, 2009).

| Flow cytometric cell abundances
Seawater samples for flow cytometric analyses were thawed and 250 μl aliquots were transferred to 96-well plates and stained with SYBR Green I (final concentration of 1X). Abundances of picoplanktonic cells were determined using an Attune Acoustic Focusing Cytometer (Life Technologies, Carlsbad, CA) at a flow rate of 100 μl min −1 , using an excitation of 488 nm and detected using a 530/30 bandpass filter and side scatter. Bran+Luebbe Autoanalyzer III (Karl et al., 2001). DOC analyses relied on high-temperature combustion using a Shimadzu TOC-V (DOM Analytical Lab, Santa Barbara, CA; (Carlson et al., 2010)).

| Chemical characterization of UK-1 claim area
Seawater nutrient concentrations and microbial cell abundances were  Table 2). One of the most prominent features was the presence of a large, relatively shallow, oxygen minimum zone (OMZ), where dissolved oxygen concentrations declined to <10 μmol L −1 (< 0.2 ml/L) between ~50 m to ~1000 m ( Figure S1).
The water column microeukaryote community was dominated by Protalveolata (Sar: Alveolata; 35%), Discicristata (Excavata: Discoba; 20%), and Retaria (Sar: Rhizaria; 15%); the relative abundances of these lineages varying with depth ( Figure S2b). Nearly all the Protalveolata sequences clustered among the Syndiniales, a presumably exclusively endoparasitic group common in cultureindependent surveys of marine environments (Guillou et al., 2008;de Vargas et al., 2015). Group II Syndiniales dominated the anoxic and suboxic OMZ, whereas Groups I and II coexisted in the photic and oxygenated bathypelagic zones (Figure 3d). The Discicristata were solely Euglenozoa, mostly Diplonemea, a group of heterotrophic flagellates with major uncultured clades distributed throughout the deep-sea (Lara, Moreira, Vereshchaka, & Lopez-Garcia, 2009). This was reflected in our data, as the Diplonemea (Discicristata) were nearly absent from the euphotic zone samples and increased in relative abundance with depth (Figure 3j and S2b). The Retaria were represented by Radiolaria, mainly the Acantharia (4%), Polycystinea (6%), and members of radiolarian sequence group RAD B (5%). It has been hypothesized that one or more of these Radiolarian groups

| Seabed-associated communities
The relative abundance of nodule-associated 16S rRNA gene sequences was dominated by members of the Gammaproteobacteria (23%), Thaumarchaeota (21%), and Alphaproteobacteria (18%; Figure   S3), consistent with previous studies examining nodule-associated archaeal and bacterial communities (Tully & Heidelberg, 2013;Wu et al., 2013). The same three major groups of prokaryotes were also dominant in the sediment samples ( Figure 4): Thaumarchaeota (28%), F I G U R E 4 Major class-level lineages of prokaryotic sediment taxa present at ≥2% relative rRNA gene abundances in at least two samples. Vertical lines represent the individual samples collected from each sediment horizon. Category "Other" represents all named taxa that did not reach the ≥2% relative abundance cutoff ). In addition to the Thaumarchaeota, the class Nitrospira, containing potential chemolithoautotrophs, comprised a minor but measurable (1%) portion of the sediment community ( Figure. 4).
Among the Gammaproteobacteria, sequences clustering among unclassified genera within the family Piscirickettsiaceae demonstrated the greatest relative abundance in both the sediments and nodules (12% and 13% relative abundances, respectively), with relative abundances decreasing with depth in the sediment. The Piscirickettsiaceae are a family of aerobic, aquatic bacteria, and a recent study indicated these organisms were enriched in seawater microcosms treated with cadmium (Wang et al., 2015). Generalized resistance to metal toxicity may explain their relatively high abundances within the nodule field. Additionally, sequences classified as belonging to the order Chromatiales comprised 2-5% of the total 16S rRNA genes throughout the sediments. Previous studies in the central and western Pacific have recovered rRNA genes belonging to the Chromatiales from sediments and nodules (Wu et al., 2013), as well as sediments associated with cobalt-rich sediment crusts (Liao et al., 2011).

The dominant Alphaproteobacteria in both nodules and sediments
were an unclassified genus within the family Rhodospirillaceae, occurring at 8% and 7% relative abundance, respectively, which was not present in the water column ( Figure 5). Although Rhodospirillaceae are often found in anaerobic environments, this family contains the genus Magnetospirillum, a microaerophilic heterotroph with relatives known from sediments previously collected in the Pacific Nodule Province (Xu, Wang, Meng, & Xiao, 2007).
All the major Eukaryotic supergroups were represented in the nodule and sediment datasets ( Figure S4, 6 and 3b-l). On nodules, the groups Geminigera (25%; a genus of cryptophytes), Fungi (14%), and Retaria (13%; a clade within Rhizaria) demonstrated the greatest relative abundances (Fig. S4). Fungi were almost exclusively comprised of F I G U R E 5 Heatmap of top 10 most abundant prokaryotic rRNA gene OTUs from each sample type. In some cases, the most abundant rRNA gene OTUs from sediments and nodules were the same, so total number of OTUs depicted is 25. Dendrograms were created via average linkage hierarchical clustering on a Bray-Curtis dissimilarity matrix of the selected dataset. Dendrogram on the Y-axis is color-coded by sample type as throughout the manuscript (blue = water, green = sediments, orange = nodules). Dendrogram on the X-axis clusters OTUs that occur most frequently together. Heatmap color represents the number of OTUs found in each sample after normalization to 16,000 reads/sample. OTU, operational taxonomic unit

| Alpha diversity of Clarion-Clipperton Zone microbial communities
After pooling by habitat and rarefaction to 2,401,000 sequences per habitat in order to account for differences in sequencing depth while still utilizing a large portion of the dataset, 33,732 prokaryotic OTUs were found in the water column, 93,790 prokaryotic OTUs were found in the nodules, and 111,413 prokaryotic OTUs were found in the sediments (Table S1). Species accumulation curves indicate that OTUs were still accumulating in the sediments and nodules at this depth of sequencing, whereas sequence diversity in the water column appeared to plateau (Figure 7).
Chao1 predicts a species richness of 35,083 prokaryotic OTUs in the water column, 118,552 prokaryotic OTUs in the nodules, and 184,335 prokaryotic OTUs in the sediments (Table S1). For eukaryotes, at a sampling depth of 100,100 sequences (pooled by habitat), the accumulation curves do not appear to approach an asymptote for nodules, the water column, or the sediments, indicating undersampling of all three habitats (Figure 7). There were 6,704 eukaryotic observed OTUs in the water column, 4,744 eukaryotic OTUs in the nodules, and 9004 eukaryotic OTUs in the sediments, with Chao1 richness estimates of 13,373, 8831, and 15,344 OTUs, respectively (Table S1).
F I G U R E 6 Eukaryotic sediment taxa from the 0-5 cm, 5-6 cm, 6-8 cm, and 8-10 cm horizons, averaged across all samples taken at a given depth; rRNA gene taxa present ≥2% relative abundance (on average) in at least one horizon are depicted. Category "Other" represents all named taxa that did not reach the ≥2% relative abundance cutoff 100,000 120,000 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000

(a) (b)
Based on the three different measures of diversity assessed in this study (Chao1, exponential of Shannon's, and observed OTUs), nodule and sediment prokaryotic communities harbored greater alpha diversity than water column communities when pooled by habitat (Table S1). This held true when the alpha diversity of individual samples within these habitats were considered as well (rarefied to 16,000 sequences/sample, nonparametric t-test, observed OTUs, p = .003 for both nodule versus water column and sediment vs. water column; Chao1, p = .003 for both nodule versus water column and sediment vs. water column; exponential of Shannon's, p = .003 for both nodules vs. water column and sediment vs. water column). Additionally, on average, prokaryotic sediment communities demonstrated greater alpha diversity than nodule communities (nonparametric t-test, observed OTUs, p = .006).

| Beta and gamma diversity of Clarion Clipperton Zone microbial communities
Principal Coordinates Analysis (PCoA) of weighted UniFrac (Lozupone & Knight, 2005) distances of the 16S and 18S rRNA gene amplicon communities, revealed that the seawater, sediments, and nodules each harbored distinct prokaryotic ( Figure S5a

| Trends in microbes differentially represented in sediments and nodules
A differential analysis of count data (Love et al., 2014) was performed to determine whether specific OTUs discriminated between nodules and sediment samples. Ninety-three prokaryotic OTUs with assigned taxonomy were identified as differentiating the 75 sediment samples from the 36 nodule samples (Figure 9a). Ninety eukaryotic OTUs differentiated the 74 sediment samples from the 20 pooled nodule samples (Figure 9b). Although the prokaryotic OTUs that differentiated the sediment and nodule samples came from diverse phyla, some general trends emerged. All the differential Chloroflexi OTUs recovered (4 OTUs; 4% of differential OTUs) were more abundant in sediment samples than nodule samples (Figure 9a). This agrees with data from the German mining claim area, to the west of our study site, where Chloroflexi were found in sediments but were not associated with nodules (Blothe et al., 2015). Fifteen (16%) of the differentially abundant OTUs were Alphaproteobacteria (Figure 9a). Of those Alphaproteobacteria more highly represented on the nodules, two fell into the family Hyphomicrobiaceae within the Rhizobiales, a group that contains members known to be involved in manganese cycling (Larsen, Sly, & McEwan, 1999). The Alphaproteobacterial OTUs identified as more abundant in the sediments than nodules were either Rhodospirillales or unclassified beyond the class level; none of them were classified as Rhizobiales, potentially indicating a unique niche for Rhizobiales on the nodules. Five of the seven deltaproteobacterial OTUs were overrepresented in the sediments relative to the nodules.
Eukaryotic communities in both the various sediment samples and on collected nodules demonstrated considerable sample-to-sample heterogeneity ( Figure S4), but differential abundance analysis identified several taxa that appear to prefer either sediments or nodules heterotrophic protists, were found solely in the sediments (Figure 9b).
Fifteen (23%) of the OTUs that were more abundant on the nodules derived from the Opisthokonta, the supergroup containing both the Fungi and the Metazoa (Figure 9b). This included one metazoan OTU unique to this study classified as Enoplea (a nematode), and 14 fungal OTUs, three of which were related to the yeast-like fungus Pseudozyma.

| Comparisons to other polymetallic nodule datasets
In addition to describing the microbial communities associated with the sediments, nodules, and overlying waters, we compared our results to previously published studies from geographically diverse sites within the Pacific Ocean to place our observations in a basin-scale context of known nodule-associated microbial communities. The large number of nodules collected in this study allowed us to compute a "nodule core microbiome" for the AB-01 region in order to compare it to nodules across the CCZ. In total, 196 prokaryotic OTUs were present in 100% of our nodules sampled; 168 of these fell into OTUs with representative sequences within the Greengenes database whereas 28 were novel (Table S2). In order to look at connectivity across the CCZ, we compared the 168 reference-based OTUs to OTUs identified in the three nodules sampled from two sites at distances ~3000 km and >9000 km from AB-01, within the North Pacific (Wu et al., 2013).
Forty-seven (28%) of these OTUs were also retrieved in at least one of the nodules studied by Wu et al.; nine of these OTUs were found in all three nodules studied by Wu et al. (Table 3). Many of these core OTUs fell into taxa that our differential abundance analyses revealed to prefer nodules rather than sediments, such as the Cytophagia and the Rhizobiales (Table 3, Figure 9a). These results indicate that certain stable and consistent associations between microbes and nodules exist over thousands of kilometers of abyssal ocean.
A recent study of the microbes associated with two polymetallic nodules in the east German license area, ~300 km from our study site, found that members of the gammaproteobacterial genera Colwellia  (Gillan, Speksnijder, Zwart, & De Ridder, 1998;Ivanova et al., 2004). The greater abundances of these Alteromondales in various nodules from other studies and complete absence in this study (as well as absence from individual nodules in the Tully and Heidelberg (2013) study) may reflect regional differences within the CCZ in the colonization of nodules by invertebrates and their associated microflora, or simply a high degree of endemism at the abyssal seafloor when examined over larger spatial scales (>300 km) (Bienhold, Zinger, Boetius, & Ramette, 2016). In addition, differences among these studies could reflect biases associated with the choice of PCR primers; for this study, the forward (515f) and reverse (805r) PCR primers we relied on for prokaryotic identification can yield single-nucleotide mismatches to the 16S rRNA genes of members of the Thaumarcheota Marine Group I (Parada, Needham, & Fuhrman, 2015) and the SAR11 clade (Apprill, McNally, Parsons, & Weber, 2015), respectively. However, this would not explain the lack of the Gammaproteobacteria Colwellia and Shewanella associated with nodules sampled in our study. We further checked these primers using Silva TestPrime 1.0 and found that the primer pair 515f/805r had coverage of 92% within the Shewanella and 89% within the Colwellia when no mismatches were allowed. When one mismatch was allowed, this coverage rose to 96 and 94%, respectively.
Therefore, it seems unlikely that the primers used for amplification in our study are biased against these genera. We utilized the same primers for all the samples collected for this study, facilitating comparative assessment of prokaryotic diversity across the different types of habitats (seawater, sediment, and nodules) and stations sampled. Finally, there may have been differences in the physical and/or chemical structure of the nodules within the German claim area and those nodules sampled from the UK claim area, which might promote differences in the microbiota observed between these studies.

| CONCLUSIONS
We provide strong evidence supporting the growing view that polymetallic nodules, surrounding sediments, and the overlying water column constitute distinct microbial habitats with characteristic microbial assemblages. Microbial assemblages differ with depth into the sediment such that surface sediment removal and/or accelerated burial by resettling from a near-bottom sediment plume are likely to fundamentally alter microbial community structure. Sediments and nodules are major reservoirs of microbial diversity distinct from even the deep water column, suggesting that large-scale removal of nodules and sediments might alter local, and even regional, patterns of microbial diversity and ultimately modify specific ecosystem functions.
Over the ~30 km scales of our study, each of these habitats appears to harbor similar microbial assemblages. Additional work is needed to determine if microbial assemblages in the sediments and nodules vary over the ~6 million square km expanse of the Clarion-Clipperton Zone, as suggested by the lack of Colwellia and Shewanella sequences in our samples versus within the German claim area ~300 km away.
However, our current study provides the first evidence of a widespread core nodule microbial community across large regions of the Pacific Ocean. Additionally, many of the prominent prokaryotic genera retrieved from the sediment and nodules in our study and others suggest an important role for chemoautotrophy within and above the nodule field. The energy sources sustaining such metabolisms remain unknown. Finally, future work is needed to understand the stability and resilience of these microbial ecosystems to perturbations such as those likely to result from nodule-mining operations.
T A B L E 3 Core nodule OTUs found in all nodules in this study and all nodules in Wu et al., at 3000 -9000 km distance from the AB-01 site

ACKNOWLEDGMENTS
We thank all members of the ABYSSLINE science party for logistical help at sea and on land, particularly D. Amon, as well as the captain, officers, and crew of the R/V Melville for their assistance on MV1313.
We are grateful to S. Curless, L. Fujieki, and F. Santiago-Mandujano for their assistance with nutrient analyses and CTD data processing; E. DeLong for computational resources; and B. Pedler and S. Goldberg for guidance on the flow cytometric analyses. We also thank three anonymous reviewers for their helpful and thorough comments and suggestions, which improved the manuscript. Funding for this research derived from a contract from UK Seabed Resources, LTD.
(UKSR) to C.R. Smith and M. Church of the University of Hawaii. UKSR had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.