Metagenomic profiling of ammonia- and methane-oxidizing microorganisms in two sequential rapid sand filters

Elevated concentrations of ammonium and methane in groundwater are often associated with microbiological, chemical and sanitary problems during drinking water production and distribution. To avoid their accumulation, raw water in the Netherlands and many other countries is purified by sand filtration. These drinking water filtration systems select for microbial communities that mediate the biodegradation of organic and inorganic compounds. In this study, the top layers and wall biofilm of a Dutch drinking water treatment plant (DWTP) were sampled from the filtration units of the plant over three years. We used high-throughput sequencing in combination with differential coverage and sequence compositionbased binning to recover 56 near-complete metagenome-assembled genomes (MAGs) with an estimated completion of ≥70% and with ≤10% redundancy. These MAGs were used to characterize the microbial communities involved in the conversion of ammonia and methane. The methanotrophic microbial communities colonizing the wall biofilm (WB) and the granular material of the primary rapid sand filter (P-RSF) were dominated by members of the Methylococcaceae and Methylophilaceae . The abundance of these bacteria drastically decreased in the secondary rapid sand filter (S-RSF) samples. In all samples, complete ammonia-oxidizing (comammox) Nitrospira were the most abundant nitrifying guild. Clade A comammox Nitrospira dominated the P-RSF, while clade B was most abundant in WB and S-RSF, where ammonium concentrations were much lower. In conclusion, the knowledge obtained in this study contributes to understanding the role of microorganisms in the removal of carbon and nitrogen compounds during drinking water production. We furthermore found that drinking water treatment plants represent valuable model systems to study microbial community function and interaction. © 2020 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )


Introduction
About 97% of all available water on earth is saline. The remaining 3% is freshwater, of which more than two-thirds is frozen in ice sheets. Thus, only a small fraction of the global freshwater exists as ground and surface water that is available for drinking water production. According to the European Commission ( EC, 2016 ), about 50% of drinking water in Europe is produced from groundwater and 37% from surface water.
Groundwater has a relatively constant composition and may contain high concentrations of iron (Fe 2 + ; ≤7.8 mg/L), manganese (Mn 2 + ; ≤0.56 mg/L), ammonium (NH 4 + ; ≤0.5 mg/L), and some organic compounds such as methane (CH 4 ; ≤37 mg/L) ( Albers et al., 2015 ;Li and Carlson, 2014 ;Osborn et al., 2011 ). Elevated con-communities that are introduced via the source water ( Yang et al., 2016 ) and are shaped by the configuration of the treatment process Pinto et al., 2012 ). In the filtration, units microbial growth is stimulated on filter material, mediating the biodegradation of organic and inorganic compounds ( Proctor and Hammes, 2015 ). Gasses such as methane, hydrogen sulfide, carbon dioxide, and other volatile compounds are removed from the groundwater through gas exchange systems ( Trussell et al., 2012 ). The increased dissolved oxygen in the water caused by this mechanical aeration step serves as an electron acceptor in microbially mediated oxidative reactions, which may ensure the near-complete nutrient removal in the biologically active sand filters.
Naturally occurring ammonium concentrations in deep groundwater are below 0.2 mg/L ( WHO, 2003 ). However, excessive fertilization can lead to increased ammonium concentrations and contamination of groundwater ( Kabala et al., 2017 ). In engineered systems, such as drinking water treatment plants (DWTP), ammonium removal is achieved by the activity of nitrifying microorganisms that oxidize ammonia to nitrate via a series of intermediates. While canonical nitrifying guilds perform ammonia-and nitriteoxidation in a tight interplay, complete ammonia-oxidizing (comammox) bacteria of the genus Nitrospira possess all proteins necessary to perform nitrification on their own ( Daims et al., 2015 ;van Kessel et al., 2015 ). Nitrifying microbial communities of rapid sand filters have been studied before and seem to be represented by different groups of nitrifiers ( Albers et al., 2015 ;Fowler et al., 2018 ;Gülay et al., 2016 ;Oh et al., 2018 ;Palomo et al., 2016 ;Pinto et al., 2015 ;van der Wielen et al., 2009 ). In these systems, nitrification can be limited by the availability of vital nutrients for this process, such as phosphate and copper. This can cause incomplete nitrification ( de Vet et al., 2012 ;Wagner et al., 2016 ), leading to incomplete ammonium removal and/or nitrite accumulation ( Wilczak et al., 1996 ), and microbial after-growth in the distribution network ( Rittmann et al., 2012 ).
Methane is a colorless and odorless gas that usually does not present a health risk in drinking water. However, methane gas is highly flammable and can be explosive at elevated concentrations, and it also can serve as substrate for growth of microorganisms in distribution systems. Most of the methane is removed by the mechanical aeration step and remaining amounts are oxidized by aerobic methanotrophs within the sand beds of the filter units. Correspondingly, a number of studies reported methane-oxidizing bacteria colonizing the granular material of the sand filters ( Albers et al., 2015 ;Gülay et al., 2016 ;Palomo et al., 2016 ). However, in contrast to nitrifying microbial communities, methanotrophs in these engineered systems are traditionally less well studied. Methane is a potent greenhouse gas, with a global warming potential of 34 CO 2 equivalents over 100 years ( Stocker et al., 2013 ) and the removal of most methane from the drinking water via aeration causes methane emissions to the atmosphere, contributing to global warming and climate change ( Maksimavi čius and Roslev, 2020 ).
Like in many European countries, groundwater is the primary source (65%) for drinking water production in the Netherlands ( Geudens and van Grootveld, 2017 ). In this study, we used genomeresolved metagenomics and gene-centric approaches to analyze microbial communities in rapid sand filters of a Dutch DWTP, with a special focus on ammonia-and methane-oxidizing microorganisms. The groundwater entering this DWTP contains elevated concentrations of organic (CH 4 , 5.2 mg/L) and inorganic (NH 4 + , 0.64 mg/L; Fe 2 + , 8.5 mg/L; Mn 2 + , 0.16 mg/L) nutrients, which are removed in two sequential rapid sand filters. Thus, this DWTP represents an interesting model system to study microbial communities involved in the conversion and removal of these compounds. Metagenomic analysis of samples taken at different stages within this DWTP revealed the key microorganisms involved in ammonia-and methane-oxidation, including novel methanotrophic bacteria from the Methylophilaceae family , which were assumed previously to comprise only methylotrophic bacteria.

Sample collection and trace element analysis
Samples were obtained from the pumping station Breehei, a drinking water treatment facility located in Venray, the Netherlands (51 °28 54.6 N; 5 °59 10.2 E), operated by NV Waterleiding Maatschappij Limburg, which produces drinking water from groundwater. The Breehei DWTP is situated in a rural area with forest (nature reserve) and agricultural land. The raw water entering the DWTP is circumneutral (pH 6.8-7.3) with a temperature of 10 °C. Groundwater is extracted at a depth of 60 m and sprinkled on the saturated filter with a surface of 12.7 m 2 and bed thickness of 1 m. The surface load over the filter is 3 m 3 (m 2 h) −1 .Each filter consists of sand with a grain size of 1.4 to 2.0 mm. At two time points (June 2016 and September 2018) single samples were collected from the top layers of the primary and secondary rapid sand filter (P-RSF and S-RSF) in 50 ml sterile falcon tubes. One sample each from the biofilm formed on the walls of the primary sand filter (WB) was taken in June 2016, May 2017 and September 2018 ( Fig. 1 ). All samples were transferred to the laboratory within 4 h, and were stored at 4 °C for further analysis. Water quality parameters were determined by Aqualab Zuid ( https://www.aqualab.nl ) over the period 20 0 0-2016 ( Table 1 ).

Methane uptake
Methane-oxidizing capacity was determined for the P-RSF and WB samples collected in 2016. For the P-RSF sample, 2.5, 5, 10, and 20 g of sand material were mixed with 20 ml top water. After settling of the sand, overlaying water samples (20 ml) were transferred into 120 ml serum bottles with ±14.5 mg/L CH 4 in the headspace and incubated at room temperature in a shaking incubator (200 rpm). For the WB sample, the incubations were conducted in triplicates using 1 and 2.5 g of biomass. CH 4 concentrations were determined by sampling 300 μL headspace, which were injected in triplicates into a HP 5890 gas chromatograph (Hewlett Packard, Palo Alto, CA).

DNA extractions
DNA from samples for Illumina sequencing was extracted using two different methods to obtain differential abundance information. DNA from 2016 samples was extracted with DNeasy Blood & Tissue Kit (Qiagen Ltd., West Sussex, United Kingdom) and Pow-erSoil DNA Isolation Kit (MO BIO Laboratories, Carlsbad, CA, USA). Samples (0.5 g) were mechanically disrupted using a TissueLyser (Qiagen) for 2 × 30 s at 30 Hz, followed by DNA extractions according to the manufacturer's instructions. For the extraction of DNA from samples collected in 2017 and 2018, the DNeasy Blood & Tissue kit was replaced by ammonium acetate ( Kowalchuk et al., 2004 ) and CTAB ( Zhou et al., 1996 ) based extraction methods, respectively.
For long-read Nanopore sequencing, DNA was extracted from P-RSF samples collected in 2016 and 2018 using the CTAB-based extraction method. To avoid shearing of genomic DNA, all beadbeating and vortexing steps were replaced by carefully inverting the tubes several times, and all the pipetting steps were performed using cut-off pipette tips. Genomic DNA was purified twice by phenol:chloroform:isoamyl alcohol (25:24:1) phase extraction ( Zhou et al., 1996 ). Extracted DNA was resuspended in nucleasefree water and stored at 4 °C.

Illumina library preparation and sequencing
For Illumina library preparation, the Nextera XT kit (Illumina, San Diego, CA, USA) was used according to the manufacturer's instructions. Enzymatic tagmentation was performed using 1 ng of DNA per sample, followed by incorporation of the indexed adapters and library amplification. After subsequent purification using AM-Pure XP beads (Beckman Coulter, Indianapolis, IN, USA), libraries were checked for quality and size distribution using the 2100 Bioanalyzer with the High Sensitivity DNA kit (Agilent, Santa Clara, CA, USA). Library quantitation was performed by Qubit using the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA). After dilution to 4 nM final concentration, the libraries were pooled, denatured, and sequenced on an Illumina Miseq. Pairedend sequencing of 2 × 300 base pairs was performed using the Illumina MiSeq Reagent Kit v3 according to the manufacturer's protocol.

Nanopore library preparation and sequencing
The DNA Library preparation for Nanopore sequencing was performed using the Ligation Sequencing Kit 1D (SQK-LSK108) in combination with the Native Barcoding Expansion Kit (EXP-NBD103 or EXP-NBD104) according to the manufacturer's protocol (Oxford Nanopore Technologies, Oxford, UK). For the detailed protocol, see Supplementary Methods. The libraries were loaded on a Flow Cell (R9.4.1) and run on a MinION device (Oxford Nanopore Technologies, Oxford, UK), according to manufacturer's instructions. Base calling after sequencing was done using Albacore v2.1.10, (Oxford Nanopore Technologies) for the 2016 and guppy_basecaller in combination with gGuppy barcoder (Oxford Nanopore Technologies; Version 2.3.7+e041753) for the 2018 sample.

Metagenome binning
Differential coverage information was determined by separately mapping the sequencing reads from each sample and DNA extraction method against the obtained co-assemblies, using Burrows-Wheeler Aligner v0.7.15 (BWA) ( Li and Durbin, 2010 ) and employing the "mem" algorithm. For the long-read Canu assembly, only the Illumina reads from P-RSF 2018 samples were mapped. The generated sequence alignment map (SAM) files were converted to binary format (BAM) using SAMtools ( Li et al., 2009 ). Metagenome binning was performed using anvi o v5.3, followed by manual bin refinement using the anvi o interactive interface ( Delmont and Eren, 2016 ;Eren et al., 2015 ). For details, see Supplementary Methods. Completeness and contamination (referred to as redundancy in this study) of bins was assessed by CheckM v1.01.11 ( Parks et al., 2015 ). Based on the suggested standards ( Bowers et al., 2017 ), the bins were defined as high-quality ( ≥90% completeness and ≤5% redundancy, complete small subunit rRNA operon, ≥18 tR-NAs) and medium-quality ( ≥70% complete and ≤10% redundancy) metagenome-assembled genomes (MAGs). MAGs were dereplicated using dRep v2.2.3 ( Olm et al., 2017 ) at 99% average nucleotide identity (ANI) used for clustering. Within each cluster, the best MAG was selected based on completeness, redundancy, N50 of contigs, and fragmentation. GTDB-Tk v0.3.2 ( Parks et al., 2018 ) was used for taxonomic assignment of the final MAGs. Phyla are named according to the recently suggested nomenclature ( Whitman et al., 2018 ) using standardized phylum suffix -ota .

Table 1
Average water quality parameters of the incoming groundwater and at different stages along the treatment train.

Abundance estimation of MAGs
To calculate the relative abundance of the dereplicated MAGs in each sample, reads from all samples were individually mapped to each co-assembly using BWA v0.7.15 ( Li and Durbin, 2010 ) as described above. The coverage of each MAG was calculated using CheckM v1.01.11 (minimum alignment length 0.95) ( Parks et al., 2015 ) and was normalized by multiplying this coverage with a normalization factor (sequencing depth of the largest sample divided by the sequencing depth of each individual sample). The distribution of MAGs was calculated as percentages by dividing a MAG's coverage in each sample by the total coverage of the respective MAG in all samples.

Functional analysis
For the gene-centric approach, co-assembly across all samples was performed using MEGAHIT v1.1.1-2 ( Li et al., 2015 ). Open reading frames (ORFs) in the new co-assembly and the MAGs obtained above were predicted using Prodigal v2.6.3 ( Hyatt, 2010 ), which was set to include partial ORFs. Custom-build hidden Markov models (HMMs) ( Eddy, 2011 ) of specific marker proteins were used (Supplementary Methods) to annotate all ORFs using hmmsearch (HMMER v3.1b2; http://hmmer.org ). The HMM for RNA polymerase subunit beta (RpoB) was downloaded from FunGene ( Fish et al., 2013 ). Remaining ORFs in the MAGs were annotated using Prokka v1.12-beta ( Seemann, 2014 ). The annotations of all genes discussed in this study were confirmed by BLAST against the TrEMBL, Swiss-Prot and NCBI nr databases. Subcellular localization of the proteins was predicted by SignalP 5.0 ( Armenteros et al., 2019 ) and TMHMM 2.0 ( Krogh et al., 2001 ).
Functional gene-based abundances of ammonia-and methaneoxidizing microorganisms were estimated using competitive metagenomic read recruitment to ensure unique mapping. For this, reads from each metagenomic sample were mapped using bowtie2 v2.3.1 ( Langmead and Salzberg, 2012 ) in '-very-sensitive' mode against extracted partial and complete sequences of rpoB and the ammonia as well as the particulate and soluble methane monooxygenase subunit A genes ( amoA, pmoA and mmoX , respectively). SAMtools v1.6 ( Li et al., 2009 ) was used to obtain the number of mapped reads. Reads per kilo base per million mapped reads (RPKM)-values were used to correct for differences in sequencing depth and gene length. To estimate the relative abundance of microorganisms encoding ammonia and methane monooxygenases in each sample, the normalized read counts were calculated as fraction of the normalized read counts of the identified rpoB genes.

Phylogenomic and phylogenetic analyses
The up-to-date bacterial core gene (UBCG) pipeline ( Na et al., 2018 ) with default parameters was used to extract and concatenate bacterial core gene sets. To infer the phylogeny of the archaeal MAGs, the anvi'o phylogenomic workflow ( http:// merenlab.org/2017/06/07/phylogenomics ) was used to individually align and concatenate 162 archaeal single-copy genes. Maximumlikelihood trees were calculated using RAxML version 8.2.10 ( Stamatakis, 2014 ) on the CIPRES science gateway ( Miller et al., 2010 ). For details, see Supplementary Methods. Amino acid sequences of type II DMSO reductase-family enzymes, and of ammonia and methane monooxygenases were aligned using ARB v5.5 ( Ludwig et al., 2004 ). The maximum-likelihood trees were calculated using RAxML HPC -HYBRID v.8.2.12 on the CIPRES and IQtree webserver ( Trifinopoulos et al., 2016 ) as described in Supplementary Methods. All phylogenetic trees were visualized in iTOL ( Letunic and Bork, 2016 ).

Recovery of metagenome-assembled genomes
Over the period 2016 to 2018 a total of 7 samples from DWTP Breehei were collected ( Fig. 1 ) and sequenced. This resulted in a total of 125 million paired-end Illumina sequencing reads. For each sample location more than 70% of the respective reads could be co-assembled into ~413, 184, and 249 Mb sequencing data for WB, P-RSF, and S-RSF, respectively (Table S1). In addition, P-RSF samples were also sequenced using the Oxford Nanopore long-read platform to improve the assembly of the most abundant microorganisms. Overall, binning of 4 individual metagenome assemblies based on sequence composition and differential coverage patterns resulted in 78 near-complete Illumina and 7 Nanopore MAGs (Table S2). All MAGs obtained from Illumina co-assemblies as well as the Nanopore assembly were dereplicated at strain level (99% ANI), which yielded 50 medium and 6 high-quality (MIMAG standards; Bowers et al., 2017 ) non-redundant MAGs ( Table 2 , Table S2) that were used for downstream analyses. Given the high number of single nucleotide variants observed during bin refinement using the anvi o interactive interface, 15 MAGs were categorized as population-level genomes ( Table 2 ). All MAGs were classified at the lowest possible taxonomic level using GTDB-tk ( Parks et al., 2018 ), indicating affiliation with 1 archaeal and 12 different bacterial phyla ( Fig. 2 , Table 2 ).

Distribution and taxonomic composition of the DWTP microbiome
The influence of sampling location, sampling time, and DNA extraction method on the distribution of recovered MAGs was analyzed by hierarchical clustering using Euclidean distance metrics with a Ward linkage algorithm, allowing grouping of the MAGs based on their occurrence patterns in the different samples ( Fig. 2 ).
Overall, the choice of a DNA extraction method had no pronounced influence on the distribution of MAGs across samples, except for the sample from P-RSF in 2016. In this specific sample, the normalized coverage values indicated a strong extraction bias for DNA extracted using the Power soil (P-RSF16_PS) compared to the Blood and Tissue kit (P-RSF16_BT; Table S2). Notably, this bias was not observed for the 2016 S-RSF samples extracted with the same kits, or for any other DNA extraction performed using the Power soil kit. It thus is difficult to conceive that the extraction method was affecting specific members only in the 2016 P-RSF microbial community, and the reason for the observed bias remains unclear. Overall, the sampling location had the most substantial effect on microbial diversity and abundance recovered in our study. While some MAGs obtained from WB and P-RSF were also present in S-RSF, the overall community of the S-RSF community clearly differed from the other sampling locations ( Fig. 2 ). The microbial communities of WB and P-RSF were generally more similar, but sampling location and time influenced the relative abundance of MAGs across all samples.
To gain insights into the overall microbial community structure and diversity of the DWTP Breehei, 16S rRNA gene sequences were retrieved directly from the metagenomic assemblies. Both full-length and partial 16S rRNA gene sequences were used for further analyses, since only 21 to 32% of 16S rRNA reads were fully assembled ( Figure S2). Subsequently, microbial community composition was analyzed at the phylum and family levels ( Figure S3). Although changes in abundance were observed, phylum level classification revealed no differences in microbial community composition between the sampling locations ( Figure S3A). At the family level, the S-RSF microbial community was clearly distinct from the WB and P-RSF community ( Figure S3B), corroborating the MAG cooccurrence pattern-based observations when samples were organized using Euclidian distance and Ward ordination ( Fig. 2 ).

Genome functional profiling
During the drinking water treatment process, contaminants like iron, manganese, reduced sulfur species, ammonium and methane are removed. However, our understanding of the microbial and geochemical processes contributing to the removal of these compounds during sand filtration is still limited. To learn more about the microorganisms driving the removal of methane and ammonium in this rapid sand filtration system,all recovered MAGs were screened for functional marker proteins involved in nitrification, and methane and other one-carbon (C1) compound oxidation ( Fig. 3 , Table S3) using custom-build HMM models. Details on microorganisms involved in iron and sulfur cycling are given in the Supplementary Material.

Ammonia and nitrite oxidation
16S rRNA gene sequence analysis revealed the presence of nitrifying bacteria affiliated with the families Nitrospiraceae and Nitrosomonadaceae in the DWTP Breehei. Nitrospiraceae dominated the nitrifying microbial community in all samples, but their abundance patterns differed along the different sam pling locations, with the lowest abundance in P-RSF. In contrast, canonical ammonia-oxidizing bacteria (AOB) affiliated with the Nitrosomonaceae showed very low abundance in all samples ( Figure S3). Metagenomic consensus binning allowed the recovery of five Nitrospira MAGs from WB and P-RSF and one MAG of an ammoniaoxidizing archaeum (AOA) affiliated with the genus Nitrosoarchaeum . Despite the detection of Nitrosomonadaceae in WB and S-RSF samples in 16S rRNA-based analyses, no metagenomic bin of this taxonomic group was recovered.
All MAGs were screened for key genes of autotrophic nitrification, including the gene sets encoding ammonia monooxygenase (AMO) and nitrite oxidoreductase (NXR). AmoA is often used as a functional and phylogenetic marker for ammonia-oxidizing microorganisms ( Junier et al., 2008 ;Pester et al., 2012 ;Pjevac et al., 2017 ) and thus was used to examine the full ammonia oxidation potential in the DWTP Breehei. Phylogenetic analysis placed our metagenome-derived amoA sequences into five divergent groups ( Fig. 4 A). The observed variations in the nitrifying microbial community structure and abundance between the different samples might be influenced by seasonality, with the largest relative abundance in spring 2016 and the lowest during fall 2018 ( Fig. 4 B). However, to confirm this trend, more frequent samples across a Bin ID reflects the co-assembly (assembly) the MAGs were recovered from. IL -Illumina co-assemblies, NP -Nanopore assembly. Population-level genomes are indicated by an asterisk. b Based on GTDB classification ( Parks et al., 2018 ). c Based on lineage-specific marker sets determined with CheckM ( Parks et al., 2015 ). d Defined by ( Bowers et al., 2017 ).
longer time span would be required, also considering variations in physicochemical parameters. Of the five Nitrospira MAGs recovered in this study, one (P-RSF-IL-20) contained amoA genes affiliated with clade A, and two (WB-IL-04, WB-IL-16) with clade B ( Fig. 3 A, Table S3) . In proteinbased phylogenetic analyses, most of the clade A and B comammox AmoA sequences most closely clustered with sequences derived from DWTP metagenomes ( Figure S4). Clade A comammox amoA genes were detected in all samples from the DWTP Breehei (0.1-3.6% of the total community) and were the most abun-dant amoA type in P-RSF. The clade A comammox Nitrospira MAG (P-RSF-IL-20) had the highest coverage in the 2016 P-RSF sample extracted with the Blood & Tissue Kit, while this coverage decreased drastically with another DNA extraction method, indicating a strong extraction bias as discussed above (Table S2). Consequently, the average coverage of this MAG was higher in S-RSF than in P-RSF. In contrast, clade B comammox amoA recruited the largest number of all assembled reads in WB (0.8-5.9%) and S-RSF samples (1.7-4.1%; Fig. 6 A), but none of the two clade B-affiliated MAGs was detected in S-RSF, suggesting that not all clade B co- mammox Nitrospira genomes were recovered. The low abundance of clade B comammox amoA in P-RSF ( Fig. 4 B) indicates an adaptation of clade B comammox Nitrospira to specific niches. Except for one MAG (P-RSF-IL-22), all Nitrospira MAGs contained a nxrAB gene cluster encoding for the alpha and beta subunit of the NXR complex ( Fig. 3 A, Table S3). MAG P-RSF-IL-22 was of medium quality (74.4% estimated completeness; Table 2 ) and did not contain any of the genes required for nitrification (Table S3). Consistent with previous studies ( Palomo et al., 2019 ;Poghosyan et al., 2019 ), phylogenomic analysis using a concatenated alignment of 91 singlecopy core genes showed that clade A and B comammox Nitrospira formed monophyletic clades within Nitrospira lineage II, and all amoA -containing MAGs were correctly affiliated with their respective comammox clade ( Fig. 5 A).
One MAG (WB-IL-22) affiliated with the genus Nitrosoarchaeum (AOA; Fig. 5 B) was retrieved from the WB metagenome; however, it lacked an amoA sequence. Since a contig ( < 1200 bp) containing a Nitrosoarchaeum -like amoA was identified in the metagenome, the gene most likely is missing from this MAG due to the size cutoff used during binning (1500 bp). Consistent with gene-based analyses (0.1-1.7%; Fig. 4 B) the Nitrosoarchaeum MAG was found solely in WB, where it accounted for merely ~0.2% of all the as-sembled reads in the WB datasets ( Fig. 2 , Table S2). The recovered archaeal AmoA sequence had high similarities to Nitrosarchaeum koreense and sequences derived from metagenomic analyses of diverse habitats ( Figure S4). Betaproteobacterial amoA sequences of members of the genera Nitrosomonas and Nitrosospira were identified in all samples, but were of low abundance. In addition to the characterized ammonia-oxidizing clades, recently the unclassified Gammaproteobacteria "MBAE14" genome was found to contain putative AMO genes ( Mori et al., 2019 ). From the DWTP Breehei metagenomes, three putative AmoA sequences clustered in this novel sequence group, albeit with low similarities to the putative enzyme of MBAE14 ( Fig. 4 A, S4).

Methane and one-carbon metabolism
Water analyses indicated that most of the methane present in the incoming groundwater was removed during the aeration step, and the remainder was oxidized in the P-RSF ( Table 1 ). Consequently, we determined microbial methane uptake rates in the P-RSF and WB samples. Complete methane oxidation was achieved in all samples within 5 to 8 days of incubation, except for the control containing only raw water ( Figure S1). As expected, methane consumption in the P-RSF sample increased with an increasing amount of sand material, whereas no significant difference in methane uptake was detected between the incubations with 1 and 2.5 g of WB biomass ( Figure S1). These identical oxidation rates in the WB incubations seem counterintuitive, but may be caused by sample inhomogeneity and uneven distribution of methanotrophic bacteria within the biofilm.
Taxonomic profiling of the extracted 16S rRNA genes ( Figure  S3) identified Gammaproteobacteria as the most dominant taxa in WB (40-55%) and P-RSF (61-67%) samples. On the family level, Methylomonadaceae and Methylophilaceae formed the most abundant groups in WB and P-RSF ( Figure S3B). In total, 14 MAGs belonging to these two phylogenetic groups were recovered, which were mainly present in WB and P-RSF samples ( Fig. 2 ).
Phylogenomic analysis of the 14 Methylomonadaceae and Methylophilaceae -affiliated MAGs using a concatenated alignment of 92 single-copy core genes showed that three of the seven Methylomonadaceae MAGs were affiliated with the genus Methyloglobulus ( Fig. 7 A). Of the remaining MAGs, two (P-RSF-IL-13 and WB-IL-06) were distantly related to Methylobacter and one  to Crenothrix , while P-RSF-IL-24 clustered separately from the known genera within this family ( Fig. 7 A). Six out of seven Methylophilaceae MAGs were affiliated with the genus Methylotenera, whereas P-RSF-NP-02 formed a separate branch within the Methylophilaceae family ( Fig. 7 B).

DWTP microbiome
The DWTP Breehei microbiome was evaluated over a time period of three years and characterized using genome-resolved and gene-centric metagenomic approaches. The microbial community structure was studied for the filter material of two sequential rapid sand filters and the biofilm formed on the walls of the primary sand filter ( Fig. 1 ). Similar to some Danish DWTPs ( Gülay et al., 2016 ), the microbial community composition of the two sequential sand filters differed substantially. In total, 56 dereplicated nearcomplete MAGs were recovered, comprising 23, 64, and 14% of the total assembled reads for WB, P-RSF, and S-RSF, respectively (Table S1). These MAGs expand our knowledge on the genomic inventory of the main microorganisms involved in contaminant removal from groundwater to produce drinking water. However, assembly statistics indicated that the obtained metagenomic information especially for WB and S-RSF only covers a part of the diversity. In general, the genome-centric approach is a powerful tool to analyze the functional potential of an environmental sample based on recovered MAGs. However, due to low abundance or strain diversity, it often is difficult to obtain good-quality genomes for many microorganisms ( Sczyrba et al., 2017 ). Thus, to examine the full ammonia and methane oxidation potential, we also used a genecentric approach (see below). More information on the microorganisms involved in the removal of other contaminants is given in the Supplementary Results and Discussion.

Ammonia and nitrite metabolism
The removal of ammonium and nitrite is a vital step in drinking water treatment. Although the drinking water produced in DWTP Breehei is of high quality and free of any nitrogen compounds, the ranges of nitrite concentrations in the effluents of the P-RSF and S-RSF indicate intermediate nitrite accumulation, which can be caused by incomplete nitrification ( de Vet et al., 2012 ;Wagner et al., 2016 ). Thus, profound insights into the microbial key players and their metabolic capabilities and limitations are crucial for optimizing and stabilizing N removal in these systems. The genomic potential for nitrification was mainly identified in MAGs af-filiated with the genus Nitrospira , which however did not display as high abundances as observed in other DWTP systems ( Albers et al., 2015 ;Gülay et al., 2016 ;Palomo et al., 2016 ). Comammox Nitrospira were the most abundant nitrifying guild in all DWTP Breehei samples ( Fig. 4 B), corroborating previous findings of Nitrospira dominating the nitrifying microbial community in DWTPs ( Albers et al., 2015 ;Gülay et al., 2016 ). Until recently, the dominance of this groupwas puzzling since Nitrospira were always regarded as strictly nitrite-oxidizing bacteria. By now, this imbalance in abundance of Nitrospira and canonical AOB can be explained by the presence of complete nitrifying Nitrospira in many sand filtration systems Pinto et al., 2015 ;Wang et al., 2017 ). Notably, complete nitrifiers are the most abundant nitrifying group in several Danish RSFs  and were identified as key drivers of ammonia and nitrite oxidation in these systems ( Gülay et al., 2019 ). Consistent with these previous results, comammox Nitrospira also apparently outcompeted canonical AOB and AOA in DWTP Breehei. Comammox clade A was found to dominate in P-RSF samples and clade B in WB and S-RSF ( Fig. 4 B). Thus, sampling location substantially influenced the abundance of the two comammox clades, indicating potential niche partitioning between them. The habitat preferences of different nitrifiers might be explained by their ammonia oxidation kinetics and substrate affinities ( Kits et al., 2017 ;Martens-Habbena et al., 2009 ;Prosser and Nicol, 2012 ). According to the kinetic theory of optimal pathway length ( Costa et al., 2006 ) and physiological analyses of N. inopinata ( Kits et al., 2017 ), comammox Nitrospira are K-strategists that have a competitive advantage in environments with very low ammonium fluxes. However, since no comammox clade B enrichment culture is currently available, we can only speculate about the niche-defining metabolic capabilities of this group. It has been shown that in Danish groundwater and Chinese surface water-treating sand filters with influent ammonium concentrations ranging from 0.01-0.53 mg-N/L and ~0.01-0.035 mg-N/L, respectively, clade B comammox Nitrospira dominated the nitrifying microbial populations Hu et al., 2020 ). A recent study also reported that nitrification activity in forest and paddy soils when subjected to ammonium limitation is associated with clade B rather than clade A comammox Nitrospira ( Wang et al., 2019 ). Comammox clade B appeared to dominate also in forest soil under increasing nitrogen load and decreasing pH ( Shi et al., 2018 ). Under acidic conditions, ammonia (NH 3 ) is increasingly protonated to ammonium (NH 4 + ), resulting in extremely low concentrations of bioavailable ammonia, the substrate of AMO. The higher abundance of clade B comammox Nitrospira in S-RSF compared to P-RSF observed here also suggests their adaptation to ammoniumdepleted environments. These results indicate that clade B comammox Nitrospira may exhibit an even lower half-saturation constant (K s ) and higher substrate affinities than clade A species and thrive at extremely low ammonium concentrations. However, it will be required to obtain clade B comammox Nitrospira in culture to ascertain the physiology of these enigmatic bacteria.
In addition to comammox Nitrospira , also two canonical Nitrospira MAGs were retrieved from the Breehei DWTP metagenomes, indicating that these nitrite oxidizers can interact with canonical ammonia oxidizers as well as with complete nitrifiers. Moreover, we identified MAGs affiliated with the Verrucomicrobiota and Planctomycetota phyla ( Fig. 3 ) that contained NXR-like sequences with similarity to the enzymes of known NOBs. In phylogenetic analyses of NxrA, they clustered with Nitrospira, Nitrospina, and anammox bacteria ( Figure S4B). The apparent periplasmic orientation of the NxrA and the lack of transmembrane helices in the NxrC subunit suggests that the NXR of these putative NOBs may be soluble, as has also been proposed for Nitrospina gracilis ( Lücker et al., 2013 ) and "Ca. Nitrotoga fabula" ( Kitzinger et al., 2018 ). However, further studies are needed to analyze the poten-tial nitrite-oxidizing capacity of these bacteria. Although in a very low abundances, Nitrobacter species are also detected in some RSFs ( Tatari et al., 2017 ), which we however did not observe in DWTP Breehei.

Methane and one-carbon metabolism
In DWTPs, methane stripping via aeration is preferred over bacterial methane oxidation since microbial activity and growth cause accumulation of extracellular polymeric substances that lead to clogging of the biofilter material ( Streese and Stegmann, 2003 ). However, both WB and P-RSF samples showed high methaneoxidizing capacity ( Figure S1). Especially the wall biofilm might counteract methane blowout by oxidizing methane before it leaves the filter and thus reduce methane emissions to the atmosphere. In the global carbon cycle, methane-oxidizing microorganisms play a significant role ( Cicerone and Oremland, 1988 ) as they represent the only known biological methane sink ( Aronson et al., 2013 ). These organisms are able to grow with methane as sole carbon and energy source. The first step of methane oxidation, methane activation and conversion to methanol, is catalyzed either by soluble (sMMO) or particulate (pMMO) methane monooxygenases ( Tavormina et al., 2011 ;Trotsenko and Murrell, 2008 ). Especially the sMMO is not universal to methanotrophs and only certain phylogenetic groups are known to encode this methane monooxygenase type ( Verbeke et al., 2019 ), usually together with pMMO.
Several studies have shown that the majority of the methaneoxidizing bacteria colonizing the granular material of RSFs are affiliated with the gammaproteobacterial Methylococcaceae family ( Albers et al., 2015 ;Gülay et al., 2016 ;Palomo et al., 2016 ). Recently, this family was reclassified and split into the Methylomonadaceae, Methylococcaceae, and Methylothermaceae . Especially members of the Methylomonadaceae have been found in many natural and engineered systems ( Flynn et al., 2016 ;Hoefman et al., 2014 ;Kalyuzhnaya et al., 2015 ;Kits et al., 2013 ;Ogisoet al., 2012 ;Oswald et al., 2017 ;Parks et al., 2017 ;Svenning et al., 2011 ). In this study, all recovered Methylomonadaceae MAGs contained pMMO, and some of them additionally sMMO ( Fig. 3 , Table S2). In addition, several methanotrophic MAGs also contained the highly divergent pXMO enzyme complex encoded by the pxmABC gene cluster ( Tavormina et al., 2011 ). Previous studies have shown that pXMO is involved in methane oxidation under hypoxic denitrifying conditions in Methylomonadaceae strains ( Kits et al., 2015a( Kits et al., , 2015b, but its exact function in the sand filters remains to be determined. Methane oxidation results in the production of various C1 intermediates including methanol and formate, which can be used as substrates by methylotrophic bacteria. In DWTP Breehei, the P-RSF samples were dominated by members of the Methylophilaceae family, which was assumed to accommodate only methylotrophic bacteria incapable of growth on methane. Surprisingly, one of the Methylophilaceae MAGs (P-RSF-NP-02) harbored a complete operon encoding a sMMO ( Fig. 3 , Table S2), indicating a methanotrophic potential. This finding is substantiated by an earlier study, which demonstrated that a member of methylotrophic genus Methyloceanibacter can become methanotrophic by acquiring sMMO ( Vekeman et al., 2016 ). The phylogenomic analysis and low ANI value to other members of this family (77.26%) suggest that P-RSF-NP-02 represents a novel species within the Methylophilaceae ( Fig. 7 B) and the extremely high coverage of this MAG indicates a major role in methane removal in the DWTP Breehei. The high abundance of this potential new novel methane-oxidizing bacterium in the P-RSF will be facilitated by the high iron content of the influent water, as the sMMO contains a diiron cluster in the active site ( Jasniewski and Que, 2018 ; Wallar and Lipscomb, 1996 ).

Conclusion
• Metagenomic analyses enabled us to identify key microbial populations involved in the removal of ammonium and methane. While we did not analyze the influence of physicochemical parameters like temperature, pH, or organic carbon content, we found the location within the DWTP Breehei to be a major influential factor shaping the microbial community. • Clade A comammox Nitrospira dominated the nitrifying microbial community in P-RSF, while clade B was most abundant in S-RSF, where ammonium concentrations are the lowest, and in the biofilm (WB), which is a predicted niche for comammox bacteria. • The methanotrophic community was dominated by sMMOcontaining bacteria, particularly by one novel Methylophilaceae member, which might be facilitated by a high iron concentration in the groundwater.

Author contributions
LP and HOdC collected and processed samples. LP and HK analyzed and interpreted the data. JF and GC contributed to bioinformatics analyses. TvA and GC performed Illumina and Nanopore sequencing. SL, HOdC, MJ and MAHJvK were involved in project discussion and data interpretation. SL conceived the research project. LP, HK, and SL wrote the manuscript with input from all the authors.

Data availability
The genome sequences of the 56 MAGs recovered in this study and the raw sequencing data have been deposited in GenBank under BioProject number PRJNA622654. Nucleotide and amino acid sequences of amoA / pmoA, mmoX and nxrA / narG genes extracted from the full co-assembly and the recovered MAGs are supplied as supplementary files (Additional data files 1-3, respectively).

Declaration of Competing Interest
The authors declare that they have no competing interests.