Thaumarchaea Genome Sequences from a High Arctic Active Layer

The role of archaeal ammonia oxidizers often exceeds that of bacterial ammonia oxidizers in marine and terrestrial environments but has been understudied in permafrost, where thawing has the potential to release ammonia. Here, three thaumarchaea genomes were assembled and annotated from metagenomic data sets from carbon-poor Canadian High Arctic active-layer cryosols.

R ecent studies have shown that ammonia-oxidizing archaea (AOA) often outnumber ammonia-oxidizing bacteria (AOB) (1)(2)(3) and that AOA do the lion's share of ammonia oxidation in terrestrial soils (3)(4)(5)(6)(7)(8)(9). However, their potential role in both fixed nitrogen losses and ozone-depleting nitric oxide or nitrous oxide release and their taxonomic diversity are poorly understood in terrestrial permafrost ecosystems (10)(11)(12). Here, we report three draft genome sequences of thaumarchaea that are potential chemolithoautotrophic ammonia oxidizers and may play a potentially significant role in the nitrogen cycle of the Canadian High Arctic.
These three genomes were constructed through the analysis of 21 metagenomes from mineral cryosols at 5-cm depth retrieved in our previously published studies (13,14). Metagenomic libraries were prepared using the Illumina Nextera DNA library preparation kit (Illumina, Inc., San Diego, CA), followed by 100-bp paired-end DNA sequencing on an Illumina HiSeq 2000 platform (13). A total of 498,483,227 forward and reverse reads were filtered separately for quality using tools available on the Princeton University Galaxy server as follows: reads with 90% of the bases with a Phred score of Ͻ30 were removed using "filter by quality" v1.0.0; Nextera transposase adaptor sequences were trimmed using cutadapt v1.6; FASTQ Trimmer v1.0.0 was used to remove the last five bases at the 3= end; and trimmed reads with fewer than 50 nucleotides (nt) were removed. One of the paired reads may be discarded, and the remaining read is referred to as a single read. Default parameters were used for all software unless otherwise specified.
Individual metagenomes were assembled using IDBA-UD v1.1.1 (15), and the scaffolds were sorted into taxonomic bins using MetaBAT v0.32.4 in "very specific" mode (16). The bins recovered from each of the 21 metagenomes were analyzed using CheckM v1.0.7 to assess the quality and taxonomy (17). The taxa appearing multiple times across the 21 metagenomes included 10 Nitrosopumilales bins within the phylum Thaumarchaeota (completeness, 3.88 to 97.73%; contamination, 0 to 2.02%). Of these, three Nitrosopumilales bins were 71.62 to 97.73% complete, so the quality reads from the three corresponding metagenomes (NCBI Sequence Read Archive accession num-bers SRR1586318, SRR1586310, and SRR1586268), which were mapped onto their respective genome bins using Bowtie 2 v2.3.2 (18), were extracted and reassembled separately using IDBA-UD v1.1.1. The number of mapped reads was used to calculate the depth of the coverage. The quality and taxonomy of the draft genomes were evaluated using CheckM v1.0.7. The statistics of these three Nitrosopumilales draft genome sequences are presented in Table 1.
Data availability. This genome assembly project and the three high-completion genome sequences have been deposited at NCBI GenBank under the accession numbers WJXC00000000, WJXD00000000, and WJXE00000000 (BioSample numbers SAMN11973980, SAMN11973981, and SAMN11973982 and BioProject number PRJNA548371). The versions described in this paper are the first versions, WJXC01000000, WJXD01000000, and WJXE01000000. The raw reads were deposited at the NCBI Sequence Read Archive under the accession number SRP047512 (13).

ACKNOWLEDGMENTS
The  We thank the staff at Research Computing, Office of Information Technology, Princeton University, for their technical support with the computational analyses. We also thank the GEO523 2016 class for filtering the metagenomic reads on the Galaxy platform. E