Soil metagenome datasets underneath the Arecibo Observatory reflector dish

The Arecibo Observatory (AO) located in Arecibo, Puerto Rico, is the most sensitive, powerful and active planetary radar system in the world [1]. One of its principal components is the 305 m-diameter spherical reflector dish (AORD), which is exposed to high frequency electromagnetic waves. To unravel the microbial communities that inhabit this environment, soil samples from underneath the AORD were collected, DNA extracted, and sequenced using Illumina MiSeq. Taxonomic and functional profiles were generated using the MG-RAST server. The most abundant domain was Bacteria (91%), followed by Virus (8%), Archaea (0.9%) and Eukaryota (0.9%). The most abundant phylum was Proteobacteria (54%), followed by Actinobacteria (8%), Bacteroidetes (5%) and Firmicutes (4%). In terms of functions, the most abundant among the metagenome corresponded to phages, transposable elements and plasmids (16%), followed by clustering-based subsystems (11%), carbohydrates (10%), and amino acids and derivatives (9%). This is the first soil metagenomic dataset from dish antennas and radar systems, specifically, underneath the AORD. Data can be used to explore the effect of high frequency electromagnetic waves in soil microbial composition, as well as the possibility of finding bioprospects with potential biomedical and biotechnological applications.


Data
The Arecibo Observatory (AO), located in Arecibo, Puerto Rico (18.3442, À66.7526), is the most sensitive, powerful and active planetary radar system in the world [1]. One of the principal components of the AO is its 305 m-diameter spherical reflector dish (AORD). It is made of~40,000 perforated aluminum panels supported by steel cables over a natural karst sinkhole in a lowland moist and wet seasonal evergreen and semi-deciduous forest [2,3]. This reflective surface allows radio emissions originating from the sky to be focused into the antennas, and redirects radar waves to objects in the solar system. The AO operates at frequencies from 50 to 10,000 MHz, which have been shown to have adverse effects on microbial growth [4e6]. Even though these electromagnetic waves might not reach the microorganisms in the soil underneath the AORD, the effect of the influx (due to rain) of the ones that are exposed on top is unknown. For this study, we sampled the soil underneath the AORD (Fig. 3), sequenced the metagenome, and described the diversity (Fig. 1) and functional (Fig. 2) profiles of the microbial communities. This dataset containing raw FASTQ files and figures, is part of the first study that assesses the microbial community underneath the AORD.

Sampling
Eight soil samples were collected (20 g each at 5 cm depth) from underneath the AORD: four from the periphery (a) 18. 3454

DNA extraction
Metagenomic DNA of the eight soil samples was extracted individually using the PowerSoil® DNA Isolation Kit (MO BIO Laboratories) following the manufacturer's protocol, except that for each sample, 0.30 g of soil was used and DNA was resuspended in 50 mL of TE1X (Tris-EDTA: 10 mM Tris-HCl,1 mM EDTA; pH 8.0).

Metagenome sequencing
The extracted DNA was pooled and then sequenced at the Molecular Research DNA Laboratory (MR DNA, Shallowater, TX, USA, www.mrdnalab.com). The initial concentration of the pooled sample was 13.20 ng/mL measured using the Qubit™ dsDNA HS Assay Kit (Life Technologies). A genomic library was constructed with 50 ng of the pooled sample using the Nextera DNA Sample Preparation Kit (Illumina). After fragmentation and addition of adapter sequences, the final concentration of the library was 14.60 ng/mL using the Qubit® dsDNA HS Assay Kit (Life Technologies) with an average length of 1273 bp using the Agilent 2100 Bioanalyzer (Agilent Technologies). The library was diluted to 12.0 pM and pair-end-sequenced using the Illumina MiSeq Reagent Kit v3 for 600 cycles. Sequences were preprocessed using FastQC [7] as quality control check, and the FASTX toolkit [8] to remove adapter sequences (FASTQ Clipper) and trim portions of sequences with Phred score < 30 (FASTQ Quality Trimmer).

Taxonomic and functional profiling
Processed sequences were uploaded to the Metagenomics Rapid Annotation using Subsystems Technology server (MG-RAST, www.mg-rast.org) [9]. The in silico profile generated for the microbial community underneath the AORD using MG-RAST, includes classification based on its taxonomic diversity (RefSeq) and genes' functionality (Subsystems, level 1) (Figs. 1 and 2).