High-throughput amplicon sequencing datasets of microbial community in soils irrigated by quicklime and fly ash-treated acid mine drainage water

In water-stressed regions, the use of treated acid mine drainage (AMD) water for irrigated agriculture has been suggested as an alternative to address the shortage of fresh water sources. However, the short and long-term impact of using such (un)treated AMD water on soil health, particularly the microbiome structure and functional capacity, is not known. We present high-throughput amplicon sequence (HTS) datasets of purified microbial metacommunity DNA of soils under Irish potato production irrigated by quicklime and fly ash treated AMD water. The irrigation treatments included quicklime treated AMD water (A1Q and A2Q; n = 16), and quicklime and fly ash-treated AMD water (AFQ; n = 5), untreated AMD water (uAMD; n = 7) and control group using tap water (n = 5). The V1-V3 hypervariable region of the 16S rRNA gene from each sample were sequenced on an Illumina MiSeq to generate these HTS datasets. The raw sequences underwent quality-checking, demultiplexing into FASTQ files, and processing using MOTHUR pipeline (v1.40.0). Th quality reads classified into taxonomic ids (phylum, class, order, family, and genus) using the Naïve Bayesian classifier algorithm against the SILVA database (v132) and were assigned to operational taxonomic units (OTUs) based on the pairwise distance matrix (Euclidean distance matrix). The applicability of the HTS datasets was confirmed by microbial taxa at the phylum level. All HTS datasets are available through the BioSample Submission Portal under the BioProject ID PRJNA974836 (https://www.ncbi.nlm.nih.gov/bioproject/974836).


a b s t r a c t
In water-stressed regions, the use of treated acid mine drainage (AMD) water for irrigated agriculture has been suggested as an alternative to address the shortage of fresh water sources.However, the short and long-term impact of using such (un)treated AMD water on soil health, particularly the microbiome structure and functional capacity, is not known.We present high-throughput amplicon sequence (HTS) datasets of purified microbial metacommunity DNA of soils under Irish potato production irrigated by quicklime and fly ash treated AMD water.The irrigation treatments included quicklime treated AMD water (A1Q and A2Q; n = 16), and quicklime and fly ash-treated AMD water (AFQ; n = 5), untreated AMD water (uAMD; n = 7) and control group using tap water (n = 5).The V1-V3 hypervariable region of the 16S rRNA gene from each sample were sequenced on an Illumina MiSeq to generate these HTS datasets.The raw sequences underwent quality-checking, demultiplexing into FASTQ files, and processing using MOTHUR pipeline (v1.40.0).Th quality reads classified into taxonomic ids (phylum, class, order, family, and genus) using the Naïve Bayesian classifier algorithm against the SILVA database (v132) and were assigned to operational taxonomic units (OTUs) based on the pairwise distance matrix (Euclidean distance matrix).The applicability of the HTS datasets was confirmed by microbial taxa at the phylum level.All

Value of the Data
• The HTS dataset provides valuable insights into the impact of treated acid mine drainage (AMD) water on soil health and agricultural productivity in irrigated agriculture, especially considering the expected increase in AMD water usage due to global climate change.• The metagenome datasets demonstrate the presence of diverse microbial communities and their relative abundances, including some that are unique to the samples of different AMD water irrigation treatments, thus confirming relevance and the applicability of the data in capturing the distinct microbial signatures associated with various irrigation strategies.
• By sharing the raw HTS metagenome datasets through the publicly accessible sequence repository (NCBI), researchers can use the data to identify the key factors influencing per-turbations in the soil microbiome within agricultural ecosystems irrigated with treated AMD water.• The raw HTS metagenome datasets are valuable resources for the broader scientific community, serving as a baseline for monitoring the stability of agricultural soils exposed to high heavy metal and sulfate.Researchers can utilize this data for their own research objectives.

Objective
The primary goal of the HTS sets was to delineate the complex interaction between the soil microbiome and agricultural practices, specifically focusing on the impacts of quicklime-and fly ash-treated AMD water irrigation on potato plots.The specific objectives were to comprehend the mechanisms optimizing bacterial richness and to explore the interrelationship between bacterial function and the soil environment in both treated and untreated AMD water-irrigated agriculture.The assessed quality and relevance of these HTS datasets, provide a substantial amount of information that can be harnessed for future research endeavors, enabling a deeper understanding of the soil microbiome in agricultural systems and shedding light on the potential effects of both treated and untreated AMD water irrigation treatments.

Sample
community structure among the various irrigation treatments.Overall, the dataset's ability to capture and differentiate these microbial patterns underscores its significance for studying the impacts of (un)treated AMD water irrigation strategies on soil microbiomes in agricultural systems.

Experimental Design and Sampling
AMD water samples were collected from mine tailing dam of Sibanye Gold Mine located in Randfontein, Gauteng Province, South Africa as previously described [34] .AMD water was subjected to three different treatments with quicklime and fly ash before use as irrigation water in the experiments: 10g quicklime + 990 mL AMD water (A1Q); 20g quicklime + 980 mL AMD water (A2Q); and 10g quicklime + 250 g fly ash + 740 mL AMD water (AFQ).The AMD water was treated with quicklime and fly ash based on the protocol by Othman et al. [35] .
The greenhouse experiment was caried out at Ceres Greenhouse Facility, University of South Africa (UNISA), Florida Science Campus, Roodepoort, Johannesburg, Gauteng Province (S 26 °10 30 S, 27 °55 22.8 E).The experiment was conducted for two cropping seasons, from September to December 2018 (Season 1) to March to June 2019 (Season 2).The layout of the experiment was a completely randomized block design with at least five replications for each of the irrigation water (Control, AMD, A1Q, A2Q and AFQ) tested for each potato cultivar.Potatoes were sown in 5 L pots filled with 3:1:1 Culterra topsoil + vermiculite + river sand, the typical soil composition in the Gauteng region, South Africa.The typical physicochemical parameters of the soil substrate at the start of the experiment were as follows: pH 7.26, bulk density (BD) 1.23 g cm−3, 15.63 g kg−1 soil organic carbon (SOC), 0.81 g kg−1 total nitrogen (TN), 33.2 mg kg−1 available phosphorus (AP), and 163.3 mg kg−1 available potassium (AK).Potato tubers were planted in each experimental pot at a depth of 7.5 cm and irrigated with 500 mL water after every 2 days until harvesting.The mean temperatures, relative humidity, and light intensity during the study period were 25 °C, 50%, and 1100 μmol s −1 m −2 .

High-Throughput Sequencing and Taxonomic Profiling
Metagenomic DNA was extracted from triplicates soil samples for each treatment, each weighing 0.5 g, using the PowerSoil ® DNA isolation kit (MoBio Laboratory, CA, USA) according to the manufacturer's instructions.A PCR amplification to cover the bacterial 16S rRNA V1-V3 hypervariable region was performed using 27F (5'-AATGATACGGCGACCACCGAGATCTACAC TATGGCGAGTGA AGAGTTTGATCMTGGCTCAG-3') and 519R (5'-CAAGCAGAAGACGGCATACGAGAT AGTCAGTCAGGG GWATTACCGCGGCKGCTG-3') primer pairs, fused with MiSeq adapters (underlined) and heterogeneity spacers (bolded) compatible with Illumina indexes for multiplex sequencing, as described by Ogola et al. [36] .The generated libraries were sequenced using the Illumina MiSeq 2 × 300-bp paired-end platform at the University of South Africa -Florida Science Campus.The raw sequences of the HTS datasets have been deposited to the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under Bioproject ID PRJNA974836 ( https://www.ncbi.nlm.nih.gov/bioproject/974836 ).
The raw fastq data files obtained from MiSeq sequencing platform were initially processed to remove PCR artefacts, Illumina tags, and low-quality reads.This trimming process was performed using the ngsShoRT trimmer algorithm [37] .The resulting trimmed reads were then merged and further processed using the MOTHUR version 1.40.0 pipeline [38] .Within the MOTHUR pipeline, the sequence reads underwent quality filtering, ensuring that only high-quality reads were retained.Chimeric sequences and potential eukaryotic contaminants were identified and removed using the UCHIME algorithm [39] .To classify the remaining sequences, the Naïve Bayesian classifier algorithm [40] was employed, utilizing the SILVA database version 132 [33] as the reference database.Operational taxonomic units (OTUs) were determined by grouping the aligned sequences based on a pairwise distance matrix using the Euclidean distance measure [38] .The clustering was performed at a sequence similarity threshold of 97% at various taxonomic levels, including phylum, class, order, family, and genus.
Bulk soil samples were collected from each experimental pots producing Irish potatoes, irrigated with quicklime-treated AMD water (A1Q and A2Q), quicklime and fly ash-treated AMD water (AFQ), untreated AMD water (uAMD) and tap water (Control) over two cropping seasons between September 2018 and June 2019.Total DNA was purified from the replicate soil samples of each treatment using standardized methods, and DNA libraries sequenced on an Illumina MiSeq platform with 2 × 300 bp sequencing.Raw data was was processed using MOTHUR pipeline (v1.40.

Table 2
Relative abundance of the representative high-throughput sequences from bulk soils irrigated with (un)treated AMD water.Percentages were calculated across the sample groups: uAMD = untreated AMD water; A1Q = 1% quicklimetreated AMD water; A2Q = 2% quicklime-treated AMD water; AFQ = quicklime and fly ash-treated AMD water; and Control = tap water.