High-throughput sequencing data and antibiotic resistance mechanisms of soil microbial communities in non-irrigated and irrigated soils with raw sewage in African cities

High-throughput sequencing data of soil microbial communities in non-irrigated and irrigated soils with raw sewage in African cities are presented in this report. These data were collected to study the potential of wastewater use in urban agriculture to disseminate bacterial resistance in soil. Soil samples were collected in three cities in two African countries. Each city had two sectors (irrigated and non-irrigated). After collection, biomass samples were purified, DNA from soil was extracted, quantified and sequenced using multiplex Illumina high-throughput sequencing. The sequence count of the six metagenome datasets ranges from 3,258,523,350 bp to 4,120,454,250 bp; the mean sequence length post quality control average was 149 ± 3 bp. The mechanisms of resistance encoded by the identified antibiotic resistance genes (ARGs) in the metagenomic data were dominated by antibiotic inactivation enzymes (64.7% and 71.9%), followed by antibiotic target replacement (14.7% and 12.5%), antibiotic target protection (11.8% and 9.4%) and efflux pumps (6.3% and 8.8%) in bacterial DNA isolated from irrigated and non-irrigated fields, respectively. The datasets will be useful for the scientific community working in the area of bacterial resistance dissemination from the environment. They can be used for further understanding of bacterial drug-resistance gene prevalence and acquisition in wastewater irrigated soils. The data reported herein was used for the article, titled “Raw wastewater irrigation for urban agriculture in three African cities increases the abundance of transferable antibiotic resistance genes in soil, including those encoding Extended spectrum β-lactamase (ESBLs)” Bougnom et al. (2020) [1].

resistance dissemination from the environment. They can be used for further understanding of bacterial drug-resistance gene prevalence and acquisition in wastewater irrigated soils. The data reported herein was used for the article, titled "Raw wastewater irrigation for urban agriculture in three African cities increases the abundance of transferable antibiotic resistance genes in soil, including those encoding Extended spectrum b-lactamase (ESBLs)" Bougnom et al. (2020) [1].
© 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
In the present work, we report DNA sequence read metrics of six metagenomic samples from soil obtained from non-irrigated fields (NIR) and their corresponding irrigated fields (IRI) with raw sewage Specifications

Value of the Data
The data provides insight into the microbial diversity and functional changes after raw sewage irrigation. The data will be useful for the scientific community working in the area of bacterial drug-resistance gene dissemination in the environment. The data can be used for further understanding of bacterial drug-resistance acquisition in wastewater irrigated soils. Thus, assessing the public health issue of urban agriculture in low-and middle-income countries.
in three cities (Table 1), in two African countries ( Fig. 1) [1]. The sequence counts of the metagenome datasets post quality control (QC) ranged from 3,309,468,880 bp to 3,649,105,747 bp and 3,159,665,932  bp to 3,682,552,830 bp in irrigated and non-irrigated fields, respectively. The mean GC content post QC ranged from 60 ± 12% to 65 ± 10% and 62 ± 12% and 66 ± 9% in irrigated and non-irrigated fields, respectively. The mean sequence length post quality control (QC) average was 149 ± 3 bp. The mechanisms of drug-resistance encoded by the identified antibiotic resistance genes (ARGs) in the metagenome data were dominated by antibiotic inactivation enzymes (64.7% and 71.9%), followed by antibiotic target replacement (14.7% and 12.5%), antibiotic target protection (11.8% and 9.4%) and efflux pumps (6.3% and 8.8%) in irrigated and non-irrigated fields, respectively (Fig. 2). The number of ARGs encoding drug-resistance due to antibiotic inactivation enzymes was 6% lower in non-irrigated fields, whereas those encoding the other mechanisms of resistance were 2% higher in irrigated fields.

Experimental design, materials, and methods
Soil samples were collected in three cities, in two African countries, namely Ouagadougou (46 38 0 N, 11 29 0 ) in Burkina Faso, Ngaoundere (46 38 0 N, 11 29 0 ) and Yaounde (46 38 0 N, 11 29 0 ) in Cameroon (Fig. 1). In each city, there were two sectors comprising three agricultural fields that were irrigated (IRI) with raw wastewater, and as control soils, 500 m away, three non-irrigated agricultural fields (NIR) with comparable soil properties. This gave samples from Ouagadougou (IRI1 and NIR1), Ngaoundere (IRI2 and NIR2), and Yaounde (IRI3 and NIR3). Wastewaters were collected from canals. The canals are natural open-air water drainage canals and collection points of different transects. They receive wastewater from habitations, hospitals, agriculture, markets and slaughterhouses. Salad and tomatoes were the growing plants in the fields. The agricultural fields were approximately 0.2 ha each and watered manually twice per day with watering cans. In each field, 100 g of soil was randomly sampled at 10 different places from 0e20 cm depth, using soil cores. Replicate samples were pooled together, giving 1 kg-composite samples. The samples were transported on ice and stored at À80 C until further analysis.
To collect the bacterial cells from the different soils, soil biomass purification was conducted according to Sentchilo et al. (2013) [2]. Briefly, 15 g soil samples were homogenized by magnetic stirring for 15 min, in ice-cold poly (beta-amino) esters (PBAE) buffer (PBAE buffer is 10 mM Na-phosphate, 10 mM ascorbate, 5 mM EDTA, pH 7.0), at 10 mL g À1 of soil. Low speed centrifugation in 50-mL conical tubes at 160 g for 6 min was used to remove coarse particles, big eukaryotic cells and bacterial flocks. The collected supernatants were centrifuged at 10,000 g for 5 min to pellet the microbial biomass for further analysis.
Soil DNA was extracted using the DNeasy PowerSoil Kit (Qiagen, Germany) according to the manufacturer's instructions. DNA concentration was determined by using the Quant-iT PicoGreen dsDNA Assay Kit, and the Qubit™ 3.0 Fluorometer (Qubit, Life Technologies, USA). The three DNA samples extracted from each block were pooled together in equal nanogram quantities. Six DNA samples representative of the three cities were sent to Edinburgh Genomics for high-throughput sequencing.     [3]. ShortBRED profiles protein family abundance in metagenomes in two-steps: (i) ShortBRED-Identify isolates representative peptide sequences (markers) for the protein families, and (ii) ShortBRED-Quantify maps metagenomic reads against these markers to determine the relative abundance of their corresponding families based on reads per kilobase million (RPKM). Minimum identity of 95% and minimum fragment length of 30 amino acids were considered positive. ARGs were identified with the Comprehensive Antibiotic Resistance Database (CARD) (McArthur et al., 2013) [4]. ARG markers were generated using the comprehensive and non-redundant UniProt reference clusters UniRef50 as a reference protein database. Antibiotic resistance ontology (ARO) numbers in CARD was used to aggregate, annotate and associate the ARGs to the corresponding resistance family.