Metagenomic profiling dataset of bacterial communities of a drinking water supply system (DWSS) in the arid Namaqualand region, South Africa: Source (lower Orange River) to point-of-use (O'Kiep)

The metagenomic data presented herein contains the bacterial community profile of a drinking water supply system (DWSS) supplying O'Kiep, Namaqualand, South Africa. Representative samples from the source (Orange River) to the point of use (O'Kiep), through a 150km DWSS used for drinking water distribution were analysed for bacterial content. PCR amplification of the 16S rRNA V1–V3 regions was undertaken using oligonucleotide primers 27F and 518R subsequent to DNA extraction. The PCR amplicons were processed using the illumina® reaction kits as per manufactures guidelines and sequenced using the illumina® MiSeq-2000, by means of MiSeq V3 kit. The data obtained was processed using a bioinformatics QIIME software with a compatible fast nucleic acid (fna) file. The raw sequences were deposited at the National Centre of Biotechnology (NCBI) and the Sequence Read Archive (SRA) database, obtaining accession numbers for each species identified.

The metagenomic data presented herein contains the bacterial community profile of a drinking water supply system (DWSS) supplying O'Kiep, Namaqualand, South Africa. Representative samples from the source (Orange River) to the point of use (O'Kiep), through a 150km DWSS used for drinking water distribution were analysed for bacterial content. PCR amplification of the 16S rRNA V1eV3 regions was undertaken using oligonucleotide primers 27F and 518R subsequent to DNA extraction. The PCR amplicons were processed using the illumina ® reaction kits as per manufactures guidelines and sequenced using the illumina ® MiSeq-2000, by means of MiSeq V3 kit. The data obtained was processed using a bioinformatics QIIME software with a compatible fast nucleic acid (fna) file. The raw sequences were deposited at the National Centre of Biotechnology (NCBI) and the Sequence Read Archive (SRA) database, obtaining accession numbers for each species identified. © 2019 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
The presented data contains the microbial composition of a drinking water supply system (DWSS) for O'Kiep, Namaqualand, South Africa. Table 1 represents the bacterial composition of the source point at the lower Orange River while Table 2 shows the microbial composition of the treated water, distributed by a state owned agency responsible for water management activities in the region. Table 3 represents the microbial composition from a local municipal reservoir at O'Kiep storing the treated water from the water agency, which is further distributed to individual households in O'Kiep. Tables 4e10 represents microbial composition at the point-of-use, i.e. households' tap.

Sample collection
The DWSS samples were obtained from a 100km long pipe system designed to deliver a flow of 18 ML/day. Freshwater is sourced from the lower Orange River by a regional water supply system to the nearby towns including O'Kiep which is located in the Northern Cape, Specification The research article that is associated with this article is still under construction.
Value of the data This data demonstrates the extent of bacterial contamination of a drinking water supply system in an arid region of Namaqualand, South Africa. This data can be used to determine the role of the detected bacteria with the observed clinical abnormalities experienced by the O'Kiep community. This data can also be used to develop mitigation techniques that will ensure that the drinking water is free of microbial contamination and suitable for drinking purposes.
Namaqualand region of South Africa [29 35 0 45 00 S, 17 52 0 51 00 E]. DWSS samples (n ¼ 9) were collected in April 2017 from the source to the point-of-use, i.e. at numerous household taps, in non-transparent 500 mL sterile polyethylene bottles which were immediately placed on ice prior to transportation to the laboratory. A composite sample (n ¼ 1) was initially collected from lower Orange River (Table 1). The second sample was composed of the treated water prior  to distribution (n ¼ 1) at the local water supply agency reservoir ( Table 2). A similar composite sample (n ¼ 1) from the local municipal reservoir (Table 3) and samples (n ¼ 6) were randomly collected from households' taps (Tables 4e10). All samples were handled according to the guidelines used for drinking water quality standard quantification [2,3].

DNA extraction and sequencing
The samples were filtered through a 0.22-mm micropore cellulose membrane (Merckmillipore, USA) and the membrane was pre-washed with a sterile saline solution followed by the isolation of   The purified DNA was PCR amplified using the 16S rRNA forward bacterial primers 27Fe16S-50-AGAGTTTGATCMTGGCT-CAG-'3 and reverse primers 518R-16S-50-ATTACCGCGGCTGCTGG-'3 [4] that targeted the V1 and V3 regions of the 16S rRNA. The PCR amplicons were sent for sequencing at Inqaba Biotechnical Industries (Pretoria, South Africa), a commercial NGS service provider. Briefly, the PCR amplicons were gel purified, end repaired and illumina ® specific adapter sequence were ligated to each amplicon. Following quantification, the samples were individually indexed, followed by a purification step. Amplicons were then sequenced using the illumina ® MiSeq-2000, using a MiSeq V3 (600 cycle) kit. Generally, 20 Mb of the data (2 x 300 bp long paired end reads) [5] were produced for each sample. The Basic Local Alignment Search Tool (BLAST)-based data analyses was performed using an Inqaba Biotech (Pretoria, South Africa) in-house developed data analysis system. Overall, sequences were deposited in two databases, i.e. the National Centre of Biotechnology (NCBI) and the Sequence Read Archive (SRA) database, prior to the generation of accession numbers for individual bacterial species.