Metagenomic insights into the variation of bacterial communities and potential pathogenic bacteria in drinking water treatment and distribution systems

: High-throughput sequencing of 16S rRNA gene amplicons was conducted to characterize the changing patterns of bacterial community and potential pathogens in full-scale drinking water treatment and distribution systems. Results showed that Actinobacteria was the predominant phylum in source water, while Proteobacteria dominated after chlorine disinfection and its relative abundance increased from 40.88%±9.45% to 67.86%±27.10%. The genera Pseudarthrobacter , Arenimonas , and Limnohabitans were effectively removed by chlorination, while Phreatobacter , Undibacterium , Pseudomonas , and Sphingomonas within the Proteobacteria phylum were greatly enriched after chlorination. Metagenomic analyses revealed the occurrence of 56 species of potential pathogenic bacteria within 17 genera in drinking water, mainly including Pseudomonas fluorescens and five mycobacteria species, which were also persistent in tap water samples. The bacteria were found to be involved in various pathways, among which considerable groups were related to


INTRODUCTION
Microbiological safety is an important and essential requirement for drinking water in the world, especially for potential pathogens in drinking water that have become a global public health issue.Notably, increasing studies have demonstrated the occurrence and prevalence of potential pathogens in drinking water [1,2], suggesting the increased human exposure to potential pathogens and considerable health risk.More seriously, various potential pathogenic bacteria acquire antibiotic resistance genes and show antibiotic resistance.Antimicrobial resistance is common in both sterilized and unsterilized drinking water systems [3].
These potential antibiotic resistant pathogens will increase the difficulty of therapeutic treatment and influence the therapeutic effect.
Sedimentation and filtration, which have been widely used in the purification of drinking water, cannot separate all microbes from source water, usually leading to the regeneration and proliferation of microbes in drinking water pipelines [4].Disinfection is the most solid barrier to controlling the potential pathogens' growth and ensuring the microbiological safety in the drinking water, which has been proven effective in preventing potential pathogens from spreading to the tap water [2].Chlorine is usually added to drinking water in the final disinfection step in the waterworks, and a constant concentration of residual chlorine is also necessary for the drinking water distribution system [5][6][7].The microbial composition in tap water is usually remarkably different from that in source water and sand filtered water, because chlorination and pipeline transportation has a great impact on bacterial populations [2,8].However, various bacteria and some potential pathogens possess several adaptive features as their survival mechanism in extreme drinking water conditions, such as chlorine resistance, slow growth, and biofilm formation [9], so that some bacteria and potential pathogens are still discovered in tap water, including Acinetobacter, Burkholderia and Pseudomonas [10][11][12].Tap water is an immediate pathway for human exposure to potential pathogenic bacteria, usually via skin exposure or inhalation of aerosols [13], so the bacterial populations involved in human diseases and potential pathogens in drinking water can pose health risks to humans, especially those with immunodeficiency [14].Moreover, the functional profiles of the bacterial populations especially related to pathogenicity will alter after the structural changes in the bacterial community along with drinking water treatment and distribution.Li et al. [15] predicted the function of the biofilm in a drinking water distribution system based on the microbial structure analyses and indicated that an important proportion of the microbial communities was associated with human infectious diseases.Thus, it is urgent and necessary to conduct a comprehensive insight into the bacterial communities and potential pathogenic bacteria in drinking water treatment and distribution systems.
Culturable microbial indicators are often monitored to evaluate the treatment efficiency and the water quality, and the culture-based methods are simple and cost-effective and considered as the gold standard that has been globally applied to monitor pathogenic bacteria [16].However, no appropriate media are available for the culture of all the target potential pathogens and the methods cannot comprehensively assess the bacterial diversity and function in drinking water [17].Recently, molecular detection methods such as polymerase chain reaction (PCR)-based methods have been used to explore the occurrence and abundance of certain potential pathogenic bacteria in drinking water [18,19].However, PCR-based methods usually aim at detecting only known and representative bacterial pathogens due to the limit of validated primers, and it is difficult to comprehensively explore the microorganisms in the environment.Notably, the technologies based on high-throughput DNA sequencing have been proved a promising and powerful tool for examining the complex microbial community and the related function information [20][21][22].Among them, 16S rRNA gene amplicon sequencing is a feasible, quick, and cost-efficient approach to successfully investigate the comprehensive bacterial community and directly detect various potential pathogenic bacteria in different environmental media, such as soil [23], surface water [24], activated sludge [25], and drinking water [26].
In this study, 16S rRNA gene amplicon sequencing was used to provide metagenomic insights into the bacterial community structure and potential pathogens in full-scale drinking water treatment and distribution systems, and revealed the main roles of water chlorination and distribution in shaping bacterial community and potential pathogenic bacteria.The study also sought to examine the bacterial functional profiles along the treatment and distribution systems.This study comprehensively characterized the changing patterns of bacterial community and potential pathogenic bacteria from source water to tap water, especially focusing on the impact of chlorination disinfection.The results may provide valuable reference for the future research on how to effectively control the dissemination of potential pathogens in drinking water and reduce the human health risks.

Water sample collection and DNA extraction
Four groups of water samples were collected in a waterworks (Nanjing City, Jiangsu Province, China) with supply quantity of 1,200,000 t/d, including 2 L of the source water from the water pumping station I (SW), 4 L of effluent water from the sedimentation tank (ES), 25 L of filtered water from the filtration tank (FW), and 1000 L of chlorine-disinfected water from the clear water tank (DW) (Supplementary Figure S1).Three tap water sampling points (TWA, TWB and TWC) were located along the transportation pipeline, 10, 30 and 40 kilometers away from the waterworks, respectively (Supplementary Figure S1).In order to minimize the experimental error, samples were collected in triplicate from each sampling site (a total of 21 samples) at the same time in June 2017.For pretreatment, 0.22-μm micropore membranes were used to filter SW, ES, and FW samples, while about 1000 L of DW, TWA, TWB, and TWC water were filtered within 24 h to trap bacterial cells using water purifiers due to the low biomass in the samples.The hollow fiber membranes inside the purifiers were subsequently obtained and mixed with 30 mL of purified water.The bacterial cells were separated from the membranes by ultrasound oscillation.The mixture was then centrifuged at 7500 r min −1 for 15 min.Total genomic DNA was extracted from the resulting pellets with the FastDNA Soil Kit (MP Biomedicals, CA, USA), and DNA concentration and purity were measured by microspectrophotometry (NanoDropND-2000, NanoDrop Technologies, Wilmington, DE, USA).

Bioinformatics analysis
Mothur software was used to further process the raw sequences, including quality controlling, denoising, chimera removal, and 885,690 clean reads were generated in total.To avoid biases between different samples during the comparison, the sequencing reads were normalized to 29,523 for each sample to pick operational taxonomic units (OTU), calculate richness and diversity indices, and annotate the sequences.OTUs were defined at a similarity of 97% using USEARCH implemented in Quantitative Insights into Microbial Ecology (QIIME).For taxonomic classification, the sequences were aligned by Ribosomal Database Project (RDP) classifier against the Silva SSU database (Release123) and the min confidence threshold was 80% [28].Richness and diversity indices such as the number of OTUs, Chao 1 index, and Shannon index were calculated using Mothur software.

Detection of potential pathogenic bacteria
The species and strains of human pathogenic bacteria (HPB) were summarized by collecting the information from HPB virulence factor database (http://www.mgc.ac.cn/VFs/) [29] and other reference databases [30,31].The HPB 16S rRNA gene sequence database was created by extracting the sequences from the NCBI GenBank (http://www.ncbi.nlm.nih.gov/),including 1525 16S rRNA gene sequences assigned to 527 HPB species.The BLASTN program was applied to search the target sequences from each sample against the offline HPB 16S rRNA gene sequence database [32].The BLAST outputs were further screened using a strict criterion of identity ≥99%, alignment length ≥150 bp, and mismatch ≤1 bp [23].The percentage of the number of screened sequences in the total sequencing reads (29,523 in this study) was recognized as the relative abundance of potential pathogenic bacteria.

Functional composition analysis
In this study, PICRUSt based on KEGG database was used for functional prediction of bacterial communities [33].Firstly, MOTHUR software was used to convert the 16S rRNA OTU-shared file (with identity of over 97%) into a BIOM format [34].The obtained OTU table was then uploaded to the online PICRUSt server on the Galaxy (http://huttenhower.sph.harvard.edu/galaxy)and the function of the OTUs was predicted following the developer-recommended instructions (http://picrust.github.io/picrust).Default three levels of KEGG pathway hierarchy was chosen and hierarchical data was collapsed to a specified level for PICRUSt predictions.

Statistical analyses
The differences of microbial community structure in drinking water before and after chlorination were calculated with linear discriminant analysis effect size (LEfSe) online (http://huttenhower.sph.harvard.edu/galaxy), which highlights both biological relevance and statistical significance [35].In order to explore the difference of functional community structure among the samples, principal coordinates analysis (PCoA) was conducted using PAST software ver 3.15 (http://folk.uio.no/ohammer/past/).Venn diagram, reflecting the number and proportion of shared potential pathogenic bacteria among all the tap water samples, was generated using Venny v2.1.0[36].The biodiversity indices and relative abundance of bacterial community were obtained by calculating the average of the triplicate samples and expressed as the mean±standard deviation.The variation among different samples in drinking water was analyzed by SPSS software using ANOVA (a one-way analysis of variance), and P<0.05 was considered statistically significant.

Bacterial community shift during drinking water treatment and distribution
Biodiversity of the drinking water samples during treatment and distribution processes was analyzed by calculating the number of OTUs, Chao 1 index, and Shannon index at 3% cutoff.As for the number of OTUs, SW had the highest biodiversity (4179±1189 OTUs), followed by ES (3062±418 OTUs) and FW (3257±891 OTUs), while DW and TWs displayed considerably less richness ((568±19)-(1087±313) OTUs) (Supplementary Figure S2).Notably, the number of OTUs significantly reduced after chlorination (P<0.05).This result was also supported by the analyses of Chao 1 index and Shannon index, revealing that biodiversity slightly decreased after sedimentation and sand filtration, then significantly decreased after chlorination (P<0.05), and finally remained relatively stable with only small fluctuations in the process of transportation (Supplementary Figures S3 and S4).Biodiversity shift revealed the important role of drinking water treatment and distribution (especially the chlorination) on the species richness variation.
At genus level, HgcI_clade and CL500-29_marine_group had the highest relative abundance in the prechlorination samples, but only a few of them remained after chlorination (Figure 1B).Moreover, other predominant genera, including Pseudarthrobacter, Arenimonas and Limnohabitans also obviously decreased after chlorination.On the contrary, the relative abundance of several bacterial genera within the Proteo- bacteria phylum, such as Sphingomonas, Phreatobacter, Undibacterium, and Pseudomonas, evidently increased after chlorination (Figure 1B).

Persistent and discriminative bacterial genera in drinking water
In this study, 17 bacterial genera concurrently occurred in all the drinking water samples and their relative abundance was over 0.001% (Figure 1B).We defined them as the persistent bacterial genera since they were eliminated by drinking water treatment and existed in the tap water.In general, the persistent bacterial genera occupied 61.09%±0.04% of the total relative abundance of the detectable bacterial genera in FW, and accounted for 99.93%±0.00%,97.12%±0.05%,94.25%±0.10%and 95.80%±0.06% of the total relative abundance in the DW, TWA, TWB, and TWC, respectively (Figure 1B).It is worth noting that the total relative abundance of the persistent genera obviously increased after chlorination.Moreover, LEfSe analysis revealed the significant differences between the samples before and after chlorination at several taxonomic levels.In general, a total of 40 bacterial genera showed logarithmic LDA scores higher than 2.0, displaying statistically significant difference after chlorination.For the samples before chlorination, most genera within the phyla Bacteroidetes, Actinobacteria and Acidobacteria were abundant.Several genera belonging to the phylum Planctomycetes were prevalent in the samples after chlorination.The genera Phreatobacter and Sphingomonas also dominated in the samples after chlorination (Supplementary Figure S5).

Potential pathogenic bacteria in drinking water treatment and distribution systems
The variation of potential pathogenic bacteria in drinking water treatment and distribution systems was observed at both genus (Figure 3A) and species (Figure 3B) levels.Alignment against the human pathogenic bacteria database revealed that 17 genera of potential pathogenic bacteria, including Pseudomonas, Mycobacterium, Acinetobacter and Aeromonas, were present in all the water samples (Figure 3A).Stenotrophomonas, Staphylococcus, Morganella, Escherichia, Citrobacter and Enterococcus only existed in SW and then totally disappeared after sedimentation process (Figure 3A).Acinetobacter (from 0.010%±0.018% to 0.002%±0.004%),Aeromonas (from 0.011%±0.004% to 0.001%±0.002%)and Mycobacterium (from 0.252%±0.436% to 0.014%±0.009%)were more sensitive to chlorine since their relative abundance obviously decreased after chlorination (Figure 3A).On the contrary, the relative abundance of Rhodococcus, Pseudomonas, and Bacillus greatly increased after chlorination (Figure 3A).It is worth noting that Mycobacterium could gain regrowth after chlorine stress decreased in water distribution pipeline, while Acinetobacter and Aeromonas could not (Figure 3A).Furthermore, Mycobacterium, Pseudomonas, Rhodococcus and Bacillus were also detected in TW samples after pipeline distribution, but Rhodococcus and Bacillus disappeared in TWC (Figure 3A).
Totally, 56 species of potential pathogenic bacteria were detected in all the water samples, among which 17 had relative abundance of over 0.01% (Figure 3B).Potential pathogenic bacteria Aeromonas veronii dominated in SW, followed by Rhodococcus erythropolis, Pseudomonas fluorescens and Aeromonas hydrophila (Figure 3B).The relative abundance of these bacteria decreased after the processes of sedimentation and filtration (Figure 3B).What is also noteworthy is that chlorination increased the relative abundance of some species of bacteria including Pseudomonas fluorescens, Rhodococcus erythropolis, Rhodococcus fascians, and Bacillus thuringiensis, among which Pseudomonas fluorescens increased most significantly (P<0.05)(Figure 3B).The most abundant species in TWA, TWB and TWC were Pseudomonas fluorescens, Mycobacterium mucogenicum and Mycobacterium smegmatis, respectively.
Furthermore, 26 species of potential pathogenic bacteria were present in each of TWA, TWB and TWC.The number of potential pathogenic bacteria detected in TWA, TWB and TWC was 13, 19 and 12, respectively (Figure 4).Six potential pathogenic bacteria including Pseudomonas fluorescens, Mycobacterium gordonae, Mycobacterium mucogenicum, Mycobacterium smegmatis, Mycobacterium fortuitum and Mycobacterium chelonae were shared by all the tap water samples, and we defined them as persistent potential pathogenic bacteria, which were closely related to human health and need to be taken seriously.Among the persistent ones, Pseudomonas fluorescens, Mycobacterium mucogenicum and Mycobacterium smegmatis were the most abundant species in TWA, TWB and TWC, respectively (Figure 3B).

Functional profile of bacteria in drinking water
Functional contributions of bacteria in the treatment and distribution systems were explored using PICRUSt software on Galaxy based on OTUs profiles.Results of PICRUSt showed 328 groups at level 3 KEGG Orthology in total, belonging to 41 and 7 groups at level 2 and level 1 categories, respectively.The total variation tendency of functional profiles of the bacterial communities before and after chlorination were explored by PCoA (Figure 5), showing no obvious differences among the samples before chlorination.However, great functional changes took place after chlorination and continued in the transportation process.In general, transporters, ABC transporters, DNA repair and recombination proteins, purine metabolism, secretion system, bacterial motility proteins, ribosome, oxidative phosphorylation, and peptidases were the most abundant functional classes at level 3 KEGG Orthology.However, we mainly focused on functional classes involved in human diseases due to their importance to human health (Supplementary Figure S6).Notably, the bacterial communities in drinking water were mainly associated with Alzheimer's disease, Huntington's disease, Tuberculosis, Parkinson's disease, and even pathways in cancer (Supplementary Figure S6).Moreover, results of PICRUSt also revealed that chlorinated water samples, especially after distribution process, harbored the highest amount of human disease functional classes (Supplementary Figure S6).

DISCUSSION
In this study, metagenomic analyses were conducted to find out how the treatment and distribution process influence the diversity and relative abundance of bacterial community in urban drinking water systems.Generally, this study showed that the diversity and abundance of the bacterial populations changed along the treatment and distribution processes, which was confirmed by previous studies demonstrating that the bacterial diversity may be influenced by chlorination disinfection and pipeline distribution [37,38].Calculation of Chao 1 index, Shannon index and the number of OTUs also revealed the variations of bacterial diversity during the treatment and transportation process.The bacterial diversity was obviously reduced after chlorination and had small fluctuations during the transportation process, and this finding was also reported in a previous study [39].Notably, chlorination disinfection posed the most considerable influence on the microbial community structure of drinking water.Similarly, previous studies have also indicated that Proteobacteria is the most abundant phylum in water treatment and distribution systems [8,19,40,41].At the level of genus, Phreatobacter was found to be the most abundant genus in the water samples after chlorination, which agrees with Li et al. [15], revealing that Phreatobacter spp.dominated in the samples after chlorination.These results indicated that chlorine was the major factor affecting the dynamic composition structure and diversity of drinking water bacterial community.The difference in the sensitivity of bacterial species to chlorine may mainly contribute to the bacterial community shift [42].Many bacteria within phylum Actinobacteria, such as Pseudarthrobacter, were sensitive to chlorine, while Phreatobacter in phylum Proteobacteria was not sensitive.
The provision of safe drinking water is necessary for public health.In the past years, opportunistic pathogens have caused numerous outbreaks of waterborne diseases in the developed countries [9].This study revealed that a total of 17 genera of potential pathogenic bacteria including Pseudomonas, Mycobacterium, Acinetobacter, and Aeromonas were detected in the water samples, among which Pseudomonas was the most abundant.Similarly, previous studies also reported that Pseudomonas had the highest abundance in filtered water in a drinking water treatment plant [4] and the genus also dominated in post-chlorination water samples [39].Acinetobacter, Aeromonas, and Mycobacterium were supposed to be more sensitive to chlorine since their relative abundances obviously decreased after chlorination.What is also noteworthy is that the relative abundance of Pseudomonas, Rhodococcus, and Bacillus increased greatly after chlorination, indicating that chlorination cannot effectively remove the three genera of potential pathogenic bacteria in drinking water.Likewise, Huang et al. [4] showed that chlorination was not effective in removal of Pseudomonas in filtered water.Previous studies also demonstrated that the species Pseudomonas aeruginosa was frequently detected in post-chlorination drinking water [39,43].However, in this study, Pseudomonas fluorescens was the most abundant bacteria in the post-chlorination samples and was one of the persistent potential pathogenic bacteria in tap water samples.The reason may be that they belong to the same genus and show the similar resistance to chlorine.
As for the persistent potential pathogenic bacteria in tap water samples, several species of Mycobacteria were detected, which is similar to the result of a previous study investigating opportunistic pathogens in secondary water supply systems [44].However, the frequently detectable species Mycobacterium avium in drinking water of the United States was not detectable in the water samples in this study [45][46][47][48].This disparity may be related to geographical isolation and the unique local epidemiological characteristics [44].Moreover, it is interesting that the mycobacteria were not detected in source water but were frequently detected in all the tap water samples, which is similar to the results obtained in a previous study that mycobacteria were commonly detected in treated water [49].The main reason may include their growth during the process of treatment and distribution, a rather smaller sampling volume of source water than treated water due to clogging of membrane filters, and the random collection of the samples containing no specific mycobacteria due to the heterogeneous distribution of bacterial cells in source water [50].Mycobacterium can cause wound, skin and soft tissue infections, as well as lung disease, especially among children, the elderly, and those with weakened immune systems [51,52].Thus, it is necessary to further scrutinize mycobacterial species, especially the persistent pathogenic species such as Mycobacterium chelonae [46], so as to clarify the exposure risk of associated mycobacteria in local drinking water systems.
Furthermore, functional profiling of the drinking water microbiome using PICRUSt revealed that the bacterial community was involved in various pathways.Although most of the bacteria were related to metabolism and environmental information processing, considerable groups were related to human diseases, including infectious diseases, metabolic diseases, neurodegenerative diseases, immune system diseases, and even cancers, among which the group of infectious diseases was predominant.PICRUSt-predicted functional profiles inferred relatively little functional variation among the samples before chlorination, suggesting sedimentation and filtration had no great effect on bacterial community function, which may be related to the inconspicuous change of microbial composition and the functional redundancy of drinking water microbial communities [53].However, the functional profiles changed significantly after chlorination disinfection, due to the variation of bacterial community structure.Although non-culture method like 16S rRNA gene sequencing cannot determine whether the genes associated with human diseases were carried by viable bacteria, there was still a potential risk of infection because of the uptake of free DNA in the drinking water supply piping environment and horizontal gene transfer among bacteria populations [54].

CONCLUSIONS
In conclusion, this study comprehensively revealed a wide and diverse bacterial population and potential pathogenic bacteria in the water samples from urban drinking water treatment and distribution systems, indicating that chlorination was the main factor affecting the bacterial community and potential pathogenic bacteria in drinking water.Chlorination obviously reduced the bacterial diversity, while evidently increased the total relative abundance of the persistent genera.In total, 56 species within 17 genera of potential pathogenic bacteria were found to be present in the drinking water, among which Pseudomonas fluorescens and five mycobacteria species were found to be the persistent potential pathogenic bacteria in tap water samples.It is worth noting that a considerable number of the functional profiles of the bacterial community predicted by PICRUSt were related to human diseases, including tuberculosis, Parkinson's disease, and even cancer.Therefore, it is necessary to seek effective methods to control potential pathogenic bacteria, especially Pseudomonas and Mycobacterium, in drinking water.This study may provide a reference for the future research to reduce the health risks caused by pathogenic bacteria in drinking water.

Figure 2
Figure 2 PCoA plot comparing the bacterial communities in source water (SW), effluent of sedimentation tank (ES), filtered water (FW), chlorine-disinfected water (DW), tap water A (TWA), tap water B (TWB) and tap water C (TWC) at genus level.

Figure 4
Figure 4 Venn diagram showing the shared and unique potential pathogenic bacteria in tap water A (TWA), tap water B (TWB) and tap water C (TWC).