Large-scale biogeographical patterns of bacterial antibiotic resistome in the waterbodies of China

Antibiotic resistance genes (ARGs) are widespread in aquatic environments, but we know little about their biogeographical distribution and occurrence at national scales. Here we analyzed the patterns of ARGs from 42 natural waterbodies (natural lakes and reservoirs) across China using high-throughput approaches. The major ARGs were multidrug genes and the main resistance mechanism was the efflux pump. Although the absolute abundance of ARGs (gene copies/L) in the south/central waterbodies was similar to the northern waterbodies, the normalized abundance of ARGs (ARGs/16S rRNA gene copy number) was higher in the south/central waterbodies than in the north (mainly because of the aminoglycoside and multidrug resistance genes). Human activities strongly correlated with the normalized abundance of ARGs. The composition of ARGs in the waterbodies of south/central China was different from that in the north, and ARGs showed a distance-decay relationship. Anthropogenic factors had the most significant effects on this spatial distribution of ARG composition, followed by the spatial, bacterial and physicochemical factors. These indicate that the ARGs exhibited biogeographical patterns and that multiple ecological mechanisms such as environmental selection (human activities and local physicochemical parameters) and dispersal limitation influence distribution of ARGs in these waters. In general, our results provide a valuable ecological insight to explain the large-scale dispersal patterns in ARGs, thereby having potential applications for both public health and environmental management.


Introduction
Antibiotics have been widely used to treat bacterial infections since the 1940s and have revolutionized global medicine (Allen et al. 2010;Marti et al. 2014). However, the widespread occurrence of microbial resistance to antibiotics has become 'possibly the best example of human-driven evolution in action' at a global scale (Gillings and Paulsen 2014), with the first cases of antibiotic resistance typically being reported only a few years after the first use of a particular antibiotic (Pruden 2014). This has been considered as a serious public health problem by some scientists for several decades, and has led to the frighteningly plausible predictions of the end of effective antibiotics in medicine (Levin and Andreasen 1999). Clearly, the control of antibiotic resistance in medical settings, such as hospitals, is important, but it has become apparent that the wider environment has become a potentially major pool of, and source of, antibiotic resistance genes (ARGs) (Pehrsson et al. 2016). In particular, the extensive use of antibiotics by humans, and their subsequent release into aquatic ecosystems in treated and untreated sewage, hospital waste, aquaculture discharges, and agricultural runoff have made aquatic environments crucial for the occurrence, exchange, evolution and spread of ARGs (Allen et al. 2010;Pruden 2014;Rodriguez-Mozaz et al. 2015;Y.G. Zhu et al. 2017c). However, there is still a lack of fundamental data on the distributions and effects of ARGs in natural waters (Czekalski et al. 2015).
Previous biogeographical studies have largely used 16S rRNA gene (via either sequencing or fingerprinting) as a standard approach for determining microbial diversity and differences in composition between communities, but it can be difficult to know how variation in taxonomic composition relates to ecosystem function (Louca et al. 2016). More recently, there has been a growing consensus on the need for a better representation of functional traits in biogeography -an approach which is better developed in macro-organism studies than in microorganisms (Violle et al. 2014). One of the lines of evidence that microorganisms display spatial biogeographical patterns is the distancedecay relationship (i.e. A decline in similarity with increasing geographical distance) (Hanson et al. 2012). This relationship shows that different locations harbor microorganisms that differ in taxonomic and functional composition. In functional biogeography, some studies found no significant correlation between geographical distance and microbial functional differences, implying that the functional traits available to a specific community have no physical constraints on their dispersal (Angermeyer et al. 2016;Raes et al. 2011). However, other studies suggest microbial functional genes have a biogeographical provincialism, as these functional genes correlated to geographical distance (Haggerty and Dinsdale 2017;Kelly et al. 2014). To date, we still know little about the whether the ARGs show a distance-decay relationship, and how the four fundamental processes that drive the biogeography of microbial taxonomic communities -selection, drift, dispersal and mutation -affect the ARG biogeographical patterns (Hanson et al. 2012). Studies on the biogeography of ARGs can enhance our knowledge on the dispersal mechanism of ARG across ecosystems. It can also contribute to strategies for prevention and control of the emergence, spread, enrichment and exchange of ARGs in both natural waters and manmade ecosystems (Berendonk et al. 2015).
In this study, we used high-throughput quantitative PCR (HT-qPCR) to investigate the ARGs along a latitudinal gradient ranging from 24 to 50°N (over 2700 km) in 42 lakes and reservoirs across China. China has a greater area than all of the countries of Europe put together -because of the size of the country, this is effectively a continental scale survey of the distribution of ARGs in waterbodies. We aimed to determine the abundance and diversity patterns of ARGs at a large-scale in the Chinese lakes/reservoirs and explore how the ecological processes and mechanisms that drive these biogeographical patterns of ARGs.

Study area, sampling and physicochemical analysis
A total of 42 Chinese lakes and reservoirs were sampled during July and August 2012 ( Fig. 1 and Table S1). As the emphasis in this study was on maximizing the number of different sites across China, only one collection was taken at each lake/reservoir. In order to guarantee the representation of each sample, first, the sample site was located at the center of each lake/reservoir where the water was mixed well. Second, we collected several water samples every 50 cm in the epilimnion, and then mixed them very well. Only one mixed sample was taken at the center of each waterbody (natural lake or reservoir). These waterbodies spanned a south-north gradient of over 2700 km and ranged from approximately 24 to 50°N. The study covered several major climate types; comprising five regions based on their climate and geographical characteristics: FJ (included 5 reservoirs) -Fujian province, southeast China; CJ (9 lakes) -the lower and middle reaches of Changjiang River, China; ECC (6 lakes) -east central China, IM (13 lakes) -Inner Mongolia, north China; NEC (9 lakes) -northeast China. All of these lakes have been featured in our previous work on large scale patterns in microbial taxonomic community diversity (Ju et al., 2014;Liu et al. 2015). We further classified these five regions into two large regions, that is, south/central China (included FJ, CJ, and ECC) and north China (included IM and NEC) based on the human population, economy and climate. Water samples were collected using a polypropylene bottle. The bottle was sterilized in an autoclave and pre-rinse by lake/reservoir water. All water samples were stored in the dark at 4°С and filtered within two hours.
Water temperature (WT), electrical conductivity (EC), pH, dissolved oxygen (DO) and turbidity (Turb) of the epilimnion layer, water depth of sampling site, water transparency (Trans) and concentrations of chlorophyll a (Chl a), total nitrogen (TN) and total phosphorus (TP) were measured using the methods described in our previous study .

DNA extraction
Water samples of approximately 500 ml were collected for ARG analyses and filtered through a 0.22-μm pore size polycarbonate filter (47 mm diameter, Millipore, Billerica, MA, USA) as described previously . Total lake/reservoir microbial DNA was extracted directly from the filter using the FastDNA SPIN Kit and the FastPrep Instrument (MP Biomedicals, Santa Ana, CA, USA) according to the manufacturer's instructions. All DNA samples were checked for quality using the agarose gel electrophoresis and quantified using the NanoDrop ND-100 device (Thermo Fisher Scientific, Waltham, MA, USA).

High-throughput quantitative PCR of ARGs
We analyzed the ARGs in 42 lakes and reservoirs using highthroughput quantitative PCR (HT-qPCR). HT-qPCR of ARGs was performed using the SmartChip Real-time PCR (Wafergen Biosystems, Fremont, CA, USA) as described previously (Wang et al. 2014;Takara Bio USA, 2017). A total of 296 primer pairs were used to target one 16S rRNA gene, 285 ARGs (resistance to major classes of antibiotics), 8 transposase genes, one universal class I integron-integrase gene (intI), and one clinical class 1 integron-integrase gene (cintI) (Table S2). These 285 ARGs confer resistance to major classes of antibiotics (Table S2). All these primer sets were designed, used, and validated in the previous studies (Looft et al. 2012;Su et al., 2015;Zhu et al. 2013, YG Zhu et al., 2017c. Antibiotic resistance-gene reference sequences were harvested from (1) the Antibiotic Resistance Genes Online database (http://ardb. cbcb.umd.edu/); (2) a National Center for Biotechnology Information (NCBI) search for resistance-gene sequences; and (3) literature searches. All qPCRs were performed in triplicate, and a non-template negative control was included. The results of the HT-qPCR were analyzed with the standard settings: (1) any wells with multiple melting peaks and/or amplification efficiency < 1.8 or > 2.2 were discarded, and (2) a threshold cycle (Ct) 31 was used as the detection limit and only sample with three replicates simultaneously (Ct < 31) were regarded as positive. The relative copy number of ARGs, MGEs and 16S rRNA gene was calculated according to a previous study (Eq. 1) (Looft et al. 2012;Muurinen et al. 2017).
where Ct refers to high-throughput quantitative PCR results, 31 refers to the detection limit, and 10/3 refers to the 10-fold difference in gene copy numbers is 10/3 cycles, if the efficiency is close to 100%. To obtained the absolute abundance of ARGs, we quantified the absolute copy number of 16S rRNA gene (gene copies/L) using a Lightcycler 480 instrument (Roche, Basel, Switzerland) with a SYBR® Green approach as previously described (Ouyang et al. 2015). A significantly high correlation between the relative copy number of 16S rRNA gene from SmartChip Real Time System and the absolute copy number of 16S rRNA gene from Roche 480 was observed (r = 0.95, P < 0.01). Therefore, we transformed the relative copy number of ARGs to absolute copy number according to a previous study (Eq. 2) (Ouyang et al. 2015

Anthropogenic factors
The anthropogenic factors (population, GDP, aquatic production, meat production, municipal domestic sewage, number of patients diagnosed and treated, and number of residential patients) were collected or estimated according the main administrative region covered by each basin (Table S3). These values and the explanation of these anthropogenic data collected, or estimated, in this study can be found in the supplementary information (Tables S3, S4 and S5). Given the difference in area between the 42 lake/reservoir basins, all anthropogenic factors were normalized by the area of administrative regions (km 2 ) where were mainly covered by each basin.

Statistics
The Bray-Curtis similarity matrix of ARGs profile was constructed based on the normalized abundance of ARGs (ARGs/16S rRNA gene copy number). Non-metric multidimensional scaling (NMDS) ordination and analysis of similarities (ANOSIM) were then used to investigate differences in Bray-Curtis similarity of composition of ARGs between sites (Clarke and Gorley, 2015). Further, the similarity percentage (SIMPER) procedure was used to identify ARGs that contributed most to the dissimilarity between sites.
The Shannon-Wiener index was calculated according to Eq. 3.
where H is Shannon-Wiener index, n is the total number of ARGs (richness of ARGs) and P i is the proportion of copy number of ARG i in a sample.
We used variation partitioning to determine the relative contribution of anthropogenic, spatial, physicochemical and bacterial factors to the distribution of ARG composition based on the normalized abundance of ARGs. Before the variation partitioning, we used a forward selection procedure to select anthropogenic, spatial and physicochemical variables using the 'ordiR2step' function from vegan R package (Blanchet et al. 2008). All variables where P > 0.05 were eliminated in further analyses. Spatial factors were generated through the use of principal coordinates of neighbor matrices (PCNM) analysis based on the geodetic coordinates from latitude and longitude of each site (using the pcnm R package) (Borcard and Legendre 2002). Bacterial factors are the relative proportion (the number of sequences of a bacterial group/total number of sequences of all bacterial groups in a sample) of major groups of abundant and rare bacteria (Actinobacteria, Bacteroidetes, Cyanobacteria, Firmicutes, Planctomycetes, Proteobacteria, Verrucomicrobia and unclassified bacteria) . The abundant bacterial OTUs were defined as the OTUs with a representation of ≥1% within a sample and had a mean relative abundance of ≥0.1%. The rare bacterial OTUs were defined as having an abundance of < 0.01% within a sample and having a mean relative abundance of < 0.001% .
To visualize the correlations between bacterial taxa and ARGs, we constructed a network between bacterial OTUs and ARGs by calculating all possible pairwise Spearman's rank correlations based on the sequence number of bacterial OTUs and normalized abundance of ARGs. The bacterial OTUs were characterized by Illumina sequencing (MiSeq platform) of the 16S rRNA gene as described by Liu et al. (2015). All 16S rRNA gene sequence data in this paper have been deposited in the public NCBI Sequence Read Archive database under the accession number SRX525963. A correlation between two items was considered statistically robust if the Spearman's correlation P value was < 0.001. All P values were adjusted using the Benjamini and Hochberg false discovery rate (FDR) procedure (the multtest R package) to reduce the chances of obtaining false-positive results with FDR-adjusted P value < 0.05 (Benjamini and Hochberg 1995). To further reduce false-positives, we restricted our analysis to bacterial OTUs and ARGs present both in > 30% of the samples. The bacteria OTUs, which we were unable to classify into family level, were excluded in this analysis. All these analyses were performed in PRIMER 7.0 (Clarke and Gorley, 2015) and R 3.2.3 (R Core Team 2015).
In total 41 lakes/reservoirs were included in the variation partitioning and network analyses, because we failed to detect ARGs by high-throughput qPCR in Lake Chaohu, and failed to obtain bacterial taxonomic OTUs by high-throughput sequencing in Lake Poyang .

Spatial distribution of ARGs in waterbodies
The absolute abundance of ARGs (i.e. the copy number of ARGs per litre water) varied from 5.95 × 10 6 (Quansanhaizi Lake) to 1.85 × 10 9 (Shijiu Lake) with mean of 4.33 × 10 8 ± 7.19 × 10 7 standard error (s.e.) (n = 42). The absolute abundance of ARGs and mobile genetic elements (MGEs) (i.e. the copy number of MGEs per litre water) and the normalized abundance of MGEs (i.e. the MGEs/16S rRNA gene copy number) did not show significant difference between the southern/ central and northern waterbodies ( Fig. 2A and Fig. S1). However, the normalized abundance of ARGs was significantly higher in the south/ central than in the north China (Fig. 2B). Moreover, the genes conferring resistance to aminoglycoside were significantly higher in the southern/central than in the northern waterbodies, in both absolute and normalized abundance. The multidrug resistance genes were significantly higher in the south/central than in the north waterbodies in normalized abundance (Table S6).
The spatial pattern of ARG composition was not clearly separated in terms of the five regions (global R = 0.206, P = 0.004). Interestingly, most of the sites from north China (i.e. IM and NEC) were placed on the upper region of the ordination, while most of the sites from south and central China (i.e. ECC, CJ and FJ) were clustered in the lower region (global R = 0.268, P = 0.001) (Fig. 2E). The SIMPER analysis further indicated the mean dissimilarities between north and south/central China was 82.73% ± 2.34%, and the main contribution to these separations was the high abundance of the mexF gene (multidrug resistance gene, 42.93% ± 7.28%) and blaCTX-M-04 gene (β-lactamase, 7.43% ± 4.01%) in the south and central China, and fox5 gene (βlactamase, 12.32% ± 3.68%) in the north China.
About 76.0% ARGs (127) were shared between the south/central and north regions (Fig. 2E inset). However, 12 ARGs were unique in the south/central Chinese waterbodies (two aminoglycoside, four β-lactamase, two MLSB (macrolide-lincosamide-streptogramin B), two tetracycline and two multidrug resistance genes) and 28 ARGs were unique in the north Chinese waterbodies (one aminoglycoside, seven β-lactamase, six MLSB, one sulfonamide, three tetracycline, five vancomycin and four multidrug resistance genes, while one ARGs which confer resistance to triclosan). Importantly, the shared ARGs account for 98.71% of the total absolute abundance of ARGs (Table S7).
Both absolute and normalized abundances of ARGs showed a significant positive correlation with the number of locations at which they were found, indicating a strong abundance-dependent effect on the biogeographical distribution of ARGs at the continental scale -the ARGs with higher abundance were detected in more samples (Fig. S5).

Relationships between anthropogenic, spatial, environmental factors and ARGs
We consider that the human-induced changes of antibiotic selection pressure between the lakes and reservoirs are one of the possible factors that drive the normalized abundance of ARGs. Seven anthropogenic factorspopulation, GDP, municipal domestic sewage, number of patients diagnosed and treated, number of residential patients, meat production and aquatic production were significantly, strongly and positively correlated with the normalized abundance of ARGs. However, only three physicochemical factors (water depth, water temperature and total phosphorus) showed significant correlations with the normalized abundance of ARGs (Table 1).
We found ARGs showed a significant distance-decay relationship (Fig. 3). Moreover, variation partitioning showed that pure anthropogenic factors (14.93%) had the most significant effects on the spatial distribution of ARG composition, followed by the pure spatial (8.93%), pure bacterial (5.49%) and pure physicochemical (0.44%) factors (Fig. 4). However, 43.55% of spatial variation of ARG composition was unexplained.

Correlations between the abundance of ARGs and MGEs
The correlation between absolute abundance of ARGs and MGEs was significant (Table S8). However, the normalized abundance of ARGs and MGEs did not show a significant correlation (Table S9).
For the absolute abundance, the integrase gene intI-1 exhibited a high correlation with the gene classes conferring resistance to aminoglycoside, β-lactamase and MLSB. The transposase gene tnpA-04 showed a high correlation with aminoglycoside, FCA (fluoroquinolone, quinolone, florfenicol, chloramphenicol, and amphenicol), MLSB, sulfonamide, tetracycline, vancomycin and multidrug resistance genes. In addition, significant correlations were observed between the transposase genes tnpA-01, tnpA-02 and tnpA-05 and some ARG groups (Table  S10). For the normalized abundance, no significant correlation was observed between the integrase genes and ARG groups. However, the transposase genes tnpA-03, tnpA-04 tnpA-05 showed high correlation with some ARG groups (Table S11). The correlation of transposase gene with ARG groups suggests that some transposase genes may contribute significantly to the dissemination of specific ARG groups.

Co-occurrence patterns between bacterial taxa and ARGs
Our network analysis showed co-occurrence relationships between the bacterial families and ARGs. Forty-nine bacterial families were strongly correlated with eight types of ARGs. The multidrug resistance genes significantly correlated with 35 bacterial OTUs which were affiliated with 24 bacterial families. The aminoglycoside resistance genes significantly correlated with 29 bacterial OTUs affiliated with 19 bacterial families. The MLSB resistance genes significantly correlated with 17 bacterial OTUs affiliated with 13 bacterial families. The β-lactamase, sulfonamide, vancomycin, tetracycline and FCA resistance genes significantly correlated with some of bacterial families (Fig. S6). Chitinophagaceae bacterial OTUs had the most edges with the ARGs, flowed by the Planctomycetaceae and Sphingomonadaceae bacteria.

The biogeography of ARGs
This is the first national/continental scale study using highthroughput methods to investigate human-induced environment effects on the abundance and composition of ARGs in inland waterbodies. We found the normalized abundance and composition of ARGs exhibited biogeographical patterns in natural water ecosystems at a national scale -the ARGs normalized abundance was significantly higher in the south/ central than in the north Chinese waterbodies, and the ARGs showed a significant distance-decay relationship. As a functional trait, the ARGs have a biogeography at a large spatial scale suggesting that they are most likely subjected to some ecological processes and environmental filtering (Hanson et al. 2012).
First, our data suggest that environmental selection associated with human activities may act strongly both on the spatial distribution of ARG composition and normalized abundance (Table 1 and Fig. 4). Although many antibiotics and ARGs are a natural part of microbial ecology (Allen et al. 2010;Wiener 1996), and can be shown to have existed before human antibiotic use (D'Costa et al. 2011), human activities have been demonstrated to make a strong contribution to the occurrence, spread and enrichment of antibiotics and antibiotic resistance (Jones et al. 2008;Qiao et al. 2018). A diverse mixture of antibiotics with other pollutants and their metabolites can reach the natural rivers, lakes and reservoirs through the urban wastewater, aquaculture discharges, and agricultural runoff, altering the distribution and magnitude of ARGs in both natural and man-made aquatic environments (Marti et al. 2014;YG Zhu et al. 2017a;2017b). More importantly, it has been shown that the annual emissions of antibiotics in watersheds of south and central China was higher than in north China . Therefore, the increase of human activity and intensity from north to south/central China may be an important reason for the increase of emissions of antibiotics and thus may result in the increase of normalized abundance of ARGs in the south/central lakes and reservoirs. Second, environmental selection associating with local physicochemical factors has an influence on the spatial distribution of ARG composition. These physicochemical factors may have indirect effects on ARGs due to their effect on the host bacterial abundance and community composition . Finally, our distance-decay and variation partitioning analyses (pure spatial factors significantly correlated with the distribution of ARG composition) suggest dispersal limitation of ARGs across the waterbodies (Fig. 4). The most likely reason is the low abundance of ARGs in the natural waterbodies. For example, the mean of ARGs to 16S rRNA gene copy number ratio was about 1.5% in the waterbodies. The low abundance of functional genes (e.g. ARGs) may result in low dispersal chances of these functional genes though host species dispersal, horizontal gene transfer and gene flow across ecosystems Raes et al. 2011). Moreover, the Actinobacteria and Firmicutes phyla were closely related to the ARG composition in the waterbodies. This result agrees with our previous study of urban sewage sludge composting . However, this result also indicates that the rare bacteria may play more important roles in the spatial distribution of ARG composition compared to abundant bacteria.
It should be noted that some of our anthropogenic data refer to the corresponding values in the province where each waterbody is located. Therefore, these data may not fully and precisely reflect the characteristics of basin where each lake or reservoir is located. However, our anthropogenic data reflect the broad anthropogenic characteristics of the different geographical regions across China, and these data closely correlated with the normalized abundance of ARGs in these waters. In addition, it should be noted that the large variation (43.55%) in composition of ARGs cannot be explained by variation partitioning. We assume that this high proportion of unexplained variation of ARGs could be related to the antibiotic concentration, and other biotic and abiotic factors not measured in this study. This might also reflect the complex mechanisms that generate the distribution of ARG composition in these waterbodies.

The abundance and composition of ARGs and MGEs
The absolute abundance of ARGs ranged from 5.95 × 10 6 to 1.85 × 10 9 copies/L with mean of 4.33 × 10 8 ± 7.19 × 10 7 s.e. in the waterbodies. Previous studies using HT-qPCR investigated ARGs in urban areas of Wenruitang River (Zhejiang, China) , Jiulong River (Fujian, China) (Ouyang et al. 2015), East Tiaoxi River (Zhejiang, China)  and Qiantang River (Zhejiang, China) (Xu et al., 2016). These studies found the absolute abundance of ARGs in urban areas of rivers reached 10 9 -10 11 copies/L, which was one to three orders of magnitude higher than that in our waterbodies. However, most of our sampling sites are located within suburban areas and in rural locations surrounding major urban areas, these have lower human activity compared to urban areas. Ouyang et al. (2015) found the absolute abundance of ARGs in a suburban site of Jiulong River was about 10 8 copies/L, which is similar with the result in this study.
MGE marker genes intI-1 and tnpA-04 were detected at high concentrations in most of our samples, and they exhibited high correlations with ARG groups. The intI-1 significantly correlated with aminoglycoside, β-lactamase and MLSB resistance genes, while tnpA-04 significantly associated with aminoglycoside, FCA, MLSB, multidrug, sulfonamide, tetracycline and vancomycin resistance genes. These indicate the important roles of these two MGE genes in ARGs transfer in the waterbodies. Previous studies using the HT-qPCR array found tnpA-04 gene is the most enriched MGEs in treated water generated from drinking water treatment plants (DWTPs) (Xu et al., 2016) and in park soils irrigated with reclaimed water (Wang et al. 2014). While other studies also found the intI-1 was persistent with high abundance in estuarine sediments (YG Zhu et al. 2017c). The class 1 integrons found in the clinical context are able to capture different ARGs in the form of circular cassette to construct a tandem array of ARGs (YG Zhu et al. 2017c). The persistence of integrons (intI-1) and the close relationship between transposons (tnpA-04) and ARGs in the waterbodies could further emphasize the risk of ARG transfers from the natural water to human pathogens (Djordjevic et al. 2013;Domingues et al. 2012;Gillings et al. 2008).
It is well known that bacteria have many efflux pumps in their membranes. Some of these pumps are able to transport antibiotics out of the bacterial cell, allowing bacteria to survive at higher concentrations of antibiotics, and thus leading to resistance. Some efflux pumps can transport antibiotics of more than one chemical class, thereby being associated with multidrug (antibiotic) resistance (Piddock 2006). In this study, multidrug resistance genes (e.g. qacEΔ gene) were highly abundant in the waterbodies, perhaps due to the extensive use of different antibiotics within hospitals, communities and animal production. For example, the qacEΔ gene was abundant and detected in most of samples. This result is congruent with our previous finding in estuarine sediments (YG Zhu et al. 2017c). The qacEΔ1 is found specifically in class 1 integrons and specify gene for resistance to quaternary ammonium compounds (Alekshun and Levy 2007). Given the serious threat of multidrug resistance genes, this highlights the need for the scientific community and public authorities to collaborate to prevent and control the development and dissemination of ARGs in aquatic environments. Moreover, we detected 13 of the 26 tetracycline resistance genes targeted on our array. The most abundant tet genes in this study were tetG, tetB, tetR, tetX and tetA. Tetracyclines (TCs) are commonly used as veterinary medicines (Jechalke et al. 2014). In the present study, we found that the proportion of tetracycline resistance genes were apparently higher in Fujian province than in four other regions (Fig. S2). The tetracycline resistance genes in Hubian Reservoir (Fujian) were two orders of magnitude higher than that in other Fujian reservoirs. Hubian Reservoir is a backup water source for Xiamen City and completely surrounded by urban development. In the dry season, this reservoir water is supplied by Jiulong River when its water level falls below the normal value. Previous work has shown that tetracycline antibiotics were highest among the measured fourteen antibiotics (tetracyclines, sulfonamides and quinolones) in water samples along the Jiulong River (Zhang et al. 2012). Another study investigated five classes of antibiotic resistance genes along the Jiulong River and found the high concentration of tetA and tetG in water samples (Zhang et al. 2013). Therefore, the Hubian Reservoir may be influenced by the tetracycline residues and tetracycline-resistance genes from the Jiulong River water, which is a human-impact river (Zhang et al. 2013).

The co-occurrence relationship between ARGs and bacterial taxa
Previous studies used network analysis to investigate the co-occurrence relationship between ARGs and bacterial taxa and hypothesized that the co-occurrence patterns between ARGs and bacterial taxa could possibly indicate host information of AGRs Li et al. 2015;B Zhu et al. 2017). For example, Li et al. (2015) found both Escherichia and Bacteroides were the host of tetQ, and this result has been verified in other studies (Forslund et al. 2013;Shoemaker et al. 2001;Zhang et al. 2009). In the present study, we also found co-occurrence relationships between ARGs and 23 bacterial families in the waterbodies. This result is inconsistent with a previous study on soil resistomes (B Zhu et al. 2017). In that study, the authors used network analysis and found Chitinophagaceae was a possible host of a vancomycin (vanC-03), a tetracycline (tetR-02) and a MLSB (mphA-02) resistance genes. However, in our study, we found Chitinophagaceae were correlated with four aminoglycoside, three MLSB (ereA, erm(36), ermF), four multidrug and a FCA resistance genes. Nevertheless, the network analysis can provide information regarding the correlations of the ARGs with some bacterial taxa, but further study should verify bacterial hosts of the ARGs in these aquatic ecosystems.

Potential limitations
There are, however, potential limitations in our approach that merit further discussion. We only covered a limited number of ARG types using PCR-based approach although 285 ARGs were investigated. Therefore, some ARG types would likely be missed. Current next-generation sequencing and metagenomics approach are able to investigate ARGs across a broader spectrum and obtain a more complete picture of the ARG profiles. However, our high-capacity quantitative PCR arrays are more efficient than most of the previous studies which only targeted a few classes of ARGs .
Moreover, it should be noted that the results from the absolute abundance and normalized abundance of ARGs did not match exactly. There are two possible reasons for this phenomenon. First, the normalized abundance of total ARGs in the waterbodies was low (mean = 1.5%, s.e. = 0.2%, n = 42), and therefore resulted in the relatively decoupled relationship between bacterial taxa and ARGs. In contrast, we found the normalized abundance of total ARGs was high during the urban sewage sludge composting (mean = 32.7%, s.e. = 3.4%, n = 23), where the bacterial showed a tight correlation with ARGs . Second, the correlation between the 16S rRNA gene abundance from the SmartChip Real Time System and from the Roche Lightcycler 480 may have some effects on the consistency of results from the absolute abundance and from the normalized abundance of ARGs.
In this study, only one mixed sample was taken at each lake/reservoir. To make these samples as representative as possible, the sample site was located at the center of each waterbody. We then collected several water samples every 50 cm in the epilimnion, and mixed the water as one sample. There is an obvious trade-off between numbers of samples from a site and number of sites studied. This work prioritized number of sites (across China) -while this has many advantages it obviously have disadvantages when it comes to replicates from individual lakes. There is always a risk that a single collection differs quite a bit from the mean and the values for some lakes may not be representative. Therefore, the more intensive sample strategies should be considered in future study.
In addition, it is worth noting that our sampling was performed only during the summer months (i.e. July and August). There are likely seasonal differences in bacterial abundance, diversity as well as the genes. Likewise, our study sites are located over different climate zones and would likely be affected by the different hydrothermal conditions. For example, the cyanobacteria tend to bloom in summer months and may strongly affect the whole bacterial communities so influencing the ARGs (Eiler and Bertilsson 2004;Guo et al. 2018). Therefore, our results are only strictly applicable to the time-frame in which the samples were collected, and that there is a possibility that seasonal differences in ARG composition and abundance could occur.

Conclusions
Our results demonstrated that, at a national (effectively continental) scale, the ARGs had a distinct biogeographical pattern in the studied Chinese lakes and reservoirs -the composition of ARGs in the waterbodies of south/central China was separated from that in the north, and the ARGs showed a significant distance-decay relationship. Anthropogenic (the highest explanation factors), spatial and physicochemical factors exhibited significant effects on this spatial distribution of ARG composition. The south/central China had higher normalized abundance of ARGs (mainly because of aminoglycoside and multidrug resistance genes) than north (northwest and northeast) China in the waterbodies as the effect of human activities. In general, the selection process associated with human activities and mobile genetic elements were closely related to the spatial distribution of ARGs in the waters, while the local physicochemical parameters and dispersal limitation also played important roles in ARG distribution. Overall, these results indicate that the root-cause of accumulation/enrichment and spread/ dissemination of ARGs is intensive human activities, this threatens both public health and natural ecosystem health.

Conflict of interest
All authors have no conflict of interest to declare.

Author contributions
J.Y. and Y.G.Z. designed the research. J.Y. and L.L. collected the samples from lakes and reservoirs. L.L., J.Q.S. and Y.G. performed the laboratory experiments. L.L., J.Q.S., J.Y. and Y.G.Z. contributed the new reagents/analytic tools, and analyzed the data. All authors discussed the interpretation of the results and wrote the manuscript. J.Y. conceived the initial concept.