Spatio-temporal dynamics of bacterial communities in the shoreline of Laurentian great Lake Erie and Lake St. Clair’s large freshwater ecosystems

Long-term trends in freshwater bacterial community composition (BCC) and dynamics are not yet well characterized, particularly in large lake ecosystems. We addressed this gap by temporally (15 months) and spatially (6 sampling locations) characterizing BCC variation in lakes Erie and St. Clair; two connected ecosystems in the Laurentian Great Lakes. We found a spatial variation of the BCC between the two lakes and among the sampling locations (significant changes in the relative abundance of 16% of the identified OTUs at the sampling location level). We observed five distinct temporal clusters (UPGMA broad-scale temporal variation) corresponding to seasonal variation over the 15 months of sampling. Temporal variation among months was high, with significant variation in the relative abundance of 69% of the OTUs. We identified significant differences in taxonomic composition between summer months of 2016 and 2017, with a corresponding significant reduction in the diversity of BCC in summer 2017. As bacteria play a key role in biogeochemical cycling, and hence in healthy ecosystem function our study defines the scope for temporal and spatial variation in large lake ecosystems. Our data also show that freshwater BCC could serve as an effective proxy and monitoring tool to access large lake health.

algal blooms (cHABs) and hypoxia [5]. Lake St. Clair is also heavily impacted by densely populated urban areas, and because of its location upstream [6]. Lake St. Clair is very shallow and highly affected by recurrent eutrophication symptoms [7].
Microbes play fundamental roles in transforming organic carbon and reintroducing it into the food web, thus characterizing temporal and spatial changes in bacterial community composition (BCC) can provide deeper insight into the processes and mechanisms operating in lake ecosystems and ultimately improve our basic knowledge and ability to predict BCC dynamics and function. BCC temporal variation can occur hourly [8] to seasonal [9] and interannual [10]. Cyclic abiotic factors such as light [11] and temperature [12], as well as biotic factors such as bacteriophages [13], may contribute to daily, weekly and seasonal cycles, but temporal variation goes beyond such straightforward cyclic relationships. BCC spatial variation also has reported in fine-scale (within pounds) [14] to large-scale; between the lakes [15] and in the lakes [16]. Despite these studies, the temporal (long-term continuous sampling) and spatial (between the lakes and in the lakes across multiple locations) dynamics of freshwater bacterial communities still needs more characterization in the Great Lakes.
Despite numerous studies addressing the biogeographical distribution of microbial species [8,10,15,16], microbial ecologists lack a basic understanding of the characteristic scales of temporal (long-term) and spatial (between the lakes and in the lakes across multiple locations) variation in aquatic BCC; as BCC form the cornerstone of whole freshwater ecosystems. Arguably, this is a key gap in our basic understanding of aquatic bacterial diversity that hinders our ability to develop theories about how microbially mediated function and the stability of those functions are maintained across space and time. Considering previous studies [8,10,15,16], we must integrate long-term temporal sampling with large scale spatial sampling to allow not only assessment of change over time and space, but also the potential for the interaction between location and date of sampling. To address this knowledge gap, we sampled bi-weekly 6 recreational beaches in Lake Erie (4 locations) and St. Clair (2 locations) at Windsor-Essex County (Windsor, Ontario, Canada) from June 2016 to August 2017 to characterize the BCC (Fig. 1). We used 16S rRNA Fig. 1 The sampling sites used for bacterial community composition in Lake Erie (Cedar Beach; CB, Colchester Harbour Beach; CH; Holiday Beach; HB and Point Pelee Beach; PP), and Lake St. Clair (Lakeview Park Beach; LP and Sand Point Beach; SP) in Windsor-Essex County. The map generated by the authors of this paper metabarcoding via next-generation sequencing (NGS) to ensure accurate and complete BCC characterization. We hypothesize both significant temporal (bi-weekly, monthly and seasonal) and spatial (sampling location and lake) variation in freshwater BCC, given the spatial and temporal scale of this study. Specifically, we predicted stronger temporal than spatial effects, primarily due to the expected large seasonal effects, but due to environmental similarity and connectivity among the sampled sites, only subtle spatial effects. More specifically, we expected to observe highly divergent BCCs among the seasons, but with the two summer season samples more similar. We also expected that environmental parameters (water temperature, season, precipitation and day light hours) have influence on the BCC dynamic. The outcome of this study will increase our basic understanding of how freshwater BCC changes at different scales of time and space which is critical for monitoring the ecological service of BCC and aquatic ecosystem health.

Global spatial and temporal effect
After quality control, 5.1 million Ion Torrent sequence reads remained across all 6 sample locations and 15 months. Each sample (replicate) had between 2102 and 8509 reads, with an average of 4789 reads. In total, 27,643 OTUs were detected. After removing singleton and doubleton sequence reads, as well as OTUs with ≤20 reads from the data set, 2100 OTUs were included in this study. The OTU table was rarefied to 2000 reads/ sample.
We collected 2 samples per location and time and found that replicates had no significant differences in the BCC (p > 0.05). We also observed no significant replicate effect with 3-9 samples/location in our recently published study of aquatic bacterial community dynamics in north temperate lakes [17], thus we combined sequence read data of the two replicates for each week at each location to increase the read depth for all further statistical analyses.
Based on our global GLMM, we found that lake (as a broad spatial factor) had significant effects (p < 0.05) only on Chao1 and Shannon indexes; however, month (as a broad temporal factor) had significant effects (p < 0.05) on Chao1, Shannon, PCo1, PCo2 and PCo3 (Supplementary Table 1). In the lake-specific models (two models), sampling location had a significant effect (p < 0.05) only on Chao1 and Shannon indexes, but month had a significant impact on Chao1, Shannon, PCo1, PCo2 and PCo3 (Supplementary Table 1). The interactions of sampling location with month also had significant (p < 0.05) effects on Chao1, Shannon, PCo1, PCo2 and PCo3 in two lake-specific models (Supplementary Table 1). Our Kruskal-Wallis analyses showed that out of 2100 OTUs, the relative abundance of 336 (16%) of the OTUs were significantly affected by location (6 sampling locations), while the relative abundance of 1453 (69%) of the OTUs were significantly affected by month (15 sampling months) (Fig. 2). The interaction of location and month had a significant effect on the relative abundance of 311 (14%) OTUs (Fig. 2). Supplementary Table 2 presented top 20 OTUs which were mostly affected by month, location (6 sampling locations) and their interaction. It is important to note that significance was not corrected for multiple simultaneous comparisons; however, the goal of this analysis was to show the pattern of effects on OTU relative abundance, highlighting the dominance of temporal effects relative to spatial and interaction effects (Fig. 2).

Spatial variation Broad spatial variation
There was no significant difference in the Bray-Curtis dissimilarity matrix for the BCCs of Lake Erie and Lake St. Clair across the full sampling period (PERMANOVA; df = 1, F = 2.064, p = 0.07) ( Supplementary Fig. 1). Similarly, we found no significant variation in the mean of the Chao1 (F = 1.2), PCo1 (F = 0.42), PCo2 (F = 0.92) and PCo3 (F = 0.94) values between two lakes by one-way ANOVA (df = 1, p > 0.05). Out of the 30 detected classes of bacteria across all samples, the relative abundance of only 5 of the classes was significantly higher (LDA; p < 0.05) in the BCCs of Lake Erie in comparison to Lake St. Clair ( Supplementary Fig. 2).

Spatial variation among different locations
There was no significant difference in the Bray-Curtis dissimilarity matrix for the BCCs of the 6 sampling locations (PERMANOVA; df = 5, F = 1.06, p = 0.34) ( Supplementary  Fig. 1). One-way ANOVA showed a significant effect of sampling locations on the Chao1 index (F = 5.32, p < 0.001). Tukey post-hoc test revealed that only the mean of Chao1 index in CB was significantly higher (p < 0.05) than HB, PP and SP but not from CH and LP. Using one-way ANOVA, there was no significant effect of sampling locations on PCo1 (df = 5, F = 0.644, p = 0.66), PCo2 (df = 5, F = 0.38, p = 0.85) and PCo3 (df = 5, F = 0.39, p = 0.88). Out of the 2100 OTUs, 336 OTUs (4 highly abundant, 21 moderately abundant and 311 rare OTUs) showed significant variation among the 6 sampling locations (Fig. 2). We identified 1-5 classes of bacteria with significant divergence for some of the pairwise comparisons at the class level between the BCC of different sampling locations (Supplementary Table 3). We identified a maximum 5 classes of bacteria with significant variation in their relative abundance between CB and LP; between CB and SP; between CH and SP; between HB and SP and between LP and SP (Supplementary Table 3). Further analysis at order and family levels shown that some bacteria had common patterns among different sampling locations (Supplementary Table 3). For example, order Aeromonadales (phylum Proteobacteria and class Gammaproteobacteria) had significantly higher relative abundance in LP and SP in comparison to CH, HP, and PP (Supplementary Table 3). Also, order Acidimicrobiales (phylum Actinobacteria and class Acidimicrobiia) had significantly lower relative abundance in LP and SP in comparison to CB, CH, and PP (Supplementary Table 3).

Broad temporal variation BCCs variation
As we were only interested in long term temporal variation, we combined bi-weekly data from each location within each month. We thus had a total of 90 samples (15 months × 6 locations) for our cluster analysis. UPGMA clustering showed five major clusters diverging at between 50 and 60% similarity based on the Bray-Curtis similarity index (Fig. 4). The BCCs of summer 2016 (June, July and August) were grouped in cluster 1. The BCCs of December (2016) and January (2017) were grouped as cluster 2. The BCCs of 5 months including February, March, April, May and June 2017 clustered together as cluster 3. The BCCs of July and August (summer 2017) were grouped as cluster 4 and the BCCs of fall 2016 (September, October and November) grouped into cluster 5 (Fig. 4). There was significant variation among the BCCs of 5 clusters (PERMANOVA test; df = 4, F = 9.57, p = 0.0001) and their pairwise comparisons. The overall average dissimilarity of the BCCs of 5 broad clusters was 51% using SIMPER analysis (Supplementary Table 4).

Diversity variation among clusters
The clusters differed significantly (df = 4, p < 0.0001) in Chao1 (F = 27.42), PCo1 (F = 28.2,), PCo2 (F = 27.9) and PCo3 (F = 20.2) using one-way ANOVA. Figure 5 presented variation of Chao1, PCo1 and PCo2 over 15 months of sampling. Tukey post-hoc analysis showing that Chao1 and PCo1 of cluster 1 were significantly (p < 0.05) higher than all 4 other clusters. PCo2 of cluster 2 was significantly (p < 0.05) lower than all 4 other clusters. More interestingly, the mean of Chao1 for cluster 3 was significantly lower than all other clusters and the mean of PCo1 also was the lowest among all clusters but significantly lower than clusters 1 and 4 (Supplementary Fig. 4 and Supplementary Table 5).
Only OTUs 2 and 3 (phylum Actinobacteria and family ACK-M1), OTU4 (phylum Firmicutes and family Exiguobacteraceae) and OTU6 (phylum Proteobacteria and family Comamonadaceae) were abundant across all clusters. The relative abundance of many OTUs significantly varied among the clusters, as expected given that the clusters were defined based on variation in BCC. For example, the relative abundance of OTU2 (family ACK-M1) was significantly higher in the BCC of cluster 1 (9.5%) than all other clusters (2.5-3.8%). Out of 2100 OTUs; 75, 90, 79, 61 and 79 moderately abundant OTUs were detected in the BCCs of clusters 1, 2, 3, 4 and 5, respectively.
While the BCCs of the clusters (1-5) were dominated by four phyla, the relative proportions of these phyla varied substantially (Fig. 4). Actinobacteria was the most common taxon (~50%) versus Firmicutes (7.16%) as the less common taxon in the BCC of cluster 1. In the BCC of clusters 4 and 5, Actinobacteria (~32% in both clusters 4 and 5) was dominant but Firmicutes (~32% in cluster 4 and~30% in cluster 5) became the second . The BCCs grouped into five broad temporal community clusters, all clusters were above 50-60% Bray-Curtis similarity (Y-axis) and show considerable taxonomic divergence (pie charts showing phyla level composition). For this analysis, the Bray-Curtis similarity matrix was generated using the 300 most abundant OTUs most common taxon. In the BCC of cluster 2, Proteobacteria (~39%) was the most common taxon and Firmicutes (~19%) became the third most common phylum after Actinobacteria (~26%). More interestingly, Cyanobacteria levels were elevated in the BCC of cluster 2 (2.25%) compared to the other clusters (0.75 ± 0.35%). The BCC of cluster 3 was enriched for Firmicutes (3 5%) as the dominant phylum and by Proteobacteria (3 0%) as the second most common phylum. We observed substantial variation in the relative abundance of 30 classes of bacteria. For example, the relative abundance of Bacilli was significantly lower in the BCC of cluster 1 compared to all other clusters, while some classes such as Actinobacteria and Thermoleophilia (belonging to Actinobacteria) and Cytophagia and Saprospirae (belonging to Bacteroidetes) conversely had significantly higher relative abundance in the BCC of cluster 1 relative to all the other clusters (Supplementary Table 6).

OTUs and taxa variation
Out of 2100 OTUs, the relative abundance of 1453 (69%) OTUs changed significantly (Fig. 2) across the months. The BCCs of July (2016) had the lowest number (6 OTUs) of highly abundant OTUs while the BCCs of February (2017) and June (2017) had the highest number of highly abundant OTUs (12 OTUs) among the 15 months of sampling. OTUs 2, 3 (family ACK-M1), 4 (family Exiguobacteraceae) and 6 (family Comamonadaceae) were highly abundant across all 15 months of sampling, but the relative abundance of others with high relative abundance dropped to ≤1% in some months (Fig. 6). Some highly abundant OTUs showed highly variable patterns over 15 months of sampling. For example, the relative abundance of OTU2 was 8.2, 10.8 and 9.5% in June, July and August (2016) respectively but dropped to between 2.0 and 5.7% across all other months. In contrast, the relative abundance of OTU13 (phylum Bacteroidetes and family Cyclobacteriaceae) was 0.02-4.4% in the BCCs of June 2016 -June 2017 but increased up to 10.7 and 7.2% in the BCCs of July and August (2017) respectively (Fig. 6A). Many of even the highly abundant taxa exhibited unpredictable variation across the study period, highlighting the chaotic nature of BCCs. Interestingly, the relative abundance of family Flavobacteriaceae belonging to Bacteroidetes (OTU50) and Oscillatoriaceae belonging to Cyanobacteria (OTUs 65 and 90) were elevated in the January BCCs relative to all other months (Fig. 6B). At the phyla level, the BCCs of June (47.56%), July (49.83%) and August (53.37%) in the summer of 2016, were enriched by Actinobacteria compared to June (27.47%), July (30%) and August (34.78%) in summer 2017 and all other months. In contrast, taxa belong to Firmicutes was not common in the BCCs of June (6.93%), July (7.24%) and August (7.31%) in summer 2016 while they were consistently common in the composition of the BCCs in other months (30.43 ± 6.88%). The relative abundance of taxa belonging to class Proteobacteria increased in January (34.72%) and February (43.34%) compared to other months (26.21 ± 3.84%).
At the class level, significant variation (p < 0.05) was detected among months by pairwise LDA among the 30 identified taxonomic classes. Only 2 bacterial classes showed significant differences between the BCCs of June and July (2016), meanwhile, 24 bacterial classes showed significant variation between the BCCs of July 2016 and June 2017. Classes Actinobacteria, Acidimicrobiia, Thermoleophilia (phylum Actinobacteria) and Saprospirae and Cytophagia (phylum Bacteroidetes) showed high relative abundance in the June and July and August (2016) BCCs compared to other months ( Supplementary  Fig. 5). Conversely, class Bacilli had low relative abundance in the June and July and August (2016) BCCs compared to all other sampling months ( Supplementary  Fig. 5). Chloroplast became a highly abundant taxa in the BCCs of cold months (November and December; 2016 and January; 2017). We also observed a noticeable increase in the relative abundance of classes Beta, Delta and Gammaproteobacteria in the BCC of January (2017) and a shift in the relative abundance of class Planctomycetia from highly to moderately abundant in the cold months (November-May) ( Supplementary Fig. 5).
Surveys of heterotrophic bacteria revealed significant changes in the composition of heterotrophic bacteria between the two summers (2016 and 2017). The change was characterized by significant reductions in the relative abundance of families such as C111 and ACK-M1 (Actinobacteria) in summer 2017 relative to summer 2016, coupled with increased abundance of Cyclobacteriaceae (Bacteroidetes), Enterobacteriaceae (Gammaproteobacteria) and Bacillaceae (Bacilli) in summer 2017 (Supplementary Fig. 6). Surveys of phototrophic bacteria revealed noticeably higher abundance of Oscillatoriaceae (Cyanobacteria) in colder and lower sunlight months which coincided with an increase in the abundance of Flavobacteriaceae family (a heterotrophic bacteria) in the colder months; November-March ( Supplementary Fig. 6).

Discussion
Our results showed spatial variation (between lakes and among sampling locations). It has been reported that variation of environmental parameters such as salinity, redox conditions and dissolved organic matters (DOM), etc. [18], as well as habitat variation [14], among sampling locations, are strong drivers of the spatial variation of the microbial communities. We did not measure abiotic parameters such as nutrient levels at our sampling locations; however, due to the connectivity of the two lakes by the Detroit River [19], the short distances among the sampling locations and the eutrophication of Lake Erie [5] and St. Clair [20], our sampling locations might have similar habitat features which consequently resulted in the little spatial variation of the BCCs in our study. The chao 1 index of CB was higher than other sampling sites, potentially due to presence of more greenhouse agriculture area and consequently higher bioavailability of nutrients; however, more studies are needed to address the influence of greenhouse agriculture area on BC diversity. Comparisons of upper Great Lakes (lakes Superior and Huron) BCC data with Lake Erie returned only a few OTUs (383 out of~13,000 OTUs) with significant differences in their abundance [16], consistent with our observation of only relatively minor spatial variation effects on BCC sampled at the same time. A report of only minor spatial variation in the metabolic profiles in 8 carbon substrates out of 31 in the sediment microbial communities of Lake St. Clair [21] highlighted the lack of substantial spatial variation among sites even at the functional level. Three distinctly different bacterial assemblages were reported in the upper, middle and lower Yenisei River (1800 km) [22] that study also reported matching nutrient spatial variation. The reported presence of distinct BCCs across large spatial scales versus our weak spatial effects across short distances may reflect nutrients gradients that may occur across long distance and drives microbial habitat variation.
We observed strong temporal variation over our sampling effort that included more than a year (15 months), which captured vast temporal (seasonal) variation. Clustering of the BCCs for the 15 sampling months resulted in five highly divergent BCC clades, which closely matched to seasonal patterns. Many studies have reported high prokaryotic microbial diversity in summer relative to winter [23], which we also observed. More interestingly, we found significant variation between the BCC of summer 2016 and 2017, with a significant decrease in the diversity indexes of the BCC of summer 2017. As the water temperature of two summers was not significantly different, the variation of the two summer BCCs is likely related to other abiotic (such nutrients bioavailability) and biotic factors that differed between the two summers [24,25]. Perhaps not surprisingly, we found significant correlations between selected environmental factors (daylight hours, precipitation, water temperature and season/cluster) and the BCC; however, all these environmental factors only explained 30% of the total variation in BCC across the 15 months. Our environmental factors are indeed confounded with our temporal variation (so, for example daylight, precipitation and temperature all co-vary with season), indicating that the covariance of the environmental factors with the temporal variation (season/cluster) may reflect the mechanisms for the temporal effects. We did not measure the nutrients, oxygen content or other environmental parameters in this study which could be consider one of the limitations of this study. It will be interesting to measure those environmental parameters to better understand the reasons behind the BCC variation with the time and locations.
This study was not designed to test for annual effects; however, significant differences in the BCC in the two sampled summers (2016 and 2017) highlights the potential for unpredicted annual temporal variation along with a seasonal and monthly temporal variation of freshwater BCCs. Few studies have characterized monthly temporal variation in freshwater BCC. However, one study reported monthly monitoring of BCC of Lake Taihu over 3 years (2009-2011) at four different sites, and showed significant monthly (and consequently seasonal) variation of diversity indexes of the BCC [26]. In line with our observation of strong seasonal variation in BCC, drastic seasonal transitions of microbial abundance and diversity have been reported in lakes [27]. Reported high levels of variability in BCC of bog lakes over 5 years (with unique communities in each year of sampling) [28] was also consistent with our limited results regarding annual (summer) BCC variation in large freshwater lake ecosystems.
Few microbial taxa showed higher abundance in Lake Erie relative to Lake St. Clair, despite Lake Erie is a substantially larger and deeper lake. The few that were more abundant in Lake Erie mostly belong to Actinobacteria. Actinobacteria are often the numerically dominant phylum in lakes [25], but their abundance decreases with oxygen limitation [29] and overloading of the nutrients [30]. Low abundance of this phylum in Lake St. Clair is likely due to the low level of oxygen or higher loading of nutrients in the smaller, shallower Lake St. Clair. We found little evidence for taxonomic variation (at the class level) among sampling locations in Lake Erie perhaps reflecting relatively uniform microbial habitat characteristics among the three sampling locations (all public beaches). Interestingly, the abundance of Bacteroidetes and Verrucomicrobia (two taxa associated with high-nutrient environments) [25] and Planctomycetes (a key taxon in anaerobic ammonium oxidation) [31] were significantly different between the two sampling locations in Lake St. Clair potentially due to overloading of nutrients from an adjacent urban tributary near one of the sampling sites (LP).
Actinobacteria has been reported as an important components of the microbial community in Lake Erie, particularly in the summer [32] which matched our finding for 2016 but not in 2017. The summer 2017 BCC showed a significant reduction of Actinobacteria compared to summer 2016, potentially due to changes in abiotic variables such as reduced oxygen levels [29] or nutrient overloading [30] in summer 2017. Cyclic high abundance of Actinobacteria within the bacterioplankton (89%) has been reported recently by late spring in years 2013 and 2014 in parallel to high abundance of zooplankton grazing at Astatic soda pans [33]. The abundance of Actinobacteria did not show cyclic pattern among 2 years (summer 2016 and 2017) in our study but decreased from summer 2016 to summer 2017 continuously which potentially could not relate to the zooplankton grazing. OTUs belonging to Proteobacteria were in the top three most abundant phyla in the BCCs across different months and seasons. For example, Proteobacteria was the most abundant phylum in January and June (2017); the second most common in summer 2016 (June, July and August) and third most common in July (2017). Proteobacteria are reported as very abundant in many different freshwater lake habitats, but their relative abundance varies among lakes, within lakes and over time [25]. Curiously, one ubiquitous group of metabolically versatile bacterial was observed at high abundance in the coldest months of our study; order Pseudomonadales (such as families Moraxellaceae and Pseudomonadaceae) in January and February and Enterobacteriaceae (such as Gammaproteobacteria) in January, March, May, June and July (2017). Freshwater lake Bacteroidetes are often found in high abundance during periods following cyanobacterial blooms [25]. It has been reported that Flavobacterium spp. (belonging to Bacteroidetes) is the dominant taxa of the winter community in Lake Erie [32], and while we also found a significant elevation of Flavobacteriaceae over the cold months, we identified Proteobacteria as the dominant phylum in the winter (cluster 2). In our data set, the relative abundance of Cyanobacteria (family Oscillatoriaceae) and Bacteroidetes (family Flavobacteriaceae) exhibited correlated abundance in the BCC of November to March, likely reflecting the dependency of Bacteroidetes on the organic matter loading by Cyanobacteria [34]. Previous studies have also noted high levels of Cyanobacteria during winter months in freshwater reservoirs [35]. It has been suggested that high concentrations of overwintering vegetative Cyanobacteria cells may provide a large inoculum for blooms during warmer seasons [36], but the impact of family Oscillatoriaceae on algal bloom dynamics in Lake Erie and St. Clair is not well known.
Cytophaga is well known to be proficient in degrading biopolymers such as cellulose and chitin, part of the high molecular mass fraction of DOM [25]. In our study, the relative abundance of Cytophaga (phylum Bacteroidetes) changed from being a rare component of the community in June, July and August (summer 2016) to high abundance in May, July and August (2017), indicating potentially elevated availability of DOM in the summer of 2017. Although Firmicutes is generally a minor freshwater lake community taxon [25], in our study the relative abundance of OTUs belonging to Firmicutes increased over time. Indeed, this phylum became one of the most dominant phyla across all sampling points after the summer of 2016. Similar to our finding, a high abundance of Firmicutes (23%) was reported from water samples collected from freshwater public beaches (Ohio, Madison lake) [37]. In that study, Exiguobacterium and Paenisporosarcina were the most dominant Firmicutes genera [37], while in our study Bacillaceae (September 2016-August 2017) and Exiguobacteraceae (December, February and March 2016-17) were the most abundant genera in Lake Erie and St. Clair.
We noted a composition shift of the freshwater BCC from a community enriched by Actinobacteria (sensitive to nutrient overloading and low oxygen level) to one enriched by Proteobacteria (adapted to nutrient overloading) [25], Bacteroidetes (proficient in the degradation of complex biopolymers and DOM) [25] and Firmicutes (diverse metabolic capabilities and resistant to oxygen limitation) [38] over time. Although we did not measure nutrients in our study, the observed pattern of BCC change indicated likely increases in the loading of nutrients into both lakes from fall 2016 onwards. However, the mechanism(s) responsible for the observed BCC shift requires further investigation. Furthermore, we observed temporal variation in Enterobacteriaceae abundance; a family that includes many waterborne pathogens and fecal indicator bacteria (FIBs) [39], Pseudomonadales; a taxon which may act as an opportunistic pathogen in fish [40] and humans [41] and Oscillatoriaceae (Cyanobacteria), all of which reflect variation in potential pathogens and health risk, particularly over the summer.

Conclusion
Our results showed that although freshwater BCC may have a cyclic seasonal or annual variation, the details of the composition of the community can change unpredictably over the temporal and spatial scales included in our study. The observed BCC variation could be linked to the functional activity of the community, making additional studies necessary to characterize the consequence of this variation on the ecological services of BCC in a large freshwater ecosystem. Our results also showed that long term monitoring of the bacterial community could serve as a sensitive proxy of freshwater ecosystems health and perhaps even function.

Study sites and sample collection
Freshwater samples were collected bi-weekly over 15 months from June 2016 to August 2017 from shorelines at 6 locations, including four locations from Lake Erie (Cedar Beach (CB), Colchester Harbour (CH) Beach, Holiday Beach (HB), and Point Pelee (PP) Beach) and 2 locations from Lake St. Clair (Lakeview Park (LP) Beach and Sand Point (SP) Beach) at Windsor-Essex County (Windsor, Ontario, Canada). LP and SP are in urban areas, while CB, CH, HB and PP are located in agricultural areas. Among these 4 sampling sites, CB is more impacted by greenhouse agriculture area. LP is near an urban tributary (the Belle River joins Lake St. Clair at LP beach), while SP and HB are near the inlet and outlet of the Detroit River respectively. Two water samples (each 250 mL) were collected at 0.5 m depth at each location and 2-3 m far from shore. Only for LP; the samples were collected from 3 m distance of shorline to meet 0.5 m depth of sampling. In total, we collected 60 samples (2 replicates * 2 weeks/month * 15 months = 60 samples) at each location between 8:00-11: 00 AM. Water samples were transported to the laboratory on ice and were filtered using 0.2 μm polycarbonate membranes (Millipore, USA), and the filter immediately stored at − 20°C until DNA extractions were performed. Water temperature was measured at the time of sample collection at each location. Other environmental variables, such as precipitation and daylight hours, were collected from Environment Canada (http://climate.weather.gc.ca/historical_data/ search_historic_data_e.html) according to the sampling date.
DNA extraction, PCR, library preparation and NGS DNA was extracted following Shahraki et al. [42]. The extracted DNA was used as a template to amplify the V5-V6 region (~350 bp) of the 16S rRNA gene using V5F and V6R primers [43]. Then sample barcode and adaptor sequences were ligated to each PCR product by a second, ligation, PCR [43]. Second-round PCR products were pooled and purified using the QIAquick Gel Extraction Kit (QIAGEN, Toronto, ON, Canada). The library was then diluted to 60 pmol/L and sequenced on an Ion PGM™ System (Thermo Fisher Scientific, Burlington, ON, Canada) with 400-base read length chemistry.

Sequence handling
We used the Quantitative Insights into Microbial Ecology (QIIME V. 1.9.1) bioinformatics pipeline to demultiplex the sequences, quality filters and trim of the adaptor and primers [44]. Briefly, a length cut-off of 200 bp was selected for quality assurance of the sequences. We used usearch quality filter pipeline (usearch_qf script) using USEARCH integrated into QIIME V. 1.9.1 to perform filtering of noisy sequences, chimera checking, and operational taxonomic units (OTUs) picking on a set of de-multiplexed (post split_libraries.py) sequences. Sequences were initially sorted by length, and then de-replication was performed using -max_rejects = 500, followed by sorting by abundance. The sequences were clustered using 97% sequence identity to filter noisy reads. Chimera checking was performed using UCHIME and chimera-free sequences were used for OTU assignment using the Basic Local Alignment Search Tool (BLAST) against Greengenes 16S rRNA database as a reference data file [45]. The representative sequence for each OTU was selected using the most abundant method for assigning taxonomy in the Ribosomal Database Project (RDP) classifier program with a minimum 80% confidence level [46]. To minimize the impact of zero-inflation, after removing of single and double read OTUs using QIIME command (filter_otus_ from_otu_table.py), we further removed the OTUs with ≤20 reads from the whole library manually. Then we used arcsine square root transformation to normalize the compositional and library size variation of the biological data (OTU table). The OTU table was rarefied to 2000 quality passed sequences for each sample to calculate alpha diversity. The original OTU table (non-rarefied) was used to calculate relative abundance. We defined an OTU as "abundant" when it had a relative abundance above 1% of the community, "moderate" when the relative abundance was between 0.1-0.99% and "rare" when the abundance was below 0.1% [47]. Original fastq files with metadata are deposited in NCBI Sequence Read Archive (ID PRJNA662419).

Statistical analysis Global spatial and temporal effects
We used nested ANOVA in R environment (version 3.1.1) [48] to determine replicate effect on the BCC. Alpha diversity indexes (Chao1 and Shannon) and the first (PCo1) and second (PCo2) principal coordinates from the principal coordinates analysis (PCoA) across all samples (including replicates) were used as a dependent variable in the nested ANOVA to test specifically for a replicate effect.
We used a generalized linear mixed model (GLMM) with Maximum Likelihood (ML) method in R environment (package lme4) [49] to test the effect of sampling location, lake (Erie and St Clair), month and the their interactions effects on Chao1, Shannon, PCo1 and PCo2 as dependent variables. Chao1 and Shannon were calculated using the rarified OTU table. The Bray-Curtis dissimilarity matrix was calculated using Primer-e software version 7.0.13 and after PCoA, we selected PCo1 (16.1%) and PCo2 (8.1%) and PCo3 (6.5%), which represented the most variation in the BCC PCoA. For each dependent variable, we run a global GLMM model by nesting sampling locations (6 locations) within the lake (2 lakes) and sampling weeks (2 weeks/month) within sampling month (15 months) to determine the effect size of month and lake on the BCCs. We used replicates as a random factor and sampling month and lake as fixed factors. To evaluate the effect of the sampling location by month interaction, we ran GLMM models on data from each lake separately. In these models, we considered month (weeks nested within 15 sampling months) and sampling locations as a fixed factor and replicates as a random factor. We also measured the impact of spatial (sampling locations) and temporal (sampling months) variation on the relative abundance of each OTU (n = 2100) using the Kruskal-Wallis one-way analysis of variance using R environment (version 3.1.1) [48].

Spatial variation
To characterize BCC spatial variation (between lakes and sampling locations), we used i) permutational multivariate analysis of variance (PERMANOVA) with 9999 permutations using Primer-e software to compare the Bray-Curtis dissimilarity of the BCCs, ii) PCoA of the Bray-Curtis similarity matrix to visualize the pattern of the BCC variation using Primer-e software, iii) one-way ANOVA to compare the mean of diversity indexes and PCo1 and PCo2 of the BCCs using R environment (version 3.1.1) [48], and iv) linear discriminant analysis (LDA) using the LEfSe method [50] to compare the relative abundance of taxa (class, order and family levels) in the bacterial communities.

Temporal variation
We applied hierarchical agglomerative clustering on the Bray-Curtis similarity matrix of the BCCs (top 300 highly abundant OTUs) using the group average method using Primer-e software to explore the possibility of seasonal clustering in terms of BCC. Once we identified clear clusters, we tested for differences in BCC among the clusters and the 15 months of sampling following the same approach as we used for the spatial variation (see above). Moreover, we used SIMPER analysis on the Bray-Curtis dissimilarity matrix to compare the overall dissimilarity among clusters and the 15 months of sampling in the vegan package [51]. Plots and graphs were generated using R (version 3.1.1) [48].

Environmental effects
We applied a RELATE analysis (Spearman's p correlation coefficient) on the Bray-Curtis similarity matrix calculated from whole data sets and the matrix of Euclidean distances calculated from normalized environmental data (daylight hours, precipitation, water temperature and season) as the environmental matrix to evaluate the relationship between the BCC and environmental factors. A distance-based linear model (distLM)