Metabarcoding for bacterial diversity assessment: looking inside Didymosphenia geminata mats in Patagonian aquatic ecosystems

The number of organisms that spread and invade new habitats has increased in recent decades as a result of drastic environmental changes such as climate change and anthropogenic activities. Microbial species invasions occur worldwide in terrestrial and aquatic systems and represent an emerging challenge to our understanding of the interplay between biodiversity and ecosystem functioning. Due to the difficulty of detecting and evaluating non-indigenous microorganisms, little is known about them and the processes that drive successful microbial invasions – especially when compared to macroinvasive species. Microalgae are one of the most abundant microorganisms in aquatic systems, and some are able to produce massive proliferations (mats) with significant impact on biodiversity and economic activities. Among microalgae invaders, Didymosphenia geminata is a benthic diatom that constitutes a major global threat for freshwater ecosystem conservation. Despite two decades of research, the cause of mat proliferations remains uncertain. It has been proposed that bacterial biofilm composition may contribute to successful attachment and consequently to proliferation. The aim of this work was to assess the bacterial diversity associated with the mat-forming diatom D. geminata in three aquatic ecosystems of the Chilean Patagonia by implementing genomic-based tools. Using a metabarcoding approach, we determined a core microbiota represented by 4 phyla, 16 families, and 20 genera. Proteobacteria (Alpha and Beta) and Bacteroidetes were the dominant phyla, followed by Cyanobacteria and Planctomycetes. At the lower taxonomic level, unidentified genera from the Comamonadacea family were the most abundant bacteria of the core microbiota. The bacterial composition we found was very similar, with some relative abundance changes, to that reported in a previous study of the bacterial diversity of biofilms from rivers contaminated with D. geminata in New Zealand. This geographical co-occurrence pattern between bacteria and D. geminata in different independent studies suggests that a specific microbiota may be associated with D. geminata distributions, establishment and proliferation. Our work serves as the starting point to design an experimental study that aims to determine whether these specific bacteria facilitate the establishment of the microalgae by creating favorable conditions or are the result of the diatom invasion.


Introduction
Biological invasions constitute a major threat to biodiversity, bringing serious ecological consequences. Most studies have focused on macroinvasive species (animals and plants); meanwhile, invasions by microorganisms (eukaryotes, prokaryotes and viruses) are less well understood, and the investigations are biased towards those microorganisms that could affect human health (Litchman 2010). Microbial species invasions involve both pathogenic and free-living taxa and constitute an important subject to better understand the interplay between biodiversity and ecosystem functioning (Acosta et al. 2015). Detection and evaluation of non-indigenous microorganisms are difficult to carry out due to their small size, the inability to cultivate a great number of them, insufficient taxonomic resolution and disagreements on what constitutes a species (Wilk-Woźniak et al. 2016). Nevertheless, microorganisms are essential components of the ecosystems and are primarily responsible for biogeochemical cycles and key environmental processes (Amalfitano et al. 2015).
Dispersal limitation and stochasticity play major roles in the assembly of natural communities (Becks et al. 2005). The distribution of microbial species is not uniform at local, regional or global scales, and some emerging evidence suggests that microorganisms may exhibit large-scale biogeographical patterns (Logares et al. 2013). Microbial community assembly processes are dynamic and influenced by different factors (Langenheder and Lindström 2019). For instance, in complex aquatic bacterial communities, it was found that the initial random arrival of a species has an effect on the community assembly by modifying the immigration success of later-arriving species (Rummens et al. 2018), a phenomenon called the priority effect (Fukami 2015). On the other hand, biological invasion may involve a dispersal event of the invasive species together with the associated microbial community (Zatoń et al. 2019), which may have a role as mediators for preparing a suitable environment for the invader. Indeed, the establishment of complex bacterial interactions may result in shorter generation times and enhance the efficiency of the system (Jousset et al. 2013). Therefore, it is necessary to consider the complexity of the interactions among microbes to better understand bioinvasions.
Microalgae are one of the most abundant microorganisms in aquatic systems (Finkel et al. 2010), and several studies have shown that invasive microalgae may be highly recurrent in aquatic environments (Amalfitano et al. 2015); in some cases they produce massive blooms with significant impact on biodiversity and economic activities (Anderson et al. 2012). Among microalgae invaders, Didymosphenia geminata (Lyngbye) M. Schmidt (Schmidt, 1899) is a freshwater benthic diatom that attaches to rock or plant substrates and grows into the water column on stalks composed of extracellular carbohydrate material. In suitable conditions, the microalgae have the ability to develop massive blooms, which form a dense mat (Whitton et al. 2009). The species has been able to establish in rivers and lakes in North America, Europe and Asia (Sherbot and Bothwell 1993;Whitton et al. 2009;Sanmiguel et al. 2016) as well as in the Southern Hemisphere (Montecino et al. 2016;Sastre et al. 2013). In the Southern Hemisphere, this diatom was first detected in New Zealand (Kilroy 2004); a few years later, it was found in Argentina (Merino et al. 2011;Beamud et al. 2013) and Chile (Segura 2011). The invasion of D. geminata in the Southern Hemisphere has been characterized as one of the most aggressive, with potentially severe ecological and economic impacts, because of the rate of expansion and the number of rivers affected (Blanco and Ector 2009;Kilroy and Unwin 2011;Reid et al. 2012). For instance, the D. geminata invasion directly affects sectors such as commercial fisheries, recreation and tourism. Besides, changes in the composition of native macrospecies have been reported, illustrating one of the ecological impacts (Ladrera et al. 2018). Despite two decades of research, the cause of mat proliferations remains uncertain. Introduction alone of the microalgae cells to a new habitat is not enough to produce the proliferation (Bothwell et al. 2014) as it has been observed that some tributaries of D. geminata-infested rivers remain free of contamination despite having suitable conditions for establishment (Sutherland et al. 2007). Blooms have been associated with low-nutrient, oligotrophic streams. Experimental evidence suggests that D. geminata produces extensive stalk material when it is phosphorus (P)-limited; this may be a strategy to expose cells to the water column for greater acquisition of phosphorus (Bothwell and Kilroy 2011;Kilroy and Bothwell 2012;Cullis et al. 2012;Bothwell et al. 2014). However, in a recent publication, data suggested that D. geminata blooms do not always form when soluble reactive phosphorus (SRP) is low, even when cells are present (West et al. 2020). Together, these findings suggest that water chemistry alone is insufficient to explain distribution patterns. Climatic variables such as temperature have a strong effect in the global spatial and temporal population dynamic of D. geminata (Montecino et al. 2016). Moreover, substrate composition may determine whether the diatom can adhere to a surface as it has been shown that rocks with intact biofilm are more likely to be colonized by D. geminata (Bergey et al. 2010). Research on D. geminata and other diatoms suggests that bacterial biofilm composition may contribute to successful attachment (Gärdes et al. 2011;Brandes et al. 2016). Few works have evaluated changes in the community composition associated with the massive growth of the diatom; rather, they have mainly focused on diversity of other diatoms and macroinvertebrates (Gillis and Chalifour 2010;Figueroa et al. 2018;Ladrera et al. 2018). There is a dearth of studies that investigate the effect of D. geminata proliferation on the bacterial communities or determine the bacterial diversity associated with the mat and the functionality of diatom-bacterial interactions. Hence, to establish the factors that potentially promote this microalgae invasion in aquatic ecosystems, it is necessary to further investigate the diversity of microbial communities associated with D. geminata mats.
The aim of this work was to assess the bacterial diversity associated with the mat-forming diatom D. geminata in three aquatic ecosystems of the Chilean Patagonia by implementing genomic-based tools. To do that, we used a metabarcoding approach, which implies amplification and highthroughput sequencing of a specific genomic region from total DNA extracted from an environmental sample or tissue (Taberlet et al. 2012). Because metabarcoding provides thousands to millions of sequences from a bulk sample, it has the potential to assess a great percentage of the diversity of a given community (Cristescu 2014). This work is the first to reveal the microbiota associated with the mat formation of D. geminata.

Study site and sampling collection
Samples of D. geminata mats attached to rocks were collected during spring in December 2017 and January 2018 in three freshwater ecosystems: two rivers and one lake located in the Chilean Patagonia. Different locations, separated by kilometers, were surveyed for each ecosystem. At both Río Serrano (RS) (51°S and 73°W) and Río Grande (RG) (53°S and 68°W), 3 locations were sampled, whereas 2 locations were sampled at Lago Blanco (LB) (54°S and 68°W) (Supplementary material Table S1). Within each location, three randomly selected biological replicates were collected. Each replicate consisted of two to three D. geminata mats of ca. 1 cm 2 removed from a rock, then combined and preserved in ethanol 70%. A total of 24 samples were collected, transported to the laboratory and stored at −20 °C prior to further analysis. At each point, water temperature, pH, dissolved oxygen (DO) and turbidity were recorded using a multiprobe (HI 9829-13202, Hanna Instruments, USA). Nutrient concentrations such as ammonium (NH 4 ), soluble reactive phosphorous (SRP), dissolved organic carbon (DOC) and silicates were measured in the laboratory (Table S1).

DNA extraction
Total bacterial DNA was extracted from 500 mg of D. geminata mat using DNeasy Plant Mini Kit (Qiagen) as per the manufacturer's protocol. The quality and concentration of the extracted DNA were measured using a fluorometer Qubit 2.0 (Invitrogen).

Amplicon library preparation and sequencing
The V4-V5 region of the bacterial 16S rRNA gene was amplified using the specific primers: 515F (5´-CTT TCC CTA CAC GAC GCT CTT CCG ATC TGT GYC AGC MGC CGC GGT A-3´) and 928R (5´-GGA GTT CAG ACG TGT GCT CTT CCG ATC TCC CCG YCA ATT CMT TTR AGT-3´). Indexed Illumina adapters are underlined. Total DNA was subjected to PCR amplification, and all PCRs were conducted in 25 µL of volume and prepared under sterile conditions. Each PCR reaction contained 2.5 µL of 10X Buffer (Sigma-Aldrich), 0.5 µL of 10 mM dNTPs, 1 µL of 10 µM of each primer, 0.25 µL of 5 U/µL Taq polymerase (Sigma-Aldrich), 10 ng of DNA and sterile water to reach the final volume. PCR amplifications were carried out following these steps: 95 °C for 2 minutes, followed by 30 cycles of 94 °C for 60 seconds, 65 °C for 40 seconds, 72 °C for 30 seconds, and a final elongation step at 72 °C for 10 minutes. Finally, paired-end sequencing was conducted using an Illumina MiSeq platform (Illumina, Inc.) at the Genome and Transcriptome Platform, INRA, France.

Data analyses
The DNA sequence analysis was performed using the pipeline Quantitative Insights Into Microbial Ecology (QIIME) version 1.9.1 (Caporaso et al. 2010). The Illumina forward and reverse reads were assembled using the join_paired_ends.py script with a minimum overlap of 50 bp, 100% identical. After the paired-end alignment and primer trimming, the reads were filtered for quality with a Q30 Phred quality threshold and demultiplexed using split_libraries.py. Chimeric sequences were detected and subsequently removed using the USEARCH 61 algorithm implemented in QIIME with the identify_chimeric_seqs.py script (Edgar 2010). The retained reads with a median sequence length of 413 bp were clustered into Operational Taxonomic Units (OTUs) based on 97% sequence similarity against the Greengenes reference database version 13_8 (DeSantis et al. 2006) using the pick_open_reference_otus.py script. OTUs with less than 50 sequences per OTU were removed. Taxonomy was assigned to each OTU with an uclust-based consensus taxonomy assigner implemented in QIIME using the Greengenes reference. Chloroplast and mitochondria sequences affiliated with D. geminata were removed using the filter_taxa_from_otu_table.py script. To determine the bacterial composition of each ecosystem, relatively rare OTUs were removed; only those representing more than or equal to 0.05% of the total sequences were selected. The core microbiota associated with D. geminata mats was determined by OTUs representing more than 0.5% of the total sequences and present in all 24 samples analyzed.

Bacterial community diversity
To quantify the bacterial richness in each aquatic ecosystem, four different alpha-diversity estimators were calculated by QIIME from rarefied samples: the Shannon entropy and the Simpson's index, which emphasize diversity, and the number of observed OTUs and the Chao1 estimator, which emphasize species richness. To highlight the differences among environments, we calculated the beta-diversity using the unweighted and weighted UniFrac distances (Lozupone and Knight 2005) and the Bray-Curtis (BC) dissimilarity (Bray and Curtis 1957). The Principal Coordinates Analysis (PCoA) was used to visualize differences between microbial communities in samples according to weighted and unweighted UniFrac distances and BC distances.

Statistical Analysis
To calculate the statistical significance of alpha-diversity indexes, the nonparametric Mann-Whitney U test was used. The statistical significance of between-groups' distances was assessed using PERMANOVA (Permutational Multivariate Analysis of Variance Using Distance Matrices) (Anderson 2001) (with 999 permutations), implemented in the adonis function in the vegan R-package. To compare the physicochemical parameters among ecosystems, the normality of the distributions of all variables was tested using Shapiro-Wilk tests. Due to the non-normal distribution of data for some variables, significant differences were determined using the Kruskal-Wallis (KW) test for non-parametric data (P < 0.05, KW). When significant differences were found, a Kruskal-Wallis multiple comparison test (KWZ, post-hoc analysis) was performed. Differences were considered significant at Z > 1.96. Statistical analyses were conducted using the NCSS 11 Statistical Analysis System software (Number Cruncher Statistical Systems, USA). The R software was used to perform a canonical correspondence analysis (CCA) so as to explore the contribution of environmental variables to the beta-diversity of microbial communities.

Data availability
All data generated or analyzed during this study are included in this published article (and its Supplementary materials files).

Analysis of 16S-rRNA gene sequencing results
After paired-end reads alignment, primer trimming and merging, a total of 636,665 reads were quality filtered with a median length of 413 bp: 248,106 from RS (9 samples with a mean of 27,567 reads per sample); 231,712 from RG (9 samples with a mean of 25,746 reads per sample); and 156,847 from LB (6 samples with a mean of 26,141 reads per sample). 34,593 chimeras were identified and removed, resulting in 602,072 high-quality reads that were clustered into OTUs at 97% similarity using a subsampled openreference strategy with the USEARCH 61 algorithm against the Greengenes reference set (Table 1 and Table S2). Samples were rarefied (without replacement) at 10,800 reads per sample. Rarefaction curves began to plateau, indicating a good representation of the bacterial community in all samples. As expected, the consistency of estimation of taxa abundances grows as the sample depth increases ( Figure 1A).

Bacterial composition of D. geminata mats
The diversity of bacterial community structure in D. geminata mat samples was estimated by alpha-and beta-diversity indices. The Shannon entropy, number of observed OTUs, Simpson's index, and Chao1 estimator indicated difference in species richness among samples. Significantly higher alpha-diversity was observed in LB and RG compared to RS (P < 0.05, Mann-Whitney U test), while LB and RG were not statistically different ( Figure 1B and Table S3); this indicates that in terms of species richness, LB and RG were similar and both were statistically different from RS. To further investigate whether the three ecosystems had a differentiated bacterial community associated with D. geminata mats, the beta-diversity of the samples was computed using the weighted UniFrac, unweighted UniFrac and Bray-Curtis dissimilarity. A PCoA analysis of the calculated weighted UniFrac distance, which accounts for the OTUs'abundance distribution between samples, indicated that the three principal components explain 81.3% of the total variability among samples. In the plot, three distinct clusters corresponding to each aquatic environment were observed ( Figure 1C). This result is confirmed by the PCoA plot that uses the unweighted UniFrac distance (presence/absence) ( Figure 1D). Furthermore, a PERMANOVA (999 permutations) showed that the three bacterial populations corresponding to LB, RG, and RS were significantly different from one another (P < 0.001 for Bray-Curtis dissimilarity, Table S4). Heatmaps displaying the relative abundances of OTUs at different cut-off points showed consistency across the three biological replicates at each location ( Figure S1). The distribution patterns of abundance of individual OTUs support the hypotheses of the existence of distinct microbial communities at each location that are likely influenced by the environmental factors characteristic of each site. OTUs with relative abundances representing more than 0.05% of total sequences were selected. 268 OTUs were obtained (Table S5), and their taxonomical classification consists of 13 phyla, 25 classes, 47 orders, 65 families, and 99 genera (Tables S6-S8 (relative abundances)). At the Phylum level, 95.8% of the overall composition of the bacterial community is represented by Proteobacteria (52.9%), Bacteroidetes (37.3%), Cyanobacteria (3.4%) and Planctomycetes (2.2%) (Figure 2A). At the Family level, 5 families represent more than 59% of the sequences: Comanonadaceae (19.2%), Cytophagaceae (16.9%), Chitinophagaceae (8.7%), Flavobacteriaceae (7.7%) and Sphingomonadaceae (7.4%) ( Figure 2B). Each ecosystem showed specific OTUs that were highly enriched or only observed at that particular environment; 10 OTUs at LB, 18 at RG, and 16 at RS (Table S9 (number of  reads)). For instance, New.Reference OTU132, which corresponds to the genus Kaistobacter, was exclusively detected in LB samples; OTU 551438 classified under Actinobacteria phylum was specific to RG, and an unidentified Cyanobacteria from OUT 94786 was observed only at RS. The key label is displayed for the 15 most abundant families. 36 families with relative abundances between 0.05% and 0.5% were grouped and displayed as others*. Percentage of relative abundances represents the average abundance among biological replicates.

Core microbiota associated with D. geminata mats
Despite the geographical distance, the differences in environmental factors and the variation in the microbiota composition of the three aquatic ecosystems, 32 OTUs were present in all 24 samples analyzed; they form the core microbiota associated with D. geminata mats (Table S10). The taxonomic profile consists of 4 phyla, where Proteobacteria and Bacteroidetes represent more than 94% of all sequences. Most of the bacteria belong to Betaproteobacteria, Cytophagia or Alphaproteobacteria classes, and the Burkholderiales order is the most evenly represented across the samples. At the family level, Comamonadaceae dominated the community followed by Cytophagaceae and Flavobacteriaceae. Organisms from the Pseudanabaenaceae family are significantly more abundant at RS, though they were observed in all samples (Figure 3 and Table S11).
Twenty taxonomic groups are listed in the core microbiota at lower taxonomic levels. Nine of these groups are highly abundant at few locations but poorly represented (< 1%) in others. For instance, Oxalobacteraceae makes up 30.6% of the community at RG1, but with the other seven locations it is present in less than 0.2%. Cyanobacteria, represented by the Leptolyngbya genus, have a relative abundance range between 16.8% of total reads at RS5 and 0.04% at LB1. The Leadbetterella genus from the Bacteroidetes phylum is significantly more abundant at LB3 (10.5%), and the Methylotenera genus from the Proteobacteria phylum is highly represented at RG2 (6.3%) and RG3 (5.6%), whereas in the other locations  it makes up < 0.5% (Table S12). These 9 genera together represent 17.6% of the core microbiota and are called "others" in Table 2. Of the remaining 11 genera, the core microbiota was largely composed of unidentified genera of the Comamonadaceae family that alone accounted for an average of 27.8% of the reads (28% LB, 28.1% RG and 27.2% RS) (Figure 4). Close to 30% of D. geminata-associated microbiota is represented by the Flectobacillus and Flavobacterium genera, both from the Bacteroidetes phylum (Table 2). Flectobacillus represents 20.3% of total sequences; however, it is significantly more abundant at RS than at LB or RG. Flavobacterium's relative abundance is similar in all samples and constitutes an average of 9.4% of the microbiota. Bacteria belonging to the Pirellulaceae family make up 2.8% of the microbiota and are the only representatives from the Planctomycetes phylum.

Physicochemical analysis and nutrient concentration of the aquatic ecosystems
During sampling, water temperature, pH and DO were recorded at each point. Water pH was similar at each location (P > 0.05, KW; mean 7.28). In contrast, water temperature was significantly lower at RS (8.2 °C) than at LB (11.3 °C) and RG (13.8 °C, P < 0.05, KW). DO was also different across ecosystems (P < 0.05, KW) with RS showing the highest values (D0 = 12.1 mg/L). Several nutrient concentrations were measured in the laboratory (Table S1). In general, the LB and RG environments were similar to each other, but both displayed differences compared to RS. For instance, RS had significantly lower concentrations of DOC (< 1.6 mg/L; P < 0.05, KW), a higher concentration of NH 4 (> 0.1 mg/L; P < 0.05, KW) and a lower concentration of silicate (< 2.6 mg/L; P < 0.05, KW) than those of LB and RG. The CCA ordination showed that more than 93% of inertia was explained by the whole set of variables in which turbidity, DOC, DO, pH and temperature explained the majority of the observed bacterial distribution pattern (P < 0.05; Table 3; Figure S2).

Discussion
Microscale interactions between phytoplankton and bacteria represent a fundamental ecological relationship in aquatic environments. Despite bloom variations in terms of phytoplankton composition and environmental  (Seymour et al. 2017). For instance, aquatic heterotrophic bacteria consume algae-derived organic material such as dissolved organic carbon (DOC) (Muhlenbruch et al. 2018) and complex algal products like mucilage and polysaccharides (Teeling et al. 2012). From the perspective of the microalgae, bacteria can enhance micronutrients' bioavailability and supply vitamins (Amin et al. 2012) and other limiting macronutrients such as nitrogen and phosphorus. In the context of aquatic invasions by microalgae, assessing these sophisticated interactions with the bacteria community could help to clarify whether bacteria have a role during the invasion process and to find biological alternatives for controlling the establishment and dispersion of invasive species such as D. geminata. Accordingly, it is important first to know the bacterial community associated with the alien species. Metabarcoding methodologies have been combined with extensive sampling of microbiomes to expand our understanding of bacterial diversity associated with a specific environment; they represent great molecular tools for the study of microbial diversity's role in the success of biological invasions.
In the present work, we assessed bacterial diversity using a metabarcoding approach and established a core microbiota associated with D. geminata mats from the Chilean Patagonia to gain insight into the diatom's invasion process. Another research group previously investigated the bacterial composition of rocks' biofilms (thickness < 1mm) from rivers invaded by D. geminata in New Zealand (Brandes et al. 2016). There, the most abundant bacterial classes were Sphingobacteria (Bacteroidetes), followed by Proteobacteria (Beta and Alpha) and Cyanobacteria. We found a very similar bacterial composition, although the relative abundances differ since Proteobacteria were the most abundant phylum in the mats. Among Proteobacteria, betaproteobacteria belonging to the Burkholderiales order (Comamonadacea family) were detected in both studies; however, it was more abundant on the mat than in the rock biofilm. An explanation for these differences in relative abundances might be offered by the stage of biofilm succession that was investigated. That is, early to mid-phase biofilm formation versus a late-phase, established mat. All together, these results suggest a relationship between D. geminata and the microbiota found in the diatom's mat that may have a role in the establishment and/or proliferation of this microalgae. Further work is required to investigate the nature of diatom-bacteria relationships and to determine whether the bacteria have a role at early stages during the establishment of the diatom or if they appear later as a result of D. geminata proliferation. Our data serve as a starting point for designing experimental studies to determine the characteristics of the ecological relationships between the microbiota and the microalgae.
The dominant phyla Proteobacteria (Alpha and Beta) and Bacteroidetes showed similar relative abundance and were evenly distributed among the studied ecosystems. This is in line with previous experimental studies that reported the association of Proteobacteria and Bacteroidetes with benthic diatoms (Grossart et al. 2005;Koedooder et al. 2019), suggesting that members of these clades appear to be particularly adapted to bloom conditions, most likely as a result of their ability to degrade complex macromolecules and alga-derived metabolites. The other two phyla found in the core microbiota were Cyanobacteria and Planctomycetes; together, they represented close to 6% of the core microbiota but showed spatial abundance variations. Cyanobacteria, represented by the Leptolyngbya genus, have a higher relative abundance at RS5. Aquatic nitrogen-fixing Cyanobacteria were associated with blooms of D. geminata in New Zealand, suggesting that the success of the diatom could be mediated by indigenous nitrogen-fixing Cyanobacteria (Novis et al. 2016). In contrast to Cyanobacteria distribution, we found that Planctomycetes, represented by bacteria from the Pirellulaceae family, were more abundant at RG and LB than RS. Changes in water temperature between ecosystems might help to explain these relative abundance differences since the RS temperature was below 8.5 °C while in LB and RG temperatures ranged from 10.5 to 14.2 °C. A previous experimental study showed that changes in temperature influenced Planctomycetes abundance, especially that of Pirellulaceae, the relative abundance of which was higher at warmer temperatures (Hicks et al. 2018). Besides, we observed significant differences in turbidity, DOC, DO, NH 4 and silicate concentration between RS and the other two ecosystems; all together, these differences may influence the community composition and abundance of each ecosystem. Indeed, the CCA ordination showed that environmental variables have a strong contribution to the bacterial distribution pattern in which turbidity, DOC, pH, temperature, DO, and silicate are some of the main factors. Interestingly, changes in these factors are related to D. geminata's presence given the carbohydrate growing mats, the silica cell wall and the photosynthetic activity. For instance, it has been shown that turbidity, silicate and total phosphorus were relatively lower at rivers with D. geminata (James et al. 2014). Turbidity strongly influences the light availability and therefore the photosynthetic activity of benthic phototrophs, having a major impact on the presence of D. geminata in streams. Silicate is required for frustule formation and phosphorus was identified as one of the major environmental drivers for D. geminata invasion, since mat formation is inversely related to concentrations of available phosphorus . These factors affect D. geminata growth and excretion, and thus indirectly may influence the associated microbiata. DOC and DO are probably drivers for heterotrophic activity, and thus have potentially a direct influence on microbiota structure. Water pH and temperature might affect both diatom and bacteria. Indeed, it has been established that both water pH and temperature gradients influence D. geminata's broad-scale distribution and capacity for establishing and forming blooms in rivers ). These findings suggest that environmental filtering and biotic interactions play a greater role in bacterial community composition than do dispersal limitations.
At the lower taxonomic level, unidentified genera from the Comamonadaceae family (Betaproteobacteria), distributed among 7 OTUs, were the most abundant members of the core microbiota. A BLAST analysis of the Comamonadaceae reference sequence showed a 99.76% identity with a sequence (GenBank: KM159573.1) previously reported as part of the microbial community associated with the freely floating pumice originated from a volcanic eruption in Lake Espejo and Lake Nahuel Huapi in southern Neuquén Province, Argentina (Elser et al. 2015). Interestingly, D. geminata has been detected since 2011 in several lakes of the Neuquén Province, including Lake Nahuel Huapi (Beamud et al. 2013). The same study reported the presence of bacteria belonging to the Sphingomonadales order and the Alphaproteobacteria Rhodobacter, both of which were also found in our work; they represent 7.6 and 2.5% of the core microbiota, respectively. The Flectobacillus genus from the Cytophagacea family were the second most abundant bacteria of the core microbiota. The reference sequence had a 95.6% identification with bacteria (GenBank: KT714214.1) found in the microbial biofilm of a stream in the Czech Republic (Brablcová et al. 2013) where several rivers have been contaminated with D. geminata since 2002 (Gágyorová and Marvan 2002). Similarly, our findings are consistent with previous data, in which Cytophagaceae together with Proteobacteria and Cyanobacteria were detected in samples of biofilms from rivers with D. geminata in New Zealand (Brandes et al. 2016). This geographical co-occurrence pattern between bacteria and D. geminata in different independent studies points to the possibility of functional interactions between these bacteria and the microalgae and highlights the necessity of further functional studies to establish whether bacteria from these taxonomical groups have a role in D. geminata's bloom.
The Flavobacterium genus, another bacteria from the Bacteroidetes phyla, represents 9.4% of the core microbiota. These bacteria exhibit enormous metabolic diversity; for instance, Flavobacterium often associated with freshwater and marine phytoplankton blooms may degrade algal fatty acids (Eiler and Bertilsson 2007). Microbiota communities work in synchrony, as was demonstrated in a metaproteomic study in water samples from coastal areas of East Antarctica. The authors reported that Proteobacteria can degrade labile compounds produced by Flavobacteria from algal-derived polymers (Williams et al. 2013). Moreover, experimental studies performed with algal strains cultivated in a laboratory suggest that these bacteria can use the available carbon sources produced by algae and, in turn, promote their own grown (Terashima et al. 2017). Now, whether these bacterial chemical processes have an effect specifically on the D. geminata invasive process must be determined.
Bacteria belonging to the Novosphingobium genus were also detected in the three aquatic ecosystems with relative abundances ranging from 0.5% to 2.7%. Novosphingobium are often associated with the biodegradation of aromatic compounds such as phenol (Liu et al. 2005). Didymosphenia geminata is rich in antioxidants like polyphenols (Olivares-Ferretti et al. 2019) that could serve as a carbon source for these bacteria. Bacteria from the Methylotenera genus were significantly represented at the RG2 and RG3 locations; there, the concentrations of nitrate and soluble reactive phosphorus were lower than at the other six locations. Methylotrophic bacteria play a major role in maintaining the balance of one-carbon molecules like methane in aerobic environments. Methylotenera have been shown to perform methane oxidation in arctic lake sediment (He et al. 2012). A BLAST analysis of the reference sequences from these two bacteria once again matched with bacteria reported in Lake Espejo and Lake Naheul Huapi, Argentina (Elser et al. 2015), suggesting that a similar bacterial community was found in freshwater ecosystems from two different countries that had D. geminata as a common factor.
To acquire a more precise picture of the biodiversity of specific environments, metabarcoding and metagenomics approaches offer great advantages over cultivation-dependent traditional methods because less than 2% of bacteria can be cultivated in the laboratory (Wade 2002). The present work explores the potential of metabarcoding methodologies for the study of diversity and abundance of those microorganisms associated with and spread by invasive species; the potential of this tool to study invasive species is globally accepted (Rey et al. 2020;Thakur et al. 2019). Pieces of evidence have shown that invasive species could also work as vectors for dispersing microorganisms to new aquatic ecosystems (Zatoń et al. 2019). The use of this molecular tool greatly facilitates the identification of microorganisms and the determination of their abundance; in the future, it may also be used for early detection in new ecosystem even before the establishment of the invasive species (Comtet et al. 2015;Ficetola et al. 2019). In the context of aquatic invasions by microalgae, assessing the associated microbiota biofilm is the first step in fully understanding the functional interactions between bacteria and microalgae that might play major roles in the establishment and dispersion of the invasive aquatic species.
The following supplementary material is available for this article: Figure S1. Rough OTUs composition plot. Heatmaps showing relative abundance of OTUs at different cut-offs, those representing more than X% of the total sequences. Figure S2. Results of canonical correspondence analysis (CCA). CCA plot depicting how the first five environmental gradients (Table 3) explained some of the variation in the operational taxonomic unit (OTU) relative abundances. Vectors indicating loading of environmental variables are plotted. Table S1. Environmental data. Table S2. Reads per sample. Table S3. Alpha diversity data. Table S4. Beta diversity data.